Caching
Example queries to run on top of cached requests.
Below are helpful examples for querying cached logs. Run these queries in Velvet's AI SQL editor, or any other tool you're comfortable with.
Cached logs
Caching unlocks additional metadata stored with each log. If the velvet-cache-enabled
header is set, the gateway will respond with a velvet-cache-status
header.
velvet-cache-status
will be one of HIT
, MISS
, NONE/UNKNOWN
Example SQL queries
Show the difference in price between cached and not cached requests.
SELECT
(metadata->'cache'->>'enabled')::boolean AS cache_enabled,
SUM((metadata->'cost'->>'input_cost')::numeric) AS total_input_cost,
SUM((metadata->'cost'->>'output_cost')::numeric) AS total_output_cost,
SUM((metadata->'cost'->>'total_cost')::numeric) AS total_cost
FROM llm_logs
GROUP BY cache_enabled
ORDER BY cache_enabled DESC;
Break down expected vs. actual token costs.
SELECT
COALESCE(metadata->'usage'->>'model', 'unknown') AS model,
SUM((metadata->'cost'->>'total_cost')::numeric) AS actual_total_cost,
SUM((metadata->'expected_cost'->>'total_cost')::numeric) AS expected_total_cost
FROM llm_logs
WHERE metadata->'usage'->>'model' IS NOT NULL
GROUP BY model
ORDER BY model ASC;
Log metadata
Caching unlocks additional metadata stored with each log. Refer to this example when querying cached requests.
{
"cache": {
"key": "4b2af868add63c97308b3133062aed384afb1be7fd81f225da3b8d113d8af086",
"value": "log_gz42yh5ecgd2e22q",
"status": "HIT",
"enabled": true
},
"model": "gpt-4o-2024-05-13",
"stream": false,
"cost": {
"input_cost": 0,
"total_cost": 0,
"output_cost": 0,
"input_cost_cents": 0,
"total_cost_cents": 0,
"output_cost_cents": 0
},
"usage": {
"model": "gpt-4o-2024-05-13",
"total_tokens": 0,
"calculated_by": "js-tiktoken",
"prompt_tokens": 0,
"completion_tokens": 0
},
"expected_cost": {
"input_cost": 0.00585,
"total_cost": 0.00669,
"output_cost": 0.00084,
"input_cost_cents": 0.585,
"total_cost_cents": 0.669,
"output_cost_cents": 0.084
},
"expected_usage": {
"model": "gpt-4o-2024-05-13",
"total_tokens": 1226,
"calculated_by": "openai",
"prompt_tokens": 1170,
"completion_tokens": 56
},
}
Updated 2 months ago