Batch API

With OpenAI’s batch API, you can process large volumes of data asynchronously. As a result, you’ll get lower costs, higher rate limits, and a guaranteed 24-hour response time. See the OpenAI Batch API docs.

Velvet's proxy gives you extra data to observe and analyze the batch process.

Use cases:

Text generation: Write review summaries for every product page
Classification: Add categories to a large dataset of cancelation requests
Embeddings: Add a vector number to each article to indicate relatedness

Below are some example SQL queries that might be helpful when querying batch logs. You can run these queries in the Velvet AI SQL editor, or any other tool you're comfortable with.

Batch logs

Each batch file upload unfurls the jsonl lines into individual requests (aka a log). The response column in the log will be null until a successful /v1/files/:id/content is run. Once completed - we expand the jsonl line, update the corresponding record’s response column, and merge the metadata.

Example SQL queries

Show model, temp, input_file_id, and custom_id for each batch with a custom_id like infer_attributes_from_person.

SELECT
  id,
  (metadata ->> 'model') AS model,
  (request -> 'body' ->> 'temperature') AS temperature,
  (metadata -> 'batch' -> 'input_file' ->> 'id') AS input_file_id,
  (metadata -> 'batch' ->> 'custom_id') AS custom_id,
  response,
  metadata,
  created_at,
  updated_at
FROM
  llm_logs
WHERE
  metadata -> 'batch' ->> 'custom_id' LIKE 'infer_attributes_from_person%'
ORDER BY
  COALESCE(updated_at, created_at)
  DESC;

Show total batch rows, completed, and incomplete (e.g. /v1/files/:id/content hasn’t been executed yet).

SELECT
  COUNT(*) AS total_rows_with_batch,
  COUNT(CASE 
           WHEN request IS NOT NULL AND response IS NOT NULL THEN 1 
         END) AS completed_rows,
  COUNT(CASE 
           WHEN request IS NOT NULL AND response IS NULL THEN 1 
         END) AS incomplete_rows
FROM
  llm_logs
WHERE
  metadata ? 'batch';

For calls of InferAttributesFromPerson, see what temperature is being set.

SELECT
  request -> 'body' -> 'messages' AS messages,
  response -> 'body' -> 'choices' -> 0 -> 'message' ->> 'content' AS content
FROM
  llm_logs
WHERE
  metadata -> 'batch' ->> 'custom_id' LIKE 'infer_attributes_from_person%'
ORDER BY
  COALESCE(updated_at, created_at) DESC;

Metadata example

Refer to these examples when querying your batch requests.

batch_example.jsonl (with two completion requests)

{"custom_id": "request-1", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "gpt-3.5-turbo-0125", "messages": [{"role": "system", "content": "You are a helpful assistant."},{"role": "user", "content": "Hello velvet!"}],"max_tokens": 1000}}
{"custom_id": "request-2", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "gpt-3.5-turbo-0125", "messages": [{"role": "system", "content": "You are an unhelpful assistant."},{"role": "user", "content": "Hello world!"}],"max_tokens": 1000}}

Completion request 1 metadata

{
  "cost": {
    "input_cost": 0.000005,
    "total_cost": 0.000012,
    "output_cost": 0.000007,
    "input_cost_cents": 0.0005,
    "total_cost_cents": 0.0012,
    "output_cost_cents": 0.0007
  },
  "batch": {
    "id": "batch_req_vfgFkPOcs5XISaULjJRawiJC",
    "custom_id": "request-1",
    "input_file": {
      "id": "file-TbeJuO1LUW0Kk3jDB3nPkuWp",
      "bytes": 517,
      "log_id": "log_mtxycg38r2k1e160",
      "object": "file",
      "status": "processed",
      "purpose": "batch",
      "filename": "./batch_example.jsonl",
      "created_at": 1724346043,
      "status_details": null
    },
    "output_file": {
      "id": "file-ZLOUc7ny6ovSDojNP78nmp6d",
      "log_id": "log_56qvrmmxnf04e9ay"
    }
  },
  "error": null,
  "model": "gpt-3.5-turbo-0125",
  "usage": {
    "model": "gpt-3.5-turbo-0125",
    "total_tokens": 29,
    "calculated_by": "openai",
    "prompt_tokens": 20,
    "completion_tokens": 9
  },
  "provider": "openai",
  "expected_cost": {
    "input_cost": 0.000005,
    "total_cost": 0.000012,
    "output_cost": 0.000007,
    "input_cost_cents": 0.0005,
    "total_cost_cents": 0.0012,
    "output_cost_cents": 0.0007
  },
  "expected_usage": {
    "model": "gpt-3.5-turbo-0125",
    "total_tokens": 29,
    "calculated_by": "openai",
    "prompt_tokens": 20,
    "completion_tokens": 9
  }
}

Completion request 2 metadata

{
  "cost": {
    "input_cost": 0.000006,
    "total_cost": 0.000018,
    "output_cost": 0.000013,
    "input_cost_cents": 0.0006,
    "total_cost_cents": 0.0018,
    "output_cost_cents": 0.0013
  },
  "batch": {
    "id": "batch_req_poqf5DCmLNjuRH8UeS4Ukn8A",
    "custom_id": "request-2",
    "input_file": {
      "id": "file-TbeJuO1LUW0Kk3jDB3nPkuWp",
      "bytes": 517,
      "log_id": "log_mtxycg38r2k1e160",
      "object": "file",
      "status": "processed",
      "purpose": "batch",
      "filename": "./batch_example.jsonl",
      "created_at": 1724346043,
      "status_details": null
    },
    "output_file": {
      "id": "file-ZLOUc7ny6ovSDojNP78nmp6d",
      "log_id": "log_56qvrmmxnf04e9ay"
    }
  },
  "error": null,
  "model": "gpt-3.5-turbo-0125",
  "usage": {
    "model": "gpt-3.5-turbo-0125",
    "total_tokens": 39,
    "calculated_by": "openai",
    "prompt_tokens": 22,
    "completion_tokens": 17
  },
  "provider": "openai",
  "expected_cost": {
    "input_cost": 0.000006,
    "total_cost": 0.000018,
    "output_cost": 0.000013,
    "input_cost_cents": 0.0006,
    "total_cost_cents": 0.0018,
    "output_cost_cents": 0.0013
  },
  "expected_usage": {
    "model": "gpt-3.5-turbo-0125",
    "total_tokens": 39,
    "calculated_by": "openai",
    "prompt_tokens": 22,
    "completion_tokens": 17
  }
}