OpenAI caching

Enable caching for OpenAI requests

To enable caching, first make sure the Velvet proxy is configured correctly when initializing your chosen provider. Then add velvet-cache-enabled as a header when sending a request to the provider’s endpoint.


Example code snippets

Initialization

import OpenAI from "openai";

const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
  baseURL: "https://gateway.usevelvet.com/api/openai/v1/",
  defaultHeaders: {
    "velvet-auth": process.env.VELVET_API_KEY,
  },
});

Chat Completions API

Velvet allows you to create cache keys with automatic expiration. By default, the cache doesn't expire.

const completion = await openai.chat.completions.create(
  {
	  model: "gpt-4o-mini",
    messages: [{ role: "system", content: "You are a helpful assistant." }]
  },
  {
    headers: {
      "velvet-cache-enabled": "true",
    },
  }
);

Chat Completions API with TTL

You can set a time-to-live (TTL) expiration using max-age={TTL} in the velvet-cache-ttl header. For instance, max-age=300 sets a 5-minute expiration on the cache key. The TTL is measured in seconds.

const completion = await openai.chat.completions.create(
  {
	  model: "gpt-4o-mini",
    messages: [{ role: "system", content: "You are a helpful assistant." }]
  },
  {
    headers: {
      "velvet-cache-enabled": "true",
      "velvet-cache-ttl": "max-age=300"   // 5 minute expiration 
    },
  }
);

Chat Completions API with TTL and invalidation

The velvet-cache-ttl header also supports cache invalidation. To invalidate a cached item, set max-age=0 in the velvet-cache-ttl header. An invalidation request will refresh the cache with new data.

const completion = await openai.chat.completions.create(
  {
	  model: "gpt-4o-mini",
    messages: [{ role: "system", content: "You are a helpful assistant." }]
  },
  {
    headers: {
      "velvet-cache-enabled": "true",
      "velvet-cache-ttl": "max-age=0"   // Invalidate cache 
    },
  }
);