Anthropic caching

Enable caching for Anthropic requests

To enable caching, first make sure the Velvet proxy is configured correctly when initializing your chosen provider. Then add velvet-cache-enabled as a header when sending a request to the provider’s endpoint.


Example code snippets

Initialization

import Anthropic from "@anthropic-ai/sdk";

const anthropic = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY,
  baseURL: "https://gateway.usevelvet.com/api/anthropic/",
  defaultHeaders: {
    "velvet-auth": process.env.VELVET_API_KEY,
  },
});

Messages API

Velvet allows you to create cache keys with automatic expiration. By default, the cache doesn't expire.

await anthropic.messages.create(
  {
    model: "claude-3-5-sonnet-20240620",
    max_tokens: 1024,
    messages: [{ role: "user", content: "You are a helpful assistant." }],
  },
  {
    headers: {
      "velvet-cache-enabled": "true",
    },
  }
);

Messages API with TTL

You can set a time-to-live (TTL) expiration using max-age={TTL} in the velvet-cache-ttl header. For instance, max-age=300 sets a 5-minute expiration on the cache key. The TTL is measured in seconds.

await anthropic.messages.create(
  {
    model: "claude-3-5-sonnet-20240620",
    max_tokens: 1024,
    messages: [{ role: "user", content: "You are a helpful assistant." }],
  },
  {
    headers: {
      "velvet-cache-enabled": "true",
      "velvet-cache-ttl": "max-age=300"   // 5 minute expiration
    },
  }
);

Messages API with TTL and invalidation

The velvet-cache-ttl header also supports cache invalidation. To invalidate a cached item, set max-age=0 in the velvet-cache-ttl header. An invalidation request will refresh the cache with new data.

await anthropic.messages.create(
  {
    model: "claude-3-5-sonnet-20240620",
    max_tokens: 1024,
    messages: [{ role: "user", content: "You are a helpful assistant." }],
  },
  {
    headers: {
      "velvet-cache-enabled": "true",
      "velvet-cache-ttl": "max-age=0"   // Invalidate cache 
    },
  }
);