Use cases

Examples of more complex evaluations

This section includes use case guides for a few common evaluation scenarios.

  • JSON evaluation: Validate JSON outputs from LLMs.
  • RAG evaluation: Evaluate the RAG process. Verify LLM outputs end-to-end performance testing.
  • AI agent evaluation: Evaluate agents to avoid unreliable reasoning, inconsistent data, and hallucinations.

See each nested guide for a complete walk through.