Use cases
Examples of more complex evaluations
This section includes use case guides for a few common evaluation scenarios.
- JSON evaluation: Validate JSON outputs from LLMs.
- RAG evaluation: Evaluate the RAG process. Verify LLM outputs end-to-end performance testing.
- AI agent evaluation: Evaluate agents to avoid unreliable reasoning, inconsistent data, and hallucinations.
See each nested guide for a complete walk through.
Updated 28 days ago