Tools for LangChain, CrewAI, AutoGPT and other agent frameworks
6 tools (filtered)
THUDM
Benchmark for evaluating LLMs as agents
Exploding Gradients
Evaluation framework for RAG pipelines
Promptfoo
Test and evaluate LLM prompts
Confident AI
Unit testing for LLM applications
Giskard
Testing framework for ML models
TruEra
Evaluation and tracking for LLM experiments