Supported frameworks
Orizon QA detects and generates tests for the following frameworks out of the box:| Framework | Detection method | Test generation | Export format |
|---|---|---|---|
| LangChain | Import patterns (from langchain, @langchain/core), tool decorators, chain constructors | Tools, chains, agents, memory | LangSmith datasets |
| CrewAI | Import patterns (from crewai), @agent, @task, @crew decorators, Crew() constructor | Crew execution, agent roles, task outputs | Promptfoo red team |
| AutoGen | Import patterns (from autogen), ConversableAgent, UserProxyAgent, AssistantAgent, GroupChat | Conversation flows, group chat orchestration, code execution safety | AutoGenBench |
| Google ADK | Import patterns (from google.adk, from google.generativeai), AgentDefinition, genai.Agent | Tool invocations, multi-turn context, orchestration trajectories | Vertex AI eval |
| Claude SDK | Import patterns (from anthropic, @anthropic-ai/sdk), Anthropic() constructor, Claude model strings | Tool calls, hook behavior, rules compliance | Self-evaluation |
| Solace Mesh | solace-agent-mesh, SolaceAgentMesh, AgentMesh, solace.messaging patterns | Agent registration, A2A messaging, event handlers | Event flow tests |
How it works
Upload or describe your agent
Provide your agent to Orizon QA by uploading code files or filling out a template that describes your agent’s purpose and tools. See Upload or Describe for details.
Auto-detect framework
Orizon QA scans your code for framework-specific imports, decorators, and constructors. Detection runs automatically — you can also manually specify the framework if needed.
Configure tests
Choose which test categories to run (functional, safety, performance, robustness), how many times to run each test (1x–10x), and which evaluation model to use (Claude Haiku, Sonnet, or Opus).
Run tests
Orizon QA generates test cases for your specific agent — including tool invocations, adversarial prompts, edge cases, and performance benchmarks — and executes them against your agent.
When to use agent testing
Before production deploy
Catch safety issues and functional regressions before your agent reaches real users. Run functional and safety tests as a mandatory gate before shipping.
Safety audits
Run a comprehensive safety evaluation — including adversarial prompts, jailbreak attempts, PII leakage, and bias checks — to document your agent’s safety posture for stakeholders or compliance requirements.
Regression checks
After changing your system prompt, switching models, or modifying tools, re-run the same test suite to verify behavior hasn’t degraded. Test history lets you compare scores across runs.
Explore this section
Upload or Describe
Learn how to provide your agent to Orizon QA — by uploading source code or using a describe template.
Test Categories
Understand what each of the four test categories covers and how to choose the right mix for your needs.
Results & Exports
Read your test report, interpret category scores, and export results for your framework.
