CMVK Review
Cross-Model Verification Kit: Multi-model consensus for agent safety, correctness, and policy compliance.
What is CMVK?
CMVK (Cross-Model Verification Kit) is AgentOS's unique approach to ensuring agent safety. Instead of relying on a single AI model's judgment, CMVK submits your agent code to multiple independent AI models for review. Only when all models reach consensus is the agent approved for deployment.
Consensus Process
Participating Models
GPT-4
OpenAI's flagship model
Claude 3
Anthropic's safety-focused model
Gemini Pro
Google's advanced model
Llama 3
Meta's open model
Running CMVK Review
1 Request Review
AgentOS: 🔬 Initiating CMVK Review
Submitting to verification models...
├── GPT-4: Analyzing...
├── Claude 3: Analyzing...
├── Gemini Pro: Analyzing...
└── Llama 3: Analyzing...
Please wait while models review your agent.
This typically takes 10-30 seconds.
2 Review Results
AgentOS: 🔬 CMVK Review Complete
Consensus: ✅ APPROVED (4/4 models agree)
Individual Verdicts:
Agent follows best practices for data handling. No security concerns detected. Rate limiting is properly implemented.
Safety policies are comprehensive. Error handling prevents data leaks. Logging excludes PII as required.
Code structure is clean and maintainable. Resource usage is within acceptable limits.
No unsafe operations detected. API calls are properly authenticated.
Handling Rejections
When models don't reach consensus, you'll see detailed feedback:
AgentOS: 🔬 CMVK Review Complete
Consensus: ❌ NOT REACHED (2/4 models raised concerns)
Individual Verdicts:
Core functionality is safe and well-implemented.
Concern: Line 45 logs user email addresses without masking. This violates GDPR data minimization requirements.
Concern: No rate limiting on external API calls. Could lead to quota exhaustion or abuse.
No critical issues found.
Required Actions:
1. Mask email addresses in log statements (line 45)
2. Add rate limiting for external API calls
Fix these issues and run @agentos review again.
Review Categories
CMVK evaluates agents across multiple dimensions:
Credentials handling, injection vulnerabilities, authentication
Policy compliance, action boundaries, failure modes
GDPR, HIPAA, SOC2, PCI-DSS requirements
Error handling, retry logic, graceful degradation
Memory usage, API quotas, execution time
Configuration
Customize CMVK review settings:
# Strict mode requires 100% consensus
# Default mode allows 75% consensus (3/4 models)
Review Modes:
--strict All models must approve
--standard 75% consensus required (default)
--quick Uses 2 models for faster review
--strict mode for production deployments and agents handling sensitive data. Use --quick mode during development for faster iteration.
Why Multi-Model?
Different AI models have different strengths and blind spots. By requiring consensus across multiple models from different providers, CMVK catches issues that any single model might miss:
- Claude excels at safety and ethical considerations
- GPT-4 is strong at code quality and best practices
- Gemini catches performance and resource issues
- Llama provides an open-source perspective