🔍

Debugging & Optimization

AI-powered error diagnosis, performance optimization, and troubleshooting for your agents.

Debugging Features

🔍 Error Diagnosis

AI analyzes error messages, stack traces, and logs to identify root causes

💡 Fix Suggestions

Get code fixes with explanations you can apply with one click

⚡ Performance Analysis

Identify bottlenecks, memory leaks, and optimization opportunities

📊 Execution Traces

Step-by-step execution visualization with timing data

Diagnosing Errors

When your agent fails, use the debug command to understand why:

User: @agentos debug

AgentOS: 🔍 Analyzing Agent Error...

Error Detected:
TypeError: Cannot read property 'data' of undefined
    at processResponse (agent.ts:45:23)
    at fetchData (agent.ts:32:10)

Root Cause Analysis:
The API response handler doesn't check for null responses.
When the API returns an error status, the response body is
undefined, causing the property access to fail.

Suggested Fix:
// Before (line 45):
const result = response.data;

// After:
const result = response?.data ?? null;
if (!result) {
  throw new Error('API returned empty response');
}

Prevention:
Add response validation before accessing properties.
Consider using a type guard or schema validation.

[Apply Fix] [Show More Context] [Explain Further]

The "Why" Command

Ask why something happened with natural language:

User: @agentos why did this agent fail?

AgentOS: 🤔 Analyzing failure...

Summary:
Your agent failed because it exceeded the rate limit on the
external API. The API allows 100 requests per minute, but
your agent made 150 requests in 45 seconds.

Timeline:
[12:00:00] Agent started
[12:00:15] First batch processed (50 requests)
[12:00:30] Second batch processed (50 requests)
[12:00:45] Third batch started (50 requests)
[12:00:46] Rate limit hit (429 Too Many Requests)
[12:00:47] Agent crashed

Recommendation:
Add rate limiting to your agent:

// Add delay between batches
async function processBatch(items) {
  for (const item of items) {
    await processItem(item);
    await delay(100); // 10 requests/second max
  }
}

Common Error Patterns

🔴 Authentication Failures

API key expired, invalid credentials, or missing permissions

Fix: Check your secrets are set correctly:
@agentos check secrets for my-api-agent

🔴 Timeout Errors

Agent took too long to complete, usually due to slow external APIs

Fix: Add timeout handling and retry logic:
@agentos add timeout handling to this agent

🔴 Policy Violations

Agent attempted an action blocked by safety policies

Fix: Review what action was blocked:
@agentos explain policy violation

Performance Optimization

Analyze and optimize your agent's performance:

User: @agentos optimize

AgentOS: ⚡ Performance Analysis

Current Performance:

2.3s

Execution Time

85MB

Memory Usage

API Calls

$0.003

Est. Cost/Run

Optimization Opportunities:

1. Batch API Calls (-40% execution time)
   You're making 12 individual API calls that could be
   batched into 2 bulk requests.

2. Add Caching (-30% API calls)
   Static lookup data is fetched every run. Consider
   caching for 1 hour.

3. Stream Large Data (-50% memory)
   Loading entire dataset into memory. Use streaming
   for better memory efficiency.

Optimized Version:
[Generate Optimized Code] [Explain Changes]

Execution Traces

View detailed execution traces for debugging:

User: @agentos trace last-run

AgentOS: 📊 Execution Trace

Run ID: run_7f3a9b2c
Duration: 2.34 seconds
Status: Success

Timeline:
┌────────────┬────────────────────────────────────────┐
│ Time       │ Event                                  │
├────────────┼────────────────────────────────────────┤
│ +0.000s    │ Agent initialized                      │
│ +0.025s    │ Loaded configuration                   │
│ +0.050s    │ Connected to database                  │
│ +0.120s    │ Query executed (50 rows)               │
│ +0.450s    │ API call: POST /process                │
│ +1.200s    │ API response received                  │
│ +1.250s    │ Processing results...                  │
│ +2.100s    │ Webhook sent: success                  │
│ +2.340s    │ Agent completed                        │
└────────────┴────────────────────────────────────────┘

Resource Usage:
- Peak Memory: 78MB
- Network In: 1.2MB
- Network Out: 0.3MB

Interactive Debugging

Have a conversation to debug complex issues:

User: @agentos my agent works locally but fails in production

AgentOS: Let me help troubleshoot this. A few questions:

1. Does the production environment have all required
   environment variables set?
   
2. Are there any network restrictions (firewall, VPN)
   that might block API calls?

3. Is the production runtime using the same Node/Python
   version as local?

Let me check your deployment configuration...

Found potential issues:
⚠️ Secret `DATABASE_URL` is set locally but not in
   GitHub Actions secrets
⚠️ Production uses Node 18, local uses Node 20

Would you like me to:
[Check Secrets] [Compare Environments] [View Logs]

💡 Pro Tip: Enable verbose logging during development with @agentos set log-level debug. Remember to switch back to info for production.

← Testing & Simulation GitHub Deployment →