🤖 AutoGen Integration

Govern Microsoft AutoGen agents with kernel-level safety. Message interception, code execution controls, and group chat governance—all with one line of code.


                        $ pip install agent-os-kernel[autogen]

Overview

AutoGen enables powerful multi-agent conversations where agents can execute code, collaborate on tasks, and autonomously iterate on solutions. Agent OS provides kernel-level governance to ensure these capabilities remain safe and controlled.

💬

Message Interception

Monitor and filter all messages between agents. Block sensitive content, enforce communication policies, and audit all conversations.

🔒

Code Execution Control

Sandbox code execution with configurable policies. Control file access, network permissions, and execution timeouts.

👥

Group Chat Governance

Govern entire group chats with unified policies. Control agent selection, iteration limits, and conversation flow.

📊

Full Audit Trail

Complete visibility into all agent actions. Track messages, code executions, and policy decisions with flight recorder integration.

Installation

1

Install Agent OS with AutoGen Support

# Install with AutoGen extras
pip install agent-os-kernel[autogen]

# Or install everything
pip install agent-os-kernel[all]

# Verify installation
python -c "from agent_os.integrations import autogen_kernel; print('✓ AutoGen integration ready')"

2

Verify AutoGen Installation

# Ensure AutoGen is installed (v0.2.0+ required)
pip install pyautogen>=0.2.0

# Verify compatibility
python -c "import autogen; print(f'AutoGen version: {autogen.__version__}')"

3

Configure Your Environment

# Set up your LLM API key
export OPENAI_API_KEY="your-api-key"

# Optional: Configure Agent OS kernel
export AGENT_OS_POLICY="strict"
export AGENT_OS_AUDIT_LOG="./agent_os_audit.jsonl"

Wrapping AssistantAgent

The AssistantAgent is the primary AI-powered agent in AutoGen. Wrap it with Agent OS to add governance without changing your existing code.

1

Basic Wrapping

from autogen import AssistantAgent
from agent_os.integrations import autogen_kernel

# Configure your LLM
llm_config = {
    "model": "gpt-4-turbo",
    "api_key": os.environ["OPENAI_API_KEY"],
    "temperature": 0.7
}

# Create your AutoGen assistant
assistant = AssistantAgent(
    name="coding_assistant",
    system_message="You are a helpful coding assistant.",
    llm_config=llm_config
)

# Wrap with Agent OS governance
governed_assistant = autogen_kernel.wrap(assistant)

# The assistant now operates under kernel protection

2

Configuring Policies

from agent_os.integrations import autogen_kernel
from agent_os.policies import Policy

# Define a custom policy for the assistant
assistant_policy = Policy(
    name="assistant_policy",
    rules=[
        # Limit response length
        {"action": "generate", "max_tokens": 2000},
        
        # Block certain topics
        {"action": "generate", "blocked_topics": ["politics", "violence"]},
        
        # Require code review for certain languages
        {"action": "code_suggest", "languages": ["python", "javascript"], "require_review": True}
    ]
)

# Wrap with custom policy
governed_assistant = autogen_kernel.wrap(
    assistant,
    policy=assistant_policy,
    audit_log=True,
    max_consecutive_auto_reply=10
)

3

Message Filtering

from agent_os.integrations import autogen_kernel
from agent_os.filters import ContentFilter, PIIFilter

# Create content filters
content_filter = ContentFilter(
    block_patterns=[
        r"\b(password|secret|api_key)\s*=\s*['\"][^'\"]+['\"]",  # Block hardcoded secrets
        r"\b(DROP|DELETE|TRUNCATE)\s+TABLE\b",  # Block destructive SQL
    ],
    redact_patterns=[
        r"\b\d{3}-\d{2}-\d{4}\b",  # Redact SSN
        r"\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b",  # Redact emails
    ]
)

pii_filter = PIIFilter(
    detect_types=["ssn", "credit_card", "phone", "email"],
    action="redact"  # or "block"
)

# Wrap with filters
governed_assistant = autogen_kernel.wrap(
    assistant,
    message_filters=[content_filter, pii_filter],
    filter_incoming=True,
    filter_outgoing=True
)

Wrapping UserProxyAgent

The UserProxyAgent handles human input and code execution. This is where kernel-level governance is most critical for safety.

1

Basic UserProxy Wrapping

from autogen import UserProxyAgent
from agent_os.integrations import autogen_kernel

# Create your UserProxyAgent
user_proxy = UserProxyAgent(
    name="user_proxy",
    human_input_mode="TERMINATE",  # or "ALWAYS", "NEVER"
    max_consecutive_auto_reply=10,
    code_execution_config={
        "work_dir": "./workspace",
        "use_docker": False  # We'll use Agent OS sandboxing instead
    }
)

# Wrap with Agent OS for code execution governance
governed_user_proxy = autogen_kernel.wrap(
    user_proxy,
    code_policy="sandboxed",  # "sandboxed", "restricted", "permissive"
    execution_timeout=30,
    max_iterations=10
)

2

Human Input Governance

from agent_os.integrations import autogen_kernel

# Configure human input handling
governed_user_proxy = autogen_kernel.wrap(
    user_proxy,
    human_input_config={
        # Validate human input before processing
        "validate_input": True,
        
        # Filter sensitive content from human input
        "filter_pii": True,
        
        # Log all human interactions
        "audit_human_input": True,
        
        # Require confirmation for certain actions
        "confirm_actions": ["file_write", "network_request", "code_execute"]
    }
)

# Now human input is also governed by the kernel

3

Initiating Governed Conversations

# Both agents are now governed
governed_assistant = autogen_kernel.wrap(assistant, policy="strict")
governed_user_proxy = autogen_kernel.wrap(user_proxy, code_policy="sandboxed")

# Start a governed conversation
governed_user_proxy.initiate_chat(
    governed_assistant,
    message="Write a Python script to analyze this CSV file and generate a summary report."
)

# All messages and code executions are:
# ✓ Filtered through kernel policies
# ✓ Logged to flight recorder
# ✓ Sandboxed for safety
# ✓ Subject to iteration limits

Group Chat Governance

AutoGen's GroupChat enables multiple agents to collaborate. Agent OS provides unified governance across all participants.

1

Wrapping GroupChat

from autogen import GroupChat, GroupChatManager, AssistantAgent, UserProxyAgent
from agent_os.integrations import autogen_kernel

# Create multiple agents
coder = AssistantAgent(name="coder", llm_config=llm_config,
    system_message="You write code to solve problems.")
reviewer = AssistantAgent(name="reviewer", llm_config=llm_config,
    system_message="You review code for bugs and improvements.")
tester = AssistantAgent(name="tester", llm_config=llm_config,
    system_message="You write tests for the code.")
user_proxy = UserProxyAgent(name="user", code_execution_config={"work_dir": "./workspace"})

# Create the group chat
group_chat = GroupChat(
    agents=[user_proxy, coder, reviewer, tester],
    messages=[],
    max_round=20
)

# Create the manager
manager = GroupChatManager(groupchat=group_chat, llm_config=llm_config)

# Wrap the entire group chat with governance
governed_manager = autogen_kernel.wrap_group_chat(
    manager,
    policy="collaborative",
    max_rounds=20,
    audit_all_messages=True
)

2

Agent Selection Policies

from agent_os.integrations import autogen_kernel
from agent_os.policies import GroupPolicy

# Define policies for agent selection
group_policy = GroupPolicy(
    name="dev_team_policy",
    rules=[
        # Require reviewer after coder
        {"after_agent": "coder", "require_agent": "reviewer"},
        
        # Limit consecutive replies from same agent
        {"max_consecutive_same_agent": 2},
        
        # Require user approval before code execution
        {"before_action": "code_execute", "require_agent": "user"},
        
        # Block certain agent transitions
        {"from_agent": "tester", "blocked_to": ["tester"]},  # Tester can't reply to self
    ],
    
    # Trust levels for inter-agent communication
    trust_matrix={
        "coder": {"reviewer": 0.9, "tester": 0.8, "user": 1.0},
        "reviewer": {"coder": 0.9, "tester": 0.9, "user": 1.0},
        "tester": {"coder": 0.8, "reviewer": 0.9, "user": 1.0},
    }
)

governed_manager = autogen_kernel.wrap_group_chat(
    manager,
    policy=group_policy
)

3

Conversation Flow Control

from agent_os.integrations import autogen_kernel

# Advanced conversation control
governed_manager = autogen_kernel.wrap_group_chat(
    manager,
    flow_control={
        # Terminate conditions
        "terminate_on": [
            {"keyword": "TASK_COMPLETE"},
            {"consensus": 0.8},  # 80% of agents agree
            {"max_messages": 100}
        ],
        
        # Escalation rules
        "escalate_to_human": [
            {"on_error": True},
            {"on_timeout": True},
            {"on_budget_exceed": True}
        ],
        
        # Rate limiting
        "rate_limit": {
            "messages_per_minute": 30,
            "tokens_per_minute": 50000
        },
        
        # Deadlock detection
        "detect_loops": True,
        "loop_threshold": 5  # Same pattern repeated 5 times
    }
)

# Start the governed group chat
user_proxy.initiate_chat(
    governed_manager,
    message="Build a REST API for user management with authentication."
)

Code Execution Control

AutoGen agents can execute code autonomously. Agent OS provides multiple layers of protection to ensure code execution remains safe.

1

Sandboxed Execution

from agent_os.integrations import autogen_kernel
from agent_os.sandbox import SandboxConfig

# Configure the sandbox
sandbox_config = SandboxConfig(
    # Isolation level
    isolation="container",  # "process", "container", "vm"
    
    # Resource limits
    max_memory_mb=512,
    max_cpu_percent=50,
    max_execution_time=30,
    
    # File system access
    filesystem={
        "work_dir": "./workspace",
        "read_paths": ["./data", "./config"],
        "write_paths": ["./workspace/output"],
        "blocked_paths": ["/etc", "/root", "~/.ssh"]
    },
    
    # Network access
    network={
        "enabled": False,  # Disable by default
        "allowed_hosts": [],
        "allowed_ports": []
    }
)

# Apply sandbox to UserProxyAgent
governed_user_proxy = autogen_kernel.wrap(
    user_proxy,
    sandbox=sandbox_config
)

2

Code Analysis Before Execution

from agent_os.integrations import autogen_kernel
from agent_os.analysis import CodeAnalyzer

# Configure code analysis
code_analyzer = CodeAnalyzer(
    # Static analysis rules
    block_imports=[
        "subprocess", "os.system", "eval", "exec",
        "pickle", "marshal", "__import__"
    ],
    
    # Pattern-based blocking
    block_patterns=[
        r"open\s*\(\s*['\"]\/etc",  # Block /etc access
        r"requests\.get\s*\(",       # Block HTTP requests
        r"socket\.",                 # Block raw sockets
        r"shutil\.rmtree",           # Block recursive delete
    ],
    
    # Complexity limits
    max_lines=500,
    max_functions=20,
    max_nesting_depth=5,
    
    # Require approval for certain operations
    require_approval=[
        "file_write",
        "external_command",
        "database_query"
    ]
)

governed_user_proxy = autogen_kernel.wrap(
    user_proxy,
    code_analyzer=code_analyzer,
    pre_execution_hook=True  # Analyze before every execution
)

3

Execution Policies

from agent_os.integrations import autogen_kernel
from agent_os.policies import ExecutionPolicy

# Define execution policy
exec_policy = ExecutionPolicy(
    name="safe_execution",
    
    # Language restrictions
    allowed_languages=["python", "bash"],
    python_version="3.10",
    
    # Package restrictions
    allowed_packages=[
        "pandas", "numpy", "matplotlib", "seaborn",
        "scikit-learn", "requests", "json", "csv"
    ],
    blocked_packages=[
        "subprocess", "multiprocessing", "ctypes"
    ],
    
    # Execution limits
    max_retries=3,
    retry_on_error=True,
    
    # Output limits
    max_output_size_kb=100,
    truncate_output=True,
    
    # Rollback on failure
    enable_rollback=True,
    snapshot_before_execution=True
)

governed_user_proxy = autogen_kernel.wrap(
    user_proxy,
    execution_policy=exec_policy
)

# Now all code execution is governed by the policy

Example: Governed Coding Assistant

A complete example showing how to build a production-ready, governed coding assistant with AutoGen and Agent OS.

"""
Governed Coding Assistant with AutoGen and Agent OS
====================================================

A production-ready coding assistant with:
- Kernel-level governance
- Sandboxed code execution
- Full audit trail
- Policy-based controls
"""

import os
from autogen import AssistantAgent, UserProxyAgent, GroupChat, GroupChatManager
from agent_os.integrations import autogen_kernel
from agent_os.policies import Policy, ExecutionPolicy, GroupPolicy
from agent_os.sandbox import SandboxConfig
from agent_os.audit import FlightRecorder

# =============================================================================
# 1. Configure the Flight Recorder for Auditing
# =============================================================================

recorder = FlightRecorder(
    output_path="./audit_logs/coding_assistant.jsonl",
    capture_level="detailed",  # "basic", "detailed", "full"
    include_timestamps=True,
    include_token_counts=True
)

# =============================================================================
# 2. Define Policies
# =============================================================================

# Policy for the coding assistant
coder_policy = Policy(
    name="coder_policy",
    rules=[
        # Response quality
        {"action": "generate", "require_explanation": True},
        {"action": "code_suggest", "require_comments": True},
        
        # Safety
        {"action": "generate", "blocked_topics": ["hacking", "exploits", "malware"]},
        {"action": "code_suggest", "blocked_patterns": [
            r"eval\s*\(",
            r"exec\s*\(",
            r"__import__\s*\("
        ]}
    ]
)

# Policy for code execution
exec_policy = ExecutionPolicy(
    name="safe_execution",
    allowed_languages=["python"],
    allowed_packages=["pandas", "numpy", "matplotlib", "seaborn", "requests", "json"],
    max_execution_time=30,
    max_memory_mb=256,
    max_output_size_kb=50
)

# Sandbox configuration
sandbox = SandboxConfig(
    isolation="process",
    filesystem={
        "work_dir": "./workspace",
        "write_paths": ["./workspace"],
        "blocked_paths": ["/etc", "/root", "~"]
    },
    network={"enabled": False}
)

# =============================================================================
# 3. Create and Wrap Agents
# =============================================================================

# LLM configuration
llm_config = {
    "model": "gpt-4-turbo",
    "api_key": os.environ["OPENAI_API_KEY"],
    "temperature": 0.3  # Lower temperature for coding tasks
}

# Create the coding assistant
coder = AssistantAgent(
    name="coder",
    system_message="""You are an expert Python developer. You write clean, 
    well-documented code with proper error handling. Always explain your 
    approach before writing code.""",
    llm_config=llm_config
)

# Create the code reviewer
reviewer = AssistantAgent(
    name="reviewer",
    system_message="""You are a senior code reviewer. Review code for:
    - Bugs and logic errors
    - Security vulnerabilities
    - Performance issues
    - Code style and best practices
    Provide specific, actionable feedback.""",
    llm_config=llm_config
)

# Create the user proxy with code execution
user_proxy = UserProxyAgent(
    name="user",
    human_input_mode="TERMINATE",
    max_consecutive_auto_reply=10,
    code_execution_config={
        "work_dir": "./workspace",
        "use_docker": False
    }
)

# =============================================================================
# 4. Wrap Agents with Agent OS Governance
# =============================================================================

# Wrap the coder with policy governance
governed_coder = autogen_kernel.wrap(
    coder,
    policy=coder_policy,
    audit_log=True,
    flight_recorder=recorder
)

# Wrap the reviewer
governed_reviewer = autogen_kernel.wrap(
    reviewer,
    policy=coder_policy,
    audit_log=True,
    flight_recorder=recorder
)

# Wrap the user proxy with execution controls
governed_user_proxy = autogen_kernel.wrap(
    user_proxy,
    execution_policy=exec_policy,
    sandbox=sandbox,
    flight_recorder=recorder
)

# =============================================================================
# 5. Create Governed Group Chat
# =============================================================================

# Group policy for the development team
group_policy = GroupPolicy(
    name="dev_team",
    rules=[
        # Require review after coding
        {"after_agent": "coder", "require_agent": "reviewer"},
        
        # Limit iterations
        {"max_consecutive_same_agent": 2}
    ]
)

# Create the group chat
group_chat = GroupChat(
    agents=[governed_user_proxy, governed_coder, governed_reviewer],
    messages=[],
    max_round=15
)

# Create and wrap the manager
manager = GroupChatManager(groupchat=group_chat, llm_config=llm_config)

governed_manager = autogen_kernel.wrap_group_chat(
    manager,
    policy=group_policy,
    flight_recorder=recorder,
    flow_control={
        "terminate_on": [{"keyword": "TASK_COMPLETE"}],
        "detect_loops": True
    }
)

# =============================================================================
# 6. Run the Governed Conversation
# =============================================================================

def main():
    """Run the governed coding assistant."""
    
    print("=" * 60)
    print("Governed Coding Assistant")
    print("Agent OS Kernel: Active")
    print("=" * 60)
    
    # Start the conversation
    governed_user_proxy.initiate_chat(
        governed_manager,
        message="""
        Create a Python script that:
        1. Reads a CSV file containing sales data
        2. Calculates monthly totals and growth rates
        3. Generates a summary report with visualizations
        4. Saves the report as a PDF
        
        The CSV has columns: date, product, quantity, price
        """
    )
    
    # Print audit summary
    print("\n" + "=" * 60)
    print("Audit Summary")
    print("=" * 60)
    
    summary = recorder.get_summary()
    print(f"Total messages: {summary['total_messages']}")
    print(f"Code executions: {summary['code_executions']}")
    print(f"Policy violations: {summary['policy_violations']}")
    print(f"Blocked actions: {summary['blocked_actions']}")
    print(f"Total tokens used: {summary['total_tokens']}")

if __name__ == "__main__":
    main()

✓ What This Example Provides

Multi-agent collaboration with coder and reviewer agents
Sandboxed code execution with resource limits
Policy-based governance for safe code generation
Complete audit trail with flight recorder
Group chat governance with agent selection rules
Loop detection and automatic termination

Troubleshooting

❌ Import Error: autogen_kernel not found

The AutoGen integration is not installed or there's a version mismatch.

# Solution: Install with AutoGen extras
pip uninstall agent-os-kernel
pip install agent-os-kernel[autogen]

# Verify both packages
pip show pyautogen agent-os-kernel

# Check version compatibility
python -c "
import autogen
from agent_os.integrations import autogen_kernel
print(f'AutoGen: {autogen.__version__}')
print(f'Agent OS AutoGen integration: OK')
"

❌ Code Execution Blocked Unexpectedly

The sandbox or code analyzer is blocking legitimate code.

# Debug: Check what's being blocked
from agent_os.integrations import autogen_kernel

governed_agent = autogen_kernel.wrap(
    agent,
    debug=True,  # Enable debug logging
    log_blocked_actions=True
)

# Check the audit log
from agent_os.audit import FlightRecorder
recorder = FlightRecorder("./debug_audit.jsonl")

governed_agent = autogen_kernel.wrap(
    agent,
    flight_recorder=recorder,
    capture_level="full"
)

# After running, inspect blocked actions
blocked = recorder.query(action_type="blocked")
for action in blocked:
    print(f"Blocked: {action['reason']}")
    print(f"Code: {action['code'][:100]}...")

# Solution: Adjust your policy or sandbox config
from agent_os.sandbox import SandboxConfig

sandbox = SandboxConfig(
    # Add the blocked import to allowed list
    allowed_imports=["pandas", "numpy", "your_blocked_import"],
    
    # Or relax the filesystem permissions
    filesystem={
        "read_paths": ["./data", "/path/you/need"]
    }
)

❌ Group Chat Loops or Doesn't Terminate

Agents are stuck in a conversation loop or don't reach termination.

# Solution: Enable loop detection and set clear termination conditions
from agent_os.integrations import autogen_kernel

governed_manager = autogen_kernel.wrap_group_chat(
    manager,
    flow_control={
        # Multiple termination conditions (any can trigger)
        "terminate_on": [
            {"keyword": "TASK_COMPLETE"},
            {"keyword": "TERMINATE"},
            {"max_messages": 50},
            {"max_rounds": 15},
            {"consensus": 0.7}  # 70% agents agree task is done
        ],
        
        # Loop detection
        "detect_loops": True,
        "loop_threshold": 3,  # Detect after 3 similar exchanges
        "loop_action": "escalate",  # "terminate", "escalate", "warn"
        
        # Timeout
        "conversation_timeout": 300,  # 5 minutes max
        
        # Stall detection
        "stall_threshold": 5,  # No progress after 5 messages
        "stall_action": "prompt_user"
    }
)

# Also ensure agents have clear termination instructions
assistant = AssistantAgent(
    name="assistant",
    system_message="""... When the task is complete, respond with 
    'TASK_COMPLETE' to end the conversation.""",
    llm_config=llm_config
)

❌ Messages Being Filtered Incorrectly

Legitimate content is being blocked by message filters.

# Debug: See what's being filtered
from agent_os.filters import ContentFilter

content_filter = ContentFilter(
    block_patterns=[...],
    debug=True,  # Log all filter decisions
    log_matches=True
)

# Test your filter against specific content
test_message = "Your message that's being blocked"
result = content_filter.analyze(test_message)
print(f"Would block: {result.would_block}")
print(f"Matched patterns: {result.matched_patterns}")
print(f"Redactions: {result.redactions}")

# Solution: Refine your patterns
content_filter = ContentFilter(
    block_patterns=[
        # Be more specific with patterns
        r"\bpassword\s*=\s*['\"][^'\"]{8,}['\"]",  # Only block actual passwords
    ],
    
    # Use allowlist for false positives
    allow_patterns=[
        r"password\s*validation",  # Allow discussion of password validation
        r"set.*password.*policy",  # Allow password policy discussions
    ],
    
    # Adjust sensitivity
    case_sensitive=False,
    
    # Use redaction instead of blocking for borderline cases
    redact_instead_of_block=True
)

❌ Performance Issues with Governance Overhead

Agent responses are slow due to governance checks.

# Solution: Optimize governance configuration
from agent_os.integrations import autogen_kernel

# Use lighter-weight policies for performance
governed_agent = autogen_kernel.wrap(
    agent,
    policy="basic",  # "basic" instead of "strict"
    
    # Async policy checks (don't block on every message)
    async_policy_check=True,
    
    # Cache policy decisions
    cache_policy_decisions=True,
    cache_ttl=60,  # Cache for 60 seconds
    
    # Reduce audit overhead
    audit_level="basic",  # "basic", "detailed", "full"
    sample_rate=0.1,  # Only audit 10% of messages
    
    # Skip certain checks for trusted operations
    skip_checks_for=["internal_messages", "status_updates"]
)

# For code execution, use process isolation instead of container
sandbox = SandboxConfig(
    isolation="process",  # Faster than "container"
    
    # Pre-warm the sandbox
    pre_warm=True,
    
    # Reuse sandbox between executions
    persistent=True
)

❌ Flight Recorder Not Capturing Events

Audit logs are empty or missing expected events.

# Solution: Ensure flight recorder is properly connected
from agent_os.audit import FlightRecorder

# Create recorder with verbose settings
recorder = FlightRecorder(
    output_path="./audit_logs/debug.jsonl",
    capture_level="full",
    
    # Ensure immediate writes
    buffer_size=1,  # Write every event immediately
    flush_on_event=True,
    
    # Capture all event types
    event_types=[
        "message_sent", "message_received",
        "code_execution", "policy_check",
        "action_blocked", "action_allowed",
        "error", "warning"
    ]
)

# Connect to ALL governed agents
governed_assistant = autogen_kernel.wrap(
    assistant,
    flight_recorder=recorder
)

governed_user_proxy = autogen_kernel.wrap(
    user_proxy,
    flight_recorder=recorder  # Same recorder instance
)

governed_manager = autogen_kernel.wrap_group_chat(
    manager,
    flight_recorder=recorder  # Same recorder instance
)

# Verify recording is working
print(f"Recorder active: {recorder.is_active}")
print(f"Output path: {recorder.output_path}")

# After conversation, flush and check
recorder.flush()
print(f"Events recorded: {recorder.event_count}")

Still having issues? Open an issue on GitHub or check the API Reference .

Ready to Govern Your AutoGen Agents?

Start with kernel-level safety in minutes.

Get Started → View All Integrations