| layout | default |
|---|---|
| title | OpenHands Tutorial - Chapter 1: Getting Started |
| nav_order | 1 |
| has_children | false |
| parent | OpenHands Tutorial |
Welcome to Chapter 1: Getting Started with OpenHands. In this part of OpenHands Tutorial: Autonomous Software Engineering Workflows, you will build an intuitive mental model first, then move into concrete implementation details and practical production tradeoffs.
Install OpenHands, understand its architecture, and execute your first autonomous coding task.
OpenHands is a powerful autonomous AI software engineering agent that can perform complex coding tasks independently. This chapter covers installation, basic concepts, and your first hands-on experience with autonomous development.
# Minimum requirements
- Python 3.10+
- 8GB RAM (16GB recommended)
- Linux/macOS/Windows
- Node.js 18+ (for frontend components)
# For GPU acceleration (optional)
- CUDA 11.8+ (NVIDIA GPUs)
- ROCm 5.4+ (AMD GPUs)# Install via pip
pip install openhands
# Or install from source for latest features
git clone https://github.com/All-Hands-AI/OpenHands.git
cd OpenHands
pip install -e .
# Install with all optional dependencies
pip install openhands[all]OpenHands requires API keys for the underlying language models:
# Set environment variables
export OPENAI_API_KEY="your-openai-api-key"
export ANTHROPIC_API_KEY="your-anthropic-api-key" # Alternative
# For other providers
export TOGETHER_API_KEY="your-together-api-key"
export DEEPINFRA_API_KEY="your-deepinfra-api-key"Create a configuration file for OpenHands:
# config.toml
[core]
workspace_base = "./workspace"
persist_sandbox = false
run_as_openhands = true
runtime_startup_timeout = 120
[security]
confirmation_mode = false
security_analyzer = ""
allow_file_operations = true
[models]
embedding_model = "local" # or "openai"
llm_model = "gpt-4"
api_key = "your-api-key"
custom_llm_provider = ""
max_input_tokens = 0
max_output_tokens = 0
input_cost_per_token = 0.0
output_cost_per_token = 0.0
ollama_base_url = ""
drop_params = false
[sandbox]
container_image = "ghcr.io/all-hands-ai/openhands:main"
local_runtime = "eventstream"
runtime_startup_timeout = 120
runtime_startup_env_vars = {}
enable_auto_fix = false
use_host_network = false
runtime_cls = "openhands.runtime.impl.eventstream.EventStreamRuntime"
[llm]
model = "gpt-4"
api_key = "your-api-key"
custom_llm_provider = ""
embedding_model = "local"
embedding_base_url = ""
max_input_tokens = 0
max_output_tokens = 0
input_cost_per_token = 0.0
output_cost_per_token = 0.0
ollama_base_url = ""
drop_params = falsefrom openhands import OpenHands, Task
from openhands.runtime import get_runtime
from openhands.agent import Agent
# Main components
agent = Agent() # The AI agent
runtime = get_runtime() # Execution environment
task = Task() # Task specification
# Combined system
openhands = OpenHands(
agent=agent,
runtime=runtime
)OpenHands supports different agent types for various scenarios:
from openhands.agent import CodeActAgent, BrowsingAgent, PlannerAgent
# CodeActAgent: Primary agent for coding tasks
code_agent = CodeActAgent()
# BrowsingAgent: For web-related tasks
browse_agent = BrowsingAgent()
# PlannerAgent: For complex multi-step planning
planner_agent = PlannerAgent()
# Custom agent configuration
custom_agent = CodeActAgent(
llm_config={
"model": "gpt-4",
"api_key": "your-key",
"temperature": 0.1, # Lower temperature for coding
"max_tokens": 4096
}
)OpenHands provides different execution environments:
from openhands.runtime import LocalRuntime, DockerRuntime, RemoteRuntime
# Local runtime (direct execution)
local_runtime = LocalRuntime()
# Docker runtime (sandboxed execution)
docker_runtime = DockerRuntime(
container_image="ghcr.io/all-hands-ai/openhands:main",
workspace_base="/workspace"
)
# Remote runtime (for distributed setups)
remote_runtime = RemoteRuntime(
host="remote-server",
port=8080
)from openhands import OpenHands
# Initialize OpenHands
openhands = OpenHands()
# Define a simple task
task = "Create a Python function that calculates the factorial of a number"
# Execute the task
result = openhands.run(
task=task,
workspace="./my_first_project"
)
# Check the result
print("Task completed!")
print(f"Generated code: {result.code}")
print(f"Execution output: {result.output}")
print(f"Files created: {result.files_created}")# The result object contains detailed information
print(f"Task: {result.task}")
print(f"Status: {result.status}") # 'completed', 'failed', etc.
print(f"Duration: {result.execution_time} seconds")
# Code generated by the agent
print("Generated code:")
print(result.code)
# Files created/modified
for file_path in result.files_created:
print(f"Created: {file_path}")
for file_path in result.files_modified:
print(f"Modified: {file_path}")
# Any errors or issues
if result.errors:
print("Errors encountered:")
for error in result.errors:
print(f" - {error}")from openhands import OpenHands
# Start an interactive session
openhands = OpenHands()
# Begin interactive mode
session = openhands.start_interactive_session(
workspace="./interactive_workspace"
)
print("OpenHands interactive mode started!")
print("Type 'help' for commands, 'exit' to quit")
while True:
user_input = input("> ")
if user_input.lower() == 'exit':
break
elif user_input.lower() == 'help':
print("Commands:")
print(" help - Show this help")
print(" status - Show current status")
print(" files - List workspace files")
print(" <task> - Execute a coding task")
continue
# Execute the user's task
try:
result = session.execute_task(user_input)
print(f"Result: {result.output}")
if result.code:
print(f"Code generated: {result.code}")
except Exception as e:
print(f"Error: {e}")# Simple task specification
simple_task = {
"description": "Create a hello world function in Python",
"language": "python",
"requirements": ["Function should return 'Hello, World!'"],
"constraints": ["Use proper function naming", "Include docstring"]
}
# Execute simple task
result = openhands.run(task=simple_task)# Complex task with detailed requirements
complex_task = {
"description": "Build a REST API for a task management system",
"components": {
"backend": {
"framework": "FastAPI",
"database": "SQLite",
"models": ["User", "Task", "Category"],
"endpoints": [
"POST /users - Create user",
"GET /users/{id} - Get user",
"POST /tasks - Create task",
"GET /tasks - List tasks",
"PUT /tasks/{id} - Update task",
"DELETE /tasks/{id} - Delete task"
]
},
"frontend": {
"framework": "React",
"components": ["TaskList", "TaskForm", "UserProfile"],
"features": ["CRUD operations", "Real-time updates"]
},
"testing": {
"unit_tests": True,
"integration_tests": True,
"api_tests": True
},
"documentation": {
"api_docs": True,
"readme": True,
"deployment_guide": True
}
},
"requirements": [
"Use proper error handling",
"Implement input validation",
"Add authentication/authorization",
"Include comprehensive tests",
"Create deployment configuration"
],
"constraints": [
"Follow REST API best practices",
"Use type hints in Python",
"Implement proper database relationships",
"Ensure security best practices"
]
}
# Execute complex task
result = openhands.run(
task=complex_task,
max_execution_time=1800, # 30 minutes
save_progress=True
)from openhands.workspace import Workspace
# Create a new workspace
workspace = Workspace("./my_project")
# Initialize workspace with template
workspace.init_from_template("python-fastapi")
# Or start with empty workspace
workspace.create_empty()
# Workspace operations
print(f"Workspace path: {workspace.path}")
print(f"Files: {workspace.list_files()}")
# Create files
workspace.create_file("main.py", "print('Hello, World!')")
# Read files
content = workspace.read_file("main.py")
print(f"File content: {content}")
# Execute commands in workspace
result = workspace.run_command("python main.py")
print(f"Command output: {result.stdout}")# Create persistent workspace for multi-session work
persistent_workspace = Workspace("./persistent_project", persistent=True)
# The workspace will remember state between OpenHands sessions
# Useful for long-running development projects
# Save workspace state
persistent_workspace.save_state()
# Load workspace state in new session
loaded_workspace = Workspace.load_state("./persistent_project")try:
result = openhands.run(task="Create a Python web server")
except Exception as e:
print(f"Execution failed: {e}")
# Get detailed error information
if hasattr(result, 'error_details'):
print("Error details:")
print(result.error_details)
# Check logs
logs = openhands.get_logs()
print("Recent logs:")
for log_entry in logs[-10:]: # Last 10 entries
print(f" {log_entry['timestamp']}: {log_entry['message']}")# Enable debugging mode for detailed execution tracing
openhands.enable_debug_mode()
# Run task with debugging
result = openhands.run(
task="Debug this Python function: def add(a, b): return a + b",
debug=True
)
# Access debug information
debug_info = result.debug_info
print("Execution trace:")
for step in debug_info['trace']:
print(f" Step {step['step']}: {step['action']}")
print(f" Result: {step['result']}")
print(f" Duration: {step['duration']}ms")OpenHands runs in secure sandboxed environments by default:
# Configure sandbox security
secure_openhands = OpenHands(
runtime=DockerRuntime(
container_image="ghcr.io/all-hands-ai/openhands:main",
security_opts=[
"--cap-drop=ALL", # Drop all capabilities
"--network=none", # No network access
"--read-only", # Read-only root filesystem
"--tmpfs=/tmp" # Writable temp directory
]
)
)
# Execute task in secure sandbox
result = secure_openhands.run(
task="Create a file processing script",
security_level="high"
)# Configure execution permissions
permissions = {
"file_operations": {
"read": True,
"write": True,
"delete": False, # Disable file deletion
"execute": True
},
"network_access": {
"outbound": False, # No internet access
"localhost": True # Allow localhost connections
},
"command_execution": {
"allowed_commands": ["python", "pip", "npm", "node"],
"blocked_commands": ["rm", "sudo", "curl", "wget"]
}
}
# Apply permissions
openhands.set_permissions(permissions)# Configure for optimal performance
high_perf_openhands = OpenHands(
runtime=DockerRuntime(
container_image="ghcr.io/all-hands-ai/openhands:main",
resources={
"cpu": "2.0", # 2 CPU cores
"memory": "4g", # 4GB RAM
"gpu": "1", # 1 GPU (if available)
"storage": "10g" # 10GB storage
}
),
agent=CodeActAgent(
llm_config={
"model": "gpt-4",
"temperature": 0.1, # Lower temperature for consistency
"max_tokens": 2048, # Reasonable token limit
"cache_enabled": True # Enable response caching
}
)
)# Enable caching for repeated operations
openhands.enable_caching(
cache_dir="./openhands_cache",
max_cache_size="10g",
ttl_seconds=3600 # 1 hour cache TTL
)
# Cache will automatically store and reuse:
# - Model responses for similar queries
# - Compiled code artifacts
# - Dependency installations
# - Test execution resultsIn this chapter, we've covered:
- Installation and Setup - Getting OpenHands running with proper configuration
- Core Architecture - Understanding agents, runtimes, and task execution
- Basic Task Execution - Running your first autonomous coding tasks
- Workspace Management - Working with project directories and files
- Security and Permissions - Safe execution in sandboxed environments
- Performance Optimization - Configuring for optimal resource usage
OpenHands represents a significant advancement in AI-assisted software development, capable of handling complex, multi-step coding tasks autonomously.
- Autonomous Execution: OpenHands can complete entire development workflows independently
- Secure Sandboxing: Code execution happens in isolated, secure environments
- Flexible Configuration: Adaptable to different project requirements and constraints
- Comprehensive Results: Detailed output including code, execution results, and metadata
- Production Ready: Configurable security, performance, and resource management
Next, we'll explore basic operations - file manipulation, command execution, and environment management.
Ready for the next chapter? Chapter 2: Basic Operations
Generated for Awesome Code Docs
flowchart TD
A[Docker Container] --> B[OpenHands Runtime]
B --> C[Agent Controller]
C --> D[LLM Provider]
C --> E[Sandbox Executor]
E --> F[File Operations]
E --> G[Shell Commands]
E --> H[Browser Automation]
D --> I[Plan and Actions]
I --> E