diff --git a/docs/copilot-bot.md b/docs/copilot-bot.md
new file mode 100644
index 0000000..ca379fc
--- /dev/null
+++ b/docs/copilot-bot.md
@@ -0,0 +1,336 @@
+# CopilotBot
+
+CopilotBot is a Microbots agent that delegates the entire agent loop to the **GitHub Copilot runtime**. Unlike other Microbots bots (ReadingBot, WritingBot, etc.) where Microbots manages the LLM ↔ tool loop, CopilotBot lets the Copilot runtime handle planning, tool invocation, file edits, shell commands, and multi-turn reasoning — all within a secure Docker sandbox.
+
+## Prerequisites
+
+- **Docker** — a running Docker daemon
+- **Python 3.10+**
+- **One of the following** for authentication:
+    - A GitHub Copilot subscription (for native Copilot auth), **or**
+    - API credentials for any OpenAI-compatible, Azure OpenAI, or Anthropic endpoint (BYOK — no Copilot subscription needed)
+
+## Installation
+
+```bash
+pip install microbots[ghcp]
+```
+
+This installs the `github-copilot-sdk` package alongside Microbots.
+
+!!! note
+    You do **not** need to install `copilot-cli` on your host machine. Microbots automatically installs and runs it inside the Docker container during initialization.
+
+## Quick Start
+
+```python
+from microbots.bot.CopilotBot import CopilotBot
+
+bot = CopilotBot(
+    model="gpt-4.1",
+    folder_to_mount="/path/to/your/project",
+    github_token="ghp_your_github_token",
+)
+
+result = bot.run("Fix the failing unit tests and make sure all tests pass.")
+
+print(result.status)  # True if successful
+print(result.result)  # The agent's final response
+print(result.error)   # Error message if status is False
+
+bot.stop()
+```
+
+## Authentication Methods
+
+CopilotBot supports multiple authentication methods. The first two require a GitHub Copilot subscription; the BYOK methods do not.
+
+### 1. GitHub Token (Native Copilot Auth)
+
+Pass a GitHub token directly or let Microbots discover it from the environment.
+
+```python
+# Option A: Pass explicitly
+bot = CopilotBot(
+    model="gpt-4.1",
+    folder_to_mount="/path/to/project",
+    github_token="ghp_your_token",
+)
+
+# Option B: Set environment variable (auto-discovered)
+# export GITHUB_TOKEN="ghp_your_token"
+# — or —
+# export COPILOT_GITHUB_TOKEN="ghp_your_token"
+# — or —
+# export GH_TOKEN="ghp_your_token"
+
+bot = CopilotBot(
+    model="gpt-4.1",
+    folder_to_mount="/path/to/project",
+)
+```
+
+If no token is provided and no environment variable is set, Microbots will attempt to retrieve a token from a local GitHub Copilot login (e.g. via `gh copilot`).
+
+**Token discovery order:** explicit `github_token` → `COPILOT_GITHUB_TOKEN` → `GITHUB_TOKEN` → `GH_TOKEN` → local Copilot login.
+
+!!! note
+    The local Copilot login fallback requires `copilot-cli` to be installed on your **host** machine and a valid login session in your home directory (e.g. via `copilot login`). If `copilot-cli` is not installed or no login is found, this step is skipped.
+
+### 2. BYOK — API Key (No Copilot Subscription Required)
+
+Use your own API key and endpoint. This works with any OpenAI-compatible API, Anthropic, or Azure OpenAI — no GitHub Copilot subscription needed.
+
+#### OpenAI
+
+```python
+bot = CopilotBot(
+    model="gpt-4.1",
+    folder_to_mount="/path/to/project",
+    api_key="sk-your-openai-key",
+    base_url="https://api.openai.com/v1",
+    provider_type="openai",      # default, can be omitted
+)
+```
+
+#### Anthropic
+
+```python
+bot = CopilotBot(
+    model="claude-sonnet-4.5",
+    folder_to_mount="/path/to/project",
+    api_key="sk-ant-your-key",
+    base_url="https://api.anthropic.com",
+    provider_type="anthropic",
+)
+```
+
+#### Azure OpenAI
+
+```python
+bot = CopilotBot(
+    model="my-gpt4-deployment",
+    folder_to_mount="/path/to/project",
+    api_key="your-azure-api-key",
+    base_url="https://your-resource.openai.azure.com",
+    provider_type="azure",
+    azure_api_version="2024-10-21",
+)
+```
+
+#### Using `wire_api` for newer models
+
+For models that use the Responses API (e.g. GPT-5 series), set `wire_api="responses"`:
+
+```python
+bot = CopilotBot(
+    model="gpt-5",
+    folder_to_mount="/path/to/project",
+    api_key="sk-your-key",
+    base_url="https://api.openai.com/v1",
+    wire_api="responses",
+)
+```
+
+### 3. BYOK — Bearer Token
+
+If your provider uses bearer token authentication instead of an API key:
+
+```python
+bot = CopilotBot(
+    model="gpt-4.1",
+    folder_to_mount="/path/to/project",
+    bearer_token="your-bearer-token",
+    base_url="https://your-endpoint.com/v1",
+)
+```
+
+!!! note
+    When both `api_key` and `bearer_token` are provided, `bearer_token` takes precedence.
+
+### 4. BYOK — Token Provider (e.g. Azure AD)
+
+For environments that use dynamic token authentication (such as Azure AD managed identity), pass a callable that returns a fresh token:
+
+```python
+from azure.identity import DefaultAzureCredential
+
+credential = DefaultAzureCredential()
+
+def get_token():
+    return credential.get_token("https://cognitiveservices.azure.com/.default").token
+
+bot = CopilotBot(
+    model="my-gpt4-deployment",
+    folder_to_mount="/path/to/project",
+    base_url="https://your-resource.openai.azure.com",
+    provider_type="azure",
+    azure_api_version="2024-10-21",
+    token_provider=get_token,
+)
+```
+
+The `token_provider` must be a callable that returns a non-empty string. It is called once at initialization time.
+
+### 5. BYOK — Environment Variables
+
+Configure BYOK entirely through environment variables without changing any code:
+
+```bash
+export COPILOT_BYOK_BASE_URL="https://api.openai.com/v1"
+export COPILOT_BYOK_API_KEY="sk-your-key"
+export COPILOT_BYOK_PROVIDER_TYPE="openai"          # optional, defaults to "openai"
+export COPILOT_BYOK_MODEL="gpt-4.1"                 # optional, overrides the model param
+export COPILOT_BYOK_WIRE_API="completions"           # optional
+export COPILOT_BYOK_AZURE_API_VERSION="2024-10-21"   # optional, for Azure only
+```
+
+Then create the bot without any auth parameters:
+
+```python
+bot = CopilotBot(
+    folder_to_mount="/path/to/project",
+)
+```
+
+You can also use `COPILOT_BYOK_BEARER_TOKEN` instead of `COPILOT_BYOK_API_KEY` for bearer-token authentication.
+
+## Authentication Priority
+
+When multiple auth methods are configured simultaneously, CopilotBot resolves them in this order:
+
+| Priority | Method | Condition |
+|----------|--------|-----------|
+| 1 | Explicit API key / bearer token | `api_key` or `bearer_token` parameter is set |
+| 2 | Environment variables | `COPILOT_BYOK_BASE_URL` + `COPILOT_BYOK_API_KEY` or `COPILOT_BYOK_BEARER_TOKEN` |
+| 3 | Token provider | `token_provider` parameter is set |
+| 4 | Native GitHub Copilot | `github_token` or `GITHUB_TOKEN` / `COPILOT_GITHUB_TOKEN` / `GH_TOKEN` env vars |
+
+## Parameters
+
+### Constructor
+
+| Parameter | Type | Default | Description |
+|-----------|------|---------|-------------|
+| `model` | `str` | `"gpt-4.1"` | Model name (e.g. `"gpt-4.1"`, `"claude-sonnet-4.5"`). No provider prefix needed. |
+| `folder_to_mount` | `str` | `None` | Absolute path to the folder to mount into the sandbox. |
+| `permission` | `PermissionLabels` | `READ_WRITE` | Mount permission — `READ_ONLY` or `READ_WRITE`. |
+| `environment` | `LocalDockerEnvironment` | `None` | Pre-created Docker environment. Auto-created if not provided. |
+| `additional_tools` | `list[ToolAbstract]` | `[]` | Extra tools to install in the sandbox. |
+| `github_token` | `str` | `None` | GitHub token for native Copilot auth. |
+| `api_key` | `str` | `None` | API key for BYOK. |
+| `bearer_token` | `str` | `None` | Bearer token for BYOK. |
+| `base_url` | `str` | `None` | API endpoint URL for BYOK. |
+| `provider_type` | `str` | `"openai"` | BYOK provider: `"openai"`, `"azure"`, or `"anthropic"`. |
+| `wire_api` | `str` | `None` | API format: `"completions"` or `"responses"`. |
+| `azure_api_version` | `str` | `None` | Azure API version (for `provider_type="azure"` only). |
+| `token_provider` | `Callable[[], str]` | `None` | Callable returning a bearer token string. |
+
+### `run()` method
+
+| Parameter | Type | Default | Description |
+|-----------|------|---------|-------------|
+| `task` | `str` | *(required)* | Natural-language description of the task. |
+| `additional_mounts` | `list[Mount]` | `None` | Extra folders to copy into the container. |
+| `timeout_in_seconds` | `int` | `600` | Maximum wall-clock time for the agent run. |
+| `streaming` | `bool` | `False` | Enable streaming delta events (logged at DEBUG level). |
+
+### Return value — `BotRunResult`
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `status` | `bool` | `True` if the agent completed successfully. |
+| `result` | `str` or `None` | The agent's final response text. |
+| `error` | `str` or `None` | Error description if `status` is `False`. |
+
+## Examples
+
+### Read-only code analysis
+
+```python
+from microbots.bot.CopilotBot import CopilotBot
+from microbots.constants import PermissionLabels
+
+bot = CopilotBot(
+    model="gpt-4.1",
+    folder_to_mount="/path/to/project",
+    permission=PermissionLabels.READ_ONLY,
+    github_token="ghp_your_token",
+)
+
+result = bot.run("Analyze the codebase and list all public API endpoints.")
+print(result.result)
+bot.stop()
+```
+
+### Fix a bug with BYOK (OpenAI)
+
+```python
+from microbots.bot.CopilotBot import CopilotBot
+
+bot = CopilotBot(
+    model="gpt-4.1",
+    folder_to_mount="/path/to/project",
+    api_key="sk-your-openai-key",
+    base_url="https://api.openai.com/v1",
+)
+
+result = bot.run(
+    "The login form crashes when email contains a '+'. Fix the validation logic.",
+    timeout_in_seconds=300,
+)
+print(result.result)
+bot.stop()
+```
+
+### Using additional tools
+
+```python
+from microbots.bot.CopilotBot import CopilotBot
+from microbots.tools.internal_tool import InternalTool
+
+my_tool = InternalTool(tool_definition_path="path/to/tool.yaml")
+
+bot = CopilotBot(
+    model="gpt-4.1",
+    folder_to_mount="/path/to/project",
+    github_token="ghp_your_token",
+    additional_tools=[my_tool],
+)
+
+result = bot.run("Use the custom tool to lint and then fix all issues.")
+bot.stop()
+```
+
+!!! warning
+    `ExternalTool` is **not supported** with CopilotBot. Only tools that run inside the Docker container (internal tools) can be used.
+
+### Mounting additional folders at runtime
+
+```python
+from microbots.bot.CopilotBot import CopilotBot
+from microbots.extras.mount import Mount
+
+bot = CopilotBot(
+    model="gpt-4.1",
+    folder_to_mount="/path/to/project",
+    github_token="ghp_your_token",
+)
+
+extra = Mount("/path/to/test-data", "/workdir/test-data", "READ_ONLY")
+result = bot.run(
+    "Run the integration tests using the data in /workdir/test-data.",
+    additional_mounts=[extra],
+)
+bot.stop()
+```
+
+## Cleanup
+
+Always call `bot.stop()` when you are done. This tears down the SDK client, the CLI server, and the Docker container:
+
+```python
+bot.stop()
+```
+
+`stop()` is idempotent — calling it multiple times is safe. It is also called automatically when the object is garbage-collected, but explicit cleanup is recommended.
diff --git a/mkdocs.yml b/mkdocs.yml
index d7e50a8..9fa483b 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -41,6 +41,9 @@ markdown_extensions:
 nav:
   - Getting Started:
     - Home: index.md
+  - Guides:
+    - CopilotBot: copilot-bot.md
+    - Authentication: authentication.md
   - Blogs:
     - blog/index.md
     - "Microbots : Safety First Agentic Workflow": blog/microbots-safety-first-ai-agent.md
diff --git a/pyproject.toml b/pyproject.toml
index becf06a..fd5f0ef 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -21,6 +21,10 @@ classifiers = [
 ]
 requires-python = ">=3.11"
 
+[project.optional-dependencies]
+ghcp = ["github-copilot-sdk"]
+azure_ad = ["azure-identity>=1.15.0"]
+
 [tool.setuptools.dynamic]
 dependencies = { file = ["requirements.txt"] }
 
@@ -30,9 +34,6 @@ dependencies = { file = ["requirements.txt"] }
 [tool.setuptools.packages.find]
 where = ["src"]
 
-[project.optional-dependencies]
-azure_ad = ["azure-identity>=1.15.0"]
-
 [project.urls]
 "Source Repo" = "https://github.com/microsoft/minions"
 Issues = "https://github.com/microsoft/minions/issues"
\ No newline at end of file
diff --git a/src/microbots/__init__.py b/src/microbots/__init__.py
index cc344c0..7415ffd 100644
--- a/src/microbots/__init__.py
+++ b/src/microbots/__init__.py
@@ -1,4 +1,4 @@
-from microbots.bot import ReadingBot, WritingBot, LogAnalysisBot, BrowsingBot, AgentBoss
+from microbots.bot import ReadingBot, WritingBot, LogAnalysisBot, BrowsingBot, AgentBoss, CopilotBot
 from microbots.MicroBot import MicroBot, BotRunResult
 __all__ = [
     "ReadingBot",
@@ -6,6 +6,7 @@
     "LogAnalysisBot",
     "BrowsingBot",
     "AgentBoss",
+    "CopilotBot",
     "MicroBot",
     "BotRunResult"
 ]
diff --git a/src/microbots/bot/CopilotBot.py b/src/microbots/bot/CopilotBot.py
new file mode 100644
index 0000000..bb8e656
--- /dev/null
+++ b/src/microbots/bot/CopilotBot.py
@@ -0,0 +1,716 @@
+"""
+CopilotBot — A wrapper around the GitHub Copilot SDK.
+
+Unlike MicroBot (which manages its own LLM ↔ shell agent loop), CopilotBot
+delegates the **entire agent loop to the Copilot runtime**.  Copilot handles
+planning, tool invocation (file edits, shell commands, web requests, etc.)
+and multi-turn reasoning autonomously.
+
+CopilotBot reuses the Microbots infrastructure:
+  - Docker sandbox (LocalDockerEnvironment) for isolated execution
+  - Mount system for folder access control
+  - ToolAbstract lifecycle (install → verify → setup) for additional tools
+  - copilot-cli is installed **inside** the container and run in headless
+    server mode; the SDK on the host connects to it via TCP.
+
+Architecture:
+
+    Host                          Docker Container
+    ─────                         ────────────────
+    CopilotBot                    copilot-cli --headless --port <P>
+        │                              │
+        ├── Copilot SDK ──TCP──────────┘
+        │   (ExternalServerConfig)
+        │
+        ├── additional tools
+        │   (define_tool → SDK session)
+        │
+        └── BotRunResult
+
+Prerequisites:
+  - pip install microbots[ghcp]   (github-copilot-sdk)
+  - Docker daemon running
+  - GitHub authentication (GITHUB_TOKEN / COPILOT_GITHUB_TOKEN or copilot login)
+"""
+
+import asyncio
+import os
+import time
+import threading
+from collections.abc import Callable
+from logging import getLogger
+from typing import Optional
+
+from microbots.constants import (
+    DOCKER_WORKING_DIR,
+    PermissionLabels,
+)
+from microbots.environment.local_docker.LocalDockerEnvironment import (
+    LocalDockerEnvironment,
+)
+from microbots.extras.mount import Mount, MountType
+from microbots.MicroBot import BotRunResult
+from microbots.tools.external_tool import ExternalTool
+from microbots.tools.tool import ToolAbstract
+from microbots.utils.copilot_auth import get_copilot_token
+from microbots.utils.network import get_free_port  # still used for _create_environment
+
+logger = getLogger(" CopilotBot ")
+
+# Default model when none is specified (just the deployment name, no provider prefix)
+_DEFAULT_MODEL = "gpt-4.1"
+
+# Time (seconds) to wait for copilot-cli to start inside the container
+_CLI_STARTUP_TIMEOUT = 60
+
+# copilot-cli port inside the container
+_CONTAINER_CLI_PORT = 4321
+
+# Environment variable names for BYOK configuration
+_BYOK_ENV_PROVIDER_TYPE = "COPILOT_BYOK_PROVIDER_TYPE"
+_BYOK_ENV_BASE_URL = "COPILOT_BYOK_BASE_URL"
+_BYOK_ENV_API_KEY = "COPILOT_BYOK_API_KEY"
+_BYOK_ENV_BEARER_TOKEN = "COPILOT_BYOK_BEARER_TOKEN"
+_BYOK_ENV_WIRE_API = "COPILOT_BYOK_WIRE_API"
+_BYOK_ENV_AZURE_API_VERSION = "COPILOT_BYOK_AZURE_API_VERSION"
+_BYOK_ENV_MODEL = "COPILOT_BYOK_MODEL"
+
+
+def resolve_auth_config(
+    model: str = _DEFAULT_MODEL,
+    github_token: Optional[str] = None,
+    api_key: Optional[str] = None,
+    bearer_token: Optional[str] = None,
+    base_url: Optional[str] = None,
+    provider_type: Optional[str] = None,
+    wire_api: Optional[str] = None,
+    azure_api_version: Optional[str] = None,
+    token_provider: Optional[Callable[[], str]] = None,
+) -> tuple[str, Optional[str], Optional[dict]]:
+    """Resolve authentication and provider configuration for CopilotBot.
+
+    Determines whether to use BYOK (Bring Your Own Key) or native GitHub
+    Copilot authentication, and builds the appropriate provider config.
+
+    Priority order:
+      1. Explicit ``api_key`` or ``bearer_token`` with ``base_url`` → BYOK
+      2. Environment variables (``COPILOT_BYOK_*``) → BYOK
+      3. ``token_provider`` (e.g. Azure AD token provider) → BYOK with bearer token
+      4. GitHub token → native Copilot authentication
+
+    Parameters
+    ----------
+    model : str
+        Model name (e.g. ``"gpt-4.1"``, ``"claude-sonnet-4.5"``).
+    github_token : Optional[str]
+        GitHub token for native Copilot auth.
+    api_key : Optional[str]
+        API key for BYOK provider.
+    bearer_token : Optional[str]
+        Bearer token for BYOK (takes precedence over ``api_key``).
+    base_url : Optional[str]
+        API endpoint URL for BYOK provider.
+    provider_type : Optional[str]
+        Provider type: ``"openai"``, ``"azure"``, or ``"anthropic"``.
+    wire_api : Optional[str]
+        API format: ``"completions"`` or ``"responses"``.
+    azure_api_version : Optional[str]
+        Azure API version (only for ``type: "azure"``).
+    token_provider : Optional[Callable[[], str]]
+        Callable that returns a bearer token string (e.g. Azure AD
+        token provider).  The token is fetched once at config resolution
+        time.  For long-running sessions, create a new session with a
+        refreshed token.
+
+    Returns
+    -------
+    tuple[str, Optional[str], Optional[dict]]
+        ``(model, github_token, provider_config)`` where
+        ``provider_config`` is *None* for native Copilot auth or a dict
+        suitable for the ``provider`` kwarg of ``create_session``.
+
+    Raises
+    ------
+    ValueError
+        If BYOK is requested but ``base_url`` is missing, or if
+        ``token_provider`` is not a valid callable.
+    """
+
+    # ── 1. Explicit api_key / bearer_token ───────────────────────────
+    if api_key or bearer_token:
+        if not base_url:
+            raise ValueError(
+                "BYOK requires a base_url when api_key or bearer_token is provided."
+            )
+        provider = _build_provider_config(
+            provider_type=provider_type or "openai",
+            base_url=base_url,
+            api_key=api_key,
+            bearer_token=bearer_token,
+            wire_api=wire_api,
+            azure_api_version=azure_api_version,
+        )
+        logger.info("🔑 BYOK auth resolved via explicit credentials (type=%s)", provider["type"])
+        return model, None, provider
+
+    # ── 2. Environment variables ─────────────────────────────────────
+    env_base_url = os.environ.get(_BYOK_ENV_BASE_URL)
+    env_api_key = os.environ.get(_BYOK_ENV_API_KEY)
+    env_bearer_token = os.environ.get(_BYOK_ENV_BEARER_TOKEN)
+
+    if env_base_url and (env_api_key or env_bearer_token):
+        env_model = os.environ.get(_BYOK_ENV_MODEL, model)
+        provider = _build_provider_config(
+            provider_type=os.environ.get(_BYOK_ENV_PROVIDER_TYPE, "openai"),
+            base_url=env_base_url,
+            api_key=env_api_key,
+            bearer_token=env_bearer_token,
+            wire_api=os.environ.get(_BYOK_ENV_WIRE_API),
+            azure_api_version=os.environ.get(_BYOK_ENV_AZURE_API_VERSION),
+        )
+        logger.info("🔑 BYOK auth resolved via environment variables (type=%s)", provider["type"])
+        return env_model, None, provider
+
+    # ── 3. Token provider (e.g. Azure AD) ────────────────────────────
+    if token_provider:
+        if not callable(token_provider):
+            raise ValueError("token_provider must be a callable that returns a string token.")
+        resolved_url = base_url or env_base_url
+        if not resolved_url:
+            raise ValueError(
+                "BYOK with token_provider requires a base_url (pass it directly "
+                "or set COPILOT_BYOK_BASE_URL)."
+            )
+        try:
+            token = token_provider()
+        except Exception as e:
+            raise ValueError(f"token_provider failed during validation: {e}") from e
+        if not isinstance(token, str) or not token:
+            raise ValueError("token_provider must return a non-empty string token.")
+
+        provider = _build_provider_config(
+            provider_type=provider_type or os.environ.get(_BYOK_ENV_PROVIDER_TYPE, "openai"),
+            base_url=resolved_url,
+            bearer_token=token,
+            wire_api=wire_api or os.environ.get(_BYOK_ENV_WIRE_API),
+            azure_api_version=azure_api_version or os.environ.get(_BYOK_ENV_AZURE_API_VERSION),
+        )
+        logger.info("🔑 BYOK auth resolved via token_provider (type=%s)", provider["type"])
+        return model, None, provider
+
+    # ── 4. Native GitHub Copilot auth ────────────────────────────────
+    resolved_github_token = (
+        github_token
+        or os.environ.get("COPILOT_GITHUB_TOKEN")
+        or os.environ.get("GITHUB_TOKEN")
+        or os.environ.get("GH_TOKEN")
+        or get_copilot_token()
+    )
+    logger.info("🔑 Using native GitHub Copilot authentication")
+    return model, resolved_github_token, None
+
+
+def _build_provider_config(
+    provider_type: str,
+    base_url: str,
+    api_key: Optional[str] = None,
+    bearer_token: Optional[str] = None,
+    wire_api: Optional[str] = None,
+    azure_api_version: Optional[str] = None,
+) -> dict:
+    """Build the ``provider`` dict accepted by ``create_session``."""
+    config: dict = {
+        "type": provider_type,
+        "base_url": base_url,
+    }
+    # bearer_token takes precedence over api_key per SDK docs
+    if bearer_token:
+        config["bearer_token"] = bearer_token
+    elif api_key:
+        config["api_key"] = api_key
+
+    if wire_api:
+        config["wire_api"] = wire_api
+
+    if provider_type == "azure" and azure_api_version:
+        config["azure"] = {"api_version": azure_api_version}
+
+    return config
+
+
+class CopilotBot:
+    """Wrapper around the GitHub Copilot SDK with a sandboxed Docker environment.
+
+    The Copilot runtime manages the agent loop (planning, tool calls,
+    multi-turn reasoning).  CopilotBot sets up the sandbox, installs
+    copilot-cli inside it, connects the SDK, and exposes a simple
+    ``run(task)`` interface.
+
+    Parameters
+    ----------
+    model : str
+        Copilot model name (e.g. ``"gpt-4.1"``, ``"claude-sonnet-4.5"``).
+        Unlike MicroBot, no ``<provider>/`` prefix is needed.
+    folder_to_mount : str
+        Absolute host path to mount into the sandbox.
+    permission : PermissionLabels
+        Mount permission — READ_ONLY or READ_WRITE.  Defaults to READ_WRITE.
+    environment : Optional[LocalDockerEnvironment]
+        Pre-created environment.  One is created automatically when *None*.
+    additional_tools : Optional[list[ToolAbstract]]
+        Extra Microbots tools to install in the sandbox.  Their
+        ``usage_instructions_to_llm`` are appended to the system message
+        and, where possible, they are registered as SDK custom tools.
+    github_token : Optional[str]
+        Explicit GitHub token.  Falls back to ``GITHUB_TOKEN`` /
+        ``COPILOT_GITHUB_TOKEN`` env vars.  Used only when BYOK is not
+        configured.
+    api_key : Optional[str]
+        API key for BYOK provider.  When provided with ``base_url``,
+        bypasses GitHub Copilot auth and uses the key directly.
+    bearer_token : Optional[str]
+        Bearer token for BYOK provider.  Takes precedence over ``api_key``.
+    base_url : Optional[str]
+        API endpoint URL for BYOK (e.g.
+        ``"https://api.openai.com/v1"``).
+    provider_type : Optional[str]
+        BYOK provider type: ``"openai"``, ``"azure"``, or
+        ``"anthropic"``.  Defaults to ``"openai"``.
+    wire_api : Optional[str]
+        API format: ``"completions"`` (default) or ``"responses"``
+        (for GPT-5 series).
+    azure_api_version : Optional[str]
+        Azure API version string (only for ``provider_type="azure"``).
+    token_provider : Optional[Callable[[], str]]
+        A callable returning a bearer token (e.g. Azure AD token
+        provider).  Requires ``base_url``.
+    """
+
+    def __init__(
+        self,
+        model: str = _DEFAULT_MODEL,
+        folder_to_mount: Optional[str] = None,
+        permission: PermissionLabels = PermissionLabels.READ_WRITE,
+        environment: Optional[LocalDockerEnvironment] = None,
+        additional_tools: Optional[list[ToolAbstract]] = None,
+        github_token: Optional[str] = None,
+        api_key: Optional[str] = None,
+        bearer_token: Optional[str] = None,
+        base_url: Optional[str] = None,
+        provider_type: Optional[str] = None,
+        wire_api: Optional[str] = None,
+        azure_api_version: Optional[str] = None,
+        token_provider: Optional[Callable[[], str]] = None,
+    ):
+        try:
+            from copilot import CopilotClient, ExternalServerConfig
+            from copilot.types import PermissionHandler
+        except ImportError:
+            raise ImportError(
+                "CopilotBot requires the github-copilot-sdk package. "
+                "Install with: pip install microbots[ghcp]"
+            )
+
+        self.additional_tools = additional_tools or []
+
+        # ── Resolve auth: BYOK vs native GitHub Copilot ─────────────
+        self.model, self.github_token, self._provider_config = resolve_auth_config(
+            model=model,
+            github_token=github_token,
+            api_key=api_key,
+            bearer_token=bearer_token,
+            base_url=base_url,
+            provider_type=provider_type,
+            wire_api=wire_api,
+            azure_api_version=azure_api_version,
+            token_provider=token_provider,
+        )
+
+        # ── Mount setup ─────────────────────────────────────────────
+        self.folder_to_mount: Optional[Mount] = None
+        if folder_to_mount:
+            sandbox_path = f"/{DOCKER_WORKING_DIR}/{os.path.basename(folder_to_mount)}"
+            self.folder_to_mount = Mount(folder_to_mount, sandbox_path, permission)
+
+        # ── Docker environment ──────────────────────────────────────
+        self.environment = environment
+        if not self.environment:
+            self._create_environment()
+
+        # ── Validate tools — ExternalTool is not supported ──────────
+        # __ And ___
+        # ── Install additional tools inside the container ───────────
+        for tool in self.additional_tools:
+            if isinstance(tool, ExternalTool):
+                raise ValueError(
+                    f"CopilotBot does not support ExternalTool '{tool.name}'. "
+                    f"copilot-cli runs inside the Docker container, so only "
+                    f"internal (container-side) tools are allowed."
+                )
+
+            logger.info("🔧 Installing additional tool '%s'...", tool.name)
+            tool.install_tool(self.environment)
+            tool.verify_tool_installation(self.environment)
+            logger.info("✅ Tool '%s' installed and verified", tool.name)
+
+        # ── Install & start copilot-cli inside the container ────────
+        self._install_copilot_cli()
+        self._start_copilot_cli_server()
+
+        # ── Background event loop for async SDK calls ───────────────
+        self._loop = asyncio.new_event_loop()
+        self._thread = threading.Thread(target=self._loop.run_forever, daemon=True)
+        self._thread.start()
+
+        # ── Connect SDK to in-container CLI ─────────────────────────
+        container_ip = self.environment.get_ipv4_address()
+        self._client = CopilotClient(
+            ExternalServerConfig(url=f"{container_ip}:{_CONTAINER_CLI_PORT}")
+        )
+        self._run_async(self._client.start())
+        self._PermissionHandler = PermissionHandler
+
+        logger.info(
+            "✅ CopilotBot initialised — model=%s, cli=%s:%d",
+            self.model,
+            container_ip,
+            _CONTAINER_CLI_PORT,
+        )
+
+    # ──────────────────────────────────────────────────────────────────
+    # Public API
+    # ──────────────────────────────────────────────────────────────────
+
+    def run(
+        self,
+        task: str,
+        additional_mounts: Optional[list[Mount]] = None,
+        timeout_in_seconds: int = 600,
+        streaming: bool = False,
+    ) -> BotRunResult:
+        """Send *task* to the Copilot agent and wait for completion.
+
+        The Copilot runtime manages the full agent loop autonomously —
+        planning, tool invocation, multi-turn reasoning, and file edits
+        all happen inside the sandboxed environment.
+
+        Parameters
+        ----------
+        task : str
+            A natural-language description of the task.
+        additional_mounts : Optional[list[Mount]]
+            Extra folders to copy into the container before running.
+        timeout_in_seconds : int
+            Maximum wall-clock time for the agent run.
+        streaming : bool
+            Whether to enable streaming delta events (logged at DEBUG level).
+
+        Returns
+        -------
+        BotRunResult
+            status=True on success with the agent's final message in *result*,
+            or status=False with an error description.
+        """
+        logger.info("🚀 Starting CopilotBot run — task: %.120s...", task)
+
+        # Setup additional tools (env vars, files, setup_commands)
+        for tool in self.additional_tools:
+            logger.info("⚙️  Setting up tool '%s'", tool.name)
+            tool.setup_tool(self.environment)
+
+        # Mount additional folders
+        for mount in additional_mounts or []:
+            self._mount_additional(mount)
+
+        # Build system message with tool instructions
+        system_content = self._build_system_message()
+
+        try:
+            result_text = self._run_async(
+                self._execute_session(
+                    task=task,
+                    system_content=system_content,
+                    timeout=timeout_in_seconds,
+                    streaming=streaming,
+                )
+            )
+            logger.info("✅ CopilotBot run completed successfully")
+            return BotRunResult(status=True, result=result_text, error=None)
+        except Exception as e:
+            logger.exception("❌ CopilotBot run failed: %s", e)
+            return BotRunResult(status=False, result=None, error=str(e))
+
+    def stop(self):
+        """Tear down the SDK client, CLI server, and Docker environment."""
+        if getattr(self, "_stopped", False):
+            return
+        self._stopped = True
+
+        # Stop the SDK client (best-effort, with timeout to avoid deadlock)
+        try:
+            if self._loop.is_running():
+                future = asyncio.run_coroutine_threadsafe(
+                    self._client.stop(), self._loop
+                )
+                future.result(timeout=10)
+        except Exception:
+            pass
+
+        # Shut down the background event loop
+        try:
+            self._loop.call_soon_threadsafe(self._loop.stop)
+            self._thread.join(timeout=5)
+        except Exception:
+            pass
+
+        if self.environment:
+            self.environment.stop()
+            self.environment = None
+        logger.info("🛑 CopilotBot stopped")
+
+    def __del__(self):
+        try:
+            self.stop()
+        except Exception:
+            pass
+
+    # ──────────────────────────────────────────────────────────────────
+    # Private — environment & CLI setup
+    # ──────────────────────────────────────────────────────────────────
+
+    def _create_environment(self):
+        free_port = get_free_port()
+        self.environment = LocalDockerEnvironment(
+            port=free_port,
+            folder_to_mount=self.folder_to_mount,
+        )
+
+    def _install_copilot_cli(self):
+        """Install copilot-cli inside the Docker container."""
+        logger.info("📦 Installing copilot-cli inside container...")
+
+        # Install Node.js (required for copilot-cli via npm)
+        install_commands = [
+            # Remove stale third-party repos that may have expired GPG keys
+            "rm -f /etc/apt/sources.list.d/yarn.list",
+            # Install Node.js 22.x (copilot-cli requires Node 22+)
+            "apt-get update -qq && apt-get install -y -qq curl ca-certificates > /dev/null 2>&1",
+            "curl -fsSL https://deb.nodesource.com/setup_22.x | bash - > /dev/null 2>&1",
+            "apt-get install -y -qq nodejs > /dev/null 2>&1",
+            # Install copilot-cli globally
+            "npm install -g @github/copilot > /dev/null 2>&1",
+        ]
+
+        for cmd in install_commands:
+            result = self.environment.execute(cmd, timeout=300)
+            if result.return_code != 0:
+                raise RuntimeError(
+                    f"Failed to install copilot-cli: {cmd}\n"
+                    f"stdout: {result.stdout}\nstderr: {result.stderr}"
+                )
+
+        # Verify installation
+        verify = self.environment.execute("copilot --version")
+        if verify.return_code != 0:
+            raise RuntimeError(
+                f"copilot-cli installation verification failed: {verify.stderr}"
+            )
+        logger.info("✅ copilot-cli installed: %s", verify.stdout.strip())
+
+    def _start_copilot_cli_server(self):
+        """Start copilot-cli in headless server mode inside the container.
+
+        The CLI listens on ``_CONTAINER_CLI_PORT`` inside the container.
+        The host connects directly to the container's bridge-network IP.
+        Authentication is handled via the GITHUB_TOKEN environment variable
+        injected into the container.
+        """
+        # Inject the GitHub token into the container for native Copilot auth.
+        # When BYOK is active, authentication is handled via the provider
+        # config passed to create_session — no container-side token needed.
+        if self.github_token and not self._provider_config:
+            self.environment.execute(
+                f'export GITHUB_TOKEN="{self.github_token}"', sensitive=True
+            )
+            self.environment.execute(
+                f'export COPILOT_GITHUB_TOKEN="{self.github_token}"', sensitive=True
+            )
+
+        # Start copilot in headless mode in the background
+        # Using nohup + & to run it as a background process inside the container's shell
+        start_cmd = (
+            f"nohup copilot --headless --port {_CONTAINER_CLI_PORT} "
+            f"> /var/log/copilot-cli.log 2>&1 &"
+        )
+        result = self.environment.execute(start_cmd)
+        if result.return_code != 0:
+            raise RuntimeError(
+                f"Failed to start copilot-cli server: {result.stderr}"
+            )
+
+        # Wait for the server to be ready
+        self._wait_for_cli_ready()
+        logger.info(
+            "✅ copilot-cli headless server running on container port %d",
+            _CONTAINER_CLI_PORT,
+        )
+
+    def _wait_for_cli_ready(self):
+        """Poll until the copilot-cli server is accepting connections."""
+        import socket as _socket
+
+        container_ip = self.environment.get_ipv4_address()
+        deadline = time.time() + _CLI_STARTUP_TIMEOUT
+        while time.time() < deadline:
+            try:
+                sock = _socket.create_connection(
+                    (container_ip, _CONTAINER_CLI_PORT), timeout=2
+                )
+                sock.close()
+                return
+            except (ConnectionRefusedError, OSError):
+                time.sleep(1)
+        raise TimeoutError(
+            f"copilot-cli did not become ready within {_CLI_STARTUP_TIMEOUT}s "
+            f"on {container_ip}:{_CONTAINER_CLI_PORT}"
+        )
+
+    # ──────────────────────────────────────────────────────────────────
+    # Private — SDK session & execution
+    # ──────────────────────────────────────────────────────────────────
+
+    def _run_async(self, coro):
+        """Submit an async coroutine to the background loop and block."""
+        future = asyncio.run_coroutine_threadsafe(coro, self._loop)
+        return future.result()
+
+    async def _execute_session(
+        self,
+        task: str,
+        system_content: str,
+        timeout: int,
+        streaming: bool,
+    ) -> str:
+        """Create a Copilot session, send the task, and collect the result."""
+        from copilot.generated.session_events import SessionEventType
+
+        session_kwargs = {
+            "model": self.model,
+            "on_permission_request": self._PermissionHandler.approve_all,
+            "streaming": streaming,
+            "hooks": {
+                "on_pre_tool_use": self._on_pre_tool_use,
+                "on_post_tool_use": self._on_post_tool_use,
+            },
+        }
+
+        if self._provider_config:
+            session_kwargs["provider"] = self._provider_config
+
+        if system_content:
+            session_kwargs["system_message"] = {"content": system_content}
+
+        logger.info("📡 Creating Copilot session (model=%s, streaming=%s, byok=%s)", self.model, streaming, self._provider_config is not None)
+        logger.debug("Session kwargs: %s", session_kwargs)
+        session = await self._client.create_session(**session_kwargs)
+
+        collected_text = []
+        done_event = asyncio.Event()
+
+        def _on_event(event):
+            if event.type == SessionEventType.ASSISTANT_MESSAGE:
+                if event.data and event.data.content:
+                    collected_text.append(event.data.content)
+                    logger.info("💬 Assistant message received (%d chars)", len(event.data.content))
+            elif event.type == SessionEventType.ASSISTANT_MESSAGE_DELTA:
+                if event.data and event.data.delta_content:
+                    logger.debug("📝 %s", event.data.delta_content)
+            elif event.type == SessionEventType.SESSION_IDLE:
+                logger.info("⏹️  Session idle — agent finished processing")
+                done_event.set()
+            else:
+                logger.debug("📨 Session event: %s", event.type)
+
+        session.on(_on_event)
+
+        # Send the task prompt and wait for completion
+        logger.info("📤 Sending task to Copilot agent...")
+        logger.debug("Task content: %s", task)
+        response = await session.send_and_wait(task, timeout=float(timeout))
+
+        # If send_and_wait returned a full response, use it
+        if response and response.data and response.data.content:
+            logger.info("✅ Received response from send_and_wait with %d chars", len(response.data.content))
+            logger.info("Response content: %s", response.data.content)
+            return response.data.content
+
+        # Otherwise wait for the collected events
+        if not collected_text:
+            try:
+                await asyncio.wait_for(done_event.wait(), timeout=float(timeout))
+            except asyncio.TimeoutError:
+                logger.warning("⏱️  Timed out waiting for session idle after %ds", timeout)
+
+        await session.disconnect()
+
+        if collected_text:
+            return collected_text[-1]  # Return the last assistant message
+
+        return "Agent completed without producing a final message."
+
+    def _build_system_message(self) -> str:
+        """Compose the system message from mount info and tool instructions."""
+        parts = []
+
+        if self.folder_to_mount:
+            parts.append(
+                f"The working directory is mounted at {self.folder_to_mount.sandbox_path}. "
+                f"You can access files using paths relative to or absolute from that directory."
+            )
+
+        for tool in self.additional_tools:
+            if tool.usage_instructions_to_llm:
+                parts.append(tool.usage_instructions_to_llm)
+
+        return "\n\n".join(parts)
+
+    # ──────────────────────────────────────────────────────────────────
+    # Private — SDK hooks for tool-use logging
+    # ──────────────────────────────────────────────────────────────────
+
+    async def _on_pre_tool_use(self, input_data, invocation):
+        """Hook called before each tool execution — log the call."""
+        tool_name = input_data.get("toolName", "unknown")
+        tool_args = input_data.get("toolArgs", {})
+        logger.info("➡️  Tool call: %s — args: %s", tool_name, tool_args)
+        return {"permissionDecision": "allow"}
+
+    async def _on_post_tool_use(self, input_data, invocation):
+        """Hook called after each tool execution — log the result."""
+        tool_name = input_data.get("toolName", "unknown")
+        result = input_data.get("toolResult", "")
+        # Truncate long results for readable logs
+        result_str = str(result)
+        logger.debug("Tool '%s'\nexecution result: %s", tool_name, result_str)
+        if len(result_str) > 500:
+            result_str = result_str[:500] + "... (truncated)"
+        logger.info("⬅️  Tool result: %s — output: %s", tool_name, result_str)
+        return {}
+
+    # ──────────────────────────────────────────────────────────────────
+    # Private — mount helpers
+    # ──────────────────────────────────────────────────────────────────
+
+    def _mount_additional(self, mount: Mount):
+        """Copy an additional folder into the running container."""
+        if mount.mount_type != MountType.COPY:
+            raise ValueError(
+                "Only COPY mount type is supported for additional mounts"
+            )
+        if not self.environment.copy_to_container(
+            mount.host_path_info.abs_path, mount.sandbox_path
+        ):
+            raise ValueError(
+                f"Failed to copy additional mount: "
+                f"{mount.host_path_info.abs_path} -> {mount.sandbox_path}"
+            )
diff --git a/src/microbots/bot/__init__.py b/src/microbots/bot/__init__.py
index fd6cd62..25e61b5 100644
--- a/src/microbots/bot/__init__.py
+++ b/src/microbots/bot/__init__.py
@@ -3,5 +3,6 @@
 from .WritingBot import WritingBot
 from .BrowsingBot import BrowsingBot
 from .AgentBoss import AgentBoss
+from .CopilotBot import CopilotBot
 
-__all__ = ["LogAnalysisBot", "ReadingBot", "WritingBot", "BrowsingBot", "AgentBoss"]
+__all__ = ["LogAnalysisBot", "ReadingBot", "WritingBot", "BrowsingBot", "AgentBoss", "CopilotBot"]
diff --git a/src/microbots/environment/Environment.py b/src/microbots/environment/Environment.py
index f2fdabd..83df871 100644
--- a/src/microbots/environment/Environment.py
+++ b/src/microbots/environment/Environment.py
@@ -33,3 +33,24 @@ def copy_from_container(self, src_path: str, dest_path: str) -> bool:
             f"{self.__class__.__name__} does not support copying files from container. "
             f"This is an optional feature - only implement if needed for your use case."
         )
+
+    def get_ipv4_address(self) -> str:
+        """Return the IPv4 address of the running environment.
+
+        This allows host-side code to connect directly to services
+        running inside the environment without port forwarding.
+
+        Returns
+        -------
+        str
+            The IPv4 address of the environment.
+
+        Raises
+        ------
+        NotImplementedError
+            If the environment does not support retrieving its IP address.
+        """
+        raise NotImplementedError(
+            f"{self.__class__.__name__} does not support retrieving its IP address. "
+            f"This is an optional feature - only implement if needed for your use case."
+        )
diff --git a/src/microbots/environment/local_docker/LocalDockerEnvironment.py b/src/microbots/environment/local_docker/LocalDockerEnvironment.py
index 874b80d..d91e98e 100644
--- a/src/microbots/environment/local_docker/LocalDockerEnvironment.py
+++ b/src/microbots/environment/local_docker/LocalDockerEnvironment.py
@@ -153,6 +153,18 @@ def _teardown_overlay_mount(self):
         except Exception as e:
             logger.error("❌  Failed to teardown overlay mount: %s", e)
 
+    def get_ipv4_address(self) -> str:
+        """Return the container's IPv4 address on the Docker bridge network."""
+        if not self.container:
+            raise RuntimeError("No active container to get IP address from")
+
+        self.container.reload()
+        networks = self.container.attrs["NetworkSettings"]["Networks"]
+        container_ip = next(iter(networks.values()))["IPAddress"]
+        if not container_ip:
+            raise RuntimeError("Could not determine container IP address")
+        return container_ip
+
     def stop(self):
         """Stop and remove the container"""
         if self.container:
diff --git a/src/microbots/utils/copilot_auth.py b/src/microbots/utils/copilot_auth.py
new file mode 100644
index 0000000..4d3aeeb
--- /dev/null
+++ b/src/microbots/utils/copilot_auth.py
@@ -0,0 +1,40 @@
+"""Utility to read GitHub Copilot CLI credentials from ~/.copilot/config.json."""
+
+import json
+from logging import getLogger
+from pathlib import Path
+from typing import Optional
+
+logger = getLogger(__name__)
+
+COPILOT_CONFIG_PATH = Path.home() / ".copilot" / "config.json"
+
+
+def get_copilot_token(config_path: Path = COPILOT_CONFIG_PATH) -> Optional[str]:
+    """Extract the OAuth token from the Copilot CLI config file.
+
+    The Copilot CLI stores credentials in ``~/.copilot/config.json`` after
+    ``copilot auth login``.  This function reads the first available token
+    from the ``copilot_tokens`` map.
+
+    Returns ``None`` if the file doesn't exist or contains no tokens.
+    """
+    if not config_path.is_file():
+        logger.debug("Copilot config not found at %s", config_path)
+        return None
+
+    try:
+        data = json.loads(config_path.read_text(encoding="utf-8"))
+    except (json.JSONDecodeError, OSError) as exc:
+        logger.warning("Failed to read Copilot config at %s: %s", config_path, exc)
+        return None
+
+    tokens = data.get("copilot_tokens", {})
+    if not tokens:
+        logger.debug("No copilot_tokens found in %s", config_path)
+        return None
+
+    # Return the first available token
+    token = next(iter(tokens.values()))
+    logger.debug("Resolved Copilot token from %s", config_path)
+    return token
diff --git a/src/microbots/utils/multi_agent_log_parser.py b/src/microbots/utils/multi_agent_log_parser.py
index ebdf9fb..0183e61 100644
--- a/src/microbots/utils/multi_agent_log_parser.py
+++ b/src/microbots/utils/multi_agent_log_parser.py
@@ -3,19 +3,21 @@
 Parse microbots info.log files into markdown trajectory files.
 
 Usage:
-    python multi_agent_log_parser.py <test_case>_info.log [output_dir]
+    python multi_agent_log_parser.py <log_file> [output_dir] [--single-file]
 
-Creates:
-    <test_case>_trajectory/
+Creates either:
+    <name>_trajectory/
         main_agent.md
         sub_agent_1.md
         sub_agent_2.md
         ...
+Or with --single-file:
+    <name>_trajectory.md
 
-The info.log file should be named as <test_case>_info.log.
-A directory <test_case>_trajectory will be created with all the markdown files.
+The log file name (minus _info.log or .log suffix) determines the output name.
 """
 
+import argparse
 import re
 import os
 import sys
@@ -26,6 +28,18 @@
 # ─────────────────────────── Data Classes ───────────────────────────
 
 
+@dataclass
+class SetupInfo:
+    """Captured setup information before the agent starts working."""
+    container_id: str = ""
+    image: str = ""
+    host_port: str = ""
+    working_dir: str = ""
+    volume_mappings: List[str] = field(default_factory=list)
+    tools_installed: List[str] = field(default_factory=list)
+    files_copied: List[str] = field(default_factory=list)
+
+
 @dataclass
 class Step:
     """Represents a single step in an agent's execution."""
@@ -39,7 +53,6 @@ class Step:
     is_sub_agent_call: bool = False
     sub_agent_task: str = ""
     sub_agent_index: int = -1  # index into the test case's sub_agents list
-    is_final: bool = False  # True if this represents LLM final thoughts
 
 
 @dataclass
@@ -51,6 +64,7 @@ class Agent:
     final_thoughts: str = ""
     completed: bool = False
     max_iterations_reached: bool = False
+    error_message: str = ""
 
 
 @dataclass
@@ -59,12 +73,21 @@ class TestCase:
     name: str = ""
     main_agent: Optional[Agent] = None
     sub_agents: List[Agent] = field(default_factory=list)
+    setup: SetupInfo = field(default_factory=SetupInfo)
 
 
 # ─────────────────────────── Log Parsing ───────────────────────────
 
-# Regex for parsing log line timestamps
+# Format: TIMESTAMP MODULE LEVEL CONTENT
+# e.g. "2026-03-26 12:45:20,277 microbots.environment.local_docker.LocalDockerEnvironment INFO ..."
+# e.g. "2026-03-26 12:46:35,819  MicroBot  INFO  ℹ️  TASK STARTED : ..."
+# e.g. "2026-03-26 12:49:30,653  🤖 MicroBot-Sub INFO Sub-agent completed..."
 LOG_LINE_RE = re.compile(
+    r'^(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3})\s+(.*?)\s+(INFO|ERROR|WARNING|DEBUG)\s(.*)$'
+)
+
+# Legacy format: TIMESTAMP [LEVEL] CONTENT
+LOG_LINE_LEGACY_RE = re.compile(
     r'^(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3}) \[(INFO|ERROR|WARNING|DEBUG)\] (.*)$'
 )
 
@@ -73,8 +96,11 @@ def parse_log_entries(log_path: str) -> List[dict]:
     """
     Parse a log file into a list of entries.
     Multi-line log entries (continuation lines without timestamps) are joined.
+    Supports both the current log format (TIMESTAMP MODULE LEVEL CONTENT) and
+    the legacy format (TIMESTAMP [LEVEL] CONTENT).
 
-    Returns a list of dicts: {'timestamp': str, 'level': str, 'content': str, 'line_num': int}
+    Returns a list of dicts:
+        {'timestamp': str, 'level': str, 'module': str, 'content': str, 'line_num': int}
     """
     entries = []
     current_entry = None
@@ -82,24 +108,36 @@ def parse_log_entries(log_path: str) -> List[dict]:
     with open(log_path, 'r', encoding='utf-8', errors='replace') as f:
         for line_num, raw_line in enumerate(f, 1):
             line = raw_line.rstrip('\n')
+
+            # Try current format first, then legacy
             match = LOG_LINE_RE.match(line)
             if match:
-                # Save previous entry
                 if current_entry is not None:
                     entries.append(current_entry)
                 current_entry = {
                     'timestamp': match.group(1),
-                    'level': match.group(2),
-                    'content': match.group(3),
+                    'module': match.group(2).strip(),
+                    'level': match.group(3),
+                    'content': match.group(4),
                     'line_num': line_num,
                 }
             else:
-                # Continuation of previous entry
-                if current_entry is not None:
-                    current_entry['content'] += '\n' + line
-                # else: lines before any log entry (skip)
+                legacy = LOG_LINE_LEGACY_RE.match(line)
+                if legacy:
+                    if current_entry is not None:
+                        entries.append(current_entry)
+                    current_entry = {
+                        'timestamp': legacy.group(1),
+                        'module': '',
+                        'level': legacy.group(2),
+                        'content': legacy.group(3),
+                        'line_num': line_num,
+                    }
+                else:
+                    # Continuation of previous entry
+                    if current_entry is not None:
+                        current_entry['content'] += '\n' + line
 
-    # Don't forget the last entry
     if current_entry is not None:
         entries.append(current_entry)
 
@@ -111,33 +149,26 @@ def parse_log_entries(log_path: str) -> List[dict]:
 
 def extract_task_from_microbot_sub(command: str) -> str:
     """Extract the --task argument from a microbot_sub command."""
-    # Normalize escaped quotes: \" -> "
     normalized = command.replace('\\"', '"').replace('\\n', '\n')
 
-    # Try to find --task "..." followed by " --iterations or end
     match = re.search(r'--task\s+"(.*?)"\s+--(?:iterations|timeout)', normalized, re.DOTALL)
     if match:
         return match.group(1).strip()
 
-    # Try to find --task "..." at end of command
     match = re.search(r'--task\s+"(.*?)"\s*$', normalized, re.DOTALL)
     if match:
         return match.group(1).strip()
 
-    # Try single quotes
     match = re.search(r"--task\s+'(.*?)'\s+--(?:iterations|timeout)", normalized, re.DOTALL)
     if match:
         return match.group(1).strip()
 
-    # Fallback: grab everything after --task " until the last " before --iterations
     match = re.search(r'--task\s+"(.+)', normalized, re.DOTALL)
     if match:
         text = match.group(1)
-        # Try to find closing quote before --iterations or --timeout
         iter_match = re.search(r'"\s+--(?:iterations|timeout)', text)
         if iter_match:
             return text[:iter_match.start()].strip()
-        # Try the last quote
         quote_end = text.rfind('"')
         if quote_end > 0:
             return text[:quote_end].strip()
@@ -145,6 +176,50 @@ def extract_task_from_microbot_sub(command: str) -> str:
     return command
 
 
+def _extract_setup_info(entries: List[dict]) -> SetupInfo:
+    """Extract environment setup information from log entries before the first TASK STARTED."""
+    setup = SetupInfo()
+    for entry in entries:
+        content = entry['content']
+        if 'TASK STARTED' in content:
+            break
+
+        # Container start
+        m = re.search(r'Started container (\w+) with image (\S+) on host port (\d+)', content)
+        if m:
+            setup.container_id = m.group(1)
+            setup.image = m.group(2)
+            setup.host_port = m.group(3)
+            continue
+
+        # Working directory
+        m = re.search(r'Created working directory at (\S+)', content)
+        if m:
+            setup.working_dir = m.group(1)
+            continue
+
+        # Volume mapping
+        if 'Volume mapping:' in content:
+            setup.volume_mappings.append(content.split('Volume mapping:', 1)[1].strip())
+            continue
+
+        # Tool installed
+        m = re.search(r'Successfully (?:installed|set up|setup) (?:external )?tool:\s*(\S+)', content)
+        if m:
+            tool_name = m.group(1)
+            if tool_name not in setup.tools_installed:
+                setup.tools_installed.append(tool_name)
+            continue
+
+        # Files copied to container
+        m = re.search(r'Successfully copied (.+?) to container:(.+)', content)
+        if m:
+            setup.files_copied.append(f"{m.group(1).strip()} → {m.group(2).strip()}")
+            continue
+
+    return setup
+
+
 def build_test_cases(entries: List[dict]) -> List[TestCase]:
     """
     Walk through log entries and build a list of TestCase objects,
@@ -153,11 +228,10 @@ def build_test_cases(entries: List[dict]) -> List[TestCase]:
     test_cases = []
     current_test: Optional[TestCase] = None
 
-    # Agent tracking
-    agent_stack: List[Agent] = []  # stack: [main_agent, sub_agent, ...]
+    agent_stack: List[Agent] = []
     current_step: Optional[Step] = None
-    pending_sub_agent_step: Optional[Step] = None  # main agent step that called microbot_sub
-    current_field: Optional[str] = None  # track what we're collecting multi-line for
+    pending_sub_agent_step: Optional[Step] = None
+    current_field: Optional[str] = None
 
     def current_agent() -> Optional[Agent]:
         return agent_stack[-1] if agent_stack else None
@@ -191,22 +265,18 @@ def finalize_test_case():
             continue
 
         # ── Task started ──
-        if 'ℹ️  TASK STARTED' in content:
+        if 'TASK STARTED' in content:
             task_text = content.split('TASK STARTED', 1)[1].lstrip(' :').strip()
             new_agent = Agent(task=task_text)
 
             if not current_test:
-                # No test case context yet, create one from filename
                 current_test = TestCase(name="unknown")
 
             if not current_test.main_agent:
-                # First agent = main agent
                 new_agent.is_main = True
                 current_test.main_agent = new_agent
                 agent_stack = [new_agent]
             else:
-                # Sub-agent
-                # Use the task from the microbot_sub command if available
                 if pending_sub_agent_step and pending_sub_agent_step.sub_agent_task:
                     new_agent.task = pending_sub_agent_step.sub_agent_task
                 elif task_text:
@@ -215,7 +285,6 @@ def finalize_test_case():
                 sub_idx = len(current_test.sub_agents)
                 current_test.sub_agents.append(new_agent)
 
-                # Link the parent step to this sub-agent
                 if pending_sub_agent_step:
                     pending_sub_agent_step.sub_agent_index = sub_idx
                     pending_sub_agent_step = None
@@ -227,16 +296,15 @@ def finalize_test_case():
             continue
 
         # ── Task completed ──
-        if '🔚 TASK COMPLETED' in content:
+        if 'TASK COMPLETED' in content:
             agent = current_agent()
             if agent:
                 agent.completed = True
-            current_field = None  # Stop accumulating text
+            current_field = None
             continue
 
         # ── Sub-agent completed message ──
         if 'Sub-agent completed successfully with output:' in content:
-            # Pop sub-agent from stack
             if len(agent_stack) > 1:
                 agent_stack.pop()
             current_step = None
@@ -249,13 +317,22 @@ def finalize_test_case():
             if agent and not agent.is_main:
                 agent.max_iterations_reached = True
                 agent.completed = False
-            # Pop sub-agent from stack
+                agent.error_message = content
             if len(agent_stack) > 1:
                 agent_stack.pop()
             current_step = None
             current_field = None
             continue
 
+        # ── Failed to parse sub-agent command ──
+        if level == 'ERROR' and 'Failed to parse microbot_sub command' in content:
+            if current_step:
+                current_step.is_blocked = True
+                current_step.blocked_reason = content
+            pending_sub_agent_step = None
+            current_field = None
+            continue
+
         # ── Max iterations reached ──
         if level == 'ERROR' and 'Max iterations' in content:
             agent = current_agent()
@@ -275,8 +352,8 @@ def finalize_test_case():
             continue
 
         # ── LLM final thoughts ──
-        if '💭  LLM final thoughts:' in content:
-            text = content.split('💭  LLM final thoughts:', 1)[1].strip()
+        if 'LLM final thoughts:' in content:
+            text = content.split('LLM final thoughts:', 1)[1].strip()
             agent = current_agent()
             if agent:
                 agent.final_thoughts = text
@@ -284,17 +361,16 @@ def finalize_test_case():
             continue
 
         # ── LLM thoughts ──
-        if '💭  LLM thoughts:' in content:
-            text = content.split('💭  LLM thoughts:', 1)[1].strip()
+        if 'LLM thoughts:' in content and 'final' not in content.split('LLM thoughts:')[0].lower():
+            text = content.split('LLM thoughts:', 1)[1].strip()
             if current_step:
                 current_step.thought = text
             current_field = 'thought'
             continue
 
         # ── LLM tool call ──
-        if '➡️  LLM tool call :' in content:
-            cmd = content.split('➡️  LLM tool call :', 1)[1].strip()
-            # Remove surrounding quotes if present
+        if 'LLM tool call' in content and ':' in content.split('LLM tool call')[1]:
+            cmd = content.split('LLM tool call', 1)[1].split(':', 1)[1].strip()
             if cmd.startswith('"') and cmd.endswith('"'):
                 cmd = cmd[1:-1]
             if current_step:
@@ -307,22 +383,29 @@ def finalize_test_case():
             continue
 
         # ── Command output ──
-        if '⬅️  Command output:' in content:
-            text = content.split('⬅️  Command output:', 1)[1].strip()
+        if 'Command output:' in content:
+            text = content.split('Command output:', 1)[1].strip()
             if current_step:
                 current_step.output = text
             current_field = 'output'
             continue
 
         # ── Dangerous command blocked ──
-        if '⚠️  Dangerous command detected' in content:
+        if 'Dangerous command detected' in content:
             if current_step:
                 current_step.is_blocked = True
-                current_step.blocked_reason = content
+                # Parse REASON/ALTERNATIVE from multi-line content
+                lines = content.split('\n')
+                current_step.blocked_reason = lines[0]
+                for bline in lines[1:]:
+                    if bline.startswith('REASON:'):
+                        current_step.blocked_reason = bline
+                    elif bline.startswith('ALTERNATIVE:'):
+                        current_step.blocked_alternative = bline
             current_field = 'blocked'
             continue
 
-        # ── REASON / ALTERNATIVE for blocked commands ──
+        # ── REASON / ALTERNATIVE for blocked commands (separate entries) ──
         if current_field == 'blocked' and current_step:
             if content.startswith('REASON:'):
                 current_step.blocked_reason = content
@@ -332,7 +415,10 @@ def finalize_test_case():
 
         # ── Invoking MicroBotSubAgent ──
         if 'Invoking MicroBotSubAgent with task:' in content:
-            # This is just a log message; the sub-agent TASK STARTED follows
+            continue
+
+        # ── Memory tool operations ──
+        if 'Memory file created:' in content or 'Memory file updated:' in content:
             continue
 
         # ── Multi-line continuation for known fields ──
@@ -366,20 +452,13 @@ def finalize_test_case():
                     agent.final_thoughts = content
             continue
 
-    # Finalize last test case
     finalize_test_case()
-
     return test_cases
 
 
 # ─────────────────────────── Markdown Generation ───────────────────────────
 
 
-def escape_md(text: str) -> str:
-    """Escape text for markdown display (minimal escaping for code blocks)."""
-    return text
-
-
 def truncate_text(text: str, max_lines: int = 200) -> str:
     """Truncate text if it exceeds max_lines."""
     lines = text.split('\n')
@@ -388,7 +467,34 @@ def truncate_text(text: str, max_lines: int = 200) -> str:
     return text
 
 
-def generate_step_md(step: Step, sub_agent_filename: str = "") -> str:
+def generate_setup_md(setup: SetupInfo) -> str:
+    """Generate markdown for the setup/environment section."""
+    if not setup.container_id and not setup.tools_installed:
+        return ""
+
+    md = "## 🔧 Environment Setup\n\n"
+
+    if setup.container_id:
+        md += f"- **Container:** `{setup.container_id}` (image: `{setup.image}`, port: {setup.host_port})\n"
+    if setup.working_dir:
+        md += f"- **Working directory:** `{setup.working_dir}`\n"
+    for vol in setup.volume_mappings:
+        md += f"- **Volume:** {vol}\n"
+
+    if setup.tools_installed:
+        md += f"- **Tools:** {', '.join(setup.tools_installed)}\n"
+
+    if setup.files_copied:
+        md += "\n<details>\n<summary>Files copied to container</summary>\n\n"
+        for fc in setup.files_copied:
+            md += f"- {fc}\n"
+        md += "\n</details>\n"
+
+    md += "\n---\n\n"
+    return md
+
+
+def generate_step_md(step: Step, sub_agent_filename: str = "", heading_level: int = 3) -> str:
     """Generate markdown for a single step as a collapsible details section."""
     status = "🚫 Blocked" if step.is_blocked else ""
     if step.is_sub_agent_call:
@@ -396,39 +502,36 @@ def generate_step_md(step: Step, sub_agent_filename: str = "") -> str:
 
     summary = f"Step {step.number}"
     if status:
-        summary += f" - {status}"
+        summary += f" — {status}"
 
-    # Build brief description from the thought (first sentence)
     if step.thought:
         first_line = step.thought.split('\n')[0]
         if len(first_line) > 120:
             first_line = first_line[:117] + "..."
         summary += f": {first_line}"
 
+    h = '#' * heading_level
+
     md = f"<details>\n<summary><strong>{summary}</strong></summary>\n\n"
 
-    # Thought section
     if step.thought:
-        md += "### 💭 Thought\n\n"
+        md += f"{h} 💭 Thought\n\n"
         md += f"{step.thought}\n\n"
 
-    # Blocked command warning
     if step.is_blocked:
-        md += "### ⚠️ Command Blocked\n\n"
+        md += f"{h} ⚠️ Command Blocked\n\n"
         if step.blocked_reason:
             md += f"> {step.blocked_reason}\n"
         if step.blocked_alternative:
             md += f"> {step.blocked_alternative}\n"
         md += "\n"
 
-    # Command section
     if step.command:
-        md += "### ➡️ Command\n\n"
+        md += f"{h} ➡️ Command\n\n"
         if step.is_sub_agent_call:
             md += "**Sub-agent invocation:**\n\n"
             if sub_agent_filename:
                 md += f"📎 **[View Sub-Agent Trajectory]({sub_agent_filename})**\n\n"
-            # Show the task
             if step.sub_agent_task:
                 md += "<details>\n<summary>Sub-agent task description</summary>\n\n"
                 md += f"```\n{step.sub_agent_task}\n```\n\n"
@@ -436,9 +539,8 @@ def generate_step_md(step: Step, sub_agent_filename: str = "") -> str:
         else:
             md += f"```bash\n{step.command}\n```\n\n"
 
-    # Output section
     if step.output:
-        md += "### ⬅️ Output\n\n"
+        md += f"{h} ⬅️ Output\n\n"
         output_text = truncate_text(step.output)
         md += f"```\n{output_text}\n```\n\n"
 
@@ -446,9 +548,19 @@ def generate_step_md(step: Step, sub_agent_filename: str = "") -> str:
     return md
 
 
+def _agent_status_str(agent: Agent) -> str:
+    if agent.completed:
+        return "✅ Completed"
+    if agent.max_iterations_reached:
+        return "❌ Failed (max iterations / timeout)"
+    return "❓ Unknown"
+
+
 def generate_main_agent_md(test_case: TestCase) -> str:
     """Generate the main agent markdown file content."""
-    md = f"# 🤖 Main Agent Trajectory: {test_case.name}\n\n"
+    md = f"# 🤖 Agent Trajectory: {test_case.name}\n\n"
+
+    md += generate_setup_md(test_case.setup)
 
     if test_case.main_agent and test_case.main_agent.task:
         md += "## Task\n\n"
@@ -471,7 +583,6 @@ def generate_main_agent_md(test_case: TestCase) -> str:
 
         md += "---\n\n"
 
-        # Summary
         if agent.completed:
             md += "## ✅ Task Completed\n\n"
             if agent.final_thoughts:
@@ -480,7 +591,6 @@ def generate_main_agent_md(test_case: TestCase) -> str:
             md += "## ❌ Max Iterations Reached\n\n"
             md += "The agent did not complete the task within the maximum allowed iterations.\n\n"
 
-        # Sub-agent index
         if test_case.sub_agents:
             md += "## 📋 Sub-Agents\n\n"
             md += "| # | Task | Status | Link |\n"
@@ -490,7 +600,7 @@ def generate_main_agent_md(test_case: TestCase) -> str:
                 first_line = clean.split('\n')[0]
                 task_summary = first_line[:80] + "..." if len(first_line) > 80 else first_line
                 task_summary = task_summary.replace('|', '\\|')
-                status = "✅ Completed" if sub.completed else "❌ Failed"
+                status = _agent_status_str(sub)
                 link = f"[sub_agent_{i + 1}.md](sub_agent_{i + 1}.md)"
                 md += f"| {i + 1} | {task_summary} | {status} | {link} |\n"
             md += "\n"
@@ -501,22 +611,18 @@ def generate_main_agent_md(test_case: TestCase) -> str:
 def clean_task_text(task: str) -> str:
     """Clean up a task string: remove microbot_sub prefix, escaped quotes, etc."""
     text = task.strip()
-    # Remove microbot_sub --task "..." wrapper if present
     if text.startswith('microbot_sub'):
         match = re.search(r'--task\s+["\'](.+)', text, re.DOTALL)
         if match:
             text = match.group(1)
-            # Remove trailing quote + flags
             text = re.sub(r'["\']\s*--(?:iterations|timeout).*$', '', text, flags=re.DOTALL)
             text = text.strip().strip('"').strip("'").strip()
-    # Unescape
-    text = text.replace('\\"', '"').replace('\\n', '\n').replace("\\'" , "'")
+    text = text.replace('\\"', '"').replace('\\n', '\n').replace("\\'", "'")
     return text
 
 
 def generate_sub_agent_md(sub_agent: Agent, index: int, test_case_name: str) -> str:
     """Generate a sub-agent markdown file content."""
-    # Clean and use the first line of the task as heading
     clean_task = clean_task_text(sub_agent.task)
     task_heading = clean_task.split('\n')[0] if clean_task else f"Sub-Agent {index + 1}"
     if len(task_heading) > 150:
@@ -539,14 +645,114 @@ def generate_sub_agent_md(sub_agent: Agent, index: int, test_case_name: str) ->
 
     md += "---\n\n"
 
-    # Summary
     if sub_agent.completed:
         md += "## ✅ Task Completed\n\n"
         if sub_agent.final_thoughts:
             md += f"{sub_agent.final_thoughts}\n\n"
     elif sub_agent.max_iterations_reached:
         md += "## ❌ Max Iterations Reached\n\n"
-        md += "The sub-agent did not complete the task within the maximum allowed iterations.\n\n"
+        if sub_agent.error_message:
+            md += f"> {sub_agent.error_message}\n\n"
+        else:
+            md += "The sub-agent did not complete the task within the maximum allowed iterations.\n\n"
+
+    return md
+
+
+# ─────────────────────────── Single-File Mode ───────────────────────────
+
+
+def generate_single_file_md(test_case: TestCase) -> str:
+    """Generate a single markdown file containing the main agent and all sub-agents."""
+    md = f"# 🤖 Agent Trajectory: {test_case.name}\n\n"
+
+    md += generate_setup_md(test_case.setup)
+
+    # Table of contents
+    if test_case.sub_agents:
+        md += "## 📑 Table of Contents\n\n"
+        md += "- [Main Agent](#main-agent)\n"
+        for i, sub in enumerate(test_case.sub_agents):
+            clean = clean_task_text(sub.task)
+            first_line = clean.split('\n')[0][:60]
+            md += f"- [Sub-Agent {i + 1}: {first_line}](#sub-agent-{i + 1})\n"
+        md += "\n---\n\n"
+
+    # Main agent section
+    md += "## Main Agent\n\n"
+
+    if test_case.main_agent and test_case.main_agent.task:
+        md += "### Task\n\n"
+        task_text = test_case.main_agent.task
+        if len(task_text) > 500:
+            md += f"<details>\n<summary>Full task description</summary>\n\n{task_text}\n\n</details>\n\n"
+        else:
+            md += f"{task_text}\n\n"
+
+    md += "---\n\n"
+    md += "### Steps\n\n"
+
+    if test_case.main_agent:
+        agent = test_case.main_agent
+        for step in agent.steps:
+            sub_ref = ""
+            if step.is_sub_agent_call and step.sub_agent_index >= 0:
+                sub_ref = f"#sub-agent-{step.sub_agent_index + 1}"
+            md += generate_step_md(step, sub_agent_filename=sub_ref, heading_level=4)
+
+        md += "---\n\n"
+
+        if agent.completed:
+            md += "### ✅ Task Completed\n\n"
+            if agent.final_thoughts:
+                md += f"{agent.final_thoughts}\n\n"
+        elif agent.max_iterations_reached:
+            md += "### ❌ Max Iterations Reached\n\n"
+
+        # Sub-agent summary table
+        if test_case.sub_agents:
+            md += "### 📋 Sub-Agents Summary\n\n"
+            md += "| # | Task | Status |\n"
+            md += "|---|------|--------|\n"
+            for i, sub in enumerate(test_case.sub_agents):
+                clean = clean_task_text(sub.task)
+                first_line = clean.split('\n')[0]
+                task_summary = first_line[:80] + "..." if len(first_line) > 80 else first_line
+                task_summary = task_summary.replace('|', '\\|')
+                status = _agent_status_str(sub)
+                md += f"| [{i + 1}](#sub-agent-{i + 1}) | {task_summary} | {status} |\n"
+            md += "\n"
+
+    # Sub-agent sections
+    for i, sub in enumerate(test_case.sub_agents):
+        clean_task = clean_task_text(sub.task)
+        task_heading = clean_task.split('\n')[0] if clean_task else f"Sub-Agent {i + 1}"
+        if len(task_heading) > 120:
+            task_heading = task_heading[:117] + "..."
+
+        md += f"\n---\n\n## Sub-Agent {i + 1}\n\n"
+        md += f"**{task_heading}**\n\n"
+
+        if clean_task and '\n' in clean_task:
+            md += "<details>\n<summary>Full task description</summary>\n\n"
+            md += f"```\n{clean_task}\n```\n\n"
+            md += "</details>\n\n"
+
+        md += "### Steps\n\n"
+
+        for step in sub.steps:
+            md += generate_step_md(step, heading_level=4)
+
+        md += "---\n\n"
+
+        if sub.completed:
+            md += "### ✅ Task Completed\n\n"
+            if sub.final_thoughts:
+                md += f"{sub.final_thoughts}\n\n"
+        elif sub.max_iterations_reached:
+            md += "### ❌ Max Iterations Reached\n\n"
+            if sub.error_message:
+                md += f"> {sub.error_message}\n\n"
 
     return md
 
@@ -554,19 +760,19 @@ def generate_sub_agent_md(sub_agent: Agent, index: int, test_case_name: str) ->
 # ─────────────────────────── Main ───────────────────────────
 
 
-def parse_and_generate(log_path: str, output_base_dir: str = None):
+def parse_and_generate(log_path: str, output_base_dir: str = None, single_file: bool = False):
     """
     Parse an info.log file and generate markdown trajectory files.
 
     Args:
         log_path: Path to the info.log file
         output_base_dir: Base directory for output. If None, uses the log file's directory.
+        single_file: If True, generate a single markdown file instead of a directory.
     """
     if not os.path.isfile(log_path):
         print(f"Error: Log file not found: {log_path}")
         sys.exit(1)
 
-    # Derive test case name from filename
     basename = os.path.basename(log_path)
     if basename.endswith('_info.log'):
         default_test_name = basename[:-len('_info.log')]
@@ -580,44 +786,59 @@ def parse_and_generate(log_path: str, output_base_dir: str = None):
 
     print(f"Parsing log file: {log_path}")
 
-    # Parse
     entries = parse_log_entries(log_path)
     print(f"  Parsed {len(entries)} log entries")
 
+    # Extract setup info before building test cases
+    setup = _extract_setup_info(entries)
+
     test_cases = build_test_cases(entries)
     print(f"  Found {len(test_cases)} test case(s)")
 
     if not test_cases:
-        # If no test case boundaries found, create a single test case
         print("  No test case boundaries found, treating entire log as one test case")
-        tc = TestCase(name=default_test_name)
-        # Re-parse with a dummy test case
         test_cases = _build_single_test_case(entries, default_test_name)
 
-    for tc in test_cases:
-        # Create output directory
-        trajectory_dir = os.path.join(output_base_dir, f"{tc.name}_trajectory")
-        os.makedirs(trajectory_dir, exist_ok=True)
-        print(f"\n  Test case: {tc.name}")
-        print(f"  Output directory: {trajectory_dir}")
-
-        # Generate main agent markdown
-        main_md = generate_main_agent_md(tc)
-        main_path = os.path.join(trajectory_dir, "main_agent.md")
-        with open(main_path, 'w', encoding='utf-8') as f:
-            f.write(main_md)
-        main_steps = len(tc.main_agent.steps) if tc.main_agent else 0
-        print(f"  Created: main_agent.md ({main_steps} steps)")
+    # Attach setup info to first test case
+    if test_cases:
+        test_cases[0].setup = setup
 
-        # Generate sub-agent markdowns
-        for i, sub in enumerate(tc.sub_agents):
-            sub_md = generate_sub_agent_md(sub, i, tc.name)
-            sub_path = os.path.join(trajectory_dir, f"sub_agent_{i + 1}.md")
-            with open(sub_path, 'w', encoding='utf-8') as f:
-                f.write(sub_md)
-            print(f"  Created: sub_agent_{i + 1}.md ({len(sub.steps)} steps)")
+    for tc in test_cases:
+        tc_name = tc.name if tc.name != "unknown" else default_test_name
+        tc.name = tc_name
 
-    print(f"\nDone! Generated trajectory files for {len(test_cases)} test case(s).")
+        main_steps = len(tc.main_agent.steps) if tc.main_agent else 0
+        sub_count = len(tc.sub_agents)
+        print(f"\n  Test case: {tc_name}")
+        print(f"  Main agent: {main_steps} steps, {sub_count} sub-agent(s)")
+
+        if single_file:
+            # Single file mode
+            md = generate_single_file_md(tc)
+            out_path = os.path.join(output_base_dir, f"{tc_name}_trajectory.md")
+            with open(out_path, 'w', encoding='utf-8') as f:
+                f.write(md)
+            print(f"  Created: {out_path}")
+        else:
+            # Multi-file mode
+            trajectory_dir = os.path.join(output_base_dir, f"{tc_name}_trajectory")
+            os.makedirs(trajectory_dir, exist_ok=True)
+            print(f"  Output directory: {trajectory_dir}")
+
+            main_md = generate_main_agent_md(tc)
+            main_path = os.path.join(trajectory_dir, "main_agent.md")
+            with open(main_path, 'w', encoding='utf-8') as f:
+                f.write(main_md)
+            print(f"  Created: main_agent.md ({main_steps} steps)")
+
+            for i, sub in enumerate(tc.sub_agents):
+                sub_md = generate_sub_agent_md(sub, i, tc_name)
+                sub_path = os.path.join(trajectory_dir, f"sub_agent_{i + 1}.md")
+                with open(sub_path, 'w', encoding='utf-8') as f:
+                    f.write(sub_md)
+                print(f"  Created: sub_agent_{i + 1}.md ({len(sub.steps)} steps)")
+
+    print(f"\nDone! Generated trajectory for {len(test_cases)} test case(s).")
     return test_cases
 
 
@@ -628,6 +849,7 @@ def _build_single_test_case(entries: List[dict], name: str) -> List[TestCase]:
     """
     fake_boundary = {
         'timestamp': '2000-01-01 00:00:00,000',
+        'module': '',
         'level': 'INFO',
         'content': f'Test directory set up at: /fake/{name}',
         'line_num': 0,
@@ -636,17 +858,17 @@ def _build_single_test_case(entries: List[dict], name: str) -> List[TestCase]:
 
 
 def main():
-    if len(sys.argv) < 2:
-        print("Usage: python multi_agent_log_parser.py <test_case>_info.log [output_dir]")
-        print("\nParses an info.log file and generates markdown trajectory files.")
-        print("The log file should be named as <test_case>_info.log.")
-        print("A directory <test_case>_trajectory will be created with all markdown files.")
-        sys.exit(1)
-
-    log_path = sys.argv[1]
-    output_dir = sys.argv[2] if len(sys.argv) > 2 else None
-
-    parse_and_generate(log_path, output_dir)
+    parser = argparse.ArgumentParser(
+        description="Parse microbots info.log files into markdown trajectory files."
+    )
+    parser.add_argument("log_file", help="Path to the info.log file to parse")
+    parser.add_argument("output_dir", nargs="?", default=None,
+                        help="Output directory (default: same directory as log file)")
+    parser.add_argument("--single-file", action="store_true",
+                        help="Generate a single markdown file instead of a directory with separate files")
+
+    args = parser.parse_args()
+    parse_and_generate(args.log_file, args.output_dir, args.single_file)
 
 
 if __name__ == '__main__':
diff --git a/test/bot/test_copilot_auth.py b/test/bot/test_copilot_auth.py
new file mode 100644
index 0000000..57dd871
--- /dev/null
+++ b/test/bot/test_copilot_auth.py
@@ -0,0 +1,67 @@
+"""Unit tests for microbots.utils.copilot_auth.get_copilot_token."""
+
+import json
+import os
+import sys
+from pathlib import Path
+
+import pytest
+
+sys.path.insert(
+    0, os.path.abspath(os.path.join(os.path.dirname(__file__), "../../src"))
+)
+
+from microbots.utils.copilot_auth import get_copilot_token
+
+
+# ---------------------------------------------------------------------------
+# Unit tests
+# ---------------------------------------------------------------------------
+
+@pytest.mark.unit
+class TestGetCopilotToken:
+    """Tests for get_copilot_token()."""
+
+    def test_returns_none_when_file_missing(self, tmp_path):
+        """Returns None when the config file does not exist."""
+        missing = tmp_path / "nonexistent.json"
+        assert get_copilot_token(config_path=missing) is None
+
+    def test_returns_none_on_invalid_json(self, tmp_path):
+        """Returns None and logs a warning when the file contains invalid JSON."""
+        bad_file = tmp_path / "config.json"
+        bad_file.write_text("this is not json", encoding="utf-8")
+        assert get_copilot_token(config_path=bad_file) is None
+
+    def test_returns_none_when_no_copilot_tokens_key(self, tmp_path):
+        """Returns None when the JSON has no 'copilot_tokens' key."""
+        cfg = tmp_path / "config.json"
+        cfg.write_text(json.dumps({"other_key": "value"}), encoding="utf-8")
+        assert get_copilot_token(config_path=cfg) is None
+
+    def test_returns_none_when_copilot_tokens_empty(self, tmp_path):
+        """Returns None when 'copilot_tokens' is an empty dict."""
+        cfg = tmp_path / "config.json"
+        cfg.write_text(json.dumps({"copilot_tokens": {}}), encoding="utf-8")
+        assert get_copilot_token(config_path=cfg) is None
+
+    def test_returns_first_token(self, tmp_path):
+        """Returns the first token value from 'copilot_tokens'."""
+        cfg = tmp_path / "config.json"
+        cfg.write_text(
+            json.dumps({"copilot_tokens": {"host1": "token-abc", "host2": "token-xyz"}}),
+            encoding="utf-8",
+        )
+        token = get_copilot_token(config_path=cfg)
+        assert token == "token-abc"
+
+    def test_returns_none_on_os_error(self, tmp_path):
+        """Returns None when the file cannot be read (OSError)."""
+        cfg = tmp_path / "config.json"
+        cfg.write_text("{}", encoding="utf-8")
+        cfg.chmod(0o000)  # remove read permission
+        try:
+            result = get_copilot_token(config_path=cfg)
+            assert result is None
+        finally:
+            cfg.chmod(0o644)  # restore permissions for cleanup
diff --git a/test/bot/test_copilot_bot.py b/test/bot/test_copilot_bot.py
new file mode 100644
index 0000000..5804d4c
--- /dev/null
+++ b/test/bot/test_copilot_bot.py
@@ -0,0 +1,1441 @@
+"""
+Unit and integration tests for CopilotBot.
+
+Unit tests mock the copilot SDK and Docker environment to verify the
+wiring and lifecycle.  Integration tests (marked ``@pytest.mark.integration``)
+require a real Docker daemon, copilot-cli, and GitHub authentication.
+"""
+
+import importlib
+import os
+import shutil
+import subprocess
+import sys
+from unittest.mock import AsyncMock, MagicMock, Mock, patch
+
+import pytest
+
+sys.path.insert(
+    0, os.path.abspath(os.path.join(os.path.dirname(__file__), "../../src"))
+)
+
+# ---------------------------------------------------------------------------
+# Mock the copilot SDK before importing CopilotBot (optional dependency)
+# ---------------------------------------------------------------------------
+_mock_copilot = MagicMock()
+_mock_copilot.CopilotClient = MagicMock
+_mock_copilot.ExternalServerConfig = MagicMock
+
+_mock_session = MagicMock()
+_mock_session.PermissionRequestResult = MagicMock
+
+_mock_events = MagicMock()
+_mock_events.SessionEventType = MagicMock()
+_mock_events.SessionEventType.ASSISTANT_MESSAGE = "assistant.message"
+_mock_events.SessionEventType.ASSISTANT_MESSAGE_DELTA = "assistant.message_delta"
+_mock_events.SessionEventType.SESSION_IDLE = "session.idle"
+
+_mock_tools = MagicMock()
+_mock_tools.Tool = MagicMock
+_mock_tools.ToolInvocation = MagicMock
+_mock_tools.ToolResult = MagicMock
+_mock_tools.define_tool = MagicMock
+
+_mock_types = MagicMock()
+_mock_types.PermissionHandler = MagicMock()
+_mock_types.PermissionHandler.approve_all = MagicMock()
+
+sys.modules.setdefault("copilot", _mock_copilot)
+sys.modules.setdefault("copilot.session", _mock_session)
+sys.modules.setdefault("copilot.generated.session_events", _mock_events)
+sys.modules.setdefault("copilot.tools", _mock_tools)
+sys.modules.setdefault("copilot.types", _mock_types)
+
+# Reload to pick up mock
+if "microbots.bot.CopilotBot" in sys.modules:
+    importlib.reload(sys.modules["microbots.bot.CopilotBot"])
+
+from microbots.MicroBot import BotRunResult
+
+
+def _restore_real_copilot_modules():
+    """Remove mock copilot modules from sys.modules and reload CopilotBot.
+
+    This allows integration tests to use the real copilot SDK instead of
+    the mocks injected at module level for unit tests.
+    """
+    mock_keys = [k for k in sys.modules if k == "copilot" or k.startswith("copilot.")]
+    for key in mock_keys:
+        del sys.modules[key]
+    # Also force CopilotBot to re-import the real SDK on next import
+    if "microbots.bot.CopilotBot" in sys.modules:
+        del sys.modules["microbots.bot.CopilotBot"]
+
+
+# ---------------------------------------------------------------------------
+# Helpers
+# ---------------------------------------------------------------------------
+
+def _copilot_cli_available():
+    return shutil.which("copilot") is not None
+
+
+def _copilot_sdk_installed():
+    try:
+        from importlib.metadata import version
+        version("github-copilot-sdk")
+        return True
+    except Exception:
+        return False
+
+
+def _copilot_auth_available():
+    if os.environ.get("GITHUB_TOKEN") or os.environ.get("COPILOT_GITHUB_TOKEN"):
+        return True
+    if shutil.which("gh"):
+        try:
+            result = subprocess.run(
+                ["gh", "auth", "status"], capture_output=True, timeout=10
+            )
+            return result.returncode == 0
+        except Exception:
+            pass
+    return False
+
+
+# ---------------------------------------------------------------------------
+# Unit test fixtures
+# ---------------------------------------------------------------------------
+
+@pytest.fixture
+def mock_environment():
+    """Create a mock LocalDockerEnvironment."""
+    env = MagicMock()
+    env.port = 9000
+    env.container_port = 8080
+    env.container = MagicMock()
+    env.container.id = "abc123def456"
+    env.image = "kavyasree261002/shell_server:latest"
+    env.working_dir = "/tmp/mock_workdir"
+    env.folder_to_mount = None
+    env.overlay_mount = False
+
+    # Make execute return success by default
+    success_return = MagicMock()
+    success_return.return_code = 0
+    success_return.stdout = "copilot version 1.0.0"
+    success_return.stderr = ""
+    env.execute = MagicMock(return_value=success_return)
+    env.copy_to_container = MagicMock(return_value=True)
+    env.stop = MagicMock()
+    env.get_ipv4_address = MagicMock(return_value="172.17.0.2")
+    return env
+
+
+@pytest.fixture
+def mock_copilot_session():
+    """Mock Copilot SDK session."""
+    session = AsyncMock()
+    session.disconnect = AsyncMock()
+
+    response = Mock()
+    response.data = Mock()
+    response.data.content = "Task completed successfully."
+    session.send_and_wait = AsyncMock(return_value=response)
+    session.on = MagicMock()
+    return session
+
+
+@pytest.fixture
+def mock_copilot_client(mock_copilot_session):
+    """Mock CopilotClient."""
+    client = AsyncMock()
+    client.start = AsyncMock()
+    client.stop = AsyncMock()
+    client.create_session = AsyncMock(return_value=mock_copilot_session)
+    return client
+
+
+@pytest.fixture
+def copilot_bot(mock_environment, mock_copilot_client):
+    """Create a CopilotBot with all external dependencies mocked."""
+    with (
+        patch("microbots.bot.CopilotBot.LocalDockerEnvironment", return_value=mock_environment),
+        patch("microbots.bot.CopilotBot.get_free_port", side_effect=[9000]),
+        patch("microbots.bot.CopilotBot.CopilotBot._install_copilot_cli"),
+        patch("microbots.bot.CopilotBot.CopilotBot._start_copilot_cli_server"),
+        patch("microbots.bot.CopilotBot.CopilotBot._wait_for_cli_ready"),
+        patch("copilot.CopilotClient", return_value=mock_copilot_client),
+        patch("copilot.ExternalServerConfig", return_value=MagicMock()),
+    ):
+        from microbots.bot.CopilotBot import CopilotBot
+        bot = CopilotBot(
+            model="gpt-4.1",
+            environment=mock_environment,
+            github_token="ghp_test_token_123",
+        )
+        yield bot
+        # Stop the event loop thread properly before teardown
+        try:
+            bot._loop.call_soon_threadsafe(bot._loop.stop)
+            bot._thread.join(timeout=2)
+        except Exception:
+            pass
+        bot.environment = None  # Prevent stop() from trying env.stop() again
+
+
+# ---------------------------------------------------------------------------
+# Unit tests
+# ---------------------------------------------------------------------------
+
+@pytest.mark.unit
+class TestCopilotBotInit:
+    """Tests for CopilotBot initialisation."""
+
+    def test_stores_model(self, copilot_bot):
+        assert copilot_bot.model == "gpt-4.1"
+
+    def test_stores_github_token(self, copilot_bot):
+        assert copilot_bot.github_token == "ghp_test_token_123"
+
+    def test_environment_assigned(self, copilot_bot, mock_environment):
+        assert copilot_bot.environment is mock_environment
+
+    def test_additional_tools_default_empty(self, copilot_bot):
+        assert copilot_bot.additional_tools == []
+
+    def test_rejects_external_tool(self, mock_environment, mock_copilot_client):
+        """CopilotBot raises ValueError if an ExternalTool is passed."""
+        from microbots.tools.external_tool import ExternalTool
+
+        ext_tool = MagicMock(spec=ExternalTool)
+        ext_tool.name = "my_external"
+
+        with (
+            patch("microbots.bot.CopilotBot.LocalDockerEnvironment", return_value=mock_environment),
+            patch("microbots.bot.CopilotBot.get_free_port", side_effect=[9000]),
+            patch("microbots.bot.CopilotBot.CopilotBot._install_copilot_cli"),
+            patch("microbots.bot.CopilotBot.CopilotBot._start_copilot_cli_server"),
+            patch("microbots.bot.CopilotBot.CopilotBot._wait_for_cli_ready"),
+            patch("copilot.CopilotClient", return_value=mock_copilot_client),
+            patch("copilot.ExternalServerConfig", return_value=MagicMock()),
+        ):
+            from microbots.bot.CopilotBot import CopilotBot
+            with pytest.raises(ValueError, match="does not support ExternalTool"):
+                CopilotBot(
+                    model="gpt-4.1",
+                    environment=mock_environment,
+                    additional_tools=[ext_tool],
+                    github_token="ghp_test",
+                )
+
+    def test_import_error_without_sdk(self):
+        """CopilotBot raises ImportError when copilot SDK is not installed."""
+        # Temporarily remove the mock so the import fails
+        saved = sys.modules.get("copilot")
+        try:
+            sys.modules["copilot"] = None  # Force ImportError on import
+            # Need to reload the module
+            if "microbots.bot.CopilotBot" in sys.modules:
+                importlib.reload(sys.modules["microbots.bot.CopilotBot"])
+            from microbots.bot.CopilotBot import CopilotBot as CB
+            with pytest.raises(ImportError, match="github-copilot-sdk"):
+                CB(model="gpt-4.1")
+        finally:
+            sys.modules["copilot"] = saved
+            if "microbots.bot.CopilotBot" in sys.modules:
+                importlib.reload(sys.modules["microbots.bot.CopilotBot"])
+
+
+@pytest.mark.unit
+class TestCopilotBotRun:
+    """Tests for CopilotBot.run()."""
+
+    def test_run_returns_bot_run_result(self, copilot_bot):
+        result = copilot_bot.run("Fix the bug in main.py")
+        assert isinstance(result, BotRunResult)
+
+    def test_run_success(self, copilot_bot):
+        result = copilot_bot.run("Fix the bug in main.py")
+        assert result.status is True
+        assert result.error is None
+        assert result.result is not None
+
+    def test_run_calls_tool_setup(self, copilot_bot, mock_environment):
+        mock_tool = MagicMock()
+        mock_tool.usage_instructions_to_llm = "Use tool X"
+        mock_tool.install_commands = []
+        mock_tool.verify_commands = []
+        copilot_bot.additional_tools = [mock_tool]
+
+        copilot_bot.run("test task")
+        mock_tool.setup_tool.assert_called_once_with(mock_environment)
+
+    def test_run_handles_exception(self, copilot_bot):
+        """Run returns failure BotRunResult on exceptions."""
+        with patch.object(copilot_bot, "_run_async", side_effect=RuntimeError("boom")):
+            result = copilot_bot.run("test")
+            assert result.status is False
+            assert "boom" in result.error
+
+
+@pytest.mark.unit
+class TestCopilotBotSystemMessage:
+    """Tests for system message construction."""
+
+    def test_system_message_empty_no_mount_no_tools(self, copilot_bot):
+        msg = copilot_bot._build_system_message()
+        assert msg == ""
+
+    def test_system_message_includes_mount_path(self, mock_environment, mock_copilot_client):
+        with (
+            patch("microbots.bot.CopilotBot.LocalDockerEnvironment", return_value=mock_environment),
+            patch("microbots.bot.CopilotBot.get_free_port", side_effect=[9000]),
+            patch("microbots.bot.CopilotBot.CopilotBot._install_copilot_cli"),
+            patch("microbots.bot.CopilotBot.CopilotBot._start_copilot_cli_server"),
+            patch("microbots.bot.CopilotBot.CopilotBot._wait_for_cli_ready"),
+            patch("copilot.CopilotClient", return_value=mock_copilot_client),
+            patch("copilot.ExternalServerConfig", return_value=MagicMock()),
+        ):
+            from microbots.bot.CopilotBot import CopilotBot
+            from microbots.extras.mount import Mount
+            mount = Mount("/tmp/test_repo", "/workdir/test_repo", "READ_WRITE")
+            bot = CopilotBot(
+                model="gpt-4.1",
+                environment=mock_environment,
+                github_token="ghp_test",
+            )
+            bot.folder_to_mount = mount
+            msg = bot._build_system_message()
+            assert "/workdir/test_repo" in msg
+            bot.stop()
+
+    def test_system_message_includes_tool_instructions(self, copilot_bot):
+        mock_tool = MagicMock()
+        mock_tool.usage_instructions_to_llm = "# Use browser command"
+        copilot_bot.additional_tools = [mock_tool]
+
+        msg = copilot_bot._build_system_message()
+        assert "browser" in msg
+
+
+@pytest.mark.unit
+class TestCopilotBotStop:
+    """Tests for CopilotBot.stop()."""
+
+    def test_stop_cleans_environment(self, copilot_bot, mock_environment):
+        copilot_bot.stop()
+        mock_environment.stop.assert_called_once()
+
+    def test_stop_idempotent(self, copilot_bot, mock_environment):
+        copilot_bot.stop()
+        copilot_bot.stop()  # Should not raise
+
+
+@pytest.mark.unit
+class TestCopilotBotCLIInstall:
+    """Tests for copilot-cli installation logic."""
+
+    def test_install_cli_calls_execute(self, mock_environment):
+        from microbots.bot.CopilotBot import CopilotBot
+
+        with (
+            patch("microbots.bot.CopilotBot.get_free_port", side_effect=[9000]),
+            patch("microbots.bot.CopilotBot.CopilotBot._start_copilot_cli_server"),
+            patch("microbots.bot.CopilotBot.CopilotBot._wait_for_cli_ready"),
+            patch("copilot.CopilotClient", return_value=AsyncMock()),
+            patch("copilot.ExternalServerConfig", return_value=MagicMock()),
+        ):
+            bot = CopilotBot(
+                model="gpt-4.1",
+                environment=mock_environment,
+                github_token="ghp_test",
+            )
+            # _install_copilot_cli was called during __init__
+            # Verify that execute was called with npm install command
+            calls = [str(c) for c in mock_environment.execute.call_args_list]
+            npm_calls = [c for c in calls if "npm install" in c or "copilot" in c]
+            assert len(npm_calls) > 0, "Expected copilot-cli install commands"
+            bot.stop()
+
+    def test_install_cli_raises_on_failure(self, mock_environment):
+        from microbots.bot.CopilotBot import CopilotBot
+
+        fail_return = MagicMock()
+        fail_return.return_code = 1
+        fail_return.stdout = ""
+        fail_return.stderr = "npm ERR! not found"
+        mock_environment.execute = MagicMock(return_value=fail_return)
+
+        with (
+            patch("microbots.bot.CopilotBot.get_free_port", side_effect=[9000]),
+            patch("microbots.bot.CopilotBot.CopilotBot._start_copilot_cli_server"),
+            patch("microbots.bot.CopilotBot.CopilotBot._wait_for_cli_ready"),
+            patch("copilot.CopilotClient", return_value=AsyncMock()),
+            patch("copilot.ExternalServerConfig", return_value=MagicMock()),
+        ):
+            with pytest.raises(RuntimeError, match="Failed to install copilot-cli"):
+                CopilotBot(
+                    model="gpt-4.1",
+                    environment=mock_environment,
+                    github_token="ghp_test",
+                )
+
+
+# ---------------------------------------------------------------------------
+# Unit tests — resolve_auth_config and BYOK
+# ---------------------------------------------------------------------------
+
+@pytest.mark.unit
+class TestResolveAuthConfig:
+    """Tests for the standalone resolve_auth_config function."""
+
+    def test_explicit_api_key_returns_byok_provider(self):
+        from microbots.bot.CopilotBot import resolve_auth_config
+
+        model, gh_token, provider = resolve_auth_config(
+            model="gpt-4.1",
+            api_key="sk-test-key",
+            base_url="https://api.openai.com/v1",
+        )
+        assert model == "gpt-4.1"
+        assert gh_token is None
+        assert provider is not None
+        assert provider["type"] == "openai"
+        assert provider["base_url"] == "https://api.openai.com/v1"
+        assert provider["api_key"] == "sk-test-key"
+        assert "bearer_token" not in provider
+
+    def test_explicit_bearer_token_takes_precedence_over_api_key(self):
+        from microbots.bot.CopilotBot import resolve_auth_config
+
+        _, _, provider = resolve_auth_config(
+            model="gpt-4.1",
+            api_key="sk-key",
+            bearer_token="my-bearer",
+            base_url="https://api.openai.com/v1",
+        )
+        assert provider["bearer_token"] == "my-bearer"
+        assert "api_key" not in provider
+
+    def test_explicit_api_key_without_base_url_raises(self):
+        from microbots.bot.CopilotBot import resolve_auth_config
+
+        with pytest.raises(ValueError, match="base_url"):
+            resolve_auth_config(model="gpt-4.1", api_key="sk-test")
+
+    def test_azure_provider_type_includes_api_version(self):
+        from microbots.bot.CopilotBot import resolve_auth_config
+
+        _, _, provider = resolve_auth_config(
+            model="gpt-4.1",
+            api_key="azure-key",
+            base_url="https://my-resource.openai.azure.com",
+            provider_type="azure",
+            azure_api_version="2024-10-21",
+        )
+        assert provider["type"] == "azure"
+        assert provider["azure"] == {"api_version": "2024-10-21"}
+
+    def test_wire_api_included_when_set(self):
+        from microbots.bot.CopilotBot import resolve_auth_config
+
+        _, _, provider = resolve_auth_config(
+            model="gpt-5",
+            api_key="key",
+            base_url="https://endpoint.com/v1",
+            wire_api="responses",
+        )
+        assert provider["wire_api"] == "responses"
+
+    def test_env_vars_resolve_byok(self, monkeypatch):
+        from microbots.bot.CopilotBot import resolve_auth_config
+
+        monkeypatch.setenv("COPILOT_BYOK_BASE_URL", "https://env-endpoint.com/v1")
+        monkeypatch.setenv("COPILOT_BYOK_API_KEY", "env-key")
+        monkeypatch.setenv("COPILOT_BYOK_PROVIDER_TYPE", "anthropic")
+        monkeypatch.setenv("COPILOT_BYOK_MODEL", "claude-sonnet-4.5")
+
+        model, gh_token, provider = resolve_auth_config(model="gpt-4.1")
+        assert model == "claude-sonnet-4.5"
+        assert gh_token is None
+        assert provider["type"] == "anthropic"
+        assert provider["base_url"] == "https://env-endpoint.com/v1"
+        assert provider["api_key"] == "env-key"
+
+    def test_env_vars_bearer_token(self, monkeypatch):
+        from microbots.bot.CopilotBot import resolve_auth_config
+
+        monkeypatch.setenv("COPILOT_BYOK_BASE_URL", "https://endpoint.com/v1")
+        monkeypatch.setenv("COPILOT_BYOK_BEARER_TOKEN", "env-bearer")
+
+        _, _, provider = resolve_auth_config(model="gpt-4.1")
+        assert provider["bearer_token"] == "env-bearer"
+        assert "api_key" not in provider
+
+    def test_env_vars_ignored_when_explicit_key_provided(self, monkeypatch):
+        from microbots.bot.CopilotBot import resolve_auth_config
+
+        monkeypatch.setenv("COPILOT_BYOK_BASE_URL", "https://env-endpoint.com/v1")
+        monkeypatch.setenv("COPILOT_BYOK_API_KEY", "env-key")
+
+        _, _, provider = resolve_auth_config(
+            model="gpt-4.1",
+            api_key="explicit-key",
+            base_url="https://explicit.com/v1",
+        )
+        assert provider["api_key"] == "explicit-key"
+        assert provider["base_url"] == "https://explicit.com/v1"
+
+    def test_token_provider_returns_byok_with_bearer(self):
+        from microbots.bot.CopilotBot import resolve_auth_config
+
+        _, _, provider = resolve_auth_config(
+            model="gpt-4.1",
+            base_url="https://azure.endpoint.com/v1",
+            token_provider=lambda: "ad-token-123",
+        )
+        assert provider["bearer_token"] == "ad-token-123"
+        assert "api_key" not in provider
+
+    def test_token_provider_without_base_url_raises(self):
+        from microbots.bot.CopilotBot import resolve_auth_config
+
+        with pytest.raises(ValueError, match="base_url"):
+            resolve_auth_config(
+                model="gpt-4.1",
+                token_provider=lambda: "token",
+            )
+
+    def test_token_provider_not_callable_raises(self):
+        from microbots.bot.CopilotBot import resolve_auth_config
+
+        with pytest.raises(ValueError, match="callable"):
+            resolve_auth_config(
+                model="gpt-4.1",
+                base_url="https://endpoint.com/v1",
+                token_provider="not-a-callable",
+            )
+
+    def test_token_provider_returning_empty_raises(self):
+        from microbots.bot.CopilotBot import resolve_auth_config
+
+        with pytest.raises(ValueError, match="non-empty"):
+            resolve_auth_config(
+                model="gpt-4.1",
+                base_url="https://endpoint.com/v1",
+                token_provider=lambda: "",
+            )
+
+    def test_token_provider_exception_raises(self):
+        from microbots.bot.CopilotBot import resolve_auth_config
+
+        def bad_provider():
+            raise RuntimeError("auth failed")
+
+        with pytest.raises(ValueError, match="auth failed"):
+            resolve_auth_config(
+                model="gpt-4.1",
+                base_url="https://endpoint.com/v1",
+                token_provider=bad_provider,
+            )
+
+    def test_fallback_to_github_token(self):
+        from microbots.bot.CopilotBot import resolve_auth_config
+
+        model, gh_token, provider = resolve_auth_config(
+            model="gpt-4.1",
+            github_token="ghp_test123",
+        )
+        assert model == "gpt-4.1"
+        assert gh_token == "ghp_test123"
+        assert provider is None
+
+    def test_default_provider_type_is_openai(self):
+        from microbots.bot.CopilotBot import resolve_auth_config
+
+        _, _, provider = resolve_auth_config(
+            model="m", api_key="k", base_url="https://x.com/v1"
+        )
+        assert provider["type"] == "openai"
+
+    def test_anthropic_provider_type(self):
+        from microbots.bot.CopilotBot import resolve_auth_config
+
+        _, _, provider = resolve_auth_config(
+            model="claude-sonnet-4.5",
+            api_key="ant-key",
+            base_url="https://api.anthropic.com",
+            provider_type="anthropic",
+        )
+        assert provider["type"] == "anthropic"
+
+
+@pytest.mark.unit
+class TestCopilotBotBYOKInit:
+    """Tests for CopilotBot initialisation with BYOK parameters."""
+
+    def test_byok_api_key_sets_provider_config(self, mock_environment, mock_copilot_client):
+        with (
+            patch("microbots.bot.CopilotBot.LocalDockerEnvironment", return_value=mock_environment),
+            patch("microbots.bot.CopilotBot.get_free_port", side_effect=[9000]),
+            patch("microbots.bot.CopilotBot.CopilotBot._install_copilot_cli"),
+            patch("microbots.bot.CopilotBot.CopilotBot._start_copilot_cli_server"),
+            patch("microbots.bot.CopilotBot.CopilotBot._wait_for_cli_ready"),
+            patch("copilot.CopilotClient", return_value=mock_copilot_client),
+            patch("copilot.ExternalServerConfig", return_value=MagicMock()),
+        ):
+            from microbots.bot.CopilotBot import CopilotBot
+            bot = CopilotBot(
+                model="gpt-4.1",
+                environment=mock_environment,
+                api_key="sk-byok-key",
+                base_url="https://api.openai.com/v1",
+            )
+            assert bot._provider_config is not None
+            assert bot._provider_config["api_key"] == "sk-byok-key"
+            assert bot.github_token is None
+            bot.stop()
+
+    def test_byok_token_provider_sets_provider_config(self, mock_environment, mock_copilot_client):
+        with (
+            patch("microbots.bot.CopilotBot.LocalDockerEnvironment", return_value=mock_environment),
+            patch("microbots.bot.CopilotBot.get_free_port", side_effect=[9000]),
+            patch("microbots.bot.CopilotBot.CopilotBot._install_copilot_cli"),
+            patch("microbots.bot.CopilotBot.CopilotBot._start_copilot_cli_server"),
+            patch("microbots.bot.CopilotBot.CopilotBot._wait_for_cli_ready"),
+            patch("copilot.CopilotClient", return_value=mock_copilot_client),
+            patch("copilot.ExternalServerConfig", return_value=MagicMock()),
+        ):
+            from microbots.bot.CopilotBot import CopilotBot
+            bot = CopilotBot(
+                model="gpt-4.1",
+                environment=mock_environment,
+                base_url="https://azure.endpoint.com/v1",
+                token_provider=lambda: "ad-token-xyz",
+            )
+            assert bot._provider_config is not None
+            assert bot._provider_config["bearer_token"] == "ad-token-xyz"
+            assert bot.github_token is None
+            bot.stop()
+
+    def test_native_auth_has_no_provider_config(self, copilot_bot):
+        assert copilot_bot._provider_config is None
+        assert copilot_bot.github_token == "ghp_test_token_123"
+
+
+# ---------------------------------------------------------------------------
+# Helper context manager shared by several new test classes
+# ---------------------------------------------------------------------------
+
+def _standard_init_patches(mock_environment, mock_copilot_client):
+    """Return a combined context manager for standard CopilotBot init patches."""
+    return (
+        patch("microbots.bot.CopilotBot.LocalDockerEnvironment", return_value=mock_environment),
+        patch("microbots.bot.CopilotBot.get_free_port", side_effect=[9000]),
+        patch("microbots.bot.CopilotBot.CopilotBot._install_copilot_cli"),
+        patch("microbots.bot.CopilotBot.CopilotBot._start_copilot_cli_server"),
+        patch("microbots.bot.CopilotBot.CopilotBot._wait_for_cli_ready"),
+        patch("copilot.CopilotClient", return_value=mock_copilot_client),
+        patch("copilot.ExternalServerConfig", return_value=MagicMock()),
+    )
+
+
+# ---------------------------------------------------------------------------
+# Unit tests — __init__ with folder_to_mount and auto-created environment
+# ---------------------------------------------------------------------------
+
+@pytest.mark.unit
+class TestCopilotBotInitFolderMount:
+    """Tests for __init__ paths: folder_to_mount string and auto environment."""
+
+    def test_folder_to_mount_creates_mount_object(self, mock_environment, mock_copilot_client):
+        """When folder_to_mount string is provided, a Mount is stored."""
+        with (
+            patch("microbots.bot.CopilotBot.LocalDockerEnvironment", return_value=mock_environment),
+            patch("microbots.bot.CopilotBot.get_free_port", side_effect=[9000]),
+            patch("microbots.bot.CopilotBot.CopilotBot._install_copilot_cli"),
+            patch("microbots.bot.CopilotBot.CopilotBot._start_copilot_cli_server"),
+            patch("microbots.bot.CopilotBot.CopilotBot._wait_for_cli_ready"),
+            patch("copilot.CopilotClient", return_value=mock_copilot_client),
+            patch("copilot.ExternalServerConfig", return_value=MagicMock()),
+        ):
+            from microbots.bot.CopilotBot import CopilotBot
+            bot = CopilotBot(
+                model="gpt-4.1",
+                environment=mock_environment,
+                github_token="ghp_test",
+                folder_to_mount="/tmp/test_repo",
+            )
+            assert bot.folder_to_mount is not None
+            assert "test_repo" in bot.folder_to_mount.sandbox_path
+            bot.stop()
+
+    def test_auto_creates_environment_when_none(self, mock_environment, mock_copilot_client):
+        """When environment=None, LocalDockerEnvironment is instantiated."""
+        mock_environment.get_ipv4_address = MagicMock(return_value="172.17.0.2")
+        with (
+            patch(
+                "microbots.bot.CopilotBot.LocalDockerEnvironment",
+                return_value=mock_environment,
+            ) as mock_lde,
+            patch("microbots.bot.CopilotBot.get_free_port", return_value=9000),
+            patch("microbots.bot.CopilotBot.CopilotBot._install_copilot_cli"),
+            patch("microbots.bot.CopilotBot.CopilotBot._start_copilot_cli_server"),
+            patch("microbots.bot.CopilotBot.CopilotBot._wait_for_cli_ready"),
+            patch("copilot.CopilotClient", return_value=mock_copilot_client),
+            patch("copilot.ExternalServerConfig", return_value=MagicMock()),
+        ):
+            from microbots.bot.CopilotBot import CopilotBot
+            bot = CopilotBot(
+                model="gpt-4.1",
+                github_token="ghp_test",
+            )
+            mock_lde.assert_called_once()
+            assert bot.environment is mock_environment
+            bot.stop()
+
+
+# ---------------------------------------------------------------------------
+# Unit tests — non-ExternalTool installation in __init__
+# ---------------------------------------------------------------------------
+
+@pytest.mark.unit
+class TestCopilotBotNonExternalToolInstall:
+    """Tests that non-ExternalTool tools are installed during __init__."""
+
+    def test_non_external_tool_install_and_verify_called(
+        self, mock_environment, mock_copilot_client
+    ):
+        """install_tool and verify_tool_installation are called for regular tools."""
+        from microbots.tools.tool import ToolAbstract
+
+        mock_tool = MagicMock(spec=ToolAbstract)
+        mock_tool.name = "my_tool"
+        mock_tool.usage_instructions_to_llm = "Use my_tool"
+
+        with (
+            patch("microbots.bot.CopilotBot.LocalDockerEnvironment", return_value=mock_environment),
+            patch("microbots.bot.CopilotBot.get_free_port", side_effect=[9000]),
+            patch("microbots.bot.CopilotBot.CopilotBot._install_copilot_cli"),
+            patch("microbots.bot.CopilotBot.CopilotBot._start_copilot_cli_server"),
+            patch("microbots.bot.CopilotBot.CopilotBot._wait_for_cli_ready"),
+            patch("copilot.CopilotClient", return_value=mock_copilot_client),
+            patch("copilot.ExternalServerConfig", return_value=MagicMock()),
+        ):
+            from microbots.bot.CopilotBot import CopilotBot
+            bot = CopilotBot(
+                model="gpt-4.1",
+                environment=mock_environment,
+                github_token="ghp_test",
+                additional_tools=[mock_tool],
+            )
+            mock_tool.install_tool.assert_called_once_with(mock_environment)
+            mock_tool.verify_tool_installation.assert_called_once_with(mock_environment)
+            bot.stop()
+
+
+# ---------------------------------------------------------------------------
+# Unit tests — _install_copilot_cli verification failure
+# ---------------------------------------------------------------------------
+
+@pytest.mark.unit
+class TestCopilotBotCLIVerification:
+    """Tests that copilot-cli verification failure raises RuntimeError."""
+
+    def test_install_cli_verify_fails_raises(self, mock_environment):
+        """RuntimeError raised when install commands succeed but 'copilot --version' fails."""
+        from microbots.bot.CopilotBot import CopilotBot
+
+        success_ret = MagicMock()
+        success_ret.return_code = 0
+        success_ret.stdout = ""
+        success_ret.stderr = ""
+
+        fail_ret = MagicMock()
+        fail_ret.return_code = 1
+        fail_ret.stdout = ""
+        fail_ret.stderr = "command not found: copilot"
+
+        def side_effect(cmd, **kwargs):
+            if "copilot --version" in cmd:
+                return fail_ret
+            return success_ret
+
+        mock_environment.execute = MagicMock(side_effect=side_effect)
+
+        with (
+            patch("microbots.bot.CopilotBot.get_free_port", side_effect=[9000]),
+            patch("microbots.bot.CopilotBot.CopilotBot._start_copilot_cli_server"),
+            patch("microbots.bot.CopilotBot.CopilotBot._wait_for_cli_ready"),
+            patch("copilot.CopilotClient", return_value=AsyncMock()),
+            patch("copilot.ExternalServerConfig", return_value=MagicMock()),
+        ):
+            with pytest.raises(RuntimeError, match="verification failed"):
+                CopilotBot(
+                    model="gpt-4.1",
+                    environment=mock_environment,
+                    github_token="ghp_test",
+                )
+
+
+# ---------------------------------------------------------------------------
+# Unit tests — _start_copilot_cli_server
+# ---------------------------------------------------------------------------
+
+@pytest.mark.unit
+class TestCopilotBotStartServer:
+    """Tests for _start_copilot_cli_server()."""
+
+    def _make_bot_for_server_test(self, mock_environment, mock_copilot_client, github_token=None, provider_config=None):
+        """Build a CopilotBot with _start_copilot_cli_server NOT patched."""
+        with (
+            patch("microbots.bot.CopilotBot.LocalDockerEnvironment", return_value=mock_environment),
+            patch("microbots.bot.CopilotBot.get_free_port", side_effect=[9000]),
+            patch("microbots.bot.CopilotBot.CopilotBot._install_copilot_cli"),
+            patch("microbots.bot.CopilotBot.CopilotBot._wait_for_cli_ready"),
+            patch("copilot.CopilotClient", return_value=mock_copilot_client),
+            patch("copilot.ExternalServerConfig", return_value=MagicMock()),
+        ):
+            from microbots.bot.CopilotBot import CopilotBot
+            if github_token:
+                bot = CopilotBot(
+                    model="gpt-4.1",
+                    environment=mock_environment,
+                    github_token=github_token,
+                )
+            elif provider_config:
+                bot = CopilotBot(
+                    model="gpt-4.1",
+                    environment=mock_environment,
+                    api_key="sk-key",
+                    base_url="https://api.openai.com/v1",
+                )
+            else:
+                bot = CopilotBot(
+                    model="gpt-4.1",
+                    environment=mock_environment,
+                )
+            return bot
+
+    def test_start_server_injects_github_token(self, mock_environment, mock_copilot_client):
+        """Token injection calls when github_token is set without provider_config."""
+        bot = self._make_bot_for_server_test(
+            mock_environment, mock_copilot_client, github_token="ghp_server_test"
+        )
+        execute_args = [str(c) for c in mock_environment.execute.call_args_list]
+        assert any("GITHUB_TOKEN" in a for a in execute_args)
+        assert any("COPILOT_GITHUB_TOKEN" in a for a in execute_args)
+        bot.stop()
+
+    def test_start_server_skips_token_injection_for_byok(
+        self, mock_environment, mock_copilot_client
+    ):
+        """No token injection when BYOK provider_config is active."""
+        bot = self._make_bot_for_server_test(
+            mock_environment, mock_copilot_client, provider_config=True
+        )
+        execute_args = [str(c) for c in mock_environment.execute.call_args_list]
+        assert not any("GITHUB_TOKEN" in a for a in execute_args)
+        bot.stop()
+
+    def test_start_server_raises_on_execute_failure(self, mock_environment, mock_copilot_client):
+        """RuntimeError raised when start_cmd execute fails."""
+        fail_ret = MagicMock()
+        fail_ret.return_code = 1
+        fail_ret.stderr = "failed to start"
+
+        success_ret = MagicMock()
+        success_ret.return_code = 0
+        success_ret.stdout = ""
+        success_ret.stderr = ""
+
+        def side_effect(cmd, **kwargs):
+            if "copilot --headless" in cmd:
+                return fail_ret
+            return success_ret
+
+        mock_environment.execute = MagicMock(side_effect=side_effect)
+
+        with (
+            patch("microbots.bot.CopilotBot.LocalDockerEnvironment", return_value=mock_environment),
+            patch("microbots.bot.CopilotBot.get_free_port", side_effect=[9000]),
+            patch("microbots.bot.CopilotBot.CopilotBot._install_copilot_cli"),
+            patch("microbots.bot.CopilotBot.CopilotBot._wait_for_cli_ready"),
+            patch("copilot.CopilotClient", return_value=mock_copilot_client),
+            patch("copilot.ExternalServerConfig", return_value=MagicMock()),
+        ):
+            from microbots.bot.CopilotBot import CopilotBot
+            with pytest.raises(RuntimeError, match="Failed to start copilot-cli server"):
+                CopilotBot(
+                    model="gpt-4.1",
+                    environment=mock_environment,
+                    github_token="ghp_test",
+                )
+
+
+# ---------------------------------------------------------------------------
+# Unit tests — _wait_for_cli_ready
+# ---------------------------------------------------------------------------
+
+@pytest.mark.unit
+class TestCopilotBotWaitReady:
+    """Tests for _wait_for_cli_ready() called directly on a minimal instance."""
+
+    def _make_minimal_bot(self):
+        """Return a bare CopilotBot instance with only environment set."""
+        from microbots.bot.CopilotBot import CopilotBot
+
+        bot = object.__new__(CopilotBot)
+        mock_env = MagicMock()
+        mock_env.get_ipv4_address = MagicMock(return_value="127.0.0.1")
+        bot.environment = mock_env
+        return bot
+
+    def test_wait_for_cli_ready_success(self):
+        """Returns immediately when socket connection succeeds."""
+        bot = self._make_minimal_bot()
+        mock_sock = MagicMock()
+        with patch("socket.create_connection", return_value=mock_sock) as mock_conn:
+            bot._wait_for_cli_ready()
+            mock_conn.assert_called_once()
+            mock_sock.close.assert_called_once()
+
+    def test_wait_for_cli_ready_timeout(self):
+        """Raises TimeoutError when connections always fail past the deadline."""
+        bot = self._make_minimal_bot()
+        with (
+            patch("socket.create_connection", side_effect=ConnectionRefusedError()),
+            patch("microbots.bot.CopilotBot.time") as mock_time,
+        ):
+            # First call sets deadline (0 + _CLI_STARTUP_TIMEOUT), second exceeds it
+            mock_time.time.side_effect = [0, 9999]
+            mock_time.sleep = MagicMock()
+            with pytest.raises(TimeoutError, match="copilot-cli did not become ready"):
+                bot._wait_for_cli_ready()
+
+    def test_wait_for_cli_ready_oserror_retries(self):
+        """OSError is caught and retried like ConnectionRefusedError."""
+        bot = self._make_minimal_bot()
+        mock_sock = MagicMock()
+        # First attempt raises OSError, second attempt succeeds
+        with patch("socket.create_connection", side_effect=[OSError("network error"), mock_sock]):
+            with patch("microbots.bot.CopilotBot.time") as mock_time:
+                mock_time.time.side_effect = [0, 1, 2]
+                mock_time.sleep = MagicMock()
+                bot._wait_for_cli_ready()
+                mock_sock.close.assert_called_once()
+
+
+# ---------------------------------------------------------------------------
+# Unit tests — run() with additional_mounts
+# ---------------------------------------------------------------------------
+
+@pytest.mark.unit
+class TestCopilotBotRunMounts:
+    """Tests for run() with additional_mounts parameter."""
+
+    def test_run_with_additional_mounts_calls_mount_additional(self, copilot_bot):
+        """_mount_additional is called for each mount in additional_mounts."""
+        from microbots.extras.mount import Mount, MountType
+
+        mock_mount = MagicMock(spec=Mount)
+        mock_mount.mount_type = MountType.COPY
+        mock_mount.host_path_info = MagicMock()
+        mock_mount.host_path_info.abs_path = "/tmp/extra"
+        mock_mount.sandbox_path = "/workdir/extra"
+
+        copilot_bot.environment.copy_to_container = MagicMock(return_value=True)
+
+        with patch.object(copilot_bot, "_mount_additional") as mock_ma:
+            copilot_bot.run("test task", additional_mounts=[mock_mount])
+            mock_ma.assert_called_once_with(mock_mount)
+
+
+# ---------------------------------------------------------------------------
+# Unit tests — _execute_session
+# ---------------------------------------------------------------------------
+
+@pytest.mark.unit
+class TestCopilotBotExecuteSession:
+    """Tests for _execute_session() paths."""
+
+    def test_execute_session_includes_provider_config(self, mock_environment, mock_copilot_client):
+        """provider is added to session kwargs when _provider_config is set."""
+        import asyncio
+
+        session = AsyncMock()
+        response = Mock()
+        response.data = Mock()
+        response.data.content = "done"
+        session.send_and_wait = AsyncMock(return_value=response)
+        session.on = MagicMock()
+        session.disconnect = AsyncMock()
+        mock_copilot_client.create_session = AsyncMock(return_value=session)
+
+        with (
+            patch("microbots.bot.CopilotBot.LocalDockerEnvironment", return_value=mock_environment),
+            patch("microbots.bot.CopilotBot.get_free_port", side_effect=[9000]),
+            patch("microbots.bot.CopilotBot.CopilotBot._install_copilot_cli"),
+            patch("microbots.bot.CopilotBot.CopilotBot._start_copilot_cli_server"),
+            patch("microbots.bot.CopilotBot.CopilotBot._wait_for_cli_ready"),
+            patch("copilot.CopilotClient", return_value=mock_copilot_client),
+            patch("copilot.ExternalServerConfig", return_value=MagicMock()),
+        ):
+            from microbots.bot.CopilotBot import CopilotBot
+            bot = CopilotBot(
+                model="gpt-4.1",
+                environment=mock_environment,
+                api_key="sk-key",
+                base_url="https://api.openai.com/v1",
+            )
+            result = asyncio.run(
+                bot._execute_session(
+                    task="do something",
+                    system_content="",
+                    timeout=30,
+                    streaming=False,
+                )
+            )
+            _, call_kwargs = mock_copilot_client.create_session.call_args
+            assert "provider" in call_kwargs
+            assert result == "done"
+            bot.stop()
+
+    def test_execute_session_includes_system_message(self, mock_environment, mock_copilot_client):
+        """system_message is added to session kwargs when system_content is non-empty."""
+        import asyncio
+
+        session = AsyncMock()
+        response = Mock()
+        response.data = Mock()
+        response.data.content = "done"
+        session.send_and_wait = AsyncMock(return_value=response)
+        session.on = MagicMock()
+        session.disconnect = AsyncMock()
+        mock_copilot_client.create_session = AsyncMock(return_value=session)
+
+        with (
+            patch("microbots.bot.CopilotBot.LocalDockerEnvironment", return_value=mock_environment),
+            patch("microbots.bot.CopilotBot.get_free_port", side_effect=[9000]),
+            patch("microbots.bot.CopilotBot.CopilotBot._install_copilot_cli"),
+            patch("microbots.bot.CopilotBot.CopilotBot._start_copilot_cli_server"),
+            patch("microbots.bot.CopilotBot.CopilotBot._wait_for_cli_ready"),
+            patch("copilot.CopilotClient", return_value=mock_copilot_client),
+            patch("copilot.ExternalServerConfig", return_value=MagicMock()),
+        ):
+            from microbots.bot.CopilotBot import CopilotBot
+            bot = CopilotBot(
+                model="gpt-4.1",
+                environment=mock_environment,
+                github_token="ghp_test",
+            )
+            asyncio.run(
+                bot._execute_session(
+                    task="do something",
+                    system_content="You are a helper.",
+                    timeout=30,
+                    streaming=False,
+                )
+            )
+            _, call_kwargs = mock_copilot_client.create_session.call_args
+            assert "system_message" in call_kwargs
+            assert call_kwargs["system_message"]["content"] == "You are a helper."
+            bot.stop()
+
+    def test_execute_session_returns_collected_event_text(
+        self, mock_environment, mock_copilot_client
+    ):
+        """Returns last collected text when send_and_wait returns no content."""
+        import asyncio
+
+        session = AsyncMock()
+        # send_and_wait returns response with no content
+        empty_response = Mock()
+        empty_response.data = Mock()
+        empty_response.data.content = ""
+        session.send_and_wait = AsyncMock(return_value=empty_response)
+        session.disconnect = AsyncMock()
+
+        captured = []
+
+        def capture_on(callback):
+            captured.append(callback)
+
+        session.on = MagicMock(side_effect=capture_on)
+        mock_copilot_client.create_session = AsyncMock(return_value=session)
+
+        with (
+            patch("microbots.bot.CopilotBot.LocalDockerEnvironment", return_value=mock_environment),
+            patch("microbots.bot.CopilotBot.get_free_port", side_effect=[9000]),
+            patch("microbots.bot.CopilotBot.CopilotBot._install_copilot_cli"),
+            patch("microbots.bot.CopilotBot.CopilotBot._start_copilot_cli_server"),
+            patch("microbots.bot.CopilotBot.CopilotBot._wait_for_cli_ready"),
+            patch("copilot.CopilotClient", return_value=mock_copilot_client),
+            patch("copilot.ExternalServerConfig", return_value=MagicMock()),
+        ):
+            from microbots.bot.CopilotBot import CopilotBot
+            from copilot.generated.session_events import SessionEventType
+
+            bot = CopilotBot(
+                model="gpt-4.1",
+                environment=mock_environment,
+                github_token="ghp_test",
+            )
+
+            # Simulate an ASSISTANT_MESSAGE event arriving before send_and_wait returns
+            async def _send_and_wait_with_event(task, timeout):
+                if captured:
+                    msg_event = Mock()
+                    msg_event.type = SessionEventType.ASSISTANT_MESSAGE
+                    msg_event.data = Mock()
+                    msg_event.data.content = "from event"
+                    captured[0](msg_event)
+                return empty_response
+
+            session.send_and_wait = _send_and_wait_with_event
+
+            result = asyncio.run(
+                bot._execute_session(
+                    task="do something",
+                    system_content="",
+                    timeout=30,
+                    streaming=False,
+                )
+            )
+            assert result == "from event"
+            bot.stop()
+
+    def test_execute_session_returns_fallback_when_no_content(
+        self, mock_environment, mock_copilot_client
+    ):
+        """Returns fallback message when no text is collected at all."""
+        import asyncio
+
+        session = AsyncMock()
+        empty_response = Mock()
+        empty_response.data = Mock()
+        empty_response.data.content = ""
+        session.send_and_wait = AsyncMock(return_value=empty_response)
+        session.on = MagicMock()
+        session.disconnect = AsyncMock()
+        mock_copilot_client.create_session = AsyncMock(return_value=session)
+
+        with (
+            patch("microbots.bot.CopilotBot.LocalDockerEnvironment", return_value=mock_environment),
+            patch("microbots.bot.CopilotBot.get_free_port", side_effect=[9000]),
+            patch("microbots.bot.CopilotBot.CopilotBot._install_copilot_cli"),
+            patch("microbots.bot.CopilotBot.CopilotBot._start_copilot_cli_server"),
+            patch("microbots.bot.CopilotBot.CopilotBot._wait_for_cli_ready"),
+            patch("copilot.CopilotClient", return_value=mock_copilot_client),
+            patch("copilot.ExternalServerConfig", return_value=MagicMock()),
+        ):
+            from microbots.bot.CopilotBot import CopilotBot
+            bot = CopilotBot(
+                model="gpt-4.1",
+                environment=mock_environment,
+                github_token="ghp_test",
+            )
+            result = asyncio.run(
+                bot._execute_session(
+                    task="do something",
+                    system_content="",
+                    timeout=1,
+                    streaming=False,
+                )
+            )
+            assert "without producing" in result
+            bot.stop()
+
+    def test_execute_session_on_event_handlers(self, mock_environment, mock_copilot_client):
+        """_on_event handles ASSISTANT_MESSAGE_DELTA, SESSION_IDLE, and unknown types."""
+        import asyncio
+
+        session = AsyncMock()
+        empty_response = Mock()
+        empty_response.data = Mock()
+        empty_response.data.content = ""
+        session.disconnect = AsyncMock()
+
+        captured = []
+
+        def capture_on(callback):
+            captured.append(callback)
+
+        session.on = MagicMock(side_effect=capture_on)
+
+        async def _send_and_wait_with_events(task, timeout):
+            if captured:
+                cb = captured[0]
+                # ASSISTANT_MESSAGE_DELTA with delta_content
+                delta_event = Mock()
+                delta_event.type = "assistant.message_delta"
+                delta_event.data = Mock()
+                delta_event.data.delta_content = "partial"
+                cb(delta_event)
+
+                # SESSION_IDLE
+                idle_event = Mock()
+                idle_event.type = "session.idle"
+                cb(idle_event)
+
+                # Unknown event type
+                unknown_event = Mock()
+                unknown_event.type = "some.other.event"
+                cb(unknown_event)
+
+                # ASSISTANT_MESSAGE with no content (data.content is empty)
+                msg_empty = Mock()
+                msg_empty.type = "assistant.message"
+                msg_empty.data = Mock()
+                msg_empty.data.content = ""
+                cb(msg_empty)
+
+            return empty_response
+
+        session.send_and_wait = _send_and_wait_with_events
+        mock_copilot_client.create_session = AsyncMock(return_value=session)
+
+        with (
+            patch("microbots.bot.CopilotBot.LocalDockerEnvironment", return_value=mock_environment),
+            patch("microbots.bot.CopilotBot.get_free_port", side_effect=[9000]),
+            patch("microbots.bot.CopilotBot.CopilotBot._install_copilot_cli"),
+            patch("microbots.bot.CopilotBot.CopilotBot._start_copilot_cli_server"),
+            patch("microbots.bot.CopilotBot.CopilotBot._wait_for_cli_ready"),
+            patch("copilot.CopilotClient", return_value=mock_copilot_client),
+            patch("copilot.ExternalServerConfig", return_value=MagicMock()),
+        ):
+            from microbots.bot.CopilotBot import CopilotBot
+            bot = CopilotBot(
+                model="gpt-4.1",
+                environment=mock_environment,
+                github_token="ghp_test",
+            )
+            # Should not raise — covers all branches of _on_event
+            asyncio.run(
+                bot._execute_session(
+                    task="do something",
+                    system_content="",
+                    timeout=5,
+                    streaming=False,
+                )
+            )
+            bot.stop()
+
+
+# ---------------------------------------------------------------------------
+# Unit tests — _on_pre_tool_use and _on_post_tool_use hooks
+# ---------------------------------------------------------------------------
+
+@pytest.mark.unit
+class TestCopilotBotToolUseHooks:
+    """Tests for _on_pre_tool_use() and _on_post_tool_use() async hooks."""
+
+    def test_on_pre_tool_use_returns_allow(self, copilot_bot):
+        import asyncio
+
+        result = asyncio.run(
+            copilot_bot._on_pre_tool_use(
+                {"toolName": "bash", "toolArgs": {"command": "ls"}},
+                None,
+            )
+        )
+        assert result == {"permissionDecision": "allow"}
+
+    def test_on_pre_tool_use_missing_keys(self, copilot_bot):
+        import asyncio
+
+        result = asyncio.run(copilot_bot._on_pre_tool_use({}, None))
+        assert result == {"permissionDecision": "allow"}
+
+    def test_on_post_tool_use_returns_empty_dict(self, copilot_bot):
+        import asyncio
+
+        result = asyncio.run(
+            copilot_bot._on_post_tool_use(
+                {"toolName": "bash", "toolResult": "output here"},
+                None,
+            )
+        )
+        assert result == {}
+
+    def test_on_post_tool_use_truncates_long_result(self, copilot_bot):
+        import asyncio
+
+        long_result = "x" * 600
+        # Should not raise even with a very long result string
+        result = asyncio.run(
+            copilot_bot._on_post_tool_use(
+                {"toolName": "bash", "toolResult": long_result},
+                None,
+            )
+        )
+        assert result == {}
+
+    def test_on_post_tool_use_missing_keys(self, copilot_bot):
+        import asyncio
+
+        result = asyncio.run(copilot_bot._on_post_tool_use({}, None))
+        assert result == {}
+
+
+# ---------------------------------------------------------------------------
+# Unit tests — _mount_additional
+# ---------------------------------------------------------------------------
+
+@pytest.mark.unit
+class TestCopilotBotMountAdditional:
+    """Tests for _mount_additional()."""
+
+    def test_mount_additional_non_copy_raises(self, copilot_bot):
+        """ValueError raised for non-COPY mount type."""
+        from microbots.extras.mount import Mount, MountType
+
+        mock_mount = MagicMock()
+        mock_mount.mount_type = MountType.MOUNT  # not COPY
+        with pytest.raises(ValueError, match="Only COPY mount type"):
+            copilot_bot._mount_additional(mock_mount)
+
+    def test_mount_additional_copy_fails_raises(self, copilot_bot):
+        """ValueError raised when copy_to_container returns False."""
+        from microbots.extras.mount import MountType
+
+        mock_mount = MagicMock()
+        mock_mount.mount_type = MountType.COPY
+        mock_mount.host_path_info = MagicMock()
+        mock_mount.host_path_info.abs_path = "/host/path"
+        mock_mount.sandbox_path = "/workdir/path"
+
+        copilot_bot.environment.copy_to_container = MagicMock(return_value=False)
+        with pytest.raises(ValueError, match="Failed to copy additional mount"):
+            copilot_bot._mount_additional(mock_mount)
+
+    def test_mount_additional_copy_succeeds(self, copilot_bot):
+        """No error raised when copy_to_container succeeds."""
+        from microbots.extras.mount import MountType
+
+        mock_mount = MagicMock()
+        mock_mount.mount_type = MountType.COPY
+        mock_mount.host_path_info = MagicMock()
+        mock_mount.host_path_info.abs_path = "/host/path"
+        mock_mount.sandbox_path = "/workdir/path"
+
+        copilot_bot.environment.copy_to_container = MagicMock(return_value=True)
+        copilot_bot._mount_additional(mock_mount)  # should not raise
+
+
+# ---------------------------------------------------------------------------
+# Integration tests — require real Docker + copilot-cli + auth
+# ---------------------------------------------------------------------------
+
+_skip_no_copilot_cli = pytest.mark.skipif(
+    not _copilot_cli_available(),
+    reason="GitHub Copilot CLI not installed (copilot not in PATH)",
+)
+
+_skip_no_copilot_sdk = pytest.mark.skipif(
+    not _copilot_sdk_installed(),
+    reason="github-copilot-sdk not installed (pip install microbots[ghcp])",
+)
+
+_skip_no_copilot_auth = pytest.mark.skipif(
+    not _copilot_auth_available(),
+    reason="No GitHub auth available (set GITHUB_TOKEN or run 'gh auth login')",
+)
+
+
+@_skip_no_copilot_cli
+@_skip_no_copilot_sdk
+@_skip_no_copilot_auth
+@pytest.mark.integration
+@pytest.mark.slow
+class TestCopilotBotIntegration:
+    """End-to-end integration tests with real Copilot SDK."""
+
+    def test_simple_task(self, test_repo, issue_1):
+        """CopilotBot can fix a simple syntax error."""
+        _restore_real_copilot_modules()
+        from microbots.bot.CopilotBot import CopilotBot
+
+        issue_text = issue_1[0]
+        verify_function = issue_1[1]
+
+        bot = CopilotBot(
+            model="gpt-4.1",
+            folder_to_mount=str(test_repo),
+            permission="READ_WRITE",
+        )
+
+        try:
+            result = bot.run(
+                issue_text,
+                timeout_in_seconds=300,
+            )
+            assert result.status is True, f"CopilotBot failed: {result.error}"
+            verify_function(test_repo)
+        finally:
+            bot.stop()
+
+
+# ---------------------------------------------------------------------------
+# BYOK helpers
+# ---------------------------------------------------------------------------
+
+def _byok_openai_available():
+    """Check if OpenAI BYOK credentials are configured via env vars."""
+    return bool(
+        os.environ.get("OPEN_AI_KEY")
+        and os.environ.get("OPEN_AI_END_POINT")
+    )
+
+
+_skip_no_byok_openai = pytest.mark.skipif(
+    not _byok_openai_available(),
+    reason="OpenAI BYOK not configured (set OPEN_AI_KEY and OPEN_AI_END_POINT)",
+)
+
+
+@_skip_no_copilot_cli
+@_skip_no_copilot_sdk
+@_skip_no_byok_openai
+@pytest.mark.integration
+@pytest.mark.slow
+class TestCopilotBotBYOKOpenAIIntegration:
+    """End-to-end integration tests for CopilotBot with OpenAI BYOK."""
+
+    def test_byok_openai_simple_task(self, test_repo, issue_1):
+        """CopilotBot can fix a simple syntax error using OpenAI BYOK credentials."""
+        _restore_real_copilot_modules()
+        from microbots.bot.CopilotBot import CopilotBot
+
+        issue_text = issue_1[0]
+        verify_function = issue_1[1]
+
+        api_key = os.environ["OPEN_AI_KEY"]
+        base_url = os.environ["OPEN_AI_END_POINT"]
+        model = os.getenv(
+            "AZURE_OPENAI_DEPLOYMENT_NAME", "mini-swe-agent-gpt5"
+        )
+
+        bot = CopilotBot(
+            model=model,
+            folder_to_mount=str(test_repo),
+            permission="READ_WRITE",
+            api_key=api_key,
+            base_url=base_url,
+            provider_type="openai",
+        )
+
+        try:
+            assert bot._provider_config is not None
+            assert bot._provider_config["type"] == "openai"
+            assert bot.github_token is None
+
+            result = bot.run(
+                issue_text,
+                timeout_in_seconds=300,
+            )
+            assert result.status is True, f"CopilotBot BYOK run failed: {result.error}"
+            verify_function(test_repo)
+        finally:
+            bot.stop()
diff --git a/test/bot/test_writing_bot_copilot.py b/test/bot/test_writing_bot_copilot.py
new file mode 100644
index 0000000..c515c17
--- /dev/null
+++ b/test/bot/test_writing_bot_copilot.py
@@ -0,0 +1,130 @@
+"""
+Integration test for CopilotApi — end-to-end code fix using GitHub Copilot.
+
+This test uses the WritingBot with the `github-copilot` provider to fix
+a real syntax error (missing colon) from the SWE-agent test repository.
+
+Prerequisites:
+  - GitHub Copilot CLI installed and in PATH (`copilot --version`)
+  - Authenticated via `copilot` login or GITHUB_TOKEN / GH_TOKEN env var
+  - Active GitHub Copilot subscription
+  - Install the ghcp extra: `pip install microbots[ghcp]`
+  - Docker daemon running
+
+Usage:
+------
+  # Run the integration test:
+  pytest test/bot/test_writing_bot_copilot.py -v -m "integration"
+"""
+
+import os
+import shutil
+import subprocess
+import sys
+
+import pytest
+
+sys.path.insert(
+    0, os.path.abspath(os.path.join(os.path.dirname(__file__), "../../src"))
+)
+
+import logging
+logging.basicConfig(level=logging.INFO)
+
+from microbots import WritingBot, BotRunResult
+
+
+def _copilot_cli_available():
+    """Check if the Copilot CLI is installed and accessible."""
+    return shutil.which("copilot") is not None
+
+
+def _copilot_sdk_installed():
+    """Check if the github-copilot-sdk package is installed."""
+    try:
+        import copilot  # noqa: F401
+        return True
+    except ImportError:
+        return False
+
+
+def _copilot_auth_available():
+    """Check if GitHub authentication is available for Copilot."""
+    if os.environ.get("GITHUB_TOKEN"):
+        return True
+    # Check if gh CLI is authenticated
+    if shutil.which("gh"):
+        try:
+            result = subprocess.run(
+                ["gh", "auth", "status"],
+                capture_output=True, timeout=10,
+            )
+            return result.returncode == 0
+        except Exception:
+            pass
+    return False
+
+
+skip_no_copilot_cli = pytest.mark.skipif(
+    not _copilot_cli_available(),
+    reason="GitHub Copilot CLI not installed (copilot not in PATH)",
+)
+
+skip_no_copilot_sdk = pytest.mark.skipif(
+    not _copilot_sdk_installed(),
+    reason="github-copilot-sdk not installed (pip install microbots[ghcp])",
+)
+
+skip_no_copilot_auth = pytest.mark.skipif(
+    not _copilot_auth_available(),
+    reason="No GitHub auth available (set GITHUB_TOKEN or run 'gh auth login')",
+)
+
+
+@skip_no_copilot_cli
+@skip_no_copilot_sdk
+@skip_no_copilot_auth
+@pytest.mark.integration
+@pytest.mark.slow
+def test_writing_bot_copilot_fixes_syntax_error(test_repo, issue_1):
+    """
+    End-to-end test: WritingBot with GitHub Copilot fixes a syntax error.
+
+    The test-repo contains `tests/missing_colon.py` with a SyntaxError
+    (missing colon on a function definition). The WritingBot should:
+    1. Read the error description
+    2. Find the faulty file
+    3. Fix the syntax error (add the missing colon)
+    4. Verify the fix by running the script
+
+    After the bot completes, `verify_function` confirms the fix by
+    executing the script and asserting returncode == 0.
+    """
+    issue_text = issue_1[0]
+    verify_function = issue_1[1]
+
+    model = "github-copilot/gpt-4.1"
+
+    writing_bot = WritingBot(
+        model=model,
+        folder_to_mount=str(test_repo),
+    )
+
+    response: BotRunResult = writing_bot.run(
+        issue_text,
+        max_iterations=25,
+        timeout_in_seconds=300,
+    )
+
+    print(
+        f"Status: {response.status}, "
+        f"Result: {response.result}, "
+        f"Error: {response.error}"
+    )
+
+    assert response.status is True, (
+        f"WritingBot did not complete the task. Error: {response.error}"
+    )
+
+    # Verify the fix actually works: run the script, expect exit code 0
+    verify_function(test_repo)
diff --git a/test/environment/local_docker/test_local_docker_environment.py b/test/environment/local_docker/test_local_docker_environment.py
index 266f9f2..e67f495 100644
--- a/test/environment/local_docker/test_local_docker_environment.py
+++ b/test/environment/local_docker/test_local_docker_environment.py
@@ -492,3 +492,54 @@ def test_execute_redacts_command_when_sensitive(self, caplog):
         assert "<redacted>" in caplog.text
         assert "SECRET_TOKEN" not in caplog.text
         assert "abc123" not in caplog.text
+
+
+@pytest.mark.unit
+class TestGetIpv4Address:
+    """Unit tests for LocalDockerEnvironment.get_ipv4_address"""
+
+    def _make_env(self):
+        """Create a bare LocalDockerEnvironment without calling __init__"""
+        env = LocalDockerEnvironment.__new__(LocalDockerEnvironment)
+        env.deleted = True
+        env.container = None
+        return env
+
+    def test_returns_ip_address_from_container_networks(self):
+        """get_ipv4_address returns the IP from the first Docker network"""
+        env = self._make_env()
+        env.container = Mock()
+        env.container.attrs = {
+            "NetworkSettings": {
+                "Networks": {
+                    "bridge": {"IPAddress": "172.17.0.2"}
+                }
+            }
+        }
+
+        result = env.get_ipv4_address()
+
+        env.container.reload.assert_called_once()
+        assert result == "172.17.0.2"
+
+    def test_raises_runtime_error_when_no_container(self):
+        """get_ipv4_address raises RuntimeError when container is None"""
+        env = self._make_env()
+
+        with pytest.raises(RuntimeError, match="No active container"):
+            env.get_ipv4_address()
+
+    def test_raises_runtime_error_when_ip_is_empty(self):
+        """get_ipv4_address raises RuntimeError when IP address is empty"""
+        env = self._make_env()
+        env.container = Mock()
+        env.container.attrs = {
+            "NetworkSettings": {
+                "Networks": {
+                    "bridge": {"IPAddress": ""}
+                }
+            }
+        }
+
+        with pytest.raises(RuntimeError, match="Could not determine container IP address"):
+            env.get_ipv4_address()
diff --git a/test/swe-bench-test/run_swe_bench.py b/test/swe-bench-test/run_swe_bench.py
index ee3da52..ea6c01d 100644
--- a/test/swe-bench-test/run_swe_bench.py
+++ b/test/swe-bench-test/run_swe_bench.py
@@ -11,10 +11,55 @@
     0, os.path.abspath(os.path.join(os.path.dirname(__file__), "../../src"))
 )
 
-from microbots import AgentBoss
+from microbots import AgentBoss, CopilotBot
+
+LOG_DIR = Path(__file__).parent.resolve() / "logs"
+LOG_DIR.mkdir(parents=True, exist_ok=True)
+
+LOG_FORMAT = logging.Formatter("%(asctime)s [%(levelname)s] %(name)s: %(message)s")
 
 logger = logging.getLogger(__name__)
-logging.basicConfig(level=logging.INFO)
+logger.setLevel(logging.DEBUG)
+
+# Console output (always active)
+console_handler = logging.StreamHandler()
+console_handler.setLevel(logging.INFO)
+console_handler.setFormatter(LOG_FORMAT)
+logger.addHandler(console_handler)
+
+# Track per-instance file handlers so they can be swapped between test cases
+_active_file_handlers: list[logging.Handler] = []
+
+
+def setup_instance_logging(instance_id: str):
+    """Create per-instance log directory and swap file handlers."""
+    root = logging.getLogger()
+
+    # Remove previous instance file handlers
+    for h in _active_file_handlers:
+        root.removeHandler(h)
+        h.close()
+    _active_file_handlers.clear()
+
+    instance_log_dir = LOG_DIR / instance_id
+    instance_log_dir.mkdir(parents=True, exist_ok=True)
+
+    # Info log file
+    info_handler = logging.FileHandler(instance_log_dir / "info.log")
+    info_handler.setLevel(logging.INFO)
+    info_handler.setFormatter(LOG_FORMAT)
+
+    # Debug log file
+    debug_handler = logging.FileHandler(instance_log_dir / "debug.log")
+    debug_handler.setLevel(logging.DEBUG)
+    debug_handler.setFormatter(LOG_FORMAT)
+
+    root.setLevel(logging.DEBUG)
+    root.addHandler(info_handler)
+    root.addHandler(debug_handler)
+    _active_file_handlers.extend([info_handler, debug_handler])
+
+    logger.info("Logging for instance %s -> %s", instance_id, instance_log_dir)
 
 # Verification method
 # `pip install swebench`
@@ -94,6 +139,23 @@ def run_agent(dataset):
     )
 
 
+def run_copilot_agent(dataset):
+    bot = CopilotBot(
+        model="gpt-5.4",
+        folder_to_mount=str(TEST_DIR / dataset['instance_id']),
+        permission="READ_WRITE",
+    )
+    try:
+        result = bot.run(
+            task=dataset['problem_statement'] + "\n\nHint: " + dataset['hints_text'],
+            timeout_in_seconds=3600 * 4,  # 4 hours
+        )
+        if not result.status:
+            logger.error(f"CopilotBot failed on {dataset['instance_id']}: {result.error}")
+    finally:
+        bot.stop()
+
+
 def generate_prediction(dataset):
     repo_path = TEST_DIR / dataset['instance_id']
     diff_output = subprocess.run(
@@ -136,6 +198,7 @@ def test_swe_bench():
     datasets = load_dataset(SWE_BENCH_SUITE, split="test")
 
     for instance in selected_dataset:
+        setup_instance_logging(instance)
         dataset = datasets.filter(lambda x: x['instance_id'] == instance)[0]
         logger.info(f"DATASET: {pprint(dataset)}")
         setup_test_directory(dataset)
@@ -146,5 +209,19 @@ def test_swe_bench():
     verify_fix()
 
 
+def test_swe_bench_copilot():
+    datasets = load_dataset(SWE_BENCH_SUITE, split="test")
+
+    for instance in selected_dataset:
+        setup_instance_logging(instance)
+        dataset = datasets.filter(lambda x: x['instance_id'] == instance)[0]
+        logger.info(f"DATASET: {pprint(dataset)}")
+        setup_test_directory(dataset)
+        run_copilot_agent(dataset)
+        generate_prediction(dataset)
+
+    verify_fix()
+
+
 if __name__ == "__main__":
-    test_swe_bench()
\ No newline at end of file
+    test_swe_bench_copilot()
\ No newline at end of file