Skip to content

AgentCoreMemorySessionManager: retrieval_config context accumulates O(turns × records) across multi-turn conversations #420

@jariy17

Description

@jariy17

Summary

When using AgentCoreMemorySessionManager with retrieval_config, the <user_context> XML injected by retrieve_customer_context() persists in agent.messages and repeats on every turn, growing O(turns × records).

Observed with 5 turns and 3 preferences: 3,655 chars of repeated context (83% of total message content).

Turn 1: [user+XML, assistant]
Turn 2: [user+XML, assistant, user+XML, assistant]          ← XML appears twice
Turn 3: [user+XML, assistant, user+XML, assistant, user+XML, assistant]  ← 3×

Strands Integration Affected

from bedrock_agentcore.memory.integrations.strands.config import AgentCoreMemoryConfig
from bedrock_agentcore.memory.integrations.strands.session_manager import AgentCoreMemorySessionManager

Root Cause

In session_manager.py, retrieve_customer_context() (lines 754–829) is registered as a MessageAddedEvent callback (line 840). On each turn it unconditionally inserts XML context at event.agent.messages[-1]["content"][0] (line 823):

event.agent.messages[-1]["content"].insert(
    0, {"text": f"<{self.config.context_tag}>{context_text}</{self.config.context_tag}>"}
)

Three missing controls cause the accumulation:

  1. No deduplication checkretrieve_customer_context() does not check if context is already present before injecting.
  2. No filtering on message restorelist_messages() (lines 688–707) reloads messages with previously-injected context intact. The existing _filter_restored_tool_context() (lines 709–734) strips toolUse/toolResult blocks but has no equivalent for <user_context> tags.
  3. Mutable in-place modification — The injected context is saved as part of the message content, so it reappears when the message history is reloaded on the next turn.

Impact

  • Token waste: Grows linearly per turn. 10 turns × 5 records × 500 chars/record = ~25KB of duplicate context.
  • Context window pressure: In long-running sessions this can consume a significant portion of the model's context window.
  • Cost: Unnecessary input token charges scale with conversation length.

Reproduction

from bedrock_agentcore.memory.integrations.strands.config import AgentCoreMemoryConfig, RetrievalConfig
from bedrock_agentcore.memory.integrations.strands.session_manager import AgentCoreMemorySessionManager
from strands import Agent

config = AgentCoreMemoryConfig(
    memory_id=MEMORY_ID,
    session_id=session_id,
    actor_id=actor_id,
    retrieval_config=RetrievalConfig(
        memory_id=MEMORY_ID,
        namespace="preferences",
    ),
)

with AgentCoreMemorySessionManager(config, region_name=REGION) as sm:
    agent = Agent(model=model, session_manager=sm)
    # After 3+ turns, inspect agent.messages — <user_context> XML appears N times
    # where N = number of completed turns

Suggested Fix Directions

Option A (Deduplication before injection):
Before line 823, check if the last message already contains the context tag and skip injection if present.

Option B (Filter on restore — analogous to existing tool context filtering):
Add a _filter_restored_user_context() method that strips previously-injected <user_context> tags from restored messages, similar to how _filter_restored_tool_context() already handles toolUse/toolResult.

Option C (Both):
Strip on restore (Option B) for correctness + deduplicate on inject (Option A) as a safety net.

Current Workaround

Skip retrieval_config and use the memory tool directly for LTM access:

config = AgentCoreMemoryConfig(
    memory_id=MEMORY_ID, session_id=sid, actor_id=aid,
    # no retrieval_config → no auto-injection
)
memory_tool = AgentCoreMemoryToolProvider(
    memory_id=MEMORY_ID, actor_id=aid, session_id=sid, namespace=ns, region=REGION,
)
with AgentCoreMemorySessionManager(config, region_name=REGION) as sm:
    agent = Agent(model=model, session_manager=sm, tools=memory_tool.tools)

Test Gap

The existing integration test test_session_manager_with_retrieval_config_adds_context (line ~140) only verifies context exists after 2 turns but does not assert that context does not accumulate across turns.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions