Skip to main content

Endpoint

POST /v1/memories/retrieve

Authentication

Authorization: ApiKey mem_...

Request body

{
  "external_user_id": "customer-123",
  "query": "How should I respond to this user?",
  "limit": 5,
  "categories": ["preference", "goal"],
  "agent_id": "support-bot",
  "time_filter_days": 30,
  "format": "bullets",
  "context_max_tokens": 500
}

Schema

FieldTypeRequiredNotes
external_user_idstringYesEnd-user identifier inside your tenant
querystringYesNatural-language retrieval query
limitintegerNoDefault 10, max 50
categoriesMemoryCategory[]NoOptional category filters
agent_idstring | nullNoOptional agent filter
time_filter_daysinteger | nullNoReturn only memories created in the last N days
format"bullets" | "json" | "xml"NoDefault bullets
context_max_tokensintegerNoDefault 500; max context budget for system_prompt_addition

Response

{
  "data": [
    {
      "id": "2b8f5f87-bbd4-4f84-9f5f-0cba5033f058",
      "content": "User prefers concise technical explanations and Python examples.",
      "category": "preference",
      "importance_score": 8.5,
      "last_accessed": "2026-04-17T09:45:00Z",
      "relevance_score": 0.962341,
      "context_snippet": "- User prefers concise technical explanations and Python examples."
    }
  ],
  "cached": false,
  "system_prompt_addition": "What you know about this user:\n- User prefers concise technical explanations and Python examples.",
  "context_token_count": 18,
  "memories_from_hot_tier": 0,
  "clarification_question": null,
  "request_id": "b0eb46a4-8794-44c8-b2a9-8f2dfbb4176c",
  "timestamp": "2026-04-17T09:45:03Z"
}

Response schema

Top-level fields

FieldTypeMeaning
dataMemorySearchResult[]Ranked memory results
cachedbooleanWhether retrieval came from the hot cache
system_prompt_additionstringPrompt-ready memory context
context_token_countinteger | nullToken count of the built context when available
memories_from_hot_tierintegerNumber of returned memories served from Redis hot tier
clarification_questionstring | nullOptional user-facing question for resolving a pending conflict
request_idstringTrace id
timestampdatetimeResponse timestamp

Domain-aware retrieval

If a tenant has a domain schema enabled, the same retrieve endpoint returns domain-aware context in system_prompt_addition. For example, an EdTech tenant may receive tutoring context about exam goals, weak topics, learning style, or forgetting-stage review urgency. The request shape does not change:
{
  "external_user_id": "student_123",
  "query": "teach this student trigonometry identities",
  "limit": 8,
  "context_max_tokens": 600
}
Use optional domain profile endpoints only when you need structured UI data, not for normal model calls.

MemorySearchResult

FieldTypeMeaning
idstringMemory id
contentstringMemory text
categorystringMemory category
importance_scorefloatImportance score
last_accesseddatetime | nullLast access timestamp
relevance_scorefloatFinal retrieval score
context_snippetstringSingle-memory rendering in the selected format

format examples

The format value controls how MemoryOS renders both context_snippet and system_prompt_addition.

bullets

{
  "format": "bullets"
}
Example system_prompt_addition:
What you know about this user:
- User prefers concise technical explanations and Python examples.

json

{
  "format": "json"
}
Example system_prompt_addition:
{
  "preference": [
    "User prefers concise technical explanations and Python examples."
  ]
}

xml

{
  "format": "xml"
}
Example system_prompt_addition:
What you know about this user:
<memory_context>
  <memory category="preference">
    User prefers concise technical explanations and Python examples.
  </memory>
</memory_context>

Context token limit

Use context_max_tokens to limit the size of system_prompt_addition.
{
  "external_user_id": "customer-123",
  "query": "How should I answer this user?",
  "limit": 10,
  "format": "bullets",
  "context_max_tokens": 300
}
MemoryOS drops lower-importance memories first when the context is too large. It does not truncate mid-sentence.