Reflection¶

Reflection is the mechanism that transforms raw accumulated experience into coherent personality. Park et al. (2023) ablation showed it is the most critical component — without it, agents accumulate raw memories but cannot form coherent beliefs and behave less believably over time despite having more data. This is analogous to human memory consolidation during sleep: raw hippocampal memories are replayed and consolidated into neocortical long-term storage.

Why Reflection Exists¶

The Stanford Generative Agents paper (Park et al., UIST 2023) ran controlled ablation studies removing each memory component:

Component Removed	Impact
Observation (raw experience)	Degraded performance
Planning	Degraded performance
Reflection	Most significant degradation

Without reflection, agents accumulated raw memories but could not form coherent higher-level beliefs. Reflection is the bottleneck that converts experience into identity.

Trigger Conditions¶

Reflection fires under two conditions (dual trigger):

Periodic: interaction_count - last_reflection_at >= REFLECTION_EVERY (default: 20)
Event-driven: positive recent shift activity with pending insights or staged updates

Cooldown

Reflection will not fire if fewer than REFLECTION_EVERY // 2 interactions have occurred since the last reflection. At the default REFLECTION_EVERY=20, this means a minimum 10-interaction cooldown. The cooldown is checked first — neither periodic nor event-driven triggers can override it. This prevents thrashing when a burst of high-ESS interactions occurs in quick succession.

The Reflection Pipeline¶

flowchart TD
    Trigger[Reflection Triggered] --> Decay[1. Decay Unreinforced Beliefs]
    Decay --> Retrieve[2. Retrieve Recent Episodes]
    Retrieve --> Prepare[3. Assemble Context]
    Prepare --> LLM[4. LLM Consolidation]
    LLM --> Validate{5. Validate Snapshot}
    Validate -->|Pass| Update[Update Snapshot]
    Validate -->|Fail| Reject[Reject, Keep Old]
    Update --> Health[6. Health Check]
    Health --> Clear[7. Clear Pending Insights]
    Clear --> Log[Log Reflection Event]

Step 1: Decay Beliefs¶

Before consolidation, stale beliefs are evaluated by BELIEF_DECAY_PROMPT and marked as RETAIN, DECAY, or FORGET with conservative fallback behavior.

See Opinion Dynamics for the full decay formula.

Step 2: Retrieve Episodes¶

Recent non-archived episodes are loaded from Neo4j graph context (list_recent_episode_context), capped at min(REFLECTION_EVERY, 10) results.

Step 3: Assemble Context¶

The reflection prompt receives:

Current personality snapshot
Structured traits (opinions, topics, disagreement rate)
Current beliefs with confidence, evidence count, last reinforced
Pending insights (accumulated since last reflection)
Recent episode summaries
Recent personality shifts

Step 4: LLM Consolidation¶

The REFLECTION_PROMPT tasks are ordered deliberately:

PRESERVE — all existing personality traits unless directly contradicted by new evidence; removing a trait is losing identity
INTEGRATE — pending insights naturally into the narrative
SYNTHESIZE — higher-order patterns (e.g., "I notice I tend to value X")
Inject specificity — if the personality has become generic, add detail from strongest recent insights

This PRESERVE-first ordering is inspired by Open Character Training (2025) — personality changes must be robust. Persona Vectors research (2025): losing a specific trait means losing neural activation patterns.

Step 5: Validate Output¶

Before committing the new snapshot:

Minimum length: ≥ 30 characters
Retention ratio: len(new) / len(old) ≥ 0.6 (60%)

Catches catastrophic content loss from LLM rewrites. Without validation, a single bad reflection could destroy interactions worth of personality development.

Step 6: Health Check¶

After a successful update, vocabulary diversity is checked:

unique_ratio = len(set(words)) / len(words)
If unique_ratio < 0.4: log warning — possible personality collapse (repetitive, generic text)

Step 7: Clear and Update¶

pending_insights cleared
last_reflection_at = interaction_count

Accumulate-Then-Consolidate Pattern¶

Sonality deliberately avoids per-interaction snapshot rewrites:

Per Interaction	During Reflection
Extract one-sentence insight	Consolidate all pending insights
Append to `pending_insights`	Rewrite snapshot narrative
Update opinion vectors (math only)	Validate and commit

Why Not Per-Interaction Rewrites?

ABBEL (2025) demonstrated that "belief bottlenecks" — forcing information through a compressed state — outperform full conversation history. But frequent rewrites introduce the Broken Telephone effect:

\[P(\text{survive}, N) = p^N\]

At p = 0.95 per rewrite, after 40 rewrites only 12.9% of initial distinctive traits survive. By accumulating insights and consolidating periodically, the number of rewrites drops from ~40 per 100 interactions to ~5, dramatically improving trait survival.

Research Grounding¶

Source	Key Finding
Park et al. (2023)	Reflection ablation: most critical component for believable agents
Sleep-time Compute (arXiv:2504.13171)	+13–18% accuracy with background consolidation, 5× compute savings
SAGE (arXiv:2409.00872)	Ebbinghaus-based memory management: 2.26× improvement
EvolveR (arXiv:2510.16079)	Experience lifecycle closure enables learning from experience
ABBEL (2025)	Belief bottleneck: compact state outperforms full context; frequent rewrites lose minority opinions
IROTE (2025)	Experience-based reflection can amplify errors — validation layers mitigate

Known Risks¶

Reflection is the highest-variance component. When it works, it produces coherent higher-level beliefs from raw experience. When it fails, it can destroy personality through a single bad rewrite. The validation layer mitigates catastrophic failure, but subtle information loss still accumulates.

Wholesale rewrite vs. targeted edit. The current implementation does a full snapshot rewrite. Broken Telephone research shows wholesale regeneration loses more information than targeted editing. A future improvement: structured opinion slots that the LLM cannot delete during reflection.

Next: Anti-Sycophancy — how reflection interacts with the eight defensive layers. Personality Development — the teaching methodology and expected reflection outputs at each interaction milestone.