Models · Generation 1

A flagship model
with the substrate underneath.

Flagship post-transformer model trained on the ReasonLoom substrate.

Model ID
RL-X1.G1.2026
Substrate
Stamen runtime · Heddle memory
Training
Atelier developmental loop · typed verifier in-loop
Context ceiling
there isn’t one

Category

Flagship cross-domain · generation 1

Substrate

Stamen + Heddle

Trained with

Atelier developmental loop

Best for

Long-horizon reasoning without context limits

What RL-X1 is

A model with the substrate underneath

RL-X1 is the first generation of our flagship cross-domain line. It does not live inside a context window. It reads, binds, and composes through Heddle, runs on Stamen, and is reared by Atelier. The result is reasoning quality on long-horizon tasks that comes from architecture, not from prompt engineering.

The structural shift

Why the difference is not “a bigger window”

Conventional models scale by extending an attention buffer. RL-X1 does not have one to extend. The work that the window used to do is done by the substrate instead.

Conventional

Token window

  • × Memory ceiling = buffer length.
  • × Recall is a scan of attention.
  • × Composition re-derived per turn.
  • × Provenance lives in prose.

RL-X1

Structured substrate

  • + No buffer to overflow.
  • + Recall is a substrate primitive.
  • + Composition is bind/walk, not re-read.
  • + Provenance is structural.

What changes versus a transformer

Three structural differences

RL-X1 is interesting because of what it is not — not a bigger attention model, not a tokens-in-tokens-out model, not a one-shot decoder.

X1

No context-window ceiling

Memory lives in structured binding, not in a buffer the decoder has to scroll. Long-horizon tasks stop being a token-budget problem.

X2

Composition is a primitive

Reasoning over analogy, counterfactual, and multi-hop chains uses the same bind/recall surface. The model does not have to re-derive structure from language each turn.

X3

Grounded by training

The model is reared by Atelier, with a typed verifier in the loop. What it knows, it can defend; what it does not know, it defers on.

Where it sits

Internal evaluation

Numbers are internal — the suites and conditions are documented in the evaluation programme. The pattern, not any single value, is what we report.

Task family RL-X1 Conventional baseline Δ
Long-horizon multi-hop P@5 1.00 P@5 ~0.62 +0.38
Cross-document binding 0.94 0.71 +0.23
Compositional analogy 0.88 0.56 +0.32
Defer-on-unknown 0.96 0.41 +0.55
Context-window overflow 0 frequent n/a

P@5 1.00

Multi-hop retrieval through the stack

End-to-end retrieval through the model and memory bridge.

+0.65

Lifelong retention vs amnesiac control

Inherited from the Atelier developmental loop.

0

Context-window failure modes

There is no context window to overflow.

A reasoning trace

What a multi-hop question looks like

A question that would force a conventional model to scroll its window becomes a sequence of substrate operations.

rl-x1 · trace · multi-hop
  1. 01 READ perceive(corpus)

    Inputs land as structured evidence — not as a token buffer.

  2. 02 BIND bind(claim_a, source_a)

    Claim is tied to where it came from. Provenance is structural, not appended.

  3. 03 BIND bind(claim_b, source_b)

    A second piece of evidence is bound. No re-derivation from prose.

  4. 04 WALK walk(claim_a → claim_b)

    Multi-hop is a substrate operation. The decoder does not have to scroll.

  5. 05 COMP compose(answer | evidence)

    The answer is composed from bound evidence. What is asserted is defensible.

  6. 06 EMIT emit(answer, audit_trail)

    Output ships with the audit trail attached. Through Mnemo, this is enterprise-ready.

The X-line

Where RL-X1 sits in the generation roadmap

G1

RL-X1

shipped

Flagship cross-domain · long-horizon reasoning without context limits.

G2

RL-X2

planned

Multi-modal substrate native. Perception and binding share the same surface.

G3

RL-X3

research

Self-revising recall. The model edits its own memory under typed verification.

Where RL-X1 is being used

Cross-domain reasoning

Reasoning

Long-horizon analysis

Tasks that span hundreds of inputs and need structured recall across all of them. The model is not asked to fit them in a window.

Research

Scientific reading at scale

RL-X1 reads collections, binds claims, and composes inferences across them. The work product is structured, not narrative.

Enterprise

Memory-aware decision support

Used through Mnemo, RL-X1 reasons over multi-tenant memory with the audit trail attached.