Pourquoi nous publions les expériences qui ont échoué

Chaque affirmation sur ce site repose sur un test réfutable — et lorsque le test répond non, nous rapportons ce non. Cette discipline est le produit.

March 2026 Responsabilité

It is easy to publish wins. The harder, more valuable discipline is publishing the experiments that were supposed to win and did not — because a research programme you can only see the highlights of is one you cannot trust. We try to run the other way.

Every headline on this site sits behind a falsifiable test with a threshold set before the run. The point of fixing the bar in advance is that it removes the temptation to move it afterwards.

Falsifiable by design

A continuous-time model only counts as a win if it wins where timing matters and honestly loses where it does not. A memory result only counts if it holds on a held-out, leakage-safe split, not a slice chosen after the fact. When a verifier cannot run, the reward is reported as undefined — never quietly replaced with a number that makes the chart look finished.

  • Thresholds fixed before the run, not after.
  • Held-out, leakage-safe splits for anything that claims to generalise.
  • A win must come with the matching loss: where does this not work?
  • No faked signals — an unrunnable check is reported as undefined.

The negatives we stand behind

We have falsified two paradigm-sized claims of our own. The first: that a new architecture is what drives grounded learning — it is not; the rearing method is, and a conventional backbone reared the same way ties. The second: that our small-model efficiency edge holds at frontier scale — it does not; beyond a point a Transformer pulls ahead, and we publish the crossover instead of hiding it.

The negatives are not a confession. They are the reason the positive numbers are worth anything.

ReasonLoom research principles

Why it is the product

For the people who will rely on these systems — clinicians, compliance teams, engineers shipping to constrained devices — the most useful thing we can offer is not a bigger number. It is a number you can act on, with its limits stated next to it. That is what falsifiable research buys, and it is the standard we hold every release to.

← Retour aux actualités