Perche pubblichiamo gli esperimenti falliti

Ogni affermazione su questo sito ha alle spalle un test falsificabile, e quando il test risponde no, lo riportiamo. Questa disciplina e il prodotto.

March 2026 Responsabilità

It is easy to publish wins. The harder, more valuable discipline is publishing the experiments that were supposed to win and did not — because a research programme you can only see the highlights of is one you cannot trust. We try to run the other way.

Every headline on this site sits behind a falsifiable test with a threshold set before the run. The point of fixing the bar in advance is that it removes the temptation to move it afterwards.

Falsifiable by design

A continuous-time model only counts as a win if it wins where timing matters and honestly loses where it does not. A memory result only counts if it holds on a held-out, leakage-safe split, not a slice chosen after the fact. When a verifier cannot run, the reward is reported as undefined — never quietly replaced with a number that makes the chart look finished.

  • Thresholds fixed before the run, not after.
  • Held-out, leakage-safe splits for anything that claims to generalise.
  • A win must come with the matching loss: where does this not work?
  • No faked signals — an unrunnable check is reported as undefined.

The negatives we stand behind

We have falsified two paradigm-sized claims of our own. The first: that a new architecture is what drives grounded learning — it is not; the rearing method is, and a conventional backbone reared the same way ties. The second: that our small-model efficiency edge holds at frontier scale — it does not; beyond a point a Transformer pulls ahead, and we publish the crossover instead of hiding it.

The negatives are not a confession. They are the reason the positive numbers are worth anything.

ReasonLoom research principles

Why it is the product

For the people who will rely on these systems — clinicians, compliance teams, engineers shipping to constrained devices — the most useful thing we can offer is not a bigger number. It is a number you can act on, with its limits stated next to it. That is what falsifiable research buys, and it is the standard we hold every release to.

← Torna alle notizie