Compress, don't memorise
We aim for compact mechanisms that look like the laws of physics, not for embeddings that look like the data.
One architecture on a continuous axis — reared on the stack.
Explore LoominumEnterprise
Research
Science
About
Compressing the laws underneath protein dynamics into physically-invariant signals.
Holobiont is a science program built around a specific idea: there are physically-invariant signals underneath protein dynamics that compress better than any amount of memorisation. We pursue those signals, audit them ruthlessly against retrieval-axis leakage, and publish the negatives when an attractive feature turns out to be measurement artifact.
add features until the leaderboard moves and ship the leaderboard, even when the lift is structural artifact.
biases the pipeline toward features that look like physics — compact, invariant, auditable — and retires the rest before they reach production.
We aim for compact mechanisms that look like the laws of physics, not for embeddings that look like the data.
Pre-computed retrieval axes are notoriously easy to leak test labels through. We default to nested cross-validation and report leakage budgets explicitly.
Where a feature class hurts performance once leakage is removed, we say so. The product is the mechanism, not the leaderboard.
Column statistics — Shannon entropy and amino-acid frequency — look attractive at low coverage. Past a coverage budget around the middle of the chart, they begin to hurt the best estimator. We retired them for production use.
Each decision is a measured statement, not a marketing one. Where a result hurt, the result and the retirement live on the page.
A large fraction of an apparently strong feature's lift was attributable to leakage through pre-computed k-NN axes. The methodology was tightened accordingly.
Shannon and frequency column statistics hurt our best estimator beyond a certain coverage budget. We retired them for production use and kept exploring direct-coupling pairs.
The current pipeline is biased toward features that look like physics. Each candidate has an explicit leakage audit and an explicit coverage budget before it ships.
Protein dynamics is the obvious place to put pressure on the "compress, don't memorise" idea. Holobiont is where we run that pressure — and where we have already retired features that looked attractive but did not survive the audit.
The mechanism-first audit travels across our research programs — from evaluation discipline through to alignment posture. Where a feature does not survive here, it does not ship anywhere.