Akros · Accuracy · Methodology
How Akros will score measured food-photo runs.
The full version of what /accuracy summarises. No Akros food-photo MAE is public until measured image rows and Akros estimates are attached. Methodology revisions are versioned at the bottom of this page.
1. Fixtures
The launch benchmark accepts source-tracked meal fixtures only:
•Measured meal photos captured by the team or licensed contributors, with gram weights, kcal, protein, carbs, fat, image source, licence, and rights notes.
•Public-domain USDA/FNDDS ground-truth rows where ingredient weights and nutrition are auditable. These rows are not photo-accuracy evidence until benchmark images and Akros estimates are attached.
We do not include the user's own logs in this benchmark. A user-generated training-set leak is the most common way published-accuracy numbers get gamed; ours stays separate from production traffic.
2. Scoring
Primary metrics: pass rate plus p50 and p95 absolute kcal error against labelled truth, reported in percent. For multi-component meals, ground truth is composed from the ingredient weights before the estimate is scored.
Secondary metrics: protein, carbs, and fat absolute error in grams, p50/p95 latency, row-level failures, and image-provenance audit results. We publish kcal and latency first because they are the numbers a user can interpret without a statistics lesson.
We use percentile latency, not mean latency. A single 30-second outlier should still be visible in p95, but it should not make the typical meal look slower than it is.
3. Exclusions
A meal is excluded from the run only for one of three reasons, each logged in the run's notes field:
•The image fails our pre-flight validation (corruption, EXIF strip, <512px).
•The labelled truth lists ingredients not present in USDA / AUSNUT (~0.4%).
•The pipeline returns an HTTP 5xx after the configured 3-attempt retry. We do not silently drop these — the failure rate is reported as a separate number alongside MAE.
We do not exclude meals because Akros performed poorly on them. The temptation is real; the rule above is the discipline.
4. Competitor comparisons
We compare Akros only to vendors that publish per-meal accuracy data. If a vendor only publishes a single MAE figure without dataset disclosure, we cite that figure (with a source link on the headline page) and do not run a head-to-head.
The Cal AI 20-50% MAE range is from an independent journalist test (linked on the headline page). We do not have a per-meal Cal AI distribution to score against ours, and we will not fabricate one.
When a comparable vendor publishes per-meal accuracy data, we will add them to the weekly run with their dataset and our matched filter. Until then, we cite ranges.
5. Confidence intervals
The first public run will not claim statistical confidence beyond the fixture size. The launch gate is operational: at least 25 measured, source-tracked image rows, p50 kcal error at or below 15%, p95 kcal error at or below 30%, and p95 latency at or below 15 seconds.
Once the benchmark has enough rows for tighter confidence intervals, we will add those intervals to the methodology and avoid calling noise an improvement.
6. What this page is not
Not a peer-reviewed publication. Not a regulated medical-device validation. Not a claim that Akros's calorie estimate is suitable for clinical decision-making. The point of the public benchmark is to expose changes in our own pipeline to the people using it — not to support a clinical assertion.
If you need clinical-grade nutrition tracking, you should be weighing your food and using a registered dietitian. We say this on every paid screen and we say it here too.
7. Revisions
v1 · 14 May 2026 · initial publication.
Akros is a personal wellness app. It is not a medical device, does not provide medical advice, and is not a substitute for consultation with a licensed clinician.