Faceless. Independent. Nobody pays for a verdict.

Before you pay for an AI tool, find out if it passes the acid test.

I put every tool in a category through the same battery, score it out of 100, and publish a dated PASS, BORDERLINE or FAIL. The inputs are public, so you can run the test yourself and tell me where I'm wrong. A new one lands every couple of weeks.

✓ You're on the list. The next battery lands in your inbox.

A new test every couple of weeks. No spam, unsubscribe anytime. By subscribing you agree to receive the newsletter and to our Privacy Policy.

Same published inputs, dated and re-runnable. No sponsorships; the verdict isn't for sale.

PASS
protocol v1.0·running battery
The public scorecard

Every verdict stays on the record.

Each test adds a row. A re-test gets its own dated row, so nothing quietly gets overwritten. The back catalogue is the proof I'm not just chasing whatever launched this week.

RESULTS · scored out of 100 updated each battery
ToolCategoryAs ofScoreVerdictCost / resultBuy or skip

Rows marked pending are queued for the next run. Bands: PASS 75 and up, BORDERLINE 55 to 74, FAIL under 55.

How the acid test works

One protocol, run the same way every time.

Every tool in a category meets the same battery and the same scoring. The number I care about most is the cost of a result you can actually use.

01

Same battery for every tool

One task set, one set of inputs, run identically. No friendly demos and no improvising to flatter a particular product.

02

Scored across seven things

Quality, reliability, speed, setup friction, cost per result, workflow fit, and the limits nobody advertises. It all collapses into one number out of 100.

03

The cost of a usable result

Sticker price lies. I track what one output you can trust actually costs, once the retries and the re-dos are in.

04

Dated, and you can repeat it

Every verdict is pinned to a version and a date, with the inputs published so you can run it yourself and check me.

Dip it, read the verdict

A litmus strip doesn't care about the marketing, and neither does the score. Drag it: the same paper gives every tool the same reading.

82/100 PASS
0102030405060708090100
PASS 75+ means take it. BORDERLINE 55–74, only for a specific job. FAIL under 55, skip it.
Get the verdict before you buy

See the receipts, not the marketing.

You get the scorecard, the raw inputs, and a straight buy-or-skip call, every couple of weeks.

✓ You're on the list. The next battery lands in your inbox.

No spam, unsubscribe anytime. By subscribing you agree to receive the newsletter and to our Privacy Policy.