Methodology

How we score AI subscriptions

We answer one question objectively: which AI subscription is the best? Instead of opinion numbers, every score is derived from a factual feature matrix. Here is exactly how.

1. A 70+ parameter matrix

For each subscription we record concrete parameters across eight areas: model quality (reasoning, math, coding, research, agentic tool use, long context, multimodal, multilingual, instruction following, factual reliability, creative quality, safety), capabilities (images, voice, video, web, citations, files, data, code, custom assistants, memory, automation), workflow fit, apps & API, privacy & compliance, reliability & support, value, and business features – plus technical specs and limits.

2. Turning parameters into points

Capability parameters score yes = 1, partial = 0.5, no = 0. Quality parameters use a 4-tier rating (●●●● = 1.0, ●●●○ = 0.75, ●●○○ = 0.5, ●○○○ = 0.25) based on recognized benchmarks and general standing. Numeric specs (context window, price, upload size, models) score by sensible tiers (price inverted – cheaper scores higher; a very large context is capped so a single huge number cannot decide the ranking).

3. Category scores

Each category score (0–100) is the average of its parameter points. The highest score wins that category.

Category	Weight	What it measures
Intelligence & model quality	30%	Reasoning, math, coding, research, agentic tool use, long context, multimodal, multilingual, instruction following, factual reliability, creative quality, prompt tolerance and safety.
Features & capabilities	20%	Concrete capabilities: images, voice, video, web, citations, files, data, code, custom assistants, projects, canvas, memory, automation.
Workflow fit	12%	How the tool fits real work: multi-model use, side-by-side compare, collaboration, document/presentation/spreadsheet studios, OCR, exports, team space.
Apps & ecosystem	9%	Apps (iOS, Android, desktop), browser extension, public API and integrations.
Privacy & compliance	8%	Training opt-out, regional data residency, retention, certifications, GDPR/DPA, HIPAA and availability.
Reliability & support	7%	Speed (response, first-token, heavy tasks), uptime, rate limits, consistency, support and data export.
Value & pricing	7%	Price, free tier, annual discount, models included and student discount.
Business & teams	7%	Team plans, SSO, admin, audit logs, enterprise tier and shared workspaces.

4. The final score

The final score out of 100 is the weighted sum of the eight category scores. The top score is the overall champion. Every number is shown on the matrix page, so you can change the weights in your head and re-judge.

5. Factual vs editorial

Two kinds of input go into a score. Factual parameters (does it generate images? file size cap? context window? price? SSO?) are objective — taken from official pages and product testing. Editorial parameters (quality ratings for reasoning, coding, writing, reliability) are judgements informed by recognized benchmarks and hands-on use, expressed on a 4-tier scale. The weights between categories are editorial too. We mark which is which so you can disagree with the editorial parts and re-judge using the raw matrix.

6. Update frequency & no fake exactness

We re-check prices, features and standings regularly and date the page (currently 14 June 2026). Specs and prices are shown as approximate ("~") on purpose — pretending a file cap or message limit is exact when providers change them monthly would be dishonest. Always confirm current details on the official page before subscribing; our pricing archive links every source.

7. Limitations of benchmarks

No score settles "which AI is best" for everyone. Benchmarks can be gamed, leaderboards can reflect prompt selection rather than real quality, and your workload may weight categories differently than we do. Treat this as a transparent starting point, not gospel — change the weights in your head and re-read the matrix.

8. How we differ from arena-style leaderboards

Vote-based arenas (e.g. LMArena) answer one question: which model wins a single chat battle? That is useful but it ignores price, file limits, privacy, team features and real workflow. We answer a different, commercial question: which AI subscription is worth paying for? Our scope is the paid product — value, files, limits, privacy and workflow fit — not just raw model IQ.

9. Disclosure

This is an independent, experimental project and is not affiliated with OpenAI, Anthropic, Google, xAI, Microsoft or Perplexity. One of the subscriptions we score, MultipleChat, is an all-in-one aggregator; we include it on the same factual criteria as the others and you can verify every one of its rows in the matrix. Scores follow this method, not payment, and referral links never change a score. Found an error? Tell us — we fix it.

See the scoreboard Category leaderboards