↳ THE METHOD

An engine
reads every
AI clause.
Then we
cite it.

AIRIN is a public record of how 402 AI platforms treat user content, prompts, outputs, and training rights. Every finding ships with the verbatim policy quote that supports it.

What is AIRIN?

AIRIN is an AI-vendor policy intelligence network that tracks public terms, privacy policies, and AI addenda for commercial AI tools. It extracts verbatim clauses, maps risk patterns, monitors policy changes, and packages cited evidence into platform records, Ask answers, and PDF reports. AIRIN is informational only, not legal advice.

01 · AUTOMATED PIPELINE

Every clause is scored by an AI reasoning engine.

Policy documents are fetched automatically three times per week and split into clauses by paragraph and section. Each clause is evaluated against five specific questions about IP ownership and privacy risk. Ratings are computed deterministically from those per-clause evaluations.

02 · VERBATIM ONLY

No paraphrase. Every finding ships with the source quote.

Every surface verdict on a platform record is paired with the verbatim language from the source policy, the exact section reference, and a link back to the original document.

03 · CONTINUOUS UPDATES

When a policy changes, the rating re-computes automatically.

When a source document changes, the affected clauses are re-evaluated automatically and the rating recomputed. Changes appear in the Updates feed within 24 hours of detection.

↳ THE SIX SURFACES WE TRACK

Every record · every surface · every tier.

PROMPT OWNERSHIP

Who owns the text, images, or files you submit as input.

We look for explicit ownership-retention language and any rights the user grants to the platform via license or assignment.

OUTPUT OWNERSHIP

Who owns the AI-generated content you receive.

We capture the platform's position on output rights, including any conditions on output ownership (compliance with usage policies, attribution, etc.).

MODEL TRAINING

Whether your submissions train future models.

We capture the default behavior (on or off), the visibility of the opt-out, and any tier-by-tier carve-outs.

COMMERCIAL USE

Whether you can use outputs for business, monetization, or resale.

We capture restrictions, revenue thresholds, and any tier-gated commercial provisions.

DATA RETENTION

How long the platform stores your prompts, outputs, and account data.

We capture the stated retention window, exceptions (T&S, legal hold, feedback), and any zero-retention configurations.

TIER DIFFERENCES

How free, pro, team, enterprise, and API tiers diverge.

We capture which surfaces materially change between tiers — especially training defaults, retention windows, and commercial use protections.

↳ THE FIVE QUESTIONS

The engine answers five questions for every clause.

1
Does this give the platform rights to user inputs beyond service delivery?
2
Does this affect who owns the outputs?
3
Does this permit training on user content?
4
Does this restrict commercial use of outputs?
5
How long is user data retained?

↳ HOW AIRIN PROVES A REPORT

The PDF and the site use the same proof chain.

Every visual report is meant to be easy to read, but it still has to prove its work. These five checks are the simple version of what the report shows.

Verbatim capture

AIRIN starts with the vendor's own policy text. If the quote is not in the source, it does not belong in the report.

Snapshot hash

Each source snapshot keeps a SHA-256 hash, so the evidence can be checked again later.

Deterministic rubric

The same clause should produce the same risk rating every time. Ratings are computed from the rubric, not guessed.

Silence is caution

If a policy does not clearly protect a right, the report says that plainly instead of filling in a friendly assumption.

Separate reader views

Creator, GRC, and counsel views stay separate so one audience does not flatten what another audience needs to know.

↳ THE RATING RULES

Deterministic. No judgment calls.

The overall rating is the most severe single-clause verdict. The same clauses always produce the same rating — the decision table below is the entire logic.

LOW

No clause meets the MED or HIGH criteria.

MED

Any single clause grants a broad (non-sublicensable) license to inputs, OR enables training with an opt-out, OR restricts commercial use of outputs.

HIGH

Any single clause grants sublicensable rights to inputs, OR claims output ownership, OR permits training with no opt-out.

⚠ CROSS-DOCUMENT CONTRADICTION SCAN

Our pipeline continuously performs a pass looking for conflicting statements across different legal documents published by the same platform (e.g., between their Terms of Service and Privacy Policy). When directly contradictory answers are found to the same question, this is flagged as a Conflict.

Because conflicts represent severe legal and compliance exposure for corporate reviews, platforms with active contradictions are automatically given a rating of MED (or HIGH depending on the clause severity) to ensure reviewer visibility, accompanied by a special warning banner highlighting the specific conflicting excerpts.

↳ THE WORKFLOW

From source document to public record.

01
Source fetched.
Every customer-facing policy document — Consumer ToS, Privacy Policy, Commercial ToS, Usage Policy, and any AI-specific addenda — is fetched automatically three times per week.
02
Clauses extracted.
Each document is split into clauses by paragraph and section, mapped to the 13 risk surfaces, and stored with its verbatim quote and a deep link to the exact section it came from.
03
Clauses scored.
Each clause is evaluated by the reasoning engine against the five questions, and the deterministic decision table computes the rating. Example — Claude (Anthropic): Rating MED, driver “Outputs limited to non-commercial use in evaluation context.”
04
Published and monitored.
The record is published with verbatim quotes, citations, the computed rating, and an auto-generated driver. Documents are re-crawled three times per week; any change re-evaluates the affected clauses and appears in the Updates feed within 24 hours.

↳ DETERMINISM

Ratings are computed, not editorialized.

The reasoning engine runs at a fixed temperature, so the same clause yields the same answer on every run. The decision table above is the entire rating logic — there are no manual overrides. Sponsors have no influence over ratings.

↳ NOT LEGAL ADVICE

This is a public record, not counsel.

AIRIN is informational. Findings are summaries of public policy language and do not constitute legal advice. Before relying on a finding for a contractual decision, consult counsel.

↳ GRC AUDIT STANDARDS

The 3-pass verification SOP

Every vendor review follows the same standardized 3-pass extraction and verification process, so legal counsel and procurement managers can audit each step:

PASS 01 · AUTOMATED INGEST

Playwright headless crawlers capture raw text snapshots from live policy URIs. Documents are hashed (SHA-256) to establish an immutable verification baseline.

PASS 02 · CLAUSE EXTRACTION

An extraction engine pulls verbatim clauses, maps them to the 13 risk surfaces, and checks each against a 40-character sentinel from the source text for original-source verification.

PASS 03 · AUTOMATED VERIFICATION

A second automated pass re-checks every citation against the stored snapshot and validates its exact source coordinates before the record is published. No human edits the ratings; they are computed, not editorialized.

Which surfaces drive the rating

We don't blend the 13 risk surfaces into a weighted average — the overall rating is set by the single most severe clause(the rules above). But the surfaces aren't equal in what counts as severe: a broad grant over your prompts or outputs, or training on your inputs, jeopardizes your IP most directly, so the HIGH/MED thresholds are strictest there.

In rough order of how often they drive a MED or HIGH for the buyers we serve:

1 · Ownership

Prompt & output grants

2 · Training

Training on inputs & opt-out

3 · Retention

How long inputs are kept

4 · Commercial

Limits on using outputs

↳ THE BENCHMARK RUBRIC · V1.0

In a landscape where most platforms claim broad rights, 35 of 375 earn Exemplary for creators.

9% of benchmarked platforms reach the top band on the creator lens (45 of 375, 12%, on the enterprise lens). Exemplary is calibrated to be genuinely scarce: demanding criteria, a moderate curve set against the real field, and an absolute requirement of zero dealbreakers.

Every number in this section is computed live from the verified corpus and recalculates as coverage grows. The curve in rubric v1.0 is provisional and will be re-frozen against the fuller population after the current coverage sweep — recorded as a new rubric version, never a silent shift. Every published band stores the rubric version that produced it.

TWO LENSES, NEVER BLENDED

Every platform is assessed twice: once for creators (your prompts, your outputs, your IP) and once for enterprises (data use, retention, subprocessors, audit rights). The two assessments are independent — a platform can be Strong for one audience and Caution for the other, and we never average that away.

SILENCE = CAUTION

A policy that doesn't address a criterion is penalized, not given the benefit of the doubt. If a vendor wants credit for protective behavior, the policy has to say it — in writing, where we can quote it. This rewards vendor clarity and is the defensible default for a legal-adjacent public claim.

The four dealbreakers

Four clause patterns are disqualifying by design. One dealbreaker imposes a heavy band penalty (a vendor that is otherwise excellent can recover no higher than Adequate). Two or more impose a hard floor: the band can never exceed Caution, regardless of everything else — multiple rights-grabs cannot be outweighed. A dealbreaker only ever trips on the verbatim text of a verified clause — never on silence, never on inference — and every tripped dealbreaker links to the exact clause that tripped it.

Output license-grab

If the platform takes a broad or perpetual license over what you create with it, your ownership of your own work is compromised at the root — no other clause can repair that.

Training without opt-out

If your inputs and outputs feed model training and the policy offers no way to decline, you cannot contain where your content goes.

Indefinite retention

Retention with no stated bound and no deletion right means your data's exposure never ends.

Third-party sublicensing

Rights that can be passed to third parties escape every promise the platform itself makes you.

We publish which dealbreakers exist, why each is disqualifying, and the band consequences they carry. We do not publish the detection patterns that decide whether a given clause trips one — publishing those would let vendors reword around them rather than fix the underlying rights problem. (The same protection applies to our capture-gate thresholds.)

The five bands

ExemplaryThe scarce apex — demanding criteria met, zero dealbreakers. Deliberately rare.

StrongClearly protective posture across the rubric.

AdequateWorkable, with caveats worth reading.

CautionSignificant rights or data exposure — read the cited clauses before relying on it. Also the ceiling for any platform with 2+ dealbreakers.

SevereBroad rights-taking or exposure across criteria.

Under the hood, each lens computes a continuous criterion-weighted score; band cutoffs are set as a moderate curve against the real distribution. Only the band is ever published — the continuous number is internal (it would be gameable, and it is false precision for a legal-adjacent public claim). Tapping any band reaches the specific verified findings and verbatim citations that drove it; a band you can't trace to source would violate our own charter.

When a platform gets a band — and when it doesn't

FULL

Both core documents (Terms + Privacy) captured and read in full through the gates → both lenses receive bands.

PROVISIONAL-PARTIAL

One core document verified, the other not yet → only the lens that document supports is banded; the other shows an honest "assessment pending" gap with a way to point us at the missing document. We never band from documents we haven't read.

NOT SHOWN

A platform appears on this site only when we hold at least one fully-verified document from the confirmed-correct company, with no unresolved review flag. If we aren't sure the evidence is about the right company — or we have no verified evidence at all — we keep working on it and show nothing, rather than something we can't stand behind. This applies identically to humans and AI agents using our APIs.

Every band, everywhere it appears, is an automated assessment against this published rubric — not legal advice. Before relying on a finding for a contractual decision, read the cited clauses and consult counsel.

↳ THE EVIDENTIARY PIPELINE

From source document to band — the chain of custody.

Nothing on this site exists without passing every step below, in order. Each step either verifies evidence or refuses it — there is no third option, and no human edits the outcome.

Automated processing disclosure

All policy discovery, clause extraction, risk scoring, and publication decisions on this platform are performed by automated systems without human review. No AIRIN employee reads, edits, or approves individual findings before they are published. Ratings are computed deterministically from verbatim source text per the rubric above — they are not editorialized or curated. If you believe a rating is incorrect, email hello@airinetwork.com with the specific clause and we will re-run the extraction against the current document.

Document capture

The live policy is fetched and frozen to an immutable, SHA-256-hashed snapshot. The snapshot — not the live page — is the evidence everything below refers to.

Rejected here: nothing yet — capture only records.

Gate 1 — capture completeness

The snapshot must be the real document: cookie walls, login screens, CAPTCHA shells, and truncated fetches are detected and refused.

Rejected here: walls, stubs, partial fetches — they produce zero findings.

Gate 2 — whole-document read

The document is exhaustively segmented and every segment is classified. Coverage is measured against the document's own structure — skimming is structurally impossible.

Rejected here: incomplete coverage, laundered segments, phantom quotes.

Finding + verbatim citation

Each finding is an exact substring of the hashed snapshot, with a structural locator (e.g. “§ 4.3”) and a deep link to the clause in the live document.

Rejected here: any quote that is not an exact substring of the snapshot.

Criterion classification

Every finding is classified into one of the 13 published rubric criteria (prompt ownership, output ownership, training use, retention, …).

Rejected here: nothing is invented — a criterion with no clause stays silent (and silence is penalized, not excused).

Band

The rubric scores each lens (creator / enterprise) from the classified, cited findings and publishes a band. Every band can be traced back through this exact chain.

Withheld here: any platform without a verified document from the confirmed-correct company gets no band and no page.

↳ HOW WE KEEP RATINGS HONEST

Every rating change is recorded, dated, and auditable.

A rating is only useful if you can check where it came from and see when it changed. These five mechanisms are built into the pipeline that publishes every record — they are how the system works, not a policy promise.

Verbatim citations

Every finding pairs its verdict with the exact clause quoted from the platform's own policy, plus a link back to the source document. If the quote cannot be verified against the stored source, the finding does not publish.

Snapshots and change tracking

Every policy document is captured as a snapshot with a SHA-256 fingerprint. New captures are compared against the stored snapshot, so when a platform edits its policy the change is detected, dated, and traceable to the exact text that moved. This is what powers Watch alerts and the Updates feed.

An audit trail that cannot be rewritten

When the classification system changes a rating or label, it writes a new row to an append-only audit table: the old value, the new value, the date, and the method that made the change. The database itself rejects edits and deletions on that table — history can be added to, never rewritten or erased.

Blind multi-model review

Clauses are classified by multiple AI models working independently — no model sees another's answer. Disagreements go to adjudication, and the settled answers build a gold set of examples the classifier is scored against. Accuracy is measured against that gold set, not assumed.

An accuracy gate in CI

The citation-verification suite runs automatically on every code change, in continuous integration, alongside the type and build checks. A change that would break the machinery that checks quotes against sources does not ship.

For GRC teams and counsel: this means a rating you relied on can be traced — to the clause it quotes, to the dated snapshot it was read from, and to a change record if it was ever revised.

↳ CLAUSE INTELLIGENCE

The record is clause-first, not summary-first.

AIRIN stores canonical policy clauses after the citation gate passes. Pattern matches, stance events, Ask answers, timelines, and reports all point back to those same clauses, so the product can become more searchable without becoming less auditable.

Canonical clauses

A canonical clause is a verbatim excerpt from a gate-verified policy snapshot, stored with surface, source URL, deep link, structural citation, character offsets, and snapshot metadata.

Risk patterns

Risk patterns are deterministic matches over canonical clause text. They identify families such as training use, IP license, data retention, dispute resolution, and tier conditionality.

Stance events

A stance event turns a matched clause into a time-stamped policy position: for example, training permitted, training excluded, opt-out available, broad license granted, or retention exception present.

Policy evolution

Evolution diffs compare stance events across captures. A change is labeled improved, worsened, or changed only when before and after citations exist.

Clause retrieval

Ask and machine search retrieve canonical clauses directly. Embeddings may rank candidate clauses, but they never generate facts, ratings, or legal conclusions.

No manual approval

Clause intelligence preserves AIRIN's no-human-review posture. If a citation cannot be verified against the stored source, it does not publish as a clause signal.

What semantic retrieval is allowed to do

Semantic retrieval can make Ask feel natural by finding clauses that use different wording than the question. It cannot fill gaps. If the retrieved clause does not contain the answer, AIRIN refuses or says the database does not contain a verified clause for that question.

Read the clause intelligence method

↳ FOUND AN ERROR?

If we got something wrong, we want to know.

Every record has a “submit a correction” link. Substantiated corrections are credited in the record’s history.

📢 POLICY UPDATES ALERT

AIRIN Brief

Built for compliance officers, legal counsel, and SaaS founders. Subscribe to the email digest — one short brief when a tracked vendor materially changes its terms, training policy, or risk rating. Prefer in-app? Watch platforms in your alerts inbox instead.

An enginereads everyAI clause.Then wecite it.