Know if your AI is using the right logic.

ClearMetric is the testing platform for AI data agents. It checks the method behind AI-generated answers against your business rules, so teams catch the wrong table, join, or filter before the answer reaches a decision.

Logic drift caught
Asked through an AI data agent

What was Q3 net revenue?

ClearMetric · your rule
Active every run
Net revenue must use fact_orders and subtract refunds.
Run · Apr 2
uses fact_orders
Matches
Run · May 6
uses fact_orders
Matches
Run · Jun 1
uses fact_orders_v2
Wrong table
— The problem

The AI looks confident.
The logic is wrong.

Ask the same question twice and your agent can build it two different ways — a different table, an extra filter, the wrong grain. These aren't typos; they're the wrong method. The number still looks plausible, so it lands in a board deck before anyone checks.

~21%

On Spider 2.0 — the leading enterprise text-to-SQL benchmark — state-of-the-art AI solves only about one in five real tasks.

Spider 2.0 benchmark
71%

of data teams fear incorrect AI output reaching stakeholders — now the fastest-rising concern in the field.

dbt · 2026
— What it does

Define the right way.
Catch every drift.

One metric, end to end — write the rule, run the question through ClearMetric's agent, and compare what it actually generated.

Ecommerce WarehouseDefine
Define(4)
AllDrifting · 4
Revenue2
West region revenue
orders · 3 logic
Active customers
customers · 2 logic

West region revenue

Experiment

Net revenue for the West sales region only.

OwnerNo Owner
Primary tableorders
Logicgrain · order · region = 'West' · status = 'completed'
Tables & joins
ordersorder_id, region, status, net_amount
Ecommerce WarehouseExperiment

West region revenue Q3

Run checks

West region revenue · checked against your rule · 2 total runs

AgentClearMetric agent
Tool answers
ClearMetric agent drifted $4.20M Yesterday
West region revenue Q3
SELECT SUM(net_amount) FROM orders WHERE quarter = 'Q3'
Ecommerce WarehouseExperiment

Checked against your rule

ClearMetric agent

West region revenue · 2 total runs

Your ruleorders · status = 'completed' · region = 'West'
Generated method
drifted $4.20M Yesterday
SELECT SUM(net_amount) FROM orders WHERE quarter = 'Q3' -- region = 'West' never applied
Drifted — region filter dropped, so it returned every region ($4.20M, not West-only).
— Why method, not number

We check the method,
not the number.

Numbers move with new data — that's fine. The method shouldn't.

Same question, each time you run it “What was Q3 net revenue?”
Method stable Method changed
Mar 1 $4.18M same method
Apr 1 $4.20M same method
May 1 $4.19M same method
May 15 $4.23M same method
Jun 1 $3.91M table changed
!
The number wiggled all spring — $4.18M → $4.23M — and stayed green, because the method held. ClearMetric flagged only the run where the table silently changed: fact_orders → fact_orders_v2.
— When a drift is found

A flag, not a dead end.

ClearMetric is observational — it tells you what's wrong and where. From a flagged drift, you have two honest moves.

01

Update the rule

If the business definition genuinely changed, the AI may be right. Move your standard forward.

02

Fix the agent's context

Correct the table descriptions or instructions in the source tool so the agent stops reaching for the wrong logic.

The signal is the value — you didn't even know your AI was drifting
— Where it fits

Run your metrics.
See where the logic holds.

ClearMetric runs each metric's question through its own agent against the metadata you connect — then scores where the method matches the rule you defined and where it drifts.

Read-only metadata
ClearMetric agent
runs your metric questions
Test results this run
Net revenue matches
Active customers matches
Gross margin wrong table
Pipeline ARR matches
Churn rate extra filter
5 metrics tested 3 match · 2 drift
Works outside your tools · read-only metadata · no row-level data
— Why ClearMetric

Not a catalog. Not a semantic
layer. Not a dashboard monitor.

Catalogs

Inventory tables.

Don't check whether the AI uses them right.

Documents
Semantic layers

Define metrics.

Don't see what your AI actually generated.

Calculates
Data monitors

Watch the data.

Don't check whether the logic was right.

Monitors
ClearMetric

Tests the logic.

Checks the method generated for the question — and flags when it drifts.

Verifies
— Who it's for

Built for high-stakes numbers.

ClearMetric earns its place where a wrong AI answer is expensive — the metrics that reach the people who make decisions. If occasional wrongness doesn't hurt, you don't need it yet.

Finance & revenue reporting Board & exec metrics Auditor-facing numbers Regulated reporting

For the analytics lead putting an AI agent in front of these — test it before you trust it.

— FAQ

Common questions.

support@clearmetric.ai

ClearMetric runs your metric questions through its own agent against the metadata you connect, reproducing the same logic errors AI data tools make: wrong table, join, filter, or grain. You verify the method against the rules you defined and catch drift before it reaches a production answer.
You don't need a complete set. Start with the few that matter most — the numbers that reach finance, the board, or auditors — and define those rules first. Most teams already have strong opinions on how their top metrics should be computed even if it was never written down. ClearMetric is where that gets written down and enforced.
It tells you what drifted and where. It doesn't change your AI. A flagged drift gives you two moves: update the rule if the definition changed, or correct the agent's context in the source tool. The value is the signal — most teams don't know their AI is drifting until something is already wrong.
No row-level data. ClearMetric works from the metadata you connect and stores your metric rules and run history — the methods your AI generated over time. Credentials are encrypted, access is scoped to your org, and your production tools are never touched.
Numbers change for legitimate reasons like a new quarter or new data, and flagging every change would bury you in false alarms. The method is what shouldn't change. If the same question quietly starts using a different table or filter, that is the real signal. ClearMetric watches the method and shows the number only to tell you how much a drift cost.
Connect a metadata source, define a metric, run the question, and you see your first match or drift in the first session. Nothing to instrument, and no change to how your team works.

Test the logic before
you trust the answer.

A 30-minute walkthrough on your stack.