Xeroml is now in early access — request your invite

Governance layer for agents.

Compliance observability for every AI agent in finance — logged, auditable, regulator-ready.

Financial institutions face $14.8M average compliance breach costs. AI agents are scaling faster than governance — XeroML closes the gap. Plug in, evaluate, prove compliance.

INTEGRATION TIME < 30 min

FRAMEWORKS SUPPORTED 20+

REAL-TIME COMPLIANCE TRACKING

Every agent decision. Audited in real time.

XeroML wraps your AI agents with a compliance observability layer — every decision is logged, mapped to the applicable regulation, and scored before it reaches production. No manual audits. No spreadsheets. No lag.

🔍

Live Agent Monitoring

Every lending, fraud, and payment decision tracked in real time — not reviewed weeks later in spreadsheets

⚖

AI Audit Agent

An autonomous agent reads decision logs, scores against live regulations, and flags violations before regulators do

📋

Regulator-Ready Outputs

Adverse action notices, audit packs, and CCO reports — auto-generated in seconds, not weeks

USER / REQUEST

REQUEST.RECEIVED()

KYC submission, loan approval, or fraud check enters your AI workflow.

AGENT

AGENT.EVALUATE()

Agent analyzes context, reasons over policy, and proposes a decision with evidence.

DECISION.PROPOSE()

CONTEXT.SUMMARY()

COMPLIANCE SDK INTERCEPT

RULE.VERIFY()

SDK intercepts the decision in real time and checks ECOA, SR 11-7, Fair Lending, and BSA/AML rules before execution.

APPROVE / BLOCK

COMPLIANCE.FLAGS

AUDIT.TRAIL

REASONING.LOG

SYSTEM / RECORD

DECISION.EXECUTE()

Approved decisions run, blocked actions halt, and every outcome is sealed in immutable logs.

Agent decision flow

XeroML intercept at step 3

Real-time compliance evaluation

END-TO-END EVAL & TRACING

See every step. Score every output.

Full Workflow Traces

Trace ML pipelines, LLM calls, retrieval steps, and agent decisions in a single view

LLM Judge Evaluation

Calibrated judges score every output for accuracy, grounding, compliance, and coherence

Real-Time Monitoring

Latency, token cost, eval scores, and drift detection across every agent run

Xeroml Trace Explorer

The fastest way to evaluate, debug, and improve your financial AI workflows.

Explore Trace Explorer

Connect your agent

Drop in our SDK with 3 lines of code. Supports LangChain, LlamaIndex, CrewAI, custom frameworks, and any OpenTelemetry-compatible pipeline.
Configure LLM judges

Calibrated judges score factual grounding, numerical accuracy, bias/fair lending, and reasoning coherence. Failing scores block deployment.
Set compliance rules

Map regulatory requirements to automated checks — fair lending, advisory, PII masking, data protection. Flag decisions lacking explainability.
Deploy to production

One Docker Compose command deploys Xeroml inside your VPC. No external dependencies, no data egress — production-ready in under 30 minutes.
Monitor & improve

Live dashboards for traces, eval scores, and compliance checks. Low scores surface prompt fixes automatically.

Connect a new agent

xeroml==2.1.0 pip install xeroml OpenTelemetry compatible

Python SDK LangChain CrewAI

# xeroml==2.1.0 · 3 lines to full observability from xeroml import trace import os trace.init( project="loan-underwriting", api_key=os.environ["XEROML_KEY"] ) @trace.agent("credit-risk-assessor") def assess_risk(applicant: dict) -> dict: return llm.run(risk_prompt, applicant)

from xeroml.integrations import XeromlCallbackHandler handler = XeromlCallbackHandler( project="loan-underwriting", api_key=os.environ["XEROML_KEY"] ) agent = initialize_agent( tools=[credit_check, kyc_lookup], llm=ChatOpenAI(model="gpt-4o"), callbacks=[handler] )

from xeroml.integrations import xeroml_crewai xeroml_crewai.init( project="loan-underwriting", api_key=os.environ["XEROML_KEY"] ) risk_agent = Agent( role="Credit Risk Analyst", goal="Assess borrower creditworthiness", tools=[credit_score_tool, income_verifier] )

Connected · loan-underwriting 847 traces collected

LLM Judge scores

7d eval trend

0.94 ↑ +0.26

Factual grounding PASS

Verifies outputs against RBI/SEBI regulatory data and live credit bureau feeds

97.3% grounded Threshold: ≥ 95%

Numerical accuracy FAIL

Checks EMI, IRR, debt-service coverage ratios, and amortization tables

88.1% correct Threshold: ≥ 99.5%

Bias / fair lending PASS

Detects protected-class correlates (race, gender, zip) in loan decision reasoning

0 bias signals Threshold: 0 flags

Reasoning coherence PASS

Validates chain-of-thought auditability for Basel III model risk documentation

0.94 coherence score Threshold: ≥ 0.90

Compliance rules

loan-underwriting 5 rules · 4 active Basel III + SEBI IA

Fair lending — explainability audit trail ECOA-compliant: every loan decision must include human-readable reasoning. Auto-reject if explainability score < 0.80

CRITICAL RBI/2024

Investment advisory — suitability assessment SEBI IA Regulations 2013: recommendations must pass suitability scoring against client risk profile. Threshold: 0.85

CRITICAL SEBI IA/2013

PII masking — sensitive identifiers Block any trace containing unmasked PII before storage. Regex + NER dual detection

HIGH

Agent drift — output distribution monitoring Alert when agent output diverges > 5% from 30-day rolling baseline

MEDIUM

Data protection — consent verification Verify user data processing consent before executing any agent pipeline

HIGH

Deploy to your VPC

✓ Eval gates passed 3/3 active judges

✓ Compliance rules active 4/4 enabled

✓ PII scan clean 0 leaks detected

$ xeroml deploy --env production --vpc internal Validating eval gates.............. ✓ 3/3 passed Compliance rules check............. ✓ 4/4 active PII scan on trace schema........... ✓ clean Pulling images (registry.xeroml.io): ✓ collector:2.1.0 pulled 0.8s ✓ eval-engine:2.1.0 pulled 1.4s ✓ dashboard:2.1.0 pulled 0.6s Starting services: ✓ xeroml-collector healthy 0.4s ✓ xeroml-eval-engine healthy 1.1s ✓ xeroml-api healthy 0.8s ✓ postgres + redis healthy 0.3s ✓ Deployed in 28s · No data left VPC

Dashboard live at traces.internal:3000 VPC-ONLY

Data residency: in-cluster (Mumbai region) SOC2

Ingesting traces — 847 received in last 5 min LIVE

COMPLIANCE INTELLIGENCE

Score every decision. Prove compliance.

Regulatory Scoring

Automated checks mapped to RBI, SEBI, ECOA, and Basel III — every agent output scored before it reaches production

Explainability Reports

Generate ECOA-compliant adverse action notices and audit-ready reasoning trails for every AI decision

Drift Alerts

Statistical drift detection (PSI, KS tests) triggers alerts when agent behavior deviates from approved baselines

Compliance scorecard

loan-underwriting

RBI SEBI ECOA

Fair lending — explainability ECOA

0.96 PASS

Advisory suitability SEBI IA

0.91 PASS

KYC verification RBI

0.74 FAIL

PII masking DPDP

0.99 PASS

Data protection — consent DPDP

1.00 PASS

7d compliance

96% +14%

Explainability report

Adverse Action Notice Agent: loan-underwriting

RPT-3102

DECISION

Loan application #LN-91205 — Denied

ADVERSE ACTION FACTORS

1Debt-to-income ratio: 47% (exceeds 43% maximum)

2Credit utilization: 82% (exceeds 30% preferred)

3Employment tenure: 7 months (below 12-month minimum)

4Derogatory marks: 2 collections in last 24 months

COMPLIANCE

ECOA §1002.9 adverse action notice — auto-generated

All 5 regulatory checks passed

Drift monitor — live

PSI 0.31 Above 0.25 threshold

KS Stat 0.18 Approaching 0.20 limit

APPROVAL RATE 67.2% ↑ 8.1% from baseline

AVG RISK SCORE 0.41 ↓ 0.12 from baseline

14d approval rate

67.2%

PSI 0.31 on unsecured personal loans — approval distribution shifted beyond SR 11-7 threshold CRITICAL

Advisory suitability KS statistic 0.18 — approaching regulatory review trigger HIGH

⟳ Auto-remediation active LIVE

Action Shadow mode — loan-underwriting v2.4 → v2.3 fallback

Triggered 12m ago · PSI breach on unsecured personal loans

Audit 3 entries logged · risk-team notified via Slack

Compliance Intelligence

Scores every agent decision against your regulatory frameworks, surfaces explainability gaps, and generates audit-ready reports.

Explore Compliance Engine

Regulatory scoring

Every agent output is checked against your compliance rules — fair lending, advisory suitability, KYC, PII masking. Failures block production.
Explainability reports

Auto-generate audit-ready reasoning trails for every flagged decision. One click to export for regulators.
Drift detection

Continuous monitoring compares agent behavior against approved baselines. Get alerted the moment output distributions shift.

CONTINUOUS IMPROVEMENT

Your agents get better with every run

Regulatory Judges

ECOA, TILA, SR 11-7, and fair lending judges score every agent output before it reaches production

Backtest & Validate

Champion-challenger testing against historical decisions — verify SR 11-7 thresholds before promoting to production

Audit Trail

Every improvement logged with timestamps, score deltas, and approvals — exportable for SR 11-7 model governance audits

The Regulatory Improvement Loop

LLM judges mapped to ECOA, TILA, and SR 11-7 score every output. Violations are diagnosed, fixes backtested against historical decisions, and every improvement is audit-logged.

See How It Works

Regulatory judge scores

Every agent output scored by judges mapped to ECOA, TILA, SR 11-7, and fair lending frameworks. Failures block deployment.
Violation diagnosis

When a regulatory judge fails, Xeroml identifies the specific violation, traces the root cause, and generates a regulation-compliant prompt fix.
Backtest & validate

Apply the fix, run champion-challenger testing against historical decisions, compare before/after across all judges, then promote.
Improvement audit trail

Every fix logged with timestamps, score deltas, and approvals. Export the full trail for SR 11-7 audits.

Regulatory judge scores — loan-underwriting

loan-underwriting

ECOA TILA SR 11-7 HMDA

ECOAECOA adverse action completeness 0.94

PASS

TILATILA APR calculation accuracy 0.68

FAIL — threshold 0.95

SR 11-7SR 11-7 model risk documentation 0.97

PASS

HMDAFair lending bias detection 0.91

PASS

RBIRBI KYC verification completeness 0.96

PASS

Violation diagnosis

TILA APR ACCURACY: 0.68 · REGULATORY VIOLATION

TILA Regulation Z, 12 CFR 1026.22 — APR disclosure must be accurate within 1/8 of 1 percentage point

ROOT CAUSE

Agent computed APR using simple interest formula instead of actuarial method required by Reg Z. 5 of 12 loan disclosures contained APR errors exceeding the 1/8% tolerance.

SUGGESTED PROMPT FIX

- Calculate APR for {loan_terms} using standard interest formula

Calculate APR for {loan_terms} using actuarial method per 12 CFR 1026.22(a). Verify result within 0.125% tolerance. Use {api.reg_z_calculator} for validation.

Backtest & validate

Champion-challenger test · backtested against 847 historical loan disclosures

Challenger Champion Delta

ECOA adverse action 0.94 0.95 +0.01

TILA APR accuracy 0.68 0.97 +0.29

SR 11-7 model risk 0.97 0.98 +0.01

Fair lending bias 0.91 0.93 +0.02

RBI KYC verification 0.96 0.96 —

All 5 regulatory judges passing STAGING

847/847 historical disclosures within TILA tolerance VERIFIED

SR 11-7 model validation threshold met COMPLIANT

Improvement audit trail — loan-underwriting

Cumulative improvement +18.4% avg regulatory score 42 days

Mar 14 TILA APR calculation fixed to actuarial method +0.29 S. Chen, Model Risk · APPROVED

Mar 8 Added adverse action factor #5 (payment history) +0.06 J. Patel, Compliance · APPROVED

Feb 28 KYC verification timeout increased to 30s +0.11 P. Sharma, Risk · APPROVED

Feb 19 Bias mitigation: income normalization added +0.04 S. Chen, Model Risk · APPROVED

Feb 12 SR 11-7 documentation prompt enrichment +0.08 Compliance Bot · AUTO

REQUEST EARLY ACCESS

Stop wondering if your AI agents will pass the next regulatory exam.

XeroML gives compliance teams real-time proof that every AI decision meets ECOA, TILA, SR 11-7, and Fair Lending requirements. Deployed in your VPC in under 30 minutes. No data leaves your infrastructure.

Book a Demo

Governance layer for agents.

Every agent decision. Audited in real time.

Live Agent Monitoring

AI Audit Agent

Regulator-Ready Outputs

See every step. Score every output.

Full Workflow Traces

LLM Judge Evaluation

Real-Time Monitoring

Connect your agent

Configure LLM judges

Set compliance rules

Deploy to production

Monitor & improve

Score every decision. Prove compliance.

Regulatory Scoring

Explainability Reports

Drift Alerts

Regulatory scoring

Explainability reports

Drift detection

Your agents get better with every run

Regulatory Judges

Backtest & Validate

Audit Trail

Regulatory judge scores

Violation diagnosis

Backtest & validate

Improvement audit trail

Stop wondering if your AI agents will pass the next regulatory exam.