Billing Event Schema Design: The Engineering Decision That Determines Your Compliance Posture

Cristian Curteanu
23 min read
Billing Event Schema Design: The Engineering Decision That Determines Your Compliance Posture

Photo by Luis Gomes on pexels.com

Table of Contents

The billing event schema is decided before you write the first line of billing code. Most engineering teams do it the other way around: instrument first, get pricing in front of customers, then discover that the session timestamps in their billing events just became PHI — or that the float precision errors in their quantity field compound across 80 million events into a $2,400 rounding discrepancy on a single invoice.

The schema defines what your billing pipeline can measure. It determines which fields become compliance liabilities. It fixes the granularity ceiling for every audit trail you’ll ever produce. And it is expensive to change in production — customers have integration dependencies on your billing event format, your historical invoice data is stored in the current schema, and a schema migration touches the entire pipeline simultaneously.

This article covers the engineering decisions that go into a billing event schema that works correctly under real operating conditions: idempotency key construction, field-level compliance design, decimal precision, schema versioning, and vertical-specific examples across Dev Tools, Healthtech, and IoT.


Why the Schema Comes Before the Pricing Model

Most teams discover this order problem after the fact. The sequence typically goes:

  1. Build the product, instrument it for observability (logs, traces, metrics)
  2. Decide on a pricing model
  3. Try to build billing on top of existing instrumentation
  4. Discover the instrumentation doesn’t capture what the pricing model requires

A CI/CD platform that instruments at the build-step level cannot produce a per-build-minute invoice without re-instrumenting. A telehealth platform that logs session IDs for debugging can’t remove them from the billing pipeline without a schema migration. An IoT platform that aggregates device readings to daily fleet totals for dashboards cannot reconstruct per-device billing retroactively.

The instrumentation granularity decision is a one-way door. You can aggregate fine-grained data upward; you cannot disaggregate coarse-grained data downward. The correct sequence:

1. Choose your billable metric
2. Design the billing event schema to capture that metric
3. Instrument your product to emit billing events at that granularity
4. Build the billing pipeline on top of the event stream
5. Set your pricing rates

Rates can change without touching the schema. The schema is fixed once it’s in production with real data behind it.


The Two-Layer Pattern

The most important structural decision in billing event schema design is what goes into the billing event and what stays in the originating system. This is particularly critical in regulated industries — but the discipline of separating billable metadata from operational records is good practice in every vertical.

Layer 1: The billing event (what goes to the billing pipeline)

Contains only the fields needed to calculate, aggregate, and audit the invoice. No content, no user-identifiable data, no operational context that isn’t relevant to billing.

1
2
3
4
5
6
7
8
9
{
  "event_id":         "sha256:a3f9c2d1e4b7...",
  "schema_version":   "2",
  "customer_id":      "cust_9f2a8e31",
  "metric":           "api_call",
  "quantity":         "1",
  "timestamp":        "2026-03-16T14:22:00.000Z",
  "source_reference": "req_8b3e9c4d"
}

Layer 2: The source record (stays in your application database)

Contains the operational detail that the billing event references. Never flows to the billing pipeline.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
{
  "request_id":    "req_8b3e9c4d",
  "user_id":       "usr_00012847",
  "endpoint":      "/v2/inference",
  "model":         "gpt-4o",
  "input_tokens":  1847,
  "output_tokens": 312,
  "duration_ms":   847,
  "ip_address":    "192.168.1.44",
  "session_id":    "sess_7a1b2c3d"
}

The source_reference in Layer 1 (req_8b3e9c4d) is enough to resolve a billing dispute — trace back to the source record, verify the event occurred, produce the evidence. But the billing pipeline has no visibility into the user identity, request content, or session context.

For Healthtech: the source_reference is a de-identified session token mapped to a patient encounter in a separately secured clinical record store. The billing event never contains the patient ID, provider ID, or session timestamp that would constitute PHI.

For IoT: the source_reference is a device-message reference — the device hardware ID and message sequence number as a composite key. The patient’s medical record number (for connected medical devices) never appears in the billing event.


Field-by-Field: The Minimum Required Schema

Every billing event needs exactly these six fields. Additional fields add complexity without proportional value.

event_id — Stable Idempotency Key

The most important field. Gets almost no attention in billing tutorials.

The event_id must be deterministic — the same physical event must always produce the same ID. If two copies of the same event arrive at the billing pipeline (from a retry, a network duplicate, or a batch replay), both must produce the same event_id so the second write is deduplicated.

A random UUID assigned at ingestion is wrong. Two separate delivery attempts for the same underlying event will produce different UUIDs and both will be stored, creating a double-charge.

The correct approach: derive the event_id from the content of the event itself.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
import hashlib
import json

def generate_event_id(customer_id: str, metric: str, source_reference: str) -> str:
    """
    Deterministic event ID derived from the event's identifying fields.
    The same event always produces the same ID regardless of when it is processed.
    """
    canonical = json.dumps({
        "customer_id":      customer_id,
        "metric":           metric,
        "source_reference": source_reference,
    }, sort_keys=True)

    return "sha256:" + hashlib.sha256(canonical.encode()).hexdigest()


# Usage
event_id = generate_event_id(
    customer_id      = "cust_9f2a8e31",
    metric           = "api_call",
    source_reference = "req_8b3e9c4d",
)
# → "sha256:a3f9c2d1e4b7f8a29c1d3e5f6..."
# Calling this again with the same arguments always produces the same result.

What to include in the hash input:

  • customer_id — prevents cross-tenant collision if source references are not globally unique
  • metric — a single source event may generate multiple billing events for different metrics; include metric to distinguish them
  • source_reference — the unique identifier of the originating event in your application

Do not include timestamp or quantity in the hash input if either could vary between delivery attempts (e.g., a quantity that’s calculated at emission time from a fluctuating counter).

Vertical-specific idempotency key design:

VerticalSource fields for idempotency hash
Dev Tools (API)customer_id + request_id
Dev Tools (CI/CD)customer_id + build_id + step_id
Healthtech (telehealth)customer_id + session_token (de-identified)
IoT (MQTT)customer_id + device_id + message_sequence_number
IoT (HTTP)customer_id + device_id + reading_timestamp

For IoT with MQTT QoS 1: the message_sequence_number from the MQTT protocol header is the correct deduplication anchor — it’s stable across retransmissions for the same original message.


schema_version — Non-Negotiable From Day One

Add this field even if you never intend to change the schema. You will change the schema. Having version 1 events and version 2 events coexisting in the same table is routine — without a version field you cannot tell them apart.

1
"schema_version": "2"

Use a simple integer string. Do not use semver for event schemas — minor version differences in billing events are not minor. Any field addition, removal, or type change that affects invoice calculation is a major version.

The version field enables the aggregation pipeline to route events to version-specific calculation logic:

1
2
3
4
5
6
7
8
9
def calculate_charge(event: dict) -> Decimal:
    version = event.get("schema_version", "1")

    if version == "1":
        return calculate_v1(event)
    elif version == "2":
        return calculate_v2(event)
    else:
        raise UnknownSchemaVersionError(version)

Historical invoices recalculated for dispute resolution will use the schema version that was active when the invoice was originally generated — not the current version. Without the version field, recalculation is impossible.


customer_id — Tenant, Not End-User

The customer_id must identify the billing entity — the company or account that will receive the invoice. It must never be an end-user identifier.

This distinction matters for three reasons:

  1. Compliance: End-user IDs in billing events are personal data under GDPR; customer_id (a B2B tenant identifier) typically is not.
  2. Aggregation: Invoice totals aggregate over all events for a customer_id. End-user-level aggregation is a different pipeline — do not conflate.
  3. Audit: A customer disputing their invoice queries by customer_id. An end-user identifier in this field forces the customer to understand your internal data model to verify their own invoice.

For multi-tenant B2B SaaS: customer_id is the organization/workspace/account ID. For B2B2C platforms where a corporate customer’s end-users generate billing events: customer_id is the corporate customer’s ID, never the consumer’s ID.


metric — The Billable Unit Type

A string enum identifying what is being measured. Not a description — a machine-readable identifier your pricing engine uses to apply the correct rate.

1
2
3
4
5
"metric": "api_call"
"metric": "compute_minute"
"metric": "storage_gb_day"
"metric": "consultation_session"
"metric": "device_transmission"

Design principles:

  • Use snake_case, all lowercase, no spaces
  • Make it specific enough to distinguish between different billable units you might charge separately (api_call_v1 vs. api_call_v2 if different rates apply to different API versions)
  • Never use free-text descriptions — “API call to /inference endpoint” is a description, not a metric identifier
  • Document your metric vocabulary as a schema artifact, not just in code comments

When you add a new billable metric, add a new metric identifier — do not reuse existing identifiers with different semantics.


quantity — Decimal, Not Float

This field carries more risk than any other. Floating-point arithmetic errors in billing accumulate across millions of events.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# WRONG — float arithmetic
rate     = 0.00247          # per API call
calls    = 80_000_000
charge   = rate * calls     # → 197600.00000000003

# RIGHT — Decimal arithmetic
from decimal import Decimal
rate     = Decimal("0.00247")
calls    = Decimal("80000000")
charge   = rate * calls     # → Decimal("197600.00")

At 80 million API calls, float arithmetic produces a $0.000000003 error per call. Multiplied across the invoice, that’s a $0.00 discrepancy — harmless. But at higher rates or with aggregated quantities:

1
2
3
4
5
rate  = Decimal("0.0831")      # per GB-hour storage
hours = Decimal("720")          # one month
gbs   = Decimal("48291.7")      # storage consumed

charge = rate * hours * gbs    # → Decimal("2887225.9416")

Float arithmetic on this example produces a 2-cent error. At scale across thousands of customers, these errors aggregate into material discrepancies that appear in billing disputes.

The rule: store quantity as a string representation of a decimal in the billing event. Deserialize to Decimal (Python) or BigDecimal (Java/Kotlin) in the pricing engine. Never use float or double for any monetary or quantity calculation.

SQL schema — use DECIMAL(20, 10) for quantity:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
CREATE TABLE billing_events (
    event_id        VARCHAR(71)      NOT NULL,   -- "sha256:" + 64-char hex
    schema_version  SMALLINT         NOT NULL DEFAULT 1,
    customer_id     VARCHAR(64)      NOT NULL,
    metric          VARCHAR(64)      NOT NULL,
    quantity        DECIMAL(20, 10)  NOT NULL,   -- 10 decimal places, 20 total digits
    timestamp       TIMESTAMPTZ      NOT NULL,
    source_ref      VARCHAR(256)     NOT NULL,
    ingested_at     TIMESTAMPTZ      NOT NULL DEFAULT NOW(),

    PRIMARY KEY (event_id),                     -- enforces idempotency at DB level
    INDEX idx_customer_period (customer_id, timestamp),
    INDEX idx_metric           (customer_id, metric, timestamp)
);

The PRIMARY KEY (event_id) constraint enforces idempotency at the database level — a duplicate insert with the same event_id will be rejected without requiring application-level deduplication logic. This is your last line of defense; the application should still deduplicate before attempting the insert, but the database constraint catches anything that slips through.


timestamp — Event Time, Not Ingestion Time

The timestamp field must record when the event occurred in the product, not when it was received by the billing pipeline. The difference matters at billing period boundaries.

A user completes an API call at 23:59:57 on March 31. The event arrives at the billing pipeline at 00:00:03 on April 1 (network latency + queuing). If timestamp is set to ingestion time, this event is billed in April instead of March. At scale, this creates systematic period boundary errors that compound with every late-arriving event.

1
"timestamp": "2026-03-31T23:59:57.000Z"

Requirements:

  • UTC — never local time
  • ISO 8601 format with millisecond precision
  • Set by the application at the point of event emission, not the billing pipeline

Late-arriving events (events where timestamp is more than N hours before the billing pipeline’s current processing time) require an explicit policy. See Engineering Metered Billing for IoT for the three policy options and their trade-offs.


source_reference — De-Identified Audit Pointer

A pointer back to the originating record in your application system. Used to resolve billing disputes (“show me the evidence that this event occurred”) without exposing application data in the billing pipeline.

For this field to be useful, the mapping from source_reference to application record must be maintained and queryable indefinitely — or at minimum for the retention period of the billing event that references it.

Design the reference to be:

  • Stable: the reference must remain valid as long as the billing event that uses it exists
  • De-identified: no PII, no PHI, no sensitive operational data
  • Resolvable: given the source_reference, an authorized internal query must be able to retrieve the original event from the application database

What to Explicitly Exclude

The fields that should not be in a billing event:

Field typeWhy to excludeRisk if included
User ID / patient IDPersonal data under GDPR; PHI under HIPAA if health-contextBilling pipeline becomes a PHI store; requires BAA
Session timestamps correlated to individualsBehavioral data; HIPAA-adjacent for healthcareSession timing can identify provider/patient activity
Request/message payloadOperational data; may contain PII or proprietary contentBilling store holds business-sensitive data beyond audit purposes
IP addressesPersonal data under GDPR; geolocation inferenceBilling pipeline becomes a personal data processor
Internal routing metadataOperational context irrelevant to billingSchema bloat; version migration complexity
Device hardware identifiers (MAC, IMEI)Can map to individual or patient in medical contextsEquivalent to patient ID in clinical IoT billing events
Pricing ratesRates change; embedding them in events locks historical recalculationCannot reprice historical data for corrections; rate changes require event schema migration

Pricing rates in events deserves special attention. The billing event records what happened — quantity, metric, when. The pricing engine records what it costs — rate per unit, tiers, effective dates. These are separate concerns. Never store the rate that was applied in the billing event; store it in a separately versioned pricing configuration. This lets you correct a pricing error retroactively by reprocessing events through the corrected rate schedule without touching the event store.


Schema Versioning in Practice

Schema version 1 is always the version you built with insufficient forethought. Version 2 is when you add the fields you should have included from the start. Version 3 is when a compliance requirement changes what you’re allowed to store.

The migration pattern that works without downtime:

1. Add the new field to the schema as NULLABLE (for compatibility with v1 writers)
2. Update writers to emit schema_version: "2" and populate the new field
3. Update the aggregation pipeline to handle both versions
4. Backfill v1 events where the new field can be derived (not all cases — accept gaps)
5. Once all active writers have deployed v2, enforce NOT NULL on the new field in schema

What does not work: changing the type of an existing field. If you need to change quantity from FLOAT to DECIMAL (a real migration many teams face), you need a new field (quantity_decimal) with a version bump, run both in parallel, and deprecate the old field after all historical data is migrated.

Schema changelog as a first-class artifact:

billing_events schema changelog
-------------------------------
v1 (2025-05-01): Initial schema. customer_id, metric, quantity (FLOAT), timestamp.
v2 (2026-01-15): Added schema_version field. Added source_ref. Changed quantity to
                 DECIMAL(20,10) stored as string in JSON events. Old float events
                 preserved in quantity_v1 column; quantity_decimal added for v2+.
v3 (2026-03-01): Added schema_version to primary key hash for multi-version
                 idempotency safety. No data migration required.

Vertical-Specific Schema Examples

Dev Tools: Token-Based AI API

1
2
3
4
5
6
7
8
9
{
  "event_id":         "sha256:f7e3b1a9...",
  "schema_version":   "2",
  "customer_id":      "ws_4a3f9c21",
  "metric":           "llm_input_token",
  "quantity":         "1847",
  "timestamp":        "2026-03-16T09:14:22.331Z",
  "source_reference": "req_7d2e4f81"
}

Note: metric is llm_input_token, not api_call. If you charge separately for input tokens, output tokens, and fine-tuned model calls, each is a separate metric with a separate billing event. One API request may generate 2–3 billing events (input tokens + output tokens + cache read tokens). All share the same source_reference (req_7d2e4f81), so a dispute can be resolved by querying all events for that request ID.


Healthtech: Telehealth Consultation Session

1
2
3
4
5
6
7
8
9
{
  "event_id":         "sha256:2c9a7f3e...",
  "schema_version":   "2",
  "customer_id":      "org_88f2b3d1",
  "metric":           "consultation_session",
  "quantity":         "1",
  "timestamp":        "2026-03-16T14:22:00.000Z",
  "source_reference": "tok_5c8d2a1f"
}

The source_reference is tok_5c8d2a1f — a platform-generated session token mapped to the clinical encounter record in the EHR. The billing event contains no patient ID, no provider ID, no session duration, and no session timestamp beyond the billing event’s own timestamp. The clinical details live in the EHR under separate access controls.

For a detailed treatment of the PHI exclusion pattern, see Usage-Based Billing for Healthtech SaaS.


IoT: Industrial Sensor Reading

1
2
3
4
5
6
7
8
9
{
  "event_id":         "sha256:9d1b5f7a...",
  "schema_version":   "2",
  "customer_id":      "ten_f3a9c2d1",
  "metric":           "sensor_reading",
  "quantity":         "1",
  "timestamp":        "2026-03-16T08:04:51.000Z",
  "source_reference": "dev_b7e3:seq_00094712"
}

The source_reference encodes the device ID and MQTT message sequence number as a composite key: dev_b7e3:seq_00094712. This is the idempotency anchor for MQTT QoS 1 retransmissions — the same device message retransmitted after a network drop will generate the same event_id (since the source_reference is identical) and be deduplicated at the database level.

The device hardware identifier (dev_b7e3) is a platform-internal alias, not the device’s hardware MAC address or IMEI — the hardware address never appears in the billing pipeline.


The Pre-Production Schema Checklist

Before your first billing event hits production:

  • event_id is deterministic — the same event always produces the same ID
  • event_id uniqueness is enforced as a PRIMARY KEY constraint at the database level
  • quantity is stored as DECIMAL, not float or double
  • timestamp is event time, not ingestion time; UTC; ISO 8601
  • schema_version is present, even on version 1
  • customer_id identifies the billing entity (tenant), not the end-user
  • source_reference is de-identified — no PII, no PHI, no hardware identifiers
  • No user IDs, patient IDs, session content, IP addresses, or payload data in the event
  • Late-arrival policy is defined and tested (what happens to events timestamped before the current billing period?)
  • Schema changelog document exists before v1 ships
  • Pricing rates are stored in a separate rate schedule, not in the billing event

ABAXUS includes a production-validated billing event schema — deploy inside your own infrastructure with idempotency, decimal precision, and late-arrival handling built in

ABAXUS includes a production-validated billing event schema — deploy inside your own infrastructure with idempotency, decimal precision, and late-arrival handling built in

Self-hosted usage-based billing engine. Your billing data stays in your own database, in your own cloud region, under your own compliance controls. No per-transaction fees. Runs in your Kubernetes cluster.

See Pricing

Common Schema Mistakes That Reach Production

Using random UUIDs as event IDs. Every retry, every network duplicate, every batch replay creates a new event in the billing store. The double-billing is silent — no error is thrown, the charge just appears twice on the invoice. Discovered in production during billing dispute resolution, when the customer’s event count doesn’t match yours.

Storing float for quantity. The error is invisible in development (amounts are small, discrepancies are fractions of cents). In production at scale, float accumulation produces invoice totals that don’t match what the pricing engine calculated. The discrepancy is random and non-reproducible, which makes it nearly impossible to debug.

Not including schema_version. You will change the schema. When you do, you need to process v1 and v2 events differently in the aggregation pipeline. Without the version field, the only way to distinguish them is by the presence or absence of the new field — which is fragile and breaks when you add optional fields.

Patient IDs or user IDs in source_reference. The intent is good — create a direct link back to the originating record. The problem is that a patient ID in the billing event is PHI, regardless of field name. The billing pipeline is now a PHI processor. All the HIPAA obligations that apply to the clinical system now apply to the billing system.

Timestamp set at ingestion, not event time. Events that arrive late (network latency, queue depth, IoT connectivity gaps) get assigned to the wrong billing period. At period boundaries this creates systematic errors: the last few minutes of a billing period are consistently under-counted, and the first few minutes of the next period are over-counted with events that belong to the prior period.

Pricing rates embedded in events. When a pricing error is discovered — the wrong rate was applied to a customer’s events for two weeks — the fix requires either reprocessing the events or correcting the rate in the event records. If rates are stored in the events, you’ve created an audit trail problem: the corrected invoice no longer matches the stored events. Rates belong in versioned pricing configuration, not in the event store.


Book an Architecture Review for Your Billing Event Schema

Getting the schema right before production is the highest-leverage engineering decision in your billing infrastructure. Getting it wrong creates problems that are expensive to fix: double-billing requires customer reconciliation, precision errors require retroactive recalculation, and PHI in billing events requires compliance remediation that goes well beyond a code change.

ABAXUS offers 30-minute architecture reviews for engineering teams designing or auditing their billing event schema. In one session:

  • Schema review — walk through your current or planned event schema field by field; identify compliance risks, idempotency gaps, and precision issues before they hit production
  • Idempotency key design — review your key construction for your specific event sources (API requests, MQTT messages, database CDC events, webhook callbacks)
  • Vertical-specific guidance — PHI exclusion patterns for Healthtech, MQTT deduplication for IoT, high-frequency API billing for Dev Tools
  • Migration path — if your current schema has known issues, a realistic migration plan that doesn’t require a big-bang redeployment

This is a technical conversation, not a product demo. Bring your current event schema or your implementation plan.

Book your 30-minute billing schema architecture review →



ABAXUS is a self-hosted usage-based billing engine for engineering teams that need production-correct billing infrastructure. It ships with idempotent event ingestion, DECIMAL-precision quantity handling, schema versioning, configurable late-arrival policies, and a full audit trail — running inside your own Kubernetes cluster with your data in your own database. See pricing · Book a schema review

FAQs

Stop debugging billing. Start shipping product.

Your billing layer should be invisible infrastructure. In 30 minutes we map your event sources, identify your data contract gaps, and show you exactly what fixing the architecture looks like. No sales pitch.