"What fields does a billing event schema need at minimum?"

"Six fields: event_id (a deterministic idempotency key derived from the event's source fields, not a random UUID), schema_version (an integer that increments with every breaking change), customer_id (the billing entity — tenant or account, never an end-user ID), metric (a machine-readable string identifying the billable unit type), quantity (a decimal value — never a float), and timestamp (the event occurrence time in UTC ISO 8601, not the ingestion time). A source_reference field is also recommended to provide a de-identified pointer back to the originating record for dispute resolution."

"Why should billing event IDs be deterministic instead of random UUIDs?"

"Because the same physical event may be delivered to the billing pipeline more than once — from retries, network duplicates, MQTT QoS 1 retransmissions, or batch replays. A random UUID assigned at ingestion creates a unique ID for each delivery attempt, causing the billing pipeline to store multiple records for the same event and charge the customer multiple times. A deterministic ID derived from the event's source fields (customer_id + metric + source_reference, hashed with SHA-256) produces the same ID for every delivery of the same event. The database PRIMARY KEY constraint on event_id then rejects the duplicate insert."

"Why can't I use float for billing quantities?"

"Floating-point arithmetic is inherently imprecise for decimal numbers. The float representation of 0.00247 is not exactly 0.00247 — it is the nearest representable IEEE 754 binary fraction. Multiplied across millions of events, these representation errors accumulate into invoice totals that differ from the expected amount. The difference may be sub-cent per invoice, but it creates billing records that don't reconcile, which is a compliance and audit problem even when the financial impact is small. Use DECIMAL(20,10) in SQL and Python's decimal.Decimal (or Java's BigDecimal) in application code."

"What is the two-layer billing event pattern?"

"A design pattern that separates billing metadata from operational data. Layer 1 is the billing event — the minimal fields needed to calculate and audit the invoice (event_id, customer_id, metric, quantity, timestamp, source_reference). It is PHI-free and PII-free and flows to the billing pipeline. Layer 2 is the source record — the full operational context (user identity, request payload, session data, device readings) that stays in the originating application database. The source_reference field in Layer 1 provides a de-identified link back to Layer 2 for dispute resolution without exposing application data to the billing pipeline."

"What fields should be excluded from billing events?"

"User IDs, patient IDs, and device hardware identifiers (if they can be mapped to individuals) — these are personal data or PHI that create compliance obligations for the billing pipeline. Session timestamps correlated to specific individuals. IP addresses (personal data under GDPR). Request or message payload content. Internal routing metadata irrelevant to billing calculation. Pricing rates applied — rates belong in a separate versioned rate schedule, not embedded in events. Including pricing rates in events prevents retroactive corrections and creates audit trail inconsistencies when rates change."

"How do I design billing event schema versioning?"

"Include a schema_version field from day one — even if the initial value is just '1'. When the schema changes in any way that affects invoice calculation (field added, field type changed, field renamed, field removed), increment the version. The aggregation pipeline must handle all active schema versions simultaneously — route each event to the version-specific calculation logic based on its schema_version field. Maintain a schema changelog as a first-class artifact: version number, date, and description of every change. Never change the type of an existing field — add a new field with the correct type and a version bump instead."

"How should the timestamp field handle late-arriving billing events?"

"Set the timestamp to the event occurrence time (when it happened in your product), not the ingestion time (when the billing pipeline received it). This is critical at billing period boundaries — an event that occurred at 23:59:57 on March 31 but arrived at the pipeline at 00:00:03 on April 1 should be billed in March, not April. Late arrivals require an explicit policy: accept events up to N hours after period close (grace window), defer late events to the current period, or re-open closed periods to insert them accurately. For IoT products with connectivity gaps, the policy must tolerate multi-day late batches — the right choice depends on your billing period structure and customer contract terms."

"Why are pricing rates stored separately from billing events?"

"Because rates change and billing events are immutable. When a pricing error is discovered — the wrong rate was applied for two weeks — you need to recalculate affected invoices by reprocessing billing events through the correct rate. If rates are embedded in the events, you cannot reprocess without modifying the immutable event records, which destroys the audit trail. Storing rates in a separately versioned pricing configuration (with effective date ranges) allows any historical period to be recalculated by applying the correct rate schedule to the unchanged event records. The billing event answers 'what happened'; the rate schedule answers 'what it cost'."

"What should the source_reference field contain for IoT devices?"

"A composite key of the platform-internal device alias (not the hardware MAC address or IMEI) and the message sequence number from the protocol header. For MQTT: device_alias + ':' + mqtt_message_id. For HTTP: device_alias + ':' + request_id. The hardware identifier should never appear in the billing pipeline — it may be linkable to an individual (for consumer IoT or connected medical devices) and constitutes personal data under GDPR. Using a platform-assigned alias that maps to the hardware identifier only within your secured device registry keeps the billing pipeline outside the personal data boundary."

Billing Event Schema Design: The Engineering Decision That Determines Your Compliance Posture

Table of Contents

The billing event schema is decided before you write the first line of billing code. Most engineering teams do it the other way around: instrument first, get pricing in front of customers, then discover that the session timestamps in their billing events just became PHI — or that the float precision errors in their quantity field compound across 80 million events into a $2,400 rounding discrepancy on a single invoice.

The schema defines what your billing pipeline can measure. It determines which fields become compliance liabilities. It fixes the granularity ceiling for every audit trail you’ll ever produce. And it is expensive to change in production — customers have integration dependencies on your billing event format, your historical invoice data is stored in the current schema, and a schema migration touches the entire pipeline simultaneously.

This article covers the engineering decisions that go into a billing event schema that works correctly under real operating conditions: idempotency key construction, field-level compliance design, decimal precision, schema versioning, and vertical-specific examples across Dev Tools, Healthtech, and IoT.

Why the Schema Comes Before the Pricing Model

Most teams discover this order problem after the fact. The sequence typically goes:

Build the product, instrument it for observability (logs, traces, metrics)
Decide on a pricing model
Try to build billing on top of existing instrumentation
Discover the instrumentation doesn’t capture what the pricing model requires

A CI/CD platform that instruments at the build-step level cannot produce a per-build-minute invoice without re-instrumenting. A telehealth platform that logs session IDs for debugging can’t remove them from the billing pipeline without a schema migration. An IoT platform that aggregates device readings to daily fleet totals for dashboards cannot reconstruct per-device billing retroactively.

The instrumentation granularity decision is a one-way door. You can aggregate fine-grained data upward; you cannot disaggregate coarse-grained data downward. The correct sequence:

1. Choose your billable metric
2. Design the billing event schema to capture that metric
3. Instrument your product to emit billing events at that granularity
4. Build the billing pipeline on top of the event stream
5. Set your pricing rates

Rates can change without touching the schema. The schema is fixed once it’s in production with real data behind it.

The Two-Layer Pattern

The most important structural decision in billing event schema design is what goes into the billing event and what stays in the originating system. This is particularly critical in regulated industries — but the discipline of separating billable metadata from operational records is good practice in every vertical.

Layer 1: The billing event (what goes to the billing pipeline)

Contains only the fields needed to calculate, aggregate, and audit the invoice. No content, no user-identifiable data, no operational context that isn’t relevant to billing.

1
2
3
4
5
6
7
8
9
{
  "event_id":         "sha256:a3f9c2d1e4b7...",
  "schema_version":   "2",
  "customer_id":      "cust_9f2a8e31",
  "metric":           "api_call",
  "quantity":         "1",
  "timestamp":        "2026-03-16T14:22:00.000Z",
  "source_reference": "req_8b3e9c4d"
}

Layer 2: The source record (stays in your application database)

Contains the operational detail that the billing event references. Never flows to the billing pipeline.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
{
  "request_id":    "req_8b3e9c4d",
  "user_id":       "usr_00012847",
  "endpoint":      "/v2/inference",
  "model":         "gpt-4o",
  "input_tokens":  1847,
  "output_tokens": 312,
  "duration_ms":   847,
  "ip_address":    "192.168.1.44",
  "session_id":    "sess_7a1b2c3d"
}

The source_reference in Layer 1 (req_8b3e9c4d) is enough to resolve a billing dispute — trace back to the source record, verify the event occurred, produce the evidence. But the billing pipeline has no visibility into the user identity, request content, or session context.

For Healthtech: the source_reference is a de-identified session token mapped to a patient encounter in a separately secured clinical record store. The billing event never contains the patient ID, provider ID, or session timestamp that would constitute PHI.

For IoT: the source_reference is a device-message reference — the device hardware ID and message sequence number as a composite key. The patient’s medical record number (for connected medical devices) never appears in the billing event.

Field-by-Field: The Minimum Required Schema

Every billing event needs exactly these six fields. Additional fields add complexity without proportional value.

`event_id` — Stable Idempotency Key

The most important field. Gets almost no attention in billing tutorials.

The event_id must be deterministic — the same physical event must always produce the same ID. If two copies of the same event arrive at the billing pipeline (from a retry, a network duplicate, or a batch replay), both must produce the same event_id so the second write is deduplicated.

A random UUID assigned at ingestion is wrong. Two separate delivery attempts for the same underlying event will produce different UUIDs and both will be stored, creating a double-charge.

The correct approach: derive the event_id from the content of the event itself.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
import hashlib
import json

def generate_event_id(customer_id: str, metric: str, source_reference: str) -> str:
    """
    Deterministic event ID derived from the event's identifying fields.
    The same event always produces the same ID regardless of when it is processed.
    """
    canonical = json.dumps({
        "customer_id":      customer_id,
        "metric":           metric,
        "source_reference": source_reference,
    }, sort_keys=True)

    return "sha256:" + hashlib.sha256(canonical.encode()).hexdigest()


# Usage
event_id = generate_event_id(
    customer_id      = "cust_9f2a8e31",
    metric           = "api_call",
    source_reference = "req_8b3e9c4d",
)
# → "sha256:a3f9c2d1e4b7f8a29c1d3e5f6..."
# Calling this again with the same arguments always produces the same result.

What to include in the hash input:

customer_id — prevents cross-tenant collision if source references are not globally unique
metric — a single source event may generate multiple billing events for different metrics; include metric to distinguish them
source_reference — the unique identifier of the originating event in your application

Do not include timestamp or quantity in the hash input if either could vary between delivery attempts (e.g., a quantity that’s calculated at emission time from a fluctuating counter).

Vertical-specific idempotency key design:

Vertical	Source fields for idempotency hash
Dev Tools (API)	`customer_id` + `request_id`
Dev Tools (CI/CD)	`customer_id` + `build_id` + `step_id`
Healthtech (telehealth)	`customer_id` + `session_token` (de-identified)
IoT (MQTT)	`customer_id` + `device_id` + `message_sequence_number`
IoT (HTTP)	`customer_id` + `device_id` + `reading_timestamp`

For IoT with MQTT QoS 1: the message_sequence_number from the MQTT protocol header is the correct deduplication anchor — it’s stable across retransmissions for the same original message.

`schema_version` — Non-Negotiable From Day One

Add this field even if you never intend to change the schema. You will change the schema. Having version 1 events and version 2 events coexisting in the same table is routine — without a version field you cannot tell them apart.

1
"schema_version": "2"

Use a simple integer string. Do not use semver for event schemas — minor version differences in billing events are not minor. Any field addition, removal, or type change that affects invoice calculation is a major version.

The version field enables the aggregation pipeline to route events to version-specific calculation logic:

1
2
3
4
5
6
7
8
9
def calculate_charge(event: dict) -> Decimal:
    version = event.get("schema_version", "1")

    if version == "1":
        return calculate_v1(event)
    elif version == "2":
        return calculate_v2(event)
    else:
        raise UnknownSchemaVersionError(version)

Historical invoices recalculated for dispute resolution will use the schema version that was active when the invoice was originally generated — not the current version. Without the version field, recalculation is impossible.

`customer_id` — Tenant, Not End-User

The customer_id must identify the billing entity — the company or account that will receive the invoice. It must never be an end-user identifier.

This distinction matters for three reasons:

Compliance: End-user IDs in billing events are personal data under GDPR; customer_id (a B2B tenant identifier) typically is not.
Aggregation: Invoice totals aggregate over all events for a customer_id. End-user-level aggregation is a different pipeline — do not conflate.
Audit: A customer disputing their invoice queries by customer_id. An end-user identifier in this field forces the customer to understand your internal data model to verify their own invoice.

For multi-tenant B2B SaaS: customer_id is the organization/workspace/account ID. For B2B2C platforms where a corporate customer’s end-users generate billing events: customer_id is the corporate customer’s ID, never the consumer’s ID.

`metric` — The Billable Unit Type

A string enum identifying what is being measured. Not a description — a machine-readable identifier your pricing engine uses to apply the correct rate.

1
2
3
4
5
"metric": "api_call"
"metric": "compute_minute"
"metric": "storage_gb_day"
"metric": "consultation_session"
"metric": "device_transmission"

Design principles:

Use snake_case, all lowercase, no spaces
Make it specific enough to distinguish between different billable units you might charge separately (api_call_v1 vs. api_call_v2 if different rates apply to different API versions)
Never use free-text descriptions — “API call to /inference endpoint” is a description, not a metric identifier
Document your metric vocabulary as a schema artifact, not just in code comments

When you add a new billable metric, add a new metric identifier — do not reuse existing identifiers with different semantics.

`quantity` — Decimal, Not Float

This field carries more risk than any other. Floating-point arithmetic errors in billing accumulate across millions of events.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# WRONG — float arithmetic
rate     = 0.00247          # per API call
calls    = 80_000_000
charge   = rate * calls     # → 197600.00000000003

# RIGHT — Decimal arithmetic
from decimal import Decimal
rate     = Decimal("0.00247")
calls    = Decimal("80000000")
charge   = rate * calls     # → Decimal("197600.00")

At 80 million API calls, float arithmetic produces a $0.000000003 error per call. Multiplied across the invoice, that’s a $0.00 discrepancy — harmless. But at higher rates or with aggregated quantities:

1
2
3
4
5
rate  = Decimal("0.0831")      # per GB-hour storage
hours = Decimal("720")          # one month
gbs   = Decimal("48291.7")      # storage consumed

charge = rate * hours * gbs    # → Decimal("2887225.9416")

Float arithmetic on this example produces a 2-cent error. At scale across thousands of customers, these errors aggregate into material discrepancies that appear in billing disputes.

The rule: store quantity as a string representation of a decimal in the billing event. Deserialize to Decimal (Python) or BigDecimal (Java/Kotlin) in the pricing engine. Never use float or double for any monetary or quantity calculation.

SQL schema — use DECIMAL(20, 10) for quantity:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
CREATE TABLE billing_events (
    event_id        VARCHAR(71)      NOT NULL,   -- "sha256:" + 64-char hex
    schema_version  SMALLINT         NOT NULL DEFAULT 1,
    customer_id     VARCHAR(64)      NOT NULL,
    metric          VARCHAR(64)      NOT NULL,
    quantity        DECIMAL(20, 10)  NOT NULL,   -- 10 decimal places, 20 total digits
    timestamp       TIMESTAMPTZ      NOT NULL,
    source_ref      VARCHAR(256)     NOT NULL,
    ingested_at     TIMESTAMPTZ      NOT NULL DEFAULT NOW(),

    PRIMARY KEY (event_id),                     -- enforces idempotency at DB level
    INDEX idx_customer_period (customer_id, timestamp),
    INDEX idx_metric           (customer_id, metric, timestamp)
);

The PRIMARY KEY (event_id) constraint enforces idempotency at the database level — a duplicate insert with the same event_id will be rejected without requiring application-level deduplication logic. This is your last line of defense; the application should still deduplicate before attempting the insert, but the database constraint catches anything that slips through.

`timestamp` — Event Time, Not Ingestion Time

The timestamp field must record when the event occurred in the product, not when it was received by the billing pipeline. The difference matters at billing period boundaries.

A user completes an API call at 23:59:57 on March 31. The event arrives at the billing pipeline at 00:00:03 on April 1 (network latency + queuing). If timestamp is set to ingestion time, this event is billed in April instead of March. At scale, this creates systematic period boundary errors that compound with every late-arriving event.

1
"timestamp": "2026-03-31T23:59:57.000Z"

Requirements:

UTC — never local time
ISO 8601 format with millisecond precision
Set by the application at the point of event emission, not the billing pipeline

Late-arriving events (events where timestamp is more than N hours before the billing pipeline’s current processing time) require an explicit policy. See Engineering Metered Billing for IoT for the three policy options and their trade-offs.

`source_reference` — De-Identified Audit Pointer

A pointer back to the originating record in your application system. Used to resolve billing disputes (“show me the evidence that this event occurred”) without exposing application data in the billing pipeline.

For this field to be useful, the mapping from source_reference to application record must be maintained and queryable indefinitely — or at minimum for the retention period of the billing event that references it.

Design the reference to be:

Stable: the reference must remain valid as long as the billing event that uses it exists
De-identified: no PII, no PHI, no sensitive operational data
Resolvable: given the source_reference, an authorized internal query must be able to retrieve the original event from the application database

What to Explicitly Exclude

The fields that should not be in a billing event:

Field type	Why to exclude	Risk if included
User ID / patient ID	Personal data under GDPR; PHI under HIPAA if health-context	Billing pipeline becomes a PHI store; requires BAA
Session timestamps correlated to individuals	Behavioral data; HIPAA-adjacent for healthcare	Session timing can identify provider/patient activity
Request/message payload	Operational data; may contain PII or proprietary content	Billing store holds business-sensitive data beyond audit purposes
IP addresses	Personal data under GDPR; geolocation inference	Billing pipeline becomes a personal data processor
Internal routing metadata	Operational context irrelevant to billing	Schema bloat; version migration complexity
Device hardware identifiers (MAC, IMEI)	Can map to individual or patient in medical contexts	Equivalent to patient ID in clinical IoT billing events
Pricing rates	Rates change; embedding them in events locks historical recalculation	Cannot reprice historical data for corrections; rate changes require event schema migration

Pricing rates in events deserves special attention. The billing event records what happened — quantity, metric, when. The pricing engine records what it costs — rate per unit, tiers, effective dates. These are separate concerns. Never store the rate that was applied in the billing event; store it in a separately versioned pricing configuration. This lets you correct a pricing error retroactively by reprocessing events through the corrected rate schedule without touching the event store.

Schema Versioning in Practice

Schema version 1 is always the version you built with insufficient forethought. Version 2 is when you add the fields you should have included from the start. Version 3 is when a compliance requirement changes what you’re allowed to store.

The migration pattern that works without downtime:

1. Add the new field to the schema as NULLABLE (for compatibility with v1 writers)
2. Update writers to emit schema_version: "2" and populate the new field
3. Update the aggregation pipeline to handle both versions
4. Backfill v1 events where the new field can be derived (not all cases — accept gaps)
5. Once all active writers have deployed v2, enforce NOT NULL on the new field in schema

What does not work: changing the type of an existing field. If you need to change quantity from FLOAT to DECIMAL (a real migration many teams face), you need a new field (quantity_decimal) with a version bump, run both in parallel, and deprecate the old field after all historical data is migrated.

Schema changelog as a first-class artifact:

billing_events schema changelog
-------------------------------
v1 (2025-05-01): Initial schema. customer_id, metric, quantity (FLOAT), timestamp.
v2 (2026-01-15): Added schema_version field. Added source_ref. Changed quantity to
                 DECIMAL(20,10) stored as string in JSON events. Old float events
                 preserved in quantity_v1 column; quantity_decimal added for v2+.
v3 (2026-03-01): Added schema_version to primary key hash for multi-version
                 idempotency safety. No data migration required.

Vertical-Specific Schema Examples

Dev Tools: Token-Based AI API

1
2
3
4
5
6
7
8
9
{
  "event_id":         "sha256:f7e3b1a9...",
  "schema_version":   "2",
  "customer_id":      "ws_4a3f9c21",
  "metric":           "llm_input_token",
  "quantity":         "1847",
  "timestamp":        "2026-03-16T09:14:22.331Z",
  "source_reference": "req_7d2e4f81"
}

Note: metric is llm_input_token, not api_call. If you charge separately for input tokens, output tokens, and fine-tuned model calls, each is a separate metric with a separate billing event. One API request may generate 2–3 billing events (input tokens + output tokens + cache read tokens). All share the same source_reference (req_7d2e4f81), so a dispute can be resolved by querying all events for that request ID.

Healthtech: Telehealth Consultation Session

1
2
3
4
5
6
7
8
9
{
  "event_id":         "sha256:2c9a7f3e...",
  "schema_version":   "2",
  "customer_id":      "org_88f2b3d1",
  "metric":           "consultation_session",
  "quantity":         "1",
  "timestamp":        "2026-03-16T14:22:00.000Z",
  "source_reference": "tok_5c8d2a1f"
}

The source_reference is tok_5c8d2a1f — a platform-generated session token mapped to the clinical encounter record in the EHR. The billing event contains no patient ID, no provider ID, no session duration, and no session timestamp beyond the billing event’s own timestamp. The clinical details live in the EHR under separate access controls.

For a detailed treatment of the PHI exclusion pattern, see Usage-Based Billing for Healthtech SaaS.

IoT: Industrial Sensor Reading

1
2
3
4
5
6
7
8
9
{
  "event_id":         "sha256:9d1b5f7a...",
  "schema_version":   "2",
  "customer_id":      "ten_f3a9c2d1",
  "metric":           "sensor_reading",
  "quantity":         "1",
  "timestamp":        "2026-03-16T08:04:51.000Z",
  "source_reference": "dev_b7e3:seq_00094712"
}

The source_reference encodes the device ID and MQTT message sequence number as a composite key: dev_b7e3:seq_00094712. This is the idempotency anchor for MQTT QoS 1 retransmissions — the same device message retransmitted after a network drop will generate the same event_id (since the source_reference is identical) and be deduplicated at the database level.

The device hardware identifier (dev_b7e3) is a platform-internal alias, not the device’s hardware MAC address or IMEI — the hardware address never appears in the billing pipeline.

The Pre-Production Schema Checklist

Before your first billing event hits production:

event_id is deterministic — the same event always produces the same ID
event_id uniqueness is enforced as a PRIMARY KEY constraint at the database level
quantity is stored as DECIMAL, not float or double
timestamp is event time, not ingestion time; UTC; ISO 8601
schema_version is present, even on version 1
customer_id identifies the billing entity (tenant), not the end-user
source_reference is de-identified — no PII, no PHI, no hardware identifiers
No user IDs, patient IDs, session content, IP addresses, or payload data in the event
Late-arrival policy is defined and tested (what happens to events timestamped before the current billing period?)
Schema changelog document exists before v1 ships
Pricing rates are stored in a separate rate schedule, not in the billing event

ABAXUS includes a production-validated billing event schema — deploy inside your own infrastructure with idempotency, decimal precision, and late-arrival handling built in

Self-hosted usage-based billing engine. Your billing data stays in your own database, in your own cloud region, under your own compliance controls. No per-transaction fees. Runs in your Kubernetes cluster.

See Pricing

Common Schema Mistakes That Reach Production

Using random UUIDs as event IDs. Every retry, every network duplicate, every batch replay creates a new event in the billing store. The double-billing is silent — no error is thrown, the charge just appears twice on the invoice. Discovered in production during billing dispute resolution, when the customer’s event count doesn’t match yours.

Storing float for quantity. The error is invisible in development (amounts are small, discrepancies are fractions of cents). In production at scale, float accumulation produces invoice totals that don’t match what the pricing engine calculated. The discrepancy is random and non-reproducible, which makes it nearly impossible to debug.

Not including schema_version. You will change the schema. When you do, you need to process v1 and v2 events differently in the aggregation pipeline. Without the version field, the only way to distinguish them is by the presence or absence of the new field — which is fragile and breaks when you add optional fields.

Patient IDs or user IDs in source_reference. The intent is good — create a direct link back to the originating record. The problem is that a patient ID in the billing event is PHI, regardless of field name. The billing pipeline is now a PHI processor. All the HIPAA obligations that apply to the clinical system now apply to the billing system.

Timestamp set at ingestion, not event time. Events that arrive late (network latency, queue depth, IoT connectivity gaps) get assigned to the wrong billing period. At period boundaries this creates systematic errors: the last few minutes of a billing period are consistently under-counted, and the first few minutes of the next period are over-counted with events that belong to the prior period.

Pricing rates embedded in events. When a pricing error is discovered — the wrong rate was applied to a customer’s events for two weeks — the fix requires either reprocessing the events or correcting the rate in the event records. If rates are stored in the events, you’ve created an audit trail problem: the corrected invoice no longer matches the stored events. Rates belong in versioned pricing configuration, not in the event store.

Book an Architecture Review for Your Billing Event Schema

Getting the schema right before production is the highest-leverage engineering decision in your billing infrastructure. Getting it wrong creates problems that are expensive to fix: double-billing requires customer reconciliation, precision errors require retroactive recalculation, and PHI in billing events requires compliance remediation that goes well beyond a code change.

ABAXUS offers 30-minute architecture reviews for engineering teams designing or auditing their billing event schema. In one session:

Schema review — walk through your current or planned event schema field by field; identify compliance risks, idempotency gaps, and precision issues before they hit production
Idempotency key design — review your key construction for your specific event sources (API requests, MQTT messages, database CDC events, webhook callbacks)
Vertical-specific guidance — PHI exclusion patterns for Healthtech, MQTT deduplication for IoT, high-frequency API billing for Dev Tools
Migration path — if your current schema has known issues, a realistic migration plan that doesn’t require a big-bang redeployment

This is a technical conversation, not a product demo. Bring your current event schema or your implementation plan.

Book your 30-minute billing schema architecture review →

Metered Billing Explained — the full billing pipeline: event ingestion, aggregation, pricing engine, and invoicing
Usage-Based Billing for Healthtech SaaS — PHI-aware event schema design in the context of telehealth, EHR, and medical device billing
10 Use Cases: IoT SaaS — where the idempotency and connectivity gap patterns in this article apply in practice
5 Key Features of Usage-Based Billing Software — the infrastructure requirements that the schema design must support
Common Usage-Based Pricing Mistakes — broader billing implementation pitfalls, including precision errors and idempotency gaps
Engineering Metered Billing for IoT — late-arrival policies, clock skew, and fleet-level aggregation for IoT pipelines

ABAXUS is a self-hosted usage-based billing engine for engineering teams that need production-correct billing infrastructure. It ships with idempotent event ingestion, DECIMAL-precision quantity handling, schema versioning, configurable late-arrival policies, and a full audit trail — running inside your own Kubernetes cluster with your data in your own database. See pricing · Book a schema review

Billing Event Schema Design: The Engineering Decision That Determines Your Compliance Posture

Why the Schema Comes Before the Pricing Model

The Two-Layer Pattern

Field-by-Field: The Minimum Required Schema

`event_id` — Stable Idempotency Key

`schema_version` — Non-Negotiable From Day One

`customer_id` — Tenant, Not End-User

`metric` — The Billable Unit Type

`quantity` — Decimal, Not Float

`timestamp` — Event Time, Not Ingestion Time

`source_reference` — De-Identified Audit Pointer

What to Explicitly Exclude

Schema Versioning in Practice

Vertical-Specific Schema Examples

Dev Tools: Token-Based AI API

Healthtech: Telehealth Consultation Session

IoT: Industrial Sensor Reading

The Pre-Production Schema Checklist

ABAXUS includes a production-validated billing event schema — deploy inside your own infrastructure with idempotency, decimal precision, and late-arrival handling built in

Common Schema Mistakes That Reach Production

Book an Architecture Review for Your Billing Event Schema

FAQs

Stop debugging billing. Start shipping product.

Why the Schema Comes Before the Pricing Model

The Two-Layer Pattern

Field-by-Field: The Minimum Required Schema

event_id — Stable Idempotency Key

schema_version — Non-Negotiable From Day One

customer_id — Tenant, Not End-User

metric — The Billable Unit Type

quantity — Decimal, Not Float

timestamp — Event Time, Not Ingestion Time

source_reference — De-Identified Audit Pointer

What to Explicitly Exclude

Schema Versioning in Practice

Vertical-Specific Schema Examples

Dev Tools: Token-Based AI API

Healthtech: Telehealth Consultation Session

IoT: Industrial Sensor Reading

The Pre-Production Schema Checklist

ABAXUS includes a production-validated billing event schema — deploy inside your own infrastructure with idempotency, decimal precision, and late-arrival handling built in

Common Schema Mistakes That Reach Production

Book an Architecture Review for Your Billing Event Schema

Related Reading

FAQs

Stop debugging billing. Start shipping product.

`event_id` — Stable Idempotency Key

`schema_version` — Non-Negotiable From Day One

`customer_id` — Tenant, Not End-User

`metric` — The Billable Unit Type

`quantity` — Decimal, Not Float

`timestamp` — Event Time, Not Ingestion Time

`source_reference` — De-Identified Audit Pointer