How Lumetry Works

Lumetry turns operational telemetry into incident intelligence. This page describes the product as a black box: the information it accepts, the decisions it makes, the objects it produces, and how those objects affect an operations workflow.

It intentionally does not describe Lumetry's internal services, databases, deployment control systems, or processing technology.

Product boundary

Lumetry owns the path from normalized operational signal to contextualized incident:

metric data
    |
    v
evaluate expected behavior
    |
    v
record threshold breaches
    |
    v
open and recover actionable alerts
    |
    v
correlate related alerts into incidents
    |
    v
attach service / CI context and notify responders

Collection, dashboards, ticketing, CMDB ownership, and long-term telemetry exploration can remain in the tools you already use. Lumetry integrates with those tools through HTTP APIs, configured sources, collectors, topology synchronization, and webhooks.

Inputs

Metric points

Every point has a metric key, timestamp, numeric value, and optional labels. Labels can represent instances, regions, clusters, or other peer dimensions.

Points can arrive by:

direct batch or single-point API ingestion;
scheduled collection from a configured metric source;
a Lumetry Collector running in a customer-controlled environment.

Ingestion and evaluation are asynchronous. A successful 202 Accepted response means the data was accepted for processing; violations and alerts can appear shortly afterward.

Metric definitions

A metric definition gives a series its operational meaning: display name, unit, source, entity, ownership model, dimensionality, and lifecycle state. Only active, evaluable metrics can be attached to rules.

Rules and baselines

Rules describe what counts as abnormal and when abnormal points become actionable. They can use fixed thresholds or learned seasonal behavior, and can define separate Warning and Critical levels with trigger and recovery conditions.

Topology

Topology models business services, applications, components, hosts, databases, and their relationships. Metric bindings connect each signal to the entity that emits it.

This context lets Lumetry answer:

Which service is affected?
Which configuration item is closest to the signal?
Which alerts likely describe the same operational problem?

Notification routing

Integrations define delivery targets. Alerting profiles decide which alert lifecycle transitions are sent to which targets.

Decisions Lumetry makes

Is the point abnormal?

Lumetry compares each evaluable point with the threshold band in force at that time. A breach becomes a violation containing the actual value, expected range, direction, severity, and confidence context.

Is the condition actionable?

Rules use violation counts and time windows to avoid opening an alert for every isolated spike. Recovery settings likewise require enough normal behavior before an alert closes, which reduces flapping.

Which alerts belong together?

When an alert opens, Lumetry correlates it with current incidents using service or CI context, severity, and time proximity. When topology context is unavailable, the metric identity provides the fallback grouping key.

Who should be notified?

Alert transitions are routed according to the rule's alerting profile or the configured default delivery policy. Delivery failures are retried so a temporary destination outage delays notification instead of silently discarding it.

Outputs

Output	What it means	Operational use
Violation	One point breached its expected range.	Forensics, tuning, and audit.
Alert	A sustained or repeated condition requires attention.	On-call ownership and recovery tracking.
Incident	Related alerts represent one broader problem.	Coordinated incident response.
Timeline	Lifecycle and correlation events in time order.	Investigation and handoff.
Impact context	Related service, CI, metrics, and alert counts.	Prioritization and routing.
Notification	An alert opened or closed.	Downstream chat, webhook, or automation flow.
System alert	Collection or platform-facing input health needs attention.	Restore observability coverage.

Effects on the operations process

Without correlation, responders receive individual signals and must reconstruct the service impact manually. With Lumetry:

transient points remain forensic violations rather than immediate pages;
sustained conditions become alerts with explicit trigger reasons;
related alerts join one incident instead of creating separate investigations;
topology adds the affected service and CI;
timelines and underlying evidence stay attached to the incident;
lifecycle transitions can drive external automation.

Integration surfaces

Goal	Surface
Push metric points	`POST /v1/metrics`
Register and operate customer-side collection	`/api/collectors` and `/v1/collectors/*`
Register metric metadata	`/api/metric-definitions`
Configure pull-based metric sources	`/api/metric-sources`
Synchronize service/CI topology	`POST /v1/topology`
Read alerts and incidents	`/api/alerts`, `/api/incidents`
Deliver alert transitions	Alerting integrations and profiles

The public HTTP contracts remain the integration boundary regardless of how Lumetry is deployed.

Product boundary​

Inputs​

Metric points​

Metric definitions​

Rules and baselines​

Topology​

Notification routing​

Decisions Lumetry makes​

Is the point abnormal?​

Is the condition actionable?​

Which alerts belong together?​

Who should be notified?​

Outputs​

Effects on the operations process​

Integration surfaces​