How Lumetry Works
Lumetry turns operational telemetry into incident intelligence. This page describes the product as a black box: the information it accepts, the decisions it makes, the objects it produces, and how those objects affect an operations workflow.
It intentionally does not describe Lumetry's internal services, databases, deployment control systems, or processing technology.
Product boundary
Lumetry owns the path from normalized operational signal to contextualized incident:
metric data
|
v
evaluate expected behavior
|
v
record threshold breaches
|
v
open and recover actionable alerts
|
v
correlate related alerts into incidents
|
v
attach service / CI context and notify responders
Collection, dashboards, ticketing, CMDB ownership, and long-term telemetry exploration can remain in the tools you already use. Lumetry integrates with those tools through HTTP APIs, configured sources, collectors, topology synchronization, and webhooks.
Inputs
Metric points
Every point has a metric key, timestamp, numeric value, and optional labels. Labels can represent instances, regions, clusters, or other peer dimensions.
Points can arrive by:
- direct batch or single-point API ingestion;
- scheduled collection from a configured metric source;
- a Lumetry Collector running in a customer-controlled environment.
Ingestion and evaluation are asynchronous. A successful 202 Accepted response means the
data was accepted for processing; violations and alerts can appear shortly afterward.
Metric definitions
A metric definition gives a series its operational meaning: display name, unit, source, entity, ownership model, dimensionality, and lifecycle state. Only active, evaluable metrics can be attached to rules.
Rules and baselines
Rules describe what counts as abnormal and when abnormal points become actionable. They can use fixed thresholds or learned seasonal behavior, and can define separate Warning and Critical levels with trigger and recovery conditions.
Topology
Topology models business services, applications, components, hosts, databases, and their relationships. Metric bindings connect each signal to the entity that emits it.
This context lets Lumetry answer:
- Which service is affected?
- Which configuration item is closest to the signal?
- Which alerts likely describe the same operational problem?
Notification routing
Integrations define delivery targets. Alerting profiles decide which alert lifecycle transitions are sent to which targets.
Decisions Lumetry makes
Is the point abnormal?
Lumetry compares each evaluable point with the threshold band in force at that time. A breach becomes a violation containing the actual value, expected range, direction, severity, and confidence context.
Is the condition actionable?
Rules use violation counts and time windows to avoid opening an alert for every isolated spike. Recovery settings likewise require enough normal behavior before an alert closes, which reduces flapping.
Which alerts belong together?
When an alert opens, Lumetry correlates it with current incidents using service or CI context, severity, and time proximity. When topology context is unavailable, the metric identity provides the fallback grouping key.
Who should be notified?
Alert transitions are routed according to the rule's alerting profile or the configured default delivery policy. Delivery failures are retried so a temporary destination outage delays notification instead of silently discarding it.
Outputs
| Output | What it means | Operational use |
|---|---|---|
| Violation | One point breached its expected range. | Forensics, tuning, and audit. |
| Alert | A sustained or repeated condition requires attention. | On-call ownership and recovery tracking. |
| Incident | Related alerts represent one broader problem. | Coordinated incident response. |
| Timeline | Lifecycle and correlation events in time order. | Investigation and handoff. |
| Impact context | Related service, CI, metrics, and alert counts. | Prioritization and routing. |
| Notification | An alert opened or closed. | Downstream chat, webhook, or automation flow. |
| System alert | Collection or platform-facing input health needs attention. | Restore observability coverage. |
Effects on the operations process
Without correlation, responders receive individual signals and must reconstruct the service impact manually. With Lumetry:
- transient points remain forensic violations rather than immediate pages;
- sustained conditions become alerts with explicit trigger reasons;
- related alerts join one incident instead of creating separate investigations;
- topology adds the affected service and CI;
- timelines and underlying evidence stay attached to the incident;
- lifecycle transitions can drive external automation.
Integration surfaces
| Goal | Surface |
|---|---|
| Push metric points | POST /v1/metrics |
| Register and operate customer-side collection | /api/collectors and /v1/collectors/* |
| Register metric metadata | /api/metric-definitions |
| Configure pull-based metric sources | /api/metric-sources |
| Synchronize service/CI topology | POST /v1/topology |
| Read alerts and incidents | /api/alerts, /api/incidents |
| Deliver alert transitions | Alerting integrations and profiles |
The public HTTP contracts remain the integration boundary regardless of how Lumetry is deployed.