Skip to main content

How Lumetry Works

Lumetry turns operational telemetry into incident intelligence. This page describes the product as a black box: the information it accepts, the decisions it makes, the objects it produces, and how those objects affect an operations workflow.

It intentionally does not describe Lumetry's internal services, databases, deployment control systems, or processing technology.

Product boundary

Lumetry owns the path from normalized operational signal to contextualized incident:

metric data
|
v
evaluate expected behavior
|
v
record threshold breaches
|
v
open and recover actionable alerts
|
v
correlate related alerts into incidents
|
v
attach service / CI context and notify responders

Collection, dashboards, ticketing, CMDB ownership, and long-term telemetry exploration can remain in the tools you already use. Lumetry integrates with those tools through HTTP APIs, configured sources, collectors, topology synchronization, and webhooks.

Inputs

Metric points

Every point has a metric key, timestamp, numeric value, and optional labels. Labels can represent instances, regions, clusters, or other peer dimensions.

Points can arrive by:

  • direct batch or single-point API ingestion;
  • scheduled collection from a configured metric source;
  • a Lumetry Collector running in a customer-controlled environment.

Ingestion and evaluation are asynchronous. A successful 202 Accepted response means the data was accepted for processing; violations and alerts can appear shortly afterward.

Metric definitions

A metric definition gives a series its operational meaning: display name, unit, source, entity, ownership model, dimensionality, and lifecycle state. Only active, evaluable metrics can be attached to rules.

Rules and baselines

Rules describe what counts as abnormal and when abnormal points become actionable. They can use fixed thresholds or learned seasonal behavior, and can define separate Warning and Critical levels with trigger and recovery conditions.

Topology

Topology models business services, applications, components, hosts, databases, and their relationships. Metric bindings connect each signal to the entity that emits it.

This context lets Lumetry answer:

  • Which service is affected?
  • Which configuration item is closest to the signal?
  • Which alerts likely describe the same operational problem?

Notification routing

Integrations define delivery targets. Alerting profiles decide which alert lifecycle transitions are sent to which targets.

Decisions Lumetry makes

Is the point abnormal?

Lumetry compares each evaluable point with the threshold band in force at that time. A breach becomes a violation containing the actual value, expected range, direction, severity, and confidence context.

Is the condition actionable?

Rules use violation counts and time windows to avoid opening an alert for every isolated spike. Recovery settings likewise require enough normal behavior before an alert closes, which reduces flapping.

Which alerts belong together?

When an alert opens, Lumetry correlates it with current incidents using service or CI context, severity, and time proximity. When topology context is unavailable, the metric identity provides the fallback grouping key.

Who should be notified?

Alert transitions are routed according to the rule's alerting profile or the configured default delivery policy. Delivery failures are retried so a temporary destination outage delays notification instead of silently discarding it.

Outputs

OutputWhat it meansOperational use
ViolationOne point breached its expected range.Forensics, tuning, and audit.
AlertA sustained or repeated condition requires attention.On-call ownership and recovery tracking.
IncidentRelated alerts represent one broader problem.Coordinated incident response.
TimelineLifecycle and correlation events in time order.Investigation and handoff.
Impact contextRelated service, CI, metrics, and alert counts.Prioritization and routing.
NotificationAn alert opened or closed.Downstream chat, webhook, or automation flow.
System alertCollection or platform-facing input health needs attention.Restore observability coverage.

Effects on the operations process

Without correlation, responders receive individual signals and must reconstruct the service impact manually. With Lumetry:

  1. transient points remain forensic violations rather than immediate pages;
  2. sustained conditions become alerts with explicit trigger reasons;
  3. related alerts join one incident instead of creating separate investigations;
  4. topology adds the affected service and CI;
  5. timelines and underlying evidence stay attached to the incident;
  6. lifecycle transitions can drive external automation.

Integration surfaces

GoalSurface
Push metric pointsPOST /v1/metrics
Register and operate customer-side collection/api/collectors and /v1/collectors/*
Register metric metadata/api/metric-definitions
Configure pull-based metric sources/api/metric-sources
Synchronize service/CI topologyPOST /v1/topology
Read alerts and incidents/api/alerts, /api/incidents
Deliver alert transitionsAlerting integrations and profiles

The public HTTP contracts remain the integration boundary regardless of how Lumetry is deployed.