Self-Observability for On-Prem Deployments

Lumetry exposes operational telemetry so on-prem operators can monitor the Lumetry deployment with tools they already own. It does not require a Lumetry-hosted telemetry backend or send self-observability data to one by default.

Current coverage

The API services and background worker share the same self-observability behavior:

OpenTelemetry traces, metrics, and logs can be exported over OTLP;
Prometheus metrics can be scraped from each service;
application logs can be emitted as one ECS-compatible JSON document per stdout line;
separate liveness, readiness, and startup endpoints support process supervision.

OTLP export

OTLP export is disabled by default. When enabled, each signal can be controlled independently, so operators can export traces, metrics, logs, or any combination of the three.

The destination is an operator-configured OTLP endpoint using gRPC or HTTP/protobuf. Optional headers can carry authentication material such as a bearer token or API key. Header values should be supplied through the deployment's secret-management mechanism and treated as credentials.

An HTTPS OTLP endpoint provides server-authenticated TLS using the deployment host's trust configuration. Lumetry does not currently provide a first-class OTLP client-certificate setting for mutual TLS. Deployments that require mTLS should enforce it through a customer-managed local collector or proxy and keep the Lumetry-to-collector connection inside the trusted deployment boundary.

Prometheus endpoints

Each covered API service exposes a Prometheus scrape endpoint at /metrics by default. The endpoint can be disabled or assigned a different path. It includes request rates and durations, response status information, outbound request and runtime measurements, and Lumetry operational counters.

The scrape endpoint is intended for the internal management network. It should not be published through an internet-facing gateway. Because each service exposes its own endpoint, the monitoring system must scrape each service instance that should be observed.

The worker endpoint also includes processing-cycle, durable-queue, retry, failure, delivery, and ingestion-latency measurements.

ECS JSON logs

In JSON mode, the covered API services write ECS-compatible events to stdout. A container or host log agent can forward those events to a customer-owned log platform.

Request events include the method, URL path, response status, duration, client address, user agent, body sizes when known, and user or tenant identifiers when available. Unhandled failures produce a separate error event with exception details. Trace and span identifiers are included when a trace is active, allowing logs and traces to be correlated.

Lumetry does not connect directly to a specific log-storage product. Log collection, transport security, indexing, retention, access control, and deletion remain under the operator's control.

Health endpoints

Services expose:

GET /livez for process liveness;
GET /startupz for host initialization;
GET /readyz for readiness;
GET /healthz as a liveness-compatible alias.

Worker readiness checks the core storage and messaging dependencies required for background processing. Liveness remains dependency-independent so a temporary downstream outage does not trigger an unnecessary process restart.

System Health page

The tenant console includes a System Health page for a concise operational check. It shows API and worker status plus the depth and oldest-item age of the main durable processing queues. This page is intended for rapid diagnosis; detailed telemetry, retention, alerting, and cross-service analysis remain in the operator's observability backend.

Customer ownership and data residency

Self-observability data remains in the customer-controlled environment unless the operator configures an export destination outside it:

stdout logs remain on the host or container platform until a customer-managed shipper forwards them;
Prometheus data leaves the service only when a customer-managed scraper reads it;
OTLP data is sent only to the endpoint configured by the operator.

The location, retention, replication, and access policy of the selected telemetry backend determine the resulting data-residency posture. Lumetry does not silently redirect these signals to a vendor-operated service.

Redaction and sensitive data

The standardized request event does not include request or response bodies, authorization headers, or OTLP authentication headers. It records body sizes rather than body content.

Operational logs and exception stack traces can still contain identifiers, URLs, provider error messages, or values written by application and integration code. The bundled collector profile filters common credential and sensitive attribute forms; operators should extend those rules for organization-specific data, restrict access to telemetry, and review diagnostic material before sharing it outside the organization.

Diagnostics and support bundles

Lumetry does not automatically generate or upload diagnostics. An operator-run support bundle command can collect a bounded snapshot of service state, health responses, metrics, and recent logs. It applies common secret and token redaction and excludes database contents.

The archive remains local until the operator chooses to share it. Operators must inspect it and remove organization-specific identifiers or sensitive details before transfer.

Current coverage​

OTLP export​

Prometheus endpoints​

ECS JSON logs​

Health endpoints​

System Health page​

Customer ownership and data residency​

Redaction and sensitive data​

Diagnostics and support bundles​