Skip to content

LCORE-2655: OpenTelemetry feature proposal#1981

Draft
asimurka wants to merge 1 commit into
lightspeed-core:mainfrom
asimurka:opentelemetry_spike
Draft

LCORE-2655: OpenTelemetry feature proposal#1981
asimurka wants to merge 1 commit into
lightspeed-core:mainfrom
asimurka:opentelemetry_spike

Conversation

@asimurka

@asimurka asimurka commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Description

Refined OpenTelemetry feature to reflect recent request updates. Split the OpenTelemetry tracing design into a spike doc and a feature design doc (chosen approach: OTEL_* + opentelemetry-instrument, facade spans, OTLP export), replacing the single monolithic proposal file.

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement
  • Benchmarks improvement

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

  • Assisted-by: Cursor

Related Tickets & Documents

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

  • Documentation
    • Updated OpenTelemetry tracing design documentation with comprehensive guidance on implementation, configuration via environment variables, trace propagation behavior, span coverage, and operational deployment expectations.

@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Walkthrough

Removes the existing observability-opentelemetry.md design document and replaces it with two new files: a spike document recording six architectural decisions (configuration, SDK init, inbound/outbound context, export topology, span filtering), and a formal design document covering requirements, span boundaries, /config enrichment, environment variables, and implementation guidance.

Changes

OpenTelemetry Design Documentation

Layer / File(s) Summary
Spike: terminology, background, and architectural decisions
docs/design/observability-opentelemetry/observability-opentelemetry-spike.md
New spike document for LCORE-1591 defining OpenTelemetry terminology, documenting current metrics-only LCORE observability, and recording six decisions: OTEL_* env-only configuration, opentelemetry-instrument SDK init with OTEL_SDK_DISABLED=true kill switch, inbound W3C traceparent acceptance via OTEL_PROPAGATORS, outbound backend facade spans without trace context injection, OTLP-only export with collector-delegated persistence, and collector-side span filtering.
Design doc: scope, requirements, use cases, and architecture
docs/design/observability-opentelemetry/observability-opentelemetry-design.md
Adds the formal design document introduction, end-to-end requirements (env-only config, resilience, multi-worker, data constraints), role-based use cases (SRE/platform/developer/admin), and architecture decisions specifying one coherent trace per inbound request, no outbound propagation to external backends, and SDK initialization via opentelemetry-instrument.
Design doc: span semantics, coverage, env vars, config API, and implementation guidance
docs/design/observability-opentelemetry/observability-opentelemetry-design.md
Specifies facade span semantics, span coverage candidates with naming/attribute/event conventions, export topology, failure/data-handling requirements, OTEL_* environment variable enumeration, /config response enrichment under observability.otel with secret redaction, migration/backwards-compatibility expectations, new OpenTelemetry dependencies, implementation insertion points, open questions, and appendices.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

  • lightspeed-core/lightspeed-stack#1482: Directly overlaps this PR — both touch the same observability-opentelemetry design area and the previous document being deleted here was likely introduced or modified in that PR.

Suggested reviewers

  • anik120
  • tisnik
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly describes the main change: reorganizing OpenTelemetry feature documentation and refining the proposal, which directly matches the changeset content.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
✨ Simplify code
  • Create PR with simplified code

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@docs/design/observability-opentelemetry/observability-opentelemetry-design.md`:
- Around line 362-364: The "Open Questions" section in the design document
currently contains only a single high-level question about OTEL_* variable
inclusion in the /config scrape, which lacks specificity and leaves several
implementation concerns unaddressed. Expand this section to include more
detailed questions such as: specific OTEL_* variable names to whitelist (e.g.,
OTEL_EXPORTER_OTLP_HEADERS), the mechanism for redacting or handling sensitive
values like bearer tokens and API keys, privacy and security considerations for
the scrape endpoint, and any operational or testing concerns that must be
resolved before implementation begins. These additions will provide clearer
direction for the implementation phase.
- Around line 50-51: Expand and clarify the "New dependencies" section to
reconcile it with the implementation section (line 356). Explicitly state
whether the listed packages are only direct dependencies or the complete set
required. If using the distro approach (opentelemetry-distro), document that
this package automatically includes the SDK and default propagators via
environment variables, making separate propagator packages unnecessary, and
explicitly note that opentelemetry-instrument (the CLI used in the Containerfile
ENTRYPOINT) is provided by the distro. For OTLP export, clarify that
opentelemetry-exporter-otlp is a meta-package and explicitly document which
protocol (gRPC or HTTP) is expected to be used. Update the dependencies list to
match the actual packages needed and add explanatory notes about transitive
dependencies and the distro approach to eliminate discrepancies with the
implementation section.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 2510269d-a4d9-44bf-b4d4-185811e5a7a9

📥 Commits

Reviewing files that changed from the base of the PR and between 2c52cc0 and 9c0221e.

📒 Files selected for processing (3)
  • docs/design/observability-opentelemetry/observability-opentelemetry-design.md
  • docs/design/observability-opentelemetry/observability-opentelemetry-spike.md
  • docs/design/observability-opentelemetry/observability-opentelemetry.md
💤 Files with no reviewable changes (1)
  • docs/design/observability-opentelemetry/observability-opentelemetry.md
📜 Review details
⏰ Context from checks skipped due to timeout. (14)
  • GitHub Check: unit_tests (3.12)
  • GitHub Check: Pylinter
  • GitHub Check: integration_tests (3.12)
  • GitHub Check: integration_tests (3.13)
  • GitHub Check: E2E Tests for Lightspeed Evaluation job
  • GitHub Check: E2E: library mode / ci / group 3
  • GitHub Check: E2E: library mode / ci / group 1
  • GitHub Check: E2E: server mode / ci / group 2
  • GitHub Check: E2E: server mode / ci / group 3
  • GitHub Check: E2E: server mode / ci / group 1
  • GitHub Check: E2E: library mode / ci / group 2
  • GitHub Check: build-pr
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-0-6-on-pull-request
🧰 Additional context used
🪛 LanguageTool
docs/design/observability-opentelemetry/observability-opentelemetry-spike.md

[style] ~52-~52: To elevate your writing, try using a synonym here.
Context: ...ving option set; modeling it in YAML is hard to maintain. Incompatible with `opentel...

(HARD_TO)

🪛 markdownlint-cli2 (0.22.1)
docs/design/observability-opentelemetry/observability-opentelemetry-design.md

[warning] 123-123: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🔇 Additional comments (1)
docs/design/observability-opentelemetry/observability-opentelemetry-spike.md (1)

1-185: LGTM!

The spike document is well-structured and technically sound. The six architectural decisions are clearly presented with pros/cons, recommendations align with OpenTelemetry best practices, and the rationale is clear. Cross-document references and appendix links are appropriate.

@asimurka asimurka marked this pull request as draft June 26, 2026 06:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant