OpenTelemetry signals from first principles
Posted by KodrAus@reddit | programming | View on Reddit | 6 comments
There's a lot of high-noise, low-value content around OpenTelemetry out there, so I've tried to put together the simplest description I could by incrementally building up from needs that arise in your systems. I hope it might help cut through some of the less obvious concepts like context propagation and exponential histograms.
The format is very loosely pinched from "The Little .." series :)
Lucidendinq@reddit
I like the way this is written. Well done.
jpfed@reddit
Before the days of OTel, I was trying on my own to figure out how to make my code observable. My own solutions happened to combine what OTel calls “traces” and “logs”. I thought at the time that anything that can emit an event for logging also occurs in a context that can be characterized by spans. However, it seems as though the rest of the world has a greater mental separation between logs and traces.
modernkennnern@reddit
I've never understood the distinction between OTel logs and traces; why would you ever use logs?
phillipcarter2@reddit
Two main reasons:
OTel logs are a compatibility play. Take your existing app logs from a major framework, OTel injects trace and span IDs for traces on top of the logs, and now all your app logs are trace-correlated.
Some observability backends meaningfully distinguish between app log storage and trace storage. Usually they traces to be “skinny” and only serve to stitch together calls rather than also contain all the debugging info.
In the case of a newer codebase and a more modern observability backend, just use traces.
KodrAus@reddit (OP)
I would say because they’re a different thing. Log events are independent point-in-time observations, which makes them cheap to work with, and can be emitted independently of trace sampling or span completion. Logs are just span events, but not bound up in the span data model
KodrAus@reddit (OP)
Same. What OpenTracing added at the time to what I was already doing with logging request timings and correlation ids was the parent/child hierarchy, and the propagation across services.