Wednesday, May 22, 2024

OpenTelemetry — what is it and why does it matter? [Q&A]


When OpenTelemetry was first released in 2019, there was a good deal of excitement about the prospect of a single standard set of telemetry data for the entire modern software stack.

OpenTelemetry set out to make robust, portable telemetry a built-in feature of cloud-native software, and give developers and platform engineers a common mental model for all the telemetry types.

We talked to Juraci Paixão Kröhling, governance committee member for the OpenTelemetry project, and principal engineer at Grafana Labs to learn more.

BN: What was the original purpose of OpenTelemetry, and how is it performing against that mission?

JPK: To quote our mission, vision, and values page, our mission is, “To enable effective observability by making high-quality, portable telemetry ubiquitous.” Our goal is to make telemetry easy, portable, and vendor-neutral by creating tools and standards that can be reused across the industry, such as a standard instrumentation API that framework developers can depend on, and a standard protocol (OTLP) that different vendors and solutions can implement to send and receive telemetry data.

I think we are on track to accomplish this goal. From where I’m standing, OpenTelemetry isn’t the future of observability; it’s the present. Every relevant vendor out there supports ingesting data in OTLP format already. More and more frameworks are getting natively instrumented with OpenTelemetry API, as it’s clear that this is what their users want from them. And finally, users are not only convinced, but already reaping the benefits of one of the key promises of OpenTelemetry: freedom from vendor lock-in on the instrumentation and collection side. I’m hearing more and more success stories about companies switching vendors without losing telemetry data, thanks to the adoption of OpenTelemetry.

But we are not at our end goal yet. We are reaching a state where we have pretty much all of the features we need to be successful, but they are not as stable as we wanted them to be. ‘Stability’ is going to be our motto for the next year: stability of the semantic conventions, APIs and SDKs, as well as of key components for the Collector.

BN: OpenTelemetry appears to be the second most active CNCF project, behind only Kubernetes. What sorts of open source contributions is it getting?

JPK: The majority of our regular and consistent contributions come from passionate people employed by vendors to work on the project, which indirectly reflect the needs of their customers, who are ultimately users of OpenTelemetry as well. That said, I’d welcome more direct contributions from end- users, which is why we recently organized our first OpenTelemetry ContribFest during KubeCon+CloudNativeCon in Chicago, resulting in more than 10 pull requests from new contributors in just under 90 minutes. To me, it’s a huge success, and we should repeat it often.

But contributions aren’t only things we can merge in a git repository somewhere. Contributions are also about openly sharing experiences, and we are seeing a lot of traction here, from conference talks from end users telling how they are using OpenTelemetry to people coming to our End User Working Group telling what works and what doesn’t.

All in all, there’s always something exciting going on in our project!

BN: How has OpenTelemetry adoption affected the observability discipline overall?

JPK: It’s impossible to talk about observability nowadays and not think about OpenTelemetry. Even when it’s only a small detail in the overall architecture, OpenTelemetry will be there. But typically, OpenTelemetry is already at the center of the observability strategy for most companies with a mature culture around the topic. It’s common to hear users stating that they want OpenTelemetry to remain vendor-agnostic, so that they can select the actual solution (databases, visualization, analysis) only later on in the process.

This is not only about creating the standard though; it’s also making it easier for people to build innovative solutions without having to invest time thinking about the basic building blocks.

BN: What do you see as the future growth areas for OpenTelemetry?

JPK: The most obvious one to me at this moment is around profiling. It’s a new signal, and it’s totally different from everything we have so far. We’ll need innovation in different parts of the system to accommodate this new signal, especially around the Collector.

Given the current attention people have been giving to topics like LLMs and AI, I’d say that there’s a huge area to be explored here as well. What does ‘observable AI’ mean to OpenTelemetry? What can we do in terms of semantic conventions here? Do we need new instrumentation solutions? And how can we use LLMs to improve instrumentation for existing code bases? Can observability make AI safer?

One area I hope that we’ll see a lot of progress in the future is around green computing, and as such, what can we do as part of OpenTelemetry to do our part here? Would it make sense to establish new semantic conventions for proxy metrics for environmental sustainability such as energy and carbon metrics? Perhaps be more aggressive in finding out our own resource consumption and optimize our software?

Image credit: everythingposs/

Read more

Local News