This PR changes to handle null currentSpan() in the LazyTracingSpanContextSupplier consistently.
Comment From: wilkinsona
Thanks, @izeye. I wondered about this when I reviewed the contribution. Neither getTraceId() nor getSpanId() should be called when isSampled() returns false. If that assumption is faulty, I thought it better to fail than to return null when getTraceId() or getSpanId() is called unexpectedly.
Comment From: izeye
@wilkinsona Ah, sorry. I didn't know that isSampled() is checked before getTraceId() or getSpanId(). Feel free to close this if that's the case.
Comment From: wilkinsona
Well, I am doubting my reasoning a little now. Reading the javadoc of isSamples() again, I'm not sure that I understand it well enough to be certain that it's an error to call getTraceId() or getSpanId() when isSampled() has returned null. The one place where it's called at the moment (io.prometheus.client.exemplars.DefaultExemplarSampler) won't call the supplier if isSampled() returns false, but we can't stop other code from calling it…
WDYT, @jonatan-ivanov?
Comment From: mshima
I am getting the following stack trace:
java.lang.NullPointerException: Cannot invoke "io.micrometer.tracing.Span.context()" because the return value of "org.springframework.boot.actuate.autoconfigure.tracing.exemplars.ExemplarsAutoConfiguration$LazyTracingSpanContextSupplier.currentSpan()" is null
at org.springframework.boot.actuate.autoconfigure.tracing.exemplars.ExemplarsAutoConfiguration$LazyTracingSpanContextSupplier.getSpanId(ExemplarsAutoConfiguration.java:75)
at io.prometheus.client.exemplars.DefaultExemplarSampler.doSample(DefaultExemplarSampler.java:46)
at io.prometheus.client.exemplars.DefaultExemplarSampler.sample(DefaultExemplarSampler.java:34)
at io.micrometer.prometheus.PrometheusCounter.updateExemplar(PrometheusCounter.java:79)
at io.micrometer.prometheus.PrometheusCounter.increment(PrometheusCounter.java:58)
at io.micrometer.core.instrument.binder.jvm.JvmGcMetrics$GcMetricsNotificationListener.countPoolSizeDelta(JvmGcMetrics.java:223)
at io.micrometer.core.instrument.binder.jvm.JvmGcMetrics$GcMetricsNotificationListener.handleNotification(JvmGcMetrics.java:191)
at java.management/sun.management.NotificationEmitterSupport.sendNotification(NotificationEmitterSupport.java:155)
at jdk.management/com.sun.management.internal.GarbageCollectorExtImpl.createGCNotification(GarbageCollectorExtImpl.java:115)
Exception in thread "restartedMain" java.lang.reflect.InvocationTargetException
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:568)
at org.springframework.boot.devtools.restart.RestartLauncher.run(RestartLauncher.java:49)
Caused by: java.lang.NullPointerException: Cannot invoke "io.micrometer.tracing.Span.context()" because the return value of "org.springframework.boot.actuate.autoconfigure.tracing.exemplars.ExemplarsAutoConfiguration$LazyTracingSpanContextSupplier.currentSpan()" is null
at org.springframework.boot.actuate.autoconfigure.tracing.exemplars.ExemplarsAutoConfiguration$LazyTracingSpanContextSupplier.getSpanId(ExemplarsAutoConfiguration.java:75)
at io.prometheus.client.exemplars.DefaultExemplarSampler.doSample(DefaultExemplarSampler.java:46)
at io.prometheus.client.exemplars.DefaultExemplarSampler.sample(DefaultExemplarSampler.java:34)
at io.micrometer.prometheus.PrometheusCounter.updateExemplar(PrometheusCounter.java:79)
at io.micrometer.prometheus.PrometheusCounter.increment(PrometheusCounter.java:58)
at io.micrometer.core.instrument.Counter.increment(Counter.java:38)
at io.micrometer.core.instrument.binder.logging.MetricsTurboFilter.decide(LogbackMetrics.java:196)
at ch.qos.logback.classic.spi.TurboFilterList.getTurboFilterChainDecision(TurboFilterList.java:49)
at ch.qos.logback.classic.LoggerContext.getTurboFilterChainDecision_0_3OrMore(LoggerContext.java:251)
at ch.qos.logback.classic.Logger.filterAndLog_0_Or3Plus(Logger.java:375)
at ch.qos.logback.classic.Logger.log(Logger.java:780)
at org.apache.commons.logging.LogAdapter$Slf4jLocationAwareLog.warn(LogAdapter.java:444)
at org.springframework.boot.SpringApplication.handleRunFailure(SpringApplication.java:792)
at org.springframework.boot.SpringApplication.run(SpringApplication.java:321)
I had to manually disable prometheus:
management:
prometheus:
metrics:
export:
enabled: false
Comment From: wilkinsona
Thanks for trying things out @mshima. It looks like you're using an old version of simpleclient. As per Boot's dependency management, you should be using 0.16.0.
Comment From: mshima
Thanks @wilkinsona, that resolves the problem.
Comment From: jonatan-ivanov
@wilkinsona @izeye That should be the contract so the current behavior (no null-check) of the SpanContextSupplier is intentional.
If the span is not sampled, there is no reason to include it in the Exemplar because won't be in the span store. Also, isSampled is a newer addition; before that, we needed to do the same checks in the getters (hence the NPE above).
I think it is fine doing the checks since:
1. We can't stop anyone calling it
2. Older versions of the client don't have isSampled so they call the getters
I can also file a PR in Prometheus Client and call this out in the javadocs so that it will be explicit.
Comment From: wilkinsona
Alright. Thanks, @jonatan-ivanov. Let's add the checks just in case.
Comment From: wilkinsona
Thanks very much, @izeye.