This PR changes to handle null currentSpan() in the LazyTracingSpanContextSupplier consistently.

Comment From: wilkinsona

Thanks, @izeye. I wondered about this when I reviewed the contribution. Neither getTraceId() nor getSpanId() should be called when isSampled() returns false. If that assumption is faulty, I thought it better to fail than to return null when getTraceId() or getSpanId() is called unexpectedly.

Comment From: izeye

@wilkinsona Ah, sorry. I didn't know that isSampled() is checked before getTraceId() or getSpanId(). Feel free to close this if that's the case.

Comment From: wilkinsona

Well, I am doubting my reasoning a little now. Reading the javadoc of isSamples() again, I'm not sure that I understand it well enough to be certain that it's an error to call getTraceId() or getSpanId() when isSampled() has returned null. The one place where it's called at the moment (io.prometheus.client.exemplars.DefaultExemplarSampler) won't call the supplier if isSampled() returns false, but we can't stop other code from calling it…

WDYT, @jonatan-ivanov?

Comment From: mshima

I am getting the following stack trace:

java.lang.NullPointerException: Cannot invoke "io.micrometer.tracing.Span.context()" because the return value of "org.springframework.boot.actuate.autoconfigure.tracing.exemplars.ExemplarsAutoConfiguration$LazyTracingSpanContextSupplier.currentSpan()" is null
    at org.springframework.boot.actuate.autoconfigure.tracing.exemplars.ExemplarsAutoConfiguration$LazyTracingSpanContextSupplier.getSpanId(ExemplarsAutoConfiguration.java:75)
    at io.prometheus.client.exemplars.DefaultExemplarSampler.doSample(DefaultExemplarSampler.java:46)
    at io.prometheus.client.exemplars.DefaultExemplarSampler.sample(DefaultExemplarSampler.java:34)
    at io.micrometer.prometheus.PrometheusCounter.updateExemplar(PrometheusCounter.java:79)
    at io.micrometer.prometheus.PrometheusCounter.increment(PrometheusCounter.java:58)
    at io.micrometer.core.instrument.binder.jvm.JvmGcMetrics$GcMetricsNotificationListener.countPoolSizeDelta(JvmGcMetrics.java:223)
    at io.micrometer.core.instrument.binder.jvm.JvmGcMetrics$GcMetricsNotificationListener.handleNotification(JvmGcMetrics.java:191)
    at java.management/sun.management.NotificationEmitterSupport.sendNotification(NotificationEmitterSupport.java:155)
    at jdk.management/com.sun.management.internal.GarbageCollectorExtImpl.createGCNotification(GarbageCollectorExtImpl.java:115)
Exception in thread "restartedMain" java.lang.reflect.InvocationTargetException
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:568)
    at org.springframework.boot.devtools.restart.RestartLauncher.run(RestartLauncher.java:49)
Caused by: java.lang.NullPointerException: Cannot invoke "io.micrometer.tracing.Span.context()" because the return value of "org.springframework.boot.actuate.autoconfigure.tracing.exemplars.ExemplarsAutoConfiguration$LazyTracingSpanContextSupplier.currentSpan()" is null
    at org.springframework.boot.actuate.autoconfigure.tracing.exemplars.ExemplarsAutoConfiguration$LazyTracingSpanContextSupplier.getSpanId(ExemplarsAutoConfiguration.java:75)
    at io.prometheus.client.exemplars.DefaultExemplarSampler.doSample(DefaultExemplarSampler.java:46)
    at io.prometheus.client.exemplars.DefaultExemplarSampler.sample(DefaultExemplarSampler.java:34)
    at io.micrometer.prometheus.PrometheusCounter.updateExemplar(PrometheusCounter.java:79)
    at io.micrometer.prometheus.PrometheusCounter.increment(PrometheusCounter.java:58)
    at io.micrometer.core.instrument.Counter.increment(Counter.java:38)
    at io.micrometer.core.instrument.binder.logging.MetricsTurboFilter.decide(LogbackMetrics.java:196)
    at ch.qos.logback.classic.spi.TurboFilterList.getTurboFilterChainDecision(TurboFilterList.java:49)
    at ch.qos.logback.classic.LoggerContext.getTurboFilterChainDecision_0_3OrMore(LoggerContext.java:251)
    at ch.qos.logback.classic.Logger.filterAndLog_0_Or3Plus(Logger.java:375)
    at ch.qos.logback.classic.Logger.log(Logger.java:780)
    at org.apache.commons.logging.LogAdapter$Slf4jLocationAwareLog.warn(LogAdapter.java:444)
    at org.springframework.boot.SpringApplication.handleRunFailure(SpringApplication.java:792)
    at org.springframework.boot.SpringApplication.run(SpringApplication.java:321)

I had to manually disable prometheus:

management:
  prometheus:
    metrics:
      export:
        enabled: false

Comment From: wilkinsona

Thanks for trying things out @mshima. It looks like you're using an old version of simpleclient. As per Boot's dependency management, you should be using 0.16.0.

Comment From: mshima

Thanks @wilkinsona, that resolves the problem.

Comment From: jonatan-ivanov

@wilkinsona @izeye That should be the contract so the current behavior (no null-check) of the SpanContextSupplier is intentional.

If the span is not sampled, there is no reason to include it in the Exemplar because won't be in the span store. Also, isSampled is a newer addition; before that, we needed to do the same checks in the getters (hence the NPE above).

I think it is fine doing the checks since: 1. We can't stop anyone calling it 2. Older versions of the client don't have isSampled so they call the getters

I can also file a PR in Prometheus Client and call this out in the javadocs so that it will be explicit.

Comment From: wilkinsona

Alright. Thanks, @jonatan-ivanov. Let's add the checks just in case.

Comment From: wilkinsona

Thanks very much, @izeye.