Spring AI Unclosed gRPC Channels in VertexAiTextEmbeddingModel: Channel Orphan Warnings

When using the Spring AI VertexAiTextEmbeddingModel class to create text embeddings, I see repeated warnings like the following in my logs:

Previous channel ManagedChannelImpl{...} was garbage collected without being shut down! Make sure to call shutdown()/shutdownNow()

These warnings indicate that a gRPC ManagedChannel created by the underlying Google Cloud client (PredictionServiceClient) is being garbage-collected without a proper call to close() or shutdown().

Inside VertexAiTextEmbeddingModel, the call(EmbeddingRequest request) method creates a new PredictionServiceClient instance on each call, but it never closes it. As a result, every ephemeral client spawns a gRPC channel that never gets shut down. Eventually, the channel is garbage-collected, triggering the “orphan channel” warnings in logs.

Other classes in Spring AI—such as VertexAiMultimodalEmbeddingModel—use a try-finally or a try-with-resources approach to ensure that each ephemeral PredictionServiceClient is closed after use, so they do not exhibit the same problem.

### This causes:

Logs are flooded with warnings about unclosed channels.
Possible resource leaks, as each PredictionServiceClient holds onto gRPC channels.
Performance overhead and potential memory usage issues from many channels staying alive longer than needed.

### Steps to Reproduce Configure VertexAiTextEmbeddingModel as a Spring bean (for example, in a @Configuration class). Inject and repeatedly call the textEmbeddingModel.embed(...) or textEmbeddingModel.call(...) method on multiple requests. Monitor application logs. Over time, you will see repeated warnings about an orphaned ManagedChannel or “Previous channel ... was garbage collected without being shut down!”

Technologies:

Spring AI version: 1.0.0-M5 Google Cloud AI libraries version: google-cloud-aiplatform: 3.40.0, gax-grpc: 2.46.1 Java version: 21 Running on local environment

Comment From: RyanHowell30

`import java.io.IOException; import java.util.ArrayList; import java.util.List; import java.util.Objects;

import org.springframework.ai.embedding.EmbeddingRequest; import org.springframework.ai.embedding.EmbeddingResponse; import org.springframework.ai.model.ModelOptionsUtils; import org.springframework.ai.vertexai.embedding.VertexAiEmbeddingConnectionDetails; import org.springframework.ai.vertexai.embedding.VertexAiEmbeddingUtils; import org.springframework.ai.vertexai.embedding.text.VertexAiTextEmbeddingModel; import org.springframework.ai.vertexai.embedding.text.VertexAiTextEmbeddingOptions;

import com.google.cloud.aiplatform.v1.EndpointName; import com.google.cloud.aiplatform.v1.PredictRequest; import com.google.cloud.aiplatform.v1.PredictResponse; import com.google.cloud.aiplatform.v1.PredictionServiceClient; import lombok.extern.slf4j.Slf4j;

@Slf4j public class ClosingGRPCCHannelWarningsEmbeddingModel extends VertexAiTextEmbeddingModel {

// Hold our own local copy of connectionDetails
private final VertexAiEmbeddingConnectionDetails localConnectionDetails;

public ClientClosingTextEmbeddingModel(
        VertexAiEmbeddingConnectionDetails connectionDetails,
        VertexAiTextEmbeddingOptions defaultOptions
) {
    // Call the parent constructor
    super(connectionDetails, defaultOptions);

    // Store a reference locally
    this.localConnectionDetails = connectionDetails;
}

@Override
public EmbeddingResponse call(EmbeddingRequest request) {
    // 1) Merge the options (like the parent does)
    VertexAiTextEmbeddingOptions finalOptions = mergedOptions(request);

    // 2) Actually open and close the PredictionServiceClient via try-with-resources
    try (PredictionServiceClient client = createClientSafely()) {
        // 3) Build the PredictRequest
        EndpointName endpointName = localConnectionDetails.getEndpointName(finalOptions.getModel());
        PredictRequest.Builder predictRequestBuilder =
                getPredictRequestBuilder(request, endpointName, finalOptions);

        // 4) Call predict()
        PredictResponse rawResponse = client.predict(predictRequestBuilder.build());

        // 5) Convert the rawResponse to EmbeddingResponse
        return buildEmbeddingResponse(rawResponse, finalOptions);
    } catch (IOException e) {
        // If createClientSafely() fails
        throw new RuntimeException("Failed to create or close the PredictionServiceClient", e);
    }
}

/**
 * Create the client from connection details in a safe way for try-with-resources.
 */
private PredictionServiceClient createClientSafely() throws IOException {
    // Use localConnectionDetails, not super (which doesn't provide a getter).
    return PredictionServiceClient.create(localConnectionDetails.getPredictionServiceSettings());
}

/**
 * This is basically the parent's "mergedOptions(request)" logic.
 */
private VertexAiTextEmbeddingOptions mergedOptions(EmbeddingRequest request) {
    VertexAiTextEmbeddingOptions defaultOptions = getDefaultOptions();
    VertexAiTextEmbeddingOptions defaultOptionsCopy = VertexAiTextEmbeddingOptions.builder()
            .from(defaultOptions)
            .build();

    // The parent's code uses ModelOptionsUtils.merge(...) if request.getOptions() is non-null
    return ModelOptionsUtils.merge(request.getOptions(), defaultOptionsCopy, VertexAiTextEmbeddingOptions.class);
}

/**
 * Replicates the parent's logic for building the PredictRequest.
 */
public PredictRequest.Builder getPredictRequestBuilder(
        EmbeddingRequest request,
        EndpointName endpointName,
        VertexAiTextEmbeddingOptions finalOptions
) {
    PredictRequest.Builder predictRequestBuilder =
            PredictRequest.newBuilder().setEndpoint(endpointName.toString());

    // The parent uses VertexAiEmbeddingUtils to build parameters, e.g. dimensions, autoTruncate, etc.
    VertexAiEmbeddingUtils.TextParametersBuilder parametersBuilder =
            VertexAiEmbeddingUtils.TextParametersBuilder.of();

    if (finalOptions.getAutoTruncate() != null) {
        parametersBuilder.autoTruncate(finalOptions.getAutoTruncate());
    }
    if (finalOptions.getDimensions() != null) {
        parametersBuilder.outputDimensionality(finalOptions.getDimensions());
    }
    predictRequestBuilder.setParameters(VertexAiEmbeddingUtils.valueOf(parametersBuilder.build()));

    // For each input text
    for (int i = 0; i < request.getInstructions().size(); i++) {
        String text = request.getInstructions().get(i);
        VertexAiEmbeddingUtils.TextInstanceBuilder instanceBuilder =
                VertexAiEmbeddingUtils.TextInstanceBuilder.of(text)
                        .taskType(finalOptions.getTaskType().name());

        if (finalOptions.getTitle() != null && !finalOptions.getTitle().isBlank()) {
            instanceBuilder.title(finalOptions.getTitle());
        }
        predictRequestBuilder.addInstances(VertexAiEmbeddingUtils.valueOf(instanceBuilder.build()));
    }

    return predictRequestBuilder;
}

/**
 * Convert PredictResponse => EmbeddingResponse (replicating parent's logic).
 */
private EmbeddingResponse buildEmbeddingResponse(PredictResponse rawResponse,
        VertexAiTextEmbeddingOptions finalOptions) {
    List<org.springframework.ai.embedding.Embedding> embeddingsList = new ArrayList<>();
    int index = 0;
    int totalTokenCount = 0;

    for (com.google.protobuf.Value predictionValue : rawResponse.getPredictionsList()) {
        com.google.protobuf.Value embeddingsStruct =
                predictionValue.getStructValue().getFieldsOrThrow("embeddings");
        com.google.protobuf.Value statistics =
                embeddingsStruct.getStructValue().getFieldsOrThrow("statistics");
        com.google.protobuf.Value tokenCountVal =
                statistics.getStructValue().getFieldsOrThrow("token_count");
        totalTokenCount += (int) tokenCountVal.getNumberValue();

        com.google.protobuf.Value valuesVal =
                embeddingsStruct.getStructValue().getFieldsOrThrow("values");
        float[] vector = VertexAiEmbeddingUtils.toVector(valuesVal);
        embeddingsList.add(new org.springframework.ai.embedding.Embedding(vector, index++));
    }

    org.springframework.ai.embedding.EmbeddingResponseMetadata metadata =
            new org.springframework.ai.embedding.EmbeddingResponseMetadata();
    metadata.setModel(Objects.requireNonNull(finalOptions.getModel()));

    return new org.springframework.ai.embedding.EmbeddingResponse(embeddingsList, metadata);
}

private VertexAiTextEmbeddingOptions getDefaultOptions() {
    return super.defaultOptions;
}

After writing this, the warnings have disappeared.

Comment From: markpollack

wow, thanks for the deep investigate. Getting back to triage after a break. Would you be able to create a PR to address this please?

Comment From: RyanHowell30

@markpollack Yea sure no problem

Comment From: RyanHowell30

@markpollack I am getting permission denied when trying to push my changes. Do you have any idea of how to get around this?

Change:

Modified call(...) with Try-With-Resources try (PredictionServiceClient client = createPredictionServiceClient()) { ... }

When the block finishes, client.close() is automatically called, ensuring the gRPC channel is cleanly shut down.

Comment From: charlie-ang-collibra

Thanks for your work here, @RyanHowell30 ! I walked into this issue today, and am glad you already identified it. I hope you don't mind if I lift your implementation in the meantime. There's just no easy way (with the private/package methods, the lambda call) to make just that change.

🙏