I've also created proposal how to fix it https://github.com/spring-projects/spring-ai/pull/1572 Bug description If processing Resource doesn't have getFileName() at org.springframework.ai.transformer.splitter.TextSplitter#createDocuments will throw NPE

Environment Please provide as many details as possible: springAiVersion = "1.0.0-M3", Java 21, PG vector store

PgVectorStore compose file Also i have init sql

Steps to reproduce I've endpoint CronController which produce event to PgEventListener in listener asynchronously calls PGVectorStoreService i've wrote wrapper method for each document enrich with filename, if i pass documents to org.springframework.ai.transformer.splitter.TextSplitter#createDocuments without filenames in metadata part, when collector try to get e.getValue() will throw unclear NPE

Map<String, Object> metadataCopy = metadata.entrySet()
                    .stream()
                    .collect(Collectors.toMap(e -> e.getKey(), e -> e.getValue()));

Expected behavior Safely Collectors.toMap calling

Minimal Complete Reproducible example

20-10-2024 23:06:50.018  -  INFO 52440 [Async-1]  r.o.cron.service.pg.PgEventListener:16  : Processing event: PgEvent(resource=Byte array resource [resource loaded from byte array], type=PDF, fileName=Cloud_Architecture_Demystified_Understand_how_to_design_sustainable.pdf)
20-10-2024 23:06:50.019  -  INFO 52440 [Async-1]  r.o.cron.service.pg.PgEventListener:19  : PDF processing event: PgEvent(resource=Byte array resource [resource loaded from byte array], type=PDF, fileName=Cloud_Architecture_Demystified_Understand_how_to_design_sustainable.pdf)
20-10-2024 23:06:50.099  -  INFO 52440 [Async-1]  r.o.c.service.pg.PGVectorStoreService:83  : Loading Cloud_Architecture_Demystified_Understand_how_to_design_sustainable.pdf Reference PDF into Vector Store
20-10-2024 23:06:50.232  -  INFO 52440 [Async-1]  o.s.ai.reader.pdf.PagePdfDocumentReader:114  : Processing PDF page: 1
20-10-2024 23:06:50.429  -  INFO 52440 [Async-1]  o.s.ai.reader.pdf.PagePdfDocumentReader:114  : Processing PDF page: 23
20-10-2024 23:06:50.520  -  INFO 52440 [Async-1]  o.s.ai.reader.pdf.PagePdfDocumentReader:114  : Processing PDF page: 45
20-10-2024 23:06:50.597  -  INFO 52440 [Async-1]  o.s.ai.reader.pdf.PagePdfDocumentReader:114  : Processing PDF page: 67
20-10-2024 23:07:16.583  -  INFO 52440 [Async-1]  o.s.ai.reader.pdf.PagePdfDocumentReader:114  : Processing PDF page: 89
20-10-2024 23:07:16.657  -  INFO 52440 [Async-1]  o.s.ai.reader.pdf.PagePdfDocumentReader:114  : Processing PDF page: 111
20-10-2024 23:07:16.749  -  INFO 52440 [Async-1]  o.s.ai.reader.pdf.PagePdfDocumentReader:114  : Processing PDF page: 133
20-10-2024 23:07:16.825  -  INFO 52440 [Async-1]  o.s.ai.reader.pdf.PagePdfDocumentReader:114  : Processing PDF page: 155
20-10-2024 23:07:16.899  -  INFO 52440 [Async-1]  o.s.ai.reader.pdf.PagePdfDocumentReader:114  : Processing PDF page: 177
20-10-2024 23:07:16.981  -  INFO 52440 [Async-1]  o.s.ai.reader.pdf.PagePdfDocumentReader:114  : Processing PDF page: 199
20-10-2024 23:07:17.087  -  INFO 52440 [Async-1]  o.s.ai.reader.pdf.PagePdfDocumentReader:156  : Processing 228 pages
20-10-2024 23:07:17.097  - ERROR 52440 [Async-1]  r.o.c.service.pg.PGVectorStoreService:95  : Error while loading PDF Cloud_Architecture_Demystified_Understand_how_to_design_sustainable.pdf into Vector Store. Exception: NullPointerException - Message: null
20-10-2024 23:07:17.098  - ERROR 52440 [Async-1]  o.s.a.i.SimpleAsyncUncaughtExceptionHandler:39  : Unexpected exception occurred invoking async method: public void ru.ogbozoyan.cron.service.pg.PgEventListener.process(ru.ogbozoyan.cron.service.pg.PgEvent)

java.lang.NullPointerException: null
    at java.base/java.util.Objects.requireNonNull(Objects.java:233) ~[na:na]
    at java.base/java.util.stream.Collectors.lambda$uniqKeysMapAccumulator$1(Collectors.java:180) ~[na:na]
    at java.base/java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169) ~[na:na]
    at java.base/java.util.HashMap$EntrySpliterator.forEachRemaining(HashMap.java:1858) ~[na:na]
    at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509) ~[na:na]
    at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499) ~[na:na]
    at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:921) ~[na:na]
    at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[na:na]
    at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:682) ~[na:na]
    at org.springframework.ai.transformer.splitter.TextSplitter.createDocuments(TextSplitter.java:91) ~[spring-ai-core-1.0.0-M3.jar:1.0.0-M3]
    at org.springframework.ai.transformer.splitter.TextSplitter.doSplitDocuments(TextSplitter.java:71) ~[spring-ai-core-1.0.0-M3.jar:1.0.0-M3]
    at org.springframework.ai.transformer.splitter.TextSplitter.apply(TextSplitter.java:41) ~[spring-ai-core-1.0.0-M3.jar:1.0.0-M3]
    at ru.ogbozoyan.cron.service.pg.PGVectorStoreService.saveNewPDFAsync(PGVectorStoreService.kt:91) ~[main/:na]
    at ru.ogbozoyan.cron.service.pg.PgEventListener.process(PgEventListener.kt:20) ~[main/:na]
    at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103) ~[na:na]
    at java.base/java.lang.reflect.Method.invoke(Method.java:580) ~[na:na]

Comment From: ogbozoyan

merged at https://github.com/spring-projects/spring-ai/commit/8b1882b2244953ef0735227d4a5525b1b5479eba