Hi,
I am wondering why the class Embedding internally stored the vector as List<Double>. I think double[] or float[] would be more memory efficient.
Comment From: aseovic
@youngmoneee I believe that's exactly the point @agoerler is trying to make.
While the list may be more flexible, it is significantly less efficient than a simple float[] which is why pretty much all LLMs and most frameworks that work with them, including Langchain and Langchain4j, use primitive float[] to represent embeddings.
For one, the additional flexibility offered by a list is completely unnecessary, as the operations you need to do with the embeddings, such as calculations necessary for the similarity search, are just as easy, if not easier to do with arrays than with lists, and you can actually leverage SIMD support in the modern CPUs to do them more efficiently. Not to mention that you can actually use classes like FloatBuffer to access float[] allocated in native memory by ONNX, for example, directly, but can't do that with a list and have to box every single value into a Double on the way in (and likely unbox into a float on the way out).
But more importantly, storing 1,536 dimension vector returned by OpenAI, for example, requires 6,144 bytes when stored as a float[], and 36,864 bytes when stored as a List<Double>, for the exact same payload (ignoring the space used by the array or list instance itself, which is approximately the same and irrelevant in this case). That's 6x the memory cost, for no good reason, as it is neither faster nor easier to work with. Quite the opposite, actually.
Ultimately, nobody will ever store only a handful of vectors in a vector store, so these size differences add up in a hurry, and you'll need 6x the memory to store the exact same data using List<Double>. There seems to be a trend towards models that can create quantized vectors that use single byte per dimension (int8/uint8) or even a single bit per dimension (binary), in order to reduce the space required for vector storage 4x or 32x without significant accuracy loss (take a look at Cohere, for example), so going in the other direction and making vectors 6x bigger than necessary seems like a bad idea.
The bottom line is that for some of us implementing vector stores, especially in-memory vector stores, the difference between using primitive arrays/buffers and collections containing boxed wrapper types is so significant that the latter is a non-starter. The only way for us to support Spring AI at the moment would be to convert from List<Double> to a float[] on the way in, and the other way around on the way out. That is certainly doable, but is not free, and most importantly, it shouldn't be necessary.
Comment From: aseovic
@youngmoneee As for "the ability to store null values" in a list, but not in the array, that argument makes no sense in this context. Embeddings are by and large dense vectors, so there shouldn't be any missing values, and you can't do calculations with null anyway.
For cases where sparse vectors are used, there are are better data structures to use than either primitive array or a list of objects, but embeddings are not such a use case.
Comment From: agoerler
@youngmoneee I believe that's exactly the point @agoerler is trying to make.
Yes.
I think float[] would be much more memory efficient as compared to List<Double>. Moreover I doubt that the option to store null would be ever needed. As pointed out by @aseovic, float[] would be better compatible with e.g. Langchain4j and would hence require less costly conversions.
Also, I think vector embeddings are typically not manipulated but rather used in semantic search similarity search. I don't see how the List interface is useful when dealing with vector embeddings.
Comment From: youngmoneee
Hi @aseovic, @agoerler
I apologize for not considering the specific context of embeddings and making an incorrect judgment based solely on general cases. As you mentioned, even though there is overhead in memory and computation, I thought it would not pose a significant issue in modern computing servers, especially compared to the bottlenecks caused by I/O-bound tasks.
However, after considering your comments, I agree that regardless of the impact's magnitude, performing unnecessary wrapping/unwrapping and occupying additional memory and processing time is indeed unwarranted.
Comment From: ThomasVitale
Solved in https://github.com/spring-projects/spring-ai/pull/1002