I am currently trying to implement a dense passage retrieval system and i am evaluating whether i can use Hibernate search for that purpose. Essentially I want to perform a knn search on Entities that have multiple Embeddings associated with them.
Consider something like this:
@Indexed
public class Book {
@Id
private Integer id;
@OneToMany(mappedBy = "book")
@IndexedEmbedded
private List<Embedding> bookEmbeddings;
// Other properties ...
}
@Entity
public class Embedding {
@Id
private Integer id;
@ManyToOne
private Book book;
@VectorField(dimension = 768, vectorSimilarity = VectorSimilarity.COSINE, searchable = Searchable.YES)
private float[] embedding;
// Other properties ...
}
This is currently not possible as stated in the documentation:
“It is not allowed to index multiple vectors within the same field, i.e. vector fields cannot be multivalued.“
Is this a temporary limitation because the vector search functionality is currently being developed or are there any underlying limitations that make this impossible to implement?
Thanks for reaching out with this question. At the moment it is a limitation in the underlying Lucene implementation of vector search. Lucene allows to index exactly one vector into a vector field. Hence it cannot be multi-valued (e.g. see Multi-value Support for KnnVectorField · Issue #12313 · apache/lucene · GitHub).
But when we are saying that it cannot be multivalued it means that there cannot be something like:
@Indexed
public class Book {
@Id
private Integer id;
@OneToMany(mappedBy = "book")
@VectorField(....)
private List<float[]> bookEmbeddings;
// Other properties ...
}
In your example, though, you’ve wrapped the vector field in another embedded object. That should work fine since it means that the vector field will be located in a nested object containing a single vector.
Note that with such mapping, the query should be something like this: