I have a problem with hibernate search. It’s a little complicated. I have an entity named Book, it has OneToMany relations to another entity named SaleLog. SaleLog has attributes: saleDate(DateTime), amount(int), customer(string), address(string). Code like this:
@Indexed
class Book {
@OneToMany(fetch = FetchType.LAZY, cascade = CascadeType.ALL)
@JoinColumn(name = "book_id")
@IndexedEmbedded(includePaths = {"customer", "address"})
private List<SaleLog> saleLogs;
...
}
class SaleLog {
int bookId;
DateTime saleDate;
int amount;
@Field
string customer;
@Field
string address;
}
Here is salelog:
Book
Customer
Address
Think in Java
Jack
Paris
Think in Java
Tom
London
Think in C++
Jack
London
Once I want to search book which bought by Jack From London, I use Hibernate search like this:
What you are trying to do is retrieve all the book that contain a SaleLog which satisfies two conditions: the customer is “Jack” AND the address is “London”.
This is not possible with traditional full-text queries, because those queries work on a “flattened” view of the document. Inside the index, you don’t have any structure.
For example let’s take the book “Think in Java”: you expect the index to contain one document for this book, and the document two contain two “sub-documents”, on for each sales log. Something like this:
Book “Think in Java”
Sale log 1
customer = Jack
address = Paris
Sale log 2
customer = Tom
address = London
But that’s not how the data is structure in the index. Instead it is flattened, like this:
Book “Think in Java”
salelogs.customer = Jack
salelogs.address = Paris
salelogs.customer = Tom
salelogs.address = London
There is no way to tell that the address “London” is related to “Tom” anymore; that information was lost when indexing.
Now, there are solutions, but they require to take a different path. The most obvious solution in your case would be to reverse the index embedding: instead of embedding the sale logs in the books, you could replace the bookId attribute in SaleLog with a @ManyToOne private Book book; attribute, add an @IndexedEmbedded annotation on it, add @Indexed on SaleLog, and execute your search queries on the SaleLog entity rather than on the Book entity. After all, you’re actually searching for the book linked to a SaleLog, so it makes sense to return a list of SaleLog to your user.
In Search 6, we will add support for “nested” queries, which provides a much easier solution to your problem. The relevant ticket in our JIRA tracker is HSEARCH-2263; we will update it as the feature makes its way to the master branch.
However, there is no planned release date for Search 6 yet (it’s still in its early days).
But in this case, if I search from SaleLog,I will get two SaleLogs that contains a same book, so directly, I will get duplicate books, could I get only one book by Hibernate Search?
Thanks for your reply, I have been confused by this problem for a long time. I have tried inverse the index embedding, it worked. But the index was more complicated, There were four grade of nested index. And I think this would impact the performance.
In my question, I took the book and salelog as an example to make it easy to understand, but my question is more complicated. I have to search an entity with some condition, some are entity’s attributes, and other are related entity’s attributes, some of the ralationship is nested. Like this: a.attr1 = xx and a.b.attr2 = yy and a.b.c.attr3 = zz
In my app, it also requires fuzzy search, which means the condition is xx,yy,zz and a keyword. There are some fields to check the keyword. I used sql and index to solve this, and I found sql is faster. SQL: a like keyword or b like keyword or c like keyword .Do you have any advices?
If all you need is predicates such as column LIKE 'searchTerm% or lower(column) LIKE lower('searchTerm%'), in other word prefix predicates, then yes, SQL can be faster, especially if you put an index on the column or on lower(column).
However, that’s about as far as you can get.
If you need to handle multiple search terms (multiple words), you need “contains” predicates (with a wildcard at the beginning of the search term, such as column LIKE '%searchTerm%), and then SQL will not be fast at all.
SQL won’t offer you the features Lucene/Elasticsearch do, either. You won’t have synonyms, you won’t have stemming, and so on.
To sum up, if you don’t need full-text search, then yes, simply using SQL is a good idea
I don’t know how this API works but something bother me. It seems that createQuery create a new instance of MustJunction. You just need to put the result of your builder in your variable. mobdro
like this :
MustJunction term = qb.bool().must(qb.keyword().onField("saleLogs.customer").matching("Jack").createQuery());
term = term.must(qb.keyword().onField("saleLogs.address").matching("London").createQuery())
List<Book> books = fullTextSession.createFullTextQuery(term.createQuery()).list();
No, must() returns an instance of MustJunction. createQuery() returns an instance of Query. It’s more obvious if you spread the code over multiple lines:
MustJunction term = qb.bool().must(
qb.keyword().onField("saleLogs.customer").matching("Jack").createQuery()
);
To get back to the code from the original question, here is an equivalent piece of code that is probably easier to understand