Implement search on 2 terms

Ragin · November 20, 2020, 3:11pm

Hi, I am a newby and have a problem using the search.

In my index I have a term matching “notebook” and a term matching “lenovo”.

If searching for “notebook” i get a result list as expected, same for “lenovo”.

But when my search term is “lenovo notebook” I am getting an empty resultlist , instead of a list with all notebooks and all lenovo matches…

Currently I am using this:

            luceneQuery.must(qb.bool()
                .should(qb.keyword().wildcard().onField("articleNumber")
                    .matching("*$escapedQuery*").createQuery())
                .should(qb.keyword().wildcard()
                    .onFields("articleLabelTranslations", "articleDescriptionTranslations")
                    .matching("*${escapedQuery.toLowerCase()}*").createQuery())
                .createQuery()
            )

Debugging my code using the Luke index toolbox I saw, that I can get the expected result when toggling the default query operator of the ‘org.apache.lucene.analysis.standard.StandardAnalyzer’ from AND to OR…

Can anybody explain me how I can get the expected (OR-like) resultset?

yrodiere · November 20, 2020, 3:17pm

The queries generated by Luke are simply not the same queries as what you’re using here.

First, consider whether you really need the wildcards (*). Believe me, they will make your life harder, for the simple reasons that terms are not analyzed by this query: they are not broken down into words, they are not lowercased, …

Try the same query without wildcards:

            luceneQuery.must(qb.bool()
                .should(qb.keyword().onField("articleNumber")
                    .matching(escapedQuery).createQuery())
                .should(qb.keyword()
                    .onFields("articleLabelTranslations", "articleDescriptionTranslations")
                    .matching(escapedQuery).createQuery())
                .createQuery()
            )

Are you getting the results you want? If not, what doesn’t match, that you would expect to match?

Ragin · November 20, 2020, 3:28pm

You saved my day

Ragin · November 20, 2020, 4:17pm

Hm, one thing:

How can i implement the search using a AND for the terms?

so that my search returns only 1 matching finally?

yrodiere · November 20, 2020, 4:24pm

Have a look at simpleQueryString, in particular .withAndAsDefaultOperator().

This query does a bit more than what you want, since it accepts a simple query syntax with various operators, but it’s the only one that allows using the AND operator by default.

Ragin · November 20, 2020, 4:27pm

i will check that.

could you point me to some more infos how to introduce a kind of ranking those results…?

yrodiere · November 23, 2020, 8:01am

Lucene already assigns a score to each match, and by default Hibernate Search sorts results according to this score.

You can find an explanation of the fundamentals here: org.apache.lucene.search (Lucene 8.7.0 API)

To investigate how your documents are being scored, see here: Hibernate Search 5.11.12.Final: Reference Guide

If the order of results is not satisfying, here are a few things you can do to change it:

Assign a boost to some fields: .onField("title").boostedTo(5f). A boosted field will have a greater influence on the score when it matches, so e.g. you might want to boost a “title” field because a document whose title match is more likely to be relevant.
If you have custom analyzers, tune them. For example if an analyzer removes diacritics, such as “résumé” => “resume”, you might want to preserve the original token so that when someone is really looking for a “résumé”, their query will give a better score to documents containing “résumé” (meaning “list of previous jobs”) than those just containing “resume” (meaning “continue”). Generally, there are options to preserve original tokens in token filters.

Alternatively, you can throw scoring out of the window and sort according to a particular field (by title, …): Hibernate Search 5.11.12.Final: Reference Guide

Ragin · November 25, 2020, 8:15am

Hello, your explanations helped much.

now I am using

                .should(qb.simpleQueryString()
                    .onFields("articleLabelTranslations", "articleDescriptionTranslations", "articleNumber")
                    .matching(query.query)
                    .createQuery()
                )

Is there a way to find out the scoring of each item in my resultset? I would like to display the scoring to the user…

yrodiere · November 25, 2020, 8:25am

I don’t recommend displaying the score to your users, unless they are technical users familiar with Lucene, because they will likely not like what they see:

Most users want the score to be a percentage, but it’s an absolute value that cannot reliably be converted to a percentage.
The score is not just affected by the content of a document. It’s also affected by the query (obviously) but also by the content of other documents: if you add more documents with the term “cat”, then “cat” becomes less significant overall, and matching the term “cat” will have a lower impact on the score… even for pre-existing documents. This behavior is often surprising to end users.

If you want to see the score for debugging purposes, though, you can still use the score projection.

If you want more details about scoring, still for debugging purposes, you can also ask for an explanation of the score computation; see here.

Ragin · November 25, 2020, 8:53am

I understand that it does not make sense to display scoring results to the end users.
For debugging purpose, i would like to have a short inspection on my results though. You pointed me to the score projection page, but sorry, I don’t understand how to apply the projection on my code:

                .should(qb.simpleQueryString()
                    .onField("articleNumber")
                        .boostedTo(5f)
                    .andFields("articleLabelTranslations", "articleDescriptionTranslations")
                        .boostedTo(2f)
                    .withAndAsDefaultOperator()
                    .matching(query.query)
                    .createQuery()

yrodiere · November 25, 2020, 8:57am

Sorry, I forgot you were on Hibernate Search 5, not 6.

See here for the score projections: https://docs.jboss.org/hibernate/search/5.11/reference/en-US/html_single/#_projections

See here for explanations: https://docs.jboss.org/hibernate/search/5.11/reference/en-US/html_single/#_understanding_results

Ragin · November 25, 2020, 10:01am

Thank you for your help!

ChristopherHilton · November 30, 2020, 12:03pm

Useful and very well explained topic, thanks! I think it’s a very good solution for everyone.

Topic		Replies	Views
Querying IndexedEmbedded entities (sometimes) returning wrong result Hibernate Search	3	381	September 13, 2021
Left leading wild card does not work Hibernate Search	5	2496	July 25, 2018
HS6 sort by score different than Lucene for the same query Hibernate Search	4	437	September 2, 2021
Lucene With Special Characters Hibernate Search	2	448	September 13, 2023
Hibernate Search 7.0: new field is ignored in wildcard()-search Hibernate Search	1	48	December 4, 2024

Implement search on 2 terms

Related topics