Implement search on 2 terms

Hi, I am a newby and have a problem using the search.

In my index I have a term matching “notebook” and a term matching “lenovo”.

If searching for “notebook” i get a result list as expected, same for “lenovo”.

But when my search term is “lenovo notebook” I am getting an empty resultlist , instead of a list with all notebooks and all lenovo matches…

Currently I am using this:

            luceneQuery.must(qb.bool()
                .should(qb.keyword().wildcard().onField("articleNumber")
                    .matching("*$escapedQuery*").createQuery())
                .should(qb.keyword().wildcard()
                    .onFields("articleLabelTranslations", "articleDescriptionTranslations")
                    .matching("*${escapedQuery.toLowerCase()}*").createQuery())
                .createQuery()
            )

Debugging my code using the Luke index toolbox I saw, that I can get the expected result when toggling the default query operator of the ‘org.apache.lucene.analysis.standard.StandardAnalyzer’ from AND to OR…

image

Can anybody explain me how I can get the expected (OR-like) resultset?

The queries generated by Luke are simply not the same queries as what you’re using here.

First, consider whether you really need the wildcards (*). Believe me, they will make your life harder, for the simple reasons that terms are not analyzed by this query: they are not broken down into words, they are not lowercased, …

Try the same query without wildcards:

            luceneQuery.must(qb.bool()
                .should(qb.keyword().onField("articleNumber")
                    .matching(escapedQuery).createQuery())
                .should(qb.keyword()
                    .onFields("articleLabelTranslations", "articleDescriptionTranslations")
                    .matching(escapedQuery).createQuery())
                .createQuery()
            )

Are you getting the results you want? If not, what doesn’t match, that you would expect to match?

You saved my day :sunny:

1 Like

Hm, one thing:

image

How can i implement the search using a AND for the terms?

so that my search returns only 1 matching finally?

Have a look at simpleQueryString, in particular .withAndAsDefaultOperator().

This query does a bit more than what you want, since it accepts a simple query syntax with various operators, but it’s the only one that allows using the AND operator by default.

i will check that.

could you point me to some more infos how to introduce a kind of ranking those results…?

Lucene already assigns a score to each match, and by default Hibernate Search sorts results according to this score.

You can find an explanation of the fundamentals here: org.apache.lucene.search (Lucene 8.7.0 API)

To investigate how your documents are being scored, see here: Hibernate Search 5.11.12.Final: Reference Guide

If the order of results is not satisfying, here are a few things you can do to change it:

  • Assign a boost to some fields: .onField("title").boostedTo(5f). A boosted field will have a greater influence on the score when it matches, so e.g. you might want to boost a “title” field because a document whose title match is more likely to be relevant.
  • If you have custom analyzers, tune them. For example if an analyzer removes diacritics, such as “résumé” => “resume”, you might want to preserve the original token so that when someone is really looking for a “résumé”, their query will give a better score to documents containing “résumé” (meaning “list of previous jobs”) than those just containing “resume” (meaning “continue”). Generally, there are options to preserve original tokens in token filters.

Alternatively, you can throw scoring out of the window and sort according to a particular field (by title, …): Hibernate Search 5.11.12.Final: Reference Guide

2 Likes

Hello, your explanations helped much.

now I am using

                .should(qb.simpleQueryString()
                    .onFields("articleLabelTranslations", "articleDescriptionTranslations", "articleNumber")
                    .matching(query.query)
                    .createQuery()
                )

Is there a way to find out the scoring of each item in my resultset? I would like to display the scoring to the user…

I don’t recommend displaying the score to your users, unless they are technical users familiar with Lucene, because they will likely not like what they see:

  • Most users want the score to be a percentage, but it’s an absolute value that cannot reliably be converted to a percentage.
  • The score is not just affected by the content of a document. It’s also affected by the query (obviously) but also by the content of other documents: if you add more documents with the term “cat”, then “cat” becomes less significant overall, and matching the term “cat” will have a lower impact on the score… even for pre-existing documents. This behavior is often surprising to end users.

If you want to see the score for debugging purposes, though, you can still use the score projection.

If you want more details about scoring, still for debugging purposes, you can also ask for an explanation of the score computation; see here.

2 Likes

I understand that it does not make sense to display scoring results to the end users.
For debugging purpose, i would like to have a short inspection on my results though. You pointed me to the score projection page, but sorry, I don’t understand how to apply the projection on my code:

                .should(qb.simpleQueryString()
                    .onField("articleNumber")
                        .boostedTo(5f)
                    .andFields("articleLabelTranslations", "articleDescriptionTranslations")
                        .boostedTo(2f)
                    .withAndAsDefaultOperator()
                    .matching(query.query)
                    .createQuery()

Sorry, I forgot you were on Hibernate Search 5, not 6.

See here for the score projections: https://docs.jboss.org/hibernate/search/5.11/reference/en-US/html_single/#_projections

See here for explanations: https://docs.jboss.org/hibernate/search/5.11/reference/en-US/html_single/#_understanding_results

2 Likes

Thank you for your help! :beers:

Useful and very well explained topic, thanks! I think it’s a very good solution for everyone.