Not getting exact match in Hibernate Search

Hi Members,

I have a query regarding exact match in Hibernate Search.
I have tried my searching with two types of tokenizers :

  1. White space tokenizer
  2. Standard tokenizer

My query is as below :
org.apache.lucene.search.Query exact_query = qb.phrase().withSlop(0).onField(“fullRecord”).sentence(input.toLowerCase()).createQuery();
My analyzer is as below :
@AnalyzerDef(name = “textanalyzer”, tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class, params = {
@Parameter(name = “maxTokenLength”, value = “8000”) }), filters = {
@TokenFilterDef(factory = LowerCaseFilterFactory.class),
@TokenFilterDef(factory = DoubleMetaphoneFilterFactory.class, params = {
// @Parameter(name = “encoder”, value = “DoubleMetaphone”),
@Parameter(name = “maxCodeLength”, value = “10”),
@Parameter(name = “inject”, value = “true”) }), })
@AnalyzerDef(name = “WithWhitespaceTokenizerFactory”, tokenizer = @TokenizerDef(factory = WhitespaceTokenizerFactory.class), filters = {
@TokenFilterDef(factory = LowerCaseFilterFactory.class),
@TokenFilterDef(factory = PhoneticFilterFactory.class, params = {
@Parameter(name = “encoder”, value = “Metaphone”),
@Parameter(name = “maxCodeLength”, value = “20”),
@Parameter(name = “inject”, value = “true”)
}),
})

@Field(termVector = TermVector.WITH_POSITION_OFFSETS)
@Analyzer(definition = "textanalyzer")
@Field(name = "fullRecord_forWildcards", analyzer = @Analyzer(definition = "WithWhitespaceTokenizerFactory"))

@Column(name = "FullRecord")
private String fullRecord;'

Search String : ** ROW HIGHER

Expected Result : Only rows which have exact match as ROW HIGHER should be returned.
i.e, ROW HIGHER, 2, ROW HIGHER, 2, STR, LONDON

Getting Output :
All of these three rows mentioned below :
ROW HIGHER, 2, ROW HIGHER, 2, STR, LONDON
HIGHER ROW, HIGHER ROW, FORE STREET, KINGSAND, CORNWALL, PL10 1NL
HIGHER ROW, HIGHER ROW, FORE STREET, KINGSAND, CORNWALL

Note : I have tried Standard & White space tokenizers both. In both of the scenario I am getting same results as output.
I also have tried using an another field with no analyzer on it like this.
@Field(name = “exact_fullRecord”, analyze = Analyze.NO)

Please Suggest a solution to achieve this.
Your help would be highly appreciated !

for exact match you do not required to tokenize it (other wise it will create multiple tokinizer based on punctuation marks or with white space )
@Field(name = “exact_fullRecord”, analyze = Analyze.NO)
this should work perfectly for exact match

1 Like