I think I’ve arrived at a set of requirements that’s not solveable with Hibernate Search 5.11. That’s ok, I’m not expecting a solution but a short statement that it’s not possible would be nice.
I have Cases (“Kauffälle”) with a List of Adresses (“KatasterAdressen”). Now I want to search for all Cases with the following Input (plus other inputs of case-data):
Problem
The @IndexEmbedded
flattens the Adress in the search index and a search street: ABC && housenumber: 2
yields the results:
- { Street: “ABC-Street”, From: 1, To: 2 }
- { Street: “ABC-Street”, From: 100, To: 101} (match only on street name, because of the index flattening)
- { Street: “XYC”, From: 2, To: 2} (match only on house number, because of the index flattening)
Of course I only want the result 1. that is both the adress and house number should fit.
Questions
(1) The recommended hack-solution i’ve found somewhere in Stackoverflow is to put all adress data in one field (so that the information of what data belongs together is not lost). Unfortunately i can’t figure out a way that keeps the ngram only on the street name and the in range search for the house number all in one field. Do you know another possible hack, or is it simply not possible with Hibernate 5.11?
(2) I’ve seen Hibernate 6 supports nested fields which seems to solve my problem. Is there a way to hack this single feature myself for usage with Hibernate 5.11? Maybe directly access Lucene API or something?
(3) I’ve read you don’t publish dates, but is there a long term ETA for a first production release of Hibernate Search 6? “Beta 6” suggests it might be somewhat stable. Would you recommend to use it in Production?
The Case is simple (irrelevant Annotations, Fields left out):
@Entity
@Indexed
@...
public class Kauffall {
@ElementCollection
@IndexedEmbedded
@Field
@...
private List<KatasterAdresse> adressen = new ArrayList<>();
...
}
The Adress is more complicated, because it expresses a street combined with a range of house numbers. I can’t find a way to reduce these five fields into one single field that would support completion of street names and a search of house numbers that are inside the range.
@Embeddable
@...
public class KatasterAdresse implements IPrimeentity {
@Field(name = "strasse_ngram", analyzer = @Analyzer(definition = "edgeNGram"))
@Field
private String strasse;
@Field(analyze = Analyze.NO, indexNullAs = "0000000000")
private Long hausnummerVon;
@Field(indexNullAs = "")
private String hausnummerVonZusatz;
@Field(analyze = Analyze.NO, indexNullAs = "9999999999")
private Long hausnummerBis;
@Field(indexNullAs = "zzzzzzzzzz")
private String hausnummerBisZusatz;
Appendix:
Analyzer Configuration:
@AnalyzerDef(
name = "edgeNGram",
tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class),
filters = {
@TokenFilterDef(factory = ASCIIFoldingFilterFactory.class), // Replace accented characters by their simpler counterpart (è => e, etc.)
@TokenFilterDef(factory = LowerCaseFilterFactory.class),
@TokenFilterDef(factory = EdgeNGramFilterFactory.class, params = {
@Parameter(name = "maxGramSize", value = "20"),
@Parameter(name ="minGramSize", value = "3"),
}),
})
@Analyzer(definition = "default")
@AnalyzerDef(name = "default",
tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class),
filters = {
@TokenFilterDef(factory = ASCIIFoldingFilterFactory.class), // Replace accented characters by their simpler counterpart (è => e, etc.)
@TokenFilterDef(factory = LowerCaseFilterFactory.class) // Lowercase all characters
})