Changing the analyzer used by @FullTextField by default

Dimitri · September 22, 2021, 5:57pm

[Moderator note: topic moved from Hibernate Search 6 doesn't support BigDecimal primary keys?]

Thanks. It works now when I just use @DocumentId.

Is there also a way I give @FullTextField the same global treatment?

I.e., can I configure Hibernate Search 6 to always you a specific analyzer (we use only one analyzer in our application) rather than being forced to duplicate the same analyzer string name in every annotation?

E.g., I’d like to do this:

...
@FullTextField private String foo;
@FullTextField private String bar;
...

rather than this:

...
@FullTextField(analyzer = "myAnalyzer") private String foo;
@FullTextField(analyzer = "myAnalyzer") private String bar;
...

yrodiere · September 23, 2021, 7:09am

You can override the default analyzer, which is the one used by @FullTextField when you don’t specify the analyzer.

To override the default analyzer, just define a custom analyzer whose name is default (you can use the constant AnalyzerNames.DEFAULT).

Something like this:

public class MyLuceneAnalysisConfigurer implements LuceneAnalysisConfigurer {
    @Override
    public void configure(LuceneAnalysisConfigurationContext context) {
        context.analyzer( AnalyzerNames.DEFAULT ).custom() 
                .tokenizer( StandardTokenizerFactory.class ) 
                .charFilter( HTMLStripCharFilterFactory.class ) 
                .tokenFilter( LowerCaseFilterFactory.class ) 
                .tokenFilter( SnowballPorterFilterFactory.class ) 
                        .param( "language", "English" ) 
                .tokenFilter( ASCIIFoldingFilterFactory.class );
    }
}

Or like this for Elasticsearch:

public class MyElasticsearchAnalysisConfigurer implements ElasticsearchAnalysisConfigurer {
    @Override
    public void configure(ElasticsearchAnalysisConfigurationContext context) {
        context.analyzer( AnalyzerNames.DEFAULT ).custom() 
                .tokenizer( "standard" ) 
                .charFilters( "html_strip" ) 
                .tokenFilters( "lowercase", "snowball_english", "asciifolding" ); 

        context.tokenFilter( "snowball_english" ) 
                .type( "snowball" )
                .param( "language", "English" );
    }
}

For more information about built-in analyzer, including the default one, see this section of the documentation for Lucene, or this one for Elasticsearch.

For more information about defining custom analyzers, see this section of the documentation for Lucene, or this one for Elasticsearch.

Topic		Replies	Views
Specifying both "analyzer" and "searchAnalyzer" for a FullTextField breaks the search Hibernate Search	1	601	January 9, 2023
Override default @Field settings in Hibernate Search with Elasticsearch provider Hibernate Search	1	818	August 6, 2018
Cannot find the overridden analyzer when using overridesForField Hibernate Search	4	3015	July 12, 2018
Which analyzer(s) / tokenizer(s) for specific ID? Hibernate Search	3	449	June 21, 2021
Hibernate Search 6 custom index settings Hibernate Search	4	932	January 26, 2022

Changing the analyzer used by @FullTextField by default

Related topics