Changing the analyzer used by @FullTextField by default

[Moderator note: topic moved from Hibernate Search 6 doesn't support BigDecimal primary keys?]

Thanks. It works now when I just use @DocumentId.

Is there also a way I give @FullTextField the same global treatment?

I.e., can I configure Hibernate Search 6 to always you a specific analyzer (we use only one analyzer in our application) rather than being forced to duplicate the same analyzer string name in every annotation?

E.g., I’d like to do this:

...
@FullTextField private String foo;
@FullTextField private String bar;
...

rather than this:

...
@FullTextField(analyzer = "myAnalyzer") private String foo;
@FullTextField(analyzer = "myAnalyzer") private String bar;
...

You can override the default analyzer, which is the one used by @FullTextField when you don’t specify the analyzer.

To override the default analyzer, just define a custom analyzer whose name is default (you can use the constant AnalyzerNames.DEFAULT).

Something like this:

public class MyLuceneAnalysisConfigurer implements LuceneAnalysisConfigurer {
    @Override
    public void configure(LuceneAnalysisConfigurationContext context) {
        context.analyzer( AnalyzerNames.DEFAULT ).custom() 
                .tokenizer( StandardTokenizerFactory.class ) 
                .charFilter( HTMLStripCharFilterFactory.class ) 
                .tokenFilter( LowerCaseFilterFactory.class ) 
                .tokenFilter( SnowballPorterFilterFactory.class ) 
                        .param( "language", "English" ) 
                .tokenFilter( ASCIIFoldingFilterFactory.class );
    }
}

Or like this for Elasticsearch:

public class MyElasticsearchAnalysisConfigurer implements ElasticsearchAnalysisConfigurer {
    @Override
    public void configure(ElasticsearchAnalysisConfigurationContext context) {
        context.analyzer( AnalyzerNames.DEFAULT ).custom() 
                .tokenizer( "standard" ) 
                .charFilters( "html_strip" ) 
                .tokenFilters( "lowercase", "snowball_english", "asciifolding" ); 

        context.tokenFilter( "snowball_english" ) 
                .type( "snowball" )
                .param( "language", "English" );
    }
}

For more information about built-in analyzer, including the default one, see this section of the documentation for Lucene, or this one for Elasticsearch.

For more information about defining custom analyzers, see this section of the documentation for Lucene, or this one for Elasticsearch.