Can I prevent hibernate search and lucene from indexing fields with @Id

Hey there,
I am using the following dependencies:

'org.hibernate:hibernate-search-orm:5.11.5.Final'
'org.apache.lucene:lucene-sandbox:5.5.5'
'org.springframework:spring-context-indexer:5.2.8.RELEASE'

I have the following superclass for my entities, so I do not have to add an id field for each class manually:

import javax.persistence.Column;
import javax.persistence.GeneratedValue;
import javax.persistence.Id;
import javax.persistence.MappedSuperclass;
import lombok.AllArgsConstructor;
import lombok.Getter;
import lombok.NoArgsConstructor;
import org.hibernate.annotations.GenericGenerator;
import org.hibernate.annotations.Type;

import java.util.UUID;

@Getter
@MappedSuperclass
@NoArgsConstructor
@AllArgsConstructor
public abstract class TableModelAutoId extends TableModel {

  @Id
  @Type(type = "uuid-char")
  @GeneratedValue(generator = "UUID")
  @Column(name = "id", updatable = false, nullable = false, unique = true)
  @GenericGenerator(name = "UUID", strategy = "org.hibernate.id.UUIDGenerator")
  private UUID id;
}

I am using @Indexed on a few of my entities. My question is the following:
Can I prevent hibernate search / lucene from indexing the id field? Normally that is a good idea, but in this case, it does not make much sense, because it is an UUID…

I hope, that you can help me, did not found any solution so far…

TNT2k

Hibernate Search needs a unique identifier for each document. Without it:

  • Hibernate Search wouldn’t be able to retrieve existing documents when they need to be updated.
  • Hibernate Search wouldn’t be able to map documents back to their “source” entity, so that when you search, you get managed entities as a result.

Why exactly does indexing an UUID bother you?

If you want to use another unique property as the document identifier, you can annotate that property with @DocumentId.

If you’re using Elasticsearch and what’s bothering you is the id property that Hibernate Search adds automatically to every document, I’m afraid you cannot solve that in Hibernate Search 5. On the bright side, Hibernate Search 6 no longer does that anymore, so you might want to upgrade.

If there is another reason, please explain so that I can help.

It fills up my index with stuff I do not need (and did not do that in the past versions of hibernate, but I do not know the versions exactly).

This does not work as well, because the only things, I want to have in my index, need special analyzers to work (and @DocumentId can not have an @Field annotation as well).

I am using lucene, here is a section of my class annotations (I am using different analyzers for indexing and searching):

@Indexed
@AnalyzerDef(
        name = "analyzer",
        tokenizer = @TokenizerDef(factory = WhitespaceTokenizerFactory.class),
        filters = {
                @TokenFilterDef(factory = LowerCaseFilterFactory.class),
                @TokenFilterDef(
                        factory = NGramFilterFactory.class,
                        params = {@Parameter(name = "maxGramSize", value = "31")}
                )
        }
)
@AnalyzerDef(
        name = "analyzer-query",
        tokenizer = @TokenizerDef(factory = WhitespaceTokenizerFactory.class),
        filters = {
                @TokenFilterDef(factory = LowerCaseFilterFactory.class)
        }
)

As I explained above:

So if you need to update/delete documents, or if you need to retrieve the entity corresponding to a document, then you need to index the document identifier. This is not “stuff you do not need”.
Without an indexed document identifier, it’s simply not possible to update/delete documents.

Hibernate Search has been indexing document identifiers for as long as I can remember. I you feel there’s been a change, please give more details about the state of the index in previous versions, and the state of the index in recent versions (field names, content, …).

With

I meant “No one will ever query an entity by its idea with hibernate search”. But ok, than I will just keep the Id.

I do not know, which versions that did, so I have to go back in history a lot, to search for it.

If it is important for you, I will do it, otherwise not, because that will take a bit of time…

Best
TNT2k

No that’s alright: if you can live with keeping the ID in the index, there’s no need for more investigation as far as I’m concerned :slight_smile: Thank you.

I have to thank you!
Well, I then keep the ID for you, although I o not use them in my queries myself, but that is ok. Would have been nice, if I could have removed it from my index, but well, then not.