Hibernate-search 6 manually create / drop mapping

Hello,

actually, when you use the lifecycle strategy with “none” argument, you can’t use the massIndexer() to create new indexes because by default the massIndexer try to purge everything.
With the purgeAllOnStart(false) its working, but documents can be duplicate (as specified in the documentation :+1:).

i know the documentation of the alpha is currently not complete but i would try to manage programmatically the creation or deletion of my mapping.

Can you give me an example please :slight_smile: ?

1 Like

With the purgeAllOnStart(false) its working

Be careful. This only works because Elasticsearch, by default, will create indexes and their mapping dynamically when it receives indexing requests for indexes that don’t exist yet. It will try to guess the type of fields, and very often, it will guess at least something wrong.

You really should not rely on that feature, and should instead create each Elasticsearch index in advance.

However, it seems that, for some reason, you don’t want to use the create lifecycle strategy. This means you will have to create indexes manually.
Unfortunately, there currently isn’t any way to create the indexes manually through Hibernate Search.

Currently, your options are:

  1. Use the create or update lifecycle strategy in your Elasticsearch backend, so that indexes are created on startup.
  2. Or bypass Hibernate Search and manually send HTTP requests to Elasticsearch to create the indexes.

The second solution is obviously much harder, because you have to know what the mappings will be.

Unless you have very deep knowledge of what your Elasticsearch mapping will look like, I would recommend starting Hibernate Search with the create lifecycle strategy in a development environment, then getting the mappings from the Elasticsearch clusters and storing them in JSON files.
Then, in your application, you can send HTTP requests to the Elasticsearch cluster to create indexes before you do anything else, using the create index REST API.

To send requests to Elasticsearch, for now the best solution would be to create your own client. Hibernate Search does have APIs to access its internal REST client, but they are not currently accessible in the Hibernate ORM integration. I opened HSEARCH-3640 to fix that. In the meantime, I’d recommend the official, low-level REST client a community client such as Jest. There is also an official, high-level REST client but I’ve never used it.

In the next releases, we could consider exposing APIs so that you can manually trigger the creation of mappings through Hibernate Search. However, we’ll need more information on your use case:

  1. Why is the create lifecycle strategy not appropriate in your case?
  2. When do you want to create the indexes/mappings? On startup? On another event? Which one?
  3. Do you also need to delete the indexes? When?

Thanks for all the tips, i was considering using HighLevel RC considering i’am alreading using it to get my search results with highlights and aggregations support.

In fact, update strategy works nice when you just add fields to the mapping, if your try to update one it’s not working

"error": {
    "root_cause": [
      {
        "type": "illegal_argument_exception",
        "reason": "Mapper for [keywords] conflicts with existing mapping:\n[mapper [keywords] has different [store] values, mapper [keywords] has different [doc_values] values]"
      }
    ],
    "type": "illegal_argument_exception",
    "reason": "Mapper for [keywords] conflicts with existing mapping:\n[mapper [keywords] has different [store] values, mapper [keywords] has different [doc_values] values]"
  },

to get that error, i just changed the annotation of my string field

@GenericField

to

@GenericField(projectable = Projectable.YES, sortable = Sortable.YES)

I don’t really know if that specific case can append in real life but i’am sure i will have to update one day the mapping of one of my documents.
I want to be able to manage that with the installation process of a specific version.

My best option for now is too drop the index before with High level RC when i know the update lifecycle will probably fail and then, after the bootstrap is complete i will trigger my fullIndexer

Yes it will. And it’s not a bug, it’s normal: you asked Elasticsearch to store more data in the index for existing field, so you need to reindex these fields.

The update strategy can only work if you add new fields, and even then it won’t magically add the data to the index: it will just update the schema, and leave the new fields empty.

Exactly this. You best option if your mapping changes is to drop the indexes completely before you even start your application, let Hibernate Search create the new mapping on startup, then reindex everything.

1 Like