Help! Dynamic Fields with Integer values

Hello,

I’m currently running Hibernate 5.2.15.Final, Hibernate Search 5.10.5.Final, and Elasticsearch 6.4. I have an an issue similar to the following issue Sorting on dynamic fields

This has nothing to do with sorting, but rather the created data type in elasticsearch/lucene. I’m creating an index field with a dynamic name based off of productId as follows:

@IndexedEmbedded
  @Field( index = Index.YES, store = Store.NO, bridge = @FieldBridge( impl = IntegerAttributeDescriptionFieldBridge.class ), analyze = Analyze.NO )
  public IntegerAttribute getIntegerAttribute()
  {
    if ( getIntegerAttributeValue() != null )
    {
      IntegerAttribute value = new IntegerAttribute();
      value.setId( getAttributeId() );// UUID data type
      value.setValue( getIntegerAttributeValue() ); ///Integer
      return value;
    }
    return null;
  }

In the field bridge, I have the following code:

@Override
  public void set( String name, Object value, Document document, LuceneOptions luceneOptions )
  {
    if ( value != null )
    {
      IntegerAttribute integerIndex = (IntegerAttribute)value;
      String dynamicFieldName = buildName( name, integerIndex.getId() );
      Integer fieldValue = integerIndex.getValue();

      log.info( "indexing attribute '{}' with value {} and trying to add with dynamic field name '{}'", name, fieldValue, dynamicFieldName );
      // below works as storing the value as an integer, but I want the name to be the ID
      luceneOptions.addNumericFieldToDocument( name, fieldValue, document );
      // this creates the desired name for the attribute, but the value is stored as a String
      luceneOptions.addNumericFieldToDocument( dynamicFieldName, fieldValue, document );
    }
  }

 private String buildName( String name, UUID id )
  {
    return name.substring( 0, name.lastIndexOf( "." ) + 1 ).concat( id.toString() );
  }

The problem is when I create the dynamic field - no matter how I define the field type - it ends up creating it as Text rather than an int. If I use the default name passed in - it correctly creates the field type as an int. But unfortunately, I have to perform a range queries against collections where the field (ie, the ID of the product) so that doesn’t work for me. The code above produces the following JSON in elasticsearch:

{
   "id": "54976542-0c7a-4796-ba30-a755af916123",
   "integerAttribute": 211,
   "04976542-0c7a-4796-ba30-a755af916002": "211"
}

What I’m looking for is:

{
   "id": "54976542-0c7a-4796-ba30-a755af916123",
   "04976542-0c7a-4796-ba30-a755af916002": 211
}

The latter allows me to do range queries. What gives? Is this something that’s even possible? It looks like it should be? Please advise!

Thanks!

Yes, that’s expected. The experimental Elasticearch integration in Search 5.x has some limitations, this is one of them: dynamic fields cannot be assigned a type and are always interpreted as strings.

If you really need dynamic integer fields, the only solution I can think of would be to force Elasticsearch to interpret the value as an integer. This can be done by manually configuring a dymamic template in your index, after the mapping is created but before you try to index anything: https://www.elastic.co/guide/en/elasticsearch/reference/6.4/dynamic-templates.html

Wow - fast reply. I appreciate that! I’ll take look at the dynamic mappings you suggest. Thanks!

Well that solution technically worked, however it appears it won’t fix my scenario as there could be hundreds of thousands of products (each one adding a distinct field mapping based on the product id). Not scalable as per the docs:

https://www.elastic.co/guide/en/elasticsearch/reference/6.4/mapping.html#mapping-limit-settings

Is there any different method of searching for these these docs (it’s represented as a collection) that doesn’t run into the following problem below? With String/text I’m actually able to concatonate with a combined field to get a match on an exact attribute, but with a number - not so much.

Using a BooleanQuery - Searching for productId of 00000000-0000-0000-0000-000000000000 and
attribute value 5000 would otherwise return ‘true’ in the following INVALID scenario:

[
	{
		"productId": "00000000-0000-0000-0000-000000000000",
		"attributeValue: 405
	},
	{
		"productId": "11111111-1111-1111-1111-111111111111",
		"attributeValue: 5000
	}
]

I need the search to find all docs where the productId of 00000000-0000-0000-0000-000000000000 has an attribute value of 5000.

Any help would be greatly appreciated - I’m sure I’m not the first to run into this scenario - but I’m a bit raw on the Elasticsearch side of things. Perhaps this should be a new thread - sorry about that.

…and…change this from an Object to a nested data type (yeah, just did some extra googling)?

If the class declaring the “productId” field and “attributeValue” fields is an entity, you could consider putting an @Indexed annotation on that class and a @IndexedEmbedded on the association from that class to the one you’re currently indexing. Then you would just query that class.

If that’s not possible, I’m afraid I’m out of solutions with Hibernate Search 5. Yes, you could hack your way into a solution by using the nested datatype, but Search 5 doesn’t support it. So that would require you to override the schema defined by Hibernate Search, and thus to manage the schema by hand from now on. Then you would have to use a nested query when querying, which is not supported by the Hibernate Search DSL, which basically means you would have to use ElasticsearchQueries.fromJson and define your whole query as JSON (not just the part about the nested field).

With Hibernate Search 6, support for the nested datatype is built-in, so you could definitely do what you want (not with dynamic fields at the moment, but with the nested datatype).
Search 6 is reasonably well tested and the code is very clean, but technically it’s still an Alpha, so it’s only a good fit at the moment if you’re fine with backward-incompatible API changes in every release, and if you’re fine with being the first to discover problems in Search 6, knowing they may not be solved in a timely manner.
We’d love you to discover and report problems, but your customers might not :slight_smile:

Unfortunately the attribute mappings are a much smaller part of a larger object graph - so I don’t see that working out for me. My timeline, however, is not completely defined. So it might be possible to wait until Hibernate Search 6 is in a more stable place. I know this is a completely unfair question to pose, and any answer you might provide would likely be a large SWAG, but do you have any idea on an approximate date for a GA release on version 6? Sorry to ask, but you know :slight_smile:

Otherwise - you’ve been extremely helpful and quick with your responses. I really appreciate that!

Cheers!

No ETA for a GA release at the moment, sorry. There’s just too much to do to give even an approximate ETA.

We may start releasing Betas by September, if everything goes as planned. Meaning the implementation code will still be in flux, but most core features will be implemented and most APIs will be considered satisfying enough that we won’t actively try to change them, unless we discover new problems… which may still happen.

Cheers.