Hibernate Search 6 suggesters + completion type

Hello,

is there any way to map a field with a custom type like “completion” for example ?

i would try to implement this feature :
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-suggesters-completion.html

Note that you can also reproduce most of the behavior of a suggester by simply declaring a text field with an appropriate analyzer (with an edge-ngram filter, in particular), then running search queries with another analyzer (one that doesn’t use the edge-ngram filter).

But if you really want the suggester API…

You can map the type with a custom bridge, and make that bridge declare a native type for the field: that way, you can precisely define the field mapping using JSON.

Suggesters are not supported yet in the query DSL, though, so you will have to use an external client for search queries.

First, implement a custom bridge:

public class MySuggesterBridge implements ValueBridge<String, String> {
	@Override
	public StandardIndexFieldTypeContext<?, String> bind(ValueBridgeBindingContext<String> context) {
		return context.getTypeFactory().extension( ElasticsearchExtension.get() )
				.asNative( "{\"type\": \"completion\"}" );
	}

	@Override
	public String toIndexedValue(String value, ValueBridgeToIndexedValueContext context) {
		// The "native" requires you to perform the JSON conversion yourself. Ideally you should use a dedicated library, to properly escape the strings.
		return value == null ? null : "['" + value + "']";
	}

	@Override
	public String cast(Object value) {
		return (String) value;
	}
}

Then reference that bridge in your mapping:

@GenericField(name = "myTextProperty_suggest", valueBridge = @ValueBridgeRef(type = MySuggesterBridge.class))
String myTextProperty;

Yes, i had look into that too, but after reading this article A detailed comparison between autocompletion strategies in ElasticSearch | by Mourjo Sen | Medium i find out suggester may be faster and more flexible.

image

I suppose it may. It all depends on your requirements :slight_smile:

I probably missed something but when i try to use the extension + asNative, the signature of asNative doesn’t match the bind one.

Sorry, I forgot about this, but Hibernate Search cannot apply standard parameters (searchable, projectable, …) to native fields, so they actually cannot be used in a value bridge.

You can always use a property bridge, but it will be a bit more complex. It’s not documented at the moment, I really need to find some time to document this…

Define an annotation:

package com.acme;

// ... imports ...

@PropertyBridgeMapping(bridge = @PropertyBridgeRef(builderType = com.acme.MyCompletionFieldBridge.Builder.class))
@Retention(RetentionPolicy.RUNTIME)
@Target({ ElementType.METHOD, ElementType.FIELD })
@Documented
@Repeatable(CompletionField.List.class)
public @interface CompletionField {

	String name();

	@Retention(RetentionPolicy.RUNTIME)
	@Target({ ElementType.METHOD, ElementType.FIELD })
	@Documented
	@interface List {
		CompletionField[] value();
	}
}

Then the bridge:

package com.acme;

// ... imports ...

public class MyCompletionFieldBridge implements PropertyBridge {
	private final String fieldName;

	private IndexFieldReference<String> valueFieldReference;

	private MyCompletionFieldBridge(Builder builder) {
		this.fieldName = builder.fieldName;
	}

	@Override
	public void bind(PropertyBridgeBindingContext context) {
		context.getDependencies().useRootOnly();
		valueFieldReference = context.getIndexSchemaElement().field(
				fieldName, f -> f.extension( ElasticsearchExtension.get() ).asNative( "{\"type\": \"completion\"}" )
		)
				.toReference();
	}

	@Override
	public void write(DocumentElement target, Object bridgedElement, PropertyBridgeWriteContext context) {
		String sourceValue = (String) bridgedElement;
		if ( sourceValue != null ) {
			// The "native" field type requires you to perform the JSON conversion yourself.
			// Ideally you should use a JSON library to properly escape the strings.
			target.addValue( valueFieldReference, "['" + sourceValue + "']" );
		}
	}

	public static class Builder implements AnnotationBridgeBuilder<PropertyBridge, CompletionField> {
		private String fieldName;

		@Override
		public void initialize(CompletionField annotation) {
			fieldName = annotation.name();
		}

		@Override
		public BeanHolder<PropertyBridge> build(BridgeBuildContext buildContext) {
			if ( fieldName == null || fieldName.isEmpty() ) {
				throw new IllegalArgumentException( "fieldName is a mandatory parameter" );
			}
			return BeanHolder.of( new MyCompletionFieldBridge( this ) );
		}
	}

}

Then use the annotation in your mapping:

@CompletionField(name = "myTextProperty_suggest")
String myTextProperty;
1 Like

I think we are close :stuck_out_tongue:

Request: PUT /myIndex with parameters {}
Response: 400 'Bad Request' with body 
{
  "error": {
    "root_cause": [
      {
        "type": "mapper_parsing_exception",
        "reason": "No type specified for field [title_suggest]"
      }
    ],
    "type": "mapper_parsing_exception",
    "reason": "Failed to parse mapping [_doc]: No type specified for field [title_suggest]",
    "caused_by": {
      "type": "mapper_parsing_exception",
      "reason": "No type specified for field [title_suggest]"
    }
  },
  "status": 400
}

it seems the type specified in json is not resolved when the PUT is send to create index

Here the json mapping in the indexManager implementor just before the PUT

{
  "properties": {
    "libelle": {
      "type": "keyword",
      "index": true,
      "norms": false,
      "doc_values": false,
      "store": false
    },
    "modifieLe": {
      "type": "date",
      "index": true,
      "doc_values": true,
      "store": true,
      "format": "uuuu-MM-dd\u0027T\u0027HH:mm:ss.SSSSSSSSSZZZZZ"
    },
    "motCle": {
      "type": "keyword",
      "index": true,
      "norms": false,
      "doc_values": true,
      "store": true
    },
    "numero": {
      "type": "keyword",
      "index": true,
      "norms": false,
      "doc_values": false,
      "store": false
    },
    "references": {
      "type": "text",
      "index": true,
      "norms": true,
      "store": false,
      "analyzer": "standard",
      "term_vector": "no"
    },
    "title_suggest": {}
  },
  "dynamic": "strict"
}

This looks like a bug, unfortunately. I created HSEARCH-3641 and am investigating. Hopefully I will be able to release a fix in the next few days.

I think i find why it’s happening

The json unserializer use a PropertingMapping class and unfortunately the DataType enum associted to the type field doesn’t list the specialized data type
https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-types.html#_core_datatypes

edit: lol, i didn’t saw your bug content :smiley:

Ok ! I monkeyPatched

your Enum :stuck_out_tongue: and it’s working fine !

Here the generated mapping from kibana :wink:

{
  "mapping": {
    "dynamic": "strict",
    "properties": {
      [...],
      "title_suggest": {
        "type": "completion",
        "analyzer": "simple",
        "preserve_separators": true,
        "preserve_position_increments": true,
        "max_input_length": 50
      }
    }
  }
}

I will wait for this

This looks like a bug, unfortunately. I created HSEARCH-3641 and am investigating. Hopefully I will be able to release a fix in the next few days.

We released Alpha8, which fixes this bug but also updates the bridge APIs. An official announcement will follow later today as I’m able (I’m traveling).

I’ll drop an updated version of my earlier comment here:

Define an annotation:

package com.acme;

// ... imports ...

@PropertyBinding(binder = @PropertyBinderRef(type = com.acme.MyCompletionFieldBridge.Binder.class))
@Retention(RetentionPolicy.RUNTIME)
@Target({ ElementType.METHOD, ElementType.FIELD })
@Documented
@Repeatable(CompletionField.List.class)
public @interface CompletionField {

	String name();

	@Retention(RetentionPolicy.RUNTIME)
	@Target({ ElementType.METHOD, ElementType.FIELD })
	@Documented
	@interface List {
		CompletionField[] value();
	}
}

Then the bridge and its binder:

package com.acme;

// ... imports ...

public class MyCompletionFieldBridge implements PropertyBridge {
	private final IndexFieldReference<String> valueFieldReference;

	private MyCompletionFieldBridge(IndexFieldReference<String> valueFieldReference) {
		this.valueFieldReference = valueFieldReference;
	}

	@Override
	public void write(DocumentElement target, Object bridgedElement, PropertyBridgeWriteContext context) {
		String sourceValue = (String) bridgedElement;
		if ( sourceValue != null ) {
			// The "native" field type requires you to perform the JSON conversion yourself.
			// Ideally you should use a JSON library to properly escape the strings.
			target.addValue( valueFieldReference, "['" + sourceValue + "']" );
		}
	}

	public static class Binder implements PropertyBinder<CompletionField> {
		private String fieldName;

		@Override
		public void initialize(CompletionField annotation) {
			fieldName = annotation.name();
		}

		@Override
		public void bind(PropertyBindingContext context) {
			if ( fieldName == null || fieldName.isEmpty() ) {
				throw new IllegalArgumentException( "fieldName is a mandatory parameter" );
			}
			context.getDependencies().useRootOnly();
			IndexFieldReference<String> valueFieldReference = schema.field(
					fieldName,
					f -> f.extension( ElasticsearchExtension.get() )
							.asNative( "{\"type\": \"completion\"}" )
			).toReference();
			context.setBridge( new MyCompletionFieldBridge( valueFieldReference ) );
		}
	}
}

Then use the annotation in your mapping:

@CompletionField(name = "myTextProperty_suggest")
String myTextProperty;

Thank you, i will try it as soon as it’s released !

hello @yrodiere !
i would like to build the CompletionField with the new Beta6 but i can’t figure out how to make your last example work.
Can you update it or point me a documentation which can help me ?

Many thanks !
see ya

Finally i found a way :slight_smile:

public class CompletionBinder implements ValueBinder {

	@Override
	public void bind(ValueBindingContext<?> context) {
		context.setBridge(
				String.class,
				new CompletionBridge(),
				context.getTypeFactory()
						.extension(ElasticsearchExtension.get())
						.asNative()
						.mapping("{\"type\": \"completion\"}")
		);
	}

	private static class CompletionBridge implements ValueBridge<String, JsonElement> {
		@Override
		public JsonElement toIndexedValue(String value, ValueBridgeToIndexedValueContext context) {
			return value == null ? null : new JsonPrimitive(value);
		}

		@Override
		public String fromIndexedValue(JsonElement value, ValueBridgeFromIndexedValueContext context) {
			return value == null ? null : value.getAsString();
		}
	}
}

@NonStandardField(name = "myField_completion", valueBinder = @ValueBinderRef(type = CompletionBinder.class))

Glad you found a solution. For the record, bridges are documented in this section and Elasticsearch-specific extensions in this section.

1 Like

for the history,
Suggester are really fine, but not compliant with multitenancy strategies since the tenant id is filtered in the query. :sweat_smile:
so i will go back on standard edge-ngram analyser for my completion ^^ :smiley: