Bridge not working or indexNullAs missing? Search advice needed

I have an entity Product which has a property Manufacturer.

Manufacturer has its own bridge to get the data for index.

When indexing with Hibernate Search, I would like to add a token “manufacturer_not_set” to the Product for the case when Manufacturer property of Product is null. Later I could issue a query to get Products that has no Manufacturer just by querying this token.

Is there a way to get this behavior? Thanks!

P.S. The code I provided is not working as expected: Manufacturer is processed the right way when not null, but nothing happens when it’s null. I did put log.trace() into the bridge method and indexer calls bridge method when Manufacturer is null, but the token is not stored.

@Entity @Indexed
public class Product {

    @Id @GeneratedValue(strategy = GenerationType.SEQUENCE)
    private Integer id;

    @Field(index=Index.YES, analyze=Analyze.YES, store=Store.NO,
            bridge = @FieldBridge(impl = ManufacturerBridge.class))
    private Manufacturer manufacturer;

}

public class ManufacturerBridge implements StringBridge {

    @Override
    public String objectToString(final Object object) {
        if (object == null) { return "manufacturer_not_set"; }
        if (object instanceof Manufacturer) {
            return "manufacturer_" + ((Manufacturer) object).getId();
        }
        return "";
    }

}

My questions are:

  1. Should I add “indexNullAs” to field definition? Should I add my own string such as “manufacturer_not_set” or is it reserved only for predefined values?
  2. This bridge is called during search query execution - why? Isn’t the bridge only to get properties of complex objects and add them to the Lucene index?
  3. For Manufacturer status I have another bridge that returns “manufacturer_enabled” or “manufacturer_disabled”; later when I use Boolean junction to create a query, only “must” works with it. If I put only one “should” with the same condition (i.e. match “manufacturer_enabled”) this thing does not work and I get all manufacturers; how to use this “should”
  4. If I do not put “indexNullAs” is it possible that bridge handles null i.e. returns “manufacturer_not_set” when it gets null as an input?

P.S. Been losing my time for few days before writing this post, so the only other thing I can imagine is to get source code and go step by step through indexing process and then through search process - which could take large amounts of time.

Thanks a lot!!!

Well… It looks like default value for “minimumShouldMatchNumber” is zero which is not so logical if you think about it. But after setting this parameter to 1 it seems the query is working.
To be tested more to be sure.
And so far no “indexNullAs” is needed while I have a bridge that handles null values.

Either solution should work, as long as you’re only dealing with with text. For numeric fields, things get more complicated, and I’d recommend migrating to Search 6 (in Beta) rather than dealing with the complexity of indexNullAs in Search 5.

It’s called during search query execution so that you can pass an integer, which is the type of your ID, and have it translated automatically to the indexed values; in this case 42 (an integer) will be translated to "manufacturer_42" (a string). If Hibernate Search didn’t do this, there simply wouldn’t be any match.
If you want to pass "manufacturer_42" to the query builder directly, you can call .ignoreFieldBridge().

Looks like you found the answer.

This is the default behavior of Lucene: minimumShouldMatch defaults to 1 when there are only should clauses, but to 0 when there are also must clauses in the same boolean query.

This is indeed confusing, but kind of makes sense: if you write A and B or C, and only documents where A and B “must” be true are returned, then C doesn’t matter: the expression is true for all documents. In the case of Lucene, C will only be used to raise the score of the document: documents where C is true will have a slightly higher score.

Anyway, that’s how the must/should clauses work. I personally try not to mix them in the same boolean query, and to nest boolean queries instead, unless I have a very specific behavior in mind.

As you noticed, yes, it is possible.

Thanks a million for such a detailed answer!