Numeric enum mapping

This migration is quite a challenge for me because every day there are new issues to solve. :rofl:
For example, enum Imporance is not creating its numeric field in Lucene no matter I wrote 2 separate annotations - one for String version and one for Numeric version. When I check created index with Luke, String version is handled properly while numeric one is empty.

(P.S. I need numeric importance to sort articles by real int importance and not alphabetically by enum name.)

(P.P.S. I didn’t create new topic because so far I’m not sure is it a bug or I went crazy going through all those migrations. :rofl: )


Article.java

@GenericField(name = "importance", projectable = Projectable.NO,
	searchable = Searchable.YES, sortable = Sortable.NO)
@GenericField(name = "importance_numeric", projectable = Projectable.NO,
	searchable = Searchable.YES, sortable = Sortable.YES,
	valueBridge = @ValueBridgeRef(type = ImportanceToNumericBridge.class))
@Enumerated(EnumType.STRING)
private Importance importance = Importance.NORMAL;

Importance.java

@AllArgsConstructor @Getter
enum Importance {
	FIRST(3), TOP(2), HIGH(1), NORMAL(0), LOW(-1), BOTTOM(-2), LAST(-3);
	private final int score;
}

ImportanceToNumericBridge.java

class ImportanceToNumericBridge implements ValueBridge<Importance, Integer> {
	@Override
	public Integer toIndexedValue(
			Importance value, ValueBridgeToIndexedValueContext context
	) {
		if( value == null)  {
			throw new IllegalStateException("Importance should not be NULL");
		}
		Integer score = Integer.valueOf(value.getScore());
		if( score.intValue() != 0 ) {
			// This message shows up 14 times in console which
			// means the bridge returns expected values.
			System.out.println("Here I am being not zero!");
		}
		return score;
	}
}

Hey,

Please do, for the sake of others looking for answers.

Two things:

  1. Did you reindex?
  2. Are you sure that Luke even displays a count of terms for numeric fields? I’m not, since the concept of “term” only makes sense in the context of text fields. Luke is great but sometimes confusing that way. I would recommend testing fields are getting indexed by running a query instead of using Luke.

Hi!

I’m using Spring Profiles and have separate profile for developmet, test and production.

application-dev.properties:

com.thevegcat.app.config.switches.reindex-lucene-at-startup = true

This swhich is true during any work on search or anything else that could interfere with indexed data.
Later in a service that I run after application boot I have:

if (Boolean.TRUE.equals( this.customProperties.getSwitches().getReindexLuceneAtStartup() ) ) {
	this.reindexingService.reindex();
}

And my reindexing is here:

final MassIndexer massIndexer =
	Search
		.session(this.entityManager)
		.massIndexer()
		.monitor(new CustomMassIndexingMonitor(this))
		.purgeAllOnStart(true)
		.mergeSegmentsAfterPurge(true)
		.mergeSegmentsOnFinish(true);
massIndexer.startAndWait();

I guess this should be enough, but if you think I should delete index files before application start, I can do this too.


Few minutes ago I downloaded the index from production environment - HibernateSearch5 and Lucene5.

I’m not sure I understand Freq value in this case but for sure those 14 items are articles from my friend’s company that I want to put at the front and they have HIGH importance which has numeric value (int)1. And also the next value above in sum with 14 gives other Freq values and also there are 5 entries. Which tells me something got indexed in contrary to Lucene 8 which has 0 entries for the same field.

mysql> SELECT DISTINCT importance, COUNT(*) FROM article GROUP BY importance;
+------------+----------+
| importance | COUNT(*) |
+------------+----------+
| NORMAL     |     2852 |
| HIGH       |       14 |
+------------+----------+
2 rows in set (0.02 sec)

Ok, that one is - I’m pretty sure - false positive.
After reindexing with Hibernate Search 6 and opening index with Luke - the true is there are no “terms” stored for “importance_numeric” as there was with previous Hibernate Search 5, but when I go to sorting tab of Luke and sort documents by importance_numeric DESC, there is visible “disorder” in document ID which tells me top 14 documents are those marked with enum HIGH (int 1) because their ID’s are not in ascending order.

On first screen there are 10 Doc ID’s (16, 21, 26, 30, 34, 38, 42, 44, 48, 49) and on second screen here are 4 Doc ID’s (55, 61, 66, 340) which are not in natural order and the count of 14 is the same as the count of my documents marked as HIGH (int 1). After those 14 documents, Doc ID’s is sorted in natural order which tells me all other documents except those 14 have the same value of NORMAL (int 0).

Right now I cannot prove it as my application is not in runnable state, but I’d bet on $1 this sort is working no matter it has no “terms” as is used to have before.



Right, I suspect there’s nothing to see here. This is a simple use case that’s tested again and again in our test suite, so I’d be surprised if there was such a blatant bug.

On the other hand, I wouldn’t be surprised if your Hibernate Search 5 “numeric” bridge actually indexed your property both as text and as a number. That’s a very, very common thing in examples of Hibernate Search 5 bridges I saw on the web.

1 Like