Filtering search result

Okay so I have successfully managed to create a global search where the user get many hits of different entities. However we have a very rigid security policy in our organization. I would like to have everything indexed but I then need to filter out all hits that users is not authorized to access. User should only find entities that corresponds to his/her association (@ManyToMany) where this is required. Meaning some entities may be accessed by anyone, while other entities need a existing relation for everyone.

Is this possible with Hibernate Search or do I have to filter the search result afterwords?

Regardless of whether this can be done in Hibernate Search, you always should implement an additional security filter that does not rely on indexes.

Indexes are not meant for security; they can lag behind updates to your authorizations because index updates are not reflected immediately, or in extreme cases can become out-of-sync with your database. You cannot rely exclusively on them for security.

On the other hand, doing the filtering exclusively after loading search hits can lead to problems when using paginated results (e.g. some pages can be completely empty). So it’s not a good idea either.

What you can do is:

  • do some pre-filtering using indexes, e.g. by indexing the persons/groups allowed to access an object, and adding the corresponding predicate in your search query.
  • and do some stricter post-filtering based on database data after having loaded the hits; generally this will allow all hits to be returned, but this will occasially reject a hit because of the indexes are lagging behind the database updates. Your framework probably offers ways of doing that simply by annotating a method [EDIT: and implementing a handler with the logic to determine authorizations for a given entity]; I know Spring Security does.

Note Hibernate ORM’s filters are probably not an option here, because IIRC they don’t apply to the per-ID loading that Hibernate Search uses.

Thanks for your answer. I chose post-filtering.

void filter(List<Object> hits) {
  try {
    var it = hits.iterator();
    while(it.hasNext()) {
      Object o = it.next();
      if(o instanceof Information info && !employee.getInformations().contains(info) {
        it.remove(); // user not authorized
      } 
      // do the same for other entities that requires authorization
    }
  } catch(SomeException e) {
    log.error("this thing happened", e);
    hits.clear(); // remove everything so we do not display stuff user is not authorized for
  }
}

Sorry if that wasn’t clear, but I was suggesting doing both pre-filtering and post-filtering.

What I was trying to explain was that, with pre-filtering only, you could end up with, for example, the second page of results appearing empty on the web interface, even if there is a page 3. The total hit count will also be off.

In any case… doing post-filtering only is secure, at least :slight_smile:

I understood what you meant, no worries. However I concluded that post-filtering would be enough for our needs. Security is our main concern. I do not think we need any pre-filtering (no noticeable impact to talk about) but if we do in the future we can always incorporate it (if we produce amounts of data and experience performance issues). As for now I think this will suffice.

1 Like