HS 6.0.7 InheritanceType.JOINED Issues?

Hi, are there any known issues using InheritanceType.Joined in HS 6.0.7? What I am seeing is this. Entities are added/modified with no issues. But, when deleting entities (whether through the base class repository or through the subclass repository), sometimes (not all the time), the index entries are not removed. I have verified that Hibernate itself is working fine, these entities are indeed deleteted, but the index is stale.

Below is a view of the class hierarchy. Any thoughts would be greatly appreciated.

BTW, when I say that sometimes this occurs, I mean this. Let’s say I create a set of folders and searches that look like this:

RootFolder
   Folder1
      Search1-1
      Search1-2
   Folder2 
      Search2-1
      Search2-2
  etc: - All exactly the same

Then, if I attempt to delete all folders under the root folder one by one, random folders remain in the folder index (examined with Luke) while others do not. All of the elements are indeed deleted in the databaase and Hibernate reports no errors. Hibernate 5.6.3; Lucene 8.7.0.

Thanks,

Keith

@MappedSuperclass
@EntityListeners(AuditingEntityListener.class)
public abstract class AuditablePE<U> {


   //The user that created the entity
   @KeywordField(name=CREATED_BY_FIELD, sortable = Sortable.YES)
   @CreatedBy
   @Column(
      name = "created_by",
      columnDefinition = "TEXT",
      nullable = false
   )
   private U createdBy;

   //The timestamp the entity was created
   @GenericField(name=CREATED_FIELD, sortable = Sortable.YES)
   @CreatedDate
   @Column(
      name = "created",
      nullable = false
   )
   private Instant created;

   //The user that last modified the entity
   @KeywordField(name=MODIFIED_BY_FIELD, sortable = Sortable.YES)
   @LastModifiedBy
   @Column(
      name = "modified_by",
      columnDefinition = "TEXT",
      nullable = false
   )
   private U lastModifiedBy;

   //The timestamp the entity was last modified
   @GenericField(name=MODIFIED_FIELD, sortable = Sortable.YES)
   @LastModifiedDate
   @Column(
      name = "modified",
      nullable = false
   )
   private Instant modified;
   
   :
   :
}
   
@Entity
@Inheritance(strategy = InheritanceType.JOINED)
@DynamicUpdate
@Table(name="pfs_element",
   uniqueConstraints = @UniqueConstraint(
   columnNames = {"name", "parent_id"},
   name = "pfs_element_name_parent_id_uq"
))
@Indexed
public abstract class FileSystemElementPE extends AuditablePE<String> {

   public static final Long NULL_ID = Long.valueOf(-1);

   @Id
   @GenericField(name = FileSystemIndex.FS_ELEMENT_ID)
   @GeneratedValue(strategy = GenerationType.IDENTITY)
   @Column(name="id", updatable = false, nullable = false)
   private Long id;

   @FullTextField(name = FileSystemIndex.FS_ELEMENT_NAME)
   @Column(
      name = "name",
      columnDefinition = "TEXT",
      nullable = false
   )
   private String name;

   @FullTextField(name = FileSystemIndex.FS_ELEMENT_SUMMARY)
   @Column(
      name = "summary",
      columnDefinition = "TEXT",
      nullable = true
   )
   private String summary;

   @Enumerated(EnumType.STRING)
   @GenericField(name = FileSystemIndex.FS_ELEMENT_TYPE, sortable = Sortable.YES)
   @Column(
      name = "elem_type",
      nullable = false
   )
   private FileSystemElementType elementType;

   @ManyToOne(fetch = FetchType.LAZY)
   @JoinColumn(
      name = "parent_id",
      foreignKey = @ForeignKey(name = "pfs_element_parent_pfs_folder_id_fk"),
      nullable = true
   )
   private FolderPE parent;

   @KeywordField(name = FileSystemIndex.FS_ELEMENT_ANCESTRY)
   @Column(
      name = "ancestry",
      columnDefinition = "TEXT",
      nullable = true)
   private String ancestry;

   @ManyToOne(fetch = FetchType.LAZY)
   @JoinColumn(
      name = "fs_id",
      foreignKey = @ForeignKey(name = "pfs_element_pfs_file_system_fk"),
      nullable = true
   )
   private FileSystemPE fileSystem;

   @GenericField(name = FileSystemIndex.FS_ELEMENT_DELETED)
   private Boolean deleted;
   :
   :
}
@Entity
@Table(name="pfs_folder")
@PrimaryKeyJoinColumn(foreignKey=@ForeignKey(name = "fk_pfs_folder_pfs_element_id"))
@DynamicUpdate
public class FolderPE extends FileSystemElementPE {
   @Enumerated(EnumType.STRING)
   @GenericField(name = FileSystemIndex.FS_FOLDER_TYPE, sortable = Sortable.YES)
   @Column(
      name = "folder_type",
      nullable = false
   )
   private FolderType folderType;

   @OneToMany(
      cascade = {CascadeType.PERSIST, CascadeType.MERGE},
      fetch = FetchType.LAZY,
      mappedBy = "parent"
   )

   private List<FileSystemElementPE> elements;

   :
   :
}

@Entity
@Table(name="pfs_patent_search")
@PrimaryKeyJoinColumn(foreignKey=@ForeignKey(name = "fk_pfs_patent_search_pfs_element_id"))
@DynamicUpdate
@Indexed
public class PatentSearchPE extends FileSystemElementPE {

   public PatentSearchPE() {
      super();
   }

   @Embedded
   @IndexedEmbedded(name=PatentSearchIndex.PATENT_SEARCH_EMBED)
   public PatentSearchEmbed search; 
   
   :
   :
}

Hey Keith,
Thanks for reaching out. I don’t remember any similar problems reported before.

Could you update the example you’ve shared and show the part where you are deleting the folders?

Some possible ideas I could think of off the top of my head just to remove the obvious:

  • when do you inspect the index? Maybe you are using one of the async synchronization strategies, and the changes haven’t been reflected in the index yet.
  • are all entities removed by ORM without any cascading involved on the database side? Meaning if some rows are removed by a DB without ORM knowing about it – the events won’t trigger required index changes.

If it’s none of the above (which I suspect it isn’t) would you be able to try and come up with a reproducer and share it with us? A template for reproducer can be found in this repository hibernate-test-case-templates/search at main · hibernate/hibernate-test-case-templates · GitHub

Hi,

Thank you for the quick response. In this case, we are using a local Lucene index and are staying synchronous. Below are the relevant properties that are set. Also, I have verified that none of the Entity tables have On Delete Cascade enabled. As you can see from the class annotations, I am not relying on Hibernate’s CascadeType.REMOVE either, so I have to delete folder child elements explicitly. I will try to narrow this down as much as possible and submit a reproducer. Thank you for the link.

Regards,

Keith

spring.jpa.hibernate.ddl-auto=validate

#Query cache is off by default but explicitly setting it
spring.jpa.properties.hibernate.cache.use_query_cache=false
#Second level cache is on by default. Turning it off due to stale caching behavior
spring.jpa.properties.hibernate.cache.use_second_level_cache=false

spring.jpa.properties.hibernate.order_inserts=true
spring.jpa.properties.hibernate.order_updates=true
spring.jpa.properties.hibernate.format_sql=true
spring.jpa.properties.hibernate.jdbc.batch_size=25
spring.jpa.properties.hibernate.order_by.default_null_ordering=first

#Hibernate search settings
#Only validate schema in production - schema upgrade requires rebuilding the index to be safe
#Can set to create-or-validate when desktop testing if desired
spring.jpa.properties.hibernate.search.schema_management.strategy=validate
spring.jpa.properties.hibernate.search.backend.analysis.configurer=class:com.propat.pas.base.index.PasLuceneAnalysisConfigurer
spring.jpa.properties.hibernate.search.backend.directory.root=${propat.basePath}index/
#alternative is local-heap
spring.jpa.properties.hibernate.search.backend.directory.type = local-filesystem
spring.jpa.properties.hibernate.search.backend.lucene_version=LUCENE_8_7_0
#write-sync is the default. There is plain sync which guarantees changes visible w/o a commit and on search
spring.jpa.properties.hibernate.search.automatic_indexing.synchronization.strategy=write-sync

Update… I haven’t had time to put this into your reproducer format yet but I have narrowed this down. To make things simple, I took the whole FolderPE bit completely out of it. I just have the PatentSearchPEs and have their parent fields set to null.

This is the interesting part. If I persist a handful of entities and then individually delete them (separate MVC calls to my Controller, which calls @Transactional delete method in the Service - see below), it works fine. However, if I persist 20 of these entities, and then issue the same individual delete operation, it fails to update the index 90% of the time. Also, entity updates do not work properly either. Let’s say I rename one of these search entities. The number of__HSEARCH_id entries in the index remains the same (as it should), but my name field attribute has more terms added to it (these are unique one-word names BTW so the analyzer doesn’t break them up). i.e. The old name stays as a term in the index and the new name is also added.

Regards,

Keith

@Service
public class FileSystemService {

:
:

   @Transactional
   public ServiceResponse hardDeleteElement(Long id) {
      try {
         FileSystemElementPE fsePE = fseRepo.findById(id).orElse(null);
         if (fsePE != null) {
            if (fsePE.isFolder()) {
               //It is more efficient to find all the children and delete them explicitly then to rely on
               //Hibernate CascadeType.REMOVE, which results in additional queries (one per sub-element)
               fseRepo.deleteByAncestryLike(fsePE.getAbsoluteIdPath() + "%");
            }
            fseRepo.delete(fsePE);
            return ServiceResponse.ok(StringUtil.sub("Deleted {} {} successfully.", fsePE.getTypeViewName(), fsePE.getName()));
         } else {
            return ServiceResponse.error(StringUtil.sub("Delete failed. File system element {} not found.", id));
         }
      }
      catch (Exception e) {
         String msg = StringUtil.sub("Failed to delete filesystem element {} due to server error.",id);
         logger.error(msg, e);
         throw new ServiceException(msg , e);

      }
   }

   @Transactional
   public ServiceResponse renameFileSystemElement(EditableFileSystemElementDTO efseDto) {
      try {
         FileSystemElementPE fsePE = findElementById(efseDto.getId());
         String prevName;
         if (fsePE != null) {
            prevName = fsePE.getName();
            fsePE.setName(efseDto.getName());
            return ServiceResponse.ok(StringUtil.sub("Renamed {} from {} to {}", fsePE.getTypeViewName(),prevName, fsePE.getName()));
         } else {
            return ServiceResponse.error(StringUtil.sub("Rename failed. Filesystem element {} not found.", efseDto.getId()));
         }

      } catch (Exception e) {
         String msg = StringUtil.sub("Failed to rename filesystem element {} due to server error.",efseDto.getId());
         logger.error(msg, e);
         throw new ServiceException(msg , e);
      }
   }
   
}

You can close this topic. This appears to be a misunderstanding of mine in regards to what Lucene is doing behind the scenes/Luke is displaying. My HS6 queries work as expected regardless of what Luke indicates are top terms, etc. in the field. If you select one of those values and request that it show you matching docs, it will come up with none - and it logs (if you look under the log tab) that the document in question was deleted. It is very interesting to me that on the first few entities the field term counts, top terms, etc. seem to update immediately. Once you have more than a handful though those changes are not reflected even though they have taken effect. So that others don’t get burned by this, pay attention to Number of Documents near the top of overview. That appears to stay up to date.

1 Like