How to fetch one side of a many-to-many association while on the other side all records match a given filtering criteria

Hi,
I have a many to many relationship between a Cluster and Tag table using an another intermediate table ClusterTag. Below, I have the bidirectional mapping.

I need help with HQL for the scenario where users can specify multiple (TagName and TagValue) combinations, and I have to retrieve only those Clusters which have ALL those TagName and TagValues associated with them.

like WHERE (tagName=? AND tagValue=?) AND (tagName=? AND tagValue=?)

  1. Cluster table
@Entity(name = "Cluster")
@Table(name = "cluster")
public class ClusterDto implements Serializable {

    private static final long serialVersionUID = -3501772243949297059L;

    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    @Column(name = "id", updatable = false, nullable = false)
    private Integer id;

    @Column(name = "uuid", nullable = false, unique = true)
    @Convert(converter = UUIDConverter.class)
    private UUID uuid;

    @Column(name = "mapped_id", nullable = false)
    private String mappedId;

    @Column(name = "create_time", nullable = false, updatable = false)
    @CreationTimestamp
    private Timestamp createTimestamp;

    @Column(name = "last_update", nullable = false)
    @UpdateTimestamp
    private Timestamp updateTimestamp;

    @OneToMany(
        mappedBy = "cluster",
        cascade = CascadeType.ALL,
        orphanRemoval = true
    )
    private List<ClusterTagDto> tags = new ArrayList<>();


    public void addTag(TagDto tag) {
        ClusterTagDto clusterTag = new ClusterTagDto(this, tag);
        tags.add(clusterTag);
        tag.getClusters().add(clusterTag);
    }


    public void removeTag(TagDto tag) {
        for (Iterator<ClusterTagDto> iterator = tags.iterator();
             iterator.hasNext(); ) {
            ClusterTagDto clusterTag = iterator.next();

            if (clusterTag.getCluster().equals(this) &&
                clusterTag.getTag().equals(tag)) {
                iterator.remove();
                clusterTag.getTag().getClusters().remove(clusterTag);
                clusterTag.setCluster(null);
                clusterTag.setTag(null);
            }
        }

    }

    @Override public boolean equals(Object o) {
        if (this == o)
            return true;
        if (!(o instanceof ClusterDto))
            return false;
        ClusterDto that = (ClusterDto) o;
        return Objects.equals(uuid, that.uuid);
    }

    @Override public int hashCode() {

        return Objects.hash(uuid);
    }
}
  1. Entity for the Intermediate table ClusterTag
@Entity(name = "ClusterTag")
@Table(name = "cluster_tag")
public class ClusterTagDto {

    @EmbeddedId
    private ClusterTagId id;

    @ManyToOne(fetch = FetchType.LAZY)
    @MapsId("clusterId")
    private ClusterDto cluster;

    @ManyToOne(fetch = FetchType.LAZY)
    @MapsId("tagId")
    private TagDto tag;

    @Column(name="create_time", nullable = false, updatable = false)
    @CreationTimestamp
    private Timestamp createTimestamp;

    @Column(name="last_update", nullable = false)
    @UpdateTimestamp
    private Timestamp updateTimestamp;

    private ClusterTagDto() {}

    public ClusterTagDto(ClusterDto cluster, TagDto tag) {
        this.cluster = cluster;
        this.tag = tag;
        this.id = new ClusterTagId(cluster.getId(), tag.getId());
    }

    @Override public boolean equals(Object o) {
        if (this == o)
            return true;
        if (!(o instanceof ClusterTagDto))
            return false;
        ClusterTagDto that = (ClusterTagDto) o;
        return Objects.equals(cluster, that.cluster) &&
            Objects.equals(tag, that.tag);
    }

    @Override public int hashCode() {

        return Objects.hash(cluster, tag);
    }

}
  1. Tag table
@Entity(name = "ClusterTag")
@Table(name = "tag")
public class TagDto implements Serializable {

    private static final long serialVersionUID = -3501772243949297059L;

    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    @Column(name = "id", updatable = false, nullable = false)
    private Integer id;

    @Column(name = "tag_name", nullable = false)
    private String tagName;

    @Column(name = "tag_value", nullable = false)
    private String tagValue;

    @OneToMany(
        mappedBy = "tag",
        cascade = CascadeType.ALL,
        orphanRemoval = true
    )
    private List<ClusterTagDto> clusters = new ArrayList<>();


    @Override public boolean equals(Object o) {
        if (this == o)
            return true;
        if (!(o instanceof TagDto))
            return false;
        TagDto tagDto = (TagDto) o;
        return Objects.equals(uuid, tagDto.uuid);
    }

    @Override public int hashCode() {

        return Objects.hash(uuid);
    }
}

regards
aravias

select c from Cluster c JOIN ClusterTag ct ON c.id = ct.id.clusterId JOIN Tag t ON ct.id.tagId = t.id WHERE (t.tagName=x AND t.tagValue=xv) OR (t.tagName=y AND t.tagValue=yv)

above query possibly also returns clusters with either of the combinations present in them, need some suggestions how to rewrite the query to obtain clusters having only both the combinations.

Like this:

select c
from TagDto t
join t.clusters tc
join tc.cluster c
where 
    (t.tagName=x AND t.tagValue=xv) or 
	(t.tagName=y AND t.tagValue=yv)

For more details, check out the User Guide.

Hi Vlad,
thanks for your response. But, I think this will still return me clusters matching either of below conditions and not both, I can give an example, please see the below sample data for the 3 tables involved ,

Cluster table

ID        NAME
1         CLUSTER1

2         CLUSTER2

3         CLUSTER3

ClusterTag table

ClusterId        TagId
1                 1

1                 2

2                 1

2                 4

3                 3

3                 4

Tag table

ID   TAGNAME   TAGVALUE
1    spark      2.2

2    hadoop     2.7

3    spark      2.3

4    hadoop     2.6

With above example data, and with the below condition in the where clause
(t.tag_name=‘spark’ and t.tag_value=‘2.2’) OR (t.tag_name=‘hadoop’ and t.tag_value=‘2.7’) ;

I think it will return me the clusters cluster1 and cluster2 (with ID 1 and 2) , but I need a query that will return only cluster1 since only it is mapped to both spark 2.2 and hadoop 2.7 in the relationship table, whereas ClusterId 2 is mapped only to spark 2.2 and not hadoop 2.7, but a different version 2.6.

Also, (t.tag_name=‘spark’ and t.tag_value=‘2.2’) OR (t.tag_name=‘hadoop’ and t.tag_value=‘2.7’) its not fixed to be 2 , it is dynamic there could multiple of them like
(t.tag_name=‘spark’ and t.tag_value=‘2.2’) OR (t.tag_name=‘hadoop’ and t.tag_value=‘2.7’) OR (…) OR (…)

If you need both conditions use AND instead of OR.

okay, let me try with AND , i was thinking
(t.tag_name=‘spark’ and t.tag_value=‘2.2’) AND (t.tag_name=‘hadoop’ and t.tag_value=‘2.7’) AND (…) AND (…) might not work since a tag’s name and value can be equal to only one of them at any point of time.

When changed to AND as below and tested it does not return any results.

select c from Tag t join t.clusters tc join tc.cluster c where (t.tagName=:val1 AND t.tagValue=:val2) AND (t.tagName=:val3 AND t.tagValue=:val4)

If you want to filter all tags matching all conditions, this is your JPQL query:

List<Cluster> clusters = entityManager.createQuery(
	"select c " +
	"from Cluster c " +
	"where exists (" +
	"   select ctc.id " +
	"   from ClusterTag ct " +
	"   join ct.cluster ctc " +
	"   join ct.tag ctt " +
	"   where c.id = ctc.id and ( " +
	"             (ctt.name = :tagName1 and ctt.value = :tagValue1) or " +
	"             (ctt.name = :tagName2 and ctt.value = :tagValue2) " +
	"         )" +
	"   group by ctc.id " +
	"   having count(*) = 2" +
	") ", Cluster.class)
.setParameter("tagName1", "Spark")
.setParameter("tagValue1", "2.2")
.setParameter("tagName2", "Hadoop")
.setParameter("tagValue2", "2.7")
.getResultList();

For more details, check out this article where I explained how I came up with this query.

thanks, this should be fine.