I use Java 20 with Quarkus 3.0.2 and Hibernate 6.2.
I have customers, each customer can have their own customers, “end_customers”.
A customer usually has 1000-300k customers, I would say 3000-5000 on average.
Each customer can define a series of self-defined, custom fields.
Each field has a field type, like text, number, date, or a constrained text value.
The data is stored in a mariadb database.
Here’s the current SQL DDL:
CREATE TABLE end_customer
(
id varchar(36) NOT NULL,
external_id varchar(36) NOT NULL,
email varchar(255) NOT NULL,
first_name varchar(255) DEFAULT NULL,
end_customer_status varchar(12) NOT NULL,
version int(11) NOT NULL DEFAULT 0,
created_at timestamp NOT NULL DEFAULT current_timestamp(),
updated_at datetime DEFAULT current_timestamp() ON UPDATE current_timestamp(),
PRIMARY KEY (id),
UNIQUE KEY end_customer_email_unique (external_id, email),
INDEX end_customer_idx1 (external_id),
INDEX end_customer_idx2 (email),
INDEX end_customer_idx3 (end_customer_status),
INDEX end_customer_idx4 (created_at)
) ENGINE = InnoDB
DEFAULT CHARSET = utf8mb4
COLLATE utf8mb4_general_ci;
CREATE TABLE end_customer_field
(
id varchar(36) NOT NULL,
end_customer_id varchar(36) NOT NULL,
field_type_id varchar(36) NOT NULL,
value varchar(255) DEFAULT NULL,
version int(11) NOT NULL DEFAULT 0,
created_at timestamp NOT NULL DEFAULT current_timestamp(),
updated_at datetime DEFAULT current_timestamp() ON UPDATE current_timestamp(),
PRIMARY KEY (id),
CONSTRAINT end_customer_field_fk1 FOREIGN KEY (end_customer_id) REFERENCES end_customer (id) ON DELETE CASCADE,
CONSTRAINT end_customer_field_fk2 FOREIGN KEY (field_type_id) REFERENCES field_type (id) ON DELETE CASCADE,
UNIQUE KEY end_customer_field_unique (end_customer_id, field_type_id),
INDEX end_customer_field_fk1 (end_customer_id),
INDEX end_customer_field_fk2 (field_type_id)
) ENGINE = InnoDB
DEFAULT CHARSET = utf8mb4
COLLATE utf8mb4_general_ci;
Since end_customer_field is basically just a stupid key value lookup, there are no indexes on the table,
and the data type of value is simply varchar(255).
Now I want to allow my customers to execute dynamic filter queries on their end customers.
Each customer has 2 default fields which are part of the end_customer table - first name and email - and up to another 28 custom fields,
30 fields in total.
I allow up to 30 search criterias to be defined -
that means I need to join the customer table up to 30 times with the end_customer_field table -
That means that for each search criteria, we need to join the customer table with the end_customer_field table,
and for each data type other than text, we need to convert the text value in the database to a number or date field.
Further, I do allow up to 24 different logical comparison operators to be used, depending on the related data type.
I implemented all this… but for more complex filters, even on my local system, the query runs for several minutes
and then at some point just times out, so I urgently need to adjust and fix this to make it work.
Here’s the pattern-matching switch I use to create the Predicates:
private static Predicate toFieldMatcherPredicate(CriteriaBuilder criteriaBuilder, FieldMatcher matcher, String value, Join<CustomerEntity, CustomerFieldEntity> end_customerFields) {
return switch (matcher) {
case TEXT_EQUALS -> criteriaBuilder.equal(end_customerFields.get("value"), value);
case TEXT_EQUALS_IGNORE_CASE ->
criteriaBuilder.equal(criteriaBuilder.lower(end_customerFields.get("value")), value.toLowerCase());
case TEXT_NOT_EQUALS -> criteriaBuilder.notEqual(end_customerFields.get("value"), value);
case TEXT_STARTS_WITH -> criteriaBuilder.like(end_customerFields.get("value"), value + "%");
case TEXT_ENDS_WITH -> criteriaBuilder.like(end_customerFields.get("value"), "%" + value);
case TEXT_CONTAINS -> criteriaBuilder.like(end_customerFields.get("value"), "%" + value + "%");
case NUMBER_EQUALS ->
criteriaBuilder.equal(toNumberFieldValue(end_customerFields), toNumericValue(criteriaBuilder, value));
case NUMBER_NOT_EQUALS ->
criteriaBuilder.notEqual(toNumberFieldValue(end_customerFields), toNumericValue(criteriaBuilder, value));
case NUMBER_GREATER_THAN ->
criteriaBuilder.greaterThan(toNumberFieldValue(end_customerFields), toNumericValue(criteriaBuilder, value));
case NUMBER_LESS_THAN ->
criteriaBuilder.lessThan(toNumberFieldValue(end_customerFields), toNumericValue(criteriaBuilder, value));
case NUMBER_GREATER_OR_EQUAL ->
criteriaBuilder.greaterThanOrEqualTo(toNumberFieldValue(end_customerFields), toNumericValue(criteriaBuilder, value));
case NUMBER_LESS_OR_EQUAL ->
criteriaBuilder.lessThanOrEqualTo(toNumberFieldValue(end_customerFields), toNumericValue(criteriaBuilder, value));
case DATE_EQUALS ->
criteriaBuilder.equal(toDateFieldValue(criteriaBuilder, end_customerFields), toDateValue(value));
case DATE_NOT_EQUALS ->
criteriaBuilder.notEqual(toDateFieldValue(criteriaBuilder, end_customerFields), toDateValue(value));
case DATE_BEFORE ->
criteriaBuilder.lessThan(toDateFieldValue(criteriaBuilder, end_customerFields), toDateValue(value));
case DATE_AFTER ->
criteriaBuilder.greaterThan(toDateFieldValue(criteriaBuilder, end_customerFields), toDateValue(value));
case DATE_BEFORE_OR_EQUAL ->
criteriaBuilder.lessThanOrEqualTo(toDateFieldValue(criteriaBuilder, end_customerFields), toDateValue(value));
case DATE_EQUAL_OR_AFTER ->
criteriaBuilder.greaterThanOrEqualTo(toDateFieldValue(criteriaBuilder, end_customerFields), toDateValue(value));
case CONSTRAINED_VALUE_EQUALS -> criteriaBuilder.equal(end_customerFields.get("value"), value);
case CONSTRAINED_VALUE_NOT_EQUALS -> criteriaBuilder.notEqual(end_customerFields.get("value"), value);
};
}
private static Expression<Integer> toNumberFieldValue(Join<EndCustomerEntity, EndCustomerFieldEntity> end_customerFields) {
return end_customerFields.get("value").as(Integer.class);
}
private static Expression<Integer> toNumericValue(CriteriaBuilder criteriaBuilder, String value) {
return criteriaBuilder.literal(Integer.parseInt(value));
}
private static Expression<LocalDate> toDateFieldValue(CriteriaBuilder criteriaBuilder, Join<EndCustomerEntity, EndCustomerFieldEntity> endCustomerFields) {
/* FIXME - STR_TO_DATE is mysql/mariadb specific */
return criteriaBuilder.function("STR_TO_DATE", LocalDate.class, endCustomerFields.get("value"), criteriaBuilder.literal("%Y-%m-%d"));
}
private static LocalDate toDateValue( String value) {
return LocalDate.parse(value, DateTimeFormatter.ISO_LOCAL_DATE);
}
}
Do you have any suggestions on how I can make the performance fast enough so this could actually be used?
Some ideas from my side:
Remove some boolean operators. GPT suggested removing the text contains, for instance
Limit the number of criterias from 30 to 15 or whatever
Store the data in a different way.
One idea would be one field table with 30 optional varchars. That would mean just one join
One idea would be one field table/data type - that would make the joining/lookup logic more complex,
but would avoid having to cast from varchar to number or date.
One idea would be to use a different storage medium… like a nosql database… at least GPT suggested that -
I don’t really have experience with nosql yet, to be honest.
Ideas and suggestions are highly welcome