Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] MV partitioned by non-SlotRef Expr can not be decomposed in query cache #56871

Merged
merged 1 commit into from
Mar 14, 2025

Conversation

satanson
Copy link
Contributor

Why I'm doing:

When query cache is enabled, some query report errors as follows:

ava.lang.ClassCastException: class com.starrocks.analysis.StringLiteral cannot be cast to class com.starrocks.analysis.DateLiteral (com.starrocks.analysis.StringLiteral and com.starrocks.analysis.DateLiteral are in unnamed module of loader 'app')
        at com.starrocks.catalog.PartitionKey.successor(PartitionKey.java:363) ~[starrocks-fe.jar:?]
        at com.starrocks.planner.FragmentNormalizer.toClosedOpenRange(FragmentNormalizer.java:124) ~[starrocks-fe.jar:?]
        at com.starrocks.planner.FragmentNormalizer.getPartitionRangePredicates(FragmentNormalizer.java:576) ~[starrocks-fe.jar:?]
        at com.starrocks.planner.OlapScanNode.decomposeRangePredicates(OlapScanNode.java:1238) ~[starrocks-fe.jar:?]
        at com.starrocks.planner.OlapScanNode.normalizeConjuncts(OlapScanNode.java:1266) ~[starrocks-fe.jar:?]
        at com.starrocks.planner.OlapScanNode.toNormalForm(OlapScanNode.java:1360) ~[starrocks-fe.jar:?]
        at com.starrocks.planner.PlanNode.normalize(PlanNode.java:998) ~[starrocks-fe.jar:?]
        at com.starrocks.planner.FragmentNormalizer.normalizeSubTree(FragmentNormalizer.java:383) ~[starrocks-fe.jar:?]
        at com.starrocks.planner.FragmentNormalizer.normalizeSubTree(FragmentNormalizer.java:372) ~[starrocks-fe.jar:?]
        at com.starrocks.planner.FragmentNormalizer.normalizeSubTree(FragmentNormalizer.java:372) ~[starrocks-fe.jar:?]
        at com.starrocks.planner.FragmentNormalizer.normalize(FragmentNormalizer.java:811) ~[starrocks-fe.jar:?]
        at com.starrocks.sql.plan.PlanFragmentBuilder.finalizeFragments(PlanFragmentBuilder.java:398) ~[starrocks-fe.jar:?]
        at com.starrocks.sql.plan.PlanFragmentBuilder.createPhysicalPlan(PlanFragmentBuilder.java:243) ~[starrocks-fe.jar:?]
        at com.starrocks.sql.StatementPlanner.createQueryPlanWithReTry(StatementPlanner.java:336) ~[starrocks-fe.jar:?]
        at com.starrocks.sql.StatementPlanner.plan(StatementPlanner.java:133) ~[starrocks-fe.jar:?]
        at com.starrocks.sql.StatementPlanner.plan(StatementPlanner.java:92) ~[starrocks-fe.jar:?]
        at com.starrocks.qe.StmtExecutor.execute(StmtExecutor.java:557) ~[starrocks-fe.jar:?]
        at com.starrocks.qe.ConnectProcessor.handleQuery(ConnectProcessor.java:356) ~[starrocks-fe.jar:?]
        at com.starrocks.qe.ConnectProcessor.dispatch(ConnectProcessor.java:551) ~[starrocks-fe.jar:?]
        at com.starrocks.qe.ConnectProcessor.processOnce(ConnectProcessor.java:885) ~[starrocks-fe.jar:?]
        at com.starrocks.mysql.nio.ReadListener.lambda$handleEvent$0(ReadListener.java:69) ~[starrocks-fe.jar:?]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
        at java.lang.Thread.run(Thread.java:829) ~[?:?]

we can reproduce this error as follow steps:

  1. create a hive or iceberg table
CREATE TABLE `store_sales2_flat` (
  `ss_sold_date_sk` bigint(20) DEFAULT NULL,
  `ss_sold_time_sk` bigint(20) DEFAULT NULL,
  `ss_item_sk` bigint(20) DEFAULT NULL,
  `ss_customer_sk` bigint(20) DEFAULT NULL,
  `ss_sold_date` varchar(1073741824) DEFAULT NULL
)
PARTITION BY (ss_sold_date);
  1. ingest some rows which contains ss_sold_date = '20001231', notice that ss_sold_date's format is yyyyMMdd.
  2. create a mv on it, use str2date(ss_sold_date, '%Y%m%d') as partittion by expr.
CREATE MATERIALIZED VIEW `mv2` (`ss_sold_date`, `ss_customer_sk`, `_ca0002`)
COMMENT "MV recommended by AutoMV"
PARTITION BY (str2date(`ss_sold_date`, '%Y%m%d'))
DISTRIBUTED BY HASH(`ss_sold_date`) BUCKETS 64 
ORDER BY (ss_sold_date)
REFRESH ASYNC START("2023-12-01 10:00:00") EVERY(INTERVAL 1 DAY)
PROPERTIES (
"replicated_storage" = "true",
"replication_num" = "3",
"force_external_table_query_rewrite" = "CHECKED",
"session.enable_spill" = "true",
"storage_medium" = "HDD"
)
AS SELECT `store_sales2_flat`.`ss_sold_date`, `store_sales2_flat`.`ss_customer_sk`, count(1) AS `_ca0002`
FROM `flat_tpcds_db`.`store_sales2_flat`
GROUP BY `store_sales2_flat`.`ss_sold_date`, `store_sales2_flat`.`ss_customer_sk`; 
  1. issue the query and get the error
mysql> explain costs select count(*) from emr_iceberg_test.flat_tpcds_db.store_sales2_flat where ss_sold_date = '20001231';
ERROR 1064 (HY000): class com.starrocks.analysis.StringLiteral cannot be cast to class com.starrocks.analysis.DateLiteral (com.starrocks.analysis.StringLiteral and com.starrocks.analysis.DateLiteral are in unnamed module of loader 'app')

The reason is that when query cache is enabled, we try to decompose the the filter ss_sold_date='20001231', at first is converted to a range ['20001231', '20001231'], it happens overlap MV's partition ['2000-12-31', ''2001-01-01'), then the range tries to be converted a closedOpen range and it fails, since '20001231' can not be recognized as a legal date string(legcal date string must be in format %Y-%m-%d).

What I'm doing:

MV's partition expr as follows can be decomposed

  1. slotRef
  2. date_trunc
  3. str2date(dt, '%Y-%m-d');

Fixes #issue

What type of PR is this:

  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Does this PR entail a change in behavior?

  • Yes, this PR will result in a change in behavior.
  • No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • Parameter changes: default values, similar parameters but with different default values
  • Policy changes: use new policy to replace old one, functionality automatically enabled
  • Feature removed
  • Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr needs user documentation (for new or modified features or behaviors)
    • I have added documentation for my new feature or new function
  • This is a backport pr

Bugfix cherry-pick branch check:

  • I have checked the version labels which the pr will be auto-backported to the target branch
    • 3.4
    • 3.3
    • 3.2
    • 3.1
    • 3.0

@satanson satanson requested a review from a team as a code owner March 13, 2025 08:11
Pair<Long, Range<PartitionKey>> partitionKeyRange = rangeMap.get(i);
// when the range is to total cover this partition, we also cache it
if (!range.isEmpty()) {
selectedRangeMap.put(partitionKeyRange.first, range.toString());
if (optRange.isPresent() && !optRange.get().isEmpty()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possiable two different partition has the same selectedRange?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Impossible, at present, only single column range partition is take into consideration, so partitions are not intersected with each other. A selected range is just a sub-range of a certain partition.

Copy link

[Java-Extensions Incremental Coverage Report]

pass : 0 / 0 (0%)

Copy link

[FE Incremental Coverage Report]

fail : 29 / 38 (76.32%)

file detail

path covered_line new_line coverage not_covered_line_detail
🔵 com/starrocks/catalog/ExpressionRangePartitionInfoV2.java 0 1 00.00% [247]
🔵 com/starrocks/planner/OlapScanNode.java 23 30 76.67% [1341, 1344, 1354, 1383, 1384, 1385, 1387]
🔵 com/starrocks/planner/FragmentNormalizer.java 5 6 83.33% [585]
🔵 com/starrocks/catalog/ExpressionRangePartitionInfo.java 1 1 100.00% []

Copy link

[BE Incremental Coverage Report]

pass : 0 / 0 (0%)

@satanson satanson enabled auto-merge (squash) March 14, 2025 02:10
@kangkaisen kangkaisen disabled auto-merge March 14, 2025 02:17
@kangkaisen kangkaisen merged commit 7b82d5f into main Mar 14, 2025
69 of 70 checks passed
@kangkaisen kangkaisen deleted the query_cache_mv_part_by_non_slot_ref branch March 14, 2025 02:17
Copy link

@Mergifyio backport branch-3.2

Copy link

@Mergifyio backport branch-3.3

Copy link

@Mergifyio backport branch-3.4

Copy link
Contributor

mergify bot commented Mar 14, 2025

backport branch-3.2

✅ Backports have been created

Copy link
Contributor

mergify bot commented Mar 14, 2025

backport branch-3.3

✅ Backports have been created

Copy link
Contributor

mergify bot commented Mar 14, 2025

backport branch-3.4

✅ Backports have been created

mergify bot pushed a commit that referenced this pull request Mar 14, 2025
…query cache (#56871)

Signed-off-by: satanson <[email protected]>
(cherry picked from commit 7b82d5f)

# Conflicts:
#	fe/fe-core/src/main/java/com/starrocks/catalog/ExpressionRangePartitionInfo.java
#	fe/fe-core/src/main/java/com/starrocks/catalog/ExpressionRangePartitionInfoV2.java
#	fe/fe-core/src/main/java/com/starrocks/planner/FragmentNormalizer.java
#	fe/fe-core/src/main/java/com/starrocks/planner/OlapScanNode.java
mergify bot pushed a commit that referenced this pull request Mar 14, 2025
…query cache (#56871)

Signed-off-by: satanson <[email protected]>
(cherry picked from commit 7b82d5f)

# Conflicts:
#	fe/fe-core/src/main/java/com/starrocks/planner/FragmentNormalizer.java
mergify bot pushed a commit that referenced this pull request Mar 14, 2025
…query cache (#56871)

Signed-off-by: satanson <[email protected]>
(cherry picked from commit 7b82d5f)
wanpengfei-git pushed a commit that referenced this pull request Mar 14, 2025
satanson added a commit that referenced this pull request Mar 14, 2025
…query cache (#56871)

Signed-off-by: satanson <[email protected]>
(cherry picked from commit 7b82d5f)
Signed-off-by: satanson <[email protected]>
satanson added a commit that referenced this pull request Mar 14, 2025
…query cache (backport #56871)

Signed-off-by: satanson <[email protected]>
(cherry picked from commit 7b82d5f)
Signed-off-by: satanson <[email protected]>
satanson added a commit that referenced this pull request Mar 14, 2025
…query cache (backport #56871)

Signed-off-by: satanson <[email protected]>
(cherry picked from commit 7b82d5f)
Signed-off-by: satanson <[email protected]>
satanson added a commit that referenced this pull request Mar 14, 2025
…query cache (backport #56871)

Signed-off-by: satanson <[email protected]>
(cherry picked from commit 7b82d5f)
Signed-off-by: satanson <[email protected]>
satanson added a commit that referenced this pull request Mar 14, 2025
…query cache (backport #56871)

Signed-off-by: satanson <[email protected]>
(cherry picked from commit 7b82d5f)
Signed-off-by: satanson <[email protected]>
satanson added a commit that referenced this pull request Mar 14, 2025
…query cache (backport #56871)

Signed-off-by: satanson <[email protected]>
(cherry picked from commit 7b82d5f)
Signed-off-by: satanson <[email protected]>
kangkaisen pushed a commit that referenced this pull request Mar 14, 2025
kangkaisen pushed a commit that referenced this pull request Mar 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants