Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[opt](iceberg)Optimize count* in batch mode #49025

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

wuwenchi
Copy link
Contributor

What problem does this PR solve?

Problem Summary:

When using batch mode, if count* optimization is available, the traditional mode will be used.
Otherwise, it is impossible to calculate how many counts should be assigned to each split because we do not know how many splits there are.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Mar 13, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@wuwenchi
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 32473 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit faec138ca21328ccf301c65e3a9deccbd047c4a3, data reload: false

------ Round 1 ----------------------------------
q1	17605	5220	5078	5078
q2	2043	285	164	164
q3	10426	1293	746	746
q4	10210	1008	531	531
q5	7549	2426	2338	2338
q6	190	168	133	133
q7	960	745	610	610
q8	9313	1298	1065	1065
q9	4906	4895	4721	4721
q10	6868	2322	1933	1933
q11	464	276	269	269
q12	350	360	213	213
q13	17760	3668	3108	3108
q14	223	240	225	225
q15	541	481	475	475
q16	626	616	584	584
q17	573	861	348	348
q18	6826	6468	6298	6298
q19	1698	960	569	569
q20	326	314	199	199
q21	2691	2133	1890	1890
q22	1060	1016	976	976
Total cold run time: 103208 ms
Total hot run time: 32473 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5185	5183	5540	5183
q2	233	321	226	226
q3	2157	2665	2298	2298
q4	1419	1812	1347	1347
q5	4258	4171	4159	4159
q6	206	162	120	120
q7	1893	1941	1819	1819
q8	2617	2620	2602	2602
q9	7281	7264	7268	7264
q10	3010	3247	2777	2777
q11	573	512	492	492
q12	720	776	590	590
q13	3314	4014	3288	3288
q14	288	288	276	276
q15	517	468	458	458
q16	647	716	652	652
q17	1131	1520	1438	1438
q18	7817	7724	7626	7626
q19	818	779	835	779
q20	1949	2094	1899	1899
q21	5469	4929	4671	4671
q22	1090	1053	1042	1042
Total cold run time: 52592 ms
Total hot run time: 51006 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 193579 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit faec138ca21328ccf301c65e3a9deccbd047c4a3, data reload: false

query1	1392	995	969	969
query2	6184	1920	1888	1888
query3	11012	4619	4486	4486
query4	54390	24852	23509	23509
query5	5303	542	488	488
query6	360	210	193	193
query7	5003	494	297	297
query8	328	252	269	252
query9	6275	2611	2648	2611
query10	432	337	263	263
query11	15221	15532	15012	15012
query12	154	112	111	111
query13	1162	526	414	414
query14	10726	6946	7149	6946
query15	212	204	180	180
query16	7055	670	483	483
query17	1070	743	575	575
query18	1548	418	340	340
query19	203	211	174	174
query20	135	124	123	123
query21	210	127	104	104
query22	4714	4770	4643	4643
query23	34213	33387	33545	33387
query24	5742	2466	2495	2466
query25	492	455	399	399
query26	727	274	162	162
query27	1790	489	337	337
query28	2954	2479	2486	2479
query29	565	550	417	417
query30	278	220	184	184
query31	888	856	763	763
query32	69	63	62	62
query33	468	362	304	304
query34	765	873	511	511
query35	800	847	776	776
query36	958	1017	907	907
query37	120	100	77	77
query38	4114	4261	4129	4129
query39	1480	1464	1495	1464
query40	210	122	108	108
query41	53	57	50	50
query42	131	107	103	103
query43	496	521	481	481
query44	1383	818	807	807
query45	181	177	177	177
query46	914	1038	677	677
query47	1852	1874	1792	1792
query48	397	428	329	329
query49	694	517	430	430
query50	752	807	454	454
query51	4322	4354	4271	4271
query52	107	108	94	94
query53	243	272	196	196
query54	494	518	443	443
query55	85	81	82	81
query56	270	300	269	269
query57	1154	1183	1116	1116
query58	244	246	240	240
query59	2797	2822	2672	2672
query60	308	287	258	258
query61	127	123	122	122
query62	727	767	709	709
query63	248	205	204	204
query64	1946	1049	670	670
query65	4654	4498	4443	4443
query66	739	376	291	291
query67	15960	15731	15453	15453
query68	7091	843	491	491
query69	534	301	266	266
query70	1232	1062	1120	1062
query71	505	315	265	265
query72	5734	3683	3897	3683
query73	1332	755	354	354
query74	8994	9177	8810	8810
query75	3653	3168	2689	2689
query76	4246	1195	750	750
query77	606	374	283	283
query78	10235	10043	9386	9386
query79	3356	833	582	582
query80	716	528	441	441
query81	485	257	215	215
query82	716	125	97	97
query83	304	171	148	148
query84	289	92	78	78
query85	797	358	311	311
query86	414	302	289	289
query87	4463	4416	4472	4416
query88	3479	2279	2272	2272
query89	427	318	284	284
query90	1933	214	225	214
query91	138	137	107	107
query92	71	57	59	57
query93	1878	1092	575	575
query94	664	409	312	312
query95	350	288	264	264
query96	497	560	283	283
query97	3388	3426	3331	3331
query98	227	209	198	198
query99	1445	1382	1241	1241
Total cold run time: 301199 ms
Total hot run time: 193579 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.27 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit faec138ca21328ccf301c65e3a9deccbd047c4a3, data reload: false

query1	0.04	0.04	0.04
query2	0.07	0.03	0.04
query3	0.24	0.06	0.06
query4	1.62	0.11	0.10
query5	0.57	0.55	0.56
query6	1.20	0.72	0.71
query7	0.02	0.02	0.02
query8	0.04	0.03	0.04
query9	0.59	0.51	0.51
query10	0.60	0.61	0.59
query11	0.16	0.10	0.10
query12	0.15	0.11	0.11
query13	0.62	0.61	0.59
query14	2.83	2.67	2.83
query15	0.92	0.85	0.85
query16	0.38	0.36	0.37
query17	1.00	1.02	1.04
query18	0.21	0.20	0.20
query19	1.87	1.82	1.90
query20	0.01	0.01	0.01
query21	15.36	0.93	0.54
query22	0.75	1.23	0.65
query23	14.92	1.41	0.66
query24	6.87	1.32	1.54
query25	0.56	0.29	0.07
query26	0.56	0.16	0.13
query27	0.05	0.05	0.04
query28	9.85	0.84	0.42
query29	12.55	4.00	3.32
query30	0.27	0.09	0.06
query31	2.82	0.58	0.39
query32	3.22	0.54	0.46
query33	3.01	2.99	3.03
query34	15.86	5.12	4.51
query35	4.49	4.58	4.54
query36	0.68	0.49	0.50
query37	0.09	0.06	0.06
query38	0.05	0.03	0.04
query39	0.02	0.02	0.02
query40	0.17	0.13	0.13
query41	0.08	0.03	0.02
query42	0.03	0.02	0.02
query43	0.03	0.03	0.03
Total cold run time: 105.43 s
Total hot run time: 31.27 s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants