Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft: Parse literal to different types #15202

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

jayzhan211
Copy link
Contributor

Which issue does this PR close?

  • Closes #.

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions bot added the sql SQL Planner label Mar 13, 2025
@jayzhan211
Copy link
Contributor Author

jayzhan211 commented Mar 13, 2025

I analysis some errors and I think if we fix the way we parse sql, we can find an optimal type for the literal easily

For example, like

SELECT -'100', we parse it as Unary(Minus, "100"), but we can get -100 instead

External error: statement failed: DataFusion error: Arrow error: Cast error: Cannot cast string 'FF01' to value of Int64 type
[SQL] CREATE TABLE t
AS VALUES
  ('FF01', X'FF01'),
  ('ABC', X'ABC'),
  ('000', X'000');
at test_files/binary.slt:33

External error: query is expected to fail, but actually succeed:
[SQL] select abs('-1.2');
at test_files/math.slt:134

External error: query failed: DataFusion error: Error during planning: Internal error: Expect TypeSignatureClass::Native(LogicalType(Native(String), String)) but received NativeType::Int64, DataType: Int64.
This was likely caused by a bug in DataFusion's code and we would welcome that you file an bug report in our issue tracker No function matches the given name and argument types 'ascii(Int64)'. You might need to add explicit type casts.
	Candidate functions:
	ascii(Coercion(TypeSignatureClass::Native(LogicalType(Native(String), String))))
[SQL] SELECT ascii('222')
at test_files/expr.slt:327

External error: query columns mismatch:
[SQL] select column1, column2 from validate_partitioned_parquet4 order by column1,column2;
[Expected] [T]T
[Actual  ] [I]T
at test_files/copy.slt:203

External error: query is expected to fail, but actually succeed:
[SQL] SELECT -'100'
at test_files/scalar.slt:1557

External error: query columns mismatch:
[SQL] SELECT '0' as c UNION ALL BY NAME SELECT 0 as c;
[Expected] [T]
[Actual  ] [I]
at test_files/union_by_name.slt:285

External error: query result mismatch:
[SQL] EXPLAIN VALUES ('1'::float)
[Diff] (-expected|+actual)
-   logical_plan Values: (Float32(1) AS Utf8("1"))
+   logical_plan Values: (Float32(1) AS Int64(1))
    physical_plan DataSourceExec: partitions=1, partition_sizes=[1]
at test_files/select.slt:429

External error: query is expected to fail, but actually succeed:
[SQL] select to_timestamp('-1');
at test_files/timestamps.slt:3386

External error: query is expected to fail with error:
	(regex) DataFusion error: Error during planning: Cannot coerce arithmetic expression Interval\(MonthDayNano\) \+ Utf8 to valid types
but got error:
	DataFusion error: Error during planning: Cannot coerce arithmetic expression Interval(MonthDayNano) + Int64 to valid types
[SQL] select interval '1' + '1' month
at test_files/interval_mysql.slt:21

External error: query result mismatch:
[SQL] SELECT to_date('21311111');
[Diff] (-expected|+actual)
-   2131-11-11
+   +60317-11-04
at test_files/dates.slt:142

External error: query is expected to fail, but actually succeed:
[SQL] select regr_slope(1, '2');
at test_files/errors.slt:111

External error: query result mismatch:
[SQL] SELECT arrow_typeof('1')
[Diff] (-expected|+actual)
-   Utf8
+   Int64
at test_files/arrow_typeof.slt:79

External error: query columns mismatch:
[SQL] select nullif('2', '3');
[Expected] [T]
[Actual  ] [I]
at test_files/nullif.slt:118

External error: query columns mismatch:
[SQL] select 
    unnest(column1), unnest(column2) + 2, 
    column3 * 10, unnest(array_remove(column1, '4')) 
from unnest_table;
[Expected] III[T]
[Actual  ] III[I]
at test_files/unnest.slt:266

External error: query failed: DataFusion error: Error during planning: Projections require unique expression names but the expression "map_extract(map(make_array(Int64(1), Int64(2), Int64(3)), make_array(Int64(1), Int64(2), Int64(3))), Float64(1))" at position 1 and "map_extract(map(make_array(Int64(1), Int64(2), Int64(3)), make_array(Int64(1), Int64(2), Int64(3))), Float64(1))" at position 3 have the same name. Consider aliasing ("AS") one of them.
[SQL] select map_extract(MAP {1: 1, 2: 2, 3:3}, '1'), map_extract(MAP {1: 1, 2: 2, 3:3}, 1.0),
       map_extract(MAP {1.0: 1, 2: 2, 3:3}, '1'), map_extract(MAP {'1': 1, '2': 2, '3':3}, 1.0),
       map_extract(MAP {arrow_cast('1', 'Utf8View'): 1, arrow_cast('2', 'Utf8View'): 2, arrow_cast('3', 'Utf8View'):3}, '1');
at test_files/map.slt:579

External error: query failed: DataFusion error: Arrow error: Cast error: Cannot cast string 'default' to value of Int64 type
[SQL] select lead(a, 1, 'default') over (order by a) from (select '1' a union all select '2' a)
at test_files/window.slt:4043

External error: query failed: DataFusion error: Execution error: unsupported type for second argument to array_to_string function as Int64
[SQL] select array_to_string([1, 1, 1], '1'), array_to_string([[1, 2], [3, 4], [5, 6]], '+'), array_to_string(array_repeat(array_repeat(array_repeat(3, 2), 2), 3), '/\');
at test_files/array.slt:4240

@jayzhan211
Copy link
Contributor Author

This query is tricky.

SELECT to_date('21311111');

If we treat '21311111' as int, we need to allow integer for to_date, but if we pass real integer to to_date it should be rejected. We don't have the nice way to differentiate numeric string literal and numeric

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sql SQL Planner
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant