Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue while casting to BIGINT #17388

Closed
Shanmugavel-J opened this issue Jan 29, 2025 · 10 comments · Fixed by #17412
Closed

Issue while casting to BIGINT #17388

Shanmugavel-J opened this issue Jan 29, 2025 · 10 comments · Fixed by #17412
Assignees

Comments

@Shanmugavel-J
Copy link

Shanmugavel-J commented Jan 29, 2025

Summary

Issue: When casting null value from payload(variant datatype) it throws error.

Execute the below queries,

Create table: CREATE TABLE TEST(payload variant);
Insert records: INSERT INTO TEST(payload) VALUES ('{"test":"value","test1":null, "test2": "null", "obj": {"test":"tes"}}');

When executing the below select query it should ideally cast it to NULL,
Select Query: SELECT payload:test1::BIGINT FROM TEST;

Error i get is

SQL Error: Failover Retry Error executing query after 0 failover retry: SQL: SELECT payload:test1::BIGINT FROM TEST Query failed: QueryErrors{code=1006, message=unable to cast to type Int64 while evaluating function `to_int64('null')` in expr `to_int64()`, during run expr: `CAST(get_by_keypath(test.payload (#0), '{"test1"}') AS Int64 NULL)`} cause: null

But the below select query get succeded,
Select query: SELECT null::BIGINT;

The only difference is that when null value comes from payload it fails but when coming from column it passes.

@sundy-li
Copy link
Member

you can use try_cast instead.

cast will throw errors if it fails to cast.
try_cast will return null.

null::bigint is a special example that we know it's always null in SQL, so it returns null.

@Shanmugavel-J
Copy link
Author

Shanmugavel-J commented Jan 30, 2025

@sundy-li

  • "null" (string) is just a string containing the word “null”. If you try to cast it to BIGINT, it should fail unless you use TRY_CAST, which would return NULL instead of an error.
  • NULL (SQL null) is a special marker meaning “unknown” or “missing”. Casting NULL::BIGINT should work as expected and remain NULL.

If the payload contains "null" (as a string), CAST("null" AS BIGINT) will fail. But if the payload contains a true NULL value, CAST(NULL AS BIGINT) should work without issue.

So yes, TRY_CAST is a workaround for handling string "null" gracefully, but NULL::BIGINT should already work correctly if the value is truly NULL.

@b41sh
Copy link
Member

b41sh commented Jan 30, 2025

We can modify this cast function to support casting a json null to a NULL value.

@Shanmugavel-J
Copy link
Author

Thanks @b41sh

@Shanmugavel-J
Copy link
Author

Shanmugavel-J commented Jan 30, 2025

@BohuTANG @b41sh One more issue to add ON. Please find the summary of it,

summary

Execute the below queries,

Create table: CREATE TABLE TEST(payload variant);
Insert records: INSERT INTO TEST(payload) VALUES ('{"test":"value","test1":null, "test2": "null", "obj": {"test":"tes"}}');

Expected Behavior:

When running the below query, the record should be returned because test1 is SQL NULL, not the string "null":
Select Query: SELECT * FROM TEST WHERE payload:test1 is null;

Another one thing was that when i execute the below query, i get the record but actually when inserting it was SQL null not string null.
Select Query: SELECT * FROM TEST WHERE payload:test1 = 'null';

However, the expected result is not being achieved. Please look into this issue.

@b41sh
Copy link
Member

b41sh commented Feb 5, 2025

Hi @Shanmugavel-J thanks for your report.

SELECT * FROM TEST WHERE payload:test1 is null;
I think it's correct that this SQL doesn't return the record because payload:test1 is a JSON null, not a SQL NULL, and it should only be returned when querying for a field that doesn't exist, such as this SQL.
SELECT * FROM TEST WHERE payload:test10 is null;
To query for a field that's a JSON null, we can use the is_null_value function like this:
SELECT * FROM TEST WHERE is_null_value(payload:test1);

I'm not sure I fully understand what you mean, please let me know if I'm wrong.

@Shanmugavel-J
Copy link
Author

Shanmugavel-J commented Feb 5, 2025

@b41sh Thanks for you response. as you had mentioned that is correct. My bad..

Please find another issue mentioned below,

Create query: CREATE OR REPLACE TEMP TABLE TEST(payload variant, ts int);
Execute the below queries:

INSERT INTO TEST (payload) SELECT PARSE_JSON('{"key3": "value3","key5":"null","key6": {"test": "test"}}');
INSERT INTO TEST (payload) SELECT PARSE_JSON('{"key3": "value3", "key4": null,"key5":"null","key6": {"test": "test"}}');
INSERT INTO TEST (payload) SELECT PARSE_JSON('{"key3": "value3", "key4": "null","key5":"null","key6": {"test": "test"}}');

Execute the select query: SELECT TYPEOF(payload:key4), payload:key4 FROM TEST;

Output:

VARIANT NULL	"null"
VARIANT NULL	[NULL]
VARIANT NULL	null

Ideally the first one should be varchar since the value is "null".

The another one which i had posted was

Another one thing was that when i execute the below query, i get the record but actually when inserting it was SQL null (VARIANT) not string null.

Create query: CREATE OR REPLACE TEMP TABLE TEST(payload variant, ts int);
Execute the below queries:

INSERT INTO TEST (payload) SELECT PARSE_JSON('{"key3": "value3","key5":"null","key6": {"test": "test"}}');
INSERT INTO TEST (payload) SELECT PARSE_JSON('{"key3": "value3", "key4": null,"key5":"null","key6": {"test": "test"}}');
INSERT INTO TEST (payload) SELECT PARSE_JSON('{"key3": "value3", "key4": "null","key5":"null","key6": {"test": "test"}}');

Select Query: SELECT * FROM TEST WHERE payload:test1 = 'null';

Output:

{"key3":"value3","key4":"null","key5":"null","key6":{"test":"test"}}	
{"key3":"value3","key4":null,"key5":"null","key6":{"test":"test"}}

Ideally in the above result, the Second record shouldn't come because it is variant NULL. the first one should come because it meets the condition string null.

@b41sh
Copy link
Member

b41sh commented Feb 5, 2025

@Shanmugavel-J Thanks for the reply.

For the first issue, the typeof function in Databend is similar to PostgreSQL's pg_typeof function, which returns the data type of the SQL column, not the internal type of the variant data. If you want to determine the internal type of variant data, you can use json_typeof function, for example:

root@0.0.0.0:48000/default> SELECT JSON_TYPEOF(payload:key4), payload:key4 FROM TEST;
┌───────────────────────────────────────────────┐
│ json_typeof(payload:key4) │    payload:key4   │
│      Nullable(String)     │ Nullable(Variant) │
├───────────────────────────┼───────────────────┤
│ NULLNULL              │
│ nullnull              │
│ string                    │ "null"            │
└───────────────────────────────────────────────┘
3 rows read in 0.032 sec. Processed 3 rows, 289 B (93.75 rows/s, 8.82 KiB/s)

Note that snowflake's typeof function is specific to variant data, so there is a difference in usage.

For the second issue, it looks like that the example SQL is not correct, because the test1 field does not exist in the inserted data. If we use payload::key4 as a filter, we can get the following result, which should be what you expect.

root@0.0.0.0:48000/default> SELECT * FROM TEST WHERE payload:key4 = 'null';
┌────────────────────────────────────────────────────────────────────────────────────────┐
│                                payload                               │        ts       │
│                           Nullable(Variant)                          │ Nullable(Int32) │
├──────────────────────────────────────────────────────────────────────┼─────────────────┤
│ {"key3":"value3","key4":"null","key5":"null","key6":{"test":"test"}} │            NULL │
└────────────────────────────────────────────────────────────────────────────────────────┘
1 row read in 0.037 sec. Processed 3 row, 304 B (81.08 rows/s, 8.02 KiB/s)

@Shanmugavel-J
Copy link
Author

@b41sh Sorry my mistake while pasting. it is test4. Okay for the second issue was it fixed as part of this?

I could see two records in my case with the below databend version,
Databend Query v1.2.692-nightly-8dfab90769(rust-1.85.0-nightly-2025-02-01T11:03:45.424925532Z)

@b41sh
Copy link
Member

b41sh commented Feb 5, 2025

@b41sh Sorry my mistake while pasting. it is test4. Okay for the second issue was it fixed as part of this?

I could see two records in my case with the below databend version, Databend Query v1.2.692-nightly-8dfab90769(rust-1.85.0-nightly-2025-02-01T11:03:45.424925532Z)

yes, it will be fixed after PR #17412 merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants