Skip to content

BigQuery TIMESTAMP_MILLIS transform conflict when model substitutes #1160

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
goldmedal opened this issue Apr 24, 2025 · 0 comments
Open

BigQuery TIMESTAMP_MILLIS transform conflict when model substitutes #1160

goldmedal opened this issue Apr 24, 2025 · 0 comments
Labels
bug Something isn't working

Comments

@goldmedal
Copy link
Contributor

Describe the bug
When the model substitutes, we use sqlglot to transpile the dialect SQL to our dialect

ast = parse_one(sql, dialect=self.data_source.value)

However, after transforming to the sqlgot IR, the function name could be mapped to its function name.
In this case, TIEMSTAMP_MILLIS will be transformed to UNIX_TO_TIME when the output dialect is None (used by our v3 API). If the output dialect is trino (used by our v2 API), it will be transformed to FROM_UNIXTIME(CAST(timestamp AS DOUBLE) / POW(10, 3)). After the v2 wren core planning, the function name will be transformed to TIEMSTAMP_SECONDS(CAST(timestamp AS DOUBLE) / POW(10, 3)), which function signature is wrong. TIMESTAMP_SECONDS only accepts INT64, but the result SQL tries to input a divided number (float64).

To Reproduce
Steps to reproduce the behavior:
For v2 case:

import sqlglot

sql = """
SELECT timestamp_millis(timestamp) from ask_question_traces
"""
ast = sqlglot.parse_one(sql, read="bigquery")
bq = ast.sql(dialect="trino")
result = sqlglot.transpile(bq, read="trino", write="bigquery")[0]
print(result)

The result is

SELECT TIMESTAMP_SECONDS(CAST(timestamp AS FLOAT64) / POWER(10, 3)) FROM ask_question_traces

For v3 case:

import sqlglot

sql = """
SELECT timestamp_millis(timestamp) from ask_question_traces
"""
ast = sqlglot.parse_one(sql, read="bigquery")
ir_sql = ast.sql(dialect=None)
print(ir_sql)
result = sqlglot.transpile(ir_sql, read=None, write="bigquery")[0]
print(result)

The result is

SELECT UNIX_TO_TIME(timestamp, 3) FROM ask_question_traces
SELECT TIMESTAMP_MILLIS(timestamp) FROM ask_question_traces

The v3 one is better. There is a only problem for UNIX_TO_TIME isn't included in our function lists.

@goldmedal goldmedal added the bug Something isn't working label Apr 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant