No items found.

BigQuery SQL Syntax: Differences with Standard SQL

Dan LeBlanc

December 28, 2023

Insight

This article will outline some key differences for writing the dialect of SQL used in BigQuery, called GoogleSQL, as it differs from standard SQL syntax used in Redshift and Snowflake.

Previously, BigQuery supported a more divergent SQL dialect for BigQuery (now referred to as Legacy SQL).

Data Types

BigQuery does not support a number of different data types that are used in other platforms. For instance:

STRING is used instead of CHAR or VARCHAR.
There is no TIMESTAMP without a timezone in BigQuery. Bigquery uses UTC.
BIGDECIMAL is used instead of NUMBER or DECIMAL.
You cannot set DEFAULT for data types or for constraints such as NULL or NOT NULL.

Notes on Functions

BigQuery requires certain differences around functions due to differences in its architecture. Three notable ones that we’ve encountered:

MD5 is stored as a "byte" in BigQuery. Therefore, you must use TO_HEX(MD5) as a function to convert the MD5 to a string.
Truncating timestamps with ::TIMESTAMP or ::DATE does not work. Instead, you must use DATE() or DATETIME_TRUNC() to get just the date.
ILIKE (i.e., the case insensitive LIKE) does not exist! You should change your query to use LOWER on the comparisons.

CASCADE Support

The CASCADE argument, which eliminates a dataset and all resources within that dataset, is only available for DROP SCHEMA, not DROP TABLE.

In order to drop all the resources for a table you need to explicitly call out those resources as CASCADE does not work at the table level.

BigQuery SQL Gotchas

Commenting in BigQuery

When running SQL via ODBC avoid using multi-line comments like this:

Instead, you need to use:

‍‍

Or:

Hot Tip: SQL execution via ODBC stalls when using the first mode of multi-line comment.

Using CLUSTER BY in BigQuery

When using CLUSTER BY with more than one field, do not use (), as BigQuery interprets fields inside parenthesis as a STRUCT.

Instead, you need to use:

Other SQL Gotchas That We’ve Found

Don't name a column and a CTE the same, as it will interpret the column as an array.
::DATE does not work. Use DATE() instead.
NVL does not work. Use COALESCE instead.
DELETE for upsert (i.e., update or insert) needs to have a JOIN.
SPLIT_PART does not work. Must use SPLIT, but that will turn what’s returned into an array, so we recommend something specific: e.g., SPLIT(email, '@')[OFFSET(0)] and SPLIT(email, '@')[OFFSET(1)].
UNION needs to be UNION ALL.

Keep Reading

Looking to choose a data warehouse for your brand?

We wrote an article comparing Snowflake, Redshift, and BigQuery: Choosing a Cloud Data Warehouse

Looking to build a custom data stack?

Daasity supports the varied data architecture, analytics, and reporting needs of consumer brands selling via eCommerce, Amazon, retail, and/or wholesale. Using Daasity, teams across an organization get a centralized and normalized view of all their data, regardless of the tools in their tech stack and how their future data needs may change.

For more information about Daasity, our 60+ integrations, and how the platform drives more profitable growth for 1600+ brands, learn more here.

Ready to take your data to the next level?

May interest you

Best Ecommerce Analytics Software for Fast-Growing DTC Brands

We've gathered real user ratings, key features, and pricing details to help you find the best fit for your DTC brand. Here's an overview of the best options on the market today.

How to use Subscriptions to Increase Customer LTV

CLTV is the predicted net profit a business can generate from a customer from the beginning to the end of their relationship. It helps DTC businesses understand the long-term value of acquiring and retaining customers, rather than focusing solely on individual transaction values.