Using BigQuery and Looker Studio with Google Analytics 4
Google Analytics 4 is a fantastic tool for collecting data, but pulling reports from the user interface is challenging, buggy, and fraught with confusing limitations. Fortunately there is a better way!
The integration between BigQuery and GA4 makes it much easier to analyze data and create robust dashboards without any limitations. The goal of this post is to explain why BigQuery is a uniquely powerful solution to overcome the challenges with reporting in Google Analytics, and share some tips and resources to help get started.
-
Migrated from ken-williams.com, updated samples, and added references to Dataform.
-
Updated all references to “data studio” to “looker studio”
Why the GA4 Reporting Interface is So Difficult to Use
As I write this in 2024 it is estimated that 56% of websites have Google Analytics installed, which means that billions of events are being recorded every second. Collecting and running queries on this massive amount of data costs money, and there has been a constant battle at Google about whether or not the benefits of offering this tool for free are worth the cost.
The latest version of Google Analytics was released in October 2020, and it was designed to be less expensive for Google to operate than the prior version (known as GA3 or Universal Analytics). Some of these cost control measures include:
- Removing views
- Expanding aggregated reporting tables
- Creating data retention limits ( read more )
- Limiting the date range for user-scoped explorations
- Tokens for the Data API ( read more )
- HyperLogLog++ ( read more )
- Query sampling ( read more )
These cost controls have produced many negative consequences for power users. For example, you may not be able to create a report with the metrics and dimensions that you need, or generate a calculated metric. Sometimes you might create a report that returns unassigned or null values. And even when you are able to build reports, you might discover that date restrictions prevent you from comparing metrics to a prior period.
These challenges do not indicate an error in the underlying data! They are the result of a reporting interface that is restrictive and difficult to understand. The good news is that you can overcome these issues by bypassing the reporting interface and working directly with the underlying data. This is possible with BigQuery.
Why BigQuery is an Ideal Solution
When you enable the BigQuery integration your data is transferred from the Google Analytics servers to BigQuery (Google's managed data warehouse product). If you are new to BigQuery, there are four things that you should know about it:
- BigQuery allows you to store and analyze massive amounts of data (we're talking petabytes) for a very low cost. Most of our clients at DiveTeam spend < $200 per month.
- BigQuery is designed to run super fast queries on large datasets. This is ideal for analysis, and makes it possible to slice data any way you want.
- BigQuery is fully managed by Google Cloud. You just pull it up and start working. Google takes care of the software and hardware running in the background.
- BigQuery integrates with everything. Imagine visualizing your spend in Google Ads with the conversions generated in Google Analytics, and the leads generated in Salesforce! This article is about Google Analytics, but the point is that any data you have can be imported and used with BigQuery.
After the data arrives in BigQuery you can run super fast SQL queries against it using the user interface, or plug it directly into your BI tool (such as Looker Studio). Using this approach eliminates ALL of the reporting limitations in Google Analytics! You can run as many complex queries as you want, because you're paying for them!
So how much does it cost?
Don't worry, BigQuery is very affordable. We've created a spreadsheet that you can use to estimate your Google Analytics storage costs in BigQuery (see screenshot below), and we also have a video tutorial that explains how to use it. Like I stated above, it is rare for a client to spend over $200 per month on BigQuery for Google Analytics alone, and We've helped over 100 companies setup the integration.
How the GA Integration with BigQuery Works
To configure the link between Google Analytics and BigQuery you first need to create a project in the Google Cloud Platform (watch this video to do it in < 4 minutes ), and then configure the integration in the Google Analytics settings ( see instructions ). This allows Google to start transferring your data to BigQuery.
I do not recommend enable the "streaming" export type unless you have a good reason to. This will significantly increase your cost, and I find that there is almost never any value in having real-time data.
What Does The Data Look Like?
The GA4 data stored in BigQuery is not completely raw, but it’s helpful to think of it as a raw and unsampled list of events. Google makes three key modifications to the data:
- The IP address is removed, but geographic information has been added
- The user agent is removed, but device information is added
- The traffic source attribution information will continue to be updated for the following 72 hours
Each day Google generates a table with data from the prior day, and each row in the table represents an event. Now here's the part that confuses beginners: each event (or row) in your table is in a JSON format. Here is a simplified example of a single page_view event in BigQuery:
event_date | event_name | event_params.key | event_params.value.string_value |
---|---|---|---|
20200101 | page_view | page_location | https://example.com |
page_title | Sample Page |
You’ll notice that there is no column for the URL, but there is a parameter called “page_location” that stores the URL where the event was collected. There’s also another parameter called “page_title” that stores the title of the page. These parameters and their values are stored in the columns “event_params.key” and “event_params.value.string_value”. Here’s what the data actually looks like when you export it from BigQuery.
{ "event_date": "20200101", "event_name": "page_view", "event_params": [{ "key": "page_location", "value": { "string_value": "https://example.com" } }, { "key": "page_title", "value": { "string_value": "Sample Page" } }] }
This format can be difficult to work with when you are getting started, which is why we have created a series of courses where you can interact with a live instructor, ask questions, and make sure you are confident that you are following best practices.
BigQuery Courses from DiveTeam
What Possibilities are Unlocked with the BigQuery Integration?
About 90% of DiveTeam's clients who use Google Analytics have enabled the BigQuery integration. We discuss the ways that companies are using the BigQuery integration to get more value from Google Analytics in detail if you take the course, Get Started with Google Analytics in BigQuery, but our clients chose to use this feature for at least one of the following three benefits:
- Improved and unlimited Google Analytics reporting
Take a look at the ecommerce dashboard that we built on sample data from the Google Store in the screenshot below, or click here to make a copy for yourself. - Dashboards or analysis across data from multiple sources
For example, you can build a chart to show how much you spent across Google Ads, Meta and Bing last month along side the revenue that was recorded in Shopify! - Advanced analysis or machine learning
Create a forecast that shows how seasonality impacts your sales, calculate lifetime value, or create audiences of users likely to purchase your products for remarketing!
Need Help from a Partner?
We would love to schedule a call to learn about your business and discuss the ways that we might be able to help!
You can CLICK HERE TO CONTACT US.