Updates to Google BigQuery following Cloud Platform Live

Felipe Hoffa, Cloud Platform team
5 min readintermediate
--
View Original

Overview

The article discusses significant updates to Google BigQuery announced during the Cloud Platform Live event, highlighting new features that enhance performance, usability, and cost-effectiveness. Key improvements include increased streaming capacity, enhanced SQL support, and reduced pricing for storage and querying.

What You'll Learn

1

How to stream data into BigQuery at a rate of 100,000 rows per second

2

Why using table wildcard functions can simplify querying in BigQuery

3

How to create and use views in BigQuery for complex queries

4

How to annotate datasets with user-defined metadata in BigQuery

5

How to utilize JSON parsing functions for flexible data handling in BigQuery

Key Questions Answered

What is the new streaming capacity for BigQuery?
BigQuery now allows users to stream data at a rate of up to 100,000 rows per second per table, which is a 1,000 times increase from the previous limit of 100 rows per second. This enhancement significantly improves the ability to perform real-time data analysis.
How do table wildcard functions improve querying in BigQuery?
Table wildcard functions enable users to query multiple tables that match specific patterns, such as date ranges or naming conventions. This feature simplifies the process of accessing partitioned data without needing to write complex queries for each individual table.
What are the benefits of the new SQL features in BigQuery?
The updated SQL support in BigQuery includes multi-joins, CROSS JOIN capabilities, and improved alias support, making it easier to write complex queries. These enhancements allow for more efficient data analysis and reduce the need for multiple sub-queries.
What significant price reductions were announced for BigQuery?
The article mentions a 68% reduction in storage costs, dropping from 8 cents to 2.6 cents per gigabyte per month, and an 85% reduction in querying costs, from 3.5 cents to 0.5 cents per gigabyte. Additionally, streaming costs have been reduced by 90%.

Key Statistics & Figures

Streaming capacity increase
1000x
BigQuery now supports streaming up to 100,000 rows per second per table.
Storage cost reduction
68%
Storage costs decreased from 8 cents to 2.6 cents per gigabyte per month.
Querying cost reduction
85%
Querying costs reduced from 3.5 cents to 0.5 cents per gigabyte.
Streaming cost reduction
90%
Previously announced streaming costs have been reduced significantly.

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

Key Actionable Insights

1
Leverage the new streaming capacity to enhance real-time analytics capabilities in your applications.
With the ability to stream 100,000 rows per second, applications requiring instant data processing can significantly improve performance and responsiveness.
2
Utilize table wildcard functions to streamline your data querying process.
This feature allows for easier management of partitioned data, reducing the complexity of queries and improving overall efficiency.
3
Take advantage of the new SQL features to simplify complex queries.
By using multi-joins and CROSS JOIN, you can write more straightforward and efficient SQL queries, which can save time and reduce errors in data analysis.
4
Annotate datasets with user-defined metadata to improve data sharing and collaboration.
Providing descriptions for datasets and tables helps users understand the data better, facilitating easier collaboration and usage across teams.
5
Implement JSON parsing functions to handle flexible data structures effectively.
This capability allows you to work with JSON data more seamlessly, making it easier to integrate various data sources into BigQuery.

Common Pitfalls

1
Failing to utilize the new features effectively can lead to inefficient queries and higher costs.
Many users may not be aware of the new capabilities, such as table wildcard functions and enhanced SQL support, which can simplify their workflows and reduce expenses.

Related Concepts

Real-time Data Processing
SQL Optimization Techniques
Data Partitioning Strategies