Updated production-ready Gemini models, reduced 1.5 Pro pricing, increased rate limits, and more

Logan Kilpatrick, Shrestha Basu Mallick

Learn about the latest updates to Google's Gemini models, including reduced pricing for Gemini 1.5 Pro, increased rate limits, faster performance, enhanced quality, and more.

Google

•

Logan Kilpatrick, Shrestha Basu Mallick

•4 min read•beginner•

--

•View Original

GeminiGoogle CloudVertex AI

Overview

The article discusses the release of updated production-ready Gemini models, specifically Gemini-1.5-Pro-002 and Gemini-1.5-Flash-002, highlighting significant improvements in pricing, performance, and usability for developers. Key enhancements include reduced pricing, increased rate limits, faster output, and updated filter settings aimed at improving overall model quality and helpfulness.

What You'll Learn

1

How to access and utilize the updated Gemini models for various applications

2

Why the pricing changes for Gemini 1.5 Pro and Flash models are beneficial for developers

3

How to implement context caching to reduce costs when using Gemini models

Key Questions Answered

What improvements have been made in the Gemini 1.5 models?

The Gemini 1.5 models have seen over 50% reduced pricing, 2x higher rate limits for 1.5 Flash, and approximately 3x higher for 1.5 Pro. Additionally, they offer 2x faster output and 3x lower latency, along with improved overall quality in math, long context, and vision tasks.

How do the new Gemini models enhance developer experience?

The updated models provide a more concise response style, improved helpfulness, and reduced output length by 5-20%, making them easier to use. Developers can also apply safety filters based on their specific use cases, enhancing customization.

What are the new rate limits for Gemini 1.5 models?

The paid tier rate limits for Gemini 1.5 Flash have been increased to 2,000 RPM, while Gemini 1.5 Pro has been raised to 1,000 RPM, up from the previous limits of 1,000 and 360 RPM respectively.

What are the pricing changes for Gemini 1.5 Pro?

Effective October 1st, 2024, there will be a 64% reduction on input tokens, a 52% reduction on output tokens, and a 64% reduction on incremental cached tokens for the Gemini 1.5 Pro model, specifically for prompts less than 128K tokens.

Key Statistics & Figures

Price reduction on input tokens for Gemini 1.5 Pro

64%

Effective October 1st, 2024, for prompts less than 128K tokens.

Price reduction on output tokens for Gemini 1.5 Pro

52%

Effective October 1st, 2024, for prompts less than 128K tokens.

Rate limit for Gemini 1.5 Flash

2,000 RPM

Increased from the previous limit of 1,000 RPM.

Rate limit for Gemini 1.5 Pro

1,000 RPM

Increased from the previous limit of 360 RPM.

Improvement in MMLU-Pro benchmark

~7%

Indicates overall quality enhancement of the models.

Improvement in math benchmarks

~20%

Significant improvement in performance on MATH and HiddenMath benchmarks.

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

AI/ML

Gemini

Used for various text, code, and multimodal tasks.

Platform

Google AI Studio

Platform for accessing the latest Gemini models.

API

Gemini API

API for integrating Gemini models into applications.

Cloud Service

Vertex AI

Available for larger organizations and Google Cloud customers.

Key Actionable Insights

1
Take advantage of the reduced pricing for Gemini 1.5 Pro and Flash models to optimize your AI/ML projects.
With significant reductions in token costs, developers can experiment more freely and build scalable applications without the financial burden.

2
Utilize the increased rate limits to enhance the performance of your applications using the Gemini models.
Higher rate limits allow for more requests per minute, enabling developers to handle larger workloads and improve application responsiveness.

3
Implement context caching to further reduce costs when using the Gemini models.
By caching context, developers can minimize repeated processing, leading to lower token usage and more efficient application performance.

Common Pitfalls

1

Failing to utilize the updated filter settings may lead to less tailored responses from the models.

Developers should apply the appropriate safety filters based on their specific use cases to ensure the models provide the most relevant and safe outputs.

Introducing the Agent Development Kit (ADK) for TypeScript, an open-source framework for building complex, multi-agent AI systems with a code-first approach. Developers can define agent logic in TypeScript, applying traditional software development best practices (version control, testing). ADK offers end-to-end type safety, modularity, and deployment-agnostic functionality, leveraging the familiar TypeScript/JavaScript ecosystem.

TypeScriptJavaScriptGoogle Cloud

3 min read

Includes Code

Has Summary

--

Google

Beginner

Your AI is now a local expert: Grounding with Google Maps is now GA

We are excited to announce Grounding with Google Maps in Vertex AI is now Generally Available (GA). ...

Google CloudGeminiVertex AI

4 min read

Includes Code

Has Summary

--

These articles from Spotify and other leading engineering teams share similar topics with "Updated production-ready Gemini models, reduced 1.5 Pro pricing, increased rate limits, and more". Explore more engineering insights on PostgreSQL, Google Cloud, TypeScript.