Google AI Edge Gallery: Now with audio and on Google Play

Google AI Edge has expanded the Gemma 3n preview to include audio support. Users can play with it on their own mobile phone using the Google AI Edge Gallery, which is now available in Open Beta on Play Store.

Alice Zheng, Na Li
3 min readbeginner
--
View Original

Overview

The article discusses the launch of the Google AI Edge Gallery app, which now includes audio capabilities and is available on Google Play. It highlights the app's features, including high-quality speech-to-text and translation functionalities powered by the Gemma 3n model.

What You'll Learn

1

How to use the Google AI Edge Gallery app to transcribe audio clips

2

Why integrating audio capabilities enhances on-device AI applications

3

When to utilize the MediaPipe LLM Inference API for audio processing

Key Questions Answered

What new features have been added to the Google AI Edge Gallery?
The Google AI Edge Gallery now includes audio capabilities, allowing users to transcribe audio clips and translate spoken audio into text in another language. This is facilitated by the Gemma 3n model through the MediaPipe LLM Inference API.
How can developers access the Google AI Edge Gallery app?
Developers can download the Google AI Edge Gallery app from the Google Play Store, where it is available in open beta. The app is also open-sourced on GitHub, allowing for exploration of the complete source code.
What is the significance of the 500,000 APK downloads mentioned in the article?
The 500,000 APK downloads within two months indicate strong community interest and excitement for the Google AI Edge Gallery app, showcasing the demand for powerful, private, on-device generative AI solutions.

Key Statistics & Figures

APK downloads
500,000
Achieved in just two months since the app's launch.

Technologies & Tools

AI Model
Gemma 3n
Used for audio transcription and translation functionalities.
API
Mediapipe Llm Inference API
Facilitates audio processing for both Android and Web platforms.

Key Actionable Insights

1
Developers should experiment with the new Audio Scribe feature in the Google AI Edge Gallery to understand its capabilities.
This hands-on experience will help developers leverage audio processing in their applications, enhancing user interaction and functionality.
2
Utilizing the MediaPipe LLM Inference API can significantly improve the performance of audio-related tasks in mobile applications.
By integrating this API, developers can provide seamless audio transcription and translation features, making their apps more versatile and appealing.
3
Engaging with the open-source community on GitHub can provide valuable insights and collaborative opportunities.
Contributing to or exploring the Google AI Edge Gallery's GitHub repository allows developers to stay updated on best practices and innovations in on-device AI.