Get started with Gemma on Ray on Vertex AI

Ju-yeong Ji, Ivan Nardini

Use Gemma Supervised tuning on Ray on Vertex AI to train and serve machine learning models efficiently and effectively.

Google

•

Ju-yeong Ji, Ivan Nardini

•12 min read•advanced•

--

•View Original

DockerGeminiGoogle CloudHugging FaceJSONPandasPyTorchTensorBoardTransformerTransformersVertex AI

Overview

This article provides a comprehensive guide on using Gemma with Ray on Vertex AI, detailing the steps to set up, fine-tune, and deploy machine learning models. It covers prerequisites, costs, and practical coding examples to help developers efficiently utilize these technologies.

What You'll Learn

1

How to create a Ray cluster on Vertex AI

2

How to tune Gemma with Ray Train on Vertex AI

3

How to validate Gemma training on Vertex AI

4

How to serve Gemma with Ray Data for offline predictions

Prerequisites & Requirements

Basic understanding of machine learning concepts
Familiarity with Google Cloud services and CLI

Key Questions Answered

What is Gemma and what can it do?

Gemma is a family of open models built from the same research and technology used to create the Gemini models. It can perform a variety of tasks including text generation, code completion, fine-tuning for specific tasks, and running on various devices.

How do you fine-tune Gemma with Ray on Vertex AI?

To fine-tune Gemma with Ray on Vertex AI, you can use Ray Train to distribute HuggingFace Transformers with PyTorch training. You define a training function, configure the scaling, and submit the fine-tuning job using the TorchTrainer's fit method.

What are the costs associated with using Vertex AI?

The tutorial utilizes billable components of Google Cloud including Vertex AI, Cloud Build, Artifact Registry, and Cloud Storage. Users should refer to the Pricing Calculator for cost estimates based on their projected usage.

How can you monitor training jobs in Vertex AI?

You can monitor tuning jobs by creating a TensorBoard instance from the Vertex AI Experiments section. This allows you to track and visualize metrics during the training process.

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

AI/ML

Gemma

Used for text generation and model fine-tuning.

AI/ML

Ray

Provides infrastructure for distributed computing and parallel processing.

Cloud Platform

Vertex AI

Used for training and deploying machine learning models.

Containerization

Docker

Used for creating custom images for Ray clusters.

Key Actionable Insights

1
Utilizing Ray on Vertex AI can significantly enhance your machine learning workflows by enabling distributed computing and parallel processing.
This is particularly beneficial for large-scale ML tasks where performance and efficiency are critical. Implementing these technologies can lead to faster model training and deployment.

2
Creating a Docker image for your Ray cluster allows for customization and optimization tailored to your specific ML tasks.
By leveraging pre-built Ray base images or creating your own, you can ensure that your environment is optimized for the libraries and dependencies your application requires.

3
Regularly monitoring your training jobs with TensorBoard can help identify issues early and improve model performance.
Using TensorBoard allows for real-time visualization of metrics, which is crucial for understanding how changes in hyperparameters affect model training.

Common Pitfalls

1

Failing to enable necessary APIs can halt your progress when setting up Google Cloud resources.

Always ensure that all required APIs are enabled for your project before proceeding with resource creation. This can save time and prevent frustration during setup.

2

Not monitoring training jobs effectively can lead to wasted resources and suboptimal model performance.

Utilizing tools like TensorBoard for monitoring can help catch issues early and allow for adjustments to be made in real-time, enhancing the overall training process.

Related Concepts

Ray On Vertex AI

Gemma Model Fine-tuning

Distributed Computing In Machine Learning

The Gemma 3n model has been fully released, building on the success of previous Gemma models and bringing advanced on-device multimodal capabilities to edge devices with unprecedented performance. Explore Gemma 3n's innovations, including its mobile-first architecture, MatFormer technology, Per-Layer Embeddings, KV Cache Sharing, and new audio and MobileNet-V5 vision encoders, and how developers can start building with it today.

DockerHugging FaceTransformers

9 min read

Includes Code

Has Summary

--

These articles from Google and other leading engineering teams share similar topics with "Get started with Gemma on Ray on Vertex AI". Explore more engineering insights on Vertex AI, JavaScript, AWS.