NVIDIA Unveils New Reinforcement Learning Research at ICRA 2019

This week, NVIDIA researchers from the newly opened robotics research lab in Seattle, Washington are presenting a new proof of concept reinforcement learning…

Nefi Alarcon
4 min readintermediate
--
View Original

Overview

NVIDIA researchers presented a new reinforcement learning approach at ICRA 2019, aimed at improving the performance of robots trained in simulation for real-world tasks. The research focuses on closing the 'reality gap' by optimizing simulation parameters based on real-world data, allowing robots to learn effectively from simulated scenarios.

What You'll Learn

1

How to enhance robot training using simulation techniques

2

Why closing the reality gap is crucial for robotic applications

3

How to utilize NVIDIA Tesla V100 GPUs for reinforcement learning tasks

4

When to apply the SimOpt algorithm for better policy transfer

Prerequisites & Requirements

  • Understanding of reinforcement learning concepts
  • Familiarity with TensorFlow and NVIDIA FleX(optional)

Key Questions Answered

What is the purpose of NVIDIA's new reinforcement learning approach?
The new reinforcement learning approach aims to enhance how robots trained in simulation perform in the real world by optimizing simulation parameters based on real-world data, thus closing the reality gap.
How does the SimOpt algorithm improve policy transfer?
The SimOpt algorithm adapts simulation parameter distributions using real-world roll-outs interleaved with policy training, allowing for better policy transfer without needing exact replication of real-world environments.
What tasks were the robots trained to perform in the study?
The robots were trained to perform two tasks: placing a peg in a hole and opening a drawer, using over 9600 simulations for each task to achieve accuracy.
What challenges exist in reinforcement learning for robotics?
One major challenge is the discrepancy between simulated environments and real-world scenarios, known as the reality gap, which affects the applicability of learned policies.

Key Statistics & Figures

Number of simulations conducted
9600
Each task, such as placing a peg in a hole and opening a drawer, involved over 9600 simulations to ensure accuracy.
Training duration per simulation
1.5-2 hours
Each simulation lasted approximately 1.5 to 2 hours, allowing the robot to learn effectively.

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

Hardware
Nvidia Tesla V100 Gpus
Used to accelerate the training of robots in reinforcement learning tasks.
Software
Tensorflow
Utilized as the deep learning framework for training the robots.
Software
Nvidia Flex
Employed as the physics engine for simulation in the research.

Key Actionable Insights

1
Implement simulation training for robotic applications to minimize risks and maximize learning opportunities.
Simulation allows for extensive training without the risks associated with real-world testing, making it an essential strategy in robotics development.
2
Utilize the SimOpt algorithm to enhance the transfer of learned policies from simulation to real-world tasks.
By adapting simulation parameters based on real-world data, developers can significantly improve the performance of robots in practical applications.
3
Leverage NVIDIA Tesla V100 GPUs to accelerate training processes in reinforcement learning.
These GPUs provide the computational power necessary for handling complex simulations and deep learning tasks efficiently.

Common Pitfalls

1
Failing to account for the reality gap can lead to ineffective policy transfer from simulation to real-world applications.
This occurs when the simulated environment does not accurately reflect real-world conditions, making it crucial to adapt simulations based on real-world data.

Related Concepts

Reinforcement Learning
Simulation Training
Robotics
Policy Transfer Techniques