Success: Pick and insert plug
In this work, we study how to build a robotic system that can solve multiple 3D manipulation tasks given task instructions. To be useful in various domains like manufacturing and home, such a system capable of solving tasks that require high precision and learn a new tasks with few demonstrations.
Prior works like RVT and PerAct have studied this problem, however, they often struggle with tasks requiring high precision. We build upon prior works to make them more performant, precise and fast. We propose RVT-2, which is 6X faster in training and 2X faster in inference than its predecessor RVT. RVT-2 achieves a new state-of-the-art on multi-task RLBench benchmark, improving the success rate from 65% to 82%. RVT-2 is also effective in the real world, where it can learn tasks requiring high precision like inserting peg from just 10 demonstations.
A single RVT-2 model can peform the following tasks in the real world.
Success: Pick and insert plug
Success: Pick and insert plug
Success: Pick and insert plug
Failure: Pick and insert plug
Success: Pick and insert 8mm peg
Success: Pick and insert 8mm peg
Success: Pick and insert 8mm peg
Failure: Pick and insert 8mm peg
Success: Pick and insert 16mm peg
Success: Pick and insert 16mm peg
Success: Pick and insert 16mm peg
Failure: Pick and insert 16mm peg
Success: Put green marker in drawer
Success: Put green marker in drawer
Success: Put green marker in drawer
Failure: Put blue marker in drawer
Success: Put green block in bottom shelf
Success: Put green block in bottom shelf
Success: Put green block in top shelf
Success: Put green block in top shelf
Success: Put green marker in bowl
Success: Put blue marker in bowl
Success: Put green marker in mug
Failure: Put blue marker in bowl
Success: Press sanitizer
Success: Press sanitizer
Success: Press sanitizer
Failure: Press sanitizer
Success: Put blue block on red block
Success: Put green block on blue block
Success: Put red block on green block
Failure: Put red block in green block
RVT-2 tested on various unseen scenerios on the stack blocks tasks.
RVT-2 demonstrates ability to recover from failures.
@article{goyal2024rvt,
author = {Goyal, Ankit and Blukis, Valts and Xu, Jie and Guo, Yijie and Chao, Yu-Wei and Fox, Dieter},
title = {RVT2: Learning Precise Manipulation from Few Demonstrations},
journal = {RSS},
year = {2024},
}