Authors: Cheng Qian, Julen Urain, Kevin Zakka, Jan Peters
Abstract: In this work, we introduce PianoMime, a framework for training a
piano-playing agent using internet demonstrations. The internet is a promising
source of large-scale demonstrations for training our robot agents. In
particular, for the case of piano-playing, Youtube is full of videos of
professional pianists playing a wide myriad of songs. In our work, we leverage
these demonstrations to learn a generalist piano-playing agent capable of
playing any arbitrary song. Our framework is divided into three parts: a data
preparation phase to extract the informative features from the Youtube videos,
a policy learning phase to train song-specific expert policies from the
demonstrations and a policy distillation phase to distil the policies into a
single generalist agent. We explore different policy designs to represent the
agent and evaluate the influence of the amount of training data on the
generalization capability of the agent to novel songs not available in the
dataset. We show that we are able to learn a policy with up to 56\% F1 score on
unseen songs.
Source: http://arxiv.org/abs/2407.18178v1