[Submitted on 7 May 2020]
Abstract: MineRL 2019 competition challenged participants to train sample-efficient
agents to play Minecraft, by using a dataset of human gameplay and a limit
number of steps the environment. We approached this task with behavioural
cloning by predicting what actions human players would take, and reached fifth
place in the final ranking. Despite being a simple algorithm, we observed the
performance of such an approach can vary significantly, based on when the
training is stopped. In this paper, we detail our submission to the
competition, run further experiments to study how performance varied over
training and study how different engineering decisions affected these results.
Submission history
From: Anssi Kanervisto [view email]
[v1]
Thu, 7 May 2020 10:48:51 UTC (465 KB)