Identifying the Best Arm in the Presence of Global Environment Shifts

Authors: Phurinut Srisawad, Juergen Branke, Long Tran-Thanh

Abstract: This paper formulates a new Best-Arm Identification problem in the
non-stationary stochastic bandits setting, where the means of all arms are
shifted in the same way due to a global influence of the environment. The aim
is to identify the unique best arm across environmental change given a fixed
total budget. While this setting can be regarded as a special case of
Adversarial Bandits or Corrupted Bandits, we demonstrate that existing
solutions tailored to those settings do not fully utilise the nature of this
global influence, and thus, do not work well in practice (despite their
theoretical guarantees). To overcome this issue, in this paper we develop a
novel selection policy that is consistent and robust in dealing with global
environmental shifts. We then propose an allocation policy, LinLUCB, which
exploits information about global shifts across all arms in each environment.
Empirical tests depict a significant improvement in our policies against other
existing methods.

Source: http://arxiv.org/abs/2408.12581v1

About the Author

Leave a Reply

Your email address will not be published. Required fields are marked *

You may also like these