One Arrow, Two Kills: An Unified Framework for Achieving Optimal Regret Guarantees in Sleeping Bandits - Apprentissage de modèles visuels à partir de données massives Accéder directement au contenu
Communication Dans Un Congrès Année : 2022

One Arrow, Two Kills: An Unified Framework for Achieving Optimal Regret Guarantees in Sleeping Bandits

Résumé

We address the problem of `Internal Regret' in Sleeping Bandits in the fully adversarial setup, as well as draw connections between different existing notions of sleeping regrets in the multiarmed bandits (MAB) literature and consequently analyze the implications: Our first contribution is to propose the new notion of Internal Regret for sleeping MAB. We then proposed an algorithm that yields sublinear regret in that measure, even for a completely adversarial sequence of losses and availabilities. We further show that a low sleeping internal regret always implies a low external regret, and as well as a low policy regret for iid sequence of losses. The main contribution of this work precisely lies in unifying different notions of existing regret in sleeping bandits and understand the implication of one to another. Finally, we also extend our results to the setting of Dueling Bandits (DB)--a preference feedback variant of MAB, and proposed a reduction to MAB idea to design a low regret algorithm for sleeping dueling bandits with stochastic preferences and adversarial availabilities. The efficacy of our algorithms is justified through empirical evaluations.
Fichier principal
Vignette du fichier
arxiv_int_sldb.pdf (1.27 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03922350 , version 1 (06-01-2023)

Licence

Paternité

Identifiants

Citer

Pierre Gaillard, Aadirupa Saha, Soham Dan. One Arrow, Two Kills: An Unified Framework for Achieving Optimal Regret Guarantees in Sleeping Bandits. AISTATS 2023 - 26th International Conference on Artificial Intelligence and Statistics, Apr 2023, Valence (Espagne), Spain. pp.7755--7773. ⟨hal-03922350⟩
41 Consultations
33 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More