Online Pandora's Boxes and Bandits

Hossein Esfandiari, MohammadTaghi Hajiaghayi, Brendan Lucier, Michael Mitzenmacher

We consider online variations of the Pandora's box problem (Weitzman. 1979), a standard model for understanding issues related to the cost of acquiring information for decision-making. Our problem generalizes both the classic Pandora's box problem and the prophet inequality framework. Boxes are presented online, each with a random value and cost drew jointly from some known distribution. Pandora chooses online whether to open each box given its cost, and then chooses irrevocably whether to keep the revealed prize or pass on it. We aim for approximation algorithms against adversaries that can choose the largest prize over any opened box, and use optimal offline policies to decide which boxes to open (without knowledge of the value inside). We consider variations where Pandora can collect multiple prizes subject to feasibility constraints, such as cardinality, matroid, or knapsack constraints. We also consider variations related to classic multi-armed bandit problems from reinforcement learning. Our results use a reduction-based framework where we separate the issues of the cost of acquiring information from the online decision process of which prizes to keep. Our work shows that in many scenarios, Pandora can achieve a good approximation to the best possible performance.

Knowledge Graph

arrow_drop_up

Comments

Sign up or login to leave a comment