Abstract
Noisy binary search (NBS) aims to find a closest element to a target value within a sorted array through erroneous queries. In an ideal NBS environment where both the error rate and the cost to each query are fixed, the maximum likelihood estimation procedure has been proven to be the optimal decision strategy. However, in some nonideal NBS problems, both the error rates and the costs are dependent on the queries, and in some cases, finding the optimal decision strategy can be intractable. In this dissertation, we propose to use deep reinforcement learning (DRL) to approximate the optimal decision strategy in NBS problem, in which an intelligent agent is used to interact with the NBS environment. Through training, the agent learns how to take action at each step, either to generate a query or to stop the search and predict the target value. We study different NBS environments, both ideal or non-ideal, with a single target or multiple targets, and propose DRL-based solutions and optimization techniques for each environment. For our proposed algorithms, we study their performance and robustness and the difference between their learned polices and the baseline algorithms.