|Coded Computing in Unknown Environment via Online Learning
|Chien-Sheng Yang, University of Southern California, United States; Ramtin Pedarsani, University of California, Santa Barbara, United States; A. Salman Avestimehr, University of Southern California, United States
|D.1: Coded Computation
|Coded and Distributed Computation
|Click here to download the manuscript
|Click here to watch in the Virtual Symposium
|Recently, there has been a significant increase in utilizing the cloud networks for event-driven and time-sensitive computations. However, large-scale distributed computing networks can suffer substantially from unpredictable and unreliable computing resources which can result in high variability of service quality. Thus, it is crucial to design efficient task scheduling policies that guarantee quality of service and the timeliness of computation queries. In this paper, we study the problem of computation offloading over unknown cloud networks with a sequence of timely computation jobs. We model the service quality (success probability of returning result back to the user within deadline) of each worker as function of context (collection of factors that affect workers). The user decides the computations to offload to each worker with the goal of receiving a recoverable set of computation results in the given deadline. Our goal is to design an efficient computing policy in the dark without the knowledge of the context or computation capabilities of each worker. By leveraging the coded computing framework in order to tackle failures or stragglers in computation, we formulate this problem using contextual-combinatorial multi-armed bandits (CC-MAB), and aim to maximize the cumulative expected reward. We propose an online learning policy called online coded computing policy, which provably achieves asymptotically-optimal performance in terms of regret loss compared with the optimal offline policy.