|Crowdsourced Classification with XOR Queries: An Algorithm with Optimal Sample Complexity
|Daesung Kim, Hye Won Chung, KAIST, Korea (South)
|L.3: Distributed Learning Performance Analysis
|Statistics and Learning Theory
|Click here to download the manuscript
|Click here to watch in the Virtual Symposium
|We consider the crowdsourced classification of $m$ binary labels with XOR queries that ask whether the number of objects having a given attribute in the chosen subset of size $d$ is even or odd. The subset size $d$, which we call query degree, can be varying over queries. Since a worker needs to make more efforts to answer a query of a higher degree, we consider a noise model where the accuracy of worker's answer changes depending both on the worker reliability and query degree $d$. For this general model, we characterize the information-theoretic limit on the optimal number of queries to reliably recover $m$ labels in terms of a given combination of degree-$d$ queries and noise parameters. Further, we propose an efficient inference algorithm that achieves this limit even when the noise parameters are unknown.