We address the task of fusing ranked lists of documents that are retrieved in response to a query. Past work on this task of rank aggregation often assumes that documents in the lists being fused are independent and that only the documents that are ranked high in many lists are likely to be relevant to a given topic. We propose manifold learning aggregation approaches, ManX and v-ManX, that build on the cluster hypothesis and exploit inter-document similarity information. ManX regularizes document fusion scores, so that documents that appear to be similar within a manifold, receive similar scores, whereas v-ManX first generates virtual adversarial documents and then regularizes the fusion scores of both original and virtual adversarial documents. Since aggregation methods built on the cluster hypothesis are computationally expensive, we adopt an optimization method that uses the top-k documents as anchors and considerably reduces the computational complexity of manifold-based methods, resulting in two efficient aggregation approaches, a-ManX and a-v-ManX. We assess the proposed approaches experimentally and show that they significantly outperform the state-of-the-art aggregation approaches, while a-ManX and a-v-ManX run faster than ManX, v-ManX, respectively.
Bibliographical noteKAUST Repository Item: Exported on 2021-03-31
Acknowledgements: This research was supported by Ahold Delhaize, Amsterdam Data Science, the Bloomberg Research Grant program, the Criteo Faculty Research Award program, Elsevier, the European Community's Seventh Framework Programme (FP7/2007-2013) under grant agreement nr 312827 (VOX-Pol), the Google Faculty Research Awards program, the Microsoft Research Ph.D. program, the Netherlands Institute for Sound and Vision, the Netherlands Organisation for Scientific Research (NWO) under project nrs 612.001.116, CI-14-25, 652.002.001, 612.001.551, 652.001.003, and Yandex. All content represents the opinion of the authors, which is not necessarily shared or endorsed by their respective employers and/or sponsors.