Investigating Comparison-based Evaluation for Sparse Data
MetadataShow full metadata
Evaluation is ubiquitous. Often we need to evaluate a set of target entities (movies, restaurants, products, courses, paper submissions) and obtain their true ratings (average ratings from the population) or true rankings (rankings based on true ratings). Based on the law of large numbers, average ratings from large samples can well serve the purpose. However, in practice evaluation data are typically extremely sparse and each entity would receive a very small number of ratings from evaluators. In this case, the average ratings would significantly differ from the true ratings due to biased distributions of evaluators holding different standards or preferences. Based on the observation that comparative evaluations (e.g., paper 1 is better than paper 2) are more trustworthy than isolated ratings (e.g., paper 1 has a score of 4.5), in this study we investigate comparison-based evaluation, where the principle idea is to first extract a partial ranking for the entities evaluated by each evaluator, and then aggregate all the partial rankings to obtain a total ranking that well approximates the true ranking. The aggregated total ranking can be used to further estimate the true ratings. In this study we also investigate an associated topic of evaluation assignment (assigning target entities to evaluators). In many applications (e.g., academic conferences) there is such an assignment phase before evaluation is conducted. Currently in these applications assignment is not sophistically designed to maximize evaluation quality. We propose a layered assignment approach to maximize the quality of comparison-based evaluation for given evaluation resources (evaluation is generally labor-intensive). All the proposed algorithms have been implemented and validated using benchmark datasets in comparison with state-of-the-art methods. In addition, to demonstrate the utility of our approach, a prototype system has been deployed and made available for convenient public access.