Discover more from Recommender Alignment Newsletter
Interesting papers from SIGIR '21
SIGIR, one of the best information retrieval and recommender systems conferences takes place this week. We skimmed through all the accepted papers to find the ones most relevant to recommender alignment:
Standing in Your Shoes: External Assessments for Personalized Recommender Systems. Interesting approach to use external (non-user) human assessors to directly estimate user preferences; presumably humans are better at predicting other humans’ preferences than ML systems using biased historical data.
On Interpretation and Measurement of Soft Attributes for Recommendation. Using natural language is one of the most promising approaches to increase communication bandwidth between users and recommender algorithms; this paper takes a step in that direction by providing a new dataset and set of algorithms to interpret “soft” or ambiguous natural language attributes.
FaiRecSys: Mitigating Algorithmic Bias in Recommender Systems. Highlights how feedback loops can amplify biases in real recommender systems, provides a formal definition of fairness and proposes an algorithm to redress it.
Societal Biases in Retrieved Contents: Measurement Framework and Adversarial Mitigation of BERT Rankers adapts adversarial bias mitigation techniques from NLP to information retrieval.
Simulating User Satisfaction for the Evaluation of Task-oriented Dialogue Systems has an annotated dataset of user satisfaction from chatbot conversations.
Clicks can be Cheating: Counterfactual Recommendation for Mitigating Clickbait Issue has an interesting and seemingly very general approach to counter clickbait and other forms of user manipulation: eliminate the effect of “exposure features” (features that the user sees prior to clicking) so that the model is forced to predict user behavior based on the remaining features.
Fight Fire with Fire: Towards Robust Recommender Systems via Adversarial Poisoning Training provides a way to inject “antidote data” into the recommender training data to optimize some social welfare function like minimizing polarization or maximizing various notions of fairness.
Fairness among New Items in Cold Start Recommender Systems is interesting in that it optimizes for an extremely demanding notion of fairness: Rawlsian Max-Min, i.e. maximize the true positive rate for the worst-off group.
HOOPS: Human-in-the-Loop Graph Reasoning for Conversational Recommendation design a much-needed dataset and benchmark to evaluate human in the loop / learning from human preferences approaches to recommendation.
An Exploration of Tester-based Evaluation of User Simulators for Comparing Interactive Retrieval Systems proposes evaluating user simulations by creating batteries of information retrieval systems for which we should already know the order of user preference (e.g. distinguishing between two IR systems A and B, an IR System B has strict improvement to page rank, for instance, that IR System A doesn’t have), and then seeing if the user simulations order the IR systems correctly.