Learning Human Preferences: From Clicks to Conversations Event

Learning Human Preferences: From Clicks to Conversations

By Suryanarayana Sankagiri, EPFL, Switzerland,

seminar hall 51,4th floor main building

Abstract

People routinely reveal their preferences online, e.g., when choosing
search results, videos, or products. Such data is used by algorithms to learn human
tastes. Recently, curated datasets of human preferences have been used to fine-
tune language models, substantially improving their alignment with human intent.
These successes raise a natural question: can recommender systems learn more
effectively from comparisons rather than ratings? The talk will trace a path from basic
models of choice behaviour to new frameworks for recommender systems. The main
focus will be on our theoretical result showing that personalised recommendations
can be learned efficiently from comparison data, despite the underlying optimisation
problem being nonconvex. I will then describe a bandit formulation that addresses
the classical exploration-exploitation trade-off in a novel way. Finally, I’ll share
empirical insights motivating richer models of human choice. I will conclude by
arguing that learning from human preferences is key to building interactive AI
systems that reliably serve human needs.

About us

People

Campus and Facilities

Library

Departments

Research Centres and Initiatives

Research Facilities

Publications

Programmes

Admissions

Fee Payment and Policies

Career Development and Placement Cell

News

Events

Announcements

Outreach and Training

Partnerships

Alumni Affairs

Giving to IISER Pune

Learning Human Preferences: From Clicks to Conversations

Abstract