Novelty and diversity in search results
Author(s) | Ravichandran, Praveen | |
Date Accessioned | 2015-05-26T13:29:58Z | |
Date Available | 2015-05-26T13:29:58Z | |
Publication Date | 2014 | |
Abstract | Information retrieval (IR) is the process of obtaining relevant information for a given information need. The concept of relevance and its relation to information needs is of central concern to IR researchers. Until recently, much work in IR settled with a notion of relevance that is topical -- that is, containing information "about" a specified topic -- and in which the relevance of a document in a ranking is independent of the relevance of other documents in the ranking. But such an approach is more likely to produce a ranking with a high degree of redundancy; the amount of novel information available to the user may be minimal as they traverse down a ranked list. In this work, we focus on the novelty and diversity problem that models rele- vance of a document taking into account the inter-document effects in a ranked list and diverse information needs for a given query. Existing approaches to this problem mostly rely on identifying subtopics (disambiguation, facets, or other component parts) of an information need, then estimating a document's relevance independently w.r.t each subtopic. Users are treated as being satisfied by a ranking of documents that covers the space of subtopics as well as covering each individual subtopic sufficiently. We propose a novel approach that models novelty implicitly while retaining the ability to capture other important factors affecting user satisfaction. We formulate a set of hypotheses based on the existing subtopic approach and test them with actual users using a simple conditional preference design: users express a preference for document A or document B given document C. Following this, we introduce a novel triplet framework for collecting such preference judgments and using them to estimate the total utility of a document while taking inter-document effects into account. Finally, a set of utility-based metrics are proposed and validated to measure the effectiveness of a system for the novelty and diversity task. | en_US |
Advisor | Carterette, Benjamin A. | |
Degree | Ph.D. | |
Department | University of Delaware, Department of Computer and Information Sciences | |
Unique Identifier | 910103948 | |
URL | http://udspace.udel.edu/handle/19716/16777 | |
Publisher | University of Delaware | en_US |
URI | http://search.proquest.com/docview/1622910938?accountid=10457 | |
dc.subject.lcsh | Information retrieval. | |
dc.subject.lcsh | Database searching. | |
Title | Novelty and diversity in search results | en_US |
Type | Thesis | en_US |