Offre en lien avec l’Action/le Réseau : – — –/– — –
Laboratoire/Entreprise : Criteo AI Lab Paris / Sorbonne Universite
Durée : 36 mois
Contact : patrick.gallinari@sorbonne-universite.fr
Date limite de publication : 2025-01-15
Contexte :
New paradigms like Generative Information Retrieval (GenIR) and Generative Recommendation (GenREC), built on foundation models, aim to transform how information is accessed. GenIR combines all components of traditional IR systems into one model that generates responses directly from user queries, while GenREC does something similar for recommendations. The goal of this PhD project is to explore the convergence of generative models for search, recommendation and related downstream tasks.
Sujet :
A first step will be to develop a unified generative engine for both search and recommendation, allowing for seamless alternation between the two modes during interactive sessions using a single engine. This is also a step toward realizing foundation models that offer a variety of functions to enhance user interactions. The second step will involve adapting this model to the large-scale, dynamic corpora characteristic of recommendation systems in the adtech industry, which presents additional research challenges. A brief description of the two directions is provided below.
Task 1: Unifying Generative IR and Recommendation
This task aims to develop a unified engine for search and recommendation, allowing for alternating between the two modes in interactive sessions. The goal is to enhance performance in both domains through a multi-task framework, enriching training data for both. While search and recommendation share similarities, they also have key differences, such as query intent. Search is driven by user queries, while recommendation relies on past user behavior. We aim to address these differences by defining a joint architecture and multi-task training strategy that captures the semantic distinctions between search (similarity-based) and recommendation (collaborative).
Task 2: Enhancing ID Associations for Large and Dynamic Collections
In this task, the goal is to improve document and item ID representations in large-scale, dynamic collections for a joint search/recommendation system. We will explore methods such as hierarchical structures and prior knowledge (e.g., product taxonomies) to optimize ID design. By leveraging additional information like brands or categorizations, we aim to improve the retrieval and recommendation process, particularly for large and evolving datasets.
Profil du candidat :
Computer science or applied mathematics. Good programming skills.
Formation et compétences requises :
Master degree in computer science or applied mathematics, Engineering school. Good background and experience in machine learning.
Adresse d’emploi :
Criteo AI Lab Paris
Document attaché : 202410031512_2024-09-PhD position-description-Generative-IR-Criteo.pdf