Sofian Chaybouti

I'm a Research Scientist in the vision team at the Technology Innovation Institute (TII), while finishing my PhD at the Tübingen AI Center, advised by Prof. Hilde Kuehne. I'm also participating in the MIT-IBM Watson Sight and Sound Project.

Prior to this, I finished my Master of Engineering in Applied Mathematics at ENSTA Paris in France and my MVA Master (Mathematics, Vision and Learning) at ENS Paris-Saclay.

Before starting my PhD, I worked for 3 years in industry research teams: at Credit Agricole's Datalab in Montrouge, France, where I mostly focused on NLP and research engines, and at Huawei's Noah's Ark Lab in Paris, working on optimization and Reinforcement Learning.

News

Research Papers

My current research focuses on multimodal models, especially Video Understanding, Video-Language modeling, Efficient video representation and efficient Multimodal models.

Falcon Perception

Falcon Perception

Aviraj Bevli, Sofian Chaybouti, Yasser Dahou, Hakim Hacid, Ngoc Dung Huynh, Phúc H. Lê Khac, Sanath Narayan, Wamiq Reyaz Para, Ankit Singh

arXiv, 2026

SigLino: Efficient Multi-Teacher Distillation for Agglomerative Vision Foundation Models

SigLino: Efficient Multi-Teacher Distillation for Agglomerative Vision Foundation Models

Sofian Chaybouti, Sanath Narayan, Yasser Dahou, Phúc H. Lê Khac, Ankit Singh, Ngoc Dung Huynh, Wamiq Reyaz Para, Hilde Kuehne, Hakim Hacid

CVPR 2026 (Highlight) — Best Paper Award, A2A Multimodal Workshop

VisRes Bench: On Evaluating the Visual Reasoning Capabilities of VLMs

VisRes Bench: On Evaluating the Visual Reasoning Capabilities of VLMs

Brigitta Malagurski Törtei, Yasser Dahou, Ngoc Dung Huynh, Wamiq Reyaz Para, Phúc H. Lê Khac, Ankit Singh, Sofian Chaybouti, Sanath Narayan

CVPR 2026

REVEAL: Relation-based Video Representation Learning for Video-Question-Answering

REVEAL: Relation-based Video Representation Learning for Video-Question-Answering

Sofian Chaybouti, Walid Bousselham, Moritz Wolter, Hilde Kuehne

arXiv, 2025

MaskInversion: Localized Embeddings via Optimization of Explainability Maps

MaskInversion: Localized Embeddings via Optimization of Explainability Maps

Walid Bousselham, Sofian Chaybouti, Christian Rupprecht, Vittorio Ferrari, Hilde Kuehne

ICLR 2026

LeGrad: An Explainability Method for Vision Transformers via Feature Formation Sensitivity

LeGrad: An Explainability Method for Vision Transformers via Feature Formation Sensitivity

Walid Bousselham, Angie Boggust, Sofian Chaybouti, Hendrik Strobelt, Hilde Kuehne

ICCV 2025

Meta-learning of black-box solvers using deep reinforcement learning

Meta-learning of black-box solvers using deep reinforcement learning

Sofian Chaybouti, Ludovic Dos Santos, Cedric Malherbe, Aladin Virmaux

NeurIPS 2022, MetaLearn Workshop

EfficientQA: A RoBERTa Based Phrase-Indexed Question-Answering System

EfficientQA: A RoBERTa Based Phrase-Indexed Question-Answering System

Sofian Chaybouti, Achraf Saghe, Aymen Shabou

arXiv, 2021

MIX: a Multi-task Learning Approach to Solve Open-Domain Question Answering

MIX: a Multi-task Learning Approach to Solve Open-Domain Question Answering

Sofian Chaybouti, Achraf Saghe, Aymen Shabou

arXiv, 2020