ANR CALME
The increasing availability of massive datasets presents a number of challenges for machine learning (ML), with two key issues standing out: the high dimensionality of the data and the difficulty of obtaining high-quality labelled data.
This project aims to develop ML methods that require moderate computational and labeling resources and with robust theoretical principles, aligning with the principles of frugal ML.
While many methods attempt to address these challenges, this project is distinctive in its focus on two key concepts: compression—an algorithmic process that simplifies data into a “lower-complexity” space—and alignment, which connects parts of different objects by identifying their correspondences, often using optimal transport techniques.
The main objectives of this project, called Compression and Alignment for Efficient Machine Learning (CALME), are to: 1) explore the theoretical foundations of modern data summarization methods through the lens of optimal transport and develop new, efficient compression techniques that are mathematically sound, and 2) create ML methods suited for scenarios with limited or noisy labeled data with the help of optimal transport and alignment.
