Professional Trajectory

From clinical data pipelines to production AI systems.

A retrospective of tenures at the frontier of machine learning, focusing on autonomous systems, Model training and inference, and the architecture of intelligence.

2024 - Present

Data Scientist / Applied ML Engineer

Health Data Hub

PARIS, FRANCE

Key Systems

  • hub

    Large-scale clinical data integration, linking and cleaning 100M+ rows from SNDS and Paris Hospitals EDWs, with scalable ETL workflows and ~150 derived analysis-ready variables.

  • hub

    Synthetic data generation, designing ParaBios, a multi-tabular stochastic generator enabling multi-constraint data synthesis under formal schema specifications at 500 MB/min throughput.

  • hub

    Clinical NLP anonymization, implementing ASR/NER pipelines (Whisper + GLiNER) for emergency call data, achieving ~92% anonymization recall.

Ownership & Impact

Full-stack ownership across a live healthcare data platform, from ingestion pipelines to deployed predictive models. Contributed to several internal tooling initiatives alongside the core modelling work, including documentation automation and structured data generation.

Clinical NLP Synthetic Data Healthcare AI MLOps
2022 - 2023

Data Scientist

SogetiLabs (part of Capgemini)

ISSY-LES-MOULINEAUX, FRANCE

Technical Impact

Worked across several applied ML problems: clinical NLP extraction, EEG-based biomarker classification, and LLM-assisted test automation. A broad mandate, handled with a consistent focus on business impact and interpretability.

NLP EHR Extraction EEG Few-shot LLMs Clinical Data
Academic Formation

Education

2021 — 2023

M.Sc. in Machine Learning

Université Paris Cité

2017 — 2022

M.Eng. in Computer Science & Software Engineering

École Nationale Supérieure d'Informatique (ESI)