Arian Amani

Arian Amani

Machine Learning Scientist

AI VIVO

Wellcome Sanger Institute

Biography

I build AI systems that model how cells respond to drugs and perturbations — bridging deep generative models, single-cell biology, and production ML to accelerate therapeutic discovery.

At AI VIVO and the Wellcome Sanger Institute, I work on virtual cells, causal representation learning, flow matching, single-cell perturbation modeling, and drug discovery AI, building production ML systems with PyTorch, GCP, and HuggingFace.

Interests
  • Deep Generative Models
  • Causal Representation Learning
  • Flow Matching
  • Single-Cell Perturbation
  • Virtual Cells
  • Drug Discovery
Education
  • Applied Computer Science & Artificial Intelligence, 2023

    Sapienza University of Rome

  • BSc in Computer Science, 2020

    Amirkabir University of Technology

Current Work

  • CellDISECT (bioRxiv, 2025) / Code: A causal generative model for disentangling cell state variation and predicting counterfactual responses in single-cell data.
  • SP-FM (arXiv, 2026): A flow-matching method that improves generalization to unseen conditions by learning condition-aware transport dynamics.
  • AI VIVO: Machine Learning Scientist building multimodal generative models and production ML systems for therapeutic discovery.

Work Experience

 
 
 
 
 
AI VIVO
Machine Learning Scientist
AI VIVO
December 2024 – Present 1 yr 3 mos Cambridge, United Kingdom
  • Develop deep learning and generative models for drug discovery using transformer and flow matching architectures
  • Deploy and scale ML pipelines on GCP using PyTorch Lightning and Docker
  • Design multi-modal ML pipelines integrating molecular structure and biological assay data
  • Maintain scalable pipelines using PyTorch, Lightning, RDKit, and HuggingFace
  • Experience with computational chemistry tools like BioSolveIt, AutoDock Vina, and Boltz-2
 
 
 
 
 
Wellcome Sanger Institute
Data Scientist
November 2022 – Present 3 yrs 4 mos Hinxton, United Kingdom

Responsibilities include:

  • Research assistant at Dr. Mo Lotfollahi’s lab
  • Co–first author of CellDISECT, a deep generative model for disentangled cellular representations and in silico perturbation analysis, developed to study perturbation effects across single-cell populations.
  • Collaborated on several ML projects in Single-Cell Genomics and Drug Discovery mainly using PyTorch
  • Developed task-specific fine-tuning pipelines on generative and transformer models
  • Contributed to ongoing projects like CPA: Compositional Perturbation Autoencoder (GitHub)
 
 
 
 
 
Virasad
Computer Vision Engineer
January 2022 – May 2022 5 mos Tehran, Iran

Responsibilities include:

  • Delivered >95% accuracy solutions for tasks with limited data (15 images per class)
  • Spearheaded development on 5 diverse projects meeting client requirements
  • Led individual projects, enhancing development pipelines

Projects

CellDISECT
CellDISECT (Cell DISentangled Experts for Covariate counTerfactuals) is a powerful causal generative model that enhances single-cell analysis by disentangling variations, making counterfactual predictions, and achieving flexible fairness.
CellDISECT
CPA (Compositional Perturbation Autoencoder)
CPA is a deep generative framework to learn effects of perturbations at the single-cell level. It performs OOD predictions of unseen combinations of drugs, learns interpretable embeddings, estimates dose-response curves, and provides uncertainty estimates.
CPA (Compositional Perturbation Autoencoder)

Recent Publications

Quickly discover relevant content by filtering publications.
(2026). Shortest-Path Flow Matching with Mixture-Conditioned Bases for OOD Generalization to Unseen Conditions. arXiv.

PDF Cite DOI arXiv

(2025). Integrating multi-covariate disentanglement with counterfactual analysis on synthetic data enables cell type discovery and counterfactual predictions. bioRxiv.

PDF Cite Code Project DOI bioRxiv

Teaching Experience

 
 
 
 
 
Teaching Assistant
Sharif University of Technology
September 2022 – August 2023 1 yr Tehran, Iran
  • Machine Learning for Bioinformatics (Graduate Course) | Spring 2023
    • Prepared teaching material on CNNs & AutoEncoders, designed assignments, and coordinated class contests.
  • Introduction to Machine Learning | Fall 2022
    • Designed and graded assignments for a class of 150 students, conducted a workshop on Variational AutoEncoders.
 
 
 
 
 
Teaching Assistant
Amirkabir University of Technology
September 2021 – March 2022 7 mos Tehran, Iran
  • Introduction to Image Processing and Neural Networks | Fall 2022
    • Conducted workshops and lectures on OpenCV and Deep Learning for a class of 80 students.
  • Advanced Programming with C++ | Spring 2022
    • Designed assignments and projects for a class of 90 students, evaluated student submissions.

Accomplish­ments

Coursera
Deep Learning Specialization
  • Neural Networks and Deep Learning
  • Improving Deep Neural Networks: Hyperparameter Tuning, Regularization and Optimization
  • Structuring Machine Learning Projects
  • Convolutional Neural Networks
  • Sequence Models
See certificate