12 sub-projects, 5 work packages, 2 aims. Aim 1 (WP1–WP3) develops the methodological backbone; Aim 2 (WP4–WP5) puts those methods to work across HT through consultancy and graduating collaborations.
WP1
Combining Biological Scales from Molecules to Populations
Bridge the full continuum of biological organization — molecules, cells, tissues, organs, populations — with scalable AI methods that integrate and reason across these levels. The ambition is to turn fragmented data streams into coherent, predictive representations of biological systems.
P1.1
Bridging structure and -omics at subcellular resolution
PIs: Pigino (lead), Funke, Jug
Integrate cryo-EM, expansion microscopy, volumetric CLEM, and spatial omics to correlate ciliary structural features with molecular states at subcellular resolution. Connects to the Ciliopathies RFP and produces a general toolkit for nanoscale structure-to-molecule analysis.
P1.2
From FIB-SEM to live-cell imaging
PIs: Zerial (lead), Funke, Jug
A correlative pipeline linking volumetric FIB-SEM with fluorescence imaging of fixed and live cells. Deep-learning segmentation, efficient annotation, and HPC produce multi-scale maps from organelles to tissue. Initial focus: liver tissue; eventual delivery via HT’s National Facilities.
P1.3
AI for molecular knowledge at population scale
PIs: Soranzo (lead), Ieva, new HDS GL
Integrate population-scale single-cell genomics with electronic health records to derive precise inferences of disease trajectories. Informed by Soranzo’s ERC IMPACT grant and the UK Biobank / Genes and Health cohorts. Connects to the Cardiometabolic RFP.
WP2
Combining Data Modalities into Multimodal Solutions
Coordinators: Glastonbury, Jug
Develop AI strategies that integrate heterogeneous data streams into unified, interpretable representations — revealing relationships invisible within any single modality and supporting predictive, mechanistic understanding of disease.
P2.1
Multimodal, explainable breast cancer risk prediction
PIs: Jug (lead), Di Angelantonio
A device-agnostic AI framework integrating 2D mammography, 3D digital breast tomosynthesis, and longitudinal EHR data for individualized, time-specific risk scores. Focus on out-of-distribution robustness, demographic fairness, and human-centred explainability co-designed with radiologists.
P2.2
Exposome intelligence and digital twins
PIs: Ieva (lead), new HDS GL
AI methods for clinical complexity that go beyond medical records to include the exposome — environmental, socio-demographic, economic, and behavioural factors. Combined with Medical Digital Twins, this enables truly personalized health policies.
P2.3
A multimodal spatial pathology atlas of Alzheimer’s disease
PIs: Glastonbury (lead)
With King’s College London Neurodegenerative Disease Biobank: ~30,000 whole-slide images from 640 donors plus Visium HD spatial transcriptomics, WGS, and pathology reports. Goal: the most detailed multimodal map of Alzheimer’s pathology to date.
WP3
Synthetic Data Generation at Scale
Coordinators: Ieva
Generate realistic synthetic biological and clinical data that protects privacy while expanding analytical reach — supporting research that would otherwise be blocked by data scarcity or sensitivity.
P3.1
Virtual Patient
PIs: Ieva (lead), new HDS GL
Holistic patient representation through deep learning and LLMs. Conditional GANs and language models for structured EHRs; LLM pipelines with RAG and knowledge-graph integration for clinical narratives; anatomically guided medical-image synthesis. Connects to the Cardiometabolic RFP.
P3.2
Failure-aware agentic AI critics from ELN data
Train critic models to evaluate plans proposed by agentic AI in the lab — learning from the silent knowledge buried in failed experiments and protocol deviations recorded in ELN systems. The critics flag risky actions and propose safer revisions.
WP4
Consultancy and Proof-of-concept Solutions
Coordinators: Funke (co-lead Jug)
The dynamic side of the Hub: short, well-scoped proof-of-concept work for HT colleagues. The four projects below are a snapshot — new ideas can come in throughout the Flagship’s lifetime.
P4.3
3D segmentation of stem cells in tissue and organoids
Robust deep-learning pipeline for segmenting individual cells in 3D brain organoids and tumor organoids — cells with intricate, overlapping morphologies that defeat conventional algorithms. Builds on Funke’s connectomics work.
P4.4
Omics2EM: bridging -omics and cryo-EM
PIs: Calviello (lead), Erdmann, Funke, Jug
Use molecular heterogeneity from transcriptomics and proteomics as a predictor of alternative complexes in cryo-EM — and conversely, use cryo-EM latent representations to identify heterogeneous molecular complexes. Initial focus: the ribosome.
WP5
Collaborative Science Facilitation Projects
Coordinators: Jug (co-lead Funke)
Where successful proof-of-concept projects from WP4 graduate into sustained, co-developed research. Two projects have already reached this level of maturity.
P5.1
Codon- and structure-aware RNA language model
An RNA language model that integrates codon semantics, regulatory motifs, and secondary structure to predict mRNA half-life. Hybrid tokenization (codons for coding regions, sub-words plus structural annotations for UTRs) and physics-informed regularization.
P5.2
AI-enabled cryo-FIB and lift-out pipeline
PIs: Erdmann (lead), Pigino, Jug
An automated pipeline that integrates light microscopy, SEM, and FIB to guide cryo-FIB lift-out and lamella preparation. Deep-learning feature detection enables automated, reproducible sample production from organoids and tissues.