Integrating Biological Knowledge for Robust Microscopy Image Profiling on De Novo Cell Lines (ICCV2025)

This repo contains code for our ICCV 2025 paper Integrating Biological Knowledge for Robust Microscopy Image Profiling on De Novo Cell Lines, a novel framework for phenotypic screening involving unseen cell lines.

📖 arXiv | 💻 GitHub | 🤗 Model

Install

Please install the required packages before running the code:

timm==0.9.16
scanpy==1.10.3
torch==2.1.2 (cuda 12.1)
transformers==4.26.1
wandb
anndata

Data

For the RxRx datasets, please download the raw image data and the metadata file from: https://www.rxrx.ai/datasets . The processed metadata for pretraing and evaluation can be found under the data folder.
The raw RNA-seq data are from GSE288929 and GSM7745109. We also provide the preprocessed h5ad file in Hugging Face .

The src/data/ folder contains the following key files:

rxrxmeta.csv: Pre-training data metadata file
u2os_data.csv: Evaluation dataset metadata for U2OS cell line experiments
scvi.pkl: Cell line representations obtained using scVI
u2os.pkl: Contains perturbation graph edge weights specifically for U2OS cell line

Pretraining

Download Data: First, download the meta_data and image datasets from the RxRx1 dataset.
Configure Paths: Update the data paths in your configuration:
- Modify csv_path to point to your metadata file location
- Update root_dir to point to your image dataset directory
Run Pre-training:
```
bash src/pretrain/run_pretrain.sh
```
Hyperparameter Tuning:
- You can sweep various hyperparameters according to your needs
- Alternatively, customize using the hyperparameters provided in src/eval/run_eval.sh

Evaluation

Configure Evaluation Paths: Update the following paths in src/eval/main.py:
- pkl_path: Path to gene expression feature file
- Dataset address: Path to your evaluation dataset
- Pre-trained model weights path: Path to your saved model weights
- Alternative: You can download our pre-trained model weights from Hugging Face
Run Evaluation:
```
bash src/eval/run_eval.sh
```
Graph Regularization:
- In our implementation, we found that adding graph regularization during evaluation is more efficient than during pre-training
- You can directly adjust the graph loss weight in run_eval.sh for optimal performance

Citation

If you find our paper useful, please cite us with

@misc{chen2025integratingbiologicalknowledgerobust,
      title={Integrating Biological Knowledge for Robust Microscopy Image Profiling on De Novo Cell Lines}, 
      author={Jiayuan Chen and Thai-Hoang Pham and Yuanlong Wang and Ping Zhang},
      year={2025},
      eprint={2507.10737},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2507.10737}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
src		src
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Integrating Biological Knowledge for Robust Microscopy Image Profiling on De Novo Cell Lines (ICCV2025)

Install

Data

Pretraining

Evaluation

Citation

About

Uh oh!

Releases

Packages

Languages

AIMedLab/BioMicroscopyProfiler

Folders and files

Latest commit

History

Repository files navigation

Integrating Biological Knowledge for Robust Microscopy Image Profiling on De Novo Cell Lines (ICCV2025)

Install

Data

Pretraining

Evaluation

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages