Skip to content

yuandou168/PriCE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PriCE: Privacy-Preserving and Cost-Effective Scheduling for Parallelizing the Large Medical Image Processing Workflow over Hybrid Clouds

Running deep neural networks for large medical images is a resource-hungry and time-consuming task with centralized computing. While outsourcing such medical image processing tasks to hybrid clouds has benefits, such as a significant reduction of execution time and monetary cost, due to privacy concerns, it is still challenging to process sensitive medical images over clouds, which would hinder their deployment in many real-world applications. To overcome this, we first formulate the overall optimization objectives of the privacy-preserving distributed system model, i.e., minimizing the amount of information about the private data learned by the adversaries throughout the process, reducing the maximum execution time and cost under the user budget constraint. We propose a novel privacy-preserving and cost-effective solution called PriCE to solve this multi-objective optimization problem. We performed extensive simulation experiments for artifact detection tasks on medical images using an ensemble of five deep convolutional neural network inferences as the workflow task. Experimental results show that PriCE successfully splits a wide range of input gigapixel medical images with graph-coloring-based strategies, yielding desired output utility and lowering the privacy risk, maximum completion time, and monetary cost under the maximum total cost.

Folder structure

dataset: dataset of the experiments

inference: well-trained CNN models for artifact detection

pipeline-example: an example of utilizing deep neural networks for detecting airbubble, blood, blured, damaged, and folded tissues in large medical images (e.g., WSIs).

PriCE-exps: experiments and/ or simulations for validating the feasibility of the proposed PriCE method.

environment.yml: an exported environment file for creating the virtual envionment instance for the project code reproduction.

Where to download a gigapixel medical data?

It is worthy noting that the results shown here are in whole or part based upon data generated by the TCGA Research Network: https://www.cancer.gov/tcga. The original Whole Slide Images (WSIs) are protected from the TCGA copyrights, more details please check the website. Therefore, the authors are not be able to share the original WSI but instead share patches or give links to the original TCGA repository with the names of the files. In our case, we used TCGA-E9-A1N3-01Z-00-DX1 to present this research.

Download the Whole Slide Image (WSI) named [`TCGA-E9-A1N3-01Z-00-DX1'] from the TCGA research network. The original WSI is available on the TCGA SLIDE IMAGE VIEWER and place it at the unzipped folder ``/PriCE/dataset/1WSI/data/''

Environment setup

Use the terminal for the following steps:

  1. Create a virtual environment instance from the environment.yml file via conda command:

    conda env create -f environment.yml

  2. Activate the new environment: conda activate mytorch

  3. Verify that the new environment was installed correctly:

    conda env list

Folder explanation

  1. dataset: storing the datasets, e.g., the original WSI example and its intermediate data files, etc.
  2. inference: storing the CNN inference models.
  3. pipeline-example for artifact detection: storing the application using CNN inference models for artifact detection in WSI.
  4. PriCE-exps: storing the experimental worklows and /or Jupyter notebooks of the PriCE experiments and simulations

Questions about this research?

how to split a gigapixel medical image?
    * `PriCE/PriCE-exps/graph_coloring_based_image_splitting.ipynb`

    * `PriCE/PriCE-exps/evenly_split_w_wo_shuffle.ipynb`
how to encrypt/decrypt sensitive information of medical images? How to quantify the privacy-preserving goals?
    * `PriCE/PriCE-exps/pertubedata_privacy_risk_evaluation.ipynb` 
    (data perturbation and its privacy-preserving algorithm evaluation)
how to seek the 3D Pareto optimal resource planning?
    * `PriCE/PriCE-exps/Pareto_3D_evaluation.ipynb`

Cite our work

Wang, Yuandou, Neel Kanwal, Kjersti Engan, Chunming Rong, Paola Grosso, and Zhiming Zhao. "PriCE: Privacy-Preserving and Cost-Effective Scheduling for Parallelizing the Large Medical Image Processing Workflow over Hybrid Clouds." arXiv preprint arXiv:2405.15398 (2024).

Note that this work has been accepted to Euro-Par2024. We will update the citation code later.

@article{wang2024price,
  title={PriCE: Privacy-Preserving and Cost-Effective Scheduling for Parallelizing the Large Medical Image Processing Workflow over Hybrid Clouds},
  author={Wang, Yuandou and Kanwal, Neel and Engan, Kjersti and Rong, Chunming and Grosso, Paola and Zhao, Zhiming},
  journal={arXiv preprint arXiv:2405.15398},
  year={2024}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published