Dataset Open Access

HeLaCytoNuc: fluorescence microscopy dataset with segmentation masks for cell nuclei and cytoplasm

De, Trina; Urbanski, Adrian; Thangamani, Subasini; Wyrzykowska, Maria; Yakimovich, Artur

Data Description:

This dataset comprises fluorescence micrographs of HeLa cells, specifically labelled to identify nuclei and cell cytoplasm. These images were acquired as a technical calibration for a high-content screening study detailed and published in [1].

The HeLa cell line (ATCC-CCL-2), a widely used immortalised cell line in laboratory research, was cultured under standard conditions. Post-cultivation, the cells were fixed and stained with fluorescent dyes to visualise the nuclei and cytoplasm. The nuclei were stained with DAPI (4',6-diamidino-2-phenylindole), a blue-fluorescent DNA stain, while fluorescent-labeled phalloidin was used to detect actin filaments and delineate the cytoplasm. The entire process of cell culture, fixation, staining, and imaging adhered strictly to the protocols described in [1].

The preprocessed dataset includes 2,676 8-bit RGB images, each with a pixel resolution of 520 x 696 pixels. In these images, only two of the RGB channels are utilized: the red channel represents the cytoplasm, and the blue channel represents the nuclei. The dataset is divided into training, validation, and test subsets in a 70:20:10 ratio. The entire dataset is accompanied by instance segmentation masks for nuclei and cytoplasm objects obtained through a specialised CellProfiler [2] software. Notably, the test subset was annotated manually by a specialist, ensuring high-quality annotations. The original raw images are of a higher resolution, 1040 x 1392 pixels, and have a bit depth of 16 bits, providing more detailed information for advanced analyses.

File Description:

The file structure of the zip files is as follows:

HeLaCytoNuc_{train/validation/test}.zip ->

- images -> {filename}.tif

- nuclei_masks  -> {filename}.tif

- cytoplasm_masks  -> {filename}.tif -> {filename}.tif ->

- nuclei_masks  -> {filename}.tif

- cytoplasm_masks  -> {filename}.tif 


1. Rämö, Pauli, Anna Drewek, Cécile Arrieumerlou, Niko Beerenwinkel, Houchaima Ben-Tekaya, Bettina Cardel, Alain Casanova et al. "Simultaneous analysis of large-scale RNAi screens for pathogen entry." BMC genomics 15 (2014): 1-18.

2. Carpenter, Anne E., Thouis R. Jones, Michael R. Lamprecht, Colin Clarke, In Han Kang, Ola Friman, David A. Guertin et al. "CellProfiler: image analysis software for identifying and quantifying cell phenotypes." Genome biology 7 (2006): 1-11.

Files (9.3 GB)
Name Size
8.6 GB Download
76.2 MB Download
3.7 MB Download
434.9 MB Download
139.9 MB Download
All versions This version
Views 161161
Downloads 4646
Data volume 83.6 GB83.6 GB
Unique views 133133
Unique downloads 3535


Cite as