Data Description:

This dataset comprises fluorescence micrographs of HeLa cells, specifically labelled to identify nuclei and cell cytoplasm. These images were acquired as a technical calibration for a high-content screening study detailed and published in [1].

The HeLa cell line (ATCC-CCL-2), a widely used immortalised cell line in laboratory research, was cultured under standard conditions. Post-cultivation, the cells were fixed and stained with fluorescent dyes to visualise the nuclei and cytoplasm. The nuclei were stained with DAPI (4',6-diamidino-2-phenylindole), a blue-fluorescent DNA stain, while fluorescent-labeled phalloidin was used to detect actin filaments and delineate the cytoplasm. The entire process of cell culture, fixation, staining, and imaging adhered strictly to the protocols described in [1].

The preprocessed dataset includes 2,676 8-bit RGB images, each with a pixel resolution of 520 x 696 pixels. In these images, only two of the RGB channels are utilized: the red channel represents the cytoplasm, and the blue channel represents the nuclei. The dataset is divided into training, validation, and test subsets in a 70:20:10 ratio. The entire dataset is accompanied by instance segmentation masks for nuclei and cytoplasm objects obtained through a specialised CellProfiler [2] software. Notably, the test subset was annotated manually by a specialist, ensuring high-quality annotations. The original raw images are of a higher resolution, 1040 x 1392 pixels, and have a bit depth of 16 bits, providing more detailed information for advanced analyses.

File Description:

The file structure of the zip files is as follows:

HeLaCytoNuc_{train/validation/test}.zip ->

- images -> {filename}.tif

- nuclei_masks  -> {filename}.tif

- cytoplasm_masks  -> {filename}.tif -> {filename}.tif ->

- nuclei_masks  -> {filename}.tif

- cytoplasm_masks  -> {filename}.tif 


1. Rämö, Pauli, Anna Drewek, Cécile Arrieumerlou, Niko Beerenwinkel, Houchaima Ben-Tekaya, Bettina Cardel, Alain Casanova et al. "Simultaneous analysis of large-scale RNAi screens for pathogen entry." BMC genomics 15 (2014): 1-18.

2. Carpenter, Anne E., Thouis R. Jones, Michael R. Lamprecht, Colin Clarke, In Han Kang, Ola Friman, David A. Guertin et al. "CellProfiler: image analysis software for identifying and quantifying cell phenotypes." Genome biology 7 (2006): 1-11.

