There is a newer version of this record available.

Dataset Open Access

Clinical urine microscopy for urinary tract infections

Liou, Natasha; De, Trina; Urbanski, Adrian; Khasriya, Rajvinder; Yakimovich, Artur; Horsley, Harry


MARC21 XML Export

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="http://www.loc.gov/MARC21/slim">
  <leader>00000nmm##2200000uu#4500</leader>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">2266088766</subfield>
    <subfield code="u">https://rodare.hzdr.de/record/2562/files/ds1.zip</subfield>
    <subfield code="z">md5:39d442175f1c40ae91b3c18f050a4fa3</subfield>
  </datafield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="a">Liou, Natasha</subfield>
    <subfield code="u">Bladder Infection and Immunity Group (BIIG), UCL Centre for Kidney &amp; Bladder Health, Division of Medicine, University College London, Royal Free Hospital Campus, London, United Kingdom</subfield>
    <subfield code="0">(orcid)0000-0002-5644-2604</subfield>
  </datafield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">Clinical urine microscopy for urinary tract infections</subfield>
  </datafield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2">opendefinition.org</subfield>
  </datafield>
  <datafield tag="041" ind1=" " ind2=" ">
    <subfield code="a">eng</subfield>
  </datafield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="u">https://creativecommons.org/licenses/by/4.0/legalcode</subfield>
    <subfield code="a">Creative Commons Attribution 4.0 International</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="a">De, Trina</subfield>
    <subfield code="u">Center for Advanced Systems Understanding (CASUS), Görlitz, Germany; Helmholtz-Zentrum Dresden-Rossendorf e. V. (HZDR), Dresden, Germany</subfield>
    <subfield code="0">(orcid)0000-0003-1111-9851</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="a">Urbanski, Adrian</subfield>
    <subfield code="u">Helmholtz-Zentrum Dresden-Rossendorf e. V. (HZDR), Dresden, Germany; Center for Advanced Systems Understanding (CASUS), Görlitz, Germany; Institute of Computer Science, University of Wrocław, Wrocław, Poland</subfield>
    <subfield code="0">(orcid)0009-0008-6619-3665</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="a">Khasriya, Rajvinder</subfield>
    <subfield code="u">Bladder Infection and Immunity Group (BIIG), UCL Centre for Kidney &amp; Bladder Health, Division of Medicine, University College London, Royal Free Hospital Campus, London, United Kingdom</subfield>
    <subfield code="0">(orcid)0000-0002-1696-1442</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="a">Yakimovich, Artur</subfield>
    <subfield code="u">Center for Advanced Systems Understanding (CASUS), Görlitz, Germany; Helmholtz-Zentrum Dresden-Rossendorf e. V. (HZDR), Dresden, Germany; Bladder Infection and Immunity Group (BIIG), UCL Centre for Kidney &amp; Bladder Health, Division of Medicine, University College London, Royal Free Hospital Campus, London, United Kingdom; Institute of Computer Science, University of Wrocław, Wrocław, Poland</subfield>
    <subfield code="0">(orcid)0000-0003-2458-4904</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="a">Horsley, Harry</subfield>
    <subfield code="u">Bladder Infection and Immunity Group (BIIG), UCL Centre for Kidney &amp; Bladder Health, Division of Medicine, University College London, Royal Free Hospital Campus, London, United Kingdom</subfield>
    <subfield code="0">(orcid)0000-0002-6967-3321</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">user-health</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">user-rodare</subfield>
  </datafield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2023-09-12</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">dataset</subfield>
  </datafield>
  <controlfield tag="001">2562</controlfield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">clinical microscopy</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">urine microscopy</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">widefield</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">transmission light</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">image segmentation</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">binary segmentation</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">multiclass segmentation</subfield>
  </datafield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">&lt;p&gt;Urinary tract infections (UTI) are a common disorder. Its diagnosis can be made by microscopic examination of voided urine for cellular markers of infection. We present a dataset containing 300 images and 3,562 manually annotated urinary cells labelled into seven classes of clinically significant urinary content. It is an enriched dataset with samples acquired from the unstained and untreated urine of patients with symptomatic UTI. The aim of the dataset is to facilitate UTI diagnosis in nearly all clinical settings by using a simple imaging system which leverages advanced machine learning techniques.&amp;nbsp;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Data acquisition&amp;nbsp;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;300 urine samples were obtained from patients with symptomatic UTI between April and August 2022 from a specialist LUTS outpatient clinic in central London. Urine samples were collected as natural voids and processed on-site within one hour to mitigate cellular degradation. Brightfield microscopic examination (Olympus BX41F microscope frame, U-5RE quintuple nosepiece, U-LS30 LED illuminator, U-AC Abbe condenser) was performed at x20 objective (Olympus PLCN20x Plan C N Achromat 20x/0.4). A disposable haemocytometer (C Chip&amp;trade;) was used for enumeration of red cells (RBC), white cells (WBC), epithelial cells (EPC), and the presence of other cellular content per 1 &amp;micro;l of urine by two experienced microscopists.&lt;/p&gt;

&lt;p&gt;Images were acquired using the aforementioned brightfield microscope using a 0.5X C-mount adapter connected to a digital colour camera (Infinity 3S-1UR, Teledyne Lumenera). Images were taken in 16-bit colour in 1392 x 1040 .tif format using Capture and Analyse software. An enriched dataset approach was taken to maximise urinary cellular content in the acquired images. Such data curation was also necessary to overcome class imbalance. Daily Kohler illumination and global white balance was performed to ensure consistency in image acquisition.&amp;nbsp;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dataset annotation&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;300 images were acquired and manually annotated by first identifying cells of interest as a binary semantic segmentation task. Individual pixels were dichotomously labelled as either informative cells, foreground, or non-informative background. Non-informative background was further constrained by including unidentifiable cells, such as debris or grossly out-of-focus particles. Binary annotation was initially performed using ilastik, an open-source software using a Random Forest classifier for pixel classification, then manually refined at the pixel level to ensure accurate semantic segmentation. This produced a binary mask in 1392 x 1040 .tif format for each corresponding raw colour image.&amp;nbsp;&lt;/p&gt;

&lt;p&gt;Objects of interest were then manually labelled by two expert microscopists into one of seven clinically significant multi-class categories: rods, RBC/WBC, yeast, miscellaneous, single EPC, small EPC sheet, and large EPC sheet. This produced a multi-class mask in 1392 x 1040 .tif format with a label as pixel value from 0-7, where 0 is background (Table 1).&amp;nbsp;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Data structure&amp;nbsp;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The dataset is organised into three root folders: img (image), bin_mask (binary mask), and mult_mask (multi-class mask). Each folder has 300 files in .tif format and labelled with an incremental number.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Table1&lt;/strong&gt;&lt;/p&gt;

&lt;pre&gt;&lt;code class="language-markdown"&gt;Folder         Files        Objects               Count       Pixel Values

img              300        Raw data                                 0-255
bin_mask         300        Background/Foreground                      0/1
mult_mask        300        Background/Class                             0
                            Rod                    1697                  1
                            RBC/WBC                1056                  2
                            Yeast                    41                  3
                            Miscellaneous           550                  4
                            Single EPC              182                  5
                            Small EPC sheet          26                  6
                            Large EPC sheet          10                  7
                                
                            Total                  3562         &lt;/code&gt;&lt;/pre&gt;</subfield>
  </datafield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.14278/rodare.2562</subfield>
    <subfield code="2">doi</subfield>
  </datafield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="a">https://www.hzdr.de/publications/Publ-37531</subfield>
    <subfield code="i">isIdenticalTo</subfield>
    <subfield code="n">url</subfield>
  </datafield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="a">10.14278/rodare.2472</subfield>
    <subfield code="i">isVersionOf</subfield>
    <subfield code="n">doi</subfield>
  </datafield>
  <controlfield tag="005">20241128175150.0</controlfield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  </datafield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="o">oai:rodare.hzdr.de:2562</subfield>
    <subfield code="p">openaire_data</subfield>
    <subfield code="p">user-health</subfield>
    <subfield code="p">user-rodare</subfield>
  </datafield>
</record>
818
442
views
downloads
All versions This version
Views 818116
Downloads 4428
Data volume 1.0 TB18.1 GB
Unique views 674109
Unique downloads 1408

Share

Cite as