Software Closed Access
Starke, Sebastian;
Smid, Michal
<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="http://www.loc.gov/MARC21/slim">
<leader>00000nmm##2200000uu#4500</leader>
<datafield tag="542" ind1=" " ind2=" ">
<subfield code="l">closed</subfield>
</datafield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">SAXS</subfield>
</datafield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">XFEL</subfield>
</datafield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">equivariant neural networks</subfield>
</datafield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">noise removal</subfield>
</datafield>
<datafield tag="980" ind1=" " ind2=" ">
<subfield code="a">user-rodare</subfield>
</datafield>
<datafield tag="700" ind1=" " ind2=" ">
<subfield code="a">Smid, Michal</subfield>
<subfield code="u">HZDR</subfield>
<subfield code="0">(orcid)0000-0002-7162-7500</subfield>
</datafield>
<datafield tag="520" ind1=" " ind2=" ">
<subfield code="a"><p>Software for training and inference of neural network models to remove bremsstrahlung background from SAXS imaging data obtained at the European XFEL laboratory.</p>
<p>We thank Peter Steinbach for providing the codebase for the equivariant UNet, which we integrated into our repository.</p>
<p>Below we share a brief description of our method:</p>
<ol>
<li><strong>Introduction</strong>
<p>Experimental data from cameras in ultra-high intensity laser interaction experiments very often con-<br>
tains not only the desired signal, but also a large amount of traces of high-energy photons created<br>
via the bremsstrahlung process during the interaction. For example, the Jungfrau camera detecting<br>
small angle x-ray scattering (SAXS) signal in a combined XFEL + optical laser (OL) experiment at<br>
the European XFEL laboratory still contains lot of bremsstrahlung background, even though strong<br>
experimental effort (adding a mirror to reflect the signal, and a massive lead wall to block direct view)<br>
was taken to reduce those (&Scaron;mı́d et al., 2020). Especially in the SAXS case, the signal is gradually<br>
becoming weaker with increasing scattering angle. Therefore, the experimentally observed signal-to-<br>
noise ratio determines the limit of the scattering angles for which the signal can be extracted, limiting<br>
the physics that can be observed.<br>
As the noise is produced by the high-energy photons, whose origin is very different from the signal<br>
photons, the signal and noise are additive. The currently used Jungfrau camera has a resolution of<br>
1024 &times; 512 pixels, pixel size of 75 &mu;m, and the read values are calibrated to deposited keV per pixel.</p>
</li>
<li><strong>Methods</strong><br>
The process of removing the noise from the data was split into three steps. First, the learning dataset<br>
was curated and cut into patches of 128 &times; 128 pixels. Second, a neural network was created an trained<br>
on those data. Splitting the data into the patches actually enables the whole process, because no<br>
&lsquo;noise-only&rsquo; data are measured in the detector areas where signal typically is. In the third step, an<br>
image with actual data is split into the patches, those are processed by the neural network, and merged<br>
together to produce the final signal and noise prediction.<br>
<br>
<strong>Data preparation</strong><br>
The experimental data used for training the neural network came from two sets:<br>
<br>
&bull; X-ray only shots: Those data are collected when only the XFEL beam was used, i.e. they do<br>
contain an example of the useful signal, but no bremsstrahlung background at all.<br>
&bull; Full shots: Those data are from the real physics shots, contain both the XFEL and OL beams,<br>
therefore have a mixture of signal and noise.<br>
<br>
In order to train the neural network in a supervised manner, we need to provide two sets of data: the<br>
signal and the noise patches. The signal patches are created from the x-ray only data like this: From<br>
each image, a set of randomly positioned and randomly oriented patches is extracted. The randomness<br>
in rotation is important, as those training x-ray data do have significant dominant directions, which<br>
are expected to change in the real full shots data. Next, the patches are checked and only those<br>
which have integrated intensity above a given threshold are used, to prevent close-to-empty patches<br>
to be used for the training. In the last step, the amplitude of the patches is randomized, to keep the<br>
algorithm more general. Note that the dynamic range of the detector as well as the signal is large,<br>
i.e. above approximately four orders of magnitude.<br>
The noise patches are created from the full shots data. To avoid the regions with signal to be used,<br>
those regions are masked out. The masking is performed automatically by using a corresponding x-ray<br>
only image. Then, patches of given size are randomly selected from the remaining data. Note that<br>
neither rotation nor changes of amplitude are applied, as both can contain signatures of the structure<br>
of bremsstrahlung, which could simplify the task for the neural network.<br>
<br>
<strong>Neural network</strong><br>
In the modelling approach we followed, noise was assumed to be additive, i.e. a noisy input signal xin<br>
can be decomposed into noise and clean signal components n and s, respectively via the relationship<br>
xin = n + s.<br>
The removal of the bremsstrahlung background n was achieved with the help of a convolutional<br>
neural network, which estimated both the noise n̂ to be subtracted from the input and the denoised<br>
image ŝ itself. More specifically, a UNet architecture (Ronneberger et al., 2015) was adopted with<br>
four encoder blocks using 32, 64, 128 and 256 feature maps. Each encoder block consisted of two<br>
separate convolutional layers and ReLU nonlinearities. No batch normalization was employed. The<br>
corresponding decoder network matched the number of filters. The decoder output produced latent<br>
feature maps l with 16 channels.<br>
In preliminary experiments, we have found an equivariant version of the UNet, implemented us-<br>
ing the &lsquo;escnn&rsquo; library (https://github.com/QUVA-Lab/escnn) (Cesa et al., 2022), to show favorable<br>
performance compared to the original version. It consisted of 5.88 million trainable parameters and<br>
implemented operations to make the network equivariant to input transformations under discrete ro-<br>
tations with angles corresponding to multiples of 90 degrees.<br>
The input to the neural network consisted of image patches of shape 128 &times; 128. The training data<br>
comprised of 1754 signal patches and another set of 4711 noise patches.<br>
During network training, we randomly sampled a new noise patch each time a clean signal patch<br>
was accessed, as a means of data augmentation and to avoid overfitting. The pixelwise addition of<br>
both patches resulted in a synthetic noisy patch which was used as model input. Both summands<br>
were treated as labels during model training. Image intensity normalization on the raw pixel values<br>
was performed as follows: lower and upper bounds for z-score normalization were computed as the 1<br>
and 99.95 percentiles of the noisy patch. The lower bound was subtracted from the noisy patch and<br>
the result was divided by the difference between upper and lower bound. Subsequently, the result was<br>
clipped to the unit range, i.e. values below zero were set to zero and values above one were reduced to<br>
one. The same normalization and clipping strategy using the bounds obtained from the noisy patch<br>
were subsequently applied on the signal and the noise patch, respectively.<br>
From the latent representation of the equivariant UNet, pixelwise noise was estimated by further<br>
applying a convolutional layer on the latent feature map, using a kernel size of three, with stride and<br>
padding of one to retain the spatial dimensionality. A ReLU activation was applied, as the noise<br>
contribution was known to be non-negative. The estimated noise ŝ was then subtracted from the<br>
input. To enforce non-negativity also of the estimated signal, again, a ReLU nonlineariy was applied.<br>
In total, the procedure worked as follows:<br>
l = eqUNet(xin ),<br>
n̂ = ReLU (conv(l)) ,<br>
ŝ = ReLU (xin &minus; n̂) .<br>
The network was implemented using the &lsquo;PyTorch&rsquo; library (version 1.12.1) for the Python pro-<br>
gramming language (version 3.10.4). It was trained for 400 epochs with a batch size of 16 on a single<br>
NVIDIA A100 GPU using the AdamW optimizer with a learning rate of 10&minus;4 and no weight decay. For<br>
both estimated components n̂ and ŝ, the mean absolute error loss was applied. Both loss components<br>
were added to obtain the loss function the model was trained on.<br>
<br>
<strong>Application</strong><br>
Once the model was trained, the removal of the bremsstrahlung background of full-sized experimental<br>
imaging data was performed by applying the model on image patches, followed by a recombination<br>
of the patch predictions to obtain full-sized model predictions. A simple sliding-window approach,<br>
i.e. a regular splitting of image data into non-overlapping patches and consequent combination would<br>
produce unwanted effects on the borders between patches, therefore a more complex method was<br>
developed.<br>
Each image is split into a grid of patches four times, with the following initial pixel offsets: [0,0],<br>
[96,32], [32,96], [64,64]. Normalization of the patches is performed in the same way as described for the training procedure, before being processed by the network. The obtained predictions for each<br>
patch are then rescaled to the original data range by undoing the normalization (i.e. by multiplying<br>
the output with the difference between upper and lower bound followed by an addition of the lower<br>
bound).<br>
In the last step, the four predictions produced for the four offsets are combined into a final result.<br>
Each pixel of the final image is calculated as a weighted mean of those four predictions. The weights<br>
for the mean are calculated as<br>
wi = 1 / ((|pi&minus;m|/2) + 2)<br>
where wi is the weight of i&minus;th prediction pi , and m is the mean of all predictions for a given pixel.<br>
This approach effectively eliminates the outliers, which are sometimes produced close to the edges of<br>
the patches.<br>
&nbsp;</li>
<li><strong>References</strong></li>
</ol>
<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; [1] Cesa, G., Lang, L., &amp; Weiler, M. (2022). A program to build e(n)-equivariant steerable CNNs. International Conference on Learning Representations. https: / /openreview.net/forum?id=WE4qe9xlnQw</p>
<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; [2] Ronneberger, O., Fischer, P., &amp; Brox, T. (2015). U-net: Convolutional networks for biomedical image segmentation. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 9351, 234&ndash;241. https://doi.org/10.1007/978-3-319-24574-4 28</p>
<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; [3] &Scaron;mı́d, M., Baehtz, C., Pelka, A., Laso Garcı́a, A., G&ouml;de, S., Grenzer, J., Kluge, T., Konopkova, Z., Makita, M., Prencipe, I., Preston, T. R., R&ouml;del, M., &amp; Cowan, T. E. (2020). Mirror to measure small angle x-ray scattering signal in high energy density experiments. Review of Scientific Instruments, 91 (12), 123501. https://doi.org/10.1063/5.0021691</p></subfield>
</datafield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Starke, Sebastian</subfield>
<subfield code="u">HZDR</subfield>
<subfield code="0">(orcid)0000-0001-5007-1868</subfield>
</datafield>
<datafield tag="980" ind1=" " ind2=" ">
<subfield code="a">software</subfield>
</datafield>
<datafield tag="041" ind1=" " ind2=" ">
<subfield code="a">eng</subfield>
</datafield>
<controlfield tag="001">2586</controlfield>
<controlfield tag="005">20250403094950.0</controlfield>
<datafield tag="260" ind1=" " ind2=" ">
<subfield code="c">2023-11-29</subfield>
</datafield>
<datafield tag="245" ind1=" " ind2=" ">
<subfield code="a">Software: removal of bremsstrahlung background from SAXS signals with deep neural networks</subfield>
</datafield>
<datafield tag="773" ind1=" " ind2=" ">
<subfield code="a">https://www.hzdr.de/publications/Publ-37977</subfield>
<subfield code="i">isIdenticalTo</subfield>
<subfield code="n">url</subfield>
</datafield>
<datafield tag="773" ind1=" " ind2=" ">
<subfield code="a">10.14278/rodare.2585</subfield>
<subfield code="i">isVersionOf</subfield>
<subfield code="n">doi</subfield>
</datafield>
<datafield tag="024" ind1=" " ind2=" ">
<subfield code="a">10.14278/rodare.2586</subfield>
<subfield code="2">doi</subfield>
</datafield>
<datafield tag="909" ind1="C" ind2="O">
<subfield code="o">oai:rodare.hzdr.de:2586</subfield>
<subfield code="p">software</subfield>
<subfield code="p">user-rodare</subfield>
</datafield>
</record>
| All versions | This version | |
|---|---|---|
| Views | 481 | 481 |
| Downloads | 1 | 1 |
| Data volume | 37.7 MB | 37.7 MB |
| Unique views | 445 | 445 |
| Unique downloads | 1 | 1 |