Software Closed Access

Software: removal of bremsstrahlung background from SAXS signals with deep neural networks

Starke, Sebastian; Smid, Michal

Software for training and inference of neural network models to remove bremsstrahlung background from SAXS imaging data obtained at the European XFEL laboratory.

We thank Peter Steinbach for providing the codebase for the equivariant UNet, which we integrated into our repository.

Below we share a brief description of our method:

  1. Introduction

    Experimental data from cameras in ultra-high intensity laser interaction experiments very often con-
    tains not only the desired signal, but also a large amount of traces of high-energy photons created
    via the bremsstrahlung process during the interaction. For example, the Jungfrau camera detecting
    small angle x-ray scattering (SAXS) signal in a combined XFEL + optical laser (OL) experiment at
    the European XFEL laboratory still contains lot of bremsstrahlung background, even though strong
    experimental effort (adding a mirror to reflect the signal, and a massive lead wall to block direct view)
    was taken to reduce those (Šmı́d et al., 2020). Especially in the SAXS case, the signal is gradually
    becoming weaker with increasing scattering angle. Therefore, the experimentally observed signal-to-
    noise ratio determines the limit of the scattering angles for which the signal can be extracted, limiting
    the physics that can be observed.
    As the noise is produced by the high-energy photons, whose origin is very different from the signal
    photons, the signal and noise are additive. The currently used Jungfrau camera has a resolution of
    1024 × 512 pixels, pixel size of 75 μm, and the read values are calibrated to deposited keV per pixel.

  2. Methods
    The process of removing the noise from the data was split into three steps. First, the learning dataset
    was curated and cut into patches of 128 × 128 pixels. Second, a neural network was created an trained
    on those data. Splitting the data into the patches actually enables the whole process, because no
    ‘noise-only’ data are measured in the detector areas where signal typically is. In the third step, an
    image with actual data is split into the patches, those are processed by the neural network, and merged
    together to produce the final signal and noise prediction.

    Data preparation
    The experimental data used for training the neural network came from two sets:

    • X-ray only shots: Those data are collected when only the XFEL beam was used, i.e. they do
    contain an example of the useful signal, but no bremsstrahlung background at all.
    • Full shots: Those data are from the real physics shots, contain both the XFEL and OL beams,
    therefore have a mixture of signal and noise.

    In order to train the neural network in a supervised manner, we need to provide two sets of data: the
    signal and the noise patches. The signal patches are created from the x-ray only data like this: From
    each image, a set of randomly positioned and randomly oriented patches is extracted. The randomness
    in rotation is important, as those training x-ray data do have significant dominant directions, which
    are expected to change in the real full shots data. Next, the patches are checked and only those
    which have integrated intensity above a given threshold are used, to prevent close-to-empty patches
    to be used for the training. In the last step, the amplitude of the patches is randomized, to keep the
    algorithm more general. Note that the dynamic range of the detector as well as the signal is large,
    i.e. above approximately four orders of magnitude.
    The noise patches are created from the full shots data. To avoid the regions with signal to be used,
    those regions are masked out. The masking is performed automatically by using a corresponding x-ray
    only image. Then, patches of given size are randomly selected from the remaining data. Note that
    neither rotation nor changes of amplitude are applied, as both can contain signatures of the structure
    of bremsstrahlung, which could simplify the task for the neural network.

    Neural network
    In the modelling approach we followed, noise was assumed to be additive, i.e. a noisy input signal xin
    can be decomposed into noise and clean signal components n and s, respectively via the relationship
    xin = n + s.
    The removal of the bremsstrahlung background n was achieved with the help of a convolutional
    neural network, which estimated both the noise n̂ to be subtracted from the input and the denoised
    image ŝ itself. More specifically, a UNet architecture (Ronneberger et al., 2015) was adopted with
    four encoder blocks using 32, 64, 128 and 256 feature maps. Each encoder block consisted of two
    separate convolutional layers and ReLU nonlinearities. No batch normalization was employed. The
    corresponding decoder network matched the number of filters. The decoder output produced latent
    feature maps l with 16 channels.
    In preliminary experiments, we have found an equivariant version of the UNet, implemented us-
    ing the ‘escnn’ library (https://github.com/QUVA-Lab/escnn) (Cesa et al., 2022), to show favorable
    performance compared to the original version. It consisted of 5.88 million trainable parameters and
    implemented operations to make the network equivariant to input transformations under discrete ro-
    tations with angles corresponding to multiples of 90 degrees.
    The input to the neural network consisted of image patches of shape 128 × 128. The training data
    comprised of 1754 signal patches and another set of 4711 noise patches.
    During network training, we randomly sampled a new noise patch each time a clean signal patch
    was accessed, as a means of data augmentation and to avoid overfitting. The pixelwise addition of
    both patches resulted in a synthetic noisy patch which was used as model input. Both summands
    were treated as labels during model training. Image intensity normalization on the raw pixel values
    was performed as follows: lower and upper bounds for z-score normalization were computed as the 1
    and 99.95 percentiles of the noisy patch. The lower bound was subtracted from the noisy patch and
    the result was divided by the difference between upper and lower bound. Subsequently, the result was
    clipped to the unit range, i.e. values below zero were set to zero and values above one were reduced to
    one. The same normalization and clipping strategy using the bounds obtained from the noisy patch
    were subsequently applied on the signal and the noise patch, respectively.
    From the latent representation of the equivariant UNet, pixelwise noise was estimated by further
    applying a convolutional layer on the latent feature map, using a kernel size of three, with stride and
    padding of one to retain the spatial dimensionality. A ReLU activation was applied, as the noise
    contribution was known to be non-negative. The estimated noise ŝ was then subtracted from the
    input. To enforce non-negativity also of the estimated signal, again, a ReLU nonlineariy was applied.
    In total, the procedure worked as follows:
    l = eqUNet(xin ),
    n̂ = ReLU (conv(l)) ,
    ŝ = ReLU (xin − n̂) .
    The network was implemented using the ‘PyTorch’ library (version 1.12.1) for the Python pro-
    gramming language (version 3.10.4). It was trained for 400 epochs with a batch size of 16 on a single
    NVIDIA A100 GPU using the AdamW optimizer with a learning rate of 10−4 and no weight decay. For
    both estimated components n̂ and ŝ, the mean absolute error loss was applied. Both loss components
    were added to obtain the loss function the model was trained on.

    Application
    Once the model was trained, the removal of the bremsstrahlung background of full-sized experimental
    imaging data was performed by applying the model on image patches, followed by a recombination
    of the patch predictions to obtain full-sized model predictions. A simple sliding-window approach,
    i.e. a regular splitting of image data into non-overlapping patches and consequent combination would
    produce unwanted effects on the borders between patches, therefore a more complex method was
    developed.
    Each image is split into a grid of patches four times, with the following initial pixel offsets: [0,0],
    [96,32], [32,96], [64,64]. Normalization of the patches is performed in the same way as described for the training procedure, before being processed by the network. The obtained predictions for each
    patch are then rescaled to the original data range by undoing the normalization (i.e. by multiplying
    the output with the difference between upper and lower bound followed by an addition of the lower
    bound).
    In the last step, the four predictions produced for the four offsets are combined into a final result.
    Each pixel of the final image is calculated as a weighted mean of those four predictions. The weights
    for the mean are calculated as
    wi = 1 / ((|pi−m|/2) + 2)
    where wi is the weight of i−th prediction pi , and m is the mean of all predictions for a given pixel.
    This approach effectively eliminates the outliers, which are sometimes produced close to the edges of
    the patches.
     
  3. References

          [1] Cesa, G., Lang, L., & Weiler, M. (2022). A program to build e(n)-equivariant steerable CNNs. International Conference on Learning Representations. https: / /openreview.net/forum?id=WE4qe9xlnQw

          [2] Ronneberger, O., Fischer, P., & Brox, T. (2015). U-net: Convolutional networks for biomedical image segmentation. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 9351, 234–241. https://doi.org/10.1007/978-3-319-24574-4 28

          [3] Šmı́d, M., Baehtz, C., Pelka, A., Laso Garcı́a, A., Göde, S., Grenzer, J., Kluge, T., Konopkova, Z., Makita, M., Prencipe, I., Preston, T. R., Rödel, M., & Cowan, T. E. (2020). Mirror to measure small angle x-ray scattering signal in high energy density experiments. Review of Scientific Instruments, 91 (12), 123501. https://doi.org/10.1063/5.0021691

Closed Access

Files are not publicly accessible.

404
1
views
downloads
All versions This version
Views 404404
Downloads 11
Data volume 37.7 MB37.7 MB
Unique views 379379
Unique downloads 11

Share

Cite as