Software Closed Access

Software: removal of bremsstrahlung background from SAXS signals with deep neural networks

Starke, Sebastian; Smid, Michal


Dublin Core Export

<?xml version='1.0' encoding='utf-8'?>
<oai_dc:dc xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
  <dc:creator>Starke, Sebastian</dc:creator>
  <dc:creator>Smid, Michal</dc:creator>
  <dc:date>2023-11-29</dc:date>
  <dc:description>Software for training and inference of neural network models to remove bremsstrahlung background from SAXS imaging data obtained at the European XFEL laboratory.

We thank Peter Steinbach for providing the codebase for the equivariant UNet, which we integrated into our repository.

Below we share a brief description of our method:


	Introduction

	Experimental data from cameras in ultra-high intensity laser interaction experiments very often con-
	tains not only the desired signal, but also a large amount of traces of high-energy photons created
	via the bremsstrahlung process during the interaction. For example, the Jungfrau camera detecting
	small angle x-ray scattering (SAXS) signal in a combined XFEL + optical laser (OL) experiment at
	the European XFEL laboratory still contains lot of bremsstrahlung background, even though strong
	experimental effort (adding a mirror to reflect the signal, and a massive lead wall to block direct view)
	was taken to reduce those (Šmı́d et al., 2020). Especially in the SAXS case, the signal is gradually
	becoming weaker with increasing scattering angle. Therefore, the experimentally observed signal-to-
	noise ratio determines the limit of the scattering angles for which the signal can be extracted, limiting
	the physics that can be observed.
	As the noise is produced by the high-energy photons, whose origin is very different from the signal
	photons, the signal and noise are additive. The currently used Jungfrau camera has a resolution of
	1024 × 512 pixels, pixel size of 75 μm, and the read values are calibrated to deposited keV per pixel.
	
	Methods
	The process of removing the noise from the data was split into three steps. First, the learning dataset
	was curated and cut into patches of 128 × 128 pixels. Second, a neural network was created an trained
	on those data. Splitting the data into the patches actually enables the whole process, because no
	‘noise-only’ data are measured in the detector areas where signal typically is. In the third step, an
	image with actual data is split into the patches, those are processed by the neural network, and merged
	together to produce the final signal and noise prediction.
	
	Data preparation
	The experimental data used for training the neural network came from two sets:
	
	• X-ray only shots: Those data are collected when only the XFEL beam was used, i.e. they do
	contain an example of the useful signal, but no bremsstrahlung background at all.
	• Full shots: Those data are from the real physics shots, contain both the XFEL and OL beams,
	therefore have a mixture of signal and noise.
	
	In order to train the neural network in a supervised manner, we need to provide two sets of data: the
	signal and the noise patches. The signal patches are created from the x-ray only data like this: From
	each image, a set of randomly positioned and randomly oriented patches is extracted. The randomness
	in rotation is important, as those training x-ray data do have significant dominant directions, which
	are expected to change in the real full shots data. Next, the patches are checked and only those
	which have integrated intensity above a given threshold are used, to prevent close-to-empty patches
	to be used for the training. In the last step, the amplitude of the patches is randomized, to keep the
	algorithm more general. Note that the dynamic range of the detector as well as the signal is large,
	i.e. above approximately four orders of magnitude.
	The noise patches are created from the full shots data. To avoid the regions with signal to be used,
	those regions are masked out. The masking is performed automatically by using a corresponding x-ray
	only image. Then, patches of given size are randomly selected from the remaining data. Note that
	neither rotation nor changes of amplitude are applied, as both can contain signatures of the structure
	of bremsstrahlung, which could simplify the task for the neural network.
	
	Neural network
	In the modelling approach we followed, noise was assumed to be additive, i.e. a noisy input signal xin
	can be decomposed into noise and clean signal components n and s, respectively via the relationship
	xin = n + s.
	The removal of the bremsstrahlung background n was achieved with the help of a convolutional
	neural network, which estimated both the noise n̂ to be subtracted from the input and the denoised
	image ŝ itself. More specifically, a UNet architecture (Ronneberger et al., 2015) was adopted with
	four encoder blocks using 32, 64, 128 and 256 feature maps. Each encoder block consisted of two
	separate convolutional layers and ReLU nonlinearities. No batch normalization was employed. The
	corresponding decoder network matched the number of filters. The decoder output produced latent
	feature maps l with 16 channels.
	In preliminary experiments, we have found an equivariant version of the UNet, implemented us-
	ing the ‘escnn’ library (https://github.com/QUVA-Lab/escnn) (Cesa et al., 2022), to show favorable
	performance compared to the original version. It consisted of 5.88 million trainable parameters and
	implemented operations to make the network equivariant to input transformations under discrete ro-
	tations with angles corresponding to multiples of 90 degrees.
	The input to the neural network consisted of image patches of shape 128 × 128. The training data
	comprised of 1754 signal patches and another set of 4711 noise patches.
	During network training, we randomly sampled a new noise patch each time a clean signal patch
	was accessed, as a means of data augmentation and to avoid overfitting. The pixelwise addition of
	both patches resulted in a synthetic noisy patch which was used as model input. Both summands
	were treated as labels during model training. Image intensity normalization on the raw pixel values
	was performed as follows: lower and upper bounds for z-score normalization were computed as the 1
	and 99.95 percentiles of the noisy patch. The lower bound was subtracted from the noisy patch and
	the result was divided by the difference between upper and lower bound. Subsequently, the result was
	clipped to the unit range, i.e. values below zero were set to zero and values above one were reduced to
	one. The same normalization and clipping strategy using the bounds obtained from the noisy patch
	were subsequently applied on the signal and the noise patch, respectively.
	From the latent representation of the equivariant UNet, pixelwise noise was estimated by further
	applying a convolutional layer on the latent feature map, using a kernel size of three, with stride and
	padding of one to retain the spatial dimensionality. A ReLU activation was applied, as the noise
	contribution was known to be non-negative. The estimated noise ŝ was then subtracted from the
	input. To enforce non-negativity also of the estimated signal, again, a ReLU nonlineariy was applied.
	In total, the procedure worked as follows:
	l = eqUNet(xin ),
	n̂ = ReLU (conv(l)) ,
	ŝ = ReLU (xin − n̂) .
	The network was implemented using the ‘PyTorch’ library (version 1.12.1) for the Python pro-
	gramming language (version 3.10.4). It was trained for 400 epochs with a batch size of 16 on a single
	NVIDIA A100 GPU using the AdamW optimizer with a learning rate of 10−4 and no weight decay. For
	both estimated components n̂ and ŝ, the mean absolute error loss was applied. Both loss components
	were added to obtain the loss function the model was trained on.
	
	Application
	Once the model was trained, the removal of the bremsstrahlung background of full-sized experimental
	imaging data was performed by applying the model on image patches, followed by a recombination
	of the patch predictions to obtain full-sized model predictions. A simple sliding-window approach,
	i.e. a regular splitting of image data into non-overlapping patches and consequent combination would
	produce unwanted effects on the borders between patches, therefore a more complex method was
	developed.
	Each image is split into a grid of patches four times, with the following initial pixel offsets: [0,0],
	[96,32], [32,96], [64,64]. Normalization of the patches is performed in the same way as described for the training procedure, before being processed by the network. The obtained predictions for each
	patch are then rescaled to the original data range by undoing the normalization (i.e. by multiplying
	the output with the difference between upper and lower bound followed by an addition of the lower
	bound).
	In the last step, the four predictions produced for the four offsets are combined into a final result.
	Each pixel of the final image is calculated as a weighted mean of those four predictions. The weights
	for the mean are calculated as
	wi = 1 / ((|pi−m|/2) + 2)
	where wi is the weight of i−th prediction pi , and m is the mean of all predictions for a given pixel.
	This approach effectively eliminates the outliers, which are sometimes produced close to the edges of
	the patches.
	 
	References


          [1] Cesa, G., Lang, L., &amp; Weiler, M. (2022). A program to build e(n)-equivariant steerable CNNs. International Conference on Learning Representations. https: / /openreview.net/forum?id=WE4qe9xlnQw

          [2] Ronneberger, O., Fischer, P., &amp; Brox, T. (2015). U-net: Convolutional networks for biomedical image segmentation. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 9351, 234–241. https://doi.org/10.1007/978-3-319-24574-4 28

          [3] Šmı́d, M., Baehtz, C., Pelka, A., Laso Garcı́a, A., Göde, S., Grenzer, J., Kluge, T., Konopkova, Z., Makita, M., Prencipe, I., Preston, T. R., Rödel, M., &amp; Cowan, T. E. (2020). Mirror to measure small angle x-ray scattering signal in high energy density experiments. Review of Scientific Instruments, 91 (12), 123501. https://doi.org/10.1063/5.0021691</dc:description>
  <dc:identifier>https://rodare.hzdr.de/record/2586</dc:identifier>
  <dc:identifier>10.14278/rodare.2586</dc:identifier>
  <dc:identifier>oai:rodare.hzdr.de:2586</dc:identifier>
  <dc:language>eng</dc:language>
  <dc:relation>url:https://www.hzdr.de/publications/Publ-37977</dc:relation>
  <dc:relation>doi:10.14278/rodare.2585</dc:relation>
  <dc:relation>url:https://rodare.hzdr.de/communities/rodare</dc:relation>
  <dc:rights>info:eu-repo/semantics/closedAccess</dc:rights>
  <dc:subject>SAXS</dc:subject>
  <dc:subject>XFEL</dc:subject>
  <dc:subject>equivariant neural networks</dc:subject>
  <dc:subject>noise removal</dc:subject>
  <dc:title>Software: removal of bremsstrahlung background from SAXS signals with deep neural networks</dc:title>
  <dc:type>info:eu-repo/semantics/other</dc:type>
  <dc:type>software</dc:type>
</oai_dc:dc>
481
1
views
downloads
All versions This version
Views 481481
Downloads 11
Data volume 37.7 MB37.7 MB
Unique views 445445
Unique downloads 11

Share

Cite as