Proton and Neutron reduced phase space for surrogate modeling of Proton Therapy from PHITS simulations

Blangiardi, Francesco; Ratliff, Hunter; Kögler, Toni

doi:10.14278/rodare.4128

November 14, 2025 Dataset Open Access

Proton and Neutron reduced phase space for surrogate modeling of Proton Therapy from PHITS simulations

Blangiardi, Francesco; Ratliff, Hunter; Kögler, Toni

Introduction

This dataset corresponds to the simulation data used within AI methods in _"Fast proton transport and neutron production in proton therapy using Fourier neural operators"_ [CITE]. It has been extracted from the corresponding PHITS dataset [1] related to the same work, and is used by the codebase provided in [2] implementing all important AI methods within the paper.

The purpose of this entry is to provide a more easily accessible version of the data in [2] ready to be used for AI applications. The size of the dataset has been greatly reduced, and put into a format allowing the access of the phase space density at each individual depth in the phantom for both protons and neutrons and in the form of discretized histograms.

A concise description of the simulation setup is provided in [2] please refer to the paper for detailed discussion, description, analysis, and further results derived from this dataset.

General information

The phase space density data is divided into discretized histograms as defined in the related paper. This follows the approximation within said paper where only 4 dimensions are kept, related to the depth, radial distance (R), energy (E) and azimuthal divergence (θ) of the particles. The depth dimension is considered as a pseudo-time dimension, meaning that time is not provided within the data. In order to simulate examples of different beams propagatng through different materials, a total of 47 phantoms have been simulated, each with a unique starting energy. Phantoms have been divided into slabs along the depth dimension which are assumed to be of homogeneous material along the dimensions perpendicular to the beam axis, but are composed of different materials among them. The proton density is provided as the Monte Carlo simulated protons appropriately binned into the defined discretizations whenever one of the surfaces of each slab is crossed. When it comes to the neutron phase space density, this is instead provided as the angle, energy and radius distributions of secondary neutrons produced within each slab. Both densities are to be considered as integrated with respect to time. For each slab, also the energy deposited by the proton is provided, coming as an energy deposition probability distribution along E and R. Moreover, each of the 47 phantoms has been irradiated according to three different sets of treatment head paramenter, leading to the creation of three dataset: ES8, ES9 and NES8. For the sake of reproducibility, weights for each of the models discussed in [2] are also provided.

Parametrization

The densities are observed through discretizations as identified in the paper. Within this work, the resolution along the beam depth is fixed to 0.5mm, the energy resolution is set to 1 and 2 MeV for the proton and neutron fluences respectively, while the radial distance and angle is handled differently among the two particles. For protons these are discretized in logarithmically spaced bins, with the first bin also comprising 0, and ranging up to 95.9 mm and 58.76 ° respectively. Instead, for neutrons both dimensions are uniformly discretized, ranging from 0 up to 60 mm and 180 ° respectively. The R, E and θ dimensions are divided into 30x250x30 bins within the proton data, and into 30x125x30 in the case of the neutrons, which are provided at each discretized depth. Data about energy deposition follows the same radial binning as in the case of the proton density, but the energy binning is instead logarithmic ranging from 1.0e-3 up to 97.7 MeV.

As already mentioned, the ES8, ES9 and NES8 datasets differ in terms of the treatment head parameters. More details about the specifics of each dataset can be found in [1]. As ES8 and ES9 share the same treatment head parameters with the exception of the intensity, the proton density is not provided for the ES9 dataset to limit storage size.

Model weights for each surrogate trained on each of the provided datasets (called MES8, MES9 and MNES8) are also provided, abiding to the surrogate structure defined in [2]. In particular, each surrogate is composed of a proton and neutron model for both density and intensity prediction. Models can be used as detailed in the GitHub repository [3] related to [2].

File description

Both the aforementioned density discretizations are named internally as "phits_logfull" and "hn_phits" for the proton and neutrons respectively, with the energy deposition one following the same convention as the protons. All files contained within this datasets are therefore named according to the discretizations as either "phits_logfull_cube_protons_\<depth in millimeters\>_data.nc", "phits_logfull_cube_dose_\<depth in millimeters\>_data.nc" or "hn_phits_cube_neutrons_\<depth in millimeters\>_data.nc". Each nc file contains an `xarray` variables, containing the MC-approximated histogram, details of the discretization, as well as important parameters such as the CT number of the considered slab, its density and the material's ID within the PHITS environment.

Surrogates are provided in separate .zip files. Each surrogate contains 4 subfolders related to each surrogate component. The PDF components come in the form of pytorch checkpoints encapsulating Fourier Neural Operator models defined through package `neuraloperator` [4] [5] with version 0.3.0. Intensity components are instead .pickle files containing XGBoostRegressor objects defined through package `XGBoost` [6]. Each component also comes with a pickled dictionary containing important metadata related to model hyperparameters.

Folder Structure

The provided data consists of three different .zip files, each related to the ES8, ES9 and the NES8 datasets. Each .zip file comes already divided within the train, validation and test split on the basis of the starting energy. Within each split folder, simulations are represented through folders named in the format "\<Starting Energy\>MeV_05mm_800layers, and each contain the related proton and neutron fluences in files with the previously specified naming convention.

It should be noted that, although the total size of the proposed dataset is of around 7GB, uncompressing the files requires a total size of 180.2 GB.

References

[1] H. N. Ratliff, F. Blangiardi, PHITS simulations of neutron and gamma-ray production from and transport of 70–250 MeV protons in hetero-geneous 1D tissue phantoms, Rodare, (in preparation for submission)(2025).

[2] "Fast proton transport and neutron production in proton therapy using Fourier neural operators" (to be filled)

[3] Blangiardi, F. (2025). AI_phase_space_PT [Computer software]. GitHub. [https://github.com/f-blan/AI_phase_space_PT](https://github.com/f-blan/AI_phase_space_PT)

[4] J. Kossaifi, N. Kovachki, Z. Li, D. Pitt, M. Liu-Schiaffini, R. J. George, B. Bonev, K. Azizzadenesheli, J. Berner, A. Anandkumar, A library for learning neural operators (2024). arXiv:2412.10354.

[5] N. B. Kovachki, Z. Li, B. Liu, K. Azizzadenesheli, K. Bhattacharya, A. M. Stuart, A. Anandkumar, Neural operator: Learning maps between function spaces, CoRR abs/2108.08481 (2021).

[6] T. Chen, C. Guestrin, Xgboost: A scalable tree boosting system, in: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16, ACM, 2016, p. 785–794. doi:10.1145/2939672.2939785. URL http://dx.doi.org/10.1145/2939672.2939785

Acknowledgements

The NOVO project has received funding from the European Innovation Council (EIC) under grant agreement No. 101130979. The EIC receives support from the European Union's Horizon Europe research and innovation programme. Partners from The University of Manchester has received funding from UK Research and Innovation under grant agreement No. 10102118

Preview

Files (9.0 GB)

Name	Size
ES8.zip md5:346d0698efd23fa1fb549c435529b80c	4.6 GB	Download
ES9.zip md5:407589760a67b72dd42a8397a68428e4	431.6 MB	Download
MES8.zip md5:09c023d96f75da63c2cc7b806d6d7982	52.4 MB	Download
MES9.zip md5:6d3a2392e94e8dfad1425fb0f2e36fa6	53.1 MB	Download
MNES8.zip md5:df16efcaabe875815ba0893a78d9d404	52.4 MB	Download
NES8.zip md5:d18243d164c1c4e5bc9624593d1a9a54	3.8 GB	Download
README.md md5:0cc28a25ac5438454ce7f19a841f6469	8.6 kB	Download

References

F. Blangiardi, AI_phase_space_PT (2025). GitHub.(https://github.com/f-blan/AI_phase_space_PT)
H. N. Ratliff, F. Blangiardi, PHITS simulations of neutron and gamma-ray production from and transport of 70–250 MeV protons in hetero-geneous 1D tissue phantoms (in preparation for submission, 2025), Rodare.
J. Kossaifi, N. Kovachki, Z. Li, D. Pitt, M. Liu-Schiaffini, R. J. George, B. Bonev, K. Azizzadenesheli, J. Berner, A. Anandkumar, A library for learning neural operators (2024). arXiv:2412.10354.
N. B. Kovachki, Z. Li, B. Liu, K. Azizzadenesheli, K. Bhattacharya, A. M. Stuart, A. Anandkumar, Neural operator: Learning maps between function spaces, CoRR abs/2108.08481 (2021).
T. Chen, C. Guestrin, Xgboost: A scalable tree boosting system, in: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '16, ACM, 2016, p. 785–794. doi:10.1145/2939672.2939785. URL http://dx.doi.org/10.1145/2939672.2939785

views

downloads

See more details...

	All versions	This version
Views	0	0
Downloads	0	0
Data volume	0 Bytes	0 Bytes
Unique views	0	0
Unique downloads	0	0

More info on how stats are collected.

Publication date:

November 14, 2025

DOI:

Keyword(s):

Proton Therapy Surrogate Modelling Proton Transport Neutron Production Deep Learning Neural Operators Monte Carlo

Related identifiers:

Identical to:
https://www.hzdr.de/publications/Publ-42226

Communities:

License (for files):

Creative Commons Attribution 4.0 International

Versions

Version 1.0.0-beta 10.14278/rodare.4128

Nov 14, 2025

Cite all versions? You can cite all versions by using the DOI 10.14278/rodare.4127. This DOI represents all versions, and will always resolve to the latest one. Read more.

Proton and Neutron reduced phase space for surrogate modeling of Proton Therapy from PHITS simulations

Versions

Share

Cite as

Export

About

Help

Contribute

Follow us

Registered in

Proton and Neutron reduced phase space for surrogate modeling of Proton Therapy from PHITS simulations

RODARE DOI Badge

DOI

10.14278/rodare.4128

Markdown

[![DOI](https://rodare.hzdr.de/badge/DOI/10.14278/rodare.4128.svg)](https://doi.org/10.14278/rodare.4128)

reStructedText

.. image:: https://rodare.hzdr.de/badge/DOI/10.14278/rodare.4128.svg :target: https://doi.org/10.14278/rodare.4128

HTML

<a href="https://doi.org/10.14278/rodare.4128"><img src="https://rodare.hzdr.de/badge/DOI/10.14278/rodare.4128.svg" alt="DOI"></a>

Image URL

https://rodare.hzdr.de/badge/DOI/10.14278/rodare.4128.svg

Target URL

https://doi.org/10.14278/rodare.4128

Versions

Share

Cite as

Export