Dataset Open Access
<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="http://www.loc.gov/MARC21/slim">
<leader>00000nmm##2200000uu#4500</leader>
<controlfield tag="005">20250924111949.0</controlfield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">Physics-informed machine learning</subfield>
</datafield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">TDDFT</subfield>
</datafield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">RT-TDDFT</subfield>
</datafield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">Fourier Neural Operators</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="s">872366268</subfield>
<subfield code="u">https://rodare.hzdr.de/record/3995/files/rttddft_fno.tar.gz</subfield>
<subfield code="z">md5:53c0add6764430afa5253ae514d63ab7</subfield>
</datafield>
<datafield tag="245" ind1=" " ind2=" ">
<subfield code="a">Dataset for Machine Learning Time Propagators for Time-Dependent Density Functional Theory Simulations</subfield>
</datafield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Shah, Karan</subfield>
<subfield code="u">CASUS, HZDR</subfield>
<subfield code="0">(orcid)0000-0002-5480-2880</subfield>
</datafield>
<datafield tag="540" ind1=" " ind2=" ">
<subfield code="u">https://creativecommons.org/licenses/by/4.0/legalcode</subfield>
<subfield code="a">Creative Commons Attribution 4.0 International</subfield>
</datafield>
<datafield tag="980" ind1=" " ind2=" ">
<subfield code="a">user-rodare</subfield>
</datafield>
<datafield tag="260" ind1=" " ind2=" ">
<subfield code="c">2025-09-24</subfield>
</datafield>
<datafield tag="773" ind1=" " ind2=" ">
<subfield code="a">10.48550/arXiv.2508.16554</subfield>
<subfield code="i">isSupplementTo</subfield>
<subfield code="n">doi</subfield>
</datafield>
<datafield tag="773" ind1=" " ind2=" ">
<subfield code="a">10.48550/arXiv.2508.16554</subfield>
<subfield code="i">isReferencedBy</subfield>
<subfield code="n">doi</subfield>
</datafield>
<datafield tag="773" ind1=" " ind2=" ">
<subfield code="a">https://www.hzdr.de/publications/Publ-41882</subfield>
<subfield code="i">isIdenticalTo</subfield>
<subfield code="n">url</subfield>
</datafield>
<datafield tag="773" ind1=" " ind2=" ">
<subfield code="a">https://www.hzdr.de/publications/Publ-41750</subfield>
<subfield code="i">isReferencedBy</subfield>
<subfield code="n">url</subfield>
</datafield>
<datafield tag="773" ind1=" " ind2=" ">
<subfield code="a">10.14278/rodare.3993</subfield>
<subfield code="i">isVersionOf</subfield>
<subfield code="n">doi</subfield>
</datafield>
<datafield tag="520" ind1=" " ind2=" ">
<subfield code="a"><p># Dataset for &quot;Machine Learning Time Propagators for Time-Dependent Density Functional Theory Simulations&quot;</p>
<p>This repository contains the dataset supporting the paper &quot;Machine Learning Time Propagators for Time-Dependent Density Functional Theory Simulations&quot; by Karan Shah and Attila Cangi. It comprises time-dependent density functional theory (TDDFT) simulations of one-dimensional diatomic molecules under laser excitation. The data is used to train and evaluate autoregressive Fourier Neural Operator (FNO) models that serve as machine learning time propagators for electron density evolution.</p>
<p>## Overview</p>
<p>This dataset comprises time-dependent density functional theory (TDDFT) simulations of one-dimensional diatomic molecules under laser excitation. The data is used to train and evaluate autoregressive Fourier Neural Operator (FNO) models that serve as machine learning time propagators for electron density evolution.</p>
<p>## Physical System</p>
<p>The simulations model diatomic molecules with:</p>
<p>- Soft-Coulomb ionic potential:</p>
<p>$$</p>
<p>v_{\text{ion}}(x) = -\frac{Z_{1}}{\sqrt{(x - d/2)^{2} + a^{2}}} \;-\; \frac{Z_{2}}{\sqrt{(x + d/2)^{2} + a^{2}}}</p>
<p>$$</p>
<p>- Sinusoidal laser excitation:</p>
<p>$v_{\text{las}}(t) = A \sin(\omega t)$ in dipole approximation</p>
<p>- Two-electron systems under adiabatic local density approximation (ALDA)</p>
<p>- Fixed boundary conditions at domain edges</p>
<p>## Dataset Specifications</p>
<p>- **Spatial domain**: $[-9.0, 9.0]$ atomic units with spacing $\Delta x = 0.05$ a.u. (361 grid points)</p>
<p>- **Temporal domain**: $[0, 5.0]$ femtoseconds with ML time step $\Delta t = 0.1$ fs (51 time steps)</p>
<p>- **Reference resolution**: $\Delta t = 0.01$ fs for high-accuracy ground truth</p>
<p>- **Total systems**: 2048 independent simulations</p>
<p>- **System parameters**:</p>
<p>- Nuclear charges $(Z_{1}, Z_{2})$: $1.0$&ndash;$4.0$ a.u.</p>
<p>- Internuclear distances $(d)$: $1.0$&ndash;$4.0$ a.u.</p>
<p>- Laser wavelengths: $400$&ndash;$750$ nm (optical range)</p>
<p>- Laser intensities: $10^{12}$&ndash;$10^{14}$ W/cm$^{2}$</p>
<p>- Softening parameter $(a)$: $1.0$ a.u.</p>
<p>## Data Format</p>
<p>Each `combined_data.npz` file contains:</p>
<p>- `densities`: Electron densities [systems, spatial_points, time_steps]</p>
<p>- `lasers_sliced`: Laser field values during simulation period</p>
<p>- `lasers_val`: Full laser field temporal profile</p>
<p>- `lasers_t`: Time grid for laser fields</p>
<p>- `x`: Spatial coordinate grid</p>
<p>- `t`: Temporal coordinate grid</p>
<p>## Baseline Directory</p>
<p>```</p>
<p>baseline/</p>
<p>├── combined_data.npz</p>
<p>├── combined_static_energy.npz</p>
<p>├── data_exclude.yaml</p>
<p>├── data_indices.npy</p>
<p>├── data_static_exclude.yaml</p>
<p>├── data_static_exclude_40_percentile.yaml</p>
<p>├── data_static_exclude_60_percentile.yaml</p>
<p>├── inp_gs.yaml</p>
<p>├── inp_td.yaml</p>
<p>├── param_set.yaml</p>
<p>├── params.csv</p>
<p>└── summary_statistics.md</p>
<p>```</p>
<p>- `combined_data.npz` &mdash; consolidated float32 arrays for 2,048 TDDFT trajectories, including spatial grid (`x`, 361 points), temporal grid (`t`, 51 steps across 0&ndash;5 fs &asymp; 0&ndash;206.7 a.u.), density snapshots, and laser waveforms.</p>
<p>- `combined_static_energy.npz` &mdash; derived observables such as particle-number integrals, dipole moments, and Thomas&ndash;Fermi energies corresponding to each trajectory and time slice.</p>
<p>- `data_exclude.yaml` &mdash; explicit indices removed from the dataset because of boundary reflections or low temporal variation.</p>
<p>- `data_indices.npy` &mdash; NumPy array of retained example indices matching the rows in `params.csv`; use it to align parameter metadata with data tensors without reparsing the NPZ archive.</p>
<p>- `data_static_exclude*.yaml` &mdash; helper masks listing systems to omit when screening for near-static densities; percentile-specific files (`40`, `60`) filter systems with low temporal variation.</p>
<p>- `inp_gs.yaml`, `inp_td.yaml` &mdash; Octopus ground-state and real-time input templates used for the simulations (Crank&ndash;Nicolson propagator, 0.01 fs internal time step, custom diatomic potential). Parameters are overriden according to `params.csv`</p>
<p>- `param_set.yaml` &mdash; YAML description of the parameter sweep ranges (nuclear charges, internuclear distance, driving-field amplitudes, carrier frequencies).</p>
<p>- `params.csv` &mdash; resolved parameter combinations with units, suitable for quick inspection or tabular joins.</p>
<p>- `summary_statistics.md` &mdash; autogenerated report validating float32 conversion, spatial/temporal ranges, and observable metrics for every stored quantity.</p>
<p>## Dataset Variants</p>
<p>- `baseline/` &mdash; fine-grid reference generated at 0.01 fs Crank&ndash;Nicolson resolution and downsampled to 0.1 fs outputs; used for training the model and to get baseline results.</p>
<p>- `octopus_coarse/` &mdash; numerical TDDFT rollouts computed directly on the coarser 0.1 fs grid for solver-versus-model comparisons.</p>
<p>- `spatial_superresolution/` &mdash; trajectories evaluated on a doubled spatial resolution (&Delta;x = 0.025 a.u., 721 grid points) to test Fourier Neural Operator generalization without retraining.</p>
<p>- `time_extension/` &mdash; long-horizon rollouts propagated to 10 fs for assessing error accumulation over time domains outside of training dataset.</p></subfield>
</datafield>
<controlfield tag="001">3995</controlfield>
<datafield tag="542" ind1=" " ind2=" ">
<subfield code="l">open</subfield>
</datafield>
<datafield tag="700" ind1=" " ind2=" ">
<subfield code="a">Cangi, Attila</subfield>
<subfield code="u">CASUS, HZDR</subfield>
<subfield code="0">(orcid)0000-0001-9162-262X</subfield>
</datafield>
<datafield tag="024" ind1=" " ind2=" ">
<subfield code="a">10.14278/rodare.3995</subfield>
<subfield code="2">doi</subfield>
</datafield>
<datafield tag="041" ind1=" " ind2=" ">
<subfield code="a">eng</subfield>
</datafield>
<datafield tag="980" ind1=" " ind2=" ">
<subfield code="a">dataset</subfield>
</datafield>
<datafield tag="909" ind1="C" ind2="O">
<subfield code="o">oai:rodare.hzdr.de:3995</subfield>
<subfield code="p">openaire_data</subfield>
<subfield code="p">user-rodare</subfield>
</datafield>
<datafield tag="650" ind1="1" ind2="7">
<subfield code="a">cc-by</subfield>
<subfield code="2">opendefinition.org</subfield>
</datafield>
</record>
| All versions | This version | |
|---|---|---|
| Views | 370 | 257 |
| Downloads | 13 | 8 |
| Data volume | 11.3 GB | 7.0 GB |
| Unique views | 288 | 227 |
| Unique downloads | 11 | 8 |