Dataset Open Access
<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="http://www.loc.gov/MARC21/slim">
<leader>00000nmm##2200000uu#4500</leader>
<datafield tag="773" ind1=" " ind2=" ">
<subfield code="a">https://www.hzdr.de/publications/Publ-35016</subfield>
<subfield code="i">isIdenticalTo</subfield>
<subfield code="n">url</subfield>
</datafield>
<datafield tag="773" ind1=" " ind2=" ">
<subfield code="a">https://www.hzdr.de/publications/Publ-39797</subfield>
<subfield code="i">isReferencedBy</subfield>
<subfield code="n">url</subfield>
</datafield>
<datafield tag="773" ind1=" " ind2=" ">
<subfield code="a">10.14278/rodare.1833</subfield>
<subfield code="i">isVersionOf</subfield>
<subfield code="n">doi</subfield>
</datafield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Fiedler, Lenz</subfield>
<subfield code="u">HZDR / CASUS</subfield>
<subfield code="0">(orcid)0000-0002-8311-0613</subfield>
</datafield>
<datafield tag="909" ind1="C" ind2="O">
<subfield code="o">oai:rodare.hzdr.de:1834</subfield>
<subfield code="p">openaire_data</subfield>
<subfield code="p">user-rodare</subfield>
</datafield>
<datafield tag="245" ind1=" " ind2=" ">
<subfield code="a">LDOS/SNAP data for MALA: Beryllium at 298K</subfield>
</datafield>
<datafield tag="260" ind1=" " ind2=" ">
<subfield code="c">2022-02-18</subfield>
</datafield>
<datafield tag="520" ind1=" " ind2=" ">
<subfield code="a"><pre><em><strong>Beryllium data set for Machine Learning applications</strong></em>
</pre>
<p>This dataset contains DFT inputs, outputs, LDOS data and fingerprint vectors for a beryllium cell at ambient conditions and varying sizes. Different levels of k-grid convergence were employed:<br>
-&nbsp; Gamma point (gamma_point)<br>
-&nbsp; total energy convergence (k-grid converged to 1meV/atom to total energy difference, total_energy_convergence)<br>
-&nbsp; LDOS convergence (k-grid converged to LDOS without unphyiscal oscillations, ldos_convergence)</p>
<p>The data set contains a .zip file for each system size (see below), as well as one .zip file containing sample scripts for recalculation and preprocessing of data.<br>
The cutoff energy was converged with respect to the energy convergence and held fixed 40Ry for all three levels of k-grids. Note that not for all sizes of unit cells data for all types of k-grid were generated.</p>
<pre><strong>Authors:</strong>
<em>- </em>Fiedler, Lenz (HZDR / CASUS)
<em>- </em>Cangi, Attila (HZDR / CASUS)
<em>Affiliations</em><strong>:</strong>
HZDR - Helmholtz-Zentrum Dresden-Rossendorf
CASUS - Center for Advanced Systems Understanding
<strong>Dataset description</strong>
<em>- </em>Total size: 143G GB
<em>- </em>System: Be128, Be256, Be512, Be1024, Be2048
<em>- </em>Temperature(s): 298K
<em>- </em>Mass density(ies): 1.896 gcc
<em>- </em>Crystal Structure: hpc (material mp-87 in the materials project)
<em>- </em>Number of atomic snapshots: 145
<em> - </em>40 (Be128)
<em> - </em>35 (Be256)
<em>- </em>30 (Be512)
<em>- </em>20 (Be1024)
<em>- </em>10 (Be2048)
<em>- </em>Contents:
<em>- </em>ideal crystal structure: yes
<em> - </em>MD trajectory: yes
<em> - </em>Atomic positions: yes
<em>- </em>DFT inputs: yes
<em> - </em>DFT outputs (energies): yes
<em> - </em>SNAP vectors: yes (partially, see below)
<em> - </em>dimensions: XxYxZx94 (last dimension: first three entries are x,y,z coordinates, data size is 91), where X, Y, Z are:
<em>- </em>Be128: 72x72x120 (size per file: 447MB)
<em>- </em>Be256: 144x72x120 (size per file: 893MB)
<em>- </em>Be512: 144x144x120 (size per file: 1.8GB)
<em> - </em>units: a.u./Bohr
<em> - </em>LDOS vectors: yes (partially, see below)
<em> - </em>dimensions: XxYxZx250, where X, Y, Z are:
<em>- </em>Be128: 72x72x120 (size per file: 1.2GB)
<em>- </em>Be256: 144x72x120 (size per file: 2.4GB)
<em>- </em>Be512: 144x144x120 (size per file: 4.7GB)
<em> - </em>units: 1/eV
<em>- </em>note: LDOS parameters are the same for all sizes of the unit cell
<em> - </em>trained networks: no
<strong>Data generation</strong>
Ideal crystal structures were obtained using the Materials Project. (https://materialsproject.org/materials/mp-87/)
DFT-MD calculations were performed using either QuantumESPRESSO (https://www.quantum-espresso.org/, QE, for Be128, Be256 and Be512) or the Vienna Ab initio Simulation Package (https://www.vasp.at/, VASP, for Be1024, Be2048). DFT calculations were performed using QuantumESPRESSO.
For the VASP calculations, the standard VASP pseudopotentials were used. For Quantum Espresso, pslibrary was used (https://dalcorso.github.io/pslibrary/).
SNAP vectors were calculated using MALA (https://github.com/mala-project/mala) and its LAMMPS (https://github.com/mala-project/mala) interface. The LDOS was preprocessed using MALA as well.
<strong>Dataset structure</strong>
The folder called &quot;sample_inputs&quot; is provided to show how MALA preprocessing and LDOS calculation have been performed.
For each temperature/mass density/number of atoms, the following subfolders exist:
<em>- </em>md_inputs: Input files for the MD simulations, either as QE or VASP file(s)
<em>- </em>md_outputs: The MD trajectory plus a numpy array containing the temperatures at the individual time steps
<em>- </em>gamma_point
<em>- </em>total_energy_convergence
<em>- </em>ldos_convergence
Each gamma_point/total_energy_convergence/ldos_convergence contains the following folders:
<em>- </em>ldos: holds the LDOS vectors
<em>- </em>fingerprints: holds the SNAP fingerprint vectors
<em>- </em>snapshots: holds the atomic positions of the atomic snapshots for which DFT and LDOS calculations were performed (as .xyz files)
<em>- </em>dft_outputs: holds the outputs from the DFT calculations, i.e. energies in the form of a QE output file
<em>- </em>dft_inputs: holds the inputs for the DFT calculations, in the form of a QE input file
Please note that the numbering of the snapshots is contiguous per temperature/mass density/number of atoms, NOT within the k-grids themselves.
Also, LDOS and fingerprint files have only been calculated for snapshots in the ldos_convergence
folders. Therefore, no LDOS and fingerprint files have been calculated for the 1024 anf 2048 atom systems.
</pre></subfield>
</datafield>
<datafield tag="540" ind1=" " ind2=" ">
<subfield code="u">https://creativecommons.org/licenses/by/4.0/legalcode</subfield>
<subfield code="a">Creative Commons Attribution 4.0 International</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="s">214946397</subfield>
<subfield code="u">https://rodare.hzdr.de/record/1834/files/N1024.zip</subfield>
<subfield code="z">md5:cfb934835556ca8d77cf18821b468971</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="s">29186428750</subfield>
<subfield code="u">https://rodare.hzdr.de/record/1834/files/N128.zip</subfield>
<subfield code="z">md5:54e49d566ef88d7897e74aaf1f7138b6</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="s">112066694</subfield>
<subfield code="u">https://rodare.hzdr.de/record/1834/files/N2048.zip</subfield>
<subfield code="z">md5:276663b3392d2b4e035af8e6c9264ed7</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="s">43680626268</subfield>
<subfield code="u">https://rodare.hzdr.de/record/1834/files/N256.zip</subfield>
<subfield code="z">md5:e8698b13beb3dbfbbe99f3dc35a4dd3d</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="s">46934764110</subfield>
<subfield code="u">https://rodare.hzdr.de/record/1834/files/N512.zip</subfield>
<subfield code="z">md5:5848cbc3a3ccc52ac07a00aab525c851</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="s">4310</subfield>
<subfield code="u">https://rodare.hzdr.de/record/1834/files/README.md</subfield>
<subfield code="z">md5:5eb3423b53f6e223deda06eacbf13cfd</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="s">2067</subfield>
<subfield code="u">https://rodare.hzdr.de/record/1834/files/sample_inputs.zip</subfield>
<subfield code="z">md5:7882af49ea82d1c715f954e8eb9e42a1</subfield>
</datafield>
<datafield tag="980" ind1=" " ind2=" ">
<subfield code="a">dataset</subfield>
</datafield>
<datafield tag="542" ind1=" " ind2=" ">
<subfield code="l">open</subfield>
</datafield>
<datafield tag="650" ind1="1" ind2="7">
<subfield code="a">cc-by</subfield>
<subfield code="2">opendefinition.org</subfield>
</datafield>
<datafield tag="980" ind1=" " ind2=" ">
<subfield code="a">user-rodare</subfield>
</datafield>
<controlfield tag="001">1834</controlfield>
<controlfield tag="005">20241024145903.0</controlfield>
<datafield tag="024" ind1=" " ind2=" ">
<subfield code="a">10.14278/rodare.1834</subfield>
<subfield code="2">doi</subfield>
</datafield>
<datafield tag="700" ind1=" " ind2=" ">
<subfield code="a">Cangi, Attila</subfield>
<subfield code="u">HZDR / CASUS</subfield>
<subfield code="0">(orcid)0000-0001-9162-262X</subfield>
</datafield>
</record>
| All versions | This version | |
|---|---|---|
| Views | 2,243 | 2,243 |
| Downloads | 5,640 | 5,640 |
| Data volume | 137.6 TB | 137.6 TB |
| Unique views | 1,192 | 1,192 |
| Unique downloads | 464 | 464 |