Molecular response properties in implicit and explicit solvent environments --------------------------------------------------------------------------- These datasets contain structures, energies, forces and various response properties (dipole moment, polarizability tensor, nuclear shielding tensors) for ethanol, methanol and allyl-p-tolyl ether used for training the FieldSchNet model. Two ethanol datasets are computed in implicit solvent environments using polarizable continuum models. Datasets of molecular properties computed in the presence of a electric field generated by explicit point charges (for QM/MM) are available for ethanol and allyl-p-tolyl ether. Atomic units are used for all quantities. All calculations have been performed at the PBE0/def2-TZVP level of theory [1,2] using the ORCA quantum chemistry package (version 4.0.1.2)[3]. For details on how individual datasets were generated and specific computational settings please refer to the publication Format ------ The datasets are stored as tar archives compressed via gzip and can be unpacked on a Unix machine using the the "tar -xzf .tgz" command. The data itself is stored in form of an atomic simulation environment (ASE) sqlite database[4]. The database can be accessed using the standard tools provided by ASE. In addition, a utility class (schnetpack.data.AtomsData) for loading the datasets can be found in the SchNetPack code package for atomistic machine learning [5] The following example python code can be used to load the data associated with a single entry with the index from a database file: """ from schnetpack.data import AtomsData database = AtomsData("path/to/db") atoms, properties = database.get_properties(idx) """ is an ASE Atoms object containing the positions and atom types, while is a dictionary containing all stored properties. Properties include energy (“energy”), forces (“forces”), dipole moments (“dipole moment”), polarizability tensors (“polarizability”) and, in most cases, nuclear shielding tensors (“shielding”). All quantities use Hartree atomic units. The following datasets are available: - ethanol_vacuum.tgz: ethanol in vacuum, all properties available. - ethanol_continuum_evw.tgz: ethanol in vacuum (ε=1.0), ethanol (ε=24.3) and water (ε=80.4) computed with a polarizable continuum model. The dielectric constant is stored in the entry “dielectric_constant“. All properties are available. - ethanol_continuum_mt.tgz: ethanol in methanol (ε=32.63) and toluene (ε=10.3) computed with a polarizable continuum model. The dielectric constant is stored in the entry “dielectric_constant“. All properties are available. - ethanol_qmmm.tgz: ethanol structures in fields of external point charges used for QM/MM models. The electric field computed for the environment is stored in the entry “electric_field“. All properties are available. - ethanol_methanol_qmmm.tgz: ethanol and methanol structures identified with adaptive sampling. All structures embedded in fields of points charges. The ethanol structures are a subset selected from the ethanol_qmmm.db dataset. Only energy, forces, dipole moment and polarizabilities are available. Free atom energies of the individual elements are provided in the metadata. - ate_vacuum.tgz: allyl-p-tolyl ether in vacuum, all properties available. - ate_qmmm.tgz: allyl-p-tolyl ether structures in fields of external point charges used for QM/MM models. The electric field computed for the environment is stored in the entry “electric_field“. All properties are available. How to cite ----------- M. Gastegger, K. T. Schütt, K. R. Müller Machine learning of solvent effects on molecular spectra and reactions, Chem. Sci., 10.1039/D1SC02742E (2021). References and links --------------------- [1] Perdew, J. P.; Burke, K.; Ernzerhof, M. Phys. Rev. Lett. 77 (18), 3865–3868 (1996). [2] Weigend, F.; Ahlrichs, R. Phys. Chem. Chem. Phys. 7, 3297-3305 (2005). [3] Neese, F. Wiley Interdiscip. Rev. Comput. Mol. Sci. 2, 73-78 (2012). [4] https://wiki.fysik.dtu.dk/ase/index.html [5] https://github.com/atomistic-machine-learning/schnetpack