Crossref journal-article
AIP Publishing
The Journal of Chemical Physics (317)
Abstract

Neural network model chemistries (NNMCs) promise to facilitate the accurate exploration of chemical space and simulation of large reactive systems. One important path to improving these models is to add layers of physical detail, especially long-range forces. At short range, however, these models are data driven and data limited. Little is systematically known about how data should be sampled, and “test data” chosen randomly from some sampling techniques can provide poor information about generality. If the sampling method is narrow, “test error” can appear encouragingly tiny while the model fails catastrophically elsewhere. In this manuscript, we competitively evaluate two common sampling methods: molecular dynamics (MD), normal-mode sampling, and one uncommon alternative, Metadynamics (MetaMD), for preparing training geometries. We show that MD is an inefficient sampling method in the sense that additional samples do not improve generality. We also show that MetaMD is easily implemented in any NNMC software package with cost that scales linearly with the number of atoms in a sample molecule. MetaMD is a black-box way to ensure samples always reach out to new regions of chemical space, while remaining relevant to chemistry near kbT. It is a cheap tool to address the issue of generalization.

Bibliography

Herr, J. E., Yao, K., McIntyre, R., Toth, D. W., & Parkhill, J. (2018). Metadynamics for training neural network model chemistries: A competitive assessment. The Journal of Chemical Physics, 148(24).

Authors 5
  1. John E. Herr (first)
  2. Kun Yao (additional)
  3. Ryker McIntyre (additional)
  4. David W. Toth (additional)
  5. John Parkhill (additional)
References 93 Referenced 63
  1. 10.1103/physrevlett.108.058301 / Phys. Rev. Lett. / Fast and accurate modeling of molecular atomization energies with machine learning (2012)
  2. 10.1021/acs.jpclett.5b00831 / J. Phys. Chem. Lett. / Machine learning predictions of molecular properties: Accurate many-body potentials and nonlocality in chemical space (2015)
  3. 10.1103/physrevb.89.235411 / Phys. Rev. B / Modeling electronic quantum transport with machine learning (2014)
  4. 10.1038/srep02810 / Sci. Rep. / Accelerating materials property predictions using machine learning (2013)
  5. 10.1103/physrevb.89.205118 / Phys. Rev. B / How to represent crystal structures for machine learning: Towards fast prediction of electronic properties (2014)
  6. 10.1021/acs.jpclett.5b01660 / J. Phys. Chem. Lett. / Machine-learning-augmented chemisorption model for CO2 electroreduction catalyst screening (2015)
  7. 10.1021/acs.jpca.7b08750 / J. Phys. Chem. A / Resolving transition metal chemical space: Feature selection for machine learning and structure-property relationships (2017)
  8. 10.1039/c7sc01247k / Chem. Sci. / Predicting electronic structure properties of transition metal complexes with neural networks (2017)
  9. 10.1103/physrevlett.98.146401 / Phys. Rev. Lett. / Generalized neural-network representation of high-dimensional potential-energy surfaces (2007)
  10. 10.1063/1.3553717 / J. Chem. Phys. / Atom-centered symmetry functions for constructing high-dimensional neural network potentials (2011)
  11. 10.1039/c7sc02267k / Chem. Sci. / Machine learning molecular dynamics for the simulation of infrared spectra (2017)
  12. 10.1039/c6sc05720a / Chem. Sci. / ANI-1: An extensible neural network potential with DFT accuracy at force field computational cost (2017)
  13. 10.1021/acs.jpclett.7b01072 / J. Phys. Chem. Lett. / Intrinsic bond energies from a bonds-in-molecules neural network (2017)
  14. 10.1063/1.4973380 / J. Chem. Phys. / The many-body expansion combined with neural networks (2017)
  15. N. Lubbers, J. S. Smith, and K. Barros, “Hierarchical modeling of molecular energies using a deep neural network,” preprint arXiv:1710.00017 (2017). (10.1063/1.5011181)
  16. B. Huang and O. A. von Lilienfeld, “Chemical space exploration with molecular genes and machine learning,” preprint arXiv:1707.04146 (2017).
  17. 10.1039/C7SC04934J / Chem. Sci. / The TensorMol-0.1 model chemistry: A neural network augmented with long-range physics (2018)
  18. 10.1073/pnas.1602375113 / Proc. Natl. Acad. Sci. U. S. A. / How van der Waals interactions determine the unique properties of water (2016)
  19. 10.1103/physrevb.83.153101 / Phys. Rev. B / High-dimensional neural-network potentials for multicomponent systems: Applications to zinc oxide (2011)
  20. 10.1021/acs.jpca.5b09497 / J. Phys. Chem. A / Ab initio investigation of O–H dissociation from the Al–OH2 complex using molecular dynamics and neural network fitting (2016)
  21. 10.1021/acs.jctc.5b01011 / J. Chem. Theory Comput. / Kinetic energy of hydrocarbons as a function of electron density and convolutional neural networks (2016)
  22. 10.1038/s41467-017-00839-3 / Nat. Commun. / Bypassing the Kohn-Sham equations with machine learning (2017)
  23. 10.1063/1.4834075 / J. Chem. Phys. / Orbital-free bond breaking via machine learning (2013)
  24. 10.1103/physrevlett.108.253002 / Phys. Rev. Lett. / Finding density functionals with machine learning (2012)
  25. 10.1002/qua.25040 / Int. J. Quantum Chem. / Understanding machine-learned density functionals (2016)
  26. 10.1002/qua.24939 / Int. J. Quantum Chem. / Understanding kernel ridge regression: Common behaviors from simple functions to density functionals (2015)
  27. J. Li, D. Cai, and X. He, “Learning graph-level representation for drug discovery,” preprint arXiv:1709.03741 (2017).
  28. 10.1021/acs.jcim.7b00146 / J. Chem. Inf. Model. / Is multitask deep learning practical for pharma? (2017)
  29. J. Gomes, B. Ramsundar, E. N. Feinberg, and V. S. Pande, “Atomic convolutional networks for predicting protein-ligand binding affinity,” preprint arXiv:1703.10603 (2017).
  30. B. Ramsundar, S. Kearnes, P. Riley, D. Webster, D. Konerding, and V. Pande, “Massively multitask networks for drug discovery,” preprint arXiv:1502.02072 (2015).
  31. 10.1039/c1ee02056k / Energy Environ. Sci. / Accelerated computational discovery of high-performance materials for organic photovoltaics by means of cheminformatics (2011)
  32. 10.1021/jz200866s / J. Phys. Chem. Lett. / The harvard clean energy project: Large-scale computational screening and design of organic photovoltaics on the world community grid (2011)
  33. 10.1039/c3ee42756k / Energy Environ. Sci. / Lead candidates for high-performance organic photovoltaics from high-throughput quantum chemistry–the Harvard clean energy project (2014)
  34. 10.1021/cm503507h / Chem. Mater. / Materials cartography: Representing and mining materials space using structural and electronic fingerprints (2015)
  35. H. Huo and M. Rupp, “Unified representation for machine learning of molecules and crystals,” preprint arXiv:1704.06439 (2017).
  36. T. Bereau, R. A. DiStasio, Jr., A. Tkatchenko, and O. A. von Lilienfeld, “Non-covalent interactions across organic and biological subsets of chemical space: Physics-based potentials parametrized from machine learning,” preprint arXiv:1710.05871 (2017). (10.1063/1.5009502)
  37. 10.1021/acs.jpclett.5b01456 / J. Phys. Chem. Lett. / Machine learning for quantum mechanical properties of atoms in molecules (2015)
  38. 10.1039/c1cp21668f / Phys. Chem. Chem. Phys. / Neural network potential-energy surfaces in chemistry: A tool for large-scale simulations (2011)
  39. 10.1021/acs.jpclett.7b00784 / J. Phys. Chem. Lett. / Accurate neural network description of surface phonons in reactive gas-surface dynamics: N2+ Ru(0001) (2017)
  40. 10.1002/anie.201703114 / Angew. Chem., Int. Ed. / First principles neural network potentials for reactive simulations of large molecular and condensed systems (2017)
  41. J. Han, L. Zhang, R. Car et al., “Deep potential: A general representation of a many-body potential energy surface,” preprint arXiv:1707.01478 (2017). (10.4208/cicp.OA-2017-0213)
  42. 10.1038/nmat3078 / Nat. Mater. / Nucleation mechanism for the direct graphite-to-diamond phase transition (2011)
  43. 10.1021/ct300913g / J. Chem. Theory Comput. / A critical assessment of two-body and three-body interactions in water (2013)
  44. 10.1063/1.4930194 / J. Chem. Phys. / On the representation of many-body interactions in water (2015)
  45. 10.1021/acs.jpclett.7b01106 / J. Phys. Chem. Lett. / Molecular origin of the vibrational structure of ice Ih (2017)
  46. 10.1063/1.4993213 / J. Chem. Phys. / Toward chemical accuracy in the description of ion–water interactions through many-body representations. Alkali-water dimer potential energy surfaces (2017)
  47. 10.1063/1.4967719 / J. Chem. Phys. / On the accuracy of the MB-pol many-body potential for water: Interaction energies, vibrational frequencies, and classical thermodynamic and dynamical properties from clusters to liquid water and ice (2016)
  48. 10.1063/1.2336223 / J. Chem. Phys. / A random-sampling high dimensional model representation neural network for building potential energy surfaces (2006)
  49. 10.1016/j.cpc.2009.05.022 / Comput. Phys. Commun. / Fitting sparse multidimensional data with low-dimensional terms (2009)
  50. 10.1002/qua.24795 / Int. J. Quantum Chem. / Neural network-based approaches for building high dimensional and quantum dynamics-friendly potential energy surfaces (2015)
  51. 10.1103/physrevb.95.094203 / Phys. Rev. B / Machine learning based interatomic potential for amorphous carbon (2017)
  52. 10.1126/sciadv.1603015 / Sci. Adv. / Machine learning of accurate energy-conserving molecular force fields (2017)
  53. 10.1038/ncomms13890 / Nat. Commun. / Quantum-chemical insights from deep tensor neural networks (2017)
  54. 10.1103/PhysRevLett.120.036002 / Phys. Rev. Lett. / Symmetry-adapted machine-learning for tensorial properties of atomistic systems (2018)
  55. 10.1073/pnas.202427399 / Proc. Natl. Acad. Sci. U. S. A. / Escaping free-energy minima (2002)
  56. 10.1021/acs.jpclett.7b00038 / J. Phys. Chem. Lett. / Genetic optimization of training sets for improved machine learning models of molecular properties (2017)
  57. 10.1146/annurev-physchem-040215-112229 / Annu. Rev. Phys. Chem. / Enhancing important fluctuations: Rare events and metadynamics from a conceptual viewpoint (2016)
  58. 10.1002/wcms.31 / Wiley Interdiscip. Rev.: Comput. Mol. Sci. / Metadynamics (2011)
  59. 10.1021/jp045424k / J. Phys. Chem. B / Assessing the accuracy of metadynamics (2005)
  60. 10.1103/physrevlett.96.090601 / Phys. Rev. Lett. / Equilibrium free energies from nonequilibrium metadynamics (2006)
  61. 10.1021/jp054359r / J. Phys. Chem. B / Efficient reconstruction of complex free energy landscapes by multiple walkers metadynamics (2006)
  62. 10.1021/ct301010b / J. Chem. Theory Comput. / Stochastic surface walking method for structure prediction and pathway searching (2013)
  63. 10.1039/c4cp01485e / Phys. Chem. Chem. Phys. / Stochastic surface walking method for crystal structure and phase transition pathway prediction (2014)
  64. 10.1039/c7sc01459g / Chem. Sci. / Material discovery by combining stochastic surface walking global optimization with a neural network (2017)
  65. 10.1002/qua.24890 / Int. J. Quantum Chem. / Constructing high-dimensional neural network potentials: A tutorial review (2015)
  66. {'year': '2017', 'key': '2023080303203419800_c66', 'article-title': 'TensorMol: A statistical model of molecular structure'} / TensorMol: A statistical model of molecular structure (2017)
  67. {'year': '2015', 'key': '2023080303203419800_c67', 'article-title': 'TensorFlow: Large-scale machine learning on heterogeneous systems'} / TensorFlow: Large-scale machine learning on heterogeneous systems (2015)
  68. 10.1021/ct800511q / J. Chem. Theory Comput. / ‘mindless’ DFT benchmarking (2009)
  69. 10.1021/ci960175l / J. Chem. Inf. Comput. Sci. / Prediction of autoignition temperatures of organic compounds from molecular structure (1997)
  70. 10.1063/1.439486 / J. Chem. Phys. / Molecular dynamics simulations at constant pressure and/or temperature (1980)
  71. 10.1103/physrev.159.98 / Phys. Rev. / Computer ‘experiments’ on classical fluids. I. Thermodynamical properties of Lennard-Jones molecules (1967)
  72. 10.1080/00268976.2014.952696 / Mol. Phys. / Advances in molecular quantum chemistry contained in the Q-chem 4 program package (2015)
  73. 10.1039/b810189b / Phys. Chem. Chem. Phys. / Long-range corrected hybrid density functionals with damped atom–atom dispersion corrections (2008)
  74. D. Kingma and J. Ba, “Adam: A method for stochastic optimization,” preprint arXiv:1412.6980 (2014).
  75. D.-A. Clevert, T. Unterthiner, and S. Hochreiter, “Fast and accurate deep network learning by exponential linear units (ELUs),” preprint arXiv:1511.07289 (2015).
  76. 10.1021/ar040198i / Acc. Chem. Res. / Metadynamics as a tool for exploring free energy landscapes of chemical reactions (2006)
  77. 10.1103/physrevlett.90.075503 / Phys. Rev. Lett. / Predicting crystal structures: The Parrinello-Rahman method revisited (2003)
  78. 10.1038/nmat1696 / Nat. Mater. / Crystal structure transformations in SiO2 from classical and ab initio metadynamics (2006)
  79. 10.1021/ct9006585 / J. Chem. Theory Comput. / Tautomerism in reduced pyrazinacenes (2010)
  80. 10.1039/b822765a / Phys. Chem. Chem. Phys. / First-principles simulations of hydrogen peroxide formation catalyzed by small neutral gold clusters (2009)
  81. 10.1002/chem.200700254 / Chem. - Eur. J. / Towards a rational design of ruthenium CO2 hydrogenation catalysts by ab initio metadynamics (2007)
  82. 10.1021/jp803185j / J. Phys. Chem. A / Conformational behavior of cinchonidine revisited: A combined theoretical and experimental study (2008)
  83. 10.1021/ct900398a / J. Chem. Theory Comput. / Free energy barriers for the N-terminal asparagine to succinimide conversion: Quantum molecular dynamics simulations for the fully solvated model (2009)
  84. 10.1002/ejic.200900714 / Eur. J. Inorg. Chem. / Theoretical analysis of the possible intermediates in the formation of [W6O19]2−
  85. 10.1021/ja8050525 / J. Am. Chem. Soc. / Molecular dynamics prediction of the mechanism of ester hydrolysis in water (2008)
  86. 10.1103/physrevb.79.165437 / Phys. Rev. B / Ab initio study of the diffusion and decomposition pathways of SiHx species on Si(100) (2009)
  87. 10.1021/jp807787s / J. Phys. Chem. C / First-principles molecular dynamics study of the heterogeneous reduction of NO2 on soot surfaces (2008)
  88. 10.1039/b600027d / Phys. Chem. Chem. Phys. / Benchmark database of accurate (MP2 and CCSD(T) complete basis set limit) interaction energies of small model complexes, DNA base pairs, and amino acid pairs (2006)
  89. 10.1063/1.460205 / J. Chem. Phys. / Gaussian-2 theory for molecular energies of first- and second-row compounds (1991)
  90. 10.1063/1.478385 / J. Chem. Phys. / Gaussian-3 theory using reduced Møller-Plesset order (1999)
  91. 10.1063/1.2436888 / J. Chem. Phys. / Gaussian-4 theory (2007)
  92. 10.1038/sdata.2014.22 / Sci. Data / Quantum chemistry structures and properties of 134 kilo molecules (2014)
  93. 10.1038/sdata.2017.193 / Sci. Data / ANI-1: A data set of 20M off-equilibrium DFT calculations for organic molecules (2017)
Dates
Type When
Created 7 years, 5 months ago (March 15, 2018, 11:34 a.m.)
Deposited 2 years ago (Aug. 2, 2023, 11:20 p.m.)
Indexed 3 weeks, 1 day ago (July 30, 2025, 7:10 a.m.)
Issued 7 years, 5 months ago (March 15, 2018)
Published 7 years, 5 months ago (March 15, 2018)
Published Online 7 years, 5 months ago (March 15, 2018)
Published Print 7 years, 1 month ago (June 28, 2018)
Funders 0

None

@article{Herr_2018, title={Metadynamics for training neural network model chemistries: A competitive assessment}, volume={148}, ISSN={1089-7690}, url={http://dx.doi.org/10.1063/1.5020067}, DOI={10.1063/1.5020067}, number={24}, journal={The Journal of Chemical Physics}, publisher={AIP Publishing}, author={Herr, John E. and Yao, Kun and McIntyre, Ryker and Toth, David W. and Parkhill, John}, year={2018}, month=mar }