Crossref journal-article
AIP Publishing
The Journal of Chemical Physics (317)
Abstract

In recent years, the machine learning techniques have shown great potent1ial in various problems from a multitude of disciplines, including materials design and drug discovery. The high computational speed on the one hand and the accuracy comparable to that of density functional theory on another hand make machine learning algorithms efficient for high-throughput screening through chemical and configurational space. However, the machine learning algorithms available in the literature require large training datasets to reach the chemical accuracy and also show large errors for the so-called outliers—the out-of-sample molecules, not well-represented in the training set. In the present paper, we propose a new machine learning algorithm for predicting molecular properties that addresses these two issues: it is based on a local model of interatomic interactions providing high accuracy when trained on relatively small training sets and an active learning algorithm of optimally choosing the training set that significantly reduces the errors for the outliers. We compare our model to the other state-of-the-art algorithms from the literature on the widely used benchmark tests.

Bibliography

Gubaev, K., Podryabinkin, E. V., & Shapeev, A. V. (2018). Machine learning of molecular properties: Locality and active learning. The Journal of Chemical Physics, 148(24).

Authors 3
  1. Konstantin Gubaev (first)
  2. Evgeny V. Podryabinkin (additional)
  3. Alexander V. Shapeev (additional)
References 22 Referenced 143
  1. 10.1021/acs.jpclett.7b00038 / J. Phys. Chem. Lett. / Genetic optimization of training sets for improved machine learning models of molecular properties (2017)
  2. 10.1021/acs.jpclett.5b00831 / J. Phys. Chem. Lett. / Machine learning predictions of molecular properties: Accurate many-body potentials and nonlocality in chemical space (2015)
  3. 10.1038/sdata.2014.22 / Sci. Data / Quantum chemistry structures and properties of 134 kilo molecules (2014)
  4. 10.1021/acs.jctc.5b00099 / J. Chem. Theory Comput. / Big data meets quantum chemistry approximations: The δ-machine learning approach (2015)
  5. {'year': '2015', 'key': '2023080301100857600_c5', 'article-title': 'Machine learning, quantum mechanics, and chemical compound space'} / Machine learning, quantum mechanics, and chemical compound space (2015)
  6. 10.1063/1.4964627 / J. Chem. Phys. / Communication: Understanding molecular representations in machine learning: The role of uniqueness and target similarity (2016)
  7. 10.1038/ncomms13890 / Nat. Commun. / Quantum-chemical insights from deep tensor neural networks (2017)
  8. 10.1021/acs.jctc.7b00577 / J. Chem. Theory Comput. / Prediction errors of molecular machine learning models lower than hybrid dft error (2017)
  9. {'year': '2017', 'key': '2023080301100857600_c9', 'article-title': 'Neural message passing for quantum chemistry'} / Neural message passing for quantum chemistry (2017)
  10. {'year': '2017', 'key': '2023080301100857600_c10', 'article-title': 'Unified representation for machine learning of molecules and crystals'} / Unified representation for machine learning of molecules and crystals (2017)
  11. 10.1103/physrevlett.108.058301 / Phys. Rev. Lett. / Fast and accurate modeling of molecular atomization energies with machine learning (2012)
  12. 10.1021/acs.jpclett.5b01456 / J. Phys. Chem. Lett. / Machine learning for quantum mechanical properties of atoms in molecules (2015)
  13. 10.1039/c6cp00415f / Phys. Chem. Chem. Phys. / Comparing molecules and solids across structural and alchemical space (2016)
  14. 10.1063/1.5011181 / J. Chem. Phys. / Hierarchical modeling of molecular energies using a deep neural network (2018)
  15. {'key': '2023080301100857600_c15', 'first-page': '992', 'article-title': 'Moleculenet: A continuous-filter convolutional neural network for modeling quantum interactions', 'volume-title': 'Advances in Neural Information Processing Systems', 'year': '2017'} / Advances in Neural Information Processing Systems / Moleculenet: A continuous-filter convolutional neural network for modeling quantum interactions (2017)
  16. 10.1137/15m1054183 / Multiscale Model. Simul. / Moment tensor potentials: A class of systematically improvable interatomic potentials (2016)
  17. {'year': '2017', 'key': '2023080301100857600_c17', 'article-title': 'The ‘DNA’ of chemistry: Scalable quantum machine learning with ‘amons'} / The ‘DNA’ of chemistry: Scalable quantum machine learning with ‘amons (2017)
  18. 10.1016/j.commatsci.2017.08.031 / Comput. Mater. Sci. / Active learning of linearly parametrized interatomic potentials (2017)
  19. {'key': '2023080301100857600_c19', 'first-page': '247', 'article-title': 'How to find a good submatrix', 'volume-title': 'Matrix Methods: Theory, Algorithms, Applications', 'year': '2010'} / Matrix Methods: Theory, Algorithms, Applications / How to find a good submatrix (2010)
  20. 10.1021/ci300415d / J. Chem. Inf. Model. / Enumeration of 166 billion organic small molecules in the chemical universe database GDB-17 (2012)
  21. 10.1016/j.actamat.2016.09.017 / Acta Mater. / A computational high-throughput search for new ternary superalloys (2017)
  22. 10.1088/1367-2630/15/9/095003 / New J. Phys. / Machine learning of molecular electronic properties in chemical compound space (2013)
Dates
Type When
Created 7 years, 4 months ago (April 19, 2018, 7:05 a.m.)
Deposited 2 years ago (Aug. 2, 2023, 9:10 p.m.)
Indexed 2 weeks, 5 days ago (Aug. 6, 2025, 8:39 a.m.)
Issued 7 years, 4 months ago (April 19, 2018)
Published 7 years, 4 months ago (April 19, 2018)
Published Online 7 years, 4 months ago (April 19, 2018)
Published Print 7 years, 1 month ago (June 28, 2018)
Funders 2
  1. Skolkovo Foundation, Russia
    Awards1
    1. 2016-7/NGP
  2. Los Alamos National Laboratory 10.13039/100008902

    Region: Americas

    gov (Research institutes and centers)

    Labels3
    1. Los Alamos National Lab
    2. Los Alamos Lab
    3. LANL
    Awards1
    1. 1150-06_2015

@article{Gubaev_2018, title={Machine learning of molecular properties: Locality and active learning}, volume={148}, ISSN={1089-7690}, url={http://dx.doi.org/10.1063/1.5005095}, DOI={10.1063/1.5005095}, number={24}, journal={The Journal of Chemical Physics}, publisher={AIP Publishing}, author={Gubaev, Konstantin and Podryabinkin, Evgeny V. and Shapeev, Alexander V.}, year={2018}, month=apr }