Abstract
AbstractThe use of advanced machine learning algorithms in experimental materials science is limited by the lack of sufficiently large and diverse datasets amenable to data mining. If publicly open, such data resources would also enable materials research by scientists without access to expensive experimental equipment. Here, we report on our progress towards a publicly open High Throughput Experimental Materials (HTEM) Database (htem.nrel.gov). This database currently contains 140,000 sample entries, characterized by structural (100,000), synthetic (80,000), chemical (70,000), and optoelectronic (50,000) properties of inorganic thin film materials, grouped in >4,000 sample entries across >100 materials systems; more than a half of these data are publicly available. This article shows how the HTEM database may enable scientists to explore materials by browsing web-based user interface and an application programming interface. This paper also describes a HTE approach to generating materials data, and discusses the laboratory information management system (LIMS), that underpin HTEM database. Finally, this manuscript illustrates how advanced machine learning algorithms can be adopted to materials science problems using this open data resource.
References
40
Referenced
159
- The AI revolution in science, Science10.1126/science.aan7064 (2017).
- Hey, T. . The fourth paradigm: data-intensive scientific discovery (Microsoft research, 2009). / The fourth paradigm: data-intensive scientific discovery by T Hey (2009)
-
Nosengo, N. The material code. Nature 533, 22–25 (2016).
(
10.1038/533022a
) / Nature by N Nosengo (2016) -
Meredig, B. et al. Combinatorial screening for new materials in unconstrained composition space with machine learning. Phys. Rev. B 89, 094104 (2014).
(
10.1103/PhysRevB.89.094104
) / Phys. Rev. B by B Meredig (2014) -
Hautier, G., Fischer, C. C., Jain, A., Mueller, T. & Ceder, G. Finding nature’s missing ternary oxide compounds using machine learning and density functional theory. Chem. Mat 22, 3762–3767 (2010).
(
10.1021/cm100795d
) / Chem. Mat by G Hautier (2010) - Carrete, J., Li, W., Mingo, N., Wang, S. & Curtarolo, S. Finding unprecedentedly low-thermal-conductivity half-Heusler semiconductors via high-throughput materials modeling. Phys. Rev. X 4, 011019 (2014). / Phys. Rev. X by J Carrete (2014)
- Rajan K. (ed.) Informatics for materials science and engineering: data-driven discovery for accelerated experimentation and application (Butterworth-Heinemann, 2013). / Informatics for materials science and engineering: data-driven discovery for accelerated experimentation and application by K Rajan (2013)
-
Jain, A. et al. Commentary: The Materials Project: A materials genome approach to accelerating materials innovation. APL Mat 1, 011002 (2013).
(
10.1063/1.4812323
) / APL Mat by A Jain (2013) -
Curtarolo, S. et al. AFLOWLIB. ORG: A distributed materials properties repository from high-throughput ab initio calculations. Comp. Mat. Sci 58, 227–235 (2012).
(
10.1016/j.commatsci.2012.02.002
) / Comp. Mat. Sci by S Curtarolo (2012) -
Saal, J. E., Kirklin, S., Aykol, M., Meredig, B. & Wolverton, C. Materials design and discovery with high-throughput density functional theory: the open quantum materials database (OQMD). JOM 65, 1501–1509 (2013).
(
10.1007/s11837-013-0755-4
) / JOM by JE Saal (2013) -
Belsky, A., Hellenbrandt, M., Karen, V. L. & Luksch, P. New developments in the Inorganic Crystal Structure Database (ICSD): accessibility in support of materials research and design. Acta Cryst. B 58, 364–369 (2002).
(
10.1107/S0108768102006948
) / Acta Cryst. B by A Belsky (2002) -
Hellwege, K. H. & Green, L. C. Landolt-Börnstein, Numerical data and functional relationships in science and technology. Am. Journ. Phys 35, 291–292 (1967).
(
10.1119/1.1974060
) / Am. Journ. Phys by KH Hellwege (1967) -
Xu, Y., Yamazaki, M. & Villars, P. Inorganic materials database for exploring the nature of material. Jpn. J. Appl. Phys. 50, 11RH02 (2011).
(
10.7567/JJAP.50.11RH02
) / Jpn. J. Appl. Phys. by Y Xu (2011) - Mueller, T., Kusne, A. G. & Ramprasad, R. Machine learning in materials science: Recent progress and emerging applications. Rev. Comput. Chem. 29, 186 (2015). / Rev. Comput. Chem. by T Mueller (2015)
-
Kalinin, S. V., Sumpter, B. G. & Archibald, R. K. Big-deep-smart data in imaging for guiding materials design. Nature Mat 14, 973 (2015).
(
10.1038/nmat4395
) / Nature Mat by SV Kalinin (2015) -
Kalidindi, S. R. & De Graef, M. Materials data science: current status and future outlook. Ann. Rev. Mat. Res 45, 171–193 (2015).
(
10.1146/annurev-matsci-070214-020844
) / Ann. Rev. Mat. Res by SR Kalidindi (2015) -
Kajikawa, Y., Abe, K. & Noda, S. Filling the gap between researchers studying different materials and different methods: a proposal for structured keywords. Journ. Inf. Sci 32, 511–524 (2006).
(
10.1177/0165551506067125
) / Journ. Inf. Sci by Y Kajikawa (2006) -
Kim, E. et al. Machine-learned and codified synthesis parameters of oxide materials. Sci. Data 4, 170127 (2017).
(
10.1038/sdata.2017.127
) / Sci. Data by E Kim (2017) -
Heidorn, P. B. Shedding light on the dark data in the long tail of science. Library Trends 57, 280–299 (2008).
(
10.1353/lib.0.0036
) / Library Trends by PB Heidorn (2008) -
Raccuglia, P. et al. Machine-learning-assisted materials discovery using failed experiments. Nature 533, 73–76 (2016).
(
10.1038/nature17439
) / Nature by P Raccuglia (2016) - Gurin, J. Open data now: the secret to hot startups, smart investing, savvy marketing, and fast innovation (McGraw Hill Professional, 2014). / Open data now: the secret to hot startups, smart investing, savvy marketing, and fast innovation by J Gurin (2014)
-
Vines, T. H. et al. Mandated data archiving greatly improves access to research data. The FASEB Journ 27, 1304–1308 (2013).
(
10.1096/fj.12-218164
) / The FASEB Journ by TH Vines (2013) -
Green, M. L., Takeuchi, I. & Hattrick-Simpers, J. R. Applications of high throughput (combinatorial) methodologies to electronic, magnetic, optical, and energy-related materials. Journal of Applied Physics 113, 9 (2013).
(
10.1007/s00339-013-7872-3
) / Journal of Applied Physics by ML Green (2013) -
Meredith, J. C., Karim, A. & Amis, E. J. Combinatorial methods for investigations in polymer materials science. MRS Bulletin 27, 330–335 (2002).
(
10.1557/mrs2002.101
) / MRS Bulletin by JC Meredith (2002) -
Snively, C. M., Oskarsdottir, G. & Lauterbach, J. Chemically sensitive parallel analysis of combinatorial catalyst libraries. Catalysis Today 67, 357–368 (2001).
(
10.1016/S0920-5861(01)00328-5
) / Catalysis Today by CM Snively (2001) -
Dima, A. et al. Informatics Infrastructure for the Materials Genome Initiative. JOM 68, 2053–2064 (2016).
(
10.1007/s11837-016-2000-4
) / JOM by A Dima (2016) -
Michel, K. & Meredig, B. Beyond bulk single crystals: a data format for all materials structure–property–processing relationships. MRS Bulletin 41, 617–623 (2016).
(
10.1557/mrs.2016.166
) / MRS Bulletin by K Michel (2016) - Maaten, L. V. D. & Hinton, G. Visualizing data using t-SNE. Journ. Mach. Learn. Res 9, 2579–2605 (2008). / Journ. Mach. Learn. Res by LVD Maaten (2008)
- Ho, T. K. Random decision forests. In Proceedings of the Third International Conference on Document Analysis and Recognition, IEEE, 1278-282 (1995).
-
Green, M. L. et al. Fulfilling the promise of the materials genome initiative with high-throughput experimental methodologies. Appl. Phys. Rev 4, 011105 (2017).
(
10.1063/1.4977487
) / Appl. Phys. Rev by ML Green (2017) - Wilkinson, M. D. et al. The FAIR Guiding Principles for scientific data management and stewardship. Nature 3, 160018 (2016). / Nature by MD Wilkinson (2016)
- The Minerals, Metals & Materials Society (TMS). Building a Materials Data Infrastructure: Opening New Pathways to Discovery and Innovation in Science and Engineering (TMS, 2017).
-
Paudel, T. R., Zakutayev, A., Lany, S., d'Avezac, M. & Zunger, A. Doping rules and doping prototypes in A2BO4 spinel oxides. Adv. Funct.Mat 21, 4493–4501 (2011).
(
10.1002/adfm.201101469
) / Adv. Funct.Mat by TR Paudel (2011) -
Perkins, J. D. et al. Inverse design approach to hole doping in ternary oxides: Enhancing p-type conductivity in cobalt oxide spinels. Phys. Rev. B 84, 205207 (2011).
(
10.1103/PhysRevB.84.205207
) / Phys. Rev. B by JD Perkins (2011) -
Zakutayev, A. et al. Zn–Ni–Co–O wide-band-gap p-type conductive oxides with high work functions. MRS Comm 1, 23–26 (2011).
(
10.1557/mrc.2011.9
) / MRS Comm by A Zakutayev (2011) -
Zakutayev, A. et al. Cation off-stoichiometry leads to high p-type conductivity and enhanced transparency in Co2ZnO4 and Co2NiO4 thin films. Phys. Rev. B 85, 085204 (2012).
(
10.1103/PhysRevB.85.085204
) / Phys. Rev. B by A Zakutayev (2012) - White, R. R. & Munch, K. Handling large and complex data in a photovoltaic research institution using a custom laboratory information management system. Preprint at arXiv 1403, 2656 (2014). / Preprint at arXiv by RR White (2014)
- R Core Team R: A language and environment for statistical computing (2014).
-
Ward, L., Agrawal, A., Choudhary, A. & Wolverton, C. ‘A general-purpose machine learning framework for predicting properties of inorganic materials.’. NPJ Comp. Mat 2, 16028 (2016).
(
10.1038/npjcompumats.2016.28
) / NPJ Comp. Mat by L Ward (2016) -
Zakutayev, A. et al. NREL Data Catalog https://doi.org/10.7799/1407128 (2017)
(
10.7799/1407128
) by A. Zakutayev (2017)
Dates
Type | When |
---|---|
Created | 7 years, 4 months ago (April 3, 2018, 9:04 a.m.) |
Deposited | 2 years, 8 months ago (Dec. 20, 2022, 7:48 p.m.) |
Indexed | 1 hour, 26 minutes ago (Aug. 22, 2025, 12:58 a.m.) |
Issued | 7 years, 4 months ago (April 3, 2018) |
Published | 7 years, 4 months ago (April 3, 2018) |
Published Online | 7 years, 4 months ago (April 3, 2018) |
@article{Zakutayev_2018, title={An open experimental database for exploring inorganic materials}, volume={5}, ISSN={2052-4463}, url={http://dx.doi.org/10.1038/sdata.2018.53}, DOI={10.1038/sdata.2018.53}, number={1}, journal={Scientific Data}, publisher={Springer Science and Business Media LLC}, author={Zakutayev, Andriy and Wunder, Nick and Schwarting, Marcus and Perkins, John D. and White, Robert and Munch, Kristin and Tumas, William and Phillips, Caleb}, year={2018}, month=apr }