Abstract
AbstractArray programming provides a powerful, compact and expressive syntax for accessing, manipulating and operating on data in vectors, matrices and higher-dimensional arrays. NumPy is the primary array programming library for the Python language. It has an essential role in research analysis pipelines in fields as diverse as physics, chemistry, astronomy, geoscience, biology, psychology, materials science, engineering, finance and economics. For example, in astronomy, NumPy was an important part of the software stack used in the discovery of gravitational waves1and in the first imaging of a black hole2. Here we review how a few fundamental array concepts lead to a simple and powerful programming paradigm for organizing, exploring and analysing scientific data. NumPy is the foundation upon which the scientific Python ecosystem is constructed. It is so pervasive that several projects, targeting audiences with specialized needs, have developed their own NumPy-like interfaces and array objects. Owing to its central position in the ecosystem, NumPy increasingly acts as an interoperability layer between such array computation libraries and, together with its application programming interface (API), provides a flexible framework to support the next decade of scientific and industrial analysis.
Bibliography
Harris, C. R., Millman, K. J., van der Walt, S. J., Gommers, R., Virtanen, P., Cournapeau, D., Wieser, E., Taylor, J., Berg, S., Smith, N. J., Kern, R., Picus, M., Hoyer, S., van Kerkwijk, M. H., Brett, M., Haldane, A., del RÃo, J. F., Wiebe, M., Peterson, P., ⦠Oliphant, T. E. (2020). Array programming with NumPy. Nature, 585(7825), 357â362.
Authors
26
- Charles R. Harris (first)
- K. Jarrod Millman (additional)
- Stéfan J. van der Walt (additional)
- Ralf Gommers (additional)
- Pauli Virtanen (additional)
- David Cournapeau (additional)
- Eric Wieser (additional)
- Julian Taylor (additional)
- Sebastian Berg (additional)
- Nathaniel J. Smith (additional)
- Robert Kern (additional)
- Matti Picus (additional)
- Stephan Hoyer (additional)
- Marten H. van Kerkwijk (additional)
- Matthew Brett (additional)
- Allan Haldane (additional)
- Jaime Fernández del Río (additional)
- Mark Wiebe (additional)
- Pearu Peterson (additional)
- Pierre Gérard-Marchant (additional)
- Kevin Sheppard (additional)
- Tyler Reddy (additional)
- Warren Weckesser (additional)
- Hameer Abbasi (additional)
- Christoph Gohlke (additional)
- Travis E. Oliphant (additional)
References
57
Referenced
16,931
-
Abbott, B. P. et al. Observation of gravitational waves from a binary black hole merger. Phys. Rev. Lett. 116, 061102 (2016).
(
10.1103/PhysRevLett.116.061102
) / Phys. Rev. Lett. by BP Abbott (2016) -
Chael, A. et al. High-resolution linear polarimetric imaging for the Event Horizon Telescope. Astrophys. J. 286, 11 (2016).
(
10.3847/0004-637X/829/1/11
) / Astrophys. J. by A Chael (2016) -
Dubois, P. F., Hinsen, K. & Hugunin, J. Numerical Python. Comput. Phys. 10, 262–267 (1996).
(
10.1063/1.4822400
) / Comput. Phys. by PF Dubois (1996) - Ascher, D., Dubois, P. F., Hinsen, K., Hugunin, J. & Oliphant, T. E. An Open Source Project: Numerical Python (Lawrence Livermore National Laboratory, 2001).
- Yang, T.-Y., Furnish, G. & Dubois, P. F. Steering object-oriented scientific computations. In Proc. TOOLS USA 97. Intl Conf. Technology of Object Oriented Systems and Languages (eds Ege, R., Singh, M. & Meyer, B.) 112–119 (IEEE, 1997).
- Greenfield, P., Miller, J. T., Hsu, J. & White, R. L. numarray: a new scientific array package for Python. In PyCon DC 2003 http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.112.9899 (2003).
- Oliphant, T. E. Guide to NumPy 1st edn (Trelgol Publishing, 2006).
-
Dubois, P. F. Python: batteries included. Comput. Sci. Eng. 9, 7–9 (2007).
(
10.1109/MCSE.2007.51
) / Comput. Sci. Eng. by PF Dubois (2007) -
Oliphant, T. E. Python for scientific computing. Comput. Sci. Eng. 9, 10–20 (2007).
(
10.1109/MCSE.2007.58
) / Comput. Sci. Eng. by TE Oliphant (2007) -
Millman, K. J. & Aivazis, M. Python for scientists and engineers. Comput. Sci. Eng. 13, 9–12 (2011).
(
10.1109/MCSE.2011.36
) / Comput. Sci. Eng. by KJ Millman (2011) -
Pérez, F., Granger, B. E. & Hunter, J. D. Python: an ecosystem for scientific computing. Comput. Sci. Eng. 13, 13–21 (2011). Explains why the scientific Python ecosystem is a highly productive environment for research.
(
10.1109/MCSE.2010.119
) / Comput. Sci. Eng. by F Pérez (2011) -
Virtanen, P. et al. SciPy 1.0—fundamental algorithms for scientific computing in Python. Nat. Methods 17, 261–272 (2020); correction 17, 352 (2020). Introduces the SciPy library and includes a more detailed history of NumPy and SciPy.
(
10.1038/s41592-019-0686-2
) / Nat. Methods by P Virtanen (2020) -
Hunter, J. D. Matplotlib: a 2D graphics environment. Comput. Sci. Eng. 9, 90–95 (2007).
(
10.1109/MCSE.2007.55
) / Comput. Sci. Eng. by JD Hunter (2007) -
McKinney, W. Data structures for statistical computing in Python. In Proc. 9th Python in Science Conf. (eds van der Walt, S. & Millman, K. J.) 56–61 (2010).
(
10.25080/Majora-92bf1922-00a
) - Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011). / J. Mach. Learn. Res. by F Pedregosa (2011)
-
van der Walt, S. et al. scikit-image: image processing in Python. PeerJ 2, e453 (2014).
(
10.7717/peerj.453
) / PeerJ by S van der Walt (2014) -
van der Walt, S., Colbert, S. C. & Varoquaux, G. The NumPy array: a structure for efficient numerical computation. Comput. Sci. Eng. 13, 22–30 (2011). Discusses the NumPy array data structure with a focus on how it enables efficient computation.
(
10.1109/MCSE.2011.37
) -
Wang, Q., Zhang, X., Zhang, Y. & Yi, Q. AUGEM: automatically generate high performance dense linear algebra kernels on x86 CPUs. In SC’13: Proc. Intl Conf. High Performance Computing, Networking, Storage and Analysis 25 (IEEE, 2013).
(
10.1145/2503210.2503219
) -
Xianyi, Z., Qian, W. & Yunquan, Z. Model-driven level 3 BLAS performance optimization on Loongson 3A processor. In 2012 IEEE 18th Intl Conf. Parallel and Distributed Systems 684–691 (IEEE, 2012).
(
10.1109/ICPADS.2012.97
) -
Pérez, F. & Granger, B. E. IPython: a system for interactive scientific computing. Comput. Sci. Eng. 9, 21–29 (2007).
(
10.1109/MCSE.2007.53
) / Comput. Sci. Eng. by F Pérez (2007) - Kluyver, T. et al. Jupyter Notebooks—a publishing format for reproducible computational workflows. In Positioning and Power in Academic Publishing: Players, Agents and Agendas (eds Loizides, F. & Schmidt, B.) 87–90 (IOS Press, 2016).
- Hagberg, A. A., Schult, D. A. & Swart, P. J. Exploring network structure, dynamics, and function using NetworkX. In Proc. 7th Python in Science Conf. (eds Varoquaux, G., Vaught, T. & Millman, K. J.) 11–15 (2008).
-
Astropy Collaboration et al. Astropy: a community Python package for astronomy. Astron. Astrophys. 558, A33 (2013).
(
10.1051/0004-6361/201322068
) / Astron. Astrophys. by Astropy Collaboration (2013) -
Price-Whelan, A. M. et al. The Astropy Project: building an open-science project and status of the v2.0 core package. Astron. J. 156, 123 (2018).
(
10.3847/1538-3881/aac387
) / Astron. J. by AM Price-Whelan (2018) -
Cock, P. J. et al. Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics 25, 1422–1423 (2009).
(
10.1093/bioinformatics/btp163
) / Bioinformatics by PJ Cock (2009) -
Millman, K. J. & Brett, M. Analysis of functional magnetic resonance imaging in Python. Comput. Sci. Eng. 9, 52–55 (2007).
(
10.1109/MCSE.2007.46
) / Comput. Sci. Eng. by KJ Millman (2007) -
The SunPy Community et al. SunPy—Python for solar physics. Comput. Sci. Discov. 8, 014009 (2015).
(
10.1088/1749-4699/8/1/014009
) / Comput. Sci. Discov. by The SunPy Community et al. (2015) - Hamman, J., Rocklin, M. & Abernathy, R. Pangeo: a big-data ecosystem for scalable Earth system science. In EGU General Assembly Conf. Abstracts 12146 (2018).
- Chael, A. A. et al. ehtim: imaging, analysis, and simulation software for radio interferometry. Astrophysics Source Code Library https://ascl.net/1904.004 (2019).
-
Millman, K. J. & Pérez, F. Developing open source scientific practice. In Implementing Reproducible Research (eds Stodden, V., Leisch, F. & Peng, R. D.) 149–183 (CRC Press, 2014). Describes the software engineering practices embraced by the NumPy and SciPy communities with a focus on how these practices improve research.
(
10.1201/9781315373461-6
) - van der Walt, S. The SciPy Documentation Project (technical overview). In Proc. 7th Python in Science Conf. (SciPy 2008) (eds Varoquaux, G., Vaught, T. & Millman, K. J.) 27–28 (2008).
- Harrington, J. The SciPy Documentation Project. In Proc. 7th Python in Science Conference (SciPy 2008) (eds Varoquaux, G., Vaught, T. & Millman, K. J.) 33–35 (2008).
- Harrington, J. & Goldsmith, D. Progress report: NumPy and SciPy documentation in 2009. In Proc. 8th Python in Science Conf. (SciPy 2009) (eds Varoquaux, G., van der Walt, S. & Millman, K. J.) 84–87 (2009).
- Royal Astronomical Society Report of the RAS ‘A’ Awards Committee 2020: Astropy Project: 2020 Group Achievement Award (A) https://ras.ac.uk/sites/default/files/2020-01/Group%20Award%20-%20Astropy.pdf (2020).
-
Wilson, G. Software carpentry: getting scientists to write better code by making them more productive. Comput. Sci. Eng. 8, 66–69 (2006).
(
10.1109/MCSE.2006.122
) / Comput. Sci. Eng. by G Wilson (2006) -
Hannay, J. E. et al. How do scientists develop and use scientific software? In Proc. 2009 ICSE Workshop on Software Engineering for Computational Science and Engineering 1–8 (IEEE, 2009).
(
10.1109/SECSE.2009.5069155
) -
Millman, K. J., Brett, M., Barnowski, R. & Poline, J.-B. Teaching computational reproducibility for neuroimaging. Front. Neurosci. 12, 727 (2018).
(
10.3389/fnins.2018.00727
) / Front. Neurosci. by KJ Millman (2018) - Paszke, A. et al. Pytorch: an imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems 32 (eds Wallach, H. et al.) 8024–8035 (Neural Information Processing Systems, 2019).
- Abadi, M. et al. TensorFlow: a system for large-scale machine learning. In OSDI’16: Proc. 12th USENIX Conf. Operating Systems Design and Implementation (chairs Keeton, K. & Roscoe, T.) 265–283 (USENIX Association, 2016).
- Chen, T. et al. MXNet: a flexible and efficient machine learning library for heterogeneous distributed systems. Preprint at http://www.arxiv.org/abs/1512.01274 (2015).
-
Hoyer, S. & Hamman, J. xarray: N–D labeled arrays and datasets in Python. J. Open Res. Softw. 5, 10 (2017).
(
10.5334/jors.148
) / J. Open Res. Softw. by S Hoyer (2017) - Entschev, P. Distributed multi-GPU computing with Dask, CuPy and RAPIDS. In EuroPython 2019 https://ep2019.europython.eu/media/conference/slides/fX8dJsD-distributed-multi-gpu-computing-with-dask-cupy-and-rapids.pdf (2019).
-
Behnel, S. et al. Cython: the best of both worlds. Comput. Sci. Eng. 13, 31–39 (2011).
(
10.1109/MCSE.2010.118
) / Comput. Sci. Eng. by S Behnel (2011) -
Lam, S. K., Pitrou, A. & Seibert, S. Numba: a LLVM-based Python JIT compiler. In Proc. Second Workshop on the LLVM Compiler Infrastructure in HPC, LLVM ’15 7:1–7:6 (ACM, 2015).
(
10.1145/2833157.2833162
) -
Guelton, S. et al. Pythran: enabling static optimization of scientific Python programs. Comput. Sci. Discov. 8, 014001 (2015).
(
10.1088/1749-4680/8/1/014001
) / Comput. Sci. Discov. by S Guelton (2015) -
Dongarra, J., Golub, G. H., Grosse, E., Moler, C. & Moore, K. Netlib and NA-Net: building a scientific computing community. IEEE Ann. Hist. Comput. 30, 30–41 (2008).
(
10.1109/MAHC.2008.29
) / IEEE Ann. Hist. Comput. by J Dongarra (2008) - Barrett, K. A., Chiu, Y. H., Painter, J. F., Motteler, Z. C. & Dubois, P. F. Basis System, Part I: Running a Basis Program—A Tutorial for Beginners UCRL-MA-118543, Vol. 1 (Lawrence Livermore National Laboratory 1995).
- Dubois, P. F. & Motteler, Z. Basis System, Part II: Basis Language Reference Manual UCRL-MA-118543, Vol. 2 (Lawrence Livermore National Laboratory, 1995).
- Chiu, Y. H. & Dubois, P. F. Basis System, Part III: EZN User Manual UCRL-MA-118543, Vol. 3 (Lawrence Livermore National Laboratory, 1995).
- Chiu, Y. H. & Dubois, P. F. Basis System, Part IV: EZD User Manual UCRL-MA-118543, Vol. 4 (Lawrence Livermore National Laboratory, 1995).
-
Munro, D. H. & Dubois, P. F. Using the Yorick interpreted language. Comput. Phys. 9, 609–615 (1995).
(
10.1063/1.4823451
) / Comput. Phys. by DH Munro (1995) -
Ihaka, R. & Gentleman, R. R: a language for data analysis and graphics. J. Comput. Graph. Stat. 5, 299–314 (1996).
(
10.1080/10618600.1996.10474713
) / J. Comput. Graph. Stat. by R Ihaka (1996) -
Iverson, K. E. A programming language. In Proc. 1962 Spring Joint Computer Conf. 345–351 (1962).
(
10.1145/1460833.1460872
) -
Jenness, T. et al. LSST data management software development practices and tools. In Proc. SPIE 10707, Software and Cyberinfrastructure for Astronomy V 1070709 (SPIE and International Society for Optics and Photonics, 2018).
(
10.1117/12.2312157
) -
Matsakis, N. D. & Klock, F. S. The Rust language. Ada Letters 34, 103–104 (2014).
(
10.1145/2692956.2663188
) / Ada Letters by ND Matsakis (2014) -
Bezanson, J., Edelman, A., Karpinski, S. & Shah, V. B. Julia: a fresh approach to numerical computing. SIAM Rev. 59, 65–98 (2017).
(
10.1137/141000671
) / SIAM Rev. by J Bezanson (2017) - Lattner, C. & Adve, V. LLVM: a compilation framework for lifelong program analysis and transformation. In Proc. 2004 Intl Symp. Code Generation and Optimization (CGO’04) 75–88 (IEEE, 2004).
Dates
Type | When |
---|---|
Created | 4 years, 11 months ago (Sept. 17, 2020, 4:48 p.m.) |
Deposited | 1 year ago (Aug. 13, 2024, 10:15 p.m.) |
Indexed | 30 minutes ago (Aug. 21, 2025, 1:20 a.m.) |
Issued | 4 years, 11 months ago (Sept. 16, 2020) |
Published | 4 years, 11 months ago (Sept. 16, 2020) |
Published Online | 4 years, 11 months ago (Sept. 16, 2020) |
Published Print | 4 years, 11 months ago (Sept. 17, 2020) |
@article{Harris_2020, title={Array programming with NumPy}, volume={585}, ISSN={1476-4687}, url={http://dx.doi.org/10.1038/s41586-020-2649-2}, DOI={10.1038/s41586-020-2649-2}, number={7825}, journal={Nature}, publisher={Springer Science and Business Media LLC}, author={Harris, Charles R. and Millman, K. Jarrod and van der Walt, Stéfan J. and Gommers, Ralf and Virtanen, Pauli and Cournapeau, David and Wieser, Eric and Taylor, Julian and Berg, Sebastian and Smith, Nathaniel J. and Kern, Robert and Picus, Matti and Hoyer, Stephan and van Kerkwijk, Marten H. and Brett, Matthew and Haldane, Allan and del Río, Jaime Fernández and Wiebe, Mark and Peterson, Pearu and Gérard-Marchant, Pierre and Sheppard, Kevin and Reddy, Tyler and Weckesser, Warren and Abbasi, Hameer and Gohlke, Christoph and Oliphant, Travis E.}, year={2020}, month=sep, pages={357–362} }