Introduction to NumPy
# CHAPTER 10
Introduction to NumPy
1. Chapter Introduction
Standard Python lists are highly flexible—they can hold integers, strings, and other lists all at the same time. However, this flexibility makes them incredibly slow for mathematics. When processing millions of data points or training machine learning models, Python lists are simply too slow. Enter NumPy (Numerical Python). NumPy provides the high-performance array structures that the rest of the data science ecosystem (Pandas, Scikit-Learn) is built upon.2. What is NumPy?
NumPy is an open-source Python library used for working with arrays. It also contains functions for linear algebra, fourier transform, and matrices.
The core feature of NumPy is the ndarray (N-dimensional array). It looks like a list, but under the hood, it is implemented in optimized C code.
3. Installing and Importing
If you are using Anaconda, NumPy is already installed. Otherwise, run !pip install numpy.
4. Creating NumPy Arrays
You can create an array by passing a standard Python list into np.array().
The Data Type Restriction: Unlike Python lists, NumPy arrays are *homogeneous*. This means every item in the array MUST be the exact same data type.
5. Array Creation Functions
NumPy has built-in functions to quickly generate large arrays without typing them out.
6. Array Properties
You must understand the shape of your data. NumPy provides attributes to inspect arrays.
7. Why NumPy is Faster (A Preview of Vectorization)
To multiply every number in a Python list by 2, you need a for loop. In NumPy, you apply the math directly to the array.
8. Common Mistakes
-
Forgetting the brackets for 2D arrays: When making a matrix, people often write
np.array([1, 2], [3, 4]). This is an error. It must be a list of lists:np.array([ [1, 2], [3, 4] ]). Note the extra set of outer brackets.
-
Using Python
mathmodule on arrays: If you want the square root of an array,math.sqrt(my_array)will crash. You must use NumPy's equivalent:np.sqrt(my_array).
9. MCQs
What is the standard industry alias for importing NumPy?
What is the core data structure in NumPy?
What is a key requirement of a NumPy array regarding data types?
How do you create an array of 10 zeros?
Which function creates an array of numbers from 0 to 10 with a step of 2?
What does matrix.shape return for a 2D array?
How do you multiply every item in a NumPy array named arr by 5?
What attribute tells you the data type of the elements inside the array?
Which function returns 5 evenly spaced numbers between 0 and 1?
Why is NumPy so much faster than standard Python lists?
10. Interview Questions
- Q: Explain the difference between a Python List and a NumPy Array. Why is the NumPy array preferred for data science?
- Q: What does it mean that NumPy arrays are "homogeneous"?
11. Summary
NumPy is the foundational mathematics library for Python. By sacrificing the flexibility of Python lists (enforcing homogeneous data types), NumPy gains massive performance advantages using thendarray. You can easily generate arrays using np.zeros(), np.arange(), and np.linspace(), and inspect their structure using .shape and .dtype.