Skip to main content
Python for Data Science
CHAPTER 10 Beginner

Introduction to NumPy

Updated: May 18, 2026
5 min read

# CHAPTER 10

Introduction to NumPy

1. Chapter Introduction

Standard Python lists are highly flexible—they can hold integers, strings, and other lists all at the same time. However, this flexibility makes them incredibly slow for mathematics. When processing millions of data points or training machine learning models, Python lists are simply too slow. Enter NumPy (Numerical Python). NumPy provides the high-performance array structures that the rest of the data science ecosystem (Pandas, Scikit-Learn) is built upon.

2. What is NumPy?

NumPy is an open-source Python library used for working with arrays. It also contains functions for linear algebra, fourier transform, and matrices.

The core feature of NumPy is the ndarray (N-dimensional array). It looks like a list, but under the hood, it is implemented in optimized C code.

3. Installing and Importing

If you are using Anaconda, NumPy is already installed. Otherwise, run !pip install numpy.

python
12
# 'np' is the universal industry standard alias
import numpy as np

4. Creating NumPy Arrays

You can create an array by passing a standard Python list into np.array().

python
12345678
import numpy as np

# Create from a list
my_list = [1, 2, 3, 4, 5]
my_array = np.array(my_list)

print(my_array)         # [1 2 3 4 5]
print(type(my_array))   # <class 'numpy.ndarray'>

The Data Type Restriction: Unlike Python lists, NumPy arrays are *homogeneous*. This means every item in the array MUST be the exact same data type.

python
12345
# If you mix types, NumPy converts them all to strings!
mixed_list = [1, 2, "Apple"]
mixed_array = np.array(mixed_list)

print(mixed_array) # ['1' '2' 'Apple']

5. Array Creation Functions

NumPy has built-in functions to quickly generate large arrays without typing them out.

python
12345678910111213141516
# 1. Array of zeros (useful for initializing empty matrices)
zeros = np.zeros(5)
print(zeros) # [0. 0. 0. 0. 0.]

# 2. Array of ones
ones = np.ones(3)
print(ones) # [1. 1. 1.]

# 3. Arange (like Python's range(), but returns an array)
# np.arange(start, stop, step)
seq = np.arange(0, 10, 2)
print(seq) # [0 2 4 6 8]

# 4. Linspace (Evenly spaced numbers over a specified interval)
lin = np.linspace(0, 1, 5)
print(lin) # [0.   0.25 0.5  0.75 1.  ]

6. Array Properties

You must understand the shape of your data. NumPy provides attributes to inspect arrays.

python
12345678910
# Let's create a 2D Array (A Matrix)
matrix = np.array([
    [1, 2, 3],
    [4, 5, 6]
])

print("Dimensions:", matrix.ndim)  # Output: 2
print("Shape:", matrix.shape)      # Output: (2, 3) -> 2 rows, 3 columns
print("Total Items:", matrix.size) # Output: 6
print("Data Type:", matrix.dtype)  # Output: int64 (or int32)

7. Why NumPy is Faster (A Preview of Vectorization)

To multiply every number in a Python list by 2, you need a for loop. In NumPy, you apply the math directly to the array.

python
123456
prices = np.array([10, 20, 30])

# Multiplies every element by 2 instantly, in C code.
doubled = prices * 2 

print(doubled) # [20 40 60]

8. Common Mistakes

  • Forgetting the brackets for 2D arrays: When making a matrix, people often write np.array([1, 2], [3, 4]). This is an error. It must be a list of lists: np.array([ [1, 2], [3, 4] ]). Note the extra set of outer brackets.
  • Using Python math module on arrays: If you want the square root of an array, math.sqrt(my_array) will crash. You must use NumPy's equivalent: np.sqrt(my_array).

9. MCQs

Question 1

What is the standard industry alias for importing NumPy?

Question 2

What is the core data structure in NumPy?

Question 3

What is a key requirement of a NumPy array regarding data types?

Question 4

How do you create an array of 10 zeros?

Question 5

Which function creates an array of numbers from 0 to 10 with a step of 2?

Question 6

What does matrix.shape return for a 2D array?

Question 7

How do you multiply every item in a NumPy array named arr by 5?

Question 8

What attribute tells you the data type of the elements inside the array?

Question 9

Which function returns 5 evenly spaced numbers between 0 and 1?

Question 10

Why is NumPy so much faster than standard Python lists?

10. Interview Questions

  • Q: Explain the difference between a Python List and a NumPy Array. Why is the NumPy array preferred for data science?
  • Q: What does it mean that NumPy arrays are "homogeneous"?

11. Summary

NumPy is the foundational mathematics library for Python. By sacrificing the flexibility of Python lists (enforcing homogeneous data types), NumPy gains massive performance advantages using the ndarray. You can easily generate arrays using np.zeros(), np.arange(), and np.linspace(), and inspect their structure using .shape and .dtype.

12. Next Chapter Recommendation

In Chapter 11: NumPy Arrays and Operations, we will dive deeper into multidimensional indexing, slicing, and utilizing NumPy's powerful built-in mathematical aggregation functions.

Finish this Chapter

Save your progress on your learning path and prepare for coding interview challenges.

Discussion

Join the discussion

Log in or create a free account to participate.

Sort: ·