CHAPTER 15
Beginner
Data Transformation and Manipulation
Updated: May 18, 2026
5 min read
# CHAPTER 15
Data Transformation and Manipulation
1. Chapter Introduction
Raw data rarely comes in the exact form needed for analysis. Transformation — sorting, applying custom logic, mapping values, creating derived features — converts raw data into analysis-ready datasets.2. Sorting
python
3. apply() — Custom Functions
python
4. map() — Value Mapping
python
5. String Transformations
python
6. Applying Multiple Transformations (Pipeline)
python
7. Common Mistakes
-
map()vsapply()on DataFrame:map()works on Series only. For row/column operations on a DataFrame, useapply().
-
apply()with axis=1 is slow: For large DataFrames, vectorized operations are much faster. Useapply(axis=1)only when vectorization isn't possible.
8. MCQs
Question 1
df['col'].apply(lambda x: x*2) applies?
Question 2
df.apply(func, axis=1) applies func to?
Question 3
map({'A': 1, 'B': 2}) for unmapped values returns?
Question 4
replace({'A': 1}) for unmapped values returns?
Question 5
sort_values(['A','B'], ascending=[True, False])?
Question 6
str.split(';', expand=True) returns?
Question 7
str.extract(r'(\d+)') returns?
Question 8
assign() in method chaining is for?
Question 9
df['col'].rank(method='min') for ties?
Question 10
Method chaining advantage?
9. Interview Questions
-
Q: What is the difference between
map(),apply(), andapplymap()in Pandas?
- Q: How do you chain multiple transformations in Pandas?
10. Summary
Transformation toolkit:sort_values() for ordering, apply() for custom functions row/column-wise, map() for value replacement, string accessor str.* for text operations, assign() for clean method chaining. Always prefer vectorized operations over apply(axis=1) for performance.