Skip to main content
Python for Data Science
CHAPTER 07 Beginner

Working with Strings and Lists

Updated: May 18, 2026
5 min read

# CHAPTER 7

Working with Strings and Lists

1. Chapter Introduction

In data science, you rarely work with a single number. You work with collections of data (Lists) and textual information (Strings). Cleaning text and extracting specific items from a massive list are foundational skills. This chapter teaches you how to slice, dice, and manipulate Strings and Lists, culminating in Python's most elegant feature: List Comprehensions.

2. String Operations and Methods

A String is a sequence of characters. Python has powerful built-in methods to clean and manipulate text.

python
123456789101112131415161718
text = "   Data Science is AWESOME!   "

# 1. Stripping whitespace (Crucial for data cleaning!)
clean_text = text.strip()
print(clean_text) # "Data Science is AWESOME!"

# 2. Case formatting
print(clean_text.lower()) # "data science is awesome!"
print(clean_text.upper()) # "DATA SCIENCE IS AWESOME!"
print(clean_text.title()) # "Data Science Is Awesome!"

# 3. Replacing characters
no_spaces = clean_text.replace(" ", "_")
print(no_spaces) # "Data_Science_is_AWESOME!"

# 4. Splitting strings into a List
words = clean_text.split(" ")
print(words) # ['Data', 'Science', 'is', 'AWESOME!']

3. List Operations

A List is an ordered, mutable (changeable) collection of items, defined by square brackets [].

python
1234567891011
fruits = ["Apple", "Banana", "Cherry"]

# Adding items
fruits.append("Orange") # Adds to the end
fruits.insert(1, "Mango") # Inserts at index 1

# Removing items
fruits.remove("Banana") # Removes by value
popped = fruits.pop() # Removes and returns the LAST item

print(fruits) # ['Apple', 'Mango', 'Cherry']

4. Indexing and Slicing (The Zero-Index Rule)

Python is Zero-Indexed. The first item is at index 0. You can extract subsets of lists and strings using Slicing: [start : stop : step]. *Note: The stop index is NOT included.*

python
1234567891011121314
# Indexes:  0    1    2    3    4
numbers = [10, 20, 30, 40, 50]

# Indexing
print(numbers[0])  # First item: 10
print(numbers[-1]) # Last item: 50 (Negative indexing is very useful!)

# Slicing
print(numbers[0:3]) # Indexes 0, 1, 2 (Output: [10, 20, 30])
print(numbers[2:])  # Index 2 to the end (Output: [30, 40, 50])
print(numbers[:3])  # Start to index 2 (Output: [10, 20, 30])

# Step (Get every 2nd item)
print(numbers[0:5:2]) # Output: [10, 30, 50]

*Fun Fact: Slicing works the exact same way on Strings! "Python"[0:2] returns "Py".*

5. List Comprehensions (The Pythonic Way)

If you have a list of prices and want to double them, the standard way uses a for loop. A List Comprehension does it in a single, elegant line.

python
12345678910111213141516
prices = [10, 20, 30, 40]

# The Old Way (for loop)
doubled_old = []
for p in prices:
    doubled_old.append(p * 2)

# The Pythonic Way (List Comprehension)
# Syntax: [expression for item in list]
doubled_new = [p * 2 for p in prices]

print(doubled_new) # [20, 40, 60, 80]

# You can even add IF statements! (Only double if price > 20)
filtered = [p * 2 for p in prices if p > 20]
print(filtered) # [60, 80]

6. Mini Project: Text Cleaner

Let's write a script that takes a list of messy email addresses, cleans them, and extracts the domain names using list comprehensions and string methods.

python
123456789
messy_emails = ["  Alice@Gmail.com", "BOB@yahoo.com  ", "  charlie@GMAIL.com"]

# 1. Clean the emails (lowercase and strip spaces)
clean_emails = [email.strip().lower() for email in messy_emails]
print("Cleaned:", clean_emails)

# 2. Extract domains (split by '@' and take the second part [1])
domains = [email.split('@')[1] for email in clean_emails]
print("Domains:", domains)

*Output:* Cleaned: ['alice@gmail.com', 'bob@yahoo.com', 'charlie@gmail.com'] Domains: ['gmail.com', 'yahoo.com', 'gmail.com']

7. Common Mistakes

  • Zero-Index Confusion: Trying to get the 3rd item using list[3]. The 3rd item is list[2]. list[3] gets the 4th item.
  • Slicing Stop Index: Remembering that list[0:2] returns TWO items (index 0 and 1), not three. The stop index is exclusive.
  • String Immutability: You cannot change a string in place. text.replace("A", "B") does nothing unless you reassign it: text = text.replace("A", "B").

8. MCQs

Question 1

Which string method removes leading and trailing spaces?

Question 2

Which method converts a string into a list of words?

Question 3

If data = [10, 20, 30, 40], what is data[1]?

Question 4

What is the syntax to get the LAST item in a Python list?

Question 5

If text = "Python", what does text[0:3] return?

Question 6

Which list method adds an item to the very end of the list?

Question 7

What is the output of [x * 2 for x in [1, 2, 3]]?

Question 8

This one-liner syntax ([expression for item in list]) is known as a?

Q9. Can you change a string after it is created (e.g., text[0] = "A")? a) Yes b) No, strings are immutable — Answer: b

Q10. In slicing [start:stop:step], is the stop index included in the output? a) Yes b) No, it stops right before that index — Answer: b

9. Interview Questions

  • Q: Explain how List Comprehensions work and why they are preferred over standard for loops in Python.
  • Q: You have a string "2023-10-15". Write a one-liner to extract just the year "2023" using slicing.

10. Summary

Strings and Lists are foundational. String methods like .strip(), .lower(), and .replace() are your primary tools for cleaning messy text data. Because Python is zero-indexed, you access list and string elements starting at 0, and can slice them using [start:stop]. Finally, List Comprehensions provide a compact, "Pythonic" way to generate and filter lists in a single line of code.

11. Next Chapter Recommendation

In Chapter 8: Tuples, Sets, and Dictionaries, we expand our data structure knowledge, learning how to store fixed data (Tuples), find unique items (Sets), and map relationships using Key-Value pairs (Dictionaries).

Finish this Chapter

Save your progress on your learning path and prepare for coding interview challenges.

Discussion

Join the discussion

Log in or create a free account to participate.

Sort: ·