Working with Strings and Lists
# CHAPTER 7
Working with Strings and Lists
1. Chapter Introduction
In data science, you rarely work with a single number. You work with collections of data (Lists) and textual information (Strings). Cleaning text and extracting specific items from a massive list are foundational skills. This chapter teaches you how to slice, dice, and manipulate Strings and Lists, culminating in Python's most elegant feature: List Comprehensions.2. String Operations and Methods
A String is a sequence of characters. Python has powerful built-in methods to clean and manipulate text.
3. List Operations
A List is an ordered, mutable (changeable) collection of items, defined by square brackets [].
4. Indexing and Slicing (The Zero-Index Rule)
Python is Zero-Indexed. The first item is at index 0. You can extract subsets of lists and strings using Slicing: [start : stop : step]. *Note: The stop index is NOT included.*
*Fun Fact: Slicing works the exact same way on Strings! "Python"[0:2] returns "Py".*
5. List Comprehensions (The Pythonic Way)
If you have a list of prices and want to double them, the standard way uses a for loop. A List Comprehension does it in a single, elegant line.
6. Mini Project: Text Cleaner
Let's write a script that takes a list of messy email addresses, cleans them, and extracts the domain names using list comprehensions and string methods.
*Output:*
Cleaned: ['alice@gmail.com', 'bob@yahoo.com', 'charlie@gmail.com']
Domains: ['gmail.com', 'yahoo.com', 'gmail.com']
7. Common Mistakes
-
Zero-Index Confusion: Trying to get the 3rd item using
list[3]. The 3rd item islist[2].list[3]gets the 4th item.
-
Slicing Stop Index: Remembering that
list[0:2]returns TWO items (index 0 and 1), not three. The stop index is exclusive.
-
String Immutability: You cannot change a string in place.
text.replace("A", "B")does nothing unless you reassign it:text = text.replace("A", "B").
8. MCQs
Which string method removes leading and trailing spaces?
Which method converts a string into a list of words?
If data = [10, 20, 30, 40], what is data[1]?
What is the syntax to get the LAST item in a Python list?
If text = "Python", what does text[0:3] return?
Which list method adds an item to the very end of the list?
What is the output of [x * 2 for x in [1, 2, 3]]?
This one-liner syntax ([expression for item in list]) is known as a?
text[0] = "A")? a) Yes b) No, strings are immutable — Answer: b
Q10. In slicing [start:stop:step], is the stop index included in the output? a) Yes b) No, it stops right before that index — Answer: b
9. Interview Questions
-
Q: Explain how List Comprehensions work and why they are preferred over standard
forloops in Python.
-
Q: You have a string
"2023-10-15". Write a one-liner to extract just the year"2023"using slicing.
10. Summary
Strings and Lists are foundational. String methods like.strip(), .lower(), and .replace() are your primary tools for cleaning messy text data. Because Python is zero-indexed, you access list and string elements starting at 0, and can slice them using [start:stop]. Finally, List Comprehensions provide a compact, "Pythonic" way to generate and filter lists in a single line of code.