CHAPTER 25 Beginner

Pandas with SQL Databases

Updated: May 18, 2026

5 min read

# CHAPTER 25

Pandas with SQL Databases

1. Chapter Introduction

Most production data lives in SQL databases — MySQL, PostgreSQL, SQLite. Pandas bridges the gap between databases and Python analytics: read SQL results into DataFrames, transform data in Python, write results back to the database.

2. SQLite Setup (Zero-Config)

python

123456789101112131415161718192021222324252627282930313233343536

import pandas as pd
import sqlite3
import numpy as np

# Create in-memory SQLite database (no server needed)
conn = sqlite3.connect(&#039;sales.db')   # Or ':memory:' for in-memory

# Create tables from DataFrames
products = pd.DataFrame({
    &#039;product_id': [1, 2, 3, 4, 5],
    &#039;name': ['Laptop', 'Phone', 'Monitor', 'Desk', 'Chair'],
    &#039;category': ['Electronics', 'Electronics', 'Electronics', 'Furniture', 'Furniture'],
    &#039;price': [1200, 500, 300, 450, 200]
})

customers = pd.DataFrame({
    &#039;customer_id': [1, 2, 3, 4],
    &#039;name': ['Alice', 'Bob', 'Carol', 'David'],
    &#039;city': ['NYC', 'LA', 'Chicago', 'Houston']
})

np.random.seed(42)
n = 100
orders = pd.DataFrame({
    &#039;order_id': range(1, n+1),
    &#039;customer_id': np.random.randint(1, 5, n),
    &#039;product_id': np.random.randint(1, 6, n),
    &#039;quantity': np.random.randint(1, 10, n),
    &#039;order_date': pd.date_range('2024-01-01', periods=n, freq='2D').astype(str)
})

# Write to SQLite
products.to_sql(&#039;products', conn, if_exists='replace', index=False)
customers.to_sql(&#039;customers', conn, if_exists='replace', index=False)
orders.to_sql(&#039;orders', conn, if_exists='replace', index=False)
print("Tables created successfully!")

3. Reading SQL Data

python

123456789101112131415161718192021222324252627282930

# Simple query
df = pd.read_sql("SELECT * FROM products", conn)
print(df)

# Filtered query
df_electronics = pd.read_sql("""
    SELECT name, price
    FROM products
    WHERE category = &#039;Electronics'
    ORDER BY price DESC
""", conn)
print(df_electronics)

# Join query
df_orders = pd.read_sql("""
    SELECT
        o.order_id,
        c.name as customer_name,
        p.name as product_name,
        p.price,
        o.quantity,
        (p.price * o.quantity) as revenue,
        o.order_date
    FROM orders o
    JOIN customers c ON o.customer_id = c.customer_id
    JOIN products p ON o.product_id = p.product_id
    ORDER BY o.order_date
""", conn)
print(f"\nOrders with details ({len(df_orders)} rows):")
print(df_orders.head(10))

4. MySQL Integration (with SQLAlchemy)

python

12345678910111213141516171819202122

# MySQL connection (requires: pip install sqlalchemy mysql-connector-python)
from sqlalchemy import create_engine

# Connection string format:
# mysql+mysqlconnector://user:password@host:port/database
engine = create_engine(&#039;mysql+mysqlconnector://root:password@localhost:3306/mydb')

# Read from MySQL
df = pd.read_sql("SELECT * FROM employees WHERE department = &#039;Engineering'", engine)

# Read with params (safe parameterized queries)
dept = &#039;Engineering'
df = pd.read_sql("SELECT * FROM employees WHERE department = %(dept)s",
                 engine, params={&#039;dept': dept})

# Write DataFrame to MySQL
df_results = pd.DataFrame({
    &#039;report_date': [pd.Timestamp.now().date()],
    &#039;total_revenue': [df_orders['revenue'].sum()],
    &#039;order_count': [len(df_orders)]
})
df_results.to_sql(&#039;daily_reports', engine, if_exists='append', index=False)

5. Mini Project: Sales Database Analyzer

python

12345678910111213141516171819202122232425262728293031323334353637383940414243444546

import pandas as pd
import sqlite3
import numpy as np

# Using our existing connection
print("=" * 55)
print("SALES DATABASE ANALYSIS")
print("=" * 55)

# 1. Revenue by product
revenue_by_product = pd.read_sql("""
    SELECT p.name as Product, p.category as Category,
           SUM(o.quantity * p.price) as Total_Revenue,
           SUM(o.quantity) as Units_Sold,
           COUNT(o.order_id) as Order_Count
    FROM orders o
    JOIN products p ON o.product_id = p.product_id
    GROUP BY p.product_id, p.name, p.category
    ORDER BY Total_Revenue DESC
""", conn)
print("\n1. Revenue by Product:")
print(revenue_by_product.to_string(index=False))

# 2. Customer analysis
customer_analysis = pd.read_sql("""
    SELECT c.name as Customer, c.city as City,
           COUNT(o.order_id) as Orders,
           SUM(o.quantity * p.price) as Total_Spent
    FROM orders o
    JOIN customers c ON o.customer_id = c.customer_id
    JOIN products p ON o.product_id = p.product_id
    GROUP BY c.customer_id, c.name, c.city
    ORDER BY Total_Spent DESC
""", conn)
print("\n2. Customer Spending:")
print(customer_analysis.to_string(index=False))

# 3. Category breakdown (SQL + Pandas)
category_rev = revenue_by_product.groupby(&#039;Category').agg(
    Total=(&#039;Total_Revenue', 'sum'),
    Products=(&#039;Product', 'count')
)
print("\n3. Revenue by Category:")
print(category_rev)

conn.close()

6. Common Mistakes

SQL injection risk: Never use f-strings for SQL queries with user input. Always use parameterized queries (%(param)s or ?).

if_exists='replace' drops entire table: Use if_exists='append' to add rows, or fail to prevent accidents.

7. MCQs

Question 1

`pd.read_sql(query, conn)` returns?

Question 2

`to_sql(if_exists='append')` does?

Question 3

SQLAlchemy is needed for?

Question 4

Parameterized queries prevent?

Question 5

`to_sql(index=False)` prevents?

Question 6

`if_exists='replace'` in to_sql?

Question 7

SQLite advantage for development?

Question 8

`pd.read_sql_table('products', engine)` reads?

Question 9

Connection string for MySQL (SQLAlchemy)?

Q10. Best approach for writing analysis results to DB? a) Manual INSERT c) DataFrame.to_sql() with if_exists='append' c) Print and copy — Answer: b

8. Interview Questions

Q: How do you read data from a MySQL database into a Pandas DataFrame?

Q: What is the risk of using f-strings to build SQL queries?

9. Summary

Pandas-SQL integration: read_sql() converts query results to DataFrames, to_sql() writes DataFrames to tables. Always use parameterized queries for security. SQLite needs no server — perfect for prototyping. SQLAlchemy engines connect to all production databases (MySQL, PostgreSQL, MSSQL).

10. Next Chapter Recommendation

In Chapter 26: Preparing Data for Machine Learning, we engineer features, encode categoricals, normalize, and split data for scikit-learn models.

Browse All 21+ Subject Areas

Quick Links

Visual Algorithm Labs

Frontend Dev

The Future of Web Architecture in 2026

Practice Quizzes

Pandas with SQL Databases #

1. Chapter Introduction #

2. SQLite Setup (Zero-Config) #

3. Reading SQL Data #

4. MySQL Integration (with SQLAlchemy) #

5. Mini Project: Sales Database Analyzer #

6. Common Mistakes #

7. MCQs #

pd.read_sql(query, conn) returns?

to_sql(if_exists='append') does?

SQLAlchemy is needed for?

Parameterized queries prevent?

to_sql(index=False) prevents?

if_exists='replace' in to_sql?

SQLite advantage for development?

pd.read_sql_table('products', engine) reads?

Connection string for MySQL (SQLAlchemy)?

8. Interview Questions #

9. Summary #

10. Next Chapter Recommendation #

Finish this Chapter

Discussion

Explore More

📖 Related Tutorials 5

Send Feedback / Bug

Feedback Submitted!

Pandas with SQL Databases

1. Chapter Introduction

2. SQLite Setup (Zero-Config)

3. Reading SQL Data

4. MySQL Integration (with SQLAlchemy)

5. Mini Project: Sales Database Analyzer

6. Common Mistakes

7. MCQs

`pd.read_sql(query, conn)` returns?

`to_sql(if_exists='append')` does?

`to_sql(index=False)` prevents?

`if_exists='replace'` in to_sql?

`pd.read_sql_table('products', engine)` reads?

8. Interview Questions

9. Summary

10. Next Chapter Recommendation