CHAPTER 29 Beginner

Performance Optimization in R

Updated: May 18, 2026

5 min read

# CHAPTER 29

Performance Optimization in R

1. Chapter Introduction

Production R code must handle millions of rows efficiently. This chapter covers profiling to identify bottlenecks, vectorization, data.table for fast aggregation, Rcpp for critical paths, and parallelism for CPU-intensive workloads.

2. Profiling: Finding Bottlenecks

1234567891011121314151617181920212223242526272829303132333435363738

library(profvis)

# Profile your code to see where time is spent
profvis({
  # Simulate typical analysis workflow
  n <- 100000
  df <- data.frame(
    x = rnorm(n),
    g = sample(letters[1:5], n, replace=TRUE)
  )

  # Which of these is faster?
  # Method 1: Loop (slow)
  result1 <- c()
  for (val in df$x) {
    result1 <- c(result1, val^2)  # Growing vector — SLOW
  }

  # Method 2: Vectorized
  result2 <- df$x^2

  # Method 3: aggregate
  agg <- aggregate(x ~ g, df, mean)
})
# Opens interactive flame graph in RStudio!

# system.time() — quick timing
system.time({
  sapply(1:100000, function(x) x^2)
})
#    user  system elapsed
#   0.432   0.001   0.436

system.time({
  (1:100000)^2  # Vectorized — instant
})
#    user  system elapsed
#   0.001   0.000   0.001

3. Vectorization vs Loops

123456789101112131415161718192021222324252627282930313233

# ─── COMMON PERFORMANCE ANTI-PATTERNS ────────────────

# SLOW: Growing vector in loop (memory reallocated each time)
slow_squares <- function(n) {
  result <- c()
  for (i in 1:n) result <- c(result, i^2)  # O(n²) memory ops!
  result
}

# FAST: Pre-allocate
fast_squares_prealloc <- function(n) {
  result <- numeric(n)  # Pre-allocate!
  for (i in 1:n) result[i] <- i^2
  result
}

# FASTEST: Vectorized (no R loop at all)
fastest_squares <- function(n) (1:n)^2

n <- 10000
cat("Slow:       "); print(system.time(slow_squares(n)))["elapsed"]
cat("Pre-alloc:  "); print(system.time(fast_squares_prealloc(n)))["elapsed"]
cat("Vectorized: "); print(system.time(fastest_squares(n)))["elapsed"]

# Rule of thumb speedups:
# Loop (growing) → Loop (pre-alloc) → ~10x faster
# Loop (pre-alloc) → Vectorized     → ~5-50x faster
# Vectorized → data.table           → ~5-20x faster for aggregation

# Apply family vs loop
data_list <- replicate(1000, rnorm(100), simplify=FALSE)
system.time(lapply(data_list, mean))  # Fast
system.time({ result <- numeric(1000); for (i in 1:1000) result[i] <- mean(data_list[[i]]) })

4. Parallel Computation

12345678910111213141516171819202122232425262728293031323334353637

library(parallel)
library(doParallel)
library(foreach)

# Check available cores
num_cores <- detectCores()
cat("Available cores:", num_cores, "\n")

# ─── parallel package (built-in) ─────────────────────
# mclapply — parallel lapply (Unix/Mac, not Windows)
result <- mclapply(1:8, function(x) x^2, mc.cores=4)

# parLapply — works on Windows
cl <- makeCluster(num_cores - 1)  # Leave 1 core for OS
parLapply(cl, 1:8, function(x) x^2)
stopCluster(cl)

# ─── foreach + doParallel (cleaner syntax) ──────────
cl <- makeCluster(num_cores - 1)
registerDoParallel(cl)

# Parallel for loop
result <- foreach(i=1:1000, .combine=c) %dopar% {
  # Each iteration runs on different core
  simulate_heavy_computation <- function(n) sum(rnorm(n)^2)
  simulate_heavy_computation(1000)
}

stopCluster(cl)
cat("Parallel result mean:", round(mean(result), 3), "\n")

# When to parallelize:
# ✅ Independent iterations (no shared state)
# ✅ Long-running simulations (Monte Carlo, bootstrapping)
# ✅ Large ML model training loops
# ❌ Fast vectorized operations (overhead kills benefit)
# ❌ I/O bound tasks (file reading — not CPU limited)

5. Memory Optimization

123456789101112131415161718192021222324252627

# Check object sizes
object.size(mtcars)          # Bytes
format(object.size(mtcars), units="MB")

# Efficient data types
df <- data.frame(x=runif(1e6), g=sample(letters, 1e6, TRUE))
cat("Before:", format(object.size(df), units="MB"), "\n")
df$g <- factor(df$g)  # Factor vs character saves memory
cat("After factor:", format(object.size(df), units="MB"), "\n")

# Use integer when possible (4 bytes vs 8 bytes for numeric)
x_num <- rnorm(1e6)            # 8MB  (double)
x_int <- as.integer(1:1e6)    # 4MB  (integer)

# data.table reads large CSV more efficiently
library(data.table)
large_df <- fread("large_file.csv",  # 5-10x faster than read.csv
                   select=c("col1","col2"),  # Only needed cols
                   nrows=100000)             # Limit rows if testing

# Remove objects from memory
rm(large_df)
gc()  # Force garbage collection
cat("After gc():", format(sum(gc()[,2])*8/1024, digits=2), "MB used\n")

# Use chunked reading for files larger than RAM
# (already covered in Chapter 13)

6. Common Mistakes

Growing vectors inside loops is O(n²): Every c(result, x) creates a new copy of the entire vector. For n=100,000 this causes 5 billion byte copies. Always pre-allocate.

Parallelizing fast operations: Parallel overhead (forking, IPC) costs ~10ms per job. If the task itself takes 1ms, parallel is 10x SLOWER. Parallelize only when single iteration takes >100ms.

7. MCQs

Question 1

`profvis({code})` visualizes?

Question 2

Pre-allocating `result <- numeric(n)` before loop is faster because?

Question 3

`detectCores()` returns?

Question 4

`fread()` in data.table is?

Question 5

`gc()` in R performs?

Question 6

Parallel is NOT beneficial when?

Question 7

`foreach(i=1:n) %dopar% {code}` runs iterations?

Question 8

Factor vs character: factor is better when?

Question 9

`object.size(x)` measures?

Question 10

`integer(n)` vs `numeric(n)` pre-allocation?

8. Interview Questions

Q: How do you identify and fix performance bottlenecks in R?

Q: When does parallelization NOT help in R?

9. Summary

Profiling: profvis() flame graph, system.time() quick timer. Vectorization speedup: growing vector (slow) < pre-allocated loop < vectorized operations. data.table::fread() for 5-10x faster CSV reading. Parallelism: foreach %dopar% for independent long tasks (>100ms per task). Memory: use factors for categoricals, integers for counts, rm() + gc() to free. Never grow vectors in loops.

10. Next Chapter Recommendation

In Chapter 30: Final Projects, we build 6 complete, production-grade R data science projects from scratch.

Browse All 21+ Subject Areas

Quick Links

Visual Algorithm Labs

Frontend Dev

The Future of Web Architecture in 2026

Practice Quizzes

Performance Optimization in R #

1. Chapter Introduction #

2. Profiling: Finding Bottlenecks #

3. Vectorization vs Loops #

4. Parallel Computation #

5. Memory Optimization #

6. Common Mistakes #

7. MCQs #

profvis({code}) visualizes?

Pre-allocating result <- numeric(n) before loop is faster because?

detectCores() returns?

fread() in data.table is?

gc() in R performs?

Parallel is NOT beneficial when?

foreach(i=1:n) %dopar% {code} runs iterations?

Factor vs character: factor is better when?

object.size(x) measures?

integer(n) vs numeric(n) pre-allocation?

8. Interview Questions #

9. Summary #

10. Next Chapter Recommendation #

Finish this Chapter

Discussion

Explore More

📖 Related Tutorials 5

Send Feedback / Bug

Feedback Submitted!

Performance Optimization in R

1. Chapter Introduction

2. Profiling: Finding Bottlenecks

3. Vectorization vs Loops

4. Parallel Computation

5. Memory Optimization

6. Common Mistakes

7. MCQs

`profvis({code})` visualizes?

Pre-allocating `result <- numeric(n)` before loop is faster because?

`detectCores()` returns?

`fread()` in data.table is?

`gc()` in R performs?

`foreach(i=1:n) %dopar% {code}` runs iterations?

`object.size(x)` measures?

`integer(n)` vs `numeric(n)` pre-allocation?

8. Interview Questions

9. Summary

10. Next Chapter Recommendation