Skip to main content
TensorFlow Introduction
CHAPTER 12 Intermediate

Transfer Learning in TensorFlow

Updated: May 16, 2026
6 min read

# CHAPTER 12

Transfer Learning in TensorFlow

1. Introduction

Google, Microsoft, and OpenAI spend millions of dollars training massive neural networks on supercomputers for weeks at a time. These networks learn how to detect millions of complex features in the world. As a beginner on a laptop, you cannot compete with that compute power. But you don't have to! Transfer Learning is the process of downloading one of these massive, pre-trained "brains" and tweaking just the final layer to solve your specific problem. It is the most powerful technique in modern AI.

2. Learning Objectives

By the end of this chapter, you will be able to:
  • Explain the concept of Transfer Learning.
  • Understand the ImageNet dataset.
  • Import pre-trained models like MobileNetV2 from tf.keras.applications.
  • Freeze base layers to prevent destroying pre-trained weights.
  • Add a custom Dense "head" to classify a custom dataset.

3. The Concept of Transfer Learning

Imagine a master chef who spent 10 years learning how to expertly chop vegetables, balance spices, and manage a kitchen. If you want them to bake a specific type of pie, you don't need to teach them how to hold a knife from scratch. You just give them the pie recipe. Similarly, a CNN trained on millions of images already knows how to detect edges, fur, eyes, and metal. We simply chop off the final "prediction" layer of the CNN and attach our own layer (e.g., "Is this a Hotdog or Not Hotdog?"). The network already knows what food looks like; it just needs a few minutes to learn what a hotdog looks like!

4. ImageNet and Famous Architectures

Most pre-trained models in Computer Vision were trained on ImageNet, a massive dataset of 14 million images categorized into 1,000 different classes. TensorFlow includes dozens of these famous architectures built-in:
  • ResNet50: Extremely deep, highly accurate, but heavy.
  • VGG16: Older, simple architecture, great for learning.
  • MobileNetV2: Highly optimized to be incredibly fast and lightweight, perfect for running on mobile phones. (We will use this one!).

5. Mini Project: Custom Image Classifier

Let's use Transfer Learning to build a world-class image classifier in minutes.
python
1234567891011121314151617181920212223242526272829
import tensorflow as tf
from tensorflow.keras.applications import MobileNetV2
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, GlobalAveragePooling2D

# 1. Download the Pre-trained Base Model (MobileNetV2)
# include_top=False means we are chopping off the final 1000-class prediction layer!
base_model = MobileNetV2(weights='imagenet', include_top=False, input_shape=(160, 160, 3))

# 2. CRITICAL: Freeze the Base Model
# We do not want Backpropagation to destroy the weights Google spent millions training!
base_model.trainable = False

# 3. Build our Custom Model on top of the Base
model = Sequential([
    base_model,                                # The pre-trained brain
    GlobalAveragePooling2D(),                  # Compresses the features (similar to Flatten)
    Dense(128, activation='relu'),             # Our custom learning layer
    Dense(1, activation='sigmoid')             # Our final Binary Prediction (e.g., Cat vs Dog)
])

# 4. Compile
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

# 5. Train!
# Because the base_model is frozen, training will be incredibly fast. 
# We are only training the new Dense layers we added at the bottom.
print("Training custom head...")
# model.fit(train_dataset, epochs=5) 

6. Fine-Tuning (Advanced)

Once you train your custom Dense layers and get a decent accuracy, you can squeeze out an extra 2-5% accuracy using Fine-Tuning.
  1. 1. Unfreeze the top layers of the base_model.
  1. 2. Re-compile the model with a *very small* learning rate (e.g., 1e-5).
  1. 3. Train for a few more epochs.
This allows the pre-trained weights to make micro-adjustments specifically for your unique dataset without destroying their foundational knowledge.
python
123456789
# Unfreeze the base model
base_model.trainable = True

# Freeze the bottom 100 layers, only fine-tune the top layers
for layer in base_model.layers[:100]:
    layer.trainable = False

# Recompile with a VERY LOW learning rate
model.compile(optimizer=tf.keras.optimizers.Adam(1e-5), loss='binary_crossentropy', metrics=['accuracy'])

7. Common Mistakes

  • Forgetting to Freeze the Base Model: If you initialize the model and immediately run fit() without base_model.trainable = False, the massive, random errors from your untrained Dense layer will backpropagate into the base model, completely destroying the pre-trained weights.
  • Wrong Preprocessing: Models trained on ImageNet expect data to be preprocessed in a very specific mathematical way (e.g., pixels scaled from -1 to 1, not 0 to 1). Always use the built-in preprocessing function: tf.keras.applications.mobilenet_v2.preprocess_input.

8. Best Practices

  • Always start with Transfer Learning: For any Computer Vision task in the real world, you should *never* build a CNN from scratch unless you are researching new architectures. Transfer Learning will save you weeks of time and require significantly less data.

9. Exercises

  1. 1. What does the parameter include_top=False do when importing a pre-trained model?
  1. 2. Why is it critical to set a very low learning rate when performing the "Fine-Tuning" step?

10. MCQ Quiz with Answers

Question 1

In Transfer Learning, what does "Freezing" a layer mean?

Question 2

Which of the following is a highly optimized, lightweight pre-trained model designed for mobile and edge devices?

11. Interview Questions

  • Q: Explain the two-step process of Transfer Learning (Feature Extraction followed by Fine-Tuning).
  • Q: Why does Transfer Learning allow you to train highly accurate models even if you only have a very small dataset (e.g., 500 images)?

12. FAQs

Q: Does Transfer Learning work for text (NLP) too? A: Absolutely! In fact, modern NLP is entirely based on Transfer Learning. Models like BERT and GPT are massive pre-trained language models that developers fine-tune for specific tasks.

13. Summary

Transfer Learning is the ultimate cheat code in Machine Learning. By downloading state-of-the-art architectures trained on supercomputers, freezing their feature-extracting brains, and attaching our own prediction heads, we can build world-class AI applications on a standard laptop in a matter of minutes.

14. Next Chapter Recommendation

We have conquered Computer Vision. Now, we must tackle human language. How does a neural network, which only understands numbers, learn to read English sentences? In Chapter 13: Natural Language Processing Basics, we will learn how to turn words into math.

Finish this Chapter

Save your progress on your learning path and prepare for coding interview challenges.

Discussion

Join the discussion

Log in or create a free account to participate.

Sort: ·