import torch
import torchvision
import numpy as np
import matplotlib.pyplot as plt
print(f"PyTorch version: {torch.__version__}")
print(f"TorchVision version: {torchvision.__version__}")
print(f"NumPy version: {np.__version__}")
# Check for GPU
if torch.cuda.is_available():
print(f"\n✅ GPU available: {torch.cuda.get_device_name(0)}")
print(f" GPU memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB")
else:
print("\n💻 Running on CPU (this is fine for learning!)")
print("\n✅ All libraries loaded successfully!")1 Introduction & Environment Setup
1.1 What is Deep Learning?
Deep Learning (DL) is a subset of machine learning that uses artificial neural networks with multiple layers (hence “deep”) to learn hierarchical representations of data. While traditional machine learning requires manual feature engineering, deep learning automatically discovers the features needed for detection or classification.
Example: To classify images of cats vs dogs: - Traditional ML: You manually define features (edges, colors, textures) then train a classifier - Deep Learning: The neural network automatically learns what features matter (whiskers, ears, fur patterns)
1.2 Deep Learning vs Machine Learning
When should you use deep learning instead of classical ML?
Use Deep Learning when: - You have large datasets (thousands to millions of examples) - You’re working with unstructured data (images, text, audio, video) - Complex patterns exist that are hard to hand-engineer - You have GPU resources for training - High accuracy is more important than interpretability
Use Classical ML when: - You have small datasets (hundreds to thousands of examples) - You’re working with structured/tabular data - You need interpretability (why did the model decide this?) - You have limited compute resources - Faster training and simpler deployment matter
Real-world examples: - DL: Image recognition, language translation, speech recognition, autonomous driving - Classical ML: Fraud detection on structured data, customer churn prediction, house price estimation
1.3 Neural Network Intuition
A neural network is inspired by how the brain processes information:
Input → Hidden Layers → Output
↓ ↓ ↓
[x₁] [neurons] [prediction]
[x₂] → [weights] → [0.87 = cat]
[x₃] [activation]
Each layer transforms the data, extracting increasingly abstract features: - Layer 1: Detects edges and simple patterns - Layer 2: Combines edges into shapes - Layer 3: Combines shapes into parts (eyes, ears) - Output: Makes final decision (cat or dog)
1.4 Environment Setup
1.4.1 System Requirements
- Operating System: Linux, macOS, or Windows
- Python: Version 3.8 or higher (python.org)
- RAM: 8GB minimum (16GB+ recommended)
- Disk Space: 5-10GB for libraries and datasets
1.4.2 GPU vs CPU
Do you need a GPU? - Not required for learning! All examples work on CPU - GPU accelerates training 10-50x faster - Free options available (Google Colab, Kaggle)
GPU recommendations: - For learning: Free Colab/Kaggle GPUs (sufficient for this book) - For serious work: NVIDIA GPU with 6GB+ VRAM (RTX 3060+, RTX 4060+) - For professionals: Visit tensorrigs.com for detailed GPU recommendations
1.4.3 Choose Your Framework
This book teaches both PyTorch and TensorFlow. All code examples have two tabs—pick one and stick with it!
PyTorch: - More Pythonic, easier to debug - Popular in research and academia - Growing industry adoption - Developed by Meta (Facebook)
TensorFlow/Keras: - Industry standard, production-ready - Easier deployment to mobile/web - Massive ecosystem - Developed by Google
Can’t decide? Start with PyTorch if you prefer intuitive code, or TensorFlow if you prioritize industry adoption. You can always learn the other later!
1.5 Installation
1.5.1 Option 1: Local Setup (Recommended)
Choose your path based on your preferred framework:
1.5.1.1 Path A: PyTorch Installation
# Create virtual environment
python3 -m venv dlbook-env
# Activate it
source dlbook-env/bin/activate
# Install PyTorch (CPU version)
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
# Install common libraries
pip install numpy pandas matplotlib jupyter
# For GPU (NVIDIA with CUDA 11.8):
# pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118# Create virtual environment
python -m venv dlbook-env
# Activate it
dlbook-env\Scripts\activate
# Install PyTorch (CPU version)
pip install torch torchvision torchaudio
# Install common libraries
pip install numpy pandas matplotlib jupyter
# For GPU (NVIDIA with CUDA 11.8):
# pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu1181.5.1.2 Path B: TensorFlow Installation
# Create virtual environment
python3 -m venv dlbook-env
# Activate it
source dlbook-env/bin/activate
# Install TensorFlow (works for both CPU and GPU)
pip install tensorflow
# Install common libraries
pip install numpy pandas matplotlib jupyter
# Verify GPU support (optional)
# python -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"# Create virtual environment
python -m venv dlbook-env
# Activate it
dlbook-env\Scripts\activate
# Install TensorFlow (works for both CPU and GPU)
pip install tensorflow
# Install common libraries
pip install numpy pandas matplotlib jupyter
# Verify GPU support (optional)
# python -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"1.5.1.3 Path C: Both Frameworks
If you want to compare both frameworks:
# Create virtual environment
python3 -m venv dlbook-env
source dlbook-env/bin/activate # or dlbook-env\Scripts\activate on Windows
# Install both frameworks
pip install torch torchvision torchaudio
pip install tensorflow
# Install common libraries
pip install numpy pandas matplotlib jupyter1.5.2 Option 2: Cloud-Based (Quick Start)
Perfect for getting started without any installation!
Google Colab (colab.research.google.com) - ✅ Free GPU access (T4 with 15GB VRAM) - ✅ Pre-installed PyTorch and TensorFlow - ✅ Works in browser - ✅ Saves to Google Drive - ⏱️ ~12 hours/day of GPU time
Kaggle Notebooks (kaggle.com/code) - ✅ Free GPU access (P100/T4) - ✅ Pre-installed libraries - ✅ 30+ hours/week of GPU - ✅ Large dataset library
Setup in Colab/Kaggle: 1. Create new notebook 2. No installation needed! Both frameworks pre-installed 3. Start coding immediately
Use cloud (Colab/Kaggle) if: - You’re just starting out - You don’t have a GPU - You want zero setup time
Use local if: - You have a GPU - You prefer working offline - You want full control
1.6 Verification
Let’s verify your installation works:
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
print(f"TensorFlow version: {tf.__version__}")
print(f"NumPy version: {np.__version__}")
# Check for GPU
gpus = tf.config.list_physical_devices('GPU')
if gpus:
print(f"\n✅ GPU available: {len(gpus)} GPU(s) detected")
for gpu in gpus:
print(f" {gpu}")
else:
print("\n💻 Running on CPU (this is fine for learning!)")
print("\n✅ All libraries loaded successfully!")1.7 Your First Neural Network (5-Minute Example)
Let’s build a simple neural network that learns to classify handwritten digits (MNIST dataset). Don’t worry about understanding every detail—we’ll cover everything in upcoming chapters!
import torch
import torch.nn as nn
from torchvision import datasets, transforms
from torch.utils.data import DataLoader
# 1. Load data (MNIST: 28x28 grayscale images of digits 0-9)
transform = transforms.Compose([transforms.ToTensor()])
train_data = datasets.MNIST(root='./data', train=True, download=True, transform=transform)
train_loader = DataLoader(train_data, batch_size=64, shuffle=True)
# 2. Define a simple neural network
class SimpleNet(nn.Module):
def __init__(self):
super().__init__()
self.flatten = nn.Flatten()
self.network = nn.Sequential(
nn.Linear(28*28, 128), # Input: 784 pixels → 128 neurons
nn.ReLU(), # Activation function
nn.Linear(128, 10) # 128 neurons → 10 outputs (digits 0-9)
)
def forward(self, x):
x = self.flatten(x)
return self.network(x)
model = SimpleNet()
print(f"Model created with {sum(p.numel() for p in model.parameters())} parameters")
# 3. Train the model (1 epoch for quick demo)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device)
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
criterion = nn.CrossEntropyLoss()
model.train()
for batch_idx, (data, target) in enumerate(train_loader):
if batch_idx >= 100: # Train on 100 batches for demo
break
data, target = data.to(device), target.to(device)
optimizer.zero_grad()
output = model(data)
loss = criterion(output, target)
loss.backward()
optimizer.step()
if batch_idx % 50 == 0:
print(f"Batch {batch_idx}, Loss: {loss.item():.4f}")
print("\n✅ Model trained! You just built your first neural network!")import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
# 1. Load data (MNIST: 28x28 grayscale images of digits 0-9)
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
# Normalize pixel values to 0-1 range
x_train = x_train.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0
# Flatten images from 28x28 to 784
x_train = x_train.reshape(-1, 28*28)
x_test = x_test.reshape(-1, 28*28)
print(f"Training data shape: {x_train.shape}")
print(f"Training labels shape: {y_train.shape}")
# 2. Define a simple neural network
model = keras.Sequential([
layers.Dense(128, activation='relu', input_shape=(784,)), # 784 inputs → 128 neurons
layers.Dense(10, activation='softmax') # 128 neurons → 10 outputs (digits 0-9)
])
# 3. Compile the model
model.compile(
optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy']
)
print(f"\nModel created with {model.count_params()} parameters")
model.summary()
# 4. Train the model (partial data for quick demo)
history = model.fit(
x_train[:6400], # Use subset for demo
y_train[:6400],
batch_size=64,
epochs=1,
verbose=1
)
print("\n✅ Model trained! You just built your first neural network!")What just happened?
- Loaded data: MNIST dataset with 28×28 pixel images of handwritten digits
- Built network: 784 inputs → 128 hidden neurons → 10 outputs (one per digit)
- Trained: The network learned to recognize digits by adjusting weights
- Result: In just a few seconds (or minutes on CPU), the model learned to classify digits!
1.8 Understanding Hardware Performance
Here’s what you can expect on different hardware:
| Hardware | Training Speed | Suitable For |
|---|---|---|
| CPU (Laptop) | 1-2 min/epoch | Learning, small experiments |
| Google Colab (T4 GPU) | 5-10 sec/epoch | All examples in this book |
| RTX 3060 (12GB) | 3-5 sec/epoch | Personal projects, learning |
| RTX 4090 (24GB) | 1-2 sec/epoch | Professional work, research |
All examples in this book work on CPU. Training will be slower but absolutely viable for learning. We use smart batch sizes and provide clear guidance for CPU users.
1.9 What’s Next?
In Chapter 2, we’ll dive deep into how neural networks actually work—neurons, activation functions, forward/backward propagation—with clear explanations and no heavy math!
1.10 Summary
- Deep learning automatically learns features from data using neural networks
- Use DL for unstructured data (images, text) and large datasets
- PyTorch (Pythonic) and TensorFlow (production-ready) are both excellent choices
- Free GPU options (Colab, Kaggle) are sufficient for this entire book
- You just built your first neural network in under 20 lines of code!