Federated Learning¶
Core Concept¶
Federated learning trains models without centralizing data. Each participant trains locally and shares only model updates (gradients), never raw data.
graph TD
S[Central Server] -->|"Send global model"| A[Device A]
S -->|"Send global model"| B[Device B]
S -->|"Send global model"| C[Device C]
A -->|"Send gradients only"| S
B -->|"Send gradients only"| S
C -->|"Send gradients only"| S
S --> D[Aggregate & Update]
D --> S
How It Works¶
- Server sends current global model to all participants
- Each participant trains on their local data for a few epochs
- Each participant sends model updates (gradients) back to server
- Server aggregates updates (e.g., weighted average) into new global model
- Repeat until convergence
Federated Averaging (FedAvg)¶
The most common aggregation strategy:
\[w_{t+1} = \sum_{k=1}^{K} \frac{n_k}{n} w_{t+1}^k\]
Where \(w_{t+1}^k\) is participant \(k\)'s updated model, \(n_k\) is their data size, and \(n\) is total data.
import numpy as np
def federated_average(model_updates, data_sizes):
"""Aggregate model updates using weighted average."""
total_data = sum(data_sizes)
weights = [n / total_data for n in data_sizes]
aggregated = {}
for key in model_updates[0].keys():
aggregated[key] = sum(
w * update[key] for w, update in zip(weights, model_updates)
)
return aggregated
Types of Federated Learning¶
| Type | Participants | Data Split | Example |
|---|---|---|---|
| Horizontal | Same features, different samples | By rows | Multiple hospitals with same patient attributes |
| Vertical | Same samples, different features | By columns | Bank + e-commerce sharing customer models |
| Transfer | Different features and samples | By domain | Cross-industry knowledge sharing |
Privacy Considerations¶
Gradients Can Leak Information
Even without sharing raw data, model gradients can reveal information about training data. Combine federated learning with differential privacy for stronger guarantees:
- Add noise to gradients before sharing
- Use secure aggregation so server only sees the sum
Federated Learning + Differential Privacy¶
def private_federated_update(local_model, local_data, epsilon, clip_norm=1.0):
"""Train locally with DP guarantees."""
# Train on local data
gradients = compute_gradients(local_model, local_data)
# Clip gradients
grad_norm = np.linalg.norm(gradients)
if grad_norm > clip_norm:
gradients = gradients * (clip_norm / grad_norm)
# Add noise
sensitivity = clip_norm
noise = np.random.laplace(0, sensitivity / epsilon, gradients.shape)
private_gradients = gradients + noise
return private_gradients
Real-World Applications¶
| Application | Why Federated? |
|---|---|
| Google Keyboard | Improve predictions without collecting typing data |
| Healthcare | Train on patient data across hospitals without sharing records |
| Finance | Fraud detection across banks without exposing transactions |
| IoT | Edge devices train locally, share learnings |
Back to: Chapter 8 Overview ←