Containerization Fundamentals: Đóng gói Ứng dụng AI một cách Nhất quán

"It works on my machine" - câu nói kinh điển của developers khi code chạy tốt trên laptop nhưng fail trên server production. Containerization giải quyết vấn đề này bằng cách đóng gói ứng dụng cùng toàn bộ dependencies vào một package nhất quán.

Trong thời đại AI/ML, containerization còn quan trọng hơn - models cần specific Python versions, CUDA versions, library versions. Một mismatch nhỏ có thể làm model fail hoàn toàn.

Virtualization vs Containerization

Traditional Virtualization

┌─────────────────────────────────────┐
│         Physical Hardware           │
├─────────────────────────────────────┤
│          Hypervisor (ESXi)          │
├──────────┬──────────┬───────────────┤
│   VM 1   │   VM 2   │     VM 3      │
│ ┌──────┐ │ ┌──────┐ │   ┌──────┐   │
│ │ OS   │ │ │ OS   │ │   │ OS   │   │
│ │Ubuntu│ │ │CentOS│ │   │Windows│  │
│ └──────┘ │ └──────┘ │   └──────┘   │
│ ┌──────┐ │ ┌──────┐ │   ┌──────┐   │
│ │ App  │ │ │ App  │ │   │ App  │   │
│ └──────┘ │ └──────┘ │   └──────┘   │
└──────────┴──────────┴───────────────┘

Each VM: Full OS + Apps (GB per VM)
Boot time: Minutes
Overhead: High (multiple OS kernels)

Containerization

┌─────────────────────────────────────┐
│         Physical Hardware           │
├─────────────────────────────────────┤
│          Host OS (Linux)            │
├─────────────────────────────────────┤
│        Container Engine (Docker)    │
├──────────┬──────────┬───────────────┤
│Container1│Container2│  Container3   │
│ ┌──────┐ │ ┌──────┐ │   ┌──────┐   │
│ │ App  │ │ │ App  │ │   │ App  │   │
│ │+Libs │ │ │+Libs │ │   │+Libs │   │
│ └──────┘ │ └──────┘ │   └──────┘   │
└──────────┴──────────┴───────────────┘

Each Container: Just Apps + Libs (MB)
Boot time: Seconds
Overhead: Low (shared OS kernel)

Key Differences:

Aspect	VM	Container
Size	GBs	MBs
Startup	Minutes	Seconds
Isolation	Full (OS-level)	Process-level
Performance	Lower	Near-native
Density	10s per host	100s per host

When to use VMs:

Need different OS (Windows + Linux)
Maximum isolation (security-critical)
Legacy applications

When to use Containers:

Microservices
CI/CD pipelines
Cloud-native apps
ML model serving

Docker Architecture

Docker là container platform phổ biến nhất.

Core Components

┌─────────────────────────────────────┐
│          Docker Client              │
│        (docker CLI commands)        │
└──────────────┬──────────────────────┘
               │ REST API
┌──────────────▼──────────────────────┐
│         Docker Daemon               │
│      (dockerd - manages:)           │
│  ┌──────────────────────────────┐  │
│  │        Images                │  │
│  │  (Read-only templates)       │  │
│  └──────────────────────────────┘  │
│  ┌──────────────────────────────┐  │
│  │       Containers             │  │
│  │  (Running instances)         │  │
│  └──────────────────────────────┘  │
│  ┌──────────────────────────────┐  │
│  │        Networks              │  │
│  │  (Container networking)      │  │
│  └──────────────────────────────┘  │
│  ┌──────────────────────────────┐  │
│  │        Volumes               │  │
│  │  (Persistent storage)        │  │
│  └──────────────────────────────┘  │
└─────────────────────────────────────┘

1. Docker Image:

Read-only template
Contains: OS, runtime, libraries, app code
Built from Dockerfile
Stored in Registry (Docker Hub, ECR, GCR)

2. Docker Container:

Running instance of an image
Writable layer on top of image
Isolated from other containers
Can be started/stopped/deleted

3. Docker Registry:

Storage for images
Public: Docker Hub
Private: AWS ECR, Google GCR, Azure ACR

Dockerfile - Build Instructions

Dockerfile là blueprint để build image.

Basic Dockerfile for Python App

# Base image - start from Python 3.10
FROM python:3.10-slim

# Set working directory
WORKDIR /app

# Copy requirements first (for caching)
COPY requirements.txt .

# Install dependencies
RUN pip install --no-cache-dir -r requirements.txt

# Copy application code
COPY . .

# Expose port
EXPOSE 8000

# Command to run
CMD ["python", "app.py"]

Dockerfile for ML Model

FROM python:3.10-slim

WORKDIR /app

# System dependencies for ML libraries
RUN apt-get update && apt-get install -y \
    build-essential \
    curl \
    && rm -rf /var/lib/apt/lists/*

# Python dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy model files
COPY models/ /app/models/
COPY src/ /app/src/

# Environment variables
ENV MODEL_PATH=/app/models/model.pkl
ENV PORT=8000

EXPOSE 8000

# Run FastAPI server
CMD ["uvicorn", "src.main:app", "--host", "0.0.0.0", "--port", "8000"]

Dockerfile Instructions

FROM: Base image

FROM python:3.10-slim          # Official Python slim
FROM nvidia/cuda:11.8-base     # CUDA support
FROM ubuntu:22.04              # Ubuntu base

WORKDIR: Set working directory

WORKDIR /app
# All subsequent commands run in /app

COPY: Copy files from host to container

COPY requirements.txt .        # Copy file
COPY src/ /app/src/           # Copy directory
COPY . .                       # Copy everything

RUN: Execute commands during build

RUN pip install torch          # Install package
RUN apt-get update && \        # Multi-line with \
    apt-get install -y curl

ENV: Set environment variables

ENV PYTHONUNBUFFERED=1
ENV MODEL_PATH=/app/models

EXPOSE: Document which ports are used

EXPOSE 8000
# Doesn't actually publish port, just documentation

CMD: Default command when container starts

CMD ["python", "app.py"]       # Exec form (preferred)
CMD python app.py              # Shell form

ENTRYPOINT: Configure container as executable

ENTRYPOINT ["python"]
CMD ["app.py"]
# Result: python app.py
# Can override CMD: docker run image script.py → python script.py

Multi-stage Builds

Reduce final image size by using multiple stages.

# Stage 1: Build
FROM python:3.10 as builder

WORKDIR /app

# Install dependencies
COPY requirements.txt .
RUN pip install --user --no-cache-dir -r requirements.txt

# Copy source
COPY src/ /app/src/

# Stage 2: Runtime
FROM python:3.10-slim

WORKDIR /app

# Copy only necessary files from builder
COPY --from=builder /root/.local /root/.local
COPY --from=builder /app/src /app/src

# Make sure scripts in .local are usable
ENV PATH=/root/.local/bin:$PATH

CMD ["python", "src/main.py"]

Benefits:

Builder stage: Large (compilers, build tools)
Runtime stage: Small (only runtime dependencies)
Final image: 50-80% smaller

Layer Caching

Docker caches each layer. Order matters for efficiency!

# ❌ BAD - App code changes invalidate all layers
FROM python:3.10
COPY . .                       # Everything copied
RUN pip install -r requirements.txt  # Re-runs every time code changes

# ✅ GOOD - Dependencies cached separately
FROM python:3.10
COPY requirements.txt .        # Copy requirements first
RUN pip install -r requirements.txt  # Cached unless requirements change
COPY . .                       # Copy code last

Principle:

Put rarely-changing layers first (base image, dependencies)
Put frequently-changing layers last (application code)

Docker Commands

Image Operations

# Build image
docker build -t myapp:v1 .
docker build -t myapp:latest --no-cache .  # Force rebuild

# List images
docker images

# Remove image
docker rmi myapp:v1

# Pull from registry
docker pull python:3.10

# Push to registry
docker tag myapp:v1 username/myapp:v1
docker push username/myapp:v1

# Inspect image
docker inspect myapp:v1
docker history myapp:v1  # Show layers

Container Operations

# Run container
docker run myapp:v1                    # Run and exit
docker run -d myapp:v1                 # Detached (background)
docker run -p 8000:8000 myapp:v1       # Port mapping
docker run -v /data:/app/data myapp:v1 # Volume mount
docker run --name mycontainer myapp:v1 # Named container
docker run --rm myapp:v1               # Auto-remove after exit

# List containers
docker ps           # Running containers
docker ps -a        # All containers

# Start/stop containers
docker start container_id
docker stop container_id
docker restart container_id

# Execute command in running container
docker exec -it container_id bash      # Interactive shell
docker exec container_id ls /app       # Run command

# View logs
docker logs container_id
docker logs -f container_id            # Follow logs

# Remove container
docker rm container_id
docker rm -f container_id              # Force remove running container

# Container stats
docker stats container_id

Real-world ML Example

# Build ML model serving image
docker build -t ml-api:v1 .

# Run with GPU support (NVIDIA)
docker run --gpus all \
  -p 8000:8000 \
  -v $(pwd)/models:/app/models \
  -e MODEL_NAME=bert-base \
  --name ml-server \
  ml-api:v1

# Check logs
docker logs -f ml-server

# Test API
curl http://localhost:8000/predict -d '{"text": "Hello world"}'

# Stop and remove
docker stop ml-server
docker rm ml-server

Docker Compose - Multi-Container Apps

Manage multiple containers together.

docker-compose.yml

version: '3.8'

services:
  # FastAPI backend
  api:
    build: ./api
    ports:
      - "8000:8000"
    environment:
      - DATABASE_URL=postgresql://user:pass@db:5432/mydb
      - REDIS_URL=redis://cache:6379
    depends_on:
      - db
      - cache
    volumes:
      - ./api:/app
      - models:/app/models
    restart: unless-stopped

  # PostgreSQL database
  db:
    image: postgres:15
    environment:
      - POSTGRES_USER=user
      - POSTGRES_PASSWORD=pass
      - POSTGRES_DB=mydb
    volumes:
      - postgres_data:/var/lib/postgresql/data
    ports:
      - "5432:5432"

  # Redis cache
  cache:
    image: redis:7-alpine
    ports:
      - "6379:6379"
    command: redis-server --appendonly yes
    volumes:
      - redis_data:/data

  # Nginx reverse proxy
  nginx:
    image: nginx:alpine
    ports:
      - "80:80"
    volumes:
      - ./nginx.conf:/etc/nginx/nginx.conf
    depends_on:
      - api

volumes:
  postgres_data:
  redis_data:
  models:

Compose Commands

# Start all services
docker-compose up
docker-compose up -d              # Detached

# Stop all services
docker-compose down
docker-compose down -v            # Also remove volumes

# View logs
docker-compose logs
docker-compose logs -f api        # Follow specific service

# Scale services
docker-compose up -d --scale api=3  # Run 3 API instances

# Rebuild and restart
docker-compose up -d --build

# Execute in service
docker-compose exec api bash

Best Practices for ML/AI Containers

1. Use Specific Base Images

# ❌ Generic
FROM python:3

# ✅ Specific version
FROM python:3.10.12-slim

# ✅ ML-optimized
FROM nvidia/cuda:11.8.0-cudnn8-runtime-ubuntu22.04

2. Minimize Image Size

# Use slim/alpine variants
FROM python:3.10-slim  # ~120MB vs 900MB for full

# Clean up in same layer
RUN apt-get update && \
    apt-get install -y --no-install-recommends curl && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

# Use .dockerignore
# .dockerignore file:
__pycache__
*.pyc
.git
.venv
*.log

3. Security

# Don't run as root
FROM python:3.10-slim

RUN useradd -m -u 1000 appuser
USER appuser

WORKDIR /home/appuser/app
COPY --chown=appuser:appuser . .

# Scan for vulnerabilities
# docker scan myimage:v1

4. Health Checks

HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
  CMD curl -f http://localhost:8000/health || exit 1

5. Model Artifacts

# Option 1: Bake into image (small models)
COPY models/model.pkl /app/models/

# Option 2: Mount volume (large models)
# docker run -v /path/to/models:/app/models myapp

# Option 3: Download at runtime
RUN pip install huggingface-hub
CMD ["python", "-c", "from huggingface_hub import snapshot_download; \
     snapshot_download('bert-base-uncased', cache_dir='/app/models')"]

6. Environment-specific Configs

# Use build args
ARG ENV=production
ENV APP_ENV=${ENV}

# Build for different envs
# docker build --build-arg ENV=development -t myapp:dev .
# docker build --build-arg ENV=production -t myapp:prod .

Container Orchestration Preview

Containers alone aren't enough for production. Need orchestration:

Kubernetes (next topic):

Auto-scaling
Load balancing
Self-healing
Rolling updates
Service discovery

Example scenario:

Single Container:
- Manual start/stop
- Manual scaling
- No automatic recovery
- Manual load balancing

Kubernetes Cluster:
- Auto-start on failure
- Auto-scale based on load
- Built-in load balancer
- Zero-downtime deployments

Key Takeaways

Containers provide consistency: "works on my machine" → "works everywhere"
Virtualization runs full OS (heavy), Containers share kernel (lightweight)
Docker architecture: Client → Daemon → Images/Containers/Networks/Volumes
Dockerfile is build blueprint - order matters for caching
Multi-stage builds reduce image size (50-80% smaller)
Docker Compose manages multi-container applications
Best practices: Specific base images, minimize size, security (non-root), health checks
ML-specific: Handle large models via volumes or runtime download

Trong bài tiếp theo, chúng ta sẽ khám phá Model Serving Architecture - batch vs online inference, model formats (ONNX, TensorRT), và deployment strategies.

Bài viết thuộc series "From Zero to AI Engineer" - Module 9: Deployment Strategy