Code Quality & Refactoring: Nghệ thuật Viết Code Dễ Đọc, Dễ Maintain

"Any fool can write code that a computer can understand. Good programmers write code that humans can understand." - Martin Fowler

Code không chỉ viết cho máy tính chạy, mà còn cho con người đọc và maintain. Trong thực tế, bạn sẽ đọc code nhiều hơn viết code gấp 10 lần. Một codebase chất lượng cao giúp team làm việc hiệu quả, giảm bugs, và dễ dàng thêm features mới.

Clean Code Principles

1. Meaningful Names - Đặt tên Có ý nghĩa

Tên biến, function, class phải reveal intent - người đọc hiểu ngay mục đích mà không cần comment.

❌ BAD - Tên mơ hồ:

def calc(d):
    return d * 0.1

x = calc(100)  # x là gì? 0.1 là gì?

✅ GOOD - Tên rõ ràng:

def calculate_discount(price):
    DISCOUNT_RATE = 0.1
    return price * DISCOUNT_RATE

discount_amount = calculate_discount(100)

Naming conventions:

# Variables & functions: snake_case (Python), camelCase (JavaScript)
user_count = 10
def get_user_by_id(user_id): pass

# Classes: PascalCase
class UserRepository: pass

# Constants: UPPER_SNAKE_CASE
MAX_RETRY_ATTEMPTS = 3
DATABASE_URL = "postgresql://..."

# Private members: leading underscore
class User:
    def __init__(self):
        self._password = None  # Private

Tips cho naming:

Use pronounceable names

❌ genymdhms = datetime.now()  # generation year-month-day-hour-minute-second
✅ generation_timestamp = datetime.now()

Avoid mental mapping

for i in users:  # i là gì? index hay user object?for user in users:

Use searchable names

if status == 7:  # 7 là gì? Magic number!
✅ 
STATUS_APPROVED = 7
if status == STATUS_APPROVED:

Class names = nouns, Function names = verbs

# Classes (nouns)
class OrderProcessor: pass
class PaymentGateway: pass

# Functions (verbs)
def process_order(): pass
def send_email(): pass
def is_valid(): pass  # Boolean functions: is/has/can

2. Function Structure - Cấu trúc Function

Single Responsibility: Mỗi function làm MỘT việc duy nhất.

❌ BAD - Function làm quá nhiều:

def process_order(order_data):
    # Validate
    if not order_data.get('email'):
        raise ValueError("Email required")
    
    # Calculate total
    total = sum(item['price'] * item['qty'] for item in order_data['items'])
    
    # Apply discount
    if order_data.get('coupon'):
        total *= 0.9
    
    # Save to database
    db.orders.insert({
        'user_email': order_data['email'],
        'total': total,
        'items': order_data['items']
    })
    
    # Send email
    send_email(order_data['email'], f"Order confirmed: ${total}")
    
    # Update inventory
    for item in order_data['items']:
        db.inventory.update({'id': item['id']}, {'$inc': {'stock': -item['qty']}})
    
    return total

✅ GOOD - Chia nhỏ thành các functions:

def process_order(order_data):
    validate_order(order_data)
    total = calculate_order_total(order_data)
    order_id = save_order(order_data, total)
    send_confirmation_email(order_data['email'], total)
    update_inventory(order_data['items'])
    return order_id

def validate_order(order_data):
    if not order_data.get('email'):
        raise ValueError("Email required")
    if not order_data.get('items'):
        raise ValueError("Items required")

def calculate_order_total(order_data):
    subtotal = sum(item['price'] * item['qty'] for item in order_data['items'])
    return apply_discount(subtotal, order_data.get('coupon'))

def apply_discount(amount, coupon):
    if not coupon:
        return amount
    # Coupon logic here
    return amount * 0.9

# ... các functions khác

Lợi ích:

  • Dễ đọc (function names document code)
  • Dễ test (test từng function riêng)
  • Reusable (có thể dùng calculate_order_total ở nhiều chỗ)

Function size:

  • Ideal: 5-10 lines
  • Maximum: 20-30 lines
  • Nếu dài hơn → extract thành sub-functions

Function parameters:

  • Ideal: 0-2 parameters
  • Acceptable: 3 parameters
  • Avoid: 4+ parameters (dùng object/dataclass thay thế)
# ❌ Too many parameters
def create_user(name, email, password, address, phone, age, country):
    pass

# ✅ Use object
from dataclasses import dataclass

@dataclass
class UserData:
    name: str
    email: str
    password: str
    address: str
    phone: str
    age: int
    country: str

def create_user(user_data: UserData):
    pass

3. Comments - Khi nào cần Comment?

Good code is self-documenting - nên dùng tên rõ ràng thay vì comment.

❌ BAD - Comment giải thích code tồi:

# Check if user is adult
if u.a >= 18:
    # u.a là user.age → dùng tên rõ ràng!

✅ GOOD - Code tự giải thích:

if user.age >= LEGAL_AGE:
    grant_access()

KHI NÀO NÊN COMMENT:

Explain WHY, not WHAT

# We use exponential backoff because the third-party API 
# rate-limits aggressively during peak hours
for attempt in range(MAX_RETRIES):
    try:
        response = api.call()
        break
    except RateLimitError:
        sleep(2 ** attempt)

Warning về consequences

# WARNING: Changing this will break backward compatibility
# with mobile app versions < 2.0
LEGACY_API_FORMAT = True

Legal requirements

# Copyright (c) 2024 Company Name
# Licensed under MIT License

TODO/FIXME

# TODO: Implement caching for performance
# FIXME: Edge case when user has no orders
# HACK: Temporary workaround for library bug #123

❌ KHÔNG NÊN COMMENT:

  • Redundant comments (lặp lại code)
  • Commented-out code (dùng version control!)
  • Journal comments (version control làm việc này)

4. Formatting - Định dạng Code

Consistency is key - team phải follow cùng style guide.

Python: PEP 8
JavaScript: Airbnb Style Guide
Java: Google Java Style

Vertical formatting:

# Related concepts gần nhau
class User:
    def __init__(self, name, email):
        self.name = name
        self.email = email
    
    # Blank line tách concepts
    def send_welcome_email(self):
        subject = "Welcome!"
        body = f"Hello {self.name}"
        send_email(self.email, subject, body)

Horizontal formatting:

  • Line length: 80-120 characters
  • Indentation: 4 spaces (Python), 2 spaces (JavaScript)

Use linters và formatters:

# Python
pip install black pylint flake8
black .                    # Auto-format
pylint myapp.py           # Check quality

# JavaScript
npm install eslint prettier
prettier --write .        # Auto-format
eslint src/              # Check quality

Code Smells - Nhận diện Vấn đề

Code smells là dấu hiệu code có vấn đề, cần refactoring.

1. Duplicated Code

Cùng logic lặp lại nhiều nơi.

❌ SMELL:

def get_user_full_name(user):
    return f"{user.first_name} {user.last_name}"

def display_user(user):
    name = f"{user.first_name} {user.last_name}"  # Duplicate!
    print(f"User: {name}")

def export_user(user):
    name = f"{user.first_name} {user.last_name}"  # Duplicate!
    return {"name": name}

✅ FIX - Extract method:

def get_full_name(user):
    return f"{user.first_name} {user.last_name}"

def display_user(user):
    print(f"User: {get_full_name(user)}")

def export_user(user):
    return {"name": get_full_name(user)}

2. Long Method

Function quá dài, làm nhiều việc.

FIX: Extract thành smaller methods (đã ví dụ ở trên).

3. Large Class (God Object)

Class biết quá nhiều, làm quá nhiều.

❌ SMELL:

class User:
    def __init__(self):
        pass
    
    def save_to_database(self): pass
    def send_email(self): pass
    def generate_report(self): pass
    def process_payment(self): pass
    def calculate_taxes(self): pass
    # ... 50 methods nữa

✅ FIX - Split responsibilities:

class User:
    def __init__(self): pass

class UserRepository:
    def save(self, user): pass

class EmailService:
    def send(self, to, subject, body): pass

class ReportGenerator:
    def generate(self, user): pass

# ... các classes khác

4. Long Parameter List

Function có quá nhiều parameters.

FIX: Dùng object/dict (đã ví dụ ở trên).

5. Primitive Obsession

Dùng primitives (int, string) thay vì objects có ý nghĩa.

❌ SMELL:

def send_money(from_account: str, to_account: str, amount: float, currency: str):
    # from_account, to_account chỉ là strings - dễ nhầm lẫn!
    pass

send_money("123456", "789012", 100.0, "USD")
send_money("789012", "123456", 100.0, "USD")  # Dễ swap nhầm!

✅ FIX - Create value objects:

@dataclass
class AccountNumber:
    value: str
    
    def __post_init__(self):
        if not self.is_valid():
            raise ValueError(f"Invalid account: {self.value}")
    
    def is_valid(self):
        return len(self.value) == 6 and self.value.isdigit()

@dataclass
class Money:
    amount: Decimal
    currency: str

def send_money(from_account: AccountNumber, to_account: AccountNumber, money: Money):
    pass

# Usage - type safety!
sender = AccountNumber("123456")
receiver = AccountNumber("789012")
amount = Money(Decimal("100.00"), "USD")
send_money(sender, receiver, amount)

6. Dead Code

Code không được dùng nữa.

FIX: Delete! Version control lưu history.

# ❌ Commented code
# def old_implementation():
#     pass

# ✅ Delete it
# Nếu cần sau này, restore từ Git

7. Inappropriate Intimacy

Classes biết quá nhiều về internal details của nhau.

❌ SMELL:

class Order:
    def __init__(self):
        self.items = []

class OrderProcessor:
    def calculate_total(self, order):
        # Truy cập trực tiếp internal structure
        total = 0
        for item in order.items:
            total += item.price * item.quantity
        return total

✅ FIX - Encapsulation:

class Order:
    def __init__(self):
        self._items = []
    
    def calculate_total(self):  # Move logic vào Order
        return sum(item.price * item.quantity for item in self._items)

class OrderProcessor:
    def process(self, order):
        total = order.calculate_total()  # Gọi public method
        # ...

Refactoring Techniques

1. Extract Method

Lấy đoạn code thành function riêng.

Before:

def print_invoice(invoice):
    print("***************")
    print("*** INVOICE ***")
    print("***************")
    
    # Print details
    print(f"Customer: {invoice.customer}")
    print(f"Amount: ${invoice.amount}")

After:

def print_invoice(invoice):
    print_banner()
    print_details(invoice)

def print_banner():
    print("***************")
    print("*** INVOICE ***")
    print("***************")

def print_details(invoice):
    print(f"Customer: {invoice.customer}")
    print(f"Amount: ${invoice.amount}")

2. Rename Variable/Method

Đổi tên cho rõ ràng hơn.

# Before
def calc(d):
    return d * 0.1

# After
def calculate_discount(price):
    DISCOUNT_RATE = 0.1
    return price * DISCOUNT_RATE

3. Replace Magic Number with Constant

# Before
if user.age >= 18:
    pass

# After
LEGAL_AGE = 18
if user.age >= LEGAL_AGE:
    pass

4. Replace Conditional with Polymorphism

# Before - nhiều if/else
class Animal:
    def __init__(self, type):
        self.type = type
    
    def make_sound(self):
        if self.type == "dog":
            return "Woof!"
        elif self.type == "cat":
            return "Meow!"
        elif self.type == "cow":
            return "Moo!"

# After - polymorphism
class Animal(ABC):
    @abstractmethod
    def make_sound(self): pass

class Dog(Animal):
    def make_sound(self):
        return "Woof!"

class Cat(Animal):
    def make_sound(self):
        return "Meow!"

class Cow(Animal):
    def make_sound(self):
        return "Moo!"

5. Introduce Parameter Object

# Before - many parameters
def create_address(street, city, state, zip_code, country):
    pass

# After - parameter object
@dataclass
class Address:
    street: str
    city: str
    state: str
    zip_code: str
    country: str

def create_address(address: Address):
    pass

6. Replace Nested Conditional with Guard Clauses

# Before - nested ifs
def process_payment(payment):
    if payment is not None:
        if payment.amount > 0:
            if payment.is_valid():
                # Process payment
                return True
            else:
                return False
        else:
            return False
    else:
        return False

# After - guard clauses (early return)
def process_payment(payment):
    if payment is None:
        return False
    
    if payment.amount <= 0:
        return False
    
    if not payment.is_valid():
        return False
    
    # Happy path ở cuối
    # Process payment
    return True

Static Code Analysis & Linting

Tools tự động phát hiện issues.

Python Tools

Pylint - Comprehensive checker:

pip install pylint
pylint myapp.py

# Output:
# ************* Module myapp
# myapp.py:1:0: C0114: Missing module docstring (missing-module-docstring)
# myapp.py:5:0: C0103: Variable name "X" doesn't conform to snake_case (invalid-name)

Flake8 - Style guide enforcement:

pip install flake8
flake8 myapp.py

# Output:
# myapp.py:1:1: E302 expected 2 blank lines, found 1
# myapp.py:10:80: E501 line too long (82 > 79 characters)

Black - Auto-formatter:

pip install black
black myapp.py  # Formats automatically

MyPy - Type checking:

# myapp.py
def add(a: int, b: int) -> int:
    return a + b

result: str = add(1, 2)  # Type error!

# Run mypy
mypy myapp.py
# Output: error: Incompatible types in assignment

JavaScript Tools

ESLint:

npm install eslint
eslint src/

# .eslintrc.json
{
  "extends": "airbnb",
  "rules": {
    "semi": ["error", "always"],
    "quotes": ["error", "single"]
  }
}

Prettier:

npm install prettier
prettier --write src/

Pre-commit Hooks

Chạy linters tự động trước khi commit.

# Install pre-commit
pip install pre-commit

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/psf/black
    rev: 23.1.0
    hooks:
      - id: black
  
  - repo: https://github.com/PyCQA/flake8
    rev: 6.0.0
    hooks:
      - id: flake8

# Install hooks
pre-commit install

# Now linters run automatically on git commit!

Type Systems: Static vs Dynamic Typing

Dynamic Typing (Python, JavaScript)

# Python - runtime type checking
def add(a, b):
    return a + b

add(1, 2)        # OK
add("a", "b")    # Also OK (concatenation)
add(1, "b")      # Runtime error!

Pros: Flexible, less boilerplate
Cons: Errors only at runtime

Static Typing (C#, Java, TypeScript)

// C# - compile-time type checking
public int Add(int a, int b) {
    return a + b;
}

Add(1, 2);       // OK
Add("a", "b");   // Compile error!

Pros: Catch errors early, better IDE support
Cons: More verbose

Type Hints in Python (Best of both worlds)

from typing import List, Optional, Dict

def process_users(users: List[str]) -> Dict[str, int]:
    result: Dict[str, int] = {}
    for user in users:
        result[user] = len(user)
    return result

# Optional type
def get_user(user_id: int) -> Optional[User]:
    # May return User or None
    return db.get(user_id)

# Run mypy to check types (không ảnh hưởng runtime)

Benefits:

  • Documentation
  • IDE autocomplete
  • Catch bugs với mypy
  • Still flexible (runtime không enforce)

Key Takeaways

  • Clean code dễ đọc hơn code "thông minh" - prioritize clarity over cleverness
  • Meaningful names reveal intent - avoid mental mapping
  • Functions should do ONE thing, be small (5-20 lines), have few parameters (0-3)
  • Comments explain WHY, not WHAT - good code is self-documenting
  • Code smells: Duplicated code, long methods, large classes, primitive obsession
  • Refactoring techniques: Extract method, rename, replace magic numbers, guard clauses
  • Static analysis tools catch issues early - use linters và formatters
  • Type hints trong dynamic languages provide documentation và error checking

Trong bài tiếp theo, chúng ta sẽ khám phá Testing Strategy - test pyramid, TDD, BDD, và cách đo lường chất lượng test.


Bài viết thuộc series "From Zero to AI Engineer" - Module 3: Implementation & Quality Assurance