reveal.js

# Introduction to Machine Learning

**ผู้จัดทำ:** อรรถพล คงหวาน

---

## Outline

1. ประวัติและวิวัฒนาการของ Machine Learning
2. Supervised Learning
3. Unsupervised Learning
4. Reinforcement Learning Basics
5. Feature Engineering & Selection
6. Overfitting & Regularization

---

# ประวัติและวิวัฒนาการของ
# Machine Learning

---

## วิวัฒนาการของ ML ทั้ง 4 ยุค

```mermaid
%%{init: {'theme': 'base', 'themeVariables': {
  'primaryColor': '#282828', 'primaryTextColor': '#ebdbb2',
  'primaryBorderColor': '#504945', 'lineColor': '#d79921',
  'secondaryColor': '#3c3836', 'tertiaryColor': '#1d2021',
  'background': '#282828', 'mainBkg': '#282828',
  'nodeBorder': '#504945', 'clusterBkg': '#32302f',
  'titleColor': '#ebdbb2', 'edgeLabelBackground': '#3c3836'
}}}%%
flowchart TB
  subgraph ERA1["ยุคที่ 1: รากฐาน (1940–1960)"]
    A1["1943 - McCulloch-Pitts Neuron"]
    A2["1950 - Alan Turing - Turing Test"]
    A3["1957 - Perceptron (Rosenblatt)"]
  end
  subgraph ERA2["ยุคที่ 2: AI Winter & Revival (1960–1990)"]
    B1["1969 - Minsky & Papert - XOR Problem"]
    B2["1986 - Backpropagation (Rumelhart)"]
  end
  subgraph ERA3["ยุคที่ 3: ยุคทอง ML (1990–2010)"]
    C1["1995 - SVM (Cortes & Vapnik)"]
    C2["2001 - Random Forest (Breiman)"]
    C3["2006 - Deep Learning (Hinton)"]
  end
  subgraph ERA4["ยุคที่ 4: Modern ML (2010–ปัจจุบัน)"]
    D1["2012 - AlexNet - ImageNet Winner"]
    D2["2017 - Transformer - Attention is All You Need"]
    D3["2022– - LLMs: GPT, Claude"]
  end
  ERA1 --> ERA2 --> ERA3 --> ERA4
  style ERA1 fill:#3c3836,stroke:#d79921,color:#ebdbb2
  style ERA2 fill:#3c3836,stroke:#458588,color:#ebdbb2
  style ERA3 fill:#3c3836,stroke:#b8bb26,color:#ebdbb2
  style ERA4 fill:#3c3836,stroke:#fb4934,color:#ebdbb2
```

---

# 1. Supervised Learning
# (การเรียนรู้แบบมีผู้สอน)

---

## Outline: Supervised Learning

- 1.1 Classification (การจำแนกประเภท)
  - Decision Tree · KNN · Naive Bayes
- 1.2 Regression (การถดถอย)
  - Linear Regression · MSE
- 1.3 Model Evaluation
  - Cross-Validation · Confusion Matrix · F1 · AUC-ROC

---

## กระบวนการ Supervised Learning

```mermaid
%%{init: {'theme': 'base', 'themeVariables': {
  'primaryColor': '#282828', 'primaryTextColor': '#ebdbb2',
  'primaryBorderColor': '#504945', 'lineColor': '#d79921',
  'background': '#282828', 'mainBkg': '#3c3836',
  'nodeBorder': '#504945', 'clusterBkg': '#32302f',
  'titleColor': '#ebdbb2', 'edgeLabelBackground': '#3c3836'
}}}%%
flowchart TD
  A["📊 ข้อมูลฝึก (Training Data) X, y"]
  B["🧠 อัลกอริทึม ML"]
  C["📦 โมเดล (Model)"]
  D["📋 ข้อมูลใหม่ X_new"]
  E["🎯 การทำนาย ŷ"]
  F["📈 การประเมิน Accuracy / MSE"]
  A --> B --> C
  D --> C --> E
  E --> F
  style A fill:#3c3836,stroke:#b8bb26,color:#ebdbb2
  style B fill:#3c3836,stroke:#d79921,color:#ebdbb2
  style C fill:#3c3836,stroke:#458588,color:#ebdbb2
  style D fill:#3c3836,stroke:#8ec07c,color:#ebdbb2
  style E fill:#3c3836,stroke:#fb4934,color:#ebdbb2
  style F fill:#3c3836,stroke:#d3869b,color:#ebdbb2
```

---

## 1.1.1 Decision Tree — Entropy

**Decision Tree** ตัดสินใจด้วยกฎ if-else แบบลำดับชั้น โดยเลือก feature ที่ให้ Information Gain สูงสุด

<math xmlns="http://www.w3.org/1998/Math/MathML" display="block">
  <mrow>
    <mi>Entropy</mi><mo>(</mo><mi>S</mi><mo>)</mo>
    <mo>=</mo>
    <mo>-</mo>
    <munderover>
      <mo>∑</mo>
      <mrow><mi>i</mi><mo>=</mo><mn>1</mn></mrow>
      <mi>c</mi>
    </munderover>
    <msub><mi>p</mi><mi>i</mi></msub>
    <msub><mo>log</mo><mn>2</mn></msub>
    <msub><mi>p</mi><mi>i</mi></msub>
  </mrow>
</math>

**ตัวอย่าง:** ผู้ป่วย 10 คน: ป่วย 7, ไม่ป่วย 3

<math xmlns="http://www.w3.org/1998/Math/MathML" display="block">
  <mrow>
    <mi>Entropy</mi><mo>(</mo><mi>S</mi><mo>)</mo>
    <mo>=</mo>
    <mo>-</mo>
    <mfrac><mn>7</mn><mn>10</mn></mfrac>
    <msub><mo>log</mo><mn>2</mn></msub>
    <mfrac><mn>7</mn><mn>10</mn></mfrac>
    <mo>-</mo>
    <mfrac><mn>3</mn><mn>10</mn></mfrac>
    <msub><mo>log</mo><mn>2</mn></msub>
    <mfrac><mn>3</mn><mn>10</mn></mfrac>
    <mo>=</mo>
    <mn>0.881</mn>
    <mtext> bits</mtext>
  </mrow>
</math>

---

## 1.1.1 Decision Tree — Information Gain

<math xmlns="http://www.w3.org/1998/Math/MathML" display="block">
  <mrow>
    <mi>IG</mi><mo>(</mo><mi>S</mi><mo>,</mo><mi>A</mi><mo>)</mo>
    <mo>=</mo>
    <mi>Entropy</mi><mo>(</mo><mi>S</mi><mo>)</mo>
    <mo>-</mo>
    <munder>
      <mo>∑</mo>
      <mrow><mi>v</mi><mo>∈</mo><mi>Values</mi><mo>(</mo><mi>A</mi><mo>)</mo></mrow>
    </munder>
    <mfrac>
      <mrow><mo>|</mo><msub><mi>S</mi><mi>v</mi></msub><mo>|</mo></mrow>
      <mrow><mo>|</mo><mi>S</mi><mo>|</mo></mrow>
    </mfrac>
    <mi>Entropy</mi><mo>(</mo><msub><mi>S</mi><mi>v</mi></msub><mo>)</mo>
  </mrow>
</math>

- **S** = ชุดข้อมูล, **pᵢ** = สัดส่วนของตัวอย่างในคลาส i
- **c** = จำนวนคลาสทั้งหมด, **A** = attribute ที่พิจารณา
- **Sᵥ** = ชุดย่อยที่ attribute A มีค่า v

```python
from sklearn.tree import DecisionTreeClassifier
from sklearn.datasets import load_iris
model = DecisionTreeClassifier(max_depth=3, criterion='entropy')
model.fit(X_train, y_train)
print(f"Accuracy: {model.score(X_test, y_test):.4f}")
```

---

## 1.1.2 K-Nearest Neighbors (KNN)

**KNN** จำแนกโดยดูจาก k ตัวอย่างที่ใกล้ที่สุด แล้วลงคะแนนเสียงข้างมาก

| จุด | x₁ | x₂ | Label | ระยะห่างจาก (3,4) |
|-----|----|----|-------|-------------------|
| B   | 2  | 3  | 🔴 Red | **1.41** |
| C   | 4  | 5  | 🔵 Blue | **1.41** |
| E   | 3.5 | 4.5 | 🔵 Blue | **0.71** |

k=3 → Red=1, Blue=2 → **ทำนาย: Blue** ✅

---

## 1.1.2 KNN — การเลือก k ด้วย Cross-Validation

```python
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import cross_val_score
import numpy as np

k_values = range(1, 21)
cv_scores = [
    cross_val_score(KNeighborsClassifier(n_neighbors=k),
                    X_train_scaled, y_train, cv=5).mean()
    for k in k_values
]
best_k = k_values[np.argmax(cv_scores)]
print(f"Best k={best_k}, Accuracy={max(cv_scores):.4f}")
```

- **สำคัญ:** ต้อง Standardize ข้อมูลก่อนใช้ KNN เสมอ
- ใช้ `StandardScaler().fit_transform(X_train)` และ `transform(X_test)`

---

## 1.1.3 Naive Bayes — Bayes' Theorem

**Naive Bayes** สมมติว่า features เป็นอิสระต่อกัน (conditional independence)

- **P(C)** = Prior probability, **P(X|C)** = Likelihood
- **P(C|X)** = Posterior probability (สิ่งที่ต้องการ)

---

## 1.1.3 Naive Bayes — ตัวอย่างกรองสแปม (Laplace Smoothing)

| Email | "free" | "win" | "meeting" | Label |
|-------|--------|-------|-----------|-------|
| 1 | ✅ | ✅ | ❌ | 🔴 Spam |
| 2 | ✅ | ❌ | ❌ | 🔴 Spam |
| 3 | ❌ | ❌ | ✅ | 🟢 Ham |
| 4 | ❌ | ✅ | ✅ | 🟢 Ham |

**ทดสอบ:** "free"=✅, "win"=✅, "meeting"=❌
- P(free|Spam) = (2+1)/(2+2) = **0.75**, P(free|Ham) = **0.25**
- Score Spam ∝ 0.5 × 0.75 × 0.75 × 0.25 = **0.070**
- Score Ham  ∝ 0.5 × 0.25 × 0.25 × 0.75 = **0.023**
- **ทำนาย: Spam** ✅

---

## 1.2.1 Linear Regression — สูตรและ Loss Function

**Mean Squared Error (MSE):**

- **β₀** = intercept, **βᵢ** = coefficients, **ε** = error term
- **ŷ** = ค่าที่ทำนาย, **y** = ค่าจริง

---

## 1.2.1 Linear Regression — ตัวอย่างคำนวณด้วยมือ

| บ้าน | พื้นที่ x (ตร.ม.) | ราคา y (ล้าน) |
|------|-------------------|---------------|
| 1 | 50 | 2.5 |
| 2 | 80 | 4.0 |
| 3 | 100 | 5.5 |
| 4 | 120 | 6.0 |
| 5 | 150 | 7.5 |

x̄ = **100**, ȳ = **5.1**

**สมการ:** ŷ = 0.25 + 0.0485 × x
**ทำนาย 130 ตร.ม.:** ŷ = 0.25 + 0.0485 × 130 = **6.555 ล้านบาท**

---

## 1.3 Model Evaluation — Cross-Validation

```mermaid
%%{init: {'theme': 'base', 'themeVariables': {
  'primaryColor': '#282828', 'primaryTextColor': '#ebdbb2',
  'primaryBorderColor': '#504945', 'lineColor': '#d79921',
  'background': '#282828', 'mainBkg': '#3c3836',
  'nodeBorder': '#504945', 'clusterBkg': '#32302f',
  'titleColor': '#ebdbb2', 'edgeLabelBackground': '#3c3836'
}}}%%
flowchart TB
  subgraph FOLD1["Fold 1"]
    T1["Test | Train | Train | Train | Train"]
  end
  subgraph FOLD2["Fold 2"]
    T2["Train | Test | Train | Train | Train"]
  end
  subgraph FOLD3["Fold 3"]
    T3["Train | Train | Test | Train | Train"]
  end
  FOLD1 --> S1["Score 1"]
  FOLD2 --> S2["Score 2"]
  FOLD3 --> S3["Score 3"]
  S1 & S2 & S3 --> AVG["📊 Mean ± Std"]
  style FOLD1 fill:#3c3836,stroke:#d79921,color:#ebdbb2
  style FOLD2 fill:#3c3836,stroke:#b8bb26,color:#ebdbb2
  style FOLD3 fill:#3c3836,stroke:#458588,color:#ebdbb2
  style AVG fill:#3c3836,stroke:#fb4934,color:#ebdbb2
```

---

## 1.3 Confusion Matrix และ Metrics

| เมตริก | สูตร | ความหมาย |
|--------|------|-----------|
| **Accuracy** | (TP+TN)/(TP+TN+FP+FN) | ความแม่นยำโดยรวม |
| **Precision** | TP/(TP+FP) | ที่ทำนาย Positive ถูกกี่ % |
| **Recall** | TP/(TP+FN) | Positive จริงทำนายถูกกี่ % |
| **F1-Score** | 2×(P×R)/(P+R) | ค่าเฉลี่ยฮาร์มอนิก P และ R |
| **AUC-ROC** | พื้นที่ใต้กราฟ ROC | ประสิทธิภาพแยกคลาสโดยรวม |

- TP=True Positive, TN=True Negative
- FP=False Positive, FN=False Negative

---

# 2. Unsupervised Learning
# (การเรียนรู้แบบไม่มีผู้สอน)

---

## Outline: Unsupervised Learning

- 2.1 Clustering (การจัดกลุ่ม)
  - K-Means · Hierarchical · DBSCAN
- 2.2 Dimensionality Reduction
  - PCA (Step-by-Step) · t-SNE

---

## ประเภทของ Unsupervised Learning

```mermaid
%%{init: {'theme': 'base', 'themeVariables': {
  'primaryColor': '#282828', 'primaryTextColor': '#ebdbb2',
  'primaryBorderColor': '#504945', 'lineColor': '#d79921',
  'background': '#282828', 'mainBkg': '#3c3836',
  'nodeBorder': '#504945', 'clusterBkg': '#32302f',
  'titleColor': '#ebdbb2', 'edgeLabelBackground': '#3c3836'
}}}%%
mindmap
  root((Unsupervised Learning))
    Clustering
      K-Means
      Hierarchical
      DBSCAN
    Dimensionality Reduction
      PCA
      t-SNE
      UMAP
    Generative Models
      Autoencoders
      GMM
      VAE
    Association Rules
      Apriori
      FP-Growth
```

---

## 2.1.1 K-Means Clustering — Objective Function

**K-Means** แบ่งข้อมูลเป็น k กลุ่ม โดยลดผลรวมระยะห่างจาก centroid (WCSS)

**ขั้นตอน K-Means:**
1. กำหนด k และสุ่ม centroids เริ่มต้น
2. **Assignment:** กำหนดแต่ละจุดให้ centroid ที่ใกล้ที่สุด
3. **Update:** คำนวณ centroid ใหม่จากค่าเฉลี่ยของจุดในกลุ่ม
4. ทำซ้ำ 2-3 จนกว่า centroids จะไม่เปลี่ยน

---

## 2.1.1 K-Means — ตัวอย่างคำนวณ (k=2)

| จุด | x | y | Centroid C1=(1,1), C2=(5,5) | Cluster |
|-----|---|---|-----------------------------|---------|
| P1 | 1 | 2 | d(C1)=1.0, d(C2)=6.4 | **1** |
| P2 | 2 | 1 | d(C1)=1.0, d(C2)=5.7 | **1** |
| P3 | 4 | 5 | d(C1)=5.0, d(C2)=1.4 | **2** |
| P4 | 5 | 4 | d(C1)=5.7, d(C2)=1.0 | **2** |
| P5 | 5 | 6 | d(C1)=6.4, d(C2)=1.0 | **2** |

**Centroid ใหม่:** C1 = (1.5, 1.5) · C2 = (4.67, 5.0)
**หา k ที่เหมาะสม:** Elbow Method + Silhouette Score

---

## 2.1.2 Hierarchical Clustering — Linkage Methods

**Hierarchical Clustering** สร้าง dendrogram — ไม่ต้องกำหนด k ล่วงหน้า

| Method | สูตร | ข้อดี | ข้อเสีย |
|--------|------|-------|---------|
| **Single** | min d(a,b) | elongated clusters | ไวต่อ outliers |
| **Complete** | max d(a,b) | ขนาดสม่ำเสมอ | ไม่ดีกับ elongated |
| **Average** | mean d(a,b) | ประนีประนอม | ช้ากว่า |
| **Ward** | minimize variance | clusters กลม | non-globular ไม่ดี |

- **Agglomerative (Bottom-up):** รวมทุกจุดจนเหลือกลุ่มเดียว
- **Divisive (Top-down):** แบ่งจากกลุ่มเดียวจนทุกจุดเป็นกลุ่มตัวเอง

---

## 2.1.3 DBSCAN — Density-Based Clustering

**DBSCAN** จัดกลุ่มตามความหนาแน่น ตรวจ outliers ได้อัตโนมัติ

**พารามิเตอร์หลัก:**
- **ε (epsilon)** = รัศมีของ neighborhood
- **MinPts** = จำนวนจุดต่ำสุดใน ε-neighborhood

**ประเภทของจุด:**
- **Core Point** — มีอย่างน้อย MinPts จุดใน ε-neighborhood
- **Border Point** — อยู่ใน neighborhood ของ core point
- **Noise Point (Outlier)** — ไม่ใช่ทั้งสอง

```python
from sklearn.cluster import DBSCAN
db = DBSCAN(eps=0.3, min_samples=10)
labels = db.fit_predict(X_scaled)   # label=-1 คือ noise
```

---

## 2.2.1 PCA — แนวคิดและเป้าหมาย

**PCA** ค้นหาทิศทางของ variance สูงสุด แล้วฉายข้อมูลบนแกนใหม่ **Principal Components (PC)**

**เป้าหมายหลัก:**
- ลดจำนวน features โดยสูญเสียข้อมูลน้อยที่สุด
- แก้ปัญหา Curse of Dimensionality
- Visualize ข้อมูลหลายมิติใน 2D/3D
- ลด Multicollinearity ก่อนทำ Regression

**ชุดข้อมูลตัวอย่าง — คะแนนนักศึกษา 6 คน:**

| นักศึกษา | คณิต (x₁) | ฟิสิกส์ (x₂) |
|----------|-----------|--------------|
| A | 2 | 2.4 |  B | 4 | 4.0 |  C | 5 | 5.5 |
| D | 6 | 5.5 |  E | 8 | 7.6 |  F | 9 | 8.5 |

---

## PCA ขั้นที่ 1-2: Mean และ Centering

**ขั้นที่ 1: คำนวณค่าเฉลี่ย**

**ขั้นที่ 2: Centering** — x̃ᵢ = xᵢ − x̄

| นักศึกษา | x̃₁ = x₁−5.667 | x̃₂ = x₂−5.583 |
|----------|----------------|----------------|
| A | −3.667 | −3.183 |
| B | −1.667 | −1.583 |
| C | −0.667 | −0.083 |
| D | +0.333 | −0.083 |
| E | +2.333 | +2.017 |
| F | +3.333 | +2.917 |

---

## PCA ขั้นที่ 3: Covariance Matrix

Var(x̃₁) = 33.334/5 = **6.667** · Var(x̃₂) = 25.228/5 = **5.046** · Cov = 28.767/5 = **5.753**

**Covariance = 5.753 (สูง บวก)** → ยืนยัน x₁ และ x₂ สัมพันธ์กันมาก

---

## PCA ขั้นที่ 4: Eigenvalues

แก้สมการ det(Σ − λI) = 0 → λ² − 11.713λ + 0.525 = 0

**Eigenvectors (normalized):**

---

## PCA ขั้นที่ 5: Explained Variance Ratio

| PC | Eigenvalue | Explained | Cumulative |
|----|-----------|-----------|------------|
| **PC1** | 11.669 | **99.62%** | 99.62% |
| PC2 | 0.045 | 0.38% | 100.00% |

**สรุป:** PC1 เพียงแกนเดียวอธิบาย variance ได้ 99.62% → ลดจาก 2D → 1D ได้!

---

## PCA ขั้นที่ 6: Projection (PC Scores)

**สูตรฉายข้อมูล:**

| นักศึกษา | PC1 Score (z₁) | PC2 Score (z₂) |
|----------|----------------|----------------|
| A | **−4.842** | +0.082 |
| B | **−2.294** | −0.102 |
| C | **−0.557** | +0.375 |
| D | **+0.196** | −0.281 |
| E | **+3.080** | −0.011 |
| F | **+4.427** | +0.012 |

**สังเกต:** PC2 Score ≈ 0 ทุกตัว → ยืนยัน PC2 แทบไม่มีข้อมูลเพิ่ม

---

## PCA — สรุปขั้นตอนทั้งหมด

```mermaid
%%{init: {'theme': 'base', 'themeVariables': {
  'primaryColor': '#282828', 'primaryTextColor': '#ebdbb2',
  'primaryBorderColor': '#504945', 'lineColor': '#d79921',
  'background': '#282828', 'mainBkg': '#3c3836',
  'nodeBorder': '#504945', 'clusterBkg': '#32302f',
  'titleColor': '#ebdbb2', 'edgeLabelBackground': '#3c3836'
}}}%%
flowchart TD
  S1["📊 ขั้นที่ 1 ข้อมูลดิบ X"]
  S2["➖ ขั้นที่ 2 Centering X̃ = X - mean"]
  S3["🔢 ขั้นที่ 3 Covariance Matrix Σ"]
  S4["🔍 ขั้นที่ 4 Eigen Decomposition Σv=λv"]
  S5["📈 ขั้นที่ 5 Explained Variance λ₁/(λ₁+λ₂)=99.62%"]
  S6["🎯 ขั้นที่ 6 Projection Z = X̃ × E"]
  S7["✅ ผลลัพธ์ ลดจาก 2D → 1D คงข้อมูล 99.62%"]
  S1-->S2-->S3-->S4-->S5-->S6-->S7
  style S1 fill:#3c3836,stroke:#d79921,color:#ebdbb2
  style S2 fill:#3c3836,stroke:#458588,color:#ebdbb2
  style S3 fill:#3c3836,stroke:#b8bb26,color:#ebdbb2
  style S4 fill:#3c3836,stroke:#d3869b,color:#ebdbb2
  style S5 fill:#3c3836,stroke:#83a598,color:#ebdbb2
  style S6 fill:#3c3836,stroke:#fe8019,color:#ebdbb2
  style S7 fill:#3c3836,stroke:#fb4934,color:#ebdbb2
```

---

## PCA vs t-SNE

| คุณสมบัติ | PCA | t-SNE |
|-----------|-----|-------|
| **ประเภท** | Linear | Non-linear |
| **เป้าหมาย** | Maximize global variance | Preserve local structure |
| **ความเร็ว** | เร็วมาก O(nd²) | ช้า O(n²) |
| **Interpretability** | สูง | ต่ำ |
| **Transform ข้อมูลใหม่** | ทำได้ทันที | ต้องฝึกใหม่ |
| **ใช้เพื่อ** | Preprocessing + Visualization | Visualization เท่านั้น |
| **Deterministic** | ใช่ | ไม่ (ต้องกำหนด seed) |

---

## PCA — ข้อควรระวัง

| ประเด็น | รายละเอียด |
|---------|------------|
| **Scaling ก่อน PCA** | ถ้า scale ต่างกัน ต้อง StandardScaler ก่อน |
| **Interpretability** | PC axes ไม่มีความหมายทางกายภาพโดยตรง |
| **Linear Method** | จับได้เฉพาะ linear structure |
| **Information Loss** | เมื่อลดมิติ ข้อมูลบางส่วนหายไปเสมอ |
| **Sign of Eigenvectors** | เครื่องหมาย +/− ไม่ unique |
| **จำนวน Components** | ดู Cumulative Explained Variance ≥ 95% |

---

# 3. Reinforcement Learning Basics
# (พื้นฐานการเรียนรู้เสริมแรง)

---

## Outline: Reinforcement Learning

- 3.1 องค์ประกอบหลักของ RL
- 3.2 Q-Learning และ Bellman Equation
- 3.3 Policy และ Value Functions

---

## 3.1 วงจร Agent-Environment

```mermaid
%%{init: {'theme': 'base', 'themeVariables': {
  'primaryColor': '#282828', 'primaryTextColor': '#ebdbb2',
  'primaryBorderColor': '#504945', 'lineColor': '#d79921',
  'background': '#282828', 'mainBkg': '#3c3836',
  'nodeBorder': '#504945', 'clusterBkg': '#32302f',
  'titleColor': '#ebdbb2', 'edgeLabelBackground': '#3c3836'
}}}%%
flowchart LR
  AG["🤖 Agent (ตัวแทน)"]
  ENV["🌍 Environment (สภาพแวดล้อม)"]
  AG -->|"Action (a) การกระทำ"| ENV
  ENV -->|"State (s') สถานะใหม่"| AG
  ENV -->|"Reward (r) รางวัล"| AG
  style AG fill:#3c3836,stroke:#d79921,color:#ebdbb2
  style ENV fill:#3c3836,stroke:#83a598,color:#ebdbb2
```

**องค์ประกอบหลัก:**
- **Agent** — ผู้เรียนรู้และตัดสินใจ
- **State (s)** — สถานะปัจจุบันของ environment
- **Action (a)** — การกระทำที่ agent เลือก
- **Reward (r)** — สัญญาณตอบกลับ
- **Policy (π)** — กลยุทธ์การเลือก action

---

## 3.2 Q-Learning — Bellman Equation

- **α (alpha)** = Learning rate — ความเร็วในการเรียนรู้
- **γ (gamma)** = Discount factor — ความสำคัญของรางวัลในอนาคต
- **r** = Reward ทันที
- **max Q(s', a')** = Q value สูงสุดของ state ถัดไป

---

## 3.2 Q-Learning — ตัวอย่างคำนวณ

**กำหนด:** α=0.1, γ=0.9, Q(s₁,a₁)=0.0, r=+10, Q(s₂,best)=5.0

**Epsilon-Greedy Policy:**
- ε สูง → **Explore** (สุ่ม action ใหม่)
- ε ต่ำ → **Exploit** (ใช้ Q-table ที่เรียนรู้แล้ว)
- ε ลดลงเรื่อยๆ ตาม `EPSILON_DECAY = 0.995`

---

## 3.3 Policy และ Value Functions

| ฟังก์ชัน | สัญลักษณ์ | ความหมาย |
|---------|-----------|-----------|
| **Policy** | π(a\|s) | ความน่าจะเป็นที่เลือก action a ใน state s |
| **State Value** | V(s) | รางวัลสะสมที่คาดหวังจาก state s |
| **Action Value** | Q(s,a) | รางวัลสะสมเมื่อทำ a ใน s ตาม π |
| **Advantage** | A(s,a) | Q(s,a) - V(s): ความได้เปรียบของ action |

```python
# Q-Learning update loop (core)
for episode in range(N_EPISODES):
    state = 0
    while not done:
        action = np.argmax(Q[state]) if rand > epsilon else random_action()
        next_state, reward, done = step(state, action)
        Q[state, action] += alpha * (reward + gamma * max(Q[next_state]) - Q[state, action])
        state = next_state
```

---

# 4. Feature Engineering & Selection
# (วิศวกรรม Feature และการเลือก Feature)

---

## Outline: Feature Engineering

- การสร้าง Feature ใหม่ (Feature Creation)
- การแปลง Feature (Transformation)
- การเลือก Feature (Feature Selection)
  - Filter · Wrapper · Embedded Methods

---

## กระบวนการ Feature Engineering

```mermaid
%%{init: {'theme': 'base', 'themeVariables': {
  'primaryColor': '#282828', 'primaryTextColor': '#ebdbb2',
  'primaryBorderColor': '#504945', 'lineColor': '#d79921',
  'background': '#282828', 'mainBkg': '#3c3836',
  'nodeBorder': '#504945', 'clusterBkg': '#32302f',
  'titleColor': '#ebdbb2', 'edgeLabelBackground': '#3c3836'
}}}%%
flowchart TD
  RAW["📥 Raw Data ข้อมูลดิบ"]
  FE1["🔧 Feature Creation สร้าง Feature ใหม่"]
  FE2["🔄 Feature Transformation แปลง Feature"]
  FE3["🔍 Feature Selection เลือก Feature"]
  T1["Scaling: StandardScaler, MinMax"]
  T2["Encoding: One-Hot, Label"]
  T3["Log Transform: skewed data"]
  S1["Filter: Correlation, Chi-Square"]
  S2["Wrapper: RFE, Forward Selection"]
  S3["Embedded: LASSO, Tree Importance"]
  RAW --> FE1 --> FE2 --> FE3
  FE2 --- T1
  FE2 --- T2
  FE2 --- T3
  FE3 --- S1
  FE3 --- S2
  FE3 --- S3
  style RAW fill:#3c3836,stroke:#d79921,color:#ebdbb2
  style FE1 fill:#3c3836,stroke:#fb4934,color:#ebdbb2
  style FE2 fill:#3c3836,stroke:#458588,color:#ebdbb2
  style FE3 fill:#3c3836,stroke:#b8bb26,color:#ebdbb2
```

---

## Feature Creation และ Transformation

**Feature Creation — ตัวอย่าง:**
```python
df['family_size']    = df['siblings'] + 1
df['is_alone']       = (df['family_size'] == 1).astype(int)
df['fare_log']       = np.log1p(df['fare'])       # Log transform
df['fare_per_class'] = df['fare'] / df['pclass']  # Interaction term
df['age_group']      = pd.cut(df['age'], bins=[0,12,18,35,60,100])
```

**Encoding:**
```python
df['gender_enc']   = (df['gender'] == 'Female').astype(int)
embarked_dummies   = pd.get_dummies(df['embarked'], prefix='emb')
```

---

## Feature Selection — 3 วิธีหลัก

**Filter Method — SelectKBest:**
```python
from sklearn.feature_selection import SelectKBest, f_classif
selector = SelectKBest(score_func=f_classif, k=8)
X_selected = selector.fit_transform(X, y)
```

**Embedded Method — Random Forest Importance:**
```python
from sklearn.ensemble import RandomForestClassifier
rf = RandomForestClassifier(n_estimators=100)
rf.fit(X, y)
importances = rf.feature_importances_   # top features
sorted_idx  = np.argsort(importances)[::-1][:10]
```

- **Wrapper (RFE):** recursive feature elimination — ช้าแต่แม่นยำ

---

# 5. Overfitting & Regularization
# (การ Overfit และการทำ Regularization)

---

## Outline: Overfitting & Regularization

- ความหมายของ Overfitting และ Underfitting
- Regularization: L1 (LASSO) · L2 (Ridge) · ElasticNet
- เทคนิคอื่นๆ ในการป้องกัน Overfitting

---

## Underfitting vs Good Fit vs Overfitting

```mermaid
%%{init: {'theme': 'base', 'themeVariables': {
  'primaryColor': '#282828', 'primaryTextColor': '#ebdbb2',
  'primaryBorderColor': '#504945', 'lineColor': '#d79921',
  'background': '#282828', 'mainBkg': '#3c3836',
  'nodeBorder': '#504945', 'clusterBkg': '#32302f',
  'titleColor': '#ebdbb2', 'edgeLabelBackground': '#3c3836'
}}}%%
flowchart LR
  UF["🔵 Underfitting - High Bias, Low Variance - Train Error: สูง, Test Error: สูง"]
  GF["🟢 Good Fit - Low Bias, Low Variance - Train Error: ต่ำ, Test Error: ต่ำ"]
  OF["🔴 Overfitting - Low Bias, High Variance - Train Error: ต่ำมาก, Test Error: สูง"]
  UF -->|"เพิ่ม complexity"| GF
  GF -->|"ซับซ้อนเกินไป"| OF
  OF -->|"Regularization"| GF
  style UF fill:#3c3836,stroke:#458588,color:#ebdbb2
  style GF fill:#3c3836,stroke:#b8bb26,color:#ebdbb2
  style OF fill:#3c3836,stroke:#fb4934,color:#ebdbb2
```

---

## Regularization — L1, L2, ElasticNet

**L1 Regularization (LASSO):**

**L2 Regularization (Ridge):**

**ElasticNet (L1 + L2):**

---

## L1 vs L2 — ความแตกต่างหลัก

| คุณสมบัติ | L1 (LASSO) | L2 (Ridge) |
|-----------|------------|------------|
| **ผลต่อ coefficients** | บางตัวเป็น 0 (Sparse) | หดเล็กลงทุกตัว |
| **Feature Selection** | ✅ อัตโนมัติ | ❌ ไม่ |
| **เมื่อ features สัมพันธ์** | เลือกหนึ่งตัว | กระจายน้ำหนักเท่าๆ |
| **เหมาะกับ** | Many irrelevant features | Multicollinearity |

**λ** = regularization strength — ยิ่งมาก ยิ่ง penalize coefficients ใหญ่

```python
from sklearn.linear_model import Ridge, Lasso, ElasticNet
models = [Ridge(alpha=1.0), Lasso(alpha=1.0), ElasticNet(alpha=1.0, l1_ratio=0.5)]
```

---

## เทคนิคป้องกัน Overfitting อื่นๆ

| เทคนิค | วิธีการ | เหมาะกับ |
|--------|---------|----------|
| **Cross-Validation** | ประเมินบนหลาย folds | ทุกโมเดล |
| **Early Stopping** | หยุดฝึกเมื่อ val loss ไม่ลด | Neural Networks |
| **Dropout** | สุ่มปิด neurons | Neural Networks |
| **Data Augmentation** | เพิ่มข้อมูลฝึก | Vision, NLP |
| **Ensemble Methods** | รวมหลายโมเดล | ทุกโมเดล |
| **Pruning** | ตัด branches ที่ไม่สำคัญ | Decision Trees |
| **Regularization** | เพิ่ม penalty | Linear Models |

---

## สรุปภาพรวม Machine Learning Foundations

```mermaid
%%{init: {'theme': 'base', 'themeVariables': {
  'primaryColor': '#282828', 'primaryTextColor': '#ebdbb2',
  'primaryBorderColor': '#504945', 'lineColor': '#d79921',
  'background': '#282828', 'mainBkg': '#3c3836',
  'nodeBorder': '#504945', 'clusterBkg': '#32302f',
  'titleColor': '#ebdbb2', 'edgeLabelBackground': '#3c3836'
}}}%%
mindmap
  root((Machine Learning Foundations))
    Supervised Learning
      Classification
        Decision Tree
        KNN
        Naive Bayes
      Regression
        Linear Regression
        MSE Loss
      Model Evaluation
        Cross-Validation
        F1, AUC-ROC
    Unsupervised Learning
      Clustering
        K-Means
        Hierarchical
        DBSCAN
      Dim Reduction
        PCA
        t-SNE
    Reinforcement Learning
      Q-Learning
      Bellman Equation
      Policy Functions
    Feature Engineering
      Scaling & Encoding
      Feature Creation
      Filter/Wrapper/Embedded
    Regularization
      L1 LASSO
      L2 Ridge
      ElasticNet
```

---

## เอกสารอ้างอิง

1. **Mitchell, T. M.** (1997). *Machine Learning*. McGraw-Hill.
2. **Géron, A.** (2022). *Hands-On Machine Learning* (3rd ed.). O'Reilly.
3. **Bishop, C. M.** (2006). *Pattern Recognition and Machine Learning*. Springer.
4. **Sutton & Barto** (2018). *Reinforcement Learning: An Introduction* (2nd ed.). MIT Press.
5. **Pedregosa et al.** (2011). Scikit-learn: ML in Python. *JMLR*, 12, 2825–2830.
6. **scikit-learn Docs** (2024). https://scikit-learn.org/stable/documentation.html
7. **Goodfellow et al.** (2016). *Deep Learning*. MIT Press.
8. **Hastie et al.** (2009). *The Elements of Statistical Learning* (2nd ed.). Springer.

---

# คำถาม - ข้อสงสัย