2025 fall

Artificial Intelligence (2 Div.)

Reinforcement learning (RL) is one of the popular machine learning paradigms for solving sequential decision-making problems. In this paradigm, agents learn the optimal policies by repeatedly interacting with an environment to maximize (cumulative) rewards. This courses will cover the foundational concepts of RL, including state-action-reward pairs, the Markov decision process, and exploration versus exploitation. In addition, we will learn key RL algorithms, such as the Monte Carlo method, temporal difference learning, function approximation, and policy gradients. Furthermore, you will work on a small team project to implement an RL agent to solve problems with different difficulties, from simple to complex ones.


Instruction

Course Staff
Time & Location
  • Mon./Thu. 09:00 - 10:15, #609, College of Engineering #6
Office Hours
  • Tue. 13:00 - 15:00
Textbook
  • [Ri 20] Reinforcement Learning: An Introduction, 2nd Ed., Richard S. Sutton and Andrew G. Barto, The MIT Press.
Prerequisite
  • Python Programming
Grading Policy
Reinforcement Learning Competitions (90%)
  • Round 1: Grid Crossing! (10%)
  • Round 2: Grid Adventure! (20%)
  • Round 3: Avoid Blurp! (20%)
  • Round 4: Al-kka-gi! (40%)
Attendance (10%)
  • 1% of credit is deducted for each absence
  • 3-Lateness = 1-Absence
  • At least 11-Absence = F grade

Schedule

Week 01
September 01 — Overview & Logistics
September 04 — Basic Math

Week 02
September 08 — Introduction to Reinforcement Learning
  • Lecture
  • Reference
    • [Ri 20] Chap. 1
September 11 — Multi-Armed Bandits
  • Lecture
  • Reference
    • [Ri 20] Chap. 2

Week 03
September 15 — Markov Process
  • Lecture
  • Reference
    • [Ri 20] Chap. 3
September 18 — Dynamic Programming
  • Lecture
  • Reference
    • [Ri 20] Chap. 4

Week 04
September 22 — Tutorial on Gymnasium
September 25 — Monte-Carlo Methods: On-Policy Methods
  • Lecture
  • Reference
    • [Ri 20] Chap. 5

Week 05
September 29 — Monte-Carlo Methods: Off-Policy Methods
  • Lecture
  • Reference
    • [Ri 20] Chap. 5
October 02 — Temporal Difference Learning
  • Lecture
  • Reference
    • [Ri 20] Chap. 6

Week 06
October 06 — Chuseok Holiday
  • No Class
October 09 — Hangul day
  • No Class

Week 07
October 13 — Competition Round 1: Grid Crossing!
October 16 — n-Step Bootstrapping
  • Lecture
  • Reference
    • [Ri 20] Chap. 7

Week 08
October 20 — Focus on Midterm Exam
  • No Class
October 23— Planning & Learning: Model-based Methods & Experience Sampling
  • Lecture
  • Reference
    • [Ri 20] Chap. 8

Week 09
October 27 — Planning & Learning: Trajectory Sampling & Decision-Time Planning
  • Lecture
  • Reference
    • [Ri 20] Chap. 9 - 10
October 30 — Competition Round 2: Grid Adventure!

Week 10
November 03 — Function Approximation: Basics & Non-Parametric FA
  • Lecture
  • Reference
    • [Ri 20] Chap. 9 - 10
November 06 — Function Approximation: Linear Function Approximation
  • Lecture
  • Reference
    • [Ri 20] Chap. 9 - 10

Week 11
November 10 — Function Approximation: Artificial Neural Network
  • Lecture
  • Reference
    • [Ri 20] Chap. 9 - 10
November 13 — Function Approximation: Convolution Neural Network
  • Lecture
  • Reference
    • [Ri 20] Chap. 9 - 10

Week 12
November 17 — Function Approximation: Deep-Q Network
November 20 — Competition Round 3: Avoid Shits!

Week 13
November 24 — Policy Gradient: Basics
  • Lecture
  • Reference
    • [Ri 20] Chap. 13
November 27 — Policy Gradient: REINFORCE & Actor-Critic Methods
  • Lecture
  • Reference
    • [Ri 20] Chap. 13

Week 14
December 01 — Policy Gradient: Actor-Critic with Eligibility Traces & Asynchronous Advantage Actor-Critic
  • Lecture
  • Reference
    • [Ri 20] Chap. 13
    • Mnih, V., Badia, A.P., Mirza, M., Graves, A., Lillicrap, T., Harley, T., Silver, D. & Kavukcuoglu, K.. "Asynchronous Methods for Deep Reinforcement Learning". Proceedings of The 33rd International Conference on Machine Learning. 48:1928-1937 (2016). https://proceedings.mlr.press/v48/mniha16.html.
December 04 - Self-Play
  • Lecture
  • Reference
    • Zhang, Ruize, Zelai Xu, Chengdong Ma, Chao Yu, Wei-Wei Tu, Wenhao Tang, Shiyu Huang et al. "A survey on self-play methods in reinforcement learning." arXiv preprint arXiv:2408.01072 (2024).

Week 15
December 08 — Focus on Final Exam
  • No Class
December 11 — Competition Round 5: League Stage

Week 16
December 15 — Competition Round 5: Playoff Stage
December 18 — Final Remark