2025 fall

Artificial Intelligence (2 Div.)

Reinforcement learning (RL) is one of the popular machine learning paradigms for solving sequential decision-making problems. In this paradigm, agents learn the optimal policies by repeatedly interacting with an environment to maximize (cumulative) rewards. This courses will cover the foundational concepts of RL, including state-action-reward pairs, the Markov decision process, and exploration versus exploitation. In addition, we will learn key RL algorithms, such as the Monte Carlo method, temporal difference learning, function approximation, and policy gradients. Furthermore, you will work on a small team project to implement an RL agent to solve problems with different difficulties, from simple to complex ones.


Instruction

Course Staff
Time & Location
  • Mon./Thu. 09:00 - 10:15, #609, College of Engineering #6
Office Hours
  • Tue. 13:00 - 15:00
Textbook
  • [Ri20] Reinforcement Learning: An Introduction, 2nd Ed., Richard S. Sutton and Andrew G. Barto, The MIT Press.
Prerequisite
  • Python Programming
Grading Policy
Reinforcement Learning Competitions (90%)
  • Round 1: Grid Crossing! (10%)
  • Round 2: Grid Adventure! (20%)
  • Round 3: Avoid Blurp! (20%)
  • Round 4: Al-kka-gi! (40%)
Attendance (10%)
  • 1% of credit is deducted for each absence
  • 3-Lateness = 1-Absence
  • At least 11-Absence = F grade

Schedule

Week 01
September 01 — Overview & Logistics
September 04 — Basic Math

Week 02
September 08 — Introduction to Reinforcement Learning
September 11 — Multi-Armed Bandits

Week 03
September 15 — Markov Process
September 18 — Dynamic Programming

Week 04
September 22 — Tutorial on Gymnasium
September 25 — Monte-Carlo Methods: On-Policy Methods

Week 05
September 29 — Monte-Carlo Methods: Off-Policy Methods
October 02 — Temporal Difference Learning

Week 06
October 06 — Chuseok Holiday
  • No Class
October 09 — Hangul day
  • No Class

Week 07
October 13 — Competition Round 1: Grid Crossing!
October 16 — n-Step Bootstrapping

Week 08
October 20 — Planning & Learning
October 23 — Focus on Midterm Exam
  • No Class

Week 09
October 27 — Linear Function Approximation
  • Lecture
  • Reference
    • [Ri20] Chap. 9 - 10
October 30 — Competition Round 2: Grid Adventure!

Week 10
November 03 — Nonlinear Function Approximation: Deep Neural Network
  • Lecture
  • Reference
    • [Ge23] Chap. 10, 11
    • [Ri20] Chap. 9 - 10
November 06 — Nonlinear Function Approximation: Convolution Neural Network
  • Lecture
  • Reference
    • [Ge23] Chap. 14
    • [Ri20] Chap. 9 - 10

Week 11
November 10 — Nonlinear Function Approximation: Practice
  • Lectures
  • Reference
    • [Ge23] Chap. 10, 11, 14
    • [Ri20] Chap. 9 - 10
November 13 — Deep-Q Network: Basics

Week 12
November 17 — Deep-Q Network: Variants
November 20 — Competition Round 3: Avoid Blurp!

Week 13
November 24 — Self-Play / Policy Gradient: Basics
  • Lecture
  • Reference
    • [Ri20] Chap. 13
November 27 — Policy Gradient: REINFORCE
  • Lecture
  • Reference
    • [Ri20] Chap. 13

Week 14
December 01 — Policy Gradient: Actor-Critic Methods
  • Lecture
  • Reference
    • [Ri20] Chap. 13
    • Mnih, V., Badia, A.P., Mirza, M., Graves, A., Lillicrap, T., Harley, T., Silver, D. & Kavukcuoglu, K.. "Asynchronous Methods for Deep Reinforcement Learning". Proceedings of The 33rd International Conference on Machine Learning. 48:1928-1937 (2016). https://proceedings.mlr.press/v48/mniha16.html.
December 04 — Policy Gradient: Proximal Policy Optimization

Week 15
December 08 — Preparing for Final Competition
  • No Class
December 11 — Competition Round 4: Al-Kka-Gi! - League Stage (Rise Group)

Week 16
December 15 — Competition Round 4: Al-Kka-Gi! - League Stage (Legend Group)
December 18 — Competition Round 5: Al-Kka-Gi! - Playoff