2025 fall

Artificial Intelligence (2 Div.)

Reinforcement learning (RL) is one of the popular machine learning paradigms for solving sequential decision-making problems. In this paradigm, agents learn the optimal policies by repeatedly interacting with an environment to maximize (cumulative) rewards. This courses will cover the foundational concepts of RL, including state-action-reward pairs, the Markov decision process, and exploration versus exploitation. In addition, we will learn key RL algorithms, such as the Monte Carlo method, temporal difference learning, function approximation, and policy gradients. Furthermore, you will work on a small team project to implement an RL agent to solve problems with different difficulties, from simple to complex ones.


Instruction

Course Staff
Time & Location
  • Mon./Thu. 09:00 - 10:15, #609, College of Engineering #6
Office Hours
  • Tue. 13:00 - 15:00
Textbook
  • [Ri 20] Reinforcement Learning: An Introduction, 2nd Ed., Richard S. Sutton and Andrew G. Barto, The MIT Press.
Prerequisite
  • Python Programming
Grading Policy
Reinforcement Learning Competitions (90%)
  • Round 1 (10%)
  • Round 2 (14%)
  • Round 3 (18%)
  • Round 4 (22%)
  • Round 5 (26%)
Attendance (10%)
  • 1% of credit is deducted for each absence
  • 3-Lateness = 1-Absence
  • At least 11-Absence = F grade

Schedule

Week 01
September 01 — Overview & Logistics
September 04 — Basic Math

Week 02
September 08 — Introduction to Reinforcement Learning
  • Lecture
  • Reference
    • [Ri 20] Chap. 1
September 11 — Multi-Armed Bandits
  • Lecture
  • Reference
    • [Ri 20] Chap. 2

Week 03
September 15 — Markov Process
  • Lecture
  • Reference
    • [Ri 20] Chap. 3
September 18 — Dynamic Programming: Value Iteration
  • Lecture
  • Reference
    • [Ri 20] Chap. 4

Week 04
September 22 — Dynamic Programming: Policy Iteration
September 25 — Monte-Carlo Methods: On-Policy Methods
  • Lecture
  • Reference
    • [Ri 20] Chap. 5

Week 05
September 29 — Monte-Carlo Methods: Off-Policy Methods
  • Lecture
  • Reference
    • [Ri 20] Chap. 5
October 02 — Competition Round 1
  • Leaderboard
  • (Announce) Competition Round 2
    • Due: October 20

Week 06
October 06 — Chuseok Holiday
  • No Class
October 09 — Hangul day
  • No Class

Week 07
October 13 — Temporal Difference Learning
  • Lecture
  • Reference
    • [Ri 20] Chap. 6
October 16 — n-Step Bootstrapping
  • Lecture
  • Reference
    • [Ri 20] Chap. 7

Week 08
October 20 — Competition Round 2
  • Leaderboard
  • (Announce) Competition Round 3
    • Due: November 06
October 23 — Focus on Midterm Exam
  • No Class

Week 09
October 27 — Planning & Learning: Model-based Methods & Experience Sampling
  • Lecture
  • Reference
    • [Ri 20] Chap. 8
October 30 — Planning & Learning: Trajectory Sampling & Decision-Time Planning
  • Lecture
  • Reference
    • [Ri 20] Chap. 8

Week 10
November 03 — Function Approximation: Basics & Non-Parametric FA
  • Lecture
  • Reference
    • [Ri 20] Chap. 8
November 06 — Competition Round 3
  • Leaderboard
  • (Announce) Competition Round 4
    • Due: November 24

Week 11:
November 10 — Function Approximation: Linear FA
  • Lecture
  • Reference
    • [Ri 20] Chap. 9 - 10
November 13 — Function Approximation: Artificial Neural Network
  • Lecture
  • Reference
    • [Ri 20] Chap. 9 - 10

Week 12
November 17 — Function Approximation: Convolution Neural Network
  • Lecture
  • Reference
    • [Ri 20] Chap. 9 - 10
November 20 — Function Approximation: Deep-Q Network

Week 13
November 24 — Competition Round 4
  • Leaderboard
  • (Announce) Competition Round 5
    • Due: December 11
November 27 — Policy Gradient: Basics
  • Lecture
  • Reference
    • [Ri 20] Chap. 13

Week 14
December 01 — Policy Gradient: REINFORCE & Actor-Critic Methods
  • Lecture
  • Reference
    • [Ri 20] Chap. 13
December 04 — Policy Gradient: Actor-Critic with Eligibility Traces & Asynchronous Advantage Actor-Critic
  • Lecture
  • Reference
    • [Ri 20] Chap. 13
    • Mnih, V., Badia, A.P., Mirza, M., Graves, A., Lillicrap, T., Harley, T., Silver, D. & Kavukcuoglu, K.. "Asynchronous Methods for Deep Reinforcement Learning". Proceedings of The 33rd International Conference on Machine Learning. 48:1928-1937 (2016). https://proceedings.mlr.press/v48/mniha16.html.

Week 15
December 08 — Focus on Final Exam
  • No Class
December 11 — Competition Round 5: League Stage

Week 16
December 15 — Competition Round 5: Playoff Stage
December 18 — Final Remark