Instruction
Course Staff
- Lecturer: Woohyeok Choi
- Office: #407, College of Engineering #6
- Mail: woohyeok.choi@kangwon.ac.kr
- Teaching Assistant: Geonwoo Choi
- Office: #416, College of Engineering #6
- Mail: geonwoo.choi@kangwon.ac.kr
Time & Location
- Mon./Thu. 09:00 - 10:15, #609, College of Engineering #6
Office Hours
- Tue. 13:00 - 15:00
Textbook
- [Ri 20] Reinforcement Learning: An Introduction, 2nd Ed., Richard S. Sutton and Andrew G. Barto, The MIT Press.
Prerequisite
- Python Programming
Grading Policy
Reinforcement Learning Competitions (90%)
- Round 1: Grid Crossing! (10%)
- Round 2: Grid Adventure! (20%)
- Round 3: Avoid Blurp! (20%)
- Round 4: Al-kka-gi! (40%)
Attendance (10%)
- 1% of credit is deducted for each absence
- 3-Lateness = 1-Absence
- At least 11-Absence = F grade
Schedule
Week 01
September 01 — Overview & Logistics
September 04 — Basic Math
Week 02
September 08 — Introduction to Reinforcement Learning
- Lecture
- Reference
- [Ri 20] Chap. 1
September 11 — Multi-Armed Bandits
- Lecture
- Reference
- [Ri 20] Chap. 2
Week 03
September 15 — Markov Process
- Lecture
- Reference
- [Ri 20] Chap. 3
September 18 — Dynamic Programming
- Lecture
- Reference
- [Ri 20] Chap. 4
Week 04
September 22 — Tutorial on Gymnasium
- Practice
- (Announce) Competition Round 1: Grid Crossing!
- Due: Oct. 13
- Readings
September 25 — Monte-Carlo Methods: On-Policy Methods
- Lecture
- Reference
- [Ri 20] Chap. 5
Week 05
September 29 — Monte-Carlo Methods: Off-Policy Methods
- Lecture
- Reference
- [Ri 20] Chap. 5
October 02 — Temporal Difference Learning
- Lecture
- Reference
- [Ri 20] Chap. 6
Week 06
October 06 — Chuseok Holiday
- No Class
October 09 — Hangul day
- No Class
Week 07
October 13 — Competition Round 1: Grid Crossing!
October 16 — n-Step Bootstrapping
- Lecture
- Reference
- [Ri 20] Chap. 7
Week 08
October 20 — Focus on Midterm Exam
- No Class
October 23— Planning & Learning: Model-based Methods & Experience Sampling
- Lecture
- Reference
- [Ri 20] Chap. 8
Week 09
October 27 — Planning & Learning: Trajectory Sampling & Decision-Time Planning
- Lecture
- Reference
- [Ri 20] Chap. 9 - 10
October 30 — Competition Round 2: Grid Adventure!
- Leaderboard
- (Announce) Competition Round 3: Avoid Blurp!
- Due: Nov. 20
Week 10
November 03 — Function Approximation: Basics & Non-Parametric FA
- Lecture
- Reference
- [Ri 20] Chap. 9 - 10
November 06 — Function Approximation: Linear Function Approximation
- Lecture
- Reference
- [Ri 20] Chap. 9 - 10
Week 11
November 10 — Function Approximation: Artificial Neural Network
- Lecture
- Reference
- [Ri 20] Chap. 9 - 10
November 13 — Function Approximation: Convolution Neural Network
- Lecture
- Reference
- [Ri 20] Chap. 9 - 10
Week 12
November 17 — Function Approximation: Deep-Q Network
- Lecture
- Reference
- Mnih, V., Kavukcuoglu, K., Silver, D. et al. “Human-level control through deep reinforcement learning”. Nature 518, 529–533 (2015). https://doi.org/10.1038/nature14236
- van Hasselt, H., Guez, A., & Silver, D. "Deep Reinforcement Learning with Double Q-Learning". Proceedings of the AAAI Conference on Artificial Intelligence, 30 (1) (2016). https://doi.org/10.1609/aaai.v30i1.10295
- Schaul, T., Quan, J., Antonoglou, I., Silver, D. "Prioritized Experience Replay." arXiv preprint arXiv:1511.05952 (2015). https://arxiv.org/abs/1511.05952
November 20 — Competition Round 3: Avoid Shits!
- Leaderboard
- (Announce) Competition Round 4: Al-Kka-Gi!
- Due: Dec. 11
Week 13
November 24 — Policy Gradient: Basics
- Lecture
- Reference
- [Ri 20] Chap. 13
November 27 — Policy Gradient: REINFORCE & Actor-Critic Methods
- Lecture
- Reference
- [Ri 20] Chap. 13
Week 14
December 01 — Policy Gradient: Actor-Critic with Eligibility Traces & Asynchronous Advantage Actor-Critic
- Lecture
- Reference
- [Ri 20] Chap. 13
- Mnih, V., Badia, A.P., Mirza, M., Graves, A., Lillicrap, T., Harley, T., Silver, D. & Kavukcuoglu, K.. "Asynchronous Methods for Deep Reinforcement Learning". Proceedings of The 33rd International Conference on Machine Learning. 48:1928-1937 (2016). https://proceedings.mlr.press/v48/mniha16.html.
December 04 - Self-Play
- Lecture
- Reference
- Zhang, Ruize, Zelai Xu, Chengdong Ma, Chao Yu, Wei-Wei Tu, Wenhao Tang, Shiyu Huang et al. "A survey on self-play methods in reinforcement learning." arXiv preprint arXiv:2408.01072 (2024).
Week 15
December 08 — Focus on Final Exam
- No Class