Date |
Assignment |
Extra
material
|
1/20 |
Overview |
|
1/22 |
Chap.
3 of Reinforcement Learning: An Introduction by Sutton and Barto
|
No response needed!
|
1/27 |
No class! (send in answers to
exercises 17.1-4 & 17.9)
|
17.1-17.3 of AI a Modern Approach 3rd edition |
1/29 |
No class!
(send
in answers to exercises 17.13-15) |
17.4 of AI a Modern Approach 3rd edition |
2/3 |
No class!
|
|
2/5 |
High-level
Reinforcement Learning in Strategy Games by Amato and Shani |
Robot skill learning video
We will also go over answers to exercises |
2/10 |
Planning
and Acting in Partially Observable Stochastic Domains by
Kaelbling, Littman and Cassandra (only up to 4.4!) |
POMDPs
for dummies |
2/12 |
(1) Point-based
value iteration: An anytime algorithm for POMDPs by Pineau, Gordon
and Thrun
(2)SARSOP:
Efficient Point-Based POMDP Planning by Approximating Optimally
Reachable Belief Spaces by Kurniawati, Hsu and Lee |
|
2/17 |
(1) Collision
Avoidance for Unmanned Aircraft using Markov Decision Processes by
Temizer, Kochenderfer, Lozano-Perez and Kaelbling
(2) Unmanned
Aircraft Collision Avoidance using Continuous-State POMDPs by Bai,
Hsu, Kochenderfer and Lee |
|
2/19 |
Monte-Carlo
Planning in Large POMDPs by Silver and Veness |
POMCP
pac-man video |
2/24 |
(1)Experiences
with a Mobile Robotic Guide for the Elderly by Montemerlo, Pineau,
Roy, Thrun and Verma
(2)Spoken
Dialogue Management Using Probabilistic Reasoning by Roy, Pineau
and Thrun |
|
2/26 |
Relatively
Robust Grasping by Hsiao, Lozano-Perez and Kaelbling
|
|
3/3 |
The
Belief Roadmap: Efficient Planning in Linear POMDPs by Factoring
the Covariance by Prentice and Roy
|
|
3/5 |
Planning
How to Learn by Bai, Hsu and Lee
|
|
3/10 |
Monte
Carlo Value Iteration with Macro-Actions by Lim, Hsu and Lee
|
|
Choose papers to present |
3/12 |
DESPOT:
Online POMDP Planning with Regularization by Somani, Lim, Hsu and
Lee |
|
3/17 |
Spring Break! |
|
3/19 |
Spring Break! |
|
3/24 |
Efficient
planning in non-Gaussian belief spaces and its application to
robot grasping by Platt, Kaelbling, Lozano-Perez and Tedrake
|
|
Project topics due
|
3/26 |
Monte Carlo Bayesian Reinforcement Learning by Wang, Won, Hsu, Lee
(Luke Jablonski) |
|
3/31 |
Using
POMDPs to Control an Accuracy-Processing Time Tradeoff in Video
Surveillance by Kapoor, Amato, Srivastava and Schrater (Mark
Kelley) |
|
4/2 |
A
POMDP Approach to Optimizing P300 Speller BCI Paradigm by Park and
Kim (Dan Shea)
|
|
4/7 |
Supporting Search and Rescue Operations with UAVs by Waharte and
Trigoni (Eliza Hunt-Hawkins)
|
|
4/9 |
Inverse
Reinforcement Learning in Partially Observable Environments by
Choi and Kim |
|
4/14 |
Planning
for Decentralized Control of Multiple Robots Under Uncertainty by
Amato, Konidaris, Cruz, Maynor, How and Kaelbling |
|
Project status reports |
4/16 |
Point-Based
POMDP Solving with Factored Value Function Approximation by Veiga,
Spaan and Lima |
|
4/21 |
Decentralized
cognitive MAC for opportunistic spectrum access in ad hoc
networks: A POMDP framework by Zhao, Tong, Swami and Chen
|
|
4/23 |
An MDP-Based
Recommender System by Shani, Brafman and Heckerman |
|
4/28 |
Project presentations (Mark and Dan)
|
|
4/30 |
Project presentations (Luke and Eliza)
|
|
5/4 |
|
AAAI
format
|
Final paper due
|