Our group members have exciting topics available that are suite for research projects and master thesis research.
Please contact us directly by email!
The success of AlphaStar for StarCraft has shown that competitive agents learning “together” can enhance the generality of a solution immensely. Can we also use this scheme to improve strategy learning for much simpler settings as in MCTS or similar approaches on the much smaller game microRTS (https://github.com/santiontanon/microrts)?
contact: Mike Preuss
- Single-agent Curriculum Learning
AlphaZero has taught itself to play world-class Go from scratch using a form of reinforcement learning called curriculum learning. Can we apply curriculum learning in single-agent puzzles such as Sokoban, and thus, in principle, to some of the world’s most important optimization problems?
contact: Aske Plaat
- Beware what you wish for: specification gaming and value alignment in AI and RL
AI techniques such as reinforcement learning can be very powerful methods to optimize certain given objectives, but unfortunately humans are notable bad at stating their objectives and constraints well enough, or overseeing the side effects of maximizing these objectives. In reinforcement learning this problem is known asspecification gaming – which actually is a misnomer as the AI is simply blindly trying to optimize the given objective. A more general term is value alignment – how can we make sure that the AI aligns its values with ours. In this project we want to provide a compelling example of specification gaming, to make the public further aware of this key existential risk of AI. Optionally, we can look into finding solutions, for example by accepting that objectives specifications are initially flawed, but humans can adapt these on the way.
Contact: Peter van der Putten (or Mike Preuss, Aske Plaat)
- Transportation Scheduling
- flatland—>multiagent reinforcement learning for train system planning
- competence based intrinsic motivation
- find the knowledge/features that humans understand in the network
- write GUI-system for RL
- plan towards hard parts of the state space
- cooperative learning
- evolution of RL teams
- Benchmarking Quantum Models
- Diplomacy, Team AI