
Bachelor, Diploma or Master Thesis
Learning a System-Model for Robot Learning
Pole Balancing, e.g. keeping a long stick or a broom upright on your fingertip, is amongst the standard evaluation tasks for machine learning algorithms. Learning controllers for balancing is straightforward in theory, challenging in simulation, and really hard in real world systems. The main reason for such difficulties is that machine learning algorithms typically need several hundreds of thousands of trials before briefly balancing a pole - which turns implementations on real robots impossible. Many real robotic projects thus apply simulators to learn controllers, and - once learned - transfer such controllers to real systems; but often hand-designed simulators are not sufficiently accurate to allow proper operation in real-world scenarios.
This Diploma/Master Thesis addresses the problem of inaccurate simulations for real-world robots. The student will investigate in a challenging balancing task, where an existing robot (see picture and project description) shall be trained to balance ordinary pencils upright on their tip - which is a task too difficult also for humans to achieve. A hand-designed controller exists that keeps pencils balanced, but performance is limited to a small set of rather similar pencils. Here we want to learn a controller online to handle a variety of different pencils. Such a “learning system” requires an accurate simulator, which is difficult to program as physical parameters of the real robot are unknown. In the thesis, the student shall implement a system that iteratively performs two steps: (a) learning/adapting parameters of a simulator with a small number of real-world-trials, and (b) learning/adapting a balancing controller in the learned simulator. We expect that after very few real-world trials the overall system will be able to balance various pencils upright, and continually improve its balancing performance.
Besides for the toy-problem of balancing pencils, such a system that - once generalized - learns its own mechanical configuration is highly relevant for all sorts of robots that either are manufactured with low cost and precision and/or wear out or get damaged during operation and need to adapt to intrinsic changes.
The current system is programmed in C and Python and uses ODE (Open Dynamic Engine) for 3D world simulations. Experience in some of the above are a clear advantage.
Advisor:
Prof. Dr. Jörg Conradt
If you are interested in this project please send an email outlining in about five to ten lines why you are interested, state your skills and knowledge, and explain why we should choose you among those who apply. Thank you!