Status: Draft

Scenario Description

To autonomously drive a vehicle in general, Convolutional Neural Networks are commmonly used [LEC98]. A research team has proven, that a complex architecture is capable to drive a car by using an end-to-end concept [NVI16]. Therefor a paradigm called Imitation Learning has been applied [HUS17]. Hereby, the model is trained supervised and the action is mimiced.
Furthermore, another paradigm, Reinforcement Learning [SB98] is eligible. In Reinforcment Learning, a reward system is required for training instead of labeled data. Based on this approach, researchers sucessfully trained models, which are capable of solving different complex tasks [KOR13], also in the continuous action space [SIL14][LIL15].

Robotic / Autonomous Driving

Prototype v2 rev2 Prototype v2 rev2
(Images: CC-BY-SA 4.0 Bruno Schilling / HTW Berlin, 2018)

Based on the Supervised Learning approach a real word prototype racecar in scale 1:10 is built and equipped with a camera. Data can be recorded to train a model, capable of driving autonomously on a random track with pylons as borders (pylons are also in scale 1:10). Further sensors and different cameras can be applied to the real world prototype as replacement or in addition to the used camera. With the paradigm of Reinforcement Learning in mind, a simulator [KH04] in combination with the Robot Operating System [QUI09] and OpenAI Gym [BRO16] is used, to provide the possibility of training networks by different reward functions. The MIT-Racecar project is used as base for the simulated version of the real world prototype. The design of the implementation is split into standalone modules. This architecture allows to train a model by Imitation Learning in real world or a simulated environment and afterwards, to improve it further with Reinforcement Learning. As the basic system, Linux in combination with ROS is used.

For further details see


[BRO16] Brockman, Greg, et al. "OpenAI Gym." CoRR abs/1606.01540.
[HUS17] Hussein, Ahmed, et al. "Imitation Learning: A Survey of Learning Methods." ACM Comput Surv 50.2, 21:1-21:35.
[KH04] Koenig, Nathan and Howard, Andrew. "Design and use paradigms for Gazebo, an open-source multi-robot simulator." Proceedings of 2004 IEEE/RSJ IROS.
[KOR13] Mnih, Volodymyr and Kavukcuoglu, Koray, et al. "Playing Atari with Deep Reinforcement Learning." CoRR abs/1312.5602.
[LEC98] LeCun, Yann, et al. "Gradient-based learning applied to document recognition." Proceedings of the IEEE 86.11 (1998): 2278-2324.
[LIL15] Lillicrap, Timothy, et al. "Continuous control with deep reinforcement learning." CoRR abs/1509.02971.
[NVI16] LeCun, Yann, et al. "Gradient-based learning applied to document recognition." Proceedings of the IEEE 86.11 (1998): 2278-2324.
[QUI09] Quiqley, Morgan, et al. "ROS: an open-source Robot Operating System." ICRA workshop on open sourc software. Vol. 3. 3.2. 2009.
[SB98] Sutton, Richard and Barto, Andrew "Introduction to Reinforcement Learning." 1st Cambridge, MA, USA: MIT Press, 1998.
[SIL14] Silver, David, et al. "Deterministic Policy Gradient Algorithms." Proceedings of the 31st International Conference on Machine Learning (ICML-14): 387-395.