In this talk we will describe the recent advances and application of the branch UPPAAL Stratego. This tool allows for synthesis of safe and near-optimal strategies to be synthesized for (1 player) stochastic hybrid games, by combining symbolic methods for timed games, with reinforcement learning methods, as well as abstraction techniques for hybrid games. Aiming at using the synthesized strategies as control programs to be executed on small embedded platforms an important issue is how to encode compactly the synthesized strategies. For this purpose algorithmic methods have been devised to take into account both compositionality as well as partial observability. Also on-line methods for strategy synthesis/learning has been successfully applied in diverse domains such as heating systems and intelligent traffic control. The on-line method has the distinct advantages of not needing to store any strategy (as it is constantly computed during operation) but may be too slow to meet response-frequency of a given domain (e.g. in the order of milli-seconds for switched controllers for power electronics). Thus, we are investigating ways of making the on-line computations more efficient. Finally we will address the problem concerned with the verification of safety properties of strategies obtained from pure learning.