Welcome to Moon

Introduction to Reinforcement Learning

The term Reinforcement Learning has gained popularity since the day AlphaGo won four out of the five games against Lee Sedol in March 2016. A year later, AlphaGo's close relative AlphaGo Zero was able to reach AlphaGo's level in a matter of days. AlphaGo Zero starts with no external knowledge save for the rules of Go. Needless to say, the term Reinforcement Learning became a word of magic and wonder. Rumor has it that reinforcement learning can solve virtually any complicated problem.

The article will review the fundamental elements of reinforcement learning and provide examples in which reinforcement learning has been proved to be successful. The word reinforcement means the process of encouraging or establishing a belief or pattern of behavior, especially by encouragement or reward according to Google's result . Hence, the concept of reinforcement learning simply to learn through trial and error in the world of machine learning.

Two key elements of reinforcement learning are a learning agent and an environment. The agent is trained to learn the actions that will yield the maximum reward in the long run and the environment is a simulated environment which helps the agent to learn the rewards generated from different actions.

RL_intro

As shown in the graph above, at each time step, the agent selects an action based on the current state, then the environment responds with a reward associated with the action and the next state that the agent will be in after performing the action. After each time step, the agent adjusts itself according to the reward it receives: