The problem of getting stuck in local minima is a common one for reinforcement learning agents. There are a few ways to overcome this problem. One way is to use a technique called last-mile optimization. With last-mile optimization, the agent tries to find the global optimum by starting from the local optimum and then moving…