Organizations frequently face Stochastic Sequential Decision Problems (SSDP) and must come up with decision policies. Those problems can be divided into static decisions (made once, e.g. capacity, delivery schedule) and dynamic decisions (made frequently, e.g. replenishment, routing). The static decision is often made in isolation, i.e., not considering the future impact it has on the dynamic policy.
Two different approaches to identify a solution for those two different problems can be identified in the research community. Mathematical Programming is commonly used to find an (optimal) solution to the static problem, but is inefficient in assessing the future impact of the decision for the dynamic decision. On the other hand, methods such as Approximate Dynamic Programming and Reinforcement Learning are often used to define policies for the dynamic decisions in the SSDP, but consider one fixed configuration of the system (environment) that is defined by the static decision.
This project will revolve around combining methods from both fields, developing new solution approaches combining the static and dynamic problem-solving, and applying them on practical use-cases from retail operations and/or transportation.