Scenario generation

 

Stochastic programs need to be solved with discrete distributions. Usually, we are faced with either continuous distributions or data. Hence, we need to pass from the continuous distributions or the data to a discrete distribution suitable for calculations. The process of creating this discrete distribution is called scenario generation, and the result is a scenario tree. There are at least two major issues in this process:

 

 

In my view, for most reasonable cases, pure sampling will not be good enough. Certainly, with enough sample points, the second item above will be taken well care of, but most likely the first will not. If the sampling is stopped so that the corresponding stochastic program can be solved in reasonable time, its statistical properties are most likely not very good, and the problem we solve may not represent the real problem very well.

 

I have therefore chosen to leave behind sampling based methods. We are then left with two major approaches, namely bounding and construction. In both of these cases, the question of measuring quality becomes important. How can we tell that a scenario tree is good? I have tried to discuss that in:

 

Michal Kaut and Stein W. Wallace, Evaluation of scenario-generation methods for stochastic programming, Paper 14-2003 in SPEPS.

 

Our major point is that to measure the distance between two distributions (typically the underlying distribution / data and the scenario tree) we must use a metric based on the stochastic program. As the objective function of a stochastic program typically is very flat, we prefer to use the difference optimal objective function values as a metric. We do not want to declare it a problem that we move around on a flat part of the objective function.

 

Earlier in my career I worked quite a bit with bounding procedures. Although it was not made a major point at the time, bounding was always guided by the objective function. I wrote several papers in this direction, I guess the most important one is

 

Stein W. Wallace, A piecewise linear upper bound on the network recourse function, Mathematical Programming 38 (1987) 133-146.

 

Lately my interests have gone more in the direction of scenario construction. It started with my former student Kjetil Høyland, and the article

 

Kjetil Høyland and Stein W. Wallace, Generating Scenario Trees for Multistage Decision Problems, Management Science 47 (2001) 295-307.

 

This article sets the scene for using the objective function values to measure distance. It also defines the idea of making scenario trees by requiring that the tree has certain properties, hence we call the methodology property matching. In a later paper we describe a much more efficient method for matching four marginal moments and correlations. It is described in

 

Michal Kaut, Kjetil Høyland and Stein W. Wallace, A heuristic for moment-matching scenario generation, Computational Optimization and Applications 24 (2003) 169-185.

 

I am presently involved in further research on scenario construction with Michal Kaut, Pavel Popela and Pavla Zemankova, the two latter from Brno University of Technology, Czech Replublic.

 

 

The above was mostly meant to reflect my own research interests. The informed reader will see that I have not discussed methods where sampling takes place during the execution of the optimization algorithm, such as Stochastic Decomposition by Higle and Sen. These are methods that finish with statistical statements about the quality of the solution. These methods seem to behave extremely well, and my intention has not been to be critical to these methods. I do not know enough to make very clear statements.