11. Risk measures#
11.1. Introduction#
One key premise of stochastic programming models is that we assume the performance of the recourse decisions to be assessed against expected values. That means that the decisions are made from a risk-neutral stance.
Being risk-neutral means that the only factor taken into account is the product likelihood (or probability) and outcome. However, this needs to reflect how decision-makers may have their preferences affected regarding the solution that exposes them to worst-outcome scenarios, even if they yield lower expected values.
11.2. Measuring risk#
In standard two-stage stochastic programming models, we seek to optimise
where, as before, \(F(x,\xi) = \braces{c^\top x + Q(x,\xi) : x \in X}\), \(Q(x, \xi) = \mini_y \braces{q(\xi)^\top y : W(\xi)y = h(\xi) - T(\xi)x, y \ge 0}\) and \(X=\braces{x \in \reals^n : Ax = b, x \ge 0}\). In this setting, we choose the optimal \(x^\star = \argmin_x \mathbb{E}_\xi\left[F(x,\xi)\right]\) that minimises the expected value of \(F(x,\xi)\).
One important aspect ot notice is that in (11.1), each first-stage solution has an associated second-stage cost probability distribution. In other words, each \(x \in X\) has an associated probability distribution \(f_x(\xi)\) which maps the cost \(F(x', \xi)\) to the probability of scenario \(\xi\). The optimal first-stage decision \(x\) is chosen based on the following principle: \(x'\) is preferred over \(x''\) if \(\mathbb{E}_\xi\left[F(x', \xi)\right] < \mathbb{E}_\xi\left[F(x'', \xi)\right]\).
Figure Comparing two solutions: the solution generating the cost distribution on the top is preferred, as it has a lower expected value illustrates this notion. The figure show the cost associated with two alternative solutions \(x\) of a newsvendor problem, with 50 scenarios of different probability. The top plots represent the probability distribution function \(f_x(\xi)\) while bottom plots represent their cumulative probability distribution.
Fig. 11.1 Comparing two solutions: the solution generating the cost distribution on the top is preferred, as it has a lower expected value#
Notice that by comparing the distributions using their expected value we neglet any information regarding how the costs are dispersed long the x-axis. Consequently, we are ignoring information regarding higher-order statistical moments, and in particular, not considering how the tails of these distributions may be informing how much a certain first-stage solution would expose us to highly undesirable scenarios.
To counterveil this effect, we include in our model a measure of the risk that a given solution expose us to. That is, we define a risk measure \(r : X \to \reals\) that associates the random variable \(F(x, \xi)\) generated by the solution \(x\) with a real-valued risk \(r_\xi(x)\).
By doing so, we can quantify the information in the distribution tails and, alternatively, define that \(x'\) can be chosen over \(x''\) if \(r_\xi\left[F(x', \xi)\right] < r_\xi\left[F(x'', \xi)\right]\).
11.3. Trading off risk and expected return#
As one may anticipate, considering risk as a sole objective to be minimised may lead to overconservative decisions, just as only minimising expected costs may completely overlook their exposure to risk.
Thus, most settings in which risk is considered do so by considering both measures at once, and, being of conflicting nature, they must be considered from a bi-objective standpoint. In other words, most applications rely on the employment of a typical bi-objective optimisation, such as using weighted terms in the objective function
where \(\beta = 0\) represents a risk-neutral stance, while risk aversion increases as \(\beta \to \infty\).
An alternative approach is to consider either of the measures as a budgetted constraint. For example, one could restrict the risk exposition to a risk exposition budget \(\delta > 0\), as
11.4. Suitable risk measures for optimisation#
For a function risk measure to be a suitable risk measure, it has to satisfy some axiomatic properties.[ADEH99] define risk measures to be coherent if they satisfy the following properties
Translation invariance: \(r_\xi\left[F(x, \xi) + a\right] = r_\xi\left[F(x, \xi)\right] + a\) for \(a \in \reals\).
Subadditivity: \(r_\xi\left[F(x', \xi) + F(x'', \xi)\right] \le r_\xi\left[F(x', \xi)\right] + r_\xi\left[F(x'', \xi)\right]\)
Positive homogeneity: \(r_\xi\left[F(x, \xi) \times a\right] = r_\xi\left[F(x, \xi)\right] \times a\) for \(a \in \reals\).
Monotonicity: if for every \(\xi\), we have that \(F(x', \xi) \le F(x'', \xi)\), then \(r_\xi\left[F(x', \xi)\right] \le r_\xi\left[F(x'', \xi)\right]\).
The importance of these properties lies in their role in the context of optimisation. [Roc07] shows taht a coherent risk measure (i) preserves convexity, (ii) preserves certainty, and (iii) is insensitive to scaling.
11.5. Conditional value-at-risk#
Conditional value-at-risk (CVaR) is likely the most widespread risk measure used in the context of optimisation. Being a coherent risk measure, it does not compromise convexity properties of risk-neutral counterparts.
Note
Risk measures are widely employed in settings other than traditional asset management with great measure of success. See for example [AOP20]
Let \(X\) be a random variable and \(F_X\) its cumulative distribution function. Then, for a confidence level \(\alpha\), the Value-at-Risk (\(VaR_\alpha\)) is defined as
In other words, \(VaR_\alpha\) represents the \(\alpha\) quantile of \(F_X(\eta)\). With that, we can then define the conditional value-at-risk, which represents the expectation of \(X\) in the conditional distribution of its \(\alpha\) upper tail, i.e.,
where \([\, \cdot \,]^+ = \max\braces{0,\,\cdot\,}\).
Warning
Notice that if maximising, then the lower \(1-\alpha\) tail is that of interest for calculating the conditiona value-at-risk.
Figures Comparing two solutions: the solution on the bottom has better (smaller) CVaR_{90\%} illustrate the calculation of \(CVaR\)_{90%}$ for the previous solutions from figure Comparing two solutions: the solution generating the cost distribution on the top is preferred, as it has a lower expected value. Notice how if that was our sole criteria for choosing between solutions, we would instead prefer the latter instead of the former.
Fig. 11.2 Comparing two solutions: the solution on the bottom has better (smaller) CVaR\(_{90\%}\)#
Notice that CVaR, having an expected value as part of its calculation, is typically present in setting swhere the uncertainty is discretised, i.e., represented as scenarios. On the other hand, an implication of its coherence in the context of optimisation is convexity preservation, which implies that it can be represent without additional binary variables. This contrasts with VaR (or chance constraints), which need extra binary variables for their representation.
11.6. Conditional value-at-risk formulation for two-stage stochastic programming models#
Recall our risk-neutral scenario-based deterministic equivalent two-stage stochastic programming (2SSP) model
In other to model the measuring of CVaR, we need to define some additional terms. Let \(\beta \in [0,1]\) be the weight parameter for the risk term, and \(\alpha\) the confidence level.
We will define some auxiliary decision variables. Let \(\eta \ge 0\) a continuous variable that represents the value at risk (VaR); \(\pi_s \ge 0\), \(\forall s \in S\) account for \([X - \eta]^+\). Notice that, referring to above, \(X \equiv c^\top x + q_s^\top y_s\).
Then, the risk-averse scenario-based deterministic equivalent 2SSP is
11.7. Final remarks on the use of CVaR#
Notice how the formulation is posed, using the weighted method, as discussed in Trading off risk and expected return. One point to be aware is that, due to scaling, \(\beta = 0.5\) may not necessarily be a midpoint between risk-neutral and risk-averse solutions.
Another relevant remark: CVaR can be used in multi-stage settings (see [Sha11]). There are also alternative more recent risk measures that not only alos work in multi-stage settings, but can also perform better from a computational standpoint while serving as a proxy for other risk aversion paradigms, such as worst-case minimisation or distributional robustness [DMP22].