Constructing a subgradient from directional derivatives for functions of two variables

For any scalar-valued bivariate function that is locally Lipschitz continuous and directionally differentiable, it is shown that a subgradient may always be constructed from the function's directional derivatives in the four compass directions, arranged in a so-called"compass difference". When the original function is nonconvex, the obtained subgradient is an element of Clarke's generalized gradient, but the result appears to be novel even for convex functions. The function is not required to be represented in any particular form, and no further assumptions are required, though the result is strengthened when the function is additionally L-smooth in the sense of Nesterov. For certain optimal-value functions and certain parametric solutions of differential equation systems, these new results appear to provide the only known way to compute a subgradient. These results also imply that centered finite differences will converge to a subgradient for bivariate nonsmooth functions. As a dual result, we find that any compact convex set in two dimensions contains the midpoint of its interval hull. Examples are included for illustration, and it is demonstrated that these results do not extend directly to functions of more than two variables or sets in higher dimensions.

interval hull Subgradient methods [ , ] and bundle methods [ , , , ] for nonsmooth optimization typically use a subgradient at each iteration to provide local sensitivity information that is ultimately useful enough to infer descent. For convex problems, these subgradients are elements of the convex subdi erential; for nonconvex problems, the subgradients must typically be elements of either Clarke's generalized gradient [ ] or other established generalized subdi erentials [ , , ]. Evaluating a subgradient directly, however, may be a challenging task; this di culty has motivated the development of numerous subdi erential approximations [ , , ].
Nevertheless, there are several settings in which evaluating directional derivatives is much simpler than evaluating a subgradient using established methods. For nite compositions of simple smooth and nonsmooth functions, directional derivatives may be evaluated e ciently [ ] by extending the standard forward/tangent mode of algorithmic di erentiation [ ], while extensions to e cient subgradient evaluation methods require more care [ , ]. Directional derivatives of implicit functions and inverse functions may be obtained by solving auxiliary equation systems [ ], whereas subgradient results in this setting assume either special structure [ , ] or a series of recursive equation-solves [ ].
This work was supported by the Natural Sciences and Engineering Research Council of Canada (NSERC) under Grant RGPIN--. For solutions of parametric ordinary di erential equations (ODEs) with nonsmooth right-hand sides, directional derivatives may be evaluated by solving an auxiliary ODE system [ , Theorem ] using a standard ODE solver, whereas the only general method for subgradient evaluation involves solving a series of ODEs punctuated by discrete jumps that must be handled carefully [ , ]. In parametric optimization, Danskin's classical result [ , ] describes directional derivatives for optimal-value functions as the solutions of related optimization problems in a general setting, while subgradient results such as [ , Theorem . ] tend to additionally require unique solutions for the embedded optimization problem. Moreover, directional derivatives and subdi erentials of convex functions are essentially duals [ ]. Hence, this article examines the question of whether, given a directional-derivative evaluation oracle for a function and little else, this oracle may be used to compute a subgradient at each iteration of a typical nonsmooth optimization method. This is clearly true for univariate functions, for example; in this case, the entire subdi erential may be constructed from directional derivatives in the positive and negative directions.
To address this question, this article de nes a function's compass di erences to be vectors obtained by arranging directional derivatives in the coordinate directions and negative coordinate directions in a certain way. Thus, for a bivariate function, a compass di erence involves directional derivatives in the four compass directions. For a bivariate function that is locally Lipschitz continuous and directionally di erentiable, it is shown that the compass di erence at any domain point is a subgradient, with this subgradient understood to be an element of Clarke's generalized gradient in the nonconvex case. Surprisingly, while this result is simple to state, it appears to be previously unknown even for convex functions, and does not require any additional assumptions. It is also shown that this result does not extend directly to functions of more than two variables. As a related result, this article shows that a compact convex set in R must always contain the midpoint of its interval hull, though this does not extend directly to sets in R n for n > . Hence, four calls to a directional-derivative evaluation oracle are su cient to compute a subgradient for a nonsmooth bivariate function, and centered nite di erences for these functions are useful approximations of a subgradient. In several cases, the approach of this article appears to be the only way known thus far to evaluate a subgradient correctly.
Audet and Hare [ ] studied a similar problem involving the similar setup, in the eld of geometric probing [ ]. Unlike our work, Audet and Hare additionally assume that: (a) their oracle D is convex (as a set's support function), (b) the bivariate function's regular subdi erential is polyhedral, and (c) the oracle D evaluates the function's directional derivative . These assumptions are evidently satis ed, for example, by any function that is both convex and piecewise-di erentiable in the sense of Scholtes [ ]. Under these assumptions, Audet and Hare present a method to use nitely many directional derivative evaluations to construct the whole regular subdi erential at a given domain point. This method proceeds by deducing each vertex of the subdi erential, and depends heavily on the assumption of subdi erential polyhedrality; its complexity scales linearly with the number of subdi erential vertices. It is readily veri ed, for example, that their Algorithm will run forever without locating any subgradients when applied to the convex Euclidean norm function: Indeed, their algorithm is not intended to work in this case. Unlike the work of [ ], we do not assume that subdi erentials are polyhedral, do not require the subdi erential's support function ), page of to be available in the nonconvex case, and do not assume directional derivatives to be convex with respect to direction. Our goal is only to identify one subgradient rather than a whole subdi erential; characterizing a whole subdi erential in closed form may be di cult or impossible when we do not know a priori that it is polyhedral. As mentioned above, in the nonconvex case, we evaluate an element of Clarke's generalized gradient [ ] instead of a subgradient. The remainder of this article is structured as follows. Section summarizes relevant established constructions in nonsmooth analysis, Section de nes compass di erences in terms of directional derivatives and shows that they are valid subgradients, and Section presents several examples for illustration.
The Euclidean norm · and inner product ·, · are used throughout this article. The i th unit coordinate vector in R n is denoted as e (i) , and components of vectors are indicated using subscripts, e.g. x i e (i) , x . The convex hull and the closure of a set S ⊂ R n are denoted as conv S and cl S, respectively. . Definition .
. Consider an open set X ⊂ R n and a function f : X → R. The following limit, if it exists, is the (one-sided) directional derivative of f at x ∈ X in the direction d ∈ R n : This article primarily considers situations where directional derivatives are available via a black-box oracle. For example, this oracle could represent symbolic calculation, the situation-speci c directional derivatives described in Section , algorithmic di erentiation [ ], or even nite di erence approximation if some error is tolerable.
The primary goal of this article is to use directional derivatives to evaluate a subgradient, de ned for convex functions as follows, and generalized to nonconvex functions as in Section . below. Individual subgradients are used at each iteration of subgradient methods for convex minimization [ ] and bundle methods for nonconvex minimization [ ]. They are also used to build useful a ne outer approximations for nonconvex sets [ , ]. In each of these applications, only a single subgradient is needed at each visited domain point. Definition . . Given a convex set X ⊂ R n and a convex function f : The set of all subgradients of f at x is the (convex) subdi erential ∂ f (x).
Thus, the subdi erential characterizes the local behavior of convex functions via ( . ), and characterizes the global behavior of convex functions via ( . ). Moreover, ( . ) shows that directional derivatives and subgradients of convex functions are essentially duals of each other. The (Clarke-)generalized directional derivative of f at x ∈ X in the direction d ∈ R n is: Clarke's generalized gradient of f at x is then: Elements of Clarke's generalized gradient will be called Clarke subgradients.
With f as in the above de nition, and for any x ∈ X , ∂ f (x) is guaranteed to be nonempty, convex, and compact in R n . As suggested by its notation, Clarke's generalized gradient does indeed coincide with the convex subdi erential when f is convex [ ]. When f is nonconvex, ( . ) is no longer guaranteed to hold with Clarke's generalized gradient in place of the convex subdi erential. The following result for univariate functions is easily demonstrated, and is summarized in [ ]. Proposition . . Consider an open set X ⊂ R and a univariate function f : X → R that is locally Lipschitz continuous and directionally di erentiable. For each x ∈ X , Hence, one call to an oracle that evaluates directional derivatives is su cient to obtain a single Clarke subgradient for such a univariate function f . It will be shown in this article that, for bivariate functions f : R → R that are locally Lipcshitz continuous and directionally di erentiable, four directional derivative evaluations are su cient to evaluate a single Clarke subgradient.
The following de nition by Nesterov [ ] will be used to specialize this result in a useful way. Nesterov's de nition is based on repeated directional di erentiation, and permits certain extensions of calculus rules for smooth functions to nonsmooth functions. Definition . . Consider an open set X ⊂ R n and a locally Lipschitz continuous function f : X → R. The function f is lexicographically (L-)smooth at x ∈ X if the following conditions are satis ed: , for any collection of vectors m ( ) , . . . , m (n) ∈ R n , the following inductive sequence of higher-order directional derivatives is well-de ned: If these vectors m (i) are linearly independent, then f (n) is linear, and its constant gradient is called a lexicographic subgradient of f at x. The lexicographic subdi erential ∂ L f (x) is the set of all lexicographic subgradients of f at x.

K.A. Khan and Y. Yuan
Constructing a subgradient from directional derivatives This section de nes compass di erences for functions in terms of directional derivatives, and shows that a compass di erence of a bivariate function is a subgradient. As a corollary, it is also shown that any compact convex set in R contains the midpoint of its interval hull. As there is nothing particularly special about the compass directions in this context, other choices of directions are also considered.
. Consider an open set X ⊂ R n and a function f : The compass di erence is so named because it considers how f behaves when its argument is varied in each of the compass directions. This metaphor works best when n = ; this case is also the focus of this article.
Evaluating ∆ ⊕ f (x) ostensibly requires n directional derivative evaluations. However, if directional derivative values are not available, compass di erences may instead be approximated using nite di erences. Observe that the compass di erence of a function is a centered nite di erence of the directional derivative mapping f (x; ·) at . From the de nition of the directional derivative, we have, So, if numerical evaluations of f : R n → R are viable but evaluations of f (x; ·) are not, then n evaluations of f may be used to approximate ∆ ⊕ f (x) using the argument of the above limit. That is, for su ciently small δ > , . . .
which is incidentally the centered simplex gradient of f at x with a sampling set comprising the coordinate vectors (c.f. [ ]). However, if f is evaluated here using a numerical method, and if δ is too small, then the subtraction operations in this approximation may introduce unacceptable numerical error. This drawback is typical of nite di erence approximations.
. As in [ ], let us say that a function R n → R is B-di erentiable if it is both directionally di erentiable and locally Lipschitz continuous. This section presents the main result of this article: that any compass di erence of a B-di erentiable function of two variables is a Clarke subgradient. This result is strengthened somewhat when the considered function is L-smooth, and is also specialized to convex functions and convex sets in the subsequent sections. To our knowledge, the main result in this section is the rst general closed-form description of a Clarke subgradient for a nonconvex bivariate function in terms of that function's directional derivatives (in the sense of De nition . ). Moreover, the result shows that four calls to a directional derivative oracle are su cient to evaluate a Clarke subgradient for a bivariate B-di erentiable function, without any further structural knowledge of the function at all. Unlike established characterizations of generalized  , Theorem . ], this result does not require f to be represented in any particular format.
The following mean-value theorem will be useful in this development. Lemma . . Consider a function ψ : R n → R that is positively homogeneous and locally Lipschitz continuous. For any x, y ∈ R n , there exists s ∈ ∂ψ ( ) for which If ψ is also L-smooth, then there exists s ∈ conv ∂ L ψ ( ) ⊂ ∂ψ ( ) satisfying ( . ).
The following theorem is the main result of this article, and rests heavily on Lemma . . It shows that any compass di erence of a B-di erentiable function is a Clarke subgradient, and specializes this result to L-smooth functions. Theorem .
. Consider an open set X ⊂ R and a locally Lipschitz continuous function f : Proof. Suppose that f is directionally di erentiable at x ∈ X . Consider the auxiliary mapping: and observe that ψ is Lipschitz continuous [ ], and that f (x; y) = ψ (y) + ∆ ⊕ f (x), y for each y ∈ R . Thus, Clarke's calculus rule for addition [ , Corollary to Proposition . . ] implies: It therefore su ces to show that ∈ ∂ψ ( ). Now, observe that ψ is positively homogeneous, and so ψ is equivalent to ψ ( ; ·). Thus, for each i ∈ { , }, Hence ∆ ⊕ ψ ( ) = .
To obtain a contradiction, suppose that ∂ψ ( ). Then, since ∂ψ ( ) is convex and closed, there must exist a strictly separating hyperplane between and ∂ψ ( ). That is, there exist a nonzero vector p (p , p ) ∈ R and a scalar a > for which p, s ≥ a for each s ∈ ∂ψ ( ).
Next, suppose that f is L-smooth at x ∈ X . The inclusion ∂ L f (x) ⊂ ∂ f (x) was shown by Nesterov [ , Theorem ]; since ∂ f (x) is closed and convex, it follows that cl conv ∂ L f (x) ⊂ ∂ f (x). Consider the auxiliary mapping ψ as above, and note that ( . ) still holds. The calculus rules of the lexicographic subdi erential [ , Theorem and De nitions and ] imply that both f (x; ·) and ψ are L-smooth at , and that From here, a similar argument to the previous case shows that Intuitively, there is nothing special about the coordinate directions used to construct a compass di erence, and a change of basis in Theorem . may be carried out as follows. Corollary . . Consider an open set X ⊂ R , a locally Lipschitz continuous function f : X → R, and a nonsingular matrix V ∈ R × . If f is directionally di erentiable at some x ∈ X , and if (i) denotes the i th column of V , then Proof. Consider auxiliary mappings: The chain rule for directional derivatives [ , Theorem . . ] implies that ∆ ⊕ h(x) = z, and so Theorem . shows that z ∈ ∂h(x). Since V is nonsingular, is surjective, in which case [ , Theorem . . ] implies that: Thus, z = V T s for some s ∈ ∂ f (x), and so (V T ) − z ∈ ∂ f (x) as claimed.
The particular Clarke subgradients identi ed by Theorem . and Corollary . do not necessarily coincide.
We may remove the directional di erentiability requirement of Theorem . as follows, by employing Clarke's generalized directional derivative f • from De nition . . We note, however, that the generalized directional derivative is typically inaccessible in practice.

K.A. Khan and Y. Yuan
Constructing a subgradient from directional derivatives ), page of Corollary .
. Given an open set X ⊂ R , a locally Lipschitz continuous function f : X → R, and some x ∈ X , . This section specializes Theorem . to convex functions; this specialization appears to be a novel result in convex analysis and is simpler to state. Namely, any compass di erence of a bivariate convex function is in fact a subgradient in the traditional sense. Hence, four directional derivative evaluations are su cient to construct a subgradient of a bivariate convex function. Corollary .
. Consider an open convex set X ⊂ R and a convex function f : Proof. Since f is convex and X is open, f is locally Lipschitz continuous and directionally di eren- The claimed result then follows immediately from Theorem . .

Corollary . .
Consider an open convex set X ⊂ R , a convex function f : X → R, and a nonsingular matrix V ∈ R × . For any x ∈ X , with (i) denoting the i th column of V , Proof. Again, since f is locally Lipschitz continuous and directionally di erentiable, the claimed corollary is a special case of Corollary . .

. R
This section applies Corollary . to show that any nonempty compact convex set in R contains the center of its smallest enclosing box (or interval). These notions are formalized in the following classical de nitions (summarized in [ ]), followed by the claimed result. Definition . . An interval in R n is a nonempty set of the form {x ∈ R n : a ≤ x ≤ b}, where a, b ∈ R n , and where each inequality is to be interpreted componentwise. The midpoint of an interval Given a bounded set B ⊂ R n , the interval hull of B is the intersection in R n of all interval supersets of B.
The interval hull of a bounded set B ⊂ R n is itself an interval, and is, intuitively, the smallest interval superset of B. Support functions of convex sets, de ned as follows and discussed at length in [ ], are useful when relating convex sets to properties of subdi erentials of convex functions. Definition . . Given a set C ⊂ R n , the support function of C is the mapping: The following corollary uses support functions to extend Corollary . to the problem of locating an element of a closed convex set in R . Corollary . . Any nonempty compact convex set C ⊂ R contains the midpoint of its interval hull.

K.A. Khan and Y. Yuan
Constructing a subgradient from directional derivatives Proof. The interval hull of C may be expressed in terms of the support function σ C as: the midpoint of this interval hull is then As shown in [ , Section VI, Example . ], σ C is directionally di erentiable at , with (σ C ) ( ; d) = σ C (d) for each d ∈ R . Thus, ∆ ⊕ σ C ( ) = z. Next, [ , Section VI, Example . ] also shows that σ C is convex, with ∂σ C ( ) = C. Combining these observations with Corollary . yields z = ∆ ⊕ σ C ( ) ∈ ∂σ C ( ) = C, as claimed.
This section illustrates the main results of this article. Section . motivates the assumptions of Corollary . and Corollary . by showing how these results could fail if their assumptions were weakened. Section . uses compass di erences to compute individual subgradients in cases where this was previously di cult or impossible. The following example shows that Theorem . , Corollary . , and Corollary . are minimal in the sense that, under the respective assumptions of these results, three support function evaluations are generally not su cient to infer a set element, and three directional derivative evaluations are generally not su cient to infer a function's subgradient. Example . . Suppose that C ⊂ R is the unit ball {x ∈ R : x ≤ }, which has the constant support function σ C : d → . Consider three nonzero points u, , w ∈ R in general position. From the support function's de nition, if we did not know the set C but did know that σ C (u) = σ C ( ) = σ C (w) = , then we could infer that C is a subset of the triangle: Denote the three vertices of T as a, b, c ∈ R , and denote the three edges of T as T conv {a, b}, T conv {b, c}, and T conv {a, c}.
Since {a, b} ⊂ T ⊂ T , observe that  Figure : The disjoint convex compact sets C (red) and C (blue) in R described in Example . , and the common midpoint (black dot) of their interval hulls.
But, since a, b, c are the vertices of the triangle T , and since one edge of T lies on the line u, x = , it cannot be that u, a and u, b are both less than . Hence σ T (u) ≥ , and so σ C (u) = σ T (u).

Similar logic shows that σ
Each T i is compact and convex, and the intersection T ∩T ∩T is empty. Hence, there is no way to infer an element of C from the support function evaluations σ C (u), σ C ( ), and σ C (w) and the knowledge that C is compact and convex; these support function evaluations are consistent with the incorrect hypotheses C = T , C = T , and C = T , yet these guesses have no point in common.
Similarly, considering the convex Euclidean norm function it is readily veri ed that ∂ f ( ) = C. Suppose we know nothing about f other than its convexity and the fact that f ( ; u) = f ( ; ) = f ( ; w) = . In this case, there is no way to infer an element of ∂ f ( ) from these three directional derivatives alone, since for each i ∈ { , , }, the functions all have the same directional derivatives as f at in the directions u, , and w. However, their subdifferentials at are the sets T i , which have no point in common.
The following example shows that the results of this article do not extend directly to functions of more than two variables or sets in more than two dimensions. Example . . Consider the following convex compact sets in R : These sets are illustrated in Figure . They are disjoint; for any x ∈ C and y ∈ C , and with e ( , , ) ∈ R , observe that e, x ≥ > − ≥ e, y .

K.A. Khan and Y. Yuan Constructing a subgradient from directional derivatives
However, it is readily veri ed that both C and C have the interval hull [− , ] , whose midpoint is ( , , ), which is in neither C nor C . Thus, Corollary . does not extend immediately to R .
Similarly, consider the following two convex piecewise-linear functions: Moreover, it is readily veri ed that: Thus, the functions f and ϕ cannot be distinguished based on their directional derivatives at in any coordinate direction or negative coordinate direction, and ∆ ⊕ f ( ) = ∆ ⊕ ϕ( ) = , but the two functions' subdi erentials at are disjoint. This shows that Theorem . and Corollary . do not extend immediately to functions of three variables.
The following example illustrates that the assumption in Corollary . that C is closed is crucial. Example . . Consider the convex set: Observe that C is not closed, and that the interval hull of C is [− , ] . The midpoint of this hull is ( , ), which is not an element of C.
This section applies Theorem . to describe correct single subgradients for solutions of parametric ordinary di erential equations (ODEs) with parameters in R . This approach reduces to the classical ODE sensitivity approach of [ , Section V, Theorem . ] when the original ODE is de ned in terms of smooth functions. Unlike existing methods [ ] for generalized derivative evaluation for these systems, the approach of this article describes a subgradient in terms of auxiliary ODE systems that can be integrated numerically using o -the-shelf ODE solvers, but is of course restricted to systems with two parameters. We consider the following setup, which is readily adapted to other ODE representations. Assumption . . Consider functions f : R n → R n , x : R → R n , and : R × R n → R that are locally Lipschitz continuous and directionally di erentiable. For some scalar t f > , let x : [ , t f ] × R be de ned so that, for each p ∈ R , x(·, p) solves the following ODE system uniquely: De ne ϕ : R → R to be the cost function: Under this assumption, a subgradient for ϕ may be computed by combining the results of this article with directional derivatives described by [ , Theorem ] as follows. If it is desired for the ODE righthand-side to depend explicitly on t, then an alternative directional derivative result [ , Theorem . ] may be used instead.

K.A. Khan and Y. Yuan
Constructing a subgradient from directional derivatives Proposition . . Suppose that Assumption . holds, and consider some particular p ∈ R . For each d ∈ R , let y(·, d) denote a solution on [ , t f ] of the following ODE: Then y(·, d) is in fact the unique solution of this ODE for each d ∈ R . Moreover, if we de ne is an element of ∂ϕ(p).
Proof. According to [ , Theorem ], y(t, d) is the directional derivative x ((t, p); ( , d)) for each t ∈ [ , t f ] and d ∈ R . The result then follows immediately from Corollary . and the chain rule [ , If lexicographic derivatives are unavailable for the functions in Assumption . or do not exist, then Proposition . is, to our knowledge, the rst method for describing a subgradient of ϕ. The following numerical example illustrates this proposition. Example . . Consider a function x : R → R : p → (p , p , p ). For each p ∈ R , let x(·, p) denote the unique solution on [ , ] of the following parametric ODE system. Here dotted variables denote derivatives with respect to t.
Consider a cost function ϕ : p → x ( , p). In this case, for each d ∈ R the ODE ( . ) becomes:

K.A. Khan and Y. Yuan
Constructing a subgradient from directional derivatives Figure : Plot of the function ϕ (top) described in Example . , which appears to dominate the approximation p → ϕ( ) + s, p (bottom) based on the computed compass di erence s of ϕ at .
with y( , d) ≡ (d , d , d ). In this case ϕ is convex. To evaluate a compass di erence of ϕ, the numerical variable-step variable-order ODE solver ode15s was used in M to evaluate y numerically, using M 's default precision (on the order of signi cant digits) for arithmetic, and using respective local absolute and relative tolerances of − and − for each integration step. Thus, to within the corresponding computational error, we obtained ∆ ⊕ ϕ( ) ≈ ( . , . ) =: s, and Proposition . yields ∆ ⊕ ϕ( ) ∈ ∂ϕ( ). Figure shows that s does indeed appear to satisfy ( . ), and does thereby appear to be a subgradient of ϕ at to within numerical precision.

. .
A well-known result by Danskin [ , Theorem ] describes directional derivatives for certain optimalvalue functions, and has been extended to a variety of settings (e.g. [ , ]). The following proposition and its proof are intended to show how any of these results may be combined with Theorem . or Corollary . to describe a subgradient in each case. Proposition . . Consider a compact set C ⊂ R n , some open superset Z of C, and a continuously di erentiable function f : R × Z → R. De ne an optimal-value function ϕ : R → R for which ϕ : x → min{ f (x, y) : y ∈ C}.
For some particularx ∈ R , de ne the following: Then ϕ is locally Lipschitz continuous and directionally di erentiable, and is an element of ∂ϕ(x).

K.A. Khan and Y. Yuan Constructing a subgradient from directional derivatives
Proof. The optimal-value function ϕ has already been established to be locally Lipschitz continuous [ , Theorem . ] and directionally di erentiable [ ], with directional derivatives given by ϕ (x; d) = ψ (d) for each d ∈ R . The claimed result then follows immediately from Theorem . .
Observe that, unlike several established sensitivity results for optimal-value functions [ , ], the above result does not require second-order su cient optimality conditions to hold, and does not require unique solutions of the optimization problems de ning ϕ. An analogous approach describes subgradients of the Tsoukalas-Mitsos convex relaxations [ ] of composite functions of two variables; the Tsoukalas-Mitsos approach is based entirely on analogous optimal-value functions.
For a bivariate nonsmooth function under minimal assumptions, the compass di erence introduced in this article is guaranteed to be a subgradient and may be computed using four calls to a directionalderivative evaluation oracle. This remains true for nonconvex functions, with the "subgradient" understood in this case to be an element of Clarke's generalized gradient. Thus, for such functions, centered nite di erences will necessarily converge to a subgradient as the perturbation width tends to zero. The presented examples show that this new relationship between directional derivatives and subgradients may be useful for functions of two variables, and may in some cases provide the only known way to evaluate a subgradient, but does not extend directly to functions of three or more variables. Such a nontrivial extension represents a possible avenue for future work.
[ ] C. Audet and W. Hare, Algorithmic construction of the subdi erential from directional derivatives, [ ] K. A. Khan and P. I. Barton, Generalized derivatives for solutions of parametric ordinary differential equations with non-di erentiable right-hand sides, J. Optimiz. Theory App.