Draft:Normal Cone (variational analysis)

In variational analysis, set-valued analysis and optimization, the concept of a normal cone to a subset of a space generalizes that of the orthogonal complement / annihilator of a vector space, (outward) normal vector fields to surfaces — or more generally of the normal bundle of an embedded submanifold — to possibly non-smooth subsets of vector spaces.

Review waiting, please be patient.

This may take 2 months or more, since drafts are reviewed in no specific order. There are 3,803 pending submissions waiting for review.

If the submission is accepted, then this page will be moved into the article space.
If the submission is declined, then the reason will be posted here.
In the meantime, you can continue to improve this submission by editing normally.

Where to get help

If you need help editing or submitting your draft, please ask us a question at the AfC Help Desk or get live help from experienced editors. These venues are only for help with editing and the submission process, not to get reviews.
If you need feedback on your draft, or if the review is taking a lot of time, you can try asking for help on the talk page of a relevant WikiProject. Some WikiProjects are more active than others so a speedy reply is not guaranteed.

How to improve a draft

Wikipedia:Contributing to Wikipedia – a basic overview on how to edit Wikipedia.
Help:Wikitext – how to use the markup
Help:Referencing for beginners – how to include references
Wikipedia:Article development – how to develop your article
Wikipedia:Writing better articles – how to improve your article
Wikipedia:Verifiability – make sure your article includes reliable third-party sources

You can also browse Wikipedia:Featured articles and Wikipedia:Good articles to find examples of Wikipedia's best writing on topics similar to your proposed article.

Improving your odds of a speedy review

To improve your odds of a faster review, tag your draft with relevant WikiProject tags using the button below. This will let reviewers know a new draft has been submitted in their area of interest. For instance, if you wrote about a female astronomer, you would want to add the Biography, Astronomy, and Women scientists tags.

Add tags to your draft

Editor resources

Easy tools: Citation bot (help) | Advanced: Fix bare URLs

Reviewer tools

Instructions · What links here · Normal Cone (variational analysis) (talk: + · bio) · (log) · Copyvios report · reFill · Citation Bot · (Search: Google, Wikipedia) · Submitted 2 days ago by Stefan Volz (talk: D · +) · Last edited 34 hours ago by Stefan Volz

Normal cones provide, among other things, the geometrical foundation for generalizing the convex subdifferential to non-convex functions. Of particular note is also their role in generalizing Fermat's rule to give necessary (and sometimes also sufficient) first order optimality conditions for constrained and non-smooth optimization problems. Moreover they play a role in defining coderivatives of set-valued maps. ^[1]

In the non-convex case there are several inequivalent definitions for a normal cone that all turn out to be useful and interesting for different problems — whereas in the convex case these all coincide which greatly simplifies things. For clarity and approachability this article first discusses the convex case over Hilbert spaces before going into the non-convex case over more general spaces.

Conventions

All vector spaces in this article are assumed to be real. The set ${\bar {\mathbb {R} }}$ denotes the extended real numbers and a function $f:X\to {\bar {\mathbb {R} }}$ is called proper if it is nowhere equal to $-\infty$ and also somewhere finite. We also assume that all cones contain zero. Sums of sets throughout the article are to be interpreted as Minkowski sums.

Convex Case

Definition

The convex normal cone $N_{C}^{\operatorname {conv} }$ to a non-empty convex subset $C\subseteq H$ of a (pre-)Hilbert space is defined by ${\begin{alignedat}{2}N_{C}^{\operatorname {conv} }({\bar {x}}):=&\{v\in H:\langle v,x-{\bar {x}}\rangle \leq 0~{\text{for all}}~x\in C\}\\=&\{v\in H:\sup _{x\in C}\langle v,x-{\bar {x}}\rangle \leq 0\}\end{alignedat}}$ for all ${\bar {x}}\in C$ , and $N_{C}^{\operatorname {conv} }({\bar {x}})=\emptyset$ for all ${\bar {x}}\not \in C$ . ^[2]^[3] Here $\langle \cdot ,\cdot \rangle :H\times H\to \mathbb {R}$ denotes the inner product of $H$ .

Geometric Interpretation

This definition can intuitively be understood as follows: suppose ${\bar {x}},x\in C$ and $v\in H$ , then $x-{\bar {x}}$ is the vector from ${\bar {x}}$ to $x$ . If $\langle x-{\bar {x}},v\rangle =0$ then $x-{\bar {x}}$ would be orthogonal to $v$ , while the condition $\langle x-{\bar {x}},v\rangle \leq 0$ additionally allows $v$ to "point away" from $x-{\bar {x}}$ more than 90°. So the (convex) normal cone at ${\bar {x}}$ is the set of all vectors $v$ that point at least 90° away from all the vectors from ${\bar {x}}$ to $C$ ; but translated to the origin.

Examples

When $C$ is a subspace of $H$ , then $N_{C}^{\operatorname {conv} }({\bar {x}})=C^{\bot }$ is the linear-algebraic normal space / orthogonal complement to $C$ for ${\bar {x}}\in C$ . If $C$ is instead an affine subspace then $N_{C}^{\operatorname {conv} }({\bar {x}})$ is instead the orthogonal complement to the underlying parallel vector space of $C$ .
Let $C\subseteq H$ be a non-empty, closed, convex set and consider its convex indicator function $\delta _{C}(x)={\begin{cases}0&x\in C\\+\infty &x\in H\setminus C\end{cases}}.$ Then $\partial \delta _{C}=N_{C}^{\operatorname {conv} }.$
For a proper, lower-semicontinuous, convex function $f:H\to {\bar {\mathbb {R} }}$ we have $s\in \partial f({\bar {x}})\iff (s,-1)\in N_{\operatorname {epi} (f)}^{\mathrm {conv} }({\bar {x}},f({\bar {x}}))$ , where $\operatorname {epi} (f)\subseteq H\times \mathbb {R}$ denotes the epigraph of $f$ and $\partial f$ the convex subdifferential of $f$ . This relationship serves to define further subdifferentials based on normal cones in the non-convex case.

Properties

$N_{C}^{\operatorname {conv} }({\bar {x}})$ is a non-empty, convex cone for all ${\bar {x}}\in C$ .
If $\dim(H)<\infty ,{\bar {x}}\in C$ , then ${\bar {x}}\in \operatorname {int} (C)\iff N_{C}^{\operatorname {conv} }({\bar {x}})=\{0\}$ .^[3] The forward direction $(\Rightarrow )$ holds even if $H$ is infinite dimensional, while the backward direction $(\Leftarrow )$ may fail unless $C$ has nonempty interior.^[4]
Let $C\subseteq H$ be non-empty, closed, convex. Then $c=\operatorname {proj} _{C}(x^{*})\iff x^{*}-c\in N_{C}^{\operatorname {conv} }(c)$ , where $\operatorname {proj} _{C}(x^{*})$ is the metric projection of $x^{*}$ onto $C$ . ^[3]
Intersection Rule: $N_{C_{1}\cap C_{2}}^{\operatorname {conv} }({\bar {x}})\supseteq N_{C_{1}}^{\operatorname {conv} }({\bar {x}})+N_{C_{2}}^{\operatorname {conv} }({\bar {x}})$ for any two nonempty, convex subsets $C_{1},C_{2}\subseteq X$ of a tvs $X$ and ${\bar {x}}\in C_{1}\cap C_{2}$ . Provided that the qualification condition $\operatorname {int} (C_{1})\cap C_{2}\neq \emptyset$ holds, the reverse inclusion is also true such that $N_{C_{1}\cap C_{2}}^{\operatorname {conv} }({\bar {x}})=N_{C_{1}}^{\operatorname {conv} }({\bar {x}})+N_{C_{2}}^{\operatorname {conv} }({\bar {x}}).$ The qualification condition may be further weakened, especially if $X$ is a nicer space. ^[4] For example if $X$ is Banach and the two convex sets are closed, it is sufficient that $0\in \operatorname {int} (C_{1}-C_{2})$ , while in finite dimensions one can show that $N_{\bigcap _{i}C_{i}}^{\operatorname {conv} }({\bar {x}})=\sum _{i}N_{C_{i}}^{\operatorname {conv} }({\bar {x}})$ for any finite family of convex sets $(C_{i})_{i}$ whose relative interiors intersect.
For further calculus rules cf. the properties of the non-convex generalizations below.

Another very important property is what is sometimes called Fermat's rule: let $f:H\to \mathbb {R}$ be convex, $C\subseteq H$ nonempty, closed and convex. Then, assuming some very mild constraint qualifications (c.f. for example Proposition 27.8 in ^[3]), ${\bar {x}}\in H$ solves the constrained minimization problem $\min _{x\in C}f(x)$ if and only if $0\in \partial f({\bar {x}})+N_{C}^{\operatorname {conv} }({\bar {x}})$ . If $f$ is differentiable this reduces to $-\nabla f({\bar {x}})\in N_{C}^{\operatorname {conv} }({\bar {x}})$ (here $\nabla f$ is the Fréchet gradient of $f$ ; i.e. the pointwise Riesz representative of the Fréchet derivative of $f$ ).

Generalization

More generally, one may define $N_{C}^{\operatorname {conv} }({\bar {x}}):=\{\varphi \in X^{*}:\langle \varphi ,x-{\bar {x}}\rangle \leq 0~{\text{for all}}~x\in C\}=\{\varphi \in X^{*}:\sup _{x\in C}\langle \varphi ,x-{\bar {x}}\rangle \leq 0\}$ for any topological vector space $X$ , with topological dual $X^{*}$ . In this case $\langle \cdot ,\cdot \rangle :X^{*}\times X\to \mathbb {R} ,(\varphi ,x)\mapsto \varphi (x)$ is the duality pairing of $X$ and $X^{*}.$ In this more general setting the normal cone can be recognized to be the set of all $\varphi \in X^{*}$ that attain their maximum on $C$ at ${\bar {x}}$ . ^[1] This connects normal cones to another important class of objects, the so-called support function $\sigma _{C}$ of a set. One has $\varphi \in N_{C}^{\operatorname {conv} }({\bar {x}})\iff \varphi ({\bar {x}})=\sigma _{C}(\varphi )$ .

Non-Convex Case

In the remainder of this article we will assume the space we're working on to be Asplund for ease of exposition. While some of the following definitions and properties are fine / remain true for more general spaces (e.g. general normed spaces), this is not true for all of them. The class of Asplund spaces contains many important Banach spaces, for example all $L^{p}$ -spaces for $p\in (1,\infty )$ , but not the cases $p\in \{1,\infty \}.$

Definition

Let $X$ be an Asplund space with topological dual $X^{*}$ and $A\subseteq X$ a (potentially non-convex) nonempty subset of $X$ . There are three main normal cones defined for this case:

The Fréchet (or firm or regular) normal cone $N_{A}^{F}$ is defined by $N_{A}^{F}({\bar {x}}):=\{\varphi \in X^{*}:\limsup _{x\to {\bar {x}},x\neq {\bar {x}}}{\frac {\langle \varphi ,x-{\bar {x}}\rangle }{\lVert x-{\bar {x}}\rVert }}\leq 0\}$ for any ${\bar {x}}\in \operatorname {cl} A$ , and empty outside of that. Alternatively it may be defined as the polar of the Bouligand tangent cone $T_{A}({\bar {x}})=\{x\in X:\exists (x^{k})\subseteq X,\exists (t_{k})\searrow 0:x^{k}\to x{\text{ and }}{\bar {x}}+t_{k}x^{k}\in A\}.$ ^[1] Note that these definitions generally only coincide if $A$ is closed.
The Mordukhovich (or limiting) normal cone $N_{A}^{M}$ is defined by $N_{A}^{M}({\bar {x}}):=\operatorname {Lim\,sup} _{x\to _{A}{\bar {x}}}N_{A}^{F}(x)=\{\varphi \in X^{*}:\exists (x^{k})\subseteq X:\exists (\varphi ^{k})\subseteq X^{*}:x^{k}\to _{X}{\bar {x}},\varphi ^{k}\to \varphi ,\varphi ^{k}\in N_{A}^{F}(x^{k})\}.$ Here $\operatorname {Lim\,sup}$ is an outer limit of sets where X carries its usual norm topology while $X^{*}$ is endowed with its weak* topology and the notation $x\to _{A}{\bar {x}}$ means that " $x$ converges to ${\bar {x}}$ along $A$ ", i.e. the sequence of points in the definition of the outer limit is a sequence of points in $A$ .
The Clarke normal cone $N_{A}^{C}$ is defined by $N_{A}^{C}({\bar {x}}):=\operatorname {cl} (\operatorname {conv} (N_{A}^{M}({\bar {x}}))),$ where $\operatorname {cl}$ is the topological closure and $\operatorname {conv}$ the convex hull of a set.

These normal cones form a nested hierarchy $N_{A}^{F}({\bar {x}})\subseteq N_{A}^{M}({\bar {x}})\subseteq N_{A}^{C}({\bar {x}})$ .

Comparison of Fréchet, Mordukhovich and Clarke normal cones for applications to optimization

Suppose, for simplicity, that we have some Fréchet differentiable function $f:X\to \mathbb {R}$ and consider the stationarity condition $-\nabla f({\bar {x}})\in N_{A}({\bar {x}})$ corresponding to each of the three normal cones. If $N_{A}({\bar {x}})$ is "too large" — in the extreme case we may even find that $N_{A}({\bar {x}})=X$ — we can't expect these conditions to actually tell us anything about the (non-)optimality of ${\bar {x}}$ . Conversely, if $N_{A}({\bar {x}})$ is "too small" — in the extreme case we might find it to be empty or to only contain zero — we will find it difficult or even impossible to verify that $-\nabla f({\bar {x}})\in N_{A}({\bar {x}})$ , or to computationally determine ${\bar {x}}$ such that this holds. Because of this (and differences in their associated calculi) all three cones (as well as further ones) have their place in the theory and applications.

Examples

Let $A=\{(x,y)\in \mathbb {R} ^{2}:xy=0,y\geq 0\}$ . Then, identifying $(\mathbb {R} ^{2})^{*}\simeq \mathbb {R} ^{2}$ , we have $N_{A}^{F}(0)=\{0\}\times (-\infty ,0]$ , $N_{A}^{M}(0)=\{xy=0\}$ and $N_{A}^{C}(0)=\mathbb {R} ^{2}.$
Suppose $X\subseteq \mathbb {R} ^{n}$ is a smooth embedded submanifold. Let $p\in X$ and denote by $N_{p}X\subseteq T_{p}\mathbb {R} ^{n}$ the normal space of $X$ at $p$ . Then $N_{p}X\simeq _{\mathbf {Vect} }N_{X}^{F}(p)=N_{X}^{C}(p)=N_{X}^{M}(p)$ .^[5]

Properties

The inclusions $N_{A}^{F}({\bar {x}})\subseteq N_{A}^{M}({\bar {x}})\subseteq N_{A}^{C}({\bar {x}})$ always hold, in general they are strict.
$N_{A}^{F}({\bar {x}})$ may be trivial, while $N_{A}^{M}({\bar {x}})$ is always nontrivial provided that $\dim(X)<\infty$ , $A$ is closed and ${\bar {x}}\in \operatorname {bd} (A)$ .
Since the Clarke normal cone is always convex it can often end up being "too large" in practice and may degenerate to $N_{A}^{C}({\bar {x}})=X^{*}$ . In contrast to this, the Mordukhovich normal cone can be nonconvex which allows it to remain "small enough to be useful" in some cases where Clarke is too large.
It holds that $N_{A}^{C}({\bar {x}})=((N_{A}^{M}({\bar {x}}))_{\circ })^{\circ }$ where $S_{\circ }\subseteq X$ is the prepolar of a set $S\subseteq X^{*}$ and $T^{\circ }\subseteq X^{*}$ is the polar of a set $T\subseteq X$ so that $N_{A}^{C}({\bar {x}})$ is the bipolar of $N_{A}^{M}({\bar {x}})$ .

Let $C\subseteq X$ be a convex set, then $N_{C}^{\operatorname {conv} }=N_{C}^{F}=N_{C}^{M}=N_{C}^{C}$ .
$N_{A}^{M}$ is stable, that is $N_{A}^{M}({\bar {x}})=\operatorname {Lim\,sup} _{x\to _{A}{\bar {x}}}N_{A}^{M}(x).$
Let $X$ be Hilbert, $C\subseteq X$ non-empty and closed, and suppose $c\in \operatorname {proj} _{C}(x^{*})$ . Then $x^{*}-c\in N_{C}^{F}(c)$ . (Note that in contrast to the convex case this fails to be an equivalence in general. Moreover, if $X$ is infinite dimensional there may be no such $c$ )
Product Rule: $N_{A_{1}\times A_{2}}^{F}({\bar {x}}_{1},{\bar {x}}_{2})=N_{A_{1}}^{F}({\bar {x}}_{1})\times N_{A_{2}}^{F}({\bar {x}}_{2})$ for any two nonempty, closed subsets $A_{1},A_{2}\subseteq X$ , ${\bar {x}}_{1}\in A_{1},{\bar {x}}_{2}\in A_{2}$ of a normed space $X$ . The analogous statement holds for $N_{A_{1}\times A_{2}}^{M}$ . ^[1]
Intersection rule: Suppose $(A_{i})_{i}$ is a finite family of closed subsets in $X$ that is allied at ${\bar {x}}\in \bigcap _{i}A_{i}$ . Then $N_{\bigcap _{i}C_{i}}^{\operatorname {conv} }({\bar {x}})\subseteq \sum _{i}N_{C_{i}}^{\operatorname {conv} }({\bar {x}})$ holds. Alliedness is a somewhat lengthy, technical condition that may be found in Penot's book. Alternatively one may assume the so-called Fuzzy qualification condition. ^[1]
Chain Rule: Let $F:\mathbb {R} ^{n}\to \mathbb {R} ^{m}$ be a $C^{1}$ -function, and $C=\{x\in A:F(x)\in D\}$ where $A,D$ are closed in $\mathbb {R} ^{n},\mathbb {R} ^{m}$ respectively. Then $N_{C}^{M}({\bar {x}})=N_{A}^{M}({\bar {x}})+F'({\bar {x}})^{*}N_{D}^{M}(F({\bar {x}})),$ subject to some constraint qualifications and regularity conditions. ^[5] Here $F'$ is the Jacobian of $F$ .

References

[1]
Penot, Jean-Paul (2013). Calculus Without Derivatives. Graduate Texts in Mathematics. Vol. 266. New York, NY: Springer New York. doi:10.1007/978-1-4614-4538-8. ISBN 978-1-4614-4537-1.
[2]
Rockafellar, Ralph Tyrell (2015). Convex Analysis. Princeton Landmarks in Mathematics and Physics. Princeton: Princeton University Press. ISBN 978-0-691-01586-6.
[3]
Bauschke, Heinz H.; Combettes, Patrick L. (2017). Convex Analysis and Monotone Operator Theory in Hilbert Spaces. CMS Books in Mathematics. Cham: Springer International Publishing. doi:10.1007/978-3-319-48311-5. ISBN 978-3-319-48310-8.
[4]
Mordukhovich, Boris S.; Mau Nam, Nguyen (2022). Convex Analysis and Beyond: Volume I: Basic Theory. Springer Series in Operations Research and Financial Engineering. Cham: Springer International Publishing. doi:10.1007/978-3-030-94785-9. ISBN 978-3-030-94784-2.
[5]
Rockafellar, R. Tyrrell; Wets, Roger J. B. (1998). Variational Analysis. Grundlehren der mathematischen Wissenschaften. Vol. 317. Berlin, Heidelberg: Springer Berlin Heidelberg. doi:10.1007/978-3-642-02431-3. ISBN 978-3-540-62772-2.

Definition

Geometric Interpretation

Examples

Properties

Generalization

Definition

Comparison of Fréchet, Mordukhovich and Clarke normal cones for applications to optimization

Examples

Properties

References

Related Articles