Jekyll2019-06-17T15:48:27+00:00http://blog.yataobian.com/feed.xmlYatao’sContinuous submodularity&amp;#58 Non-convex structure with guaranteed optimization2016-12-28T00:00:00+00:002016-12-28T00:00:00+00:00http://blog.yataobian.com/Cont-Submodularity<p>Recently, we are working on optimizing a class of non-convex objectives with a celebrated and general structure, called <em>continuous submodularity</em>. People know that submodularity is a classical structure in combinatorial optimization, it turns out that continuous submodularity is also a common non-convex structure for many continuous objectives, with strong guarantees for both minimization and maximization.</p> <p>For two very recent papers in this area, one can refer to <a href="https://arxiv.org/abs/1511.00394">the one</a> from Francis Bach for minimization, and <a href="https://arxiv.org/pdf/1606.05615.pdf">the one</a> from us for maximization. This post aims to: i) explain how to recognize submodular continuous-functions; ii) summarize the current results on optimizing submodular continuous-functions; iii) discuss open problems in this new area.</p> <h2 id="generic-submodular-functions">Generic submodular functions</h2> <p>To have a better understanding of the submodularity of both set-functions and continuous-functions, let us first of all give a <em>generic</em> view on the submodular functions.</p> <p>The domain of a “generic” submodular function $f: \cal X\rightarrow \mathbb R$ is the Cartesian product of subsets of $\mathbb{R}$: $\cal X = \prod_{i=1}^n \cal X_i$, each $\cal X_i$ is a compact subset of $\mathbb R$. It is clear that one can define a <em>lattice</em> over $\cal X$ by taking the “join” operation $\vee$ as coordinate-wise maximum, and the “meet” operation $\wedge$ as coordinate-wise minimum, respectively.</p> <p>By considering different realizations of $\cal X_i$, we can recover different submodular functions:</p> <ul> <li>$\cal X_i = \{0, 1\}$: submodular set-function;</li> <li>$\cal X_i = \{0, 1, …, k_i -1\}, k_i&gt;2$, $k_i\in \mathbb Z$: submodular integer-lattice-function;</li> <li>$\cal X_i = [a, b]$ is an interval: submodular continuous-function.</li> </ul> <p>The submodularity of all of them can be defined as:</p> <blockquote> <p><strong>Submodularity and submodular functions:</strong> For all $(x,y)$ in the domain, it holds <script type="math/tex">f(x) + f(y) \geq f(x\vee y) + f(x\wedge y)</script>. This function $f$ is a submodular function.</p> </blockquote> <p>It is well-known that for set-functions, submodularity is equivalent to the diminishing returns (<strong>DR</strong>) property. However, this does not hold when generalized to generic functions defined over $\cal X$:</p> <blockquote> <p><strong>DR property &amp; DR-submodular functions</strong>: Let $\chi_i$ be the $i^\text{th}$ characteristic vector. $f$ satisfies the DR property if $\forall a\leq b\in \cal X$, for any coordinate $i$, $\forall k\in \mathbb{R}_+$ s.t. $k\chi_i+a$ and $k\chi_i+b$ are still in $\cal X$, it holds <script type="math/tex">f(k\chi_i+a) - f(a) \geq f(k\chi_i+b) - f(b)</script>. <br /> This function $f$ is called a DR-submodular function.</p> </blockquote> <p>One immediate observation is that $\nabla f(a)\geq \nabla f(b)$ (if $f$ is differentiable), so the gradient of a differentiable DR-submodualr function is an <em>antitone</em> mapping.</p> <p>Both submodular and DR-submodular functions are prevalent in real-world applications. So far there are naturally <em>three questions</em>:</p> <p>Q1. For generic functions defined over $\cal X$, submodularity $\neq$ DR, what is the connection between them?</p> <p>Q2. For the submodularity of generic functions defined over $\cal X$, is there an equivalent diminishing-returns-style property to characterize it?</p> <p>Q3. What we can say regarding optimizing submodular and DR-submodular continuous-functions?</p> <p>These questions will be answered in the following.</p> <h2 id="characterization-of-generic--submodular-functions">Characterization of generic submodular functions</h2> <p>First of all, we give a positive answer to question Q2 by proposing the <em>weak DR</em> property:</p> <blockquote> <p><strong>weak DR:</strong> $f$ satisfies the weak DR property if $\forall a\leq b\in \cal X$, for any coordinate $i\in \{i’| a_{i’} = b_{i’} \}$, $\forall k\in \mathbb{R}_+$ s.t. $k\chi_i+a$ and $k\chi_i+b$ are still in $\cal X$, it holds <script type="math/tex">f(k\chi_i+a) - f(a) \geq f(k\chi_i+b) - f(b)</script>.</p> </blockquote> <p>and show that</p> <blockquote> <p><strong>Lemma</strong>: For a generic function $f$, weak DR $\Leftrightarrow$ submodularity.</p> </blockquote> <p>For question Q1, now it is clear that DR-submodular functions are a subclass of submodular functions. Furthermore, it can be shown that,</p> <blockquote> <p><strong>Lemma</strong>: submodularity + coordinate-wise concavity $\Leftrightarrow$ DR.</p> </blockquote> <p><img src="/images/cont-submodularity/submodular.png" style="float:left;width:35%" /> The class of submodular continuous-functions contains a subset of both convex and concave functions, see the left figure for an illustration. For detailed examples, one can refer to the corresponding sections in the above the two papers.</p> <p>The characterizations of submodular and DR-submodular continuous-functions can be put in comparison with that of convex functions, which are summarized in the following tables. These properties make it very easy to recognize the submodularity of a continuous-function.</p> <!-- ![Table 1](/images/cont-submodularity/table1.png) --> <p><img src="/images/cont-submodularity/table1.png" style="size:120%" /> <img src="/images/cont-submodularity/table2.png" alt="Table 2" /></p> <p>For question Q3, please see the following.</p> <h2 id="what-we-can-say-about-optimizing-submodular-continuous-functions-so-far">What we can say about optimizing submodular continuous-functions so far?</h2> <p>Here I just summarize the current results on minimizing and maximizing submodular continuous-functions from the above two papers. It is noteworthy that there are plenty of open problems in this new area.</p> <ul> <li> <p>Submodular continuous-functions over the “box” constraints can be minimized to arbitrary precision in polynomial time using the discretization + continuous extension method in <a href="https://arxiv.org/abs/1511.00394">Bach 2015</a>.</p> </li> <li> <p>Maximizing a monotone DR-submodular continuous-function over general down-closed convex constraints is NP-hard. The submodular Frank-Wolfe algorithm gives $(1-1/e)$-approximation and sublinear “convergence” rate <a href="https://arxiv.org/pdf/1606.05615.pdf">Bian et al 2016</a>.</p> </li> <li> <p>Maximizing a non-monotone submodular continuous-function over “box” constraints is NP-hard. The generalized DoubleGreedy algorithm gives $1/3$-approximation <a href="https://arxiv.org/pdf/1606.05615.pdf">Bian et al 2016</a>.</p> </li> </ul> <h2 id="open-problems">Open problems</h2> <p>Continuous submodularity is a very general structure in the non-convex realm. The characterizations, especially the second order properties, give a very convenient way to recognize a submodular/DR-submodular objective in real-world applications. So in terms of <em>new applications</em>, I think there are much more non-convex objectives waiting to be discovered, like what happened for the submodular set-functions.</p> <p>In terms of <em>theory</em>, there are lots of interesting open problems. To name a few:</p> <ul> <li> <p>For the minimization, how to make the algorithm faster/scalable? How to properly utilize the gradient information?</p> </li> <li> <p>What one can say about constrained minimization?</p> </li> <li> <p>For maximization, the projected gradient method works good in the experiments, is it possible to prove some approximation guarantees?</p> </li> <li> <p>For maximizing a non-monotone submodular continuous-function over “box” constraints, whether the worst-case guarantee or the hardness results can be improved?</p> </li> </ul> <hr /> <p>Hopefully you will find out that the non-convex problem you are working on turns out to be a submodular/DR-submodular one!</p>Recently, we are working on optimizing a class of non-convex objectives with a celebrated and general structure, called continuous submodularity. People know that submodularity is a classical structure in combinatorial optimization, it turns out that continuous submodularity is also a common non-convex structure for many continuous objectives, with strong guarantees for both minimization and maximization.I start to use Jekyll!2016-12-18T00:00:00+00:002016-12-18T00:00:00+00:00http://blog.yataobian.com/Hello-World<p>I decide to use the fantastic Jekyll.<br /> This website is based on an open source Jekyll theme called <a href="https://github.com/barryclark/jekyll-now">jekyll-now</a>.</p>I decide to use the fantastic Jekyll. This website is based on an open source Jekyll theme called jekyll-now.