The L-Tangent Norm

In this section, we expose one of our contributions concerning range surface fitting. This work was first published in (31). As it was previously presented, range surface fitting is commonly achieved by minimizing a compound cost function that includes two terms: a goodness of fit term and a regularization term. The trade-off between these two terms is controlled by the so-called regularization parameter. Many approaches can be used to determine automatically a proper value for this hyperparameters. Some, such as the cross-validation, are very generic. We already reviewed them in section 3.2.2. Some others are more specific in the sense that they have been designed to work with the type of cost function considered in this section, i.e. a mix of a data term and a regularization term. This is the case of the L-curve approach (which will be detailed later in this section). However, all these methods are not fully satisfactory. Indeed, the methods based on cross-validation generally suffer from their computational complexity. The L-curve is more efficient in terms of the computational burden but the resulting criterion is hard to minimize in the sense that there are usually many local minima. Therefore, we propose a new criterion to tune the regularization parameter in the context of range surface fitting. We called this new criterion the L-Tangent Norm (LTN). Even though empirical, the LTN gives sensible results with a much lower computational cost.

This section is organized as follows. Firstly, we give supplemental details on range surface fitting. In particular, we give the details on the bending energy term for the B-spline model. Secondly, we review another criterion for hyperparameter selection: the L-curve criterion. Thirdly, we present our new criterion: the L-Tangent Norm. Finally, we conclude with some experimental results on both synthetic and real data.

Supplementary Details on Range Surface Fitting


The main building blocks of range surface fitting have been presented in the introductory example of section 4.1.3. In this section, we also use the B-spline model to represent the surface. We use the same principle as the one utilized in the introductory example. However, we use a slightly different way of writing it:

$\displaystyle \min_{\mathbf{p}} \mathcal{E}_d(\mathbf{p}) + \frac{\lambda}{1 - \lambda} \mathcal{E}_r(\mathbf{p}),$ (4.9)

where $ \mathcal{E}_d$ is the data term that measures the discrepancy between the fitted surface and the data, and $ \mathcal{E}_r$ is the bending energy term that measures how `smooth' the surface is. Besides, the regularization parameter is reparametrized so that it lies within $ [0,1[$ instead of $ [0, \infty[$. If we denote by $ f : \mathbb{R}^2 \times \mathbb{R}^p \rightarrow \mathbb{R}$ the parametric model of surface, then we have that:

$\displaystyle \mathcal{E}_d(\mathbf{p}) =$ $\displaystyle \frac{1}{n} \sum_{i=1}^n \left ( f(\mathbf{q}_i ; \mathbf{p}) - z_i \right )^2,$ (4.10)
$\displaystyle \textrm{and } \mathcal{E}_r(\mathbf{p}) =$ $\displaystyle \iint_\Omega \sum_{d=0}^2 \binom{2}{d} \left ( \frac{\partial^2 f...
...}; \mathbf{p})}{\partial x^{2-d} \partial y^d}\right )^2 \mathrm dx \mathrm dy,$ (4.11)

where  $ \Omega \subset \mathbb{R}^2$ is the definition domain of the range surface. It can be chosen as, for instance, the bounding box of the data points. Since $ f$ is a tensor-product B-spline, it can be written as a linear combination of the control points, i.e. :

$\displaystyle f(\mathbf{q} ; \mathbf{p}) = \mathbf{n}_{\mathbf{q}}^\mathsf{T}\mathbf{p},$ (4.12)

where  $ \mathbf{w}_{\mathbf{q}}$ is the vector defined by (with $ \mathbf{q}^\mathsf{T}= (x  y)$):

$\displaystyle \mathbf{n}_{\mathbf{q}}^\mathsf{T}= \begin{pmatrix}N_{-3}(x)N_{-3}(y) \ldots N_{-3}(x)N_{p}(y) & \ldots & N_{p}(x)N_{p}(y) \end{pmatrix}$ (4.13)

Consequently the data term  $ \mathcal{E}_d$ can be written in matrix form:

$\displaystyle \mathcal{E}_d(\mathbf{p}) = \frac{1}{n} \left \Vert \mathsf{N} \mathbf{p} - \mathbf{z} \right \Vert^2,$ (4.14)

where $ \mathsf{N} \in \mathbb{R}^{n \times p}$ is the matrix such that $ \mathsf{N}^\mathsf{T}= \begin{bmatrix}
\mathbf{n}_{\mathbf{q}_1} & \ldots & \mathbf{n}_{\mathbf{q}_n}
\end{bmatrix}$ and $ \mathbf{z} \in \mathbb{R}^n$ is obtained by stacking the depth of the data points, i.e. $ \mathbf{z}^\mathsf{T}= \begin{pmatrix}
z_1 & \ldots & z_n
\end{pmatrix}$. The bending term can also be written as the squared norm of some vector. One trivial way of doing it would be to approximate the bending energy term of equation (4.11) by discretizing the integral over a regular grid:

$\displaystyle \mathcal{E}_r(\mathbf{p}) \approx \frac{1}{ab} \sum_{i=0}^{a-1} \...
...rac{i}{a},\frac{j}{b} ; \mathbf{p}) }{\partial x^{2-d} \partial y^d}\right )^2.$ (4.15)

Since the second derivatives of a B-spline are also linear with respect to the control points, the approximation of equation (4.15) is a sum of squared terms linear with respect to the parameters. Even though this approach is effective in practice, there is a better way to write the bending energy term as a squared norm of a vector. Indeed, we showed that it was possible to get an exact formula for the bending energy of the form:

$\displaystyle \mathcal{E}_r(\mathbf{p}) = \left \Vert \mathsf{B} \mathbf{p} \right \Vert^2,$ (4.16)

where $ \mathsf{B} \in \mathbb{R}^{p \times p}$ is a matrix that we call the bending matrix. The details of the computations of the bending matrix will be given in section For now, let us just assume that the matrix  $ \mathsf{B}$ exists. Given all the previous elements, the initial minimization problem of equation (4.9) is equivalent to the following linear least-squares minimization problem in matrix form:

$\displaystyle \min_{\mathbf{p}} \left \Vert \begin{bmatrix}\mathsf{N}  \frac{...
...thbf{p} - \begin{bmatrix}\mathbf{z}  \mathbf{0} \end{bmatrix} \right \Vert^2.$ (4.17)

The solution to this problem is given by using, for instance, the normal equations. For a given regularization parameter  $ \lambda \in [0,1[$, we get that:

$\displaystyle \mathbf{p}^\star_\lambda = \left ( \mathsf{N}^\mathsf{T}\mathsf{N...
... \mathsf{B}^\mathsf{T}\mathsf{B} \right )^{-1} \mathsf{N}^\mathsf{T}\mathbf{z}.$ (4.18)

The Bending Matrix

In this section, we show how the bending matrix $ \mathsf{B}$ can be derived and practically computed. To the best of our knowledge, the practical computation of the bending matrix cannot be found in the literature. For the sake of simplicity, we first consider the case of a univariate and scalar-valued B-spline. For the same reasons, we also restrict this section to uniform cubic B-splines, which is the flavour of B-spline the most used in this thesis. Let $ f$ be the B-spline with knot sequence $ k_{-3} < \ldots < k_{n+3}$. We consider the bending energy over the natural definition domain of the B-spline, i.e. :

$\displaystyle b(\mathbf{p}) = \int_{k_0}^{k_n} \left ( \frac{\partial^2 f(x ; \mathbf{p})}{\partial x^2}\right )^2 \mathrm dx.$ (4.19)

The integral bending energy can be divided on each knot interval and, therefore, equation (4.19) can be rewritten as:

$\displaystyle b(\mathbf{p}) = \sum_{j=0}^{n-1} \int_{k_j}^{k_{j+1}} \left ( \frac{\partial^2 f(x ; \mathbf{p})}{\partial x^2}\right )^2 \mathrm dx.$ (4.20)

As we have seen in section, the B-spline $ f$ may written as:

$\displaystyle f(x;\mathbf{p}) = \sum_{i=0}^3 p_{\iota(x) + i} b_i(o(x)),$ (4.21)

where the four functions $ b_i$ (for $ i \in \llbracket 0,3 \rrbracket $) are the pieces of the B-spline basis functions. The functions $ \iota$ and $ o$ were defined in section A similar notation is possible for the derivatives of the B-spline:

$\displaystyle \frac{\partial^d f}{\partial x^d} (x ; \mathbf{p}) = \frac{1}{s^d} \sum_{i=0}^3 p_{\iota(x) + i} b_{i,d}(o(x)),$ (4.22)

where $ s$ is the width of a knot interval (i.e. $ s = k_{i+1}-k_i$ for all $ i \in \llbracket -3, n+2 \rrbracket $). The function $ b_{i,d}$ is the derivative of the $ d$-th order of the function $ b_i$ (for $ i \in \llbracket 0,3 \rrbracket $). In particular, for the second derivatives ($ d=2$), we have that:

$\displaystyle b_{0,2}(x)$ $\displaystyle = -x + 1,$ (4.23)
$\displaystyle b_{1,2}(x)$ $\displaystyle = 3x - 2,$ (4.24)
$\displaystyle b_{2,2}(x)$ $\displaystyle = -3x + 1,$ (4.25)
$\displaystyle b_{3,2}(x)$ $\displaystyle = x.$ (4.26)


Here is a formula which will be useful for the computation of the bending matrix: the integration by substitution. It is given by:

$\displaystyle \int_{\phi(a)}^{\phi(b)} f(x) \mathrm d x = \int_a^b f(\phi(t)) \phi'(t) \mathrm dt.$ (4.27)

Actual computation.

We now compute $ e_j(\mathbf{p})$ i.e. the bending energy of a B-spline over the knot interval $ [k_j,k_{j+1}]$.

$\displaystyle e_j(\mathbf{p})$ $\displaystyle = \int_{k_j}^{k_{j+1}} \left( \frac{\partial^2 f}{\partial x^2}(x ; \mathbf{p}) \right)^2 \mathrm dx$ (4.28)
  $\displaystyle = \frac{1}{s^4} \int_{k_j}^{k_{j+1}} \left( \sum_{i=0}^3 p_{j+i} b_{i,2}(o(x)) \right)^2 \mathrm dx$ (4.29)
  $\displaystyle = \frac{1}{s^3} \int_{k_j}^{k_{j+1}} \left( \sum_{i=0}^3 p_{j+i} b_{i,2}(o(x)) \right)^2 \frac{1}{s}\mathrm dx$ (4.30)
  $\displaystyle = \frac{1}{s^3} \int_{o(k_j)}^{o(k_{j+1})} \left( \sum_{i=0}^3 p_{j+i} b_{i,2}(x) \right)^2 \mathrm dx$ (4.31)
  $\displaystyle = \frac{1}{s^3} \int_0^1 \left( \sum_{i=0}^3 p_{j+i} b_{i,2}(x) \right)^2 \mathrm dx$ (4.32)
  $\displaystyle = \frac{1}{s^3} \int_0^1 \left( \begin{pmatrix}x & \frac{1}{3} & ...
...matrix}p_j  p_{j+1}  p_{j+2}  p_{j+3} \end{pmatrix}} \right)^2 \mathrm dx$ (4.33)
  $\displaystyle = \frac{1}{s^3} \int_0^1 \left( \begin{pmatrix}x & \frac{1}{3} & 0 & 0 \end{pmatrix} \mathsf{M} \bar{\mathbf{p}} \right)^2 \mathrm dx$ (4.34)
  $\displaystyle = \frac{1}{s^3} \int_0^1 \bar{\mathbf{p}}^\mathsf T \mathsf{M}^\m...
... & 0 & 0  0 & 0 & 0 & 0 \end{pmatrix}} \mathsf{M} \bar{\mathbf{p}} \mathrm dx$ (4.35)
  $\displaystyle = \frac{1}{s^3} \int_0^1 \bar{\mathbf{p}}^\mathsf T {\tiny\begin{...
...ix}-1 & 3 & -3 & 1  3 & -6 & 3 & 0 \end{pmatrix}} \bar{\mathbf{p}} \mathrm dx$ (4.36)
  $\displaystyle = \frac{1}{s^3} \int_0^1 \bar{\mathbf{p}}^\mathsf T \mathsf{M}_2^...
...rac{x}{3} & \frac{1}{9} \end{pmatrix}} \mathsf{M}_2 \bar{\mathbf{p}} \mathrm dx$ (4.37)
  $\displaystyle = \frac{1}{s^3} \bar{\mathbf{p}}^\mathsf T \mathsf{M}_2^\mathsf T...
...3} & \frac{1}{9} \end{pmatrix}} \mathrm dx\right) \mathsf{M}_2 \bar{\mathbf{p}}$ (4.38)
  $\displaystyle = \frac{1}{s^3} \bar{\mathbf{p}}^\mathsf T \mathsf{M}_2^\mathsf T...
...& x^2/6  x^2/6 & x/9 \end{pmatrix}} \right]_0^1 \mathsf{M}_2 \bar{\mathbf{p}}$ (4.39)
  $\displaystyle = \frac{1}{s^3} \bar{\mathbf{p}}^\mathsf T \mathsf{M}_2^\mathsf T...
...gin{pmatrix}1/3 & 1/6  1/6 & 1/9 \end{pmatrix}} \mathsf{M}_2 \bar{\mathbf{p}}$ (4.40)
  $\displaystyle = \frac{1}{s^3} \bar{\mathbf{p}}^\mathsf T \frac{1}{6} {\tiny\beg...
...6 & -3 & 0  0 & -3 & 6 & -3  1 & 0 & -3 & 2 \end{pmatrix}} \bar{\mathbf{p}}$ (4.41)
  $\displaystyle = \frac{1}{s^3} \bar{\mathbf{p}}^\mathsf T \bar{\mathsf{B}} \bar{...
... & 0 & 1  -3 & 6 & -3 & 0  0 & -3 & 6 & -3  1 & 0 & -3 & 2 \end{pmatrix}}$ (4.42)
  $\displaystyle = \frac{1}{s^3} \mathbf{p}^\mathsf T \mathsf{B}_j \mathbf{p}$ (4.43)

where $ \mathsf{B}_j$ is the matrix defined as:

$\displaystyle \mathsf{B}_j= \begin{pmatrix}\mathbf 0_{j\times j} & \mathbf 0_{j...
...mathbf 0_{(n-j-4)\times 4} & \mathbf 0_{(n-j-4)\times (n-j-4)}  \end{pmatrix}$ (4.44)

The bending matrix for the bending energy over the entire natural definition domain is obtained by summing the bending matrix for single knot intervals. Therefore, if we define the bending matrix as:

$\displaystyle \mathsf{B} = \frac{1}{s^3} \sum_{i=0}^{n-1} \mathsf{B}_i,$ (4.45)

then the bending energy is given by:

$\displaystyle b(\mathbf{p}) = \mathbf{p}^\mathsf{T}\mathsf{B} \mathbf{p} = \left \Vert \mathsf{B}^{1/2} \mathbf{p} \right \Vert^2.$ (4.46)

Note that in practice, one generally does not need to explicitly compute a square root of the matrix $ \mathsf{B}$. This stems from the fact that it is usually the form $ \mathbf{p}^\mathsf{T}\mathsf{B} \mathbf{p}$ which is used in practical computations.

The bending matrix is interesting for several reasons. Of course, it is nice to have an exact formula of the bending energy that can easily be integrated into a linear least-squares optimization problem. Besides, the bending matrix as defined in equation (4.45) is a sparse matrix. Indeed, $ \mathsf{B}$ is an heptadiagonal symmetric matrix. This sparsity structure allows one to use efficient optimization algorithms. The last interesting property of this bending matrix is that it is usually much smaller that the matrix that would arise by using the approximation of equation (4.15). Indeed, $ \mathsf{B}$ is a matrix of $ \mathbb{R}^{(n+3) \times (n+3)}$. With the approximate formula, the resulting matrix has usually much more rows (it depends on the fineness of the grid used for the discretization of the integral).

The bivariate case.

We now derive an exact formula for the bending energy term for bivariate spline. This derivation follows the same principle as the one used for the univariate case. Let $ f$ be the uniform cubic tensor-product bi-variate B-spline with knots $ k_{-3}^x < \ldots < k_{m+3}^x$ along the $ x$ direction and $ k_{-3}^y < \ldots < k_{n+3}^y$ along the $ y$ direction. The bending energy, denoted $ e$, is defined by:

$\displaystyle e(\mathbf{p}) = \int_{k_0^y}^{k_n^y} \int_{k_0^x}^{k_m^x} \left (...
...partial^2 f}{\partial y^2} (x,y ; \mathbf{p}) \right )^2 \mathrm dx \mathrm dy.$ (4.47)

As in the univariate case, the bending energy term over the entire natural definition domain can be split on each knot domain:

$\displaystyle e(\mathbf{p}) = \sum_{b=0}^{n-1} \sum_{a=0}^{m-1} e_{a,b}(\mathbf{p}),$ (4.48)

where $ e_{a,b}(\mathbf{p})$ is the bending energy of the B-spline over the knot domain $ [k_a^x,k_{a+1}^x] \times [k_b^y,k_{b+1}^y]$. The bending energy $ e_{a,b}(\mathbf{p})$ can be further split in three terms:

= \int_{k_b^y}^{k_{b+1}^y} \negthickspace...
...partial^2 f}{\partial y^2}(x,y) \right)^2
\mathrm dx \mathrm dy

We will give the details of the computation only for the first term in equation (4.49) since the two other terms are computed in an almost identical way. But before that, we define a few other notation. As it was explained in section, the bivariate B-spline can be written as:

$\displaystyle f(x,y ; \mathbf{p}) = \left ( \mathbf{r}_{o(y)}^\mathsf{T}\otimes...
...T}\right ) (\mathsf{M}_G \otimes \mathsf{M}_G) \mathrm{vect}(\bar{\mathsf{P}}),$ (4.49)

where $ \bar{\mathsf{P}} = \mathsf{P}_{\iota(x):\iota(x)+3, \iota(y):\iota(y)+3}$, $ \mathsf{M}_G$ is the geometric matrix (defined in equation (2.99)), and $ \mathbf{r}_{o(x)}$ is the vector of $ \mathbb{R}^4$ defined as a function of the free variable $ x$ by:

$\displaystyle \mathbf{r}_{o(x)}^\mathsf{T}= \begin{pmatrix}o(x)^3 & o(x)^2 & o(x) & 1 \end{pmatrix}.$ (4.50)

The partial derivatives of the bi-variate spline can be written in a similar way:

$\displaystyle \frac{\partial^{c+d} f}{\partial x^c \partial y^d}(x,y ; \mathbf{...
...t ( \mathsf{M}_G \otimes \mathsf{M}_G \right ) \mathrm{vect}(\bar{\mathbf{p}}),$ (4.51)

where $ \mathbf{r}_{o(x),c}$ is the derivative of the $ c$th order of the vector $ r_{o(x)}$. Along the $ x$ direction, these vectors are given by:

$\displaystyle \mathbf{r}_{o(x),0}$ $\displaystyle = \mathbf{r}_{o(x)},$ (4.52)
$\displaystyle \mathbf{r}_{o(x),1}$ $\displaystyle = \frac{1}{s_x} \begin{pmatrix}3 o(x)^2 & 2 o(x) & 1 & 0 \end{pmatrix},$ (4.53)
$\displaystyle \mathbf{r}_{o(x),2}$ $\displaystyle = \frac{1}{s_x^2} \begin{pmatrix}6 o(x) & 2 & 0 & 0 \end{pmatrix},$ (4.54)
$\displaystyle \mathbf{r}_{o(x),3}$ $\displaystyle = \frac{1}{s_x^3} \begin{pmatrix}6 & 0 & 0 & 0 \end{pmatrix},$ (4.55)

where $ s_x$ is the width of a knot interval along the $ x$ direction (i.e. $ s_x = k_{i+1}^x - k_i^x$ for all $ i \in \llbracket -3, m-2 \rrbracket $). These formulas are identical for the $ y$ direction expect for $ s_x$ which is replaced by $ s_y$, the width of a knot interval along the $ y$ direction.

Actual computation.

First term of $ e_{a,b}(\mathbf p)$.

  $\displaystyle \int_{k_b^y}^{k_{b+1}^y} \negthickspace \negthickspace \! \int_{k...
...c{\partial^2 f}{\partial x^2}(x,y ; \mathbf{p}) \right)^2 \mathrm dx \mathrm dy$    
  $\displaystyle = \int_{k_b^y}^{k_{b+1}^y} \negthickspace \negthickspace \! \int_...
...+i,b+j} \frac{1}{s_x^2} b_{i,2}(o(x)) b_i(o(y)) \right)^2 \mathrm dx \mathrm dy$    
  $\displaystyle = \frac{1}{s_x^3}\int_{k_b^y}^{k_{b+1}^y} \negthickspace \negthic...
...{a+i,b+j} b_{i,2}(o(x)) b_i(o(y)) \right)^2 \frac{1}{s_x} \mathrm dx \mathrm dy$    
  $\displaystyle = \frac{1}{s_x^3}\int_{k_b^y}^{k_{b+1}^y} \negthickspace \negthic...
...3 \sum_{i=0}^3 p_{a+i,b+j} b_{i,2}(x) b_i(o(y)) \right)^2 \mathrm dx \mathrm dy$    
  $\displaystyle = \frac{1}{s_x^3}\int_{k_b^y}^{k_{b+1}^y} \negthickspace \negthic...
...3 \sum_{i=0}^3 p_{a+i,b+j} b_{i,2}(x) b_i(o(y)) \right)^2 \mathrm dx \mathrm dy$    
  $\displaystyle = \frac{1}{s_x^3}\int_{k_b^y}^{k_{b+1}^y} \negthickspace \negthic...
..._{a+i,b+j} b_{i,2}(x) b_i(o(y)) \right)^2 \mathrm dx \frac{s_y}{s_y} \mathrm dy$    
  $\displaystyle = \frac{s_y}{s_x^3} \int_0^1 \negthickspace \negthickspace \int_0...
...0}^3 \sum_{i=0}^3 p_{a+i,b+j} b_{i,2}(x) b_i(y) \right)^2 \mathrm dx \mathrm dy$    
  $\displaystyle = \frac{s_y}{s_x^3} \int_0^1 \negthickspace \negthickspace \int_0...
...& 2 & 0 & 0 \end{pmatrix} \mathsf{M} \mathbf{p} \right)^2 \mathrm dx \mathrm dy$    
  $\displaystyle = \frac{s_y}{s_x^3} \int_0^1 \negthickspace \negthickspace \int_0...
... \mathbf x_{xx}^\mathsf{T}\mathsf{M} \mathbf{p} \right)^2 \mathrm dx \mathrm dy$    
  $\displaystyle = \frac{s_y}{s_x^3} \int_0^1 \negthickspace \negthickspace \int_0...
...{x}_{xx} \mathbf{x}_{xx}^\mathsf{T}\mathsf{M} \mathbf{p} \mathrm dx \mathrm dy$    
  $\displaystyle = \frac{s_y}{s_x^3} \mathbf{p}^\mathsf{T}\mathsf{M}^\mathsf{T}\le...
... \mathbf{x}_{xx}^\mathsf{T} \mathrm dx \mathrm dy \right) \mathsf{M} \mathbf{p}$    
  $\displaystyle = \frac{s_y}{s_x^3} \mathbf{p}^\mathsf{T}\mathsf{M}^\mathsf{T} \l...
...{r}_{x,2}^\mathsf{T}\mathrm dx \right) \mathrm dy \right) \mathsf{M} \mathbf{p}$    
  $\displaystyle = \frac{s_y}{s_x^3} \mathbf{p}^\mathsf{T}\mathsf{M}^\mathsf{T} \l...
...,2} \mathbf{r}_{x,2}^\mathsf{T}\mathrm dx \right) \right) \mathsf{M} \mathbf{p}$    
  $\displaystyle = \frac{s_y}{s_x^3} \mathbf{p}^\mathsf{T}\mathsf{M}^\mathsf{T} \l...
...0  0 & 0 & 0 & 0  0 & 0 & 0 & 0 \end{pmatrix} \right) \mathsf{M} \mathbf{p}$    
  $\displaystyle = \frac{s_y}{s_x^3} \mathbf{p}^\mathsf{T}\mathsf{M}^\mathsf{T}\mathsf{B}_{xx} \mathsf{M} \mathbf{p}$    

Second and third terms of $ e_{a,b}(\mathbf p)$.
The second and the third terms of $ e_{a,b}(\mathbf{p})$ are computed in a similar way as the first term. We obtain the following formulae:

$\displaystyle \textrm{Second term:}\quad$ $\displaystyle \frac{1}{s_x s_y} \mathbf{p}^\mathsf{T}\mathsf{M}^\mathsf{T}\mathsf{B}_{xy} \mathsf{M} \mathbf{p}$ (4.56)
$\displaystyle \textrm{Third term:}\quad$ $\displaystyle \frac{s_x}{s_y^3} \mathbf{p}^\mathsf{T}\mathsf{M}^\mathsf{T}\mathsf{B}_{yy} \mathsf{M} \mathbf{p}$ (4.57)

Bending energy over a single knot domain.

Consequently, the exact formula for the bending energy over the knot domain $ [k_a^x,k_{a+1}^x] \times [k_b^y,k_{b+1}^y]$ is:

$\displaystyle e_{a,b}(\mathbf{p})$ $\displaystyle = \mathbf{p}^\mathsf{T}\mathsf{M} \left ( \frac{s_y}{s_x^3} \math...
...thsf{B}_{xx} + \frac{s_x}{s_y^3} \mathsf{B}_{yy} \right ) \mathsf{M} \mathbf{p}$ (4.58)
  $\displaystyle = \mathbf{p}^\mathsf{T}\mathsf{B}_{a,b} \mathbf{p}.$ (4.59)

Bending energy over the entire natural definition domain.

Finally, the bending energy for the entire natural definition domain is:

$\displaystyle e(\mathbf{p})$ $\displaystyle = \sum_{b=0}^{n-1} \sum_{a=0}^{m-1} e_{a,b}(\mathbf{p})$ (4.60)
  $\displaystyle = \mathbf{p}^\mathsf{T}\left ( \sum_{b=0}^{n-1} \sum_{a=0}^{m-1} \mathsf{B}_{a,b} \right ) \mathbf{p}$ (4.61)
  $\displaystyle = \mathbf{p}^\mathsf{T}\mathsf{B} \mathbf{p}$ (4.62)

In the bivariate case, the bending matrix  $ \mathsf{B}$ has similar properties than the bending matrix in the univariate case. In particular, the matrix  $ \mathsf{B}$ is still sparse ; although its sparsity pattern is a bit more complicated. Indeed, it is made of 7 diagonals of blocks each one of which being an heptadiagonal matrix. This is illustrated in figure 4.12.

Figure 4.12: Sparsity structure of the bending matrix of bi-variate B-splines. Note that all the blocks are heptadiagonal matrices but they are not all identical to each others. The entire bending matrix is symmetric as are the individual heptadiagonal blocks.
Image bendmat-structure

The L-Curve Criterion

The L-curve criterion is a criterion used to automatically determine a regularization parameter in an inverse problem. It was first introduced in (109). This criterion is not as generic as the criteria reviewed in section 3.2.2. Indeed, it is specifically designed to determine only one type of hyperparameter (a regularization parameter) in the special context of a least-squares cost function. The interested reader may find an extensive review of the L-curve criterion in, for instance, (93,124,92,157).

The underlying idea of the L-curve criterion is to find a compromise between underfitting and overfitting. We remind the reader that $ \mathbf{p}^\star_\lambda$ is the vector of the control points solution to the problem (4.17) for a given regularization parameter $ \lambda$. Let $ \rho(\lambda) = \Vert \mathsf{N} \mathbf{p}_\lambda^\star - \mathbf{z} \Vert^2$ be the residual norm and let $ \eta(\lambda) = \Vert \mathsf{B}^{1/2} \mathbf{p}_\lambda^\star \Vert^2$ be the solution norm4.7. In an underfitting situation, the solution norm is expected to be small while the residual norm is likely to be large. On the contrary, in an overfitting situation, the solution norm is probably large and the residual norm is expected to be small. To find a compromise between these two pathological situation, the principle of the L-curve criterion is to plot the residual norm and the solution against each other. More precisely, it is the logarithm of the residual norm and of the solution norm that are plotted against each others. This allows one to be invariant to the scale. The resulting curve is called the L-curve. If we note $ \hat{\rho}$ and $ \hat{\eta}$ the two functions such that $ \hat{\rho}(\lambda) = \log(\rho(\lambda))$ and $ \hat{\eta}(\lambda) = \log(\eta(\lambda))$, then the L-curve is the set:

$\displaystyle \left \{ (\hat{\rho}(\lambda), \hat{\eta}(\lambda) ) \in \mathbb{R}_+^2 \vert \lambda \in [0,1[ \right \}.$ (4.63)

The name L-curve comes from the usual shape of this curve. Indeed, it often resembles to the letter L. The extremities of the L correspond to the underfitting and overfitting situations. The compromise selected with the L-curve criterion corresponds to the corner of the L. This corner has often been selected manually by actually plotting the L-curve. It is possible to make this approach automatic by saying that the corner of the L is the point of the curve which as the highest curvature. Let $ \kappa$ be the curvature of the L-curve:

$\displaystyle \kappa(\lambda) = 2 \frac{\hat{\rho}' \hat{\eta}''- \hat{\rho}''\hat{\eta}'}{\left( \hat{\rho}'^2 + \hat{\eta}'^2 \right)^{3/2}},$ (4.64)

where the symbols $ '$ and $ ''$ denote respectively the first and the second derivatives of the functions to which it is applied. The L-curve criterion is thus defined as:

$\displaystyle \lambda^* = \arg \max_{\lambda \in [0,1[} \kappa(\lambda).$ (4.65)

An example of L-curve is shown in figure 4.13. Figure 4.13 also shows the typical aspect of the L-curve criterion. In this example, the L-curve has actually the shape of the letter L. In this case, the L-curve criterion is well defined in the sense that its maximum indeed corresponds to a good regularization parameter.

Figure 4.13: (a) An example of L-curve that has the typical shape of the letter L. The extremities of the L corresponds to the worst cases: underfitting and overfitting. (b) Curvature of the L-curve shown in (a). The highest curvature corresponds to the regularization parameter determined with the L-curve criterion.
Image nice-lcurve Image nice-lcurvature
(a) (b)
However, selecting a regularization parameter with the L-curve criterion is not always that simple. Indeed, the L-curve is often not well defined in the sense that there are a lot of local maxima. This is a problem if we want a fully automatic approach to choose the regularization parameter. Besides, when the values of the maxima are close to each other, it is not always clear to decide which local maxima is the best (it is not necessarily the highest one: it can be one of the highest). This pathological situation is illustrated in figure 4.14.
Figure 4.14: An example of pathological case for the L-curve criterion.
Image patho-lcurve Image patho-lcurvature
(a) (b)

The L-Tangent Norm Criterion

The L-Tangent Norm (LTN) is a new criterion that we have designed to choose the regularization parameter in range surface fitting. It was partly inspired by the idea of `stability' developed in (151). It also relies on the L-curve.

The Proposed Criterion

One thing that can easily be noticed when dealing with L-curves is that their parametrization is not uniform. In particular, one can observe that there exists a range of values for $ \lambda$ where the tangent vector norm is significantly smaller than elsewhere. Our new criterion is based on this observation. The regularization parameter is chosen as the one for which the L-curve tangent norm is minimal. Intuitively, such a regularization parameter is the one for which a small variation of the regularization parameter has the lowest impact in the trade-off between the goodness of fit and the surface smoothness.

The L-Tangent Norm (LTN) criterion can be written as:

  $\displaystyle \lambda^* = \arg \min_{]0,1[} L(\lambda)$ (4.66)
$\displaystyle \textrm{with }$ $\displaystyle L(\lambda) = \left\Vert \left( \overline{\eta}'_\lambda, \overline{\rho}'_\lambda \right) \right\Vert_2^2.$ (4.67)

$ \overline{\rho}'_\lambda$ and $ \overline{\eta}'_\lambda$ are the derivatives with respect to $ \lambda$ of the normalized residual and solution norms:

$\displaystyle \overline{\rho}_\lambda = \frac{\rho_\lambda-\rho_\varepsilon}{\r...
...\frac{\eta_\lambda-\eta_{1-\varepsilon}}{\eta_\varepsilon-\eta_{1-\varepsilon}}$ (4.68)

for $ \varepsilon$ a small positive constant ($ 10^{-6}$, for instance).

Properties of the L-Tangent Norm

A typical example of the L-tangent norm criterion is shown in figure 4.15. Even if our criterion is not convex, it is continuous and smooth enough to make it interesting from the optimization point of view. Moreover, neglecting the values of $ \lambda$ very close to 1, our criterion often has a unique minimum, which is not the case of the L-curve criterion.

Figure 4.15: Example of the L-Tangent Norm criterion. (a) An initial surface. (b) The initial surface sampled on a set of 500 points with additive normally-distributed noise. (c) The L-Tangent Norm criterion. (d) The reconstructed surface using the optimal regularization parameter found with the LTN.
Image orig-surf Image orig-cloud
(a) (b)
Image ltn-criterion Image fitted-surf
(c) (d)

It sometimes happens that there are two minima. In such cases, it seems that these two local minima are both meaningful. The smaller one (i.e. the global minimum) corresponds to the regularization parameter giving the best of the two `explanations' of the data. The second one seems to appear when the data contains, for instance, a lot of small oscillations. In this case, it is not clear (even for a human being) whether the surface must interpolate the data or approximate them, considering the oscillations as some kind of noise. This situation is illustrated in figure 4.16.

Figure 4.16: An example of the LTN criterion presenting two meaningful minima. (a) An initial surface containing a lot of small oscillations. (b) The L-tangent norm criterion presents two minima (excluding the one reached for $ \lambda$ close to 1). (c) The reconstructed surface using the first minimum ( $ \lambda^*_1=0.0189$). (d) The reconstructed surface using the second minimum ( $ \lambda^*_2=0.8073$).
Image 2m-orig-surf Image 2m-ltn
(a) (b)
Image 2m-fitted-surf1 Image 2m-fitted-surf2
(c) (d)

The evaluation of the LTN criterion requires only the computation of the residual and solution norm derivatives. This makes our new criterion faster to compute than, for instance, cross-validation. In particular, our criterion allows one to improve the computation time when the surface model leads to sparse collocation and regularization matrices (as it is the case with the B-spline model). This is not possible with the cross-validation because the influence matrix is generally not sparse.

Another advantage of the LTN criterion is that it would still be efficient with a non-linear surface model. While cross-validation needs a non-iterative formula to achieve acceptable computational time (which does not necessarily exists for such surface models), our criterion just needs the computation of the residual and the solution norms.

Experimental Results


Synthetic data.

The first type of data we have used in these experiments are generated by taking sample points (with added noise) of surfaces defined by:

  $\displaystyle g(x,y) = \sum_{i=1}^8 \frac{2(1-d)c_1}{5} g_1(x,y) + \frac{dc_2}{5} g_2(x,y)$ (4.69)
  $\displaystyle g_1(x,y) = \exp\left( -\frac{20 \left( a_1(x-a_2)^2 + a_3(y-a_4)^2\right)}{a_5} \right)$ (4.70)
  $\displaystyle g_2(x,y) = \sin 4 \pi \left(b_1 (x+2b_2)^{\frac{1}{2}+b_3} + b_4(y+2b_5)^{\frac{1}{2}+b_6} \right)$ (4.71)

where $ a_1,\ldots,a_5,b_1,\ldots,b_6,c_1,c_2$ are randomly chosen in $ [0,1]$ and where $ d$ is randomly chosen in $ \{0,1\}$. Examples of generated surfaces are given in figure 4.17.

Figure 4.17: Examples of randomly generated surfaces for the experiments.
Image rdfn01 Image rdfn02 Image rdfn03
(a) (b) (c)
Image rdfn04 Image rdfn05 Image rdfn06
(d) (e) (f)

Real data.

The second type of data we have used in these experiments are real depth maps obtained by stereo imaging means. Figure 4.18 shows the data. The range images we have used in these experiments are large: their size is approximately $ 400 \times 600$ pixels. Therefore, with the approach presented in this section, it is difficult (even impossible) to reconstruct a surface from the original datasets. This is the reason why the range images have been subsampled over a regular grid of size $ 30 \times 45$. However, the full resolution image is used when comparing a reconstructed surface to the initial dataset.

Figure 4.18: Real range data used in the experiments of this section (Courtesy of Toby Collins). Top row: data represented as a textured 3D surface. Bottom row: same data as in the top row but represented with depth maps.
Image sc28-surf Image s39-surf Image w89-surf
(a) (b) (c)
Image sc28-depth Image s39-depth Image w89-depth
(d) (e) (f)

Computation Timings

Single point evaluation.

We intend to compare the computation time of the evaluation for a single value of the regularization parameter of the cross-validation score and the L-tangent norm. To do so, we take a surface and we sample it for several numbers of points. The timings reported in figure 4.19 have been obtained with the cputime function of Matlab and for the (arbitrary) regularization parameter $ \lambda = \frac{1}{2}$. Note that the timing for each distinct number of points has been repeated multiple times in order to get stable results. Not surprisingly, figure 4.19 tells us that the evaluation for a single point with the L-tangent norm is far much faster than with cross-validation. This comes from the fact that the inversion of a matrix is needed in the computation of the cross-validation score while only multiplications between sparse matrices and vectors are involved in the L-tangent norm computation.

Figure 4.19: Comparison of the cross-validation versus the L-tangent norm computation time for the evaluation of the criteria at a single value. (a) Plot for both cross-validation and the L-Tangent Norm criteria. (b) Ratio of the timings (cross-validation timings divided by the L-Tangent Norm timings).
Image timings-single Image timings-ratio
(a) (b)

Optimization of the criterion.

In this experiment, we are interested in the computation time of the whole optimization process for both the L-tangent norm and cross-validation. We have taken 300 examples of randomly generated surfaces known through a noisy sampling. The cross-validation optimization process is performed using a golden section search (implemented in the fminbnd function of Matlab). The results are shown in figure 4.20. As in the previous experiment, the optimization of the L-tangent norm is faster than for cross-validation.

Figure 4.20: Computation time needed to optimize the L-tangent norm and the cross-validation.
Image timings

Reconstruction of whole surfaces.

Figure 4.21 shows the computation times needed to the whole surface reconstruction problem with the three range images presented in figure 4.18. Timings for both the L-tangent norm criterion and the cross-validation score are given in figure 4.21. As expected, using the L-tangent norm is faster than using cross-validation.

Figure 4.21: Computation time needed to reconstruct the whole surfaces from the range data of figure 4.18 using the L-tangent norm and cross-validation.
Image whole-timings

Is the L-Tangent Norm an Approximation of Cross-Validation?

This experiment aims at comparing the regularization parameter obtained with our L-tangent norm and with the cross-validation criterion. To do so, we have taken noisy samples of randomly generated surfaces. Then, the regularization parameters obtained with cross-validation (the $ \lambda_c^\star$) for each dataset are plotted versus the regularization parameter determined with the L-tangent norm (the $ \lambda_l^\star$). The results are reported in figure 4.22. We see from figure 4.22 that the regularization parameters obtained with the L-tangent norm are often close to the ones obtained with cross-validation. One can observe that the L-tangent norm tends to slightly under-estimate large regularization parameters. However, large regularization parameters are usually obtained for datasets with a lot of noise or badly constrained. In such cases, the accuracy of the regularization parameter does not matter so much.

Figure 4.22: Comparison of the regularization parameters obtained with the L-tangent norm ( $ \lambda_l^\star$) and with the ones obtained with cross-validation ( $ \lambda_c^\star$). (a) Data with normally-distributed noise. (b) Data with uniformly-distributed noise.
Image new-against-gcv Image new-against-gcv-uniform
(a) (b)

Reconstructed Surfaces

Synthetic data.

In this experiment, we compare the surfaces reconstructed from data obtained as noisy discretizations of randomly generated surfaces. Let us denote $ f$ the original randomly generated surface, $ f_c$, $ f_l$ and $ f_n$ the surfaces reconstructed using respectively cross-validation, the L-curve criterion and our L-tangent norm. The difference between the original surface and the reconstructed ones is measured with the Integral Relative Error (IRE). If the functions $ f$, $ f_c$, $ f_l$ and $ f_n$ are all defined over the domain $ \Omega$, then the IRE is given by:

$\displaystyle e(f,g) = \frac{\iint_\Omega \left\vert g(x,y)-f(x,y) \right\vert ...
...y)\in\Omega} f(x,y) - \min_{(x,y)\in\Omega} f(x,y) \raisebox{-1mm}{\Big\vert}},$ (4.72)

where the function $ g$ is either $ f_c$, $ f_l$ or $ f_n$. The results of this experiment are reported in figure 4.23. This figure tells us that the reconstruction errors are small and similar for cross-validation and the L-tangent norm. The IRE for surfaces reconstructed using the L-curve criterion are much larger than with the two other criteria. Moreover, only IRE s lower than 1 are reported for the L-curve criterion: the IRE was greater than 1 for 48 test surfaces. These large IRE s are mainly due to a failure in the maximization of the L-curve criterion.

Figure 4.23: Integral relative errors for 200 randomly generated surfaces sampled over 500 points with additive normally-distributed noise.
Image ire

Range images.

In this last experiment, we intend to compare the surfaces reconstructed from real range images. To do so, we take again the three range images of figure 4.18. Let $ f_i$ be the original range image (before subsampling). Let $ f_l$ and $ f_c$ be the reconstructed surfaces using respectively the L-tangent norm and the cross-validation criterion to choose the regularization parameter. The results of this experiment are presented in the form of Relative Error Maps (REM). The REM for the surface reconstructed using the L-tangent norm is a picture such that each pixels $ (x,y)$ is associated to a colour $ C_l(x,y)$ proportional to the difference of depth between the reconstructed surface and the original one. This is written as:

$\displaystyle C_l(x,y) = \frac{\displaystyle \left\vert f_i(x,y) - f_l(x,y) \ri...
...g\vert} \max_{(u,v)}f_i(u,v)-\min_{(u,v)}f_i(u,v) \raisebox{-1mm}{\Big\vert} }.$ (4.73)

The REM for the surface reconstructed using cross-validation, $ C_c$, is defined similarly to equation (4.74)) except that $ f_l$ is replaced by $ f_c$. We also define the Difference Error Map (DEM) by $ C_{l,c}(x,y)=\left\vert C_c(x,y)-C_l(x,y) \right\vert$. The results of the comparison between surfaces reconstructed from range images using the L-tangent norm and the cross-validation are reported in figure 4.24. On this figure, only the error map for the L-tangent norm is reported. Indeed, as it is shown in figure 4.24 (d-f), the two reconstructed surfaces are very similar (which is the point of main interest in this experiment). Even if the reconstruction errors are not negligible (figure 4.24 (a-c)), they are still small. The main reason for these errors is the subsampling of the initial datasets.

Figure 4.24: (a-c) Relative error maps (REM) for the surfaces reconstructed using the L-tangent norm criterion for the three range images of figure 4.18. (d-f) Difference error maps (DEM) between the surfaces reconstructed using the L-tangent norm criterion and the cross-validation criterion.
Image errmap-ltn1 Image errmap-ltn2 Image errmap-ltn3
(a) (b) (c)
Image diffmap1 Image diffmap2 Image diffmap3
(d) (e) (f)

Contributions to Parametric Image Registration and 3D Surface Reconstruction (Ph.D. dissertation, November 2010) - Florent Brunet
Webpage generated on July 2011
PDF version (11 Mo)