CP decomposition


Some facts and properties of CP decomposition are collected. Since when dimention of data array goes larger than two things changed fundamentally from matrix, these facts and progress on tensor is necessary to be noticed.


algorithm

Notation

An \(K\)th-order rank-1 tensor \(\mathcal{X} \in \mathbb{C}^{I_1 \times \cdots \times I_K}\) is defined as the tensor product of nonzero vectors \(\textbf{a}^{(k)} \in \mathbb{C}^{I_k} , 1 \le k \le K\), such that \(\mathcal{X}_{i_1 \cdots i_K} = \prod_{k=1}^{K} \textbf{a}_{i_k}^{k}\). We write \(\mathcal{X} = \textbf{a}^{(1)} \circ \textbf{a}^{(2)} \circ \cdots \circ \textbf{a}^{(K)}\). The rank of a tensor \(\mathcal{X}\) is equal to the minimal number of rank-1 tensors that yield \(\mathcal{X}\) in a linear combination. Assume that the rank of \(\mathcal{X}\) is \(R\); then it can be written as \begin{equation}\label{cp} \mathcal{X} = \sum_{r=1}^{R} \textbf{a}_r^{(1)} \circ \cdots \circ \textbf{a}_r^{(K)} \end{equation} where $\textbf{a}_r^{(k)} \in \mathbb{C}^{I_k}$. Let us stack the vectors $\textbf{a}_{r}^{(k)}$ into the matrices \begin{equation}\label{factor} \mathbf{A}^{(k)} = \Big[\mathbf{a}_1^{(k)}, \cdots, \mathbf{a}_R^{(k)} \Big] \in \mathbb{C}^{I_k \times R}, 1 \le k \le K \end{equation} The matrices \(\mathbf{A}^{(k)}\) will be denoted as factor matrices.

Definition of Uniqueness

The CP Decomposition of a tensor is unique if all the factor matrices $\mathbf{\bar{A}}^{(1)},\cdots, \mathbf{\bar{A}}^{(K)}$ satisfying (\ref{factor}) are related via \begin{equation} \mathbf{\bar{A}}^{(k)} = \mathbf{A}^{(k)} \mathbf{P} \Delta \end{equation} where $\Delta$ are diagonal matrices satisfying $\prod_{k=1}^{K} \Delta _{\mathbf{A}^{(k)}} = \mathbf{I}_{R}$ and $\mathbf{P}$ is a permutation matrix.

Summary

The uniqueness of CP decomposition is not fully understood yet and fundamentally different from matrices. Note that, contrary to singular value decomposition (SVD) in the matrix case, no orthogonality constraints are imposed to ensure uniqueness. Imposing orthogonality constraints yields a different decomposition that has different properties. Unlike matrices, the identifiability may not be guaranteed by the orthogonality for tensor decomposition.

  • Notions of orthogonality [2]: If every $1 \le k \le K$, $1 \le i,j \le R$, $\mathbf{a}_{i}^{(k)} \perp \mathbf{a}_{j}^{(k)}$, it is combinatorial orthogonal rank decomposition. And there are many other notions of orthogonality for tensors.
  • Combinatorial orthogonal rank decomposition may not even exist. For example, it is possible that the rank is greater than the largest dimension. Matrices (i.e., tensors of order two) are special cases that always have a completely orthogonal decomposition.
  • Even if it exists, decompositions under the orthogonality constraints are not necessarily minimal in terms of $r$ [2]. For matrices, all three rank decompositions are equivalent to the SVD.
  • Even if it exists and meets minimal rank $r$. Orthogonal rank decomposition is NOT unique. See examples in [2].
  • Key to many applications are the uniqueness properties of the CPD; CPD may be unique without imposing constraints like orthogonality.

    If $\sum_{i=1}^{K} k_{\mathbf{A}^{(i)}} \ge 2R + K - 1$, then CP is unique.
    Remarks:

  • $k_{\mathbf{A}} = 0$ if and only if the matrix $\mathbf{A}$ has an all-zero column.
  • $k_{\mathbf{A}} = 1$ if and only if $\mathbf{A}$ contains proportional columns.
  • It is reasonable to assume $\min_{i} k_{\mathbf{A}^{(i)}} = 2$. In high dimensions, assume $\min_{i} k_{\mathbf{A}^{(i)}} = 2$, meaning that one extra dimension increases the left-hand side by at least 2 but the right-hand side by only 1.
  • If all matrices involved are full k-rank, then unique [1].
  • It is known that Kruskal’s rank conditions are best possible [16], however other assumptions could also imply uniqueness.

  • When at least one factor matrix $\mathbf{A}^{(i)}$ is full rank [6];
  • When $I_K > R$ [9];
  • When at least one of the matrix factors is constrained to be columnwise orthonormal [13], etc.
  • In random (generic) settings, it is unique with probability one [8].
  • References


    [1] Sidiropoulos, Bro. On the uniqueness of multilinear decomposition of N-way arrays. 2000.
    Some fasts discussed in Remark 3: (1) If all matrices involved are full k-rank (a matrix whose columns are drawn independently from absolutely continuous distributions is full k-rank with probability one), then unique. (2) F-component multilinear models are generically unique in higher dimensions}.
    [2] Tamara G. Kolda. Orthogonal Tensor Decompositions. 2001.
    Definitions of orthogonality: rank decomposition, orthogonal rank decomposition, strong orthogonal rank decomposition. All "orthogonal rank decomposition is NOT unique." Given a definition such that "decomposition is unique in the sense that the weights are unique".
    [3] Xiangqian Liu, Nicholas D. Sidiropoulo. Cramér–Rao Lower Bounds for Low-Rank Decomposition of Multidimensional Arrays. 2001.
    Simple-to-check necessary conditions for a unique low-rank decomposition are provided.
    [4] Jos M.F. Ten Berge, Nicholas D. Sidiropoulos. On Uniqueness In CANDECOMP/PARAFAC. 2007.
    the sufficient condition is also necessary for tensors of rank $R = 2$ and $R = 3$, but not for $R > 3$. It has been shown that, in cases of small $k$-rank, the particular pattern of zeros, after pre-transformations to have identity submatrices in A, B, and C, may have a decisive impact on uniqueness. This implies that attempts to derive necessary and sufficient conditions for uniqueness are doomed to fail unless they take that very pattern into account.
    [5] Tamara G. Kolda. A Counterexample to The Possibility of an Extension of The ECKART–YOUNG Low-Rank Approximation Therorem for The Orthogonal Rank Tensor Decomposition. 2003.
    The first term in orthogonal rank tensor decomposition is NOT the best rank-1 approximation.
    [6] Tao Jiang, Nicholas D. Sidiropoulos. Kruskal’s Permutation Lemma and the Identification of CANDECOMP/PARAFAC. 2004.
    Two equivalent necessary and sufficient conditions for unique decomposition of restricted CP models where at least one of the component matrices is full column rank.
    [7] Alwin Stegeman, Nicholas D. Sidiropoulos. On Kruskal’s uniqueness condition for the Candecomp/Parafac decomposition. 2006.
    Provide an accessible and intuitive proof of Kruskal’s condition.
    [8] A. Stegeman, J.M.F. Ten Berge, L. De Lathauwer. Sufficient conditions for uniqueness in Candecomp/Parafac and Indscal with random component matrices. 2006.
    A link between the deterministic approach of Jiang and Sidiropoulos and the random setting of De Lathauwer is provided.
    [9] Lieven De Lathauwer. A Link Between The CANONICAL DECOMPOSITION in Multilinear Algebra And Simultaneous Matrix Diagonalization. 2006.
    This paper derive a new sufficient condition for uniqueness in the case that $I_N \ge R$.
    [10] Evrim Acar, Bulent Yener. Unsupervised Multiway Data Analysis: A Literature Survey. 2009.
    The motivation behind PARAFAC is to obtain a unique solution such that component matrices are determined uniquely up to a permutation, i.e., rank-one tensors can be arbitrarily reordered, and scaling of columns. It is this uniqueness property that makes PARAFAC a popular technique in various fields.
    [11] T. G. Kolda and B. W. Bader, Tensor decompositions and applications. 2009.
    A claasic review paper. Some sufficient and necessary conditions for uniqueness are mentioned.
    [12] Evrim Acar, Daniel M. Dunlavy, Tamara G. Kolda. A Scalable Optimization Approach for Fitting Canonical Tensor. 2011.
    CP factors are not orthonormal in each mode. In fact, it is possible that the rank is greater than the largest dimension. CP is known to be unique when it satisfies, e.g., the Kruskal conditions. Proposed a Regularizing the optimization formulation of CP.
    [13] Mikael Sorensen, Lieven de Lathauwer, Pierre Comon, etc. CANONICAL POLYADIC Decomposition with a Columnwise Orthonormal Factor Matrix. 2012.
    In many applications at least one of the matrix factors is constrained to be columnwise orthonormal. We first derive a relaxed condition that guarantees uniqueness of the CPD under this constraint. Second, we give a simple proof of the existence of the optimal low-rank approximation of a tensor in the case that a factor matrix is columnwise orthonormal.
    [14] Bocci, Chiantint, etc. Refined Methods For The Identifiability Of Tensors. 2013.
    It shows that for a nonnegative tensor, a best nonnegative rankr approximation is almost always unique. Also some good reviews about uniqueness are mentioned in this paper.
    [15] IGNAT DOMANOV, LIEVEN DE LATHAUWER. ON THE UNIQUENESS OF THE CANONICAL POLYADIC DECOMPOSITION OF THIRD-ORDER TENSORS — PART I. 2013.
    The uniqueness fails if the inequality is weakened.
    [16] Harm Derksen. Kruskal’s uniqueness inequality is sharp. 2013.
    A good review paper. Found sufficient conditions on the factor matrices which guarantee "partially uniqueness".
    [17] IGNAT DOMANOV, LIEVEN DE LATHAUWER. ON THE UNIQUENESS OF THE CANONICAL POLYADIC DECOMPOSITION OF THIRD-ORDER TENSORS — PART II. 2014.
    A good review paper. Establishing overall CPD uniqueness in cases where none of the factor matrices has full column rank..
    [18] Andrzej Cichocki, Danilo P. Mandic, etc. Tensor Decompositions for Signal Processing Applications. 2015.
    CP decomposition is unique under more natural and relaxed conditions, which only require the components to be sufficiently different and their number not unreasonably large.
    [19] Yang Qi, Pierre Comon. Uniqueness of Nonnegative Tensor Approximations. 2016.
    It shows that for a nonnegative tensor, a best nonnegative rankr approximation is almost always unique. Also some good reviews about uniqueness are mentioned in this paper.
    [20] Jérémy Cohen, Konstantin Usevich, Pierre Comon. A Tour of Constrained Tensor Canonical Polyadic Decomposition. 2016.
    This paper surveys the use of constraints in tensor decomposition models.
    [21] Nicholas D. Sidiropoulos,etc. Tensor Decomposition for Signal Processing and Machine Learning. 2017.
    A good review paper.