The Dynamic Target: A New Approach to Outlier-Resistant Non-Negative Matrix and Tensor Factorization
-
Data Services, ForvisMazars, Courbevoie, France
[paul.fogel@forvismazars.com] [christophe.geissler@forvismazars.com] -
Department of Biostatistics, Bioinformatics and Biomathematics, Georgetown University Medical Center, Washington, DC, USA [George.Luta@georgetown.edu]
Keywords: Robust NMF, weighted least squares
1 Introduction
By enforcing specific constraints, non-negative matrix and tensor factorization (NMTF) methods decompose data into sparse and interpretable factors (Lee and Seung [1999]). This distinctive property has contributed to their early popularity in the broad area of dimension reduction techniques. Among the numerous related algorithms developed so far, Fast-HALS (Hierarchical Alternating Least Squares) is considered one of the most powerful in terms of computational performance (Cichocki and Phan [2009]). Its excellent convergence properties have been recently studied (Hou et al. [2024]). However, like most algorithms using Frobenius norm-based objectives, Fast-HALS is sensitive to outliers. Incorporating resistance to outliers in the design of algorithms commonly implies that ”the least squares criterion is replaced by a weighted least squares criterion, where the weight function is chosen in order to give less weight to ”discrepant” observations (in the sense only of fitting the model less well)” (Green [1984]). Unfortunately, Fast-HALS update rules rely on the associativity of matrix multiplication, that no longer applies when a weight matrix is inserted. For this reason, outlier-resistant NMTF algorithms use multiplicative update rules, which converge sub-linearly to asymptotic stability (Badeau et al. [2011]) and are typically slower than Fast-HALS. In this work, we introduce a novel approach: the” Dynamic Target”, which allows us to apply Fast-HALS rules and take full advantage of their computational power, while being resistant to outliers.
2 Materials and methods
2.1 Materials
We compared our new approach with state-of-the-art weighted least squares NMF approaches: CIM-NMF, Huber-NMF (Du et al. [2012]), L1-NMF and L21-NMF (Lam [2008]), using the ORL (Olivetti Research Laboratory) and the CroppedYaleb datasets, which contain face images from 40 and 28 subjects, respectively, with variations in lighting and facial expressions. ”Block” noise (a randomly placed white rectangle) and ”salt” noise (randomly distributed white pixels ) were added to the images to generate outlier images.
2.2 Methods
Before we discuss our novel approach, it is illuminating to consider this brief parable: A fisherman throws a rope to the pier and pulls the ship closer. A child, observing this, remarks to his father,” Look at how strong the fisherman is; he is pulling the pier closer to the ship!”. So, we took the child’s remark seriously and gradually adjusted the data to make it more amenable to smooth factorization. How does an iteration of the ”Dynamic Target” approach work? As with weighted least squares, we start by computing the squared differences between the data points and the values corresponding to the current factorization. From this, we can derive weights using a set of functions already proposed in the literature. Next, we calculate the Dynamic Target by taking, for each data point, a weighted average of the global mean of the data and the actual value, so that ill-fitted points are pushed towards the global mean, while well-fitted points are left almost unchanged. Factors can now be updated based on the Dynamic Target using Fast-HALS rules. Once this iterative process is completed, a few iterations of weighted least squares are performed to get the factorized data closer to the real data. To save time (and energy), the Dynamic Target is updated using an iteration step that depends on the relative distance between the previous and current targets. Finally, solutions obtained with different random initializations are integrated using the Integrated Sources Model (ISM) (Fogel et al. [2024]).
2.3 Data and Code availability
The Python jasoncoding13 code was used to run weighted least squares NMF. The image databases used can also be found in this repository. The Python enAInem code was used to run NMTF with Dynamic Target.
3 Results
Performance was assessed in terms of relative reconstruction error (RRE), accuracy score (ACC) and normalized mutual information (NMI). For simplicity, only the results for the ORL data with block noise are presented. We first checked that the standard NMTF performance was indeed affected by the addition of noise, with low ACC and NMI (0.20 and 0.35 respectively). When using the CIM weighting scheme, Dynamic Target NMTF achieved an ACC and NMI that were comparable to the performance obtained with the original images without added noise (ACC=0.70 and NMI=0.80). In contrast, weighted least squares NMF was far behind when using the same weighting scheme (ACC=0.50 and NMI=0.70). We also ran simulations to confirm that the Dynamic Target approach is resistant to outliers. Notably, it did not degrade performance when applied to the original data. Due to the iterative calculation of the Dynamic Target approach, the computational time increased compared to standard Fast-HALS. However, it was still five times less than that of weighted least squares NMF algorithms.
4 Discussion
We have introduced the Dynamic Target approach to achieve an NMTF decomposition that is resistant to outliers. and provided empirical evidence that this approach, disruptive as it may seem, is resistant to outliers and its computational performance remains close to the original performance of the Fast-HALS algorithm, while outperforming state-of-the-art weighted least squares NMF approaches. Further research is needed to investigate its convergence properties. Potential extensions of the approach to other algorithms, such as generalized NMTF using Iteratively Reweighted least squares (IRLS), are also of interest.
References
- Badeau et al. [2011] Roland Badeau, Nancy Bertin, and Emmanuel Vincent. Stability analysis of multiplicative update algorithms for non-negative matrix factorization. 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 2148–2151, 2011. URL https://api.semanticscholar.org/CorpusID:295409.
- Cichocki and Phan [2009] Andrzej Cichocki and A. Phan. Fast local algorithms for large scale nonnegative matrix and tensor factorizations. IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 92-A:708–721, 2009. URL https://api.semanticscholar.org/CorpusID:16858053.
- Du et al. [2012] Liang Du, Xuan Li, and Yi-Dong Shen. Robust nonnegative matrix factorization via half-quadratic minimization. 2012 IEEE 12th International Conference on Data Mining, pages 201–210, 2012. URL https://api.semanticscholar.org/CorpusID:14250058.
- Fogel et al. [2024] Paul Fogel, Christophe Geissler, Franck Augé, Galina Boldina, and George Luta. Integrated sources model: A new space-learning model for heterogeneous multi-view data reduction, visualization, and clustering. Artificial Intelligence in Health, 2024. URL https://api.semanticscholar.org/CorpusID:271634070.
- Green [1984] Peter J. Green. Iteratively reweighted least squares for maximum likelihood estimation. 1984. URL https://api.semanticscholar.org/CorpusID:16336111.
- Hou et al. [2024] Liangshao Hou, Delin Chu, and Li-Zhi Liao. Convergence of a fast hierarchical alternating least squares algorithm for nonnegative matrix factorization. IEEE Transactions on Knowledge and Data Engineering, 36:77–89, 2024. URL https://api.semanticscholar.org/CorpusID:258903573.
- Lam [2008] Edmund Y. Lam. Non-negative matrix factorization for images with laplacian noise. APCCAS 2008 - 2008 IEEE Asia Pacific Conference on Circuits and Systems, pages 798–801, 2008. URL https://api.semanticscholar.org/CorpusID:13111402.
- Lee and Seung [1999] Daniel D. Lee and H. Sebastian Seung. Learning the parts of objects by non-negative matrix factorization. Nature, 401:788–791, 1999. URL https://api.semanticscholar.org/CorpusID:4428232.