Non-negative Matrix Factorization (NMF) is a key technique for dimensionality reduction and latent structure discovery in count data analysis. However, classical NMF approaches fail to adequately model sparse count data with excess zeros. This research proposes a novel extension of Matrix Multiplicative NMF that integrates zero-inflated and hurdle model formulations within the probabilistic matrix factorization framework, enabling improved dimensionality reduction, imputation, and downstream analysis for sparse count data.
Key findings
Proposes ZI-MM-NMF and Hurdle-MM-NMF, extensions of MM-NMF that explicitly model the data-generating process behind excess zeros.
Derives multiplicative update rules with guaranteed convergence properties for both formulations.
Develops efficient variational inference algorithms for Bayesian posterior estimation.
Captures both technical zeros (dropouts) and structural zeros within a unified latent factor model.
Limitations & open questions
The proposed methods require complex optimization procedures and may lack the simplicity and interpretability of some existing methods.