Introduction -Problem Statements and ModelsMatrix factorization is an important and unifying topic in signal processing and linear algebra, which has found numerous applications in many other areas. This chapter introduces basic linear and multi-linear 1 models for matrix and tensor factorizations and decompositions, and formulates the analysis framework for the solution of problems posed in this book. The workhorse in this book is Nonnegative Matrix Factorization (NMF) for sparse representation of data and its extensions including the multi-layer NMF, semi-NMF, sparse NMF, tri-NMF, symmetric NMF, orthogonal NMF, non-smooth NMF (nsNMF), overlapping NMF, convolutive NMF (CNMF), and large-scale NMF. Our particular emphasis is on NMF and semi-NMF models and their extensions to multi-way models (i.e., multi-linear models which perform multi-way array (tensor) decompositions) with nonnegativity and sparsity constraints, including, Nonnegative Tucker Decompositions (NTD), Constrained Tucker Decompositions, Nonnegative and semi-nonnegative Tensor Factorizations (NTF) that are mostly based on a family of the TUCKER, PARAFAC and PARATUCK models.As the theory and applications of NMF, NTF and NTD are still being developed, our aim is to produce a unified, state-of-the-art framework for the analysis and development of efficient and robust algorithms. In doing so, our main goals are to:1. Develop various working tools and algorithms for data decomposition and feature extraction based on nonnegative matrix factorization (NMF) and sparse component analysis (SCA) approaches. We thus integrate several emerging techniques in order to estimate physically, physiologically, and neuroanatomically meaningful sources or latent (hidden) components with morphological constraints. These constraints include nonnegativity, sparsity, orthogonality, smoothness, and semi-orthogonality. 2. Extend NMF models to multi-way array (tensor) decompositions, factorizations, and filtering, and to derive efficient learning algorithms for these models. 3. Develop a class of advanced blind source separation (BSS), unsupervised feature extraction and clustering algorithms, and to evaluate their performance using a priori knowledge and morphological constraints. 4. Develop computational methods to efficiently solve the bi-linear system Y = AX + E for noisy data, where Y is an input data matrix, A and X represent unknown matrix factors to be estimated, and the matrix E represents error or noise (which should be minimized using suitably designed cost function).1 A function in two or more variables is said to be multi-linear if it is linear in each variable separately.