We present an analysis of the optical spectra of narrow emission-line galaxies , based on mean field independent component analysis ( MFICA ) , a blind source separation technique . Samples of galaxies were drawn from the Sloan Digital Sky Survey ( SDSS ) and used to generate compact sets of ‘ continuum ’ and ‘ emission-line ’ component spectra . These components can be linearly combined to reconstruct the observed spectra of a wider sample of galaxies . Only 10 components – five continuum and five emission line – are required to produce accurate reconstructions of essentially all narrow emission-line galaxies to a very high degree of accuracy ; the median absolute deviations of the reconstructed emission-line fluxes , given the signal-to-noise ratio ( S/N ) of the observed spectra , are 1.2–1.8 \sigma for the strong lines . After applying the MFICA components to a large sample of SDSS galaxies we identify the regions of parameter space that correspond to pure star formation and pure active galactic nucleus ( AGN ) emission-line spectra , and produce high S/N reconstructions of these spectra . The physical properties of the pure star formation and pure AGN spectra are investigated by means of a series of photoionization models , exploiting the faint emission lines that can be measured in the reconstructions . We are able to recreate the emission line strengths of the most extreme AGN case by assuming the central engine illuminates a large number of individual clouds with radial distance and density distributions , f ( r ) \propto r ^ { \gamma } and g ( n ) \propto n ^ { \beta } , respectively . The best fit is obtained with \gamma = -0.75 and \beta = -1.4 . From the reconstructed star formation spectra we are able to estimate the starburst ages . These preliminary investigations serve to demonstrate the success of the MFICA-based technique in identifying distinct emission sources , and its potential as a tool for the detailed analysis of the physical properties of galaxies in large-scale surveys .