--fit_reducer
Switch
--fit_reducer
Description
Reduces a feature space to clusters.
Argument and Default Value
If --n_components is not specified then the default number of clusers is 24 (when applicable).
Details
Using --model one can specify the following clustering algorithms:
NMF - Non:doc:fwflag_Negative matrix factorization by Projected Gradient (NMF)
PCA - (Principal component analysis) Linear dimensionality reduction using Singular Value Decomposition of the data and keeping only the most significant singular vectors to project the data to a lower dimensional space.
SPARSEPCA - (Sparse Principal Components Analysis) Finds the set of sparse components that can optimally reconstruct the data. The amount of sparseness is controllable by the coefficient of the L1 penalty.
LDA - (Linear Discriminant Analysis) A classifier with a linear decision boundary, generated by fitting class conditional densities to the data and using Bayes’ rule.
KMEANS - K:doc:fwflag_Means clustering
DBSCAN - (Density:doc:fwflag_Based Spatial Clustering of Applications with Noise) Finds core samples of high density and expands clusters from them. Good for data which contains clusters of similar density.
SPECTRAL - Apply clustering to a projection to the normalized laplacian. In practice Spectral Clustering is very useful when the structure of the individual clusters is highly non:doc:fwflag_convex or more generally when a measure of the center and spread of the cluster is not a suitable description of the complete cluster. For instance when clusters are nested circles on the 2D plan.
GMM - (Gaussian Mixture Model)
Other Switches
Required Switches:
Optional Switches:
--group_freq_thresh GROUP_THRESH
--save_models
--load_models
--picklefile FILE_NAME
Example Commands
# General syntax
dlatkInterface.py -d <DATABASE> -t <TABLE> -c <> -f <FEATURE_TABLE> --fit_reducer --model <MODEL_NAME>
# Example command
dlatkInterface.py -d primals -t primals_new -c dp_id -f 'feat$1to3gram$primals_new$dp_id$16to1$0_0001' --fit_reducer --model spectral --group_freq_thresh 100