Model to use when predicting or clustering

Argument and Default Value

There is no default model when using --fit_reducer. For everything else, default model is ridgecv.


This switched is used with the following: --train_regression --test_regression --nfold_test_regression --predict_regression --predict_regression_to_feats fwflag_predict_cv_to_feats? fwflag_predict_combo_to_feats? fwflag_predict_regression_all_to_feats? --regression_to_lexicon --train_regression fwflag_test_combined_regression? --train_classifiers --test_classifiers --nfold_test_classifiers --predict_classifiers fwflag_predict_class ? --predict_classifiers_to_feats --classification_to_lexicon --roc fwflag_train_c2? fwflag_test_c2r? fwflag_predict_c2r? Using --fit_reducer one can specify the following clustering algorithms: NMF - Non:doc:fwflag_Negative matrix factorization by Projected Gradient (NMF) PCA - (Principal component analysis) Linear dimensionality reduction using Singular Value Decomposition of the data and keeping only the most significant singular vectors to project the data to a lower dimensional space. SPARSEPCA - (Sparse Principal Components Analysis) Finds the set of sparse components that can optimally reconstruct the data. The amount of sparseness is controllable by the coefficient of the L1 penalty. LDA - (Linear Discriminant Analysis) A classifier with a linear decision boundary, generated by fitting class conditional densities to the data and using Bayes’ rule. KMEANS - K:doc:fwflag_Means clustering DBSCAN - (Density:doc:fwflag_Based Spatial Clustering of Applications with Noise) Finds core samples of high density and expands clusters from them. Good for data which contains clusters of similar density. SPECTRAL - Apply clustering to a projection to the normalized laplacian. In practice Spectral Clustering is very useful when the structure of the individual clusters is highly non:doc:fwflag_convex or more generally when a measure of the center and spread of the cluster is not a suitable description of the complete cluster. For instance when clusters are nested circles on the 2D plan. GMM - (Gaussian Mixture Model)

Other Switches

Optional Switches: --n_components N --group_freq_thresh GROUP_THRESH fwflag_save_models? fwflag_load_models? --picklefile FILE_NAME --sparse --no_standardize Example Commands ================ .. code:doc:fwflag_block:: python

# General syntax ./fwInterface.py -d <DATABASE> -t <TABLE> -g <> -f <FEATURE_TABLE> --fit_reducer --model <MODEL_NAME>

# Example command ./fwInterface.py -d primals -t primals_new -g dp_id -f 'feat$1to3gram$primals_new$dp_id$16to1$0_0001' --fit_reducer --model spectral --group_freq_thresh 100