--outliers_to_mean
Switch
--outliers_to_mean [OUTLIER_THRESHOLD]
Description
Set an outlier threshold. After standardization if absolute feature value is greater than threshold then set feature to mean value.
Argument and Default Value
Default threshold is 2.5
Other Switches
Required Switches:
Optional Switches:
Some regression command: --nfold_test_regression, --predict_regression, --test_regression, etc.
Some classification command: --nfold_test_classifiers, --predict_classifiers, --test_classifiers, etc.
Example Commands
# Runs 10-fold cross validation on predicting the users ages from 1grams.
# Set outliers to the default value of 2.5
dlatkInterface.py -d dla_tutorial -t msgs -c user_id -f 'feat$1gram$msgs$user_id$16to16' --outcome_table blog_outcomes \
--outcomes age --combo_test_regression --model ridgecv --folds 10 --outliers_to_mean
# Set the threshold to 3.5
dlatkInterface.py -d dla_tutorial -t msgs -c user_id -f 'feat$1gram$msgs$user_id$16to16' --outcome_table blog_outcomes \
--outcomes age --combo_test_regression --model ridgecv --folds 10 --outliers_to_mean 3.5