--outcome_interaction
Switch
--outcome_interaction <interaction variable>
Description
Generate correlations for --outcomes like the default --correlate but including a term that is the product of the outcome_interaction variable and the feature group norm variable.
Argument and Default Value
interaction variable - The column name of the variable that you would like use in your interaction term. This column must exist in the outcomes table.
Details
These values are generated by using least squares linear regression. For each feature/outcome pair, we normalize all variables, including feature group norms, control variables and outcome variables by subtracting the mean and dividing by the standard deviation, thus creating a data distribution that has a mean of zero and a standard deviation of 1. We then create a linear model that predicts the outcome value based on the feature group norms, and control variables. B0 + B1*F + B2*C1 + B3*C2 + B3*F*I + B4*I = O_pred From this model three rows will be 3 coefficients per feature output into an rmatrix: xxxxx - corresponds to B1 xxxx with yyyyy - corresponds to B4 group_norm * xxxxx from yyyy - corresponds to B3
Other Switches
Required Switches: --outcomes --outcome_table Optional Switches: --group_freq_thresh --outcome_controls Example Commands ================ .. code:doc:fwflag_block:: python
# Correlates 1grams with age for every user ~/fwInterface.py -d twitterGH -t msgs_en -g msa_id --group_freq_thresh 20000
-f 'feat$cat_met_a30_2000_cp_w$msgs_en$msa_id$16to16' --outcome_table jordan_msa --outcomes LSmean_wavg --interactions --outcome_interaction Rpercent_2008 fwflag_output_interaction_terms --outcome_controls median_age_wavg percent_white_wavg mean_income_wavg_log percent_bachelors_wavg --output_name fb2000_intr --rmatrix --sort --whitelist --feat_whitelist 1543 1087 418 943 1309 18 1826 904 1145 1843 648 277 404 785 752 285 1283 91 230 1529 445 1447 1149 1023 800