--clean_messages
Switch
--clean_messages
Description
When used alone it replaces URLs with <URL> and @mentions with <USER>.
When used with:
--deduplicate: it replaces URLs with <URL> and @mentions with <USER> but also removed duplicate tweets.
--language_filter: it removed urls and @mentions before applying the language filter but not removed from the resulting message table.
Argument and Default Value
None
Details
When used alone it will create a new table whose name is taken from the -t flag and appends "_an".
Other Switches
Required Switches:
Optional Switches:
Example Commands
Clean URLs and @mentions:
# creates the table msgs_an
./dlatkInterface.py -d dla_tutorial -t msgs -c user_id --clean_messages
Clean URLs and @mentions while lanugage filtering:
# creates the table msgs_en
./dlatkInterface.py -d dla_tutorial -t msgs -c user_id --language_filter en --clean_messages
Clean URLs and @mentions while deduplicating:
# creates the table msgs_dedup
./dlatkInterface.py -d dla_tutorial -t msgs -c user_id --deduplicate --clean_messages