.. DLATK documentation master file, created by sphinx-quickstart on Wed Sep 7 15:59:11 2016. You can adapt this file completely to your liking, but it should at least contain the root `toctree` directive. Differential Language Analysis ToolKit -------------------------------------- DLATK is an end to end human text analysis package, specifically suited for social media and social scientific applications. It is written in Python 3 and developed by the World Well-Being Project at the University of Pennsylvania and Stony Brook University. It contains: * feature extraction * part-of-speech tagging * correlation * prediction and classification * mediation * dimensionality reduction and clustering * wordcloud visualization DLATK can utilize: - `HuggingFace <https://huggingface.co/>`_ for transformer language models - `Mallet <http://mallet.cs.umass.edu/>`_ for creating LDA topics - `Stanford Parser <http://nlp.stanford.edu/software/lex-parser.shtml>`_ - `CMU's TweetNLP <http://www.cs.cmu.edu/~ark/TweetNLP/>`_ - `pandas <http://pandas.pydata.org/>`_ dataframe output Getting Started --------------- .. toctree:: :maxdepth: 1 install Github Repo <https://github.com/dlatk/dlatk/> Getting started in Colab <https://colab.research.google.com/drive/10WMCmnKzwywZR7s2et5xx9CcoWBNmhLY?usp=sharing> tutorials datasets dlatkinterface_ordered papers Citations --------- If you use DLATK in your work please cite the following `paper <https://wwbp.org/papers/DLATK_Differential_Language_Analysis_ToolKit.pdf>`_: .. code-block:: bash @InProceedings{DLATKemnlp2017, author = "Schwartz, H. Andrew and Giorgi, Salvatore and Sap, Maarten and Crutchley, Patrick and Eichstaedt, Johannes and Ungar, Lyle", title = "DLATK: Differential Language Analysis ToolKit", booktitle = "Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing: System Demonstrations", year = "2017", publisher = "Association for Computational Linguistics", pages = "55--60", location = "Copenhagen, Denmark", url = "http://aclweb.org/anthology/D17-2010" } More Information ---------------- * `DLATK GitHub page <http://www.github.com/dlatk/dlatk>`_ * `DLATK at DockerHub <https://hub.docker.com/r/dlatk/dlatk/>`_ * `World Well-Being Project <http://www.wwbp.org>`_ * `Human Language Analysis Beings (HLAB) at Stony Brook <http://hlab.cs.stonybrook.edu/>`_ * `Computational Psychology & Well-Being Lab (CPWB) at Stanford University <https://cpwb.stanford.edu/>`_ * :doc:`modules` * :doc:`changelog` DLATK is licensed under a `GNU General Public License v3 (GPLv3) <https://www.gnu.org/licenses/gpl-3.0.en.html>`_.