TDWI Upside - Where Data Means Business

Data Stories: Learn Text Analysis from Pop Lyrics, The Federalist Papers, and Jane Austen

Learn how to analyze text with compression, statistics, and n-grams using examples from pop culture, history, and literature.

Language Compression and Pop Lyrics

The Pudding 

Do you know how you could use compression to analyze language? These fun charts and interactive visualizations from The Pudding demonstrate by measuring repetition in song lyrics.

 

Bayesian Statistics and the Federalist Papers

Priceonomics 

The Priceonomics blog presents this in-depth article on using statistics to determine the authors of anonymous works -- from counting words by hand on paper slips to today’s automatic systems.

 

N-grams and Jane Austen

juliasilge.com 

How would you identify patterns of words that occur together? Data scientist Julia Silge uses n-grams to track differences in how male and female characters are described in Austen’s novels, and she explains every step of her process in this blog post.

About the Author

Lindsay Stares is a production editor at TDWI. You can contact her at lstares@tdwi.org.


TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI Members have access to exclusive research reports, publications, communities and training.

Individual, Student, & Team memberships available.