Text Analysis with R for Students of Literature – Book Review | Tunguz Review

July 14, 2015 by Bojan Tunguz

Text Analysis with R for Students of Literature – Book Review

Our ability to access, process, and analyze large quantities of data has been increasing at a dizzying pace over the last few years. This data-driven revolution is fundamentally changing many professional and academic fields. Many people, especially the long-term practitioners in humanities and similar disciplines, find this change worrying, and in many ways exactly contrary to the spirit of these disciplines. Pouring over long and demanding texts, while internalizing them and becoming personally immersed in them, seems to be at the very core of what these disciplines are all about. And yet, as both a lover of humanities and a die-hard techy, I find this latest development incredibly exciting.

The title of this short book makes it eminently clear who the intended audience is: students of literature who are interested in using R for textual analysis. R is a very powerful programming language used for statistical analysis. Textual analysis is a very prominent aspect of modern data science, so there are many well-known and established tools and techniques that can help one with this task. However, the aim of this book is neither to teach R or programming, but to give the Literature students just the most basic tools needed to do some relatively straightforward textual analysis. The book jumps straight into the examples almost from the very first page. The obvious virtue of this approach is that you can start doing some interesting work rather quickly, and as long as your own research doesn’t depart dramatically from the examples given in the book you should be able to use the books as a reference and a primer for your own work. However, if you have some slightly more demanding problems that you are trying to work on, then after finishing this book you might want to go to a specialized book on R programming that will give you enough foundation to work on a larger variety of problems.

The book takes the freely available text file of “Moby Dick” and runs a variety of textual analysis on it: simple word count and word frequencies, correlations between various “special” words, context analysis, etc. In the latter chapters it moves from a single book to a corpus of books for more interesting look at themes across many texts. I found the last chapter on topic modeling especially fascinating, but way too brief. I guess I will now have to take a look at other sources to learn more about this line of analysis.

This books is very pedagogical in its style. Oftentimes the author would present two different solutions to a particular problem – one using a very simple yet hard to understand R command, and another broken down into several self-contained chunks. I find this approach very educational and helpful.

Even though this is primarily a book intended for literature students, I would actually strongly recommend it to anyone interested in text mining, text analysis and natural language processing. It is a very gentle and approachable introduction to the whole world of textual analysis.

**** Electronic version of the book provided for review purposes. ****

About the Author

Follow on Twitter Me on Facebook

Bojan Tunguz

Bojan Tunguz was born in Bosnia and Herzegovina, which he and his family fled during the civil war for the neighboring Croatia. Over the past two decades he has studied, lived and worked in the United States. He is a theoretical physicist with degrees from Stanford and University of Illinois. Tunguz has taught physics at several prominent liberal arts colleges and has been writing about physics, science and technology for more than a decade. He also has a wide spectrum of interests, and reads and writes about current events, society, culture, religion and politics. Over the years he has reviewed many of the books that he has read, and posted his reviews on various online outlets. In 2011 he had become a top 10 reviewer on Amazon.com, where he continues to be very active. Aside from reading and writing, Tunguz enjoys traveling, digital photography, hiking, and fitness. He resides with his wife in Indiana. You can follow my review updates on the following pages as well: Facebook: http://www.facebook.com/tunguzreview Twitter: http://www.twitter.com/tunguzreviews Google+: https://plus.google.com/u/0/104312842297641697463/posts

Visit Website Bojan Tunguz

September 21, 2016 by Bojan Tunguz

Integrated Analytics – Platforms and Principles for Centralizing Your Data – eBook Review
This is another short and to the point ebook from O’reilly media exploring various insights from the current developments in machine learning, data science and […]
Share this:
Facebook
Twitter
Google
LinkedIn
More
Email
Reddit
Print
Read More
July 18, 2016 by Bojan Tunguz

Client Access Control
Network access controls the act of keeping unauthorized users and devices out of a private network. Without access controls, network access is unrestricted. Note that […]
Share this:
Facebook
Twitter
Google
LinkedIn
More
Email
Reddit
Print
Read More
January 6, 2016 by Bojan Tunguz

Disruptive Possibilities: How Big Data Changes Everything – Book Review
Big Data is one of the buzz-phrases that has become popular in recent years, especially in tech and business circles. Even though I am not a […]
Share this:
Facebook
Twitter
Google
LinkedIn
More
Email
Reddit
Print
Read More
December 21, 2015 by Bojan Tunguz

The Master Algorithm – Book Review
Over the past few years we have witnessed an incredible explosion of interest and application of machine learning. Machine learning has become the predominant computational […]
Share this:
Facebook
Twitter
Google
LinkedIn
More
Email
Reddit
Print
Read More

There are no comments yet, add one below.

Text Analysis with R for Students of Literature – Book Review

About the Author

Bojan Tunguz

Leave a Comment Cancel reply

Popular Reviews

Roker®Sound Cube Portable Wireless Bluetooth Stereo Speaker

Maxell HP/NC-V Noise Redux Headphones

Gogogu Large LCD Non-contact Infrared Ear Thermometer Baby Thermometer

How to Measure Social Media – Book Review

The Periodic Table: A Very Short Introduction

Recent News

Text Analysis with R for Students of Literature – Book Review

Share this:

About the Author

Bojan Tunguz

Related Posts

Share this:

Share this:

Share this:

Share this:

Leave a Comment Cancel reply

Popular Reviews

Recent News