Week 6 Practicum – Dazed and Confused



This week’s practicum exercises made the left side of my brain hurt (OK, admittedly, my whole head hurt, I began to feel the room spin and as you are reading this, I am probably still holding my head in my hands!)  but since I identify  more with “right-sided brain” thinking, I thought exercising the left made it hurt more.

Nevertheless, I persevered through the exercises!  I know that I am not getting the full capability out of the text mining resources.  So I will be very excited to see what we do in class on Monday.  I began with the introductory video to Voyant, and read through some of the instruction guides to get a better sense of how the website worked.  Then I uploaded a dissertation with which I have been working on for our first project (“Evaluation of Noisy Transcripts for Spoken Document Retrieval”).  I thought I would have some fun and see what I could gain from mining the document for key phrases or words.

Unfortunately, I do not think I accomplished much.  The Wordle picture was interesting, finding mostly research-oriented words like “query”, “segmentation” and “analysis”.  Voyant also helpfully let me know where those words appeared in the document, so I could get to them quickly.

I played a bit with frequency of words appearing in the text.  Again, I found it interesting to see how frequently, and where, in the text such words appeared, but I did not gain anything from this additional information.  Of course, I did not have anything specific in mind when I went search.  I was just having fun.  Perhaps this colored my perspective on the tool.  Like anything in research (or life, for that matter), it appears that having a direction and goal in mind helps shape the final product significantly.

Next, I decided to play with Google Ngram for a bit.  The articles made that sound more fun, and easier to use.  So I played.  Full disclosure – I am not particularly creative, so the most interesting things I did were enter in single words.  I realize that Ted Underwood said the most interesting things would come from 4- or 5-Ngrams.  However, I am just not that good at identifying those phrases (tried a few times and failed miserably).  So I reverted back to single words, like “dog”, “cat”, “war” (peaking around pre-1920 and post-1940 – go figure) and “peace” (also peaking around pre-1920 and post-1940).   This was far more interesting than I thought it would be!

In case you were wondering, “happy” peaked at the beginning of the period covered (1800) and has been on the decline ever since.  “Sad” has gone up and done, peaking around 1860 (curious) and then dropping again to about where it was in 1800.  “Dog” has risen and fallen over the same time period; whereas “cat” has simply been on the rise.  “Equanimity” peaked around 1860 and then fell to about level with its 1800 frequency.  Here, I was checking our vocabulary, to see if it really HAS been on the decline.  “Obfuscate” appears hardly at all until 1960, then rises sharply.   So interesting to see how words become popular!  Also of note, our professionalism has experienced a significant decline.  “Sir” has nearly dropped from use altogether.

So herein lie my forays into the world of text mining.

Again, I am looking forward to class.  I tried to get a better understanding of the technology involved by going into Google’s explanations a bit more, reading additional items available on Voyant, but all of that just made my head hurt more.  I felt like I was in high school physics class again.  So I can see why Professor Leon said this is the point in the semester where people begin to get overwhelmed.  I’m not quite overwhelmed, because I know this is just one more tool in our ever-expanding toolbox.  However, I could use a bit of professional expertise in applying the tool.  See you all in class!


  1. I’m reading a book for my seminar on the American Revolution called “Passion is the Gale,” which analyzes the subtle changes in contested emotions during the 18th century. The author actually performed a text mining exercise using the “Pennsylvania Gazette” and searched the frequency of words such as “mercy,” “pity,” “compassion,” etc., which yielded very interesting results. So don’t feel badly about your “lack of creativity”–you never know what you can gauge from seemingly random data!

  2. Although I found the Voyant tool to be very useful for my research, I completely understand what you mean by having a goal in mind. But in the spirit of Trevor Owens and Michael Edson, I began questioning this “goal” philosophy. Time constraints and practical application of a search are important factors when constructing a paper. Yet I wonder where a simple search could lead when I am brainstorming.

    This is not intended to be a suggestion but just a thought of my reading reflections this semester!

  3. Don’t worry, you’re not the only one is confused at this point. I had to walk away from my computer after failing miserably with my data mining attempts. But I like your creative use of Google Ngram!

Leave a Reply to caitlinclio Cancel reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s