jr8237, Jonathan Rosenberg
Overall, I found this lab to be quite interesting. I especially enjoyed testing out word trends with Google. The one thing that I think could be improved a bit is the directions for Voyant. It seems that there are really only a few different tools, so I was a tad confused when asked which tools I found the most useful. Other than that, everything went fine when doing this lab.
Part 1
The first graph displays the usage of the word "call" with the red line displaying how often it is used as a verb, and the blue line displaying how often it is used as a noun. I used a parts-of-speech tag in order to create this grammatical distinction in the graph. I correctly predicted that the verb to be more used than the noun, since this matches up with my own speech patterns. For instance, I am much more likely to say "Someone is calling me," than I am to say that "I am getting a call." However, it is interesting that the usage of "call" as a verb decreased throughout the late 19th century into the mid 20th century, beginning to rebound around 1980. If I had to guess the cause behind the rebound, I would conjecture that this is due to the increasing popularity of cell phones, although I have no evidence to back this up.
The second graph displays the most popular word combinations for the word "south." I used the wildcard search feature in order to create this trend. I found this graph to be really interesting, as it produced some unexpected results. For much of the 19th century, South Carolina was the most popular term, but South Africa took the leader around the turn of the 20th century. I would wager that South Wales refers to the Australian state rather than the southern region of Wales. The most fascinating detail to me was how South Dakota experienced a brief surge in the 20th century, reaching its peak around 1940.
Part 2
These graphs display data about Nathaniel Hawthorne's The Scarlet Letter. I chose this specific book because it is one of my favorites. Many people consider it to be dull or slow-paced, but I am quite fond of it. The first image is a word cloud; it depicts the 45 most frequently used words in the book. However, it removes words that we might deem boring, such as articles and prepositions. The larger a word appears in the word cloud, the more often it is used in the book. Unsurprisingly, the largest word that appears is "Hester," who serves as the main character. I suspect that some words also come from the notes added by Project Gutenberg, such as "illustration" and even "gutenberg." I found the most useful tool to be the word trends. They show the relative frequencies of words across segments of the books. The second and third images above are made using this tool. They show the same data but using different methods. I personally find the line method to be more transparent, but I think the area method looks better.
Part 3
Two words that could have very different interpretations, depending on context and interpretation, are "sick" and "killed." In their most common usages, these words typically carry negative connotations. Sentimood reflects this, giving a score of -2 to "sick" and -3 to "killed." Yet, both of these terms can also be used positively, such as "That was pretty sick" or "You totally killed that!" Two words that I believe Sentimood got wrong are "lenient" and "skeptical." The website labels the former as positive and the latter as negative, but I believe that it should be the other way around. After all, people are often told to be a bit skeptical in life and are also reprimanded for being too lenient.
There were a few times where MeaningCloud and Sentimood disagreed. For example, Sentimood considered "Break a leg out there!" to be neutral, while MeaningCloud considered it to be negative. Another example is "I totally am not sad right now." Sentimood found this sentence to be negative, and MeaningCloud found it to be positive. However, there were also several instances where the websites agreed but were very off in their assessments. Both Sentimood and MeaningCloud considered "You are pretty terrible" and "He is positively bad at being kind" to be positive statements. Yet, these both carry quite negative connotations. This is likely due to the fact that I used positive adjectives as negative adverbs.
Part 4
To test the Google and Bing translators, I pulled quotes from Nathaniel Hawthorne's The Scarlet Letter then converted them from English to German and back to English again. I noticed that they have a tendency to mix up some adjectives, especially ones that have similar meanings to other adjectives. For example, Google converted "dusky and dark" into "murky and dark," while Bing converted it into "dark and dark." Musky and dim are not really similar, so I would consider this a bit of a failure. An arguably worse translation comes with the phrase "strange and wicked eccentricities." While Bing got the phrase basically right (changing "wicked" to "bad"), Google translated the phrase to "strangers and bad guys." However, there were many instances where both translators produced the exact results. I used longer sentences with somewhat uncommon words to test the translators. The following two sentences turned out exactly (or almost exactly) the same:
1) "The minister knew well that he was himself enshrined within the stainless sanctity
of her heart, which hung its snowy curtains about his image, imparting to religion the warmth of love, and to love a religious purity."
2) All the time that he gazed upward to the zenith, he was, nevertheless, perfectly aware that little Pearl was pointing her finger towards old Roger Chillingworth, who stood at no great distance from the scaffold.
Overall, I would say that the Google and Bing translators are quite effective. Based on multiple trials, I would say that Bing tend to do a better job at translation than Google. Although both services make errors regularly, the translated sentences still typically convey a similar meaning. If you are trying to get your point across to someone in a foreign language, then these translators are great. However, if you are trying to create a grammatically perfect sentence, then these translators probably will not do the trick.
Part 5
In the first image, the two classes of photos I used were from my graduation day and from the day of my senior photos, respectively. I chose the two groups because I had a decent amount of photos from each one. Although the first class appears to contain the same image over and over, they are actually all different images. The images from my senior pictures feature me in different poses, but I am wearing the same blue shirt in all of them. The machine did a pretty decent job at differentiating the images, and adding more photos made only a slight difference. With the second class, I used pictures of my dog (Laney) against pictures of elephants I found online. The machine was able to accurately differentiate the image, and thus, there was no need for me to add more pictures. I chose these two types because I have many pictures of Laney, and nothing seemed less like Laney than an elephant.