When playing with Sinclair and Rockwell’s Voyant, I decided to take a look at the entire Sherlock Holmes corpus in order to view trends that spanned across the collection. I tampered with the different formats of data visualization (the most interesting of which was the bubbles, which made a shrill, high pitched noise as it went through the entire text word by word) , but the most useful one in my perspective was the default Cirrus (word cloud) format.
With Cirrus, you can see the giant word cloud of the most frequently used words, with the size corresponding to the number of occurrences in the Sherlock Holmes corpus. After filtering out the common and irrelevant words (by selecting the Taporware option in the “Stop Words List”), the user gets left with a more relevant view of the text. One of the first things I noticed was the relatively large word “man.” Subconsciously going back to our class discussion regarding Arthur Conan Doyle’s perspective of women (a la “Scandal in Bohemia”), I found it interesting that the word “man” was so much more prevalent than any word pertaining to a female. The largest I could find was “lady,” which at 176 instances was dwarfed by both “sir” (at 323 instances) and “man” (weighing in at a whopping 902 instances).
The Cirrus view also maintains the Words in Documents window, which allows the user to see exactly how many times the word occurred in each document. I found it interesting that, out of 36 stories, the word “lady” did not come up in 12. I feel like this reveals a little bit more about Arthur Conan Doyle’s perspective on women—although he may respect them (by making one as a protagonist who outsmarts Holmes in “Scandal in Bohemia”), he still grew up in a patriarchal society that was mainly concerned with reading about men instead of women.
There was one thing I would have liked to see, though, which is having more than one word on the Word Trends graph. I’m sure Voyant has this capability, and I’m relatively sure I’ve gotten it to work before, but for some reason this time tampering with it did not yield the same results as the last time.