A note to former and future students, as well as visitors to this site: this site documents “Studies in Communication and Culture: Data,” a course conducted in the School of Literature, Media, and Communication at Georgia Tech in Spring 2013. The content will remain up as a resource for students and other inquiring minds. Please contact me, Lauren Klein, at email@example.com, if you have any questions about the material that appears on the site.
On a final note, there is a video game coming out soon called Watch Dogs that is all about data. I don’t know much, but the main character has an entire city’s data stored in his phone. He can pull up anyone’s information, control people’s phones, and all other crazy stuff. Here is a sample of the gameplay. It’s 9 minutes long, but worth it. If anything you need to see the part with the traffic light at 6:30 minutes in.
LCC 3206 Final Project
For my project I continued from my previous assignments using the data set of annual family income in the state of California. In my second midterm assignment I tried to show pure neutrality and created a visualization that is absolutely unbiased, where in these visualizations I created graphs that may appear unbiased, but can easily be misinterpreted. In the first visualization the different wages of each race is shown on a radar graph. This form was inappropriate for this data set because many parts of the graph are covered behind the graphs for other races. In the second visualization, some races seemed inferior to the larger categories just because they have a much smaller population. For example, when you look at the higher annual income ranges the smaller categories are practically nonexistent, and make it look like no one of that race or ethnicity makes that annual income range. This become misleading since in reality each race has about the same percent of their population in each annual income range. I then went and talked about how people can create these kinds of visualizations on purpose to make it benefit them in some way.
For my final project, I focused on the number of industries within Atlanta and the strength of their presences, data available from the United States Census Bureau’s County Business and Demographics interactive map found here: http://www.census.gov/cbdmap/. I provided eight different visualizations, some I collected from the website and some I created via Microsoft Excel, Adobe InDesign, or ManyEyes. These visualizations were chosen for based upon their displayed information hierarchy, their usefulness to a particular audience, and the voice they bring to the data via their representation of it.
I wanted to understand how the data might be represented differently with different approaches to a visualization of it. The concepts, some of which were inspired by the “New York City Regional Foodshed Initiative” or “Tweets from Hillary,” used color and size to aesthetically illustrate the data and make it easy to see the comparisons between the strength of one industry presence over another, sometimes in a rather abstract or artistic way. Others focused on traditional representations such as a bar chart and Matrix chart which focus on the numeric aspect of the data.
While compiling the visualizations for the data, I came upon two conclusions. First, the audience for a data visualization must be kept in the forefront of the designer’s mind as different visualizations may evoke very different emotions or reactions within a particular audience. For instance, if the Tree Map of the data were to be presented to a board room of executives trying to understand the industry needs of the city, they would be entirely confused and probably frustrated. However, an artist might find this representation much more appealing as it focuses upon the visual appeal of the data and how it might be represented conceptually. Secondly, though data visualizations may begin with different goals in mind, organized and useful data visualizations must contain some sort of hierarchy in order to make sense, as Alex Wright posits. Hierarchy commands the order of the data and brings significance to the viewer and is, therefore, necessary to meaningful representations.
I have an interest in art, especially in exploring what kind of specific visual aesthetics can grow out of technology-based spaces.
I’m one of those people … I take pictures of my feet. Pictures like these, which make up my Final Project data set:
It started last summer as a way to prove that I had actually gotten dressed up for work each day. But I have since continued taking pictures of my feet, adopting it as my way of documenting what I have worn, where I have been, who has been with me, etc. To my friends and family who might see my “shoe shots”, they are just pictures of my feet, but these photographs are incredibly successful at jogging my memory. In creating a visual representation for my data set as a whole, I wanted to make sure that I could arrange the pictures of my feet in such a way that would facilitate their potential meaning something to other people.
For this project, I chose to arrange my data set using GoogleMaps, pinning only locations for which I have a corresponding “shoe shot”. Below are a few screen shots, but you can also click here to see the whole map.
Currently, the data set and the visual representation are narrow; as a student, I frequent many of the same places, so the spatial storytelling is limited. It is my intention, however, to continue this project and to focus on the “where I went” aspect of the photographs as I study abroad this summer and travel throughout Europe. While I do not explore my email archives like Stephen Wolfram or keep track of the number of coffees I order each year like Nicholas Felton, I believe that my “shoe shots” contain a large amount of personal information. The locations pinned on the map that I visit most often (and which have the highest amount of corresponding photographs) are clustered and clearly indicate my status as a student who travels primarily on foot. In addition, it is possible to make assumptions about the environment and the responsibilities I have in certain places, based on the type of shoes I wear when I am there. It would also be possible to make judgments about my fashion preferences by examining the included photographs. A viewer could form any number of theories about me from the “shoe shots” on the map.
Though not much of a narrative or “data diary” account now, I believe that the map will gain strength in those areas as time progresses. The map will become an interesting way to trace my life developments (such as my trip to Europe, my graduation from college, or my potential relocation for the start of my career), visually (through my “shoe shots”) and spatially with the help of pinned photo locations.
With graduation right around the corner, I wanted to make a visualization that showed multiple southeast cities compared against one another in multiple categories. I chose six that I felt were important to many young graduates when deciding to follow their career and move into the next stage in life. I chose walkability, mass transit, parks, average income, average housing costs, and how much of a sports town it is.
It can be stressful to make huge life altering decisions and I noticed that there wasn’t a place for an individual to make a decision on the greatest location to continue their career. I wanted to make something where individuals could come and interact with the data.
For my final project, I wanted to take a historic data visualization and twist it to fit more meaningful data. I did this by using the form of John Snow’s 1854 map of the Broad Street cholera outbreak, a simple map and hash marks to represent some sort of data, but with a map of the Georgia Tech and with the data points as my expenses for the last semester. I then created two maps: one which depicted the amount of trips I took to different restaurants and stores over the semester and another that depicted the average amount of money that I spent at these establishments per visit.
In order to create these maps, I took screenshots of Google Maps maps around the Georgia Tech area. Once I had captured Georgia Tech and the surrounding area, I pieced the multiple maps together and began tracing the roads and buildings. In order to get the spending data, I downloaded my bank statements for the past semester and created a spreadsheet of each transaction that took place in a brick-and-mortar store, recording the store (with it’s location) and the amount spent. I then put the map and data together, using hashes to represent either one visit to a store or one average dollar spent at a store. At each location, I stacked the relevant hashes into columns, almost like a bar chart bar. I kept the style of the map and hashes as close to the John Snow maps as possible, using simple black lines to create roads and smaller, thicker, black lines for the hash marks. This allowed me to not only create a more accurate reproduction of Snow’s map and represent the data in a simple, effective form.
By using John Snow’s map as inspiration and form, I was able to not only plot where I spend my money, but also see trends in location in relevance to where I spend most of my time, on campus around Skiles and my apartment. I tried to go into the project open minded, but by already having a sense of what I was trying to find and knowing that I was probably going to see a location based trend, kept me biased towards finding some structure within the data. The size of the maps and hashes and the inclusion and exclusion of certain roads all biased how the final map appeared and how the data was represented. However, these biases still allowed me to find correlations between location and spending habits.
In general, in areas closer to where I live, I will spend less but more often. For example, I had more transactions within the Student Center and CULC Starbucks than I did for all of the other stores put together. However, the average amount of money I spent at these locations was less than many of the stores outside of Tech’s campus. Following this logic, if a store/restaurant is further away, I was more likely to spend more money per trip over a fewer number of trips.
For this final project, I chose to analyze a dataset of all the Top 20 routines from the first four seasons of the reality television show, So You Think You Can Dance. This dataset provided the style of dance, what couple performed the routine, song choice, who choreographed it, what judges were on the panel at that time and how the couple fared in the result show. With all this information, I chose to visualize the correlation between style and result using a series of pie charts. I created four separate HTML pages, one for each season, with a pie chart of each style that was performed that season in which the pie charts were broken down by how the couple who performed that routine fared in the result show, whether they were safe, put into the bottom three, eliminated and so on. When you hover over each slice of the pie, a box pops up to tell you what kind of result it was and what percentage it makes up. Also, when you hover over the title in the center, it tells you how many times the style was performed. However, I was unable to make the sizes of the pie charts proportional this total to create a more visually clear message, which is why I created a home page that featured a static image of pie charts for all the styles in which their sizes are based on their totals throughout all four seasons. What I had originally intended to be able to analyze with this data and my visualization was some correlation between the styles the dancers performed and how they fared in the result show that same week. However, with my inability to create my visualization to the extent that I wanted, I found myself analyzing the structure of my visualization instead. I drew from Tufte when he said “The representation of numbers, as physically measured on the surface of the graphic itself, should be directly proportional to the numerical quantities represented.”(Tufte 56) to analyze how not being able to visually include this piece of data may have an effect on how the data is interpreted by others. I also consider, however, how the factors that I chose to leave out, such as which dancers and which choreographers were involved in the performance could have affected the result and not just the style and how in the end there is no way to visualize every contributing factor and at some point you need to draw the line and choose a stance to interpret.
For my final project I created a project proposal for an interactive Github visualization. Github is an open source code repository site that allows people to host their code online to easily manage revisions and collaborate with others. My project proposal is for a network visualization of the people that a specific user has collaborated with, and the projects they have worked on. Given a Github ID, the program would generate a network of all of the projects they have collaborated on, and who that collaboration was with. The user would be able to easily expand the network to show more connections and projects based on degrees of separation from the original ID. The user will also be able to select a specific project or person to obtain more detailed information.
This program could be used in a variety of ways. It could be used as a simple exploration tool to see what the people around you are working on. It could also be used to determine who you might wish to work with in the future. If you know that you had a good experience working with a specific person, you might use the program to find other people that that person has worked with in the hopes of finding similar personalities or work ethics. Additionally, this can be used to determine how much a particular person contributes to projects and just generally how active they are in the open source community. For students, this would be helpful to attempt to gauge how effective a future group member might be. For employers, this would provide an easy way to see what interview candidates have been working on, and how often they work on personal and collaborative projects.
Though this proposal, I considered the visualization of networks and how they are typically portrayed. In the end, went with a design that initially seems very similar to traditional network visualizations. However, when examined more closely the visualization is actually an interesting spin on traditional networks. In my case, there are actually two categories of nodes; one for the people and one for the projects they worked on. The projects become part of the links between people, but I chose to make them their own unique pieces in the network because they contain very useful information in and of themselves. Without information on the projects, the links between people are much less informative and interesting.