Ananda Rutherford has organised a workshop for the Documenting Homes project, which is researching visualisation models for presenting the archive and other collections information across digital platforms. The workshop is a chance to explore the role of visualisations in organising, interrogating and interpreting collections in context and to develop critical and planning skills for designing visualisations. It will include guided exercises for turning data in a spreadsheet into simple visualisations and an optional hour for trying out visualisation tools with your own data.
Contact me for the workshop slides and datasets.
Exercises for Data Visualisation
Exercise 1: exploring network visualisations
Time: approx 5 minutes
- In your browser, go to http://bit.ly/11qqXuj
- Scroll down the page to the network graph.
- Take a few minutes to explore the visualisation: try holding the cursor over items, clicking, dragging, etc.
- Discuss with your neighbour: does interacting with the network graph give you more or less information than the other representations of the data on the same page? Is it clear what it’s for, how to get started? Does it open up new questions?
If you want to try others, try ‘iconic figures in British culture’ at http://kindred.stanford.edu/, Les Misérables http://hci.stanford.edu/jheer/files/zoo/ex/networks/force.html or the wine industry https://www.msu.edu/~howardp/wineindustry.html.
Exercise 2: N-grams
Time: approx 5 minutes
NB: in both tools, copyright affects the availability of 20th century books. Transcription errors may also affect results, particularly for older books (e.g. the ‘long s’ vs f)
- Think of two words or phrases you’d like to compare over time (e.g. World War One, Great War)
- Go to http://books.google.com/ngrams
- Enter your words or phrases and compare the results
- Discuss with your neighbour: are the results what you expected to see?
Google Ngram tips: http://books.google.com/ngrams/info
Tip: if you’re more interested in newspapers, try the Library of Congress’s Chronicling America collection at http://arxiv.culturomics.org/ChronAm/ or Australia and New Zealand newspapers at http://dhistory.org/querypic/create/.
Exercise 3: trying entity recognition
Time: approx 5 minutes
- In your browser, go to http://nlp.stanford.edu:8080/ner/
- Find a short paragraph of text (e.g. from a news site or collection records) to paste into the box
- How many of the things (concepts, people, places, events, references to time or dates, etc) you recognise did it pick up? Is any of the other information presented useful? Did it label anything incorrectly? What if you change classifiers?
Exercise 4: geocode data and create charts using Google Fusion Tables
Time: approx 10 minutes
NB: If your screen options don’t match the instructions, ask for help. Google roll out some changes incrementally and treat educational accounts differently, so accounts sometimes see different screens.
- Go to https://drive.google.com/ and log into Google (if you aren’t already)
- Go to http://bit.ly/FTables (or https://www.google.com/fusiontables/DataSource?dsrcid=implicit&redirectPath=data&usp=apps_start&hl=en) to access Fusion from your account.
Creating a map
- You should see a screen ‘Import new table’ with the option called ‘From this computer’ highlighted below. (If you don’t see this screen, go to step 2 above)
- Click ‘Choose file’ and find the file ‘Geffrye_places.csv’ in the handouts folder. Click ‘Next’.
- Click ‘Next’ on the next screen, then click ‘Finish’ on the following screen.
- The screen should load in ‘Row’ view that looks something like a spreadsheet with one column of data in it.
- Hover over top of the column ‘Term plus Broader Term plus UK’ to reveal an arrow. Click the arrow then click ‘Change’.
- On the ‘Change column’ screen, change the ‘Type’ from ‘text’ to ‘location’ then ‘Save changes’.
- The geocoding process may start automatically. If it doesn’t, then look for be a red box with a plus sign in it at the end of the row of menu options. Click the plus sign, then select ‘Add map’.
- Geocoding may take quite some time, so go on with the next exercise in a new browser window in the meantime.
- Congratulations, you’ve made a map!
If you finish early, you can explore other options including heatmaps and other options on the Fusion interface.
This data has been prepared so it contains enough information for the geocoding process. If you finish early you can try uploading the other ‘Geffrye_places’ files and see whether providing more or less information significantly changes the results. Look out for rows that could not be located, or that ended up in other countries.
Creating a pie chart
- Go to http://bit.ly/FTables
- You should see three tabs on the left-hand side of the screen. Click on ‘Google Spreadsheets’
- On the ‘Select a spreadsheet’ screen, look for ‘Or paste a web address here:’ towards the bottom. Copy the link to your data into the box.
- You should be on the ‘Import new table’ screen. Change the value of ‘Column names are in row’ to ‘None’. Click ‘next’.
- You can add information to the next screen as desired or just click ‘Finish’.
- The screen should load in ‘Row’ view that looks something like a spreadsheet.
- Go to the Help menu on the Google page and click ‘Back to Classic look’.
- When the page reloads, it should have a grey bar that says ‘Showing all rows’. Click ‘options’ next to that.
- This opens an area with options to ‘Filter’, ‘Aggregate’ and ‘Create view’.
- Click on ‘Aggregate’.
- Tick ‘wholeObjectName’ in the ‘Aggregated by’ section, then ‘Apply’
- This will create a view that shows only the values for ‘wholeObjectName’ with a count of the number of objects with that name.
- Click ‘Visualize’ in the menu above, and select ‘Pie’.
- You should have a pie chart of your data!
- You might need to click ‘many’ or ‘next’ in the top right-hand corner to make it process more of the data.
If you finish early, try:
- Changing the field that is being aggregat For example, try ‘collectionCategory’, ‘techniques’ or ‘materials’.
- Changing options on the ‘Configure chart’ screen
- Making other charts. Which formats best suit the data?
- What happens if you filter the underlying data to reduce the number of rows shown?
- If you get stuck, you can reset by choosing ‘clear aggregation’ and changing the ‘Visualize’ option to ‘Table’.
Don’t forget to check back and see how your geocoded data looks!
Exercise 5: analysing data visualisations
Time: approx 30 minutes
Pair up with your neighbour and explore and discuss one of the visualisations below.
- In your browser, go to one of the sites below
- Take a few minutes to explore the visualisation
- What do you think is being presented here?
- Can you easily see where to start and how to use it?
- What stories or trends can you start to see?
- Does it work better at one scale over another?
- Do you find it more effective at aggregate or detail level?
- Does it present an argument or provide a space to develop and explore one?
- If it was designed to present an argument or investigate a particular question, what do you think that was?
- What have you learned from visualisation that you might not have learned from looking at the data or reading text about it?
- Are the sources have they used clear? Do they explain how they’ve prepared them?
- What effect have their choices of visualisation formats and tools had?
- Which data or queries are prioritised, and which are more difficult or impossible?
Report back to the group: summarise the site’s purpose, visualisation formats and data types in a sentence, then share the most interesting parts of your discussion.
If you have a particular type of data, process, format or audience in mind, ask for suggestions for sites to review.
Further information: http://www.shardcore.org/shardpress/index.php/2013/11/06/tate-data-explorer/
Further information: http://mtchl.net/nolan-explorer/
University of Richmond, “Visualizing Emancipation”
Further information: http://dirt.terrypbrock.com/2012/04/visualizing-emancipation-examining-its-process-through-digital-tools/
Stanford “Mapping the Republic of Letters”
Further information: http://openglam.org/2012/03/21/mapping-the-republic-of-letters/, http://danbri.org/words/2010/11/22/603, http://republicofletters.stanford.edu/tools/
Further information: http://googleancientplaces.wordpress.com/
Digital Harlem :: Everyday Life 1915-1930
Further information: http://digitalharlemblog.wordpress.com/, http://writinghistory.trincoll.edu/evidence/robertson-2012-spring/
Further information: http://hestia.open.ac.uk/updating-orbis/
Digital Public Library of America’s timeline, map, bookshelf
Further information: http://dp.la/info/ and http://dp.la/info/news/blog/
Further information: http://blog.britishmuseum.org/2014/02/19/lost-change-mapping-coins-from-the-portable-antiquities-scheme/
‘Humanity’s cultural history captured in 5-minute film’
http://www.nature.com/news/humanity-s-cultural-history-captured-in-5-minute-film-1.15650 (article and video)
Exercise 6: Choose your own adventure
Choose the option that suits your interests and skills:
- explore and analyse more visualisations
- try making different visualisations with provided data
- more data cleaning and linking (reconciling) to other data
- try creating visualisations with your own data
You can try this data in ViewShare (http://viewshare.org/) or Palladio (http://palladio.designhumanities.org/), ImagePlot (http://lab.softwarestudies.com/p/imageplot.html) or keep exploring Google Drive/Fusion Tables.
Making more visualisations
ManyEyes is a useful tool for learning, but as it uses Java it can be tricky in classroom situations. You can create visualisations from data other people have uploaded (without signing up) at ManyEyes: http://www-958.ibm.com/software/analytics/manyeyes/datasets or try Palladio http://palladio.designhumanities.org/. You can also try visualising the data from the British Library Pin-a-tale project, available in Google Docs at http://bit.ly/WT1Ai5 Many other open cultural datasets are listed at http://museum-api.pbworks.com/Museum%C2%A0APIs
Try a visualisation and evaluate the results. Is more cleaning or transformation needed? You may need to iterate with different versions of your data after cleaning or enhancing it.
If you have your own dataset, review the ‘planning’ slides. What do you want to learn or express about your data? The ManyEyes site provides some guidance on the best visualisation types for different types of data: http://www-958.ibm.com/software/analytics/manyeyes/page/Visualization_Options.html which can also be useful for planning visualisations in Google Fusion.
You may want to try re-arranging columns to meet the requirements of different visualisation tools. You can try updating the values in the spreadsheet and re-visualising the results.
Exploring and analysing more visualisations
There are links to visualisation blogs and other specialist sites on the Resources post at http://bit.ly/UJwgEz (i.e. http://www.miaridge.com/resources-for-data-visualisation-for-analysis-in-scholarly-research/)
This workshop is based on one I give at the British Library. Further background reading is available at Resources for ‘Data Visualisation for Analysis in Scholarly Research’.