ÐÓ°ÉÂÛ̳

 

PP434      Half Unit
Automated Data Visualisation for Policymaking

This information is for the 2024/25 session.

Teacher responsible

Professor Richard Davies

Availability

This course is available on the Double Master of Public Administration (ÐÓ°ÉÂÛ̳-Columbia), Double Master of Public Administration (ÐÓ°ÉÂÛ̳-Sciences Po), Double Master of Public Administration (ÐÓ°ÉÂÛ̳-University of Toronto), MPA Dual Degree (ÐÓ°ÉÂÛ̳ and Hertie), MPA Dual Degree (ÐÓ°ÉÂÛ̳ and NUS), MPA Dual Degree (ÐÓ°ÉÂÛ̳ and Tokyo), MPA in Data Science for Public Policy, Master of Public Administration and Master of Public Policy. This course is available with permission as an outside option to students on other programmes where regulations permit.

Course content

This course explores ways of accessing large data sets to better understand the societies in which we live and ultimately to help guide policy decisions. The data we will encounter ranges from real-time measures of economic activity to micro data on local prices, to voting patterns and measures of pollution.

We will use methods from programming and economics to work on real-world problems. Students will learn the theory and policy history that lies behind data types, visualisation methods, data mapping and machine learning. With these tools in place, we will use APIs to access data programmatically, build scrapers and batch downloaders using Python. Cleaned and verified data are stored on GitHub, with students’ work visualised using live and interactive web pages.

Topics include empirical strategy design, fetching and scraping data, data cleaning and storage, visualisation and interactivity. There is a focus on clear, replicable code that allows the automation of all these tasks in a policy setting. Students apply concepts of descriptive data analysis and may also use econometric techniques learned in parallel compulsory econometrics courses.

Teaching

20 hours of lectures and 15 hours of seminars in the AT.

Formative coursework

Each week students have an opportunity to present workbooks and visualisations from their portfolio to their peers. This provides the opportunity both to present to peers and gain their feedback, and for each student to present their skills for formative assessment and feedback from the course instructors” as well as an opportunity to iron out bugs and learn coding best practices. This work is not graded.

Indicative reading

  • Friendly, M., Wainer, H., 2021; A History of Data Visualization and Graphic Communication, Chapter 5, pp. 95-120, Harvard University Press
  • Tufte, E., 2007; The Visual Display of Quantitative Information, 2nd ed., Chapter 1 Graphical Excellence and Chapter 5: Chartjunk, Graphics Press LLC
  • Mattmann, C.; A vision for data science. Nature 493, 473–475 (2013). https://doi.org/10.1038/493473a
  • Heer, Jeffrey, Michael Bostock, and Vadim Ogievetsky. "A tour through the visualization zoo." Communications of the ACM 53.6 (2010): 59-67.
  • Ferguson, A.; A History of Computer Programming Languages, Brown University, 2000,  https://cs.brown.edu/~adf/programming_languages.html

Assessment

Project (80%) and portfolio (20%) in the AT Week 10.

The course is graded via the production of a professional-grade Data Science website. The website may consist of as many pages as students choose. The grades are given based on two pages: a portfolio, and a project. All graded work must be embedded in the website, hosted by GitHub pages. Students are given detailed lessons on how to do this.

  • Portfolio (20%). This page demonstrates the tools that students learn in a practical setting, by using them to embed charts and diagrams of various types. There are 10 challenges, each of them demonstrating 1-2 skills and resulting in embedding 1-2 charts. The total score for this work is 20%, split evenly across each of these challenges.

Portfolio skills include: Building a web site, Embedding a live visualisation in a web site, Hosting data in the cloud, Editing and cleaning data, API-driven charts, Loops and APIs, Scrapers, Critical commentary on data, Advanced analytics, Interactivity.

  • Project (80%). This page sets out the student’s data science project. There are weekly on-line sessions in which students can discuss ideas with the teaching team. The project consists of between 5 and 8 charts, tables or visualisations. Students briefly discuss four topics: the aims, the data, analytical challenges, conclusions. Key marking criteria include: accessibility, empirical design, data approach, automation, interactivity, clarity of writing.

The group sizes for seminars will be a maximum of 30 students. The course portfolio and project are handed in together, at the end of week 10.

Key facts

Department: School of Public Policy

Total students 2023/24: Unavailable

Average class size 2023/24: Unavailable

Controlled access 2023/24: No

Value: Half Unit

Course selection videos

Some departments have produced short videos to introduce their courses. Please refer to the course selection videos index page for further information.

Personal development skills

  • Self-management
  • Team working
  • Problem solving
  • Application of information skills
  • Communication
  • Application of numeracy skills
  • Commercial awareness
  • Specialist skills