Data Processing

Edition: February 2017, under construction

The amount and complexity of information produced in science, engineering, business, and everyday human activity is increasing at staggering rates. The goal of this course is to expose you to visual representation methods and techniques that increase the understanding of complex data. Good visualizations not only present a visual interpretation of data, but do so by improving comprehension, communication, and decision making.

course picture

In this course you will learn how the human visual system processes and perceives images, good design practices for visualization, tools for visualization of data from a variety of fields, collecting data from websites with Python, and programming of interactive web-based visualizations using D3.

Staff

The course’s staff consists of your instructor and a student assistants. Due to the size of the course, we can’t usually respond to email inquiries about your assignments or organizational matters. You are encouraged to speak to a student assistant or the instructor at the lab.

Gosia Migut
instructor help@mprog.nl

Bas Châtel
teaching assistant

Tim Meijer
teaching assistant

Goals for this course

After successful completion of this course, you will be able to…

Prerequisites

You’ll need programming experience in a language like C, Ruby or Python, an understanding of object-oriented basics, experience with reading and writing data files.

Expectations

We require you to attend the weekly workshops. We expect you to watch all lectures in preparation of the meetings. We expect you to submit all homework assignments.

Grades

Participation in workshops, design assignments, and programming homeworks and final written assignment will all be part of your final grade. Non-effort in any of these will lead to a failing grade for the course, regardless of compensation. But if you do the work and interact regularly with instructor and fellow students, you should be able to make it!

Grading specification:

A student must ordinarily meet all deadlines in order to be eligible for a passing grade unless granted an exception in writing by the course’s instructor.

Course components

Homework

The path to a good visualization design in your projects is likely to involve mistakes and wrong turns. It is therefore important to recognize that mistakes are valuable in finding the path to a solution, to broadly explore the design space, and to iterate designs to improve possible solutions. Weekly homework is going to provide an opportunity to learn these design skills and to test your understanding of the material. The homework is designed to support you in developing later projects (example of the data processing programmingproject).

Homework formats

When handing in homework make sure to always do the following:

Design part

In the design part of the course we’ll dive into understanding what good visualizations are, what theoretical principles should be applied and what good practices are for designing a good visualization.

Colin Ware

There are reading assignments each week, scientific articles and chapters from books. The content of each week’s reading will be reflected in the workshops and design assignment of that week.

To complete this part of the course, you will need to attend weekly workshop sessions, on Wednesdays at 11:00. Some of the work will be done in class and there is regular sharing of work by way of informal presentations. All design assignments are released on Wednesdays and are due next Wednesday, at 9:59 a.m. You will be assigned to a group during the first workshop session and the assignments will be done in a group and presented by a group during the weekly workshops. The last two weeks you will work on a design assignment that will be graded (example video assignment)

Topics by week are:

week topic
1 intro + process
2 visual variables + design principles
3 visualization taxonomy
4 color
5 dashboard + interaction + linked views
6 storytelling
7 presentation

Technical part

In the technical part of the course you will learn how to implement interactive web-based visualizations in javascript and D3.

This part consists first and foremost of a lot of programming work, done by you!

Topics by week are:

week topic
1 Scraping IMDB
2 Line graph in Javascript
3 JSON + interactive barchart D3
  4 D3 scatterplot.              
5 Multiseries interactive line
6 Linked views
7 Linked views

All assignments are released on Mondays and are due by Friday, at 23:59. All of these assignments have to be handed in individually, but you are supposed to help each other out generously during the course.

To assist you with programming the lab will be open several hours per week. Check datanose for weekly schedule.

Course Policies

Collaboration policy

Because teamwork is stressed in this class, collaboration, consulting information sources, and working with others is permitted. Please note the following restrictions, however.

On the homework solutions you must list any help or hints you have received from others. Extensive collaboration (that is, solving the lab assignments with others) is not permitted.

You may not submit the same or similar work to this course that you have submitted or will submit to another. Nor may you provide or make available solutions to homeworks to individuals who take or may take this course in the future.

Quoting sources

You must acknowledge any source code that was not written by you by mentioning the original author(s) directly in your source code (comment or header). You can also acknowledge sources in a README.txt file if you used whole classes or libraries. Do not remove any original copyright notices and headers. However, you are encouraged to use libraries, unless explicitly stated otherwise!

You may use examples you find on the web as a starting point, provided its license allows you to re-use it. You must quote the source using proper citations (author, year, title, time accessed, URL) both in the source code and in any publicly visible material. You may not use existing complex combinations or large examples. For example, you may not use a ready to use multiple linked view visualization. You may use parts out of such examples.

Missed activities and assignment deadlines

The data processing course consists of several parts: workshops, the design assignments and the lab assignments. Attendence is required at the workshops and the lab sessions. You are allowed to miss one workshop, any extra absences will result in the personal penalty in the grade for the group assignment (10% per absence). Homework is to be handed in according to the schedule, you are allowed one slipped deadline.

If circumstances force you to miss a workshop, a lab session or a homework deadline be sure to mention this to the instructors ahead of time, use our email. Feel free to discuss any issues that come up (you may be referred to our coordinator).

Regulations

All forms of academic dishonesty are dealt with harshly. If the course refers some matter to the Examination Board, the course reserves the right to impose local sanctions on top of that outcome for that student that may include, but not be limited to, a failing grade for work submitted or for the course itself.

In all cases we follow the directives regarding fraud and plagiarism of the University of Amsterdam and of the examination board of the Computer Science BSc programme. Find them here in English and Dutch.