Visualising the Sensemaking Space


This project closely relates to the Visual Analytics for Sensemaking project. Please check that project webpage for the introduction on sensemaking.

A published paper describing the idea and a link to a online demo are available on this page. The source code of the system is available on GitHub.

The goal of this project is to understand how users make sense of data by visualising and analysing the sensemaking process that is represented as a sequence of high-dimensional vectors.

In this project, we describe each step in sensemaking with a ‘provenance vector’, that includes all the information necessary to reconstruct the visualisation state. This includes information such as what data are displayed and how they are visualisation.

Using the Gapminder as an example (see the screenshot below), the vector includes information such as what the x and y axis represent, what are the colour and size of each circle, and the year of the data.


As the user explores the data, he/she may change the year, select different attribute for the axis (e.g., showing ‘population’ instead of ‘income’ on the x axis), and so on. Each such change updates the ‘provenance vector’, and a sensemaking process can be described as a temporal sequence of such vectors (the left of the top figure). Such sequences (called ‘provectories’ in the paper) can be projected into lower dimensions with methods such as t-SNE or UMAP (the middle of the top figure). It is also possible to show the traces of many user sessions together to see if there is any similarity or difference (the right of the top figure).

Research Questions

Once the sensemaking process is captured, it can be analysed by either visualisation or machine learning to identify any interesting patterns:

For single user, such patterns can be:

  • Are there any frequent patterns, i.e., a sequence of actions that appear multiple time? If so, what does this mean, i.e., why was the user doing this?
  • Is it possible to infer the analysis actions, such as comparing two similar states or analysing the clustering in the data?
  • Is it possible to infer what strategy the user is using, such as a depth- or breadth-first search?
  • Is it possible to show what data have been explore and what hasn’t?
  • Is it possible to infer when a user is stuck and can use some help with further analysis?
  • Did the user find the answer, and what is it?

There are also many interesting questions about a group of users, such as:

  • What are the differences and similarities among the user sequences?
  • Is it possible to tell who are the experts and who are the novices?

As mentioned, visualisation and/or machine learning can be used to answer these questions.

Required knowledge and skills

  • Data Visualisation: An User Interface (UI) is needed to:
    • Present the provectories to the users in a visual and easier-to-understand way;
    • This can be done for the users, so they can use the information to improve their sensemaking process, or
    • This can be done for the researchers who want to understand how users conduct sensemaking and how the tool can be designed to support this.
  • Machine Learning can be used together with the visualisation to answer the research questions listed earlier:
    • Machine learning can be used to identify frequent patterns, infer user task and strategies, and so on;
    • Machine Learning can also be used to provide proactive support, for example, detecting when user is stuck and provide personalised support. This is related to the Human-AI Teaming project.
  • Programming
    • The existing UI is programmed in JavaScript mostly using libraries d3.js.
    • Python is the recommended language for the Machine Learning and NLP. Python libraries such as Scikit-Learn and spaCy are recommended for the machine learning and NLP respectively.