Categories
project

Visual Analytic for Sensemaking

Background

The goal of this project is to understand how users (general public or domain experts) make sense of data using data visualisation and/or machine learning, and build tools to support them. ‘Sensemaking’ is a bit broader than what is usually known as ‘data analysis’, as shown in the diagram below (the ‘Pirolli-Card model’):

  1. Searching for relevant data
  2. Extracting useful information
  3. Form of an understanding of the important factors and their relationships
  4. Forming and testing hypothesis
  5. Comparing different options and making decisions
  6. Presenting the results and decision process to others.

For example, a sensemaking task can be ‘finding the best camera under £500 for baby photos’, then the steps above become:

  1. Searching for information about camera, such as the reviews/recommendation and camera price/pixel number.
  2. Find the information that is relevant to you: matching use case (such as baby photos) and price range (no more than £500).
  3. Understand which factors are important to you (such as sharp photo) and how is this affected by other factors (such as large aperture).
  4. Use the understanding to form and then test hypotheses what types of cameras may meet your requirements, such as ‘will a phone camera be good enough or I need to get a mirrorless or DSLR?’
  5. Then you will need to research and compare different camera models and make a decision
  6. Finally you need to communicate the results to others (such as your parents or partner) and convince them this is the right choice.

There are many other sensemaking examples in our life, such as planning a holiday and selecting a university/degree to study. There are also many, many applications in business, medical, defence, and other industries. Academic research itself is also sensemaking.

The fundamental issue is that most of the sensemaking steps are done manually without support from specialised tools. This make it a bottle neck in Big Data analysis (for example the dozens Chrome tabs shown below).

The project aims to design and develop visual analytics tool to support all the sensemaking stages. Some progress has been made such as SenseMap (screenshot below) and SensePath, but there is still a lot of work to:

  • Cover all the stages of Sensemaking: For example, SenseMap only targets the ‘Information foraging and triage’ part of the sensemaking.
  • Support different use cases or application domains: the tool designed for camera shopping will be quite different from the one design for machine learning research.

Required knowledge and skills

  • Data Visualisation: An User Interface (UI) is needed to:
    • Present the data to the users in a visual and easier-to-understand way;
    • Tools to support the different tasks/domains of sensemaking. For example, SenseMap only targets the ‘Information foraging and triage’ part of the sensemaking.
  • Machine Learning and Natural Language Processing (NLP) can be used together with the visualisation to make the support even more effective:
    • Machine learning is needed to infer what user is doing (such as buying a camera or comparing price) from user log (such as button click and mouse drag);
    • Then Machine Learning and NLP can be used to provide proactive support. For example, it can automatically search for the price information from different shopping websites if a user is looking for the best price for a camera model. This is related to the Human-AI Teaming project.
  • Programming
    • JavaScript is the recommended language for the UI and data visualisation;
    • JavaScript libraries such as Vue and React are recommended to build the UI, and libraries such as d3.js are recommended to build the data visualisation.
    • Extension development is needed depends on the software used for sensemaking. For online shopping, Chrome extension development is needed to build such a tool (SenseMap is a Chrome extension);
    • Python is the recommended language for the Machine Learning and NLP;
    • Python libraries such as Scikit-Learn and spaCy are recommend for the machine learning and NLP respectively.