Human-AI Teaming

Background

The recent breakthrough in Large Language Models (LLMs), such as chatGPT, and generative models, such as Stable Diffusion, can be tremendously valuable in supporting analysis and creative tasks. While powerful, such models can be difficult to use, especially for domain experts, such as qualitative researchers and visual artists, who are not experts in machine learning. This project will investigate using LLMs to support qualitative analysis and artistic creation. The goal is to first understand the user needs, then design interactive visual interface, and finally making recommendations.

Research Questions (Project Ideas)

RQ1. LLM for Qualitative Analysis

Extracting required information from documents is a common task in may research areas such as Human-Computer Interaction (HCI), Psychology, Sociology, Law, and Business. For example, in a study to understand the attitude of general public towards LLMs, the researchers may interview many participants and build up a large collection of transcripts from the interviews. The researchers need to extract information such as the type of a user, the application domain, and the user attitudes.

Currently such information has to be extracted manually, often following a ‘coding scheme’ that defines the types of user, application, and attitude. It can take several hours to ‘code’ one-hour interview transcript. This makes such research very time consuming (weeks or months just to code the transcripts) and significantly limit the power of analysis (larger number of interviews allows the discovery of patterns that are more general and less likely to miss rare but important cases).

The project idea is to support such analysis with LLMs, such as chatGPT or other open-source options (such as Llama2). Can we just give a LLM the coding scheme and the transcript? How can we make the LLM better understands user’s analysis goal? How to make this process easier for users not familiar with LLM?

The result could be a web interface similar to that for chatGPT but with better support or the qualitative analysis.

RQ2. Generative Image Models for Artistic Creation

Generative image models allow the creation of images based on a text prompt. However, the generated images can be unpredictable and it can be difficult to find the right prompt that can produce the desired image (‘prompt engineering’). Essentially the users need to learn to describe their vision and idea in a way that is understandable by a generative model, which even the machine learning experts do not understand how it ‘thinks’. The result is a time-consuming trial-and-error process that can take hours or days with no guarantee of a satisfying result.

Besides using a web interface (such as DALL·E 2), user may choose to run the model in a notebook (such as Disco Diffusion). The goals are similar to the LLM for qualitative analysis: How can we make the generative model better understands user’s artistic intension? How to make this process easier for users not familiar with the generative model that often has many options and parameters?

In the case of using a notebook, this project ideas overlap with that for the Human-Centred Data Science project, i.e., both need to capture and present the history of user input (prompt and parameters) and results (images). The image below shows an Jupyter extension that captures the user input history but not the output history. There is a previous student project on this.

Readings

Promptify: Text-to-Image Generation through Interactive Prompt Exploration with Large Language Models (UIST 2023) Video
PromptMagician: Interactive Prompt Engineering for Text-to-Image Creation (TVCG 2023)
Design Guidelines for Prompt Engineering Text-to-Image Generative Models (CHI 2022) video
RePrompt: Automatic prompt editing to refine ai-generative art towards precise expressions (CHI2023) video
PromptMaker: Prompt-based Prototyping with Large Language Models (CHI 2022)
Dispensing with Humans in Human-Computer Interaction Research (CHI 2023)
Designing Intelligent Tools for Creative People (IEEE CG&A 2023)