Human-AI Collaboration – LLM for Qualitative Analysis

The recent breakthrough in Large Language Models (LLMs), such as chatGPT, and generative models, such as Stable Diffusion, can be tremendously valuable in supporting analysis and creative tasks. While powerful, such models can be difficult to use, especially for domain experts, such as qualitative researchers and visual artists, who are not experts in machine learning. This project will investigate using LLMs to support qualitative analysis. The goal is to

First, understand the user needs,
Then, design interactive visual interface, and
Finally, making recommendations.

What is Qualitative Analysis?

Extracting required information from documents is a common task in may research areas such as Human-Computer Interaction (HCI), Health, Psychology, Sociology, Law, and Business. For example, in a study to understand the attitude of general public towards LLMs, the researchers may interview many participants and build up a large collection of transcripts from the interviews. The researchers need to extract information such as the type of a user, the application domain, and the user attitudes.

Why is qualitative analysis time consuming?

Currently such information has to be extracted manually, often following a ‘coding scheme’ that defines the types of user, application, and attitude. It can take several hours to ‘code’ one-hour interview transcript. This makes such research very time consuming (weeks or months just to code the transcripts) and significantly limit the power of analysis (larger number of interviews allows the discovery of patterns that are more general and less likely to miss rare but important cases).

How can LLM help?

The project idea is to support such analysis with LLMs, such as chatGPT or other open-source options (such as Llama).

Can we just give a LLM the coding scheme and the transcript and ask it do the ‘coding’?
How can we make the LLM better understands user’s analysis goal, i.e., what information users want to extract?
How to make this process easier for users not familiar with LLM? A psychology probably can’t write code to use LLM, and the web interface, such as chatGPT, is not ideal for data analysis.

The goal is not to fully automate the analysis, as LLM is still not as good as (domain) experts for many tasks. Instead, the approach is to produce a first draft result (by the LLM) and then the users/experts can further refine it to get the final results. The can hopefully reduce the overall efforts to half, quarter, even 1/10 of what manual analysis requires. This is Human-AI Teaming method.

Project 1: Evaluation of psychological counselling

I am currently working with psychologists from School of Medicine and NHS to address the effectiveness of psychological counselling.

Background

Mental ill health affects 1 in 6 adults and can profoundly influence a person’s life, work, and relationships. It is the largest cause of disability in the UK, with a direct cost to the UK Government of £24–27bn per year and wider cost to the economy of £74–99bn per year.

Psychological therapy is an important treatment for common mental health problems, and received 1.81 million NHS referrals in 2021–22. However, there is substantive variability in effectiveness between therapists, and this variability can be more contributory to patient outcomes than therapy type or dose. In one of the largest studies examining between-therapist differences in effectiveness, the most efficient and effective therapists were able to successfully treat 10 patients in the time that an average therapist would take to treat 1 patient.

The most promising evidence suggests that therapists can maximise effectiveness through detailed feedback on their interaction skills in routine practice. However, current methods for this (i.e., monitoring and rating therapist interaction skills) require costly and time-consuming assessments. Because of this, they are almost never used in routine practice.

Project

In this project we will partner with the creator of the Consultation Interactions Coding Scheme (CICS), which is designed for categorising and rating the therapist-patient interactions. It has been shown that CICS-based classifications predict a range of patient outcomes, outperforming patient and therapist predictions of prognosis. Analysis based on CICS can be time consuming, which prevents its wide adoption among the therapy services.

This project aims to investigate the feasibility and effectiveness of reducing the time and efforts required for CICS analysis by utilising the latest Large Language Models (LLMs). These LLMs, such as chatGPT, exhibits a high level of intelligence for natural language-based tasks that is never seen before. However, due to its proprietary nature and high computational cost, often the only way to access LLM is by sending data to a cloud service, which make it unsuitable for the sensitive patient data a therapy session may include. There have been ongoing efforts to provide open-source alternatives, but their performance lags the close-source counterparts until the very recent release of open-source LLMs such as Llama 2 by Meta (July 2023). For the first time, this makes it possible to run LLM locally without sending any data to external parties or significant performance penalty. This project will be the first one to apply this revolutionary technology to therapy analysis.

Resources

Project 2: Medical data integration

Again, this is a project in collaboration with School of Medicine and NHS.

Background

There are many medical datasets in NHS (thousands or even tens of thousands). This is a small collection of them, as the majority of the data contains personal information and therefore not publicly available. The value of these datasets mostly depends on the integration of relevant information. For example, while useful individually, patient demographic information, hospital treatment history, and medical check results are much more valuable if they are all linked together (i.e., integration).

However, in practice the interoperability among the medical datasets are quite poor. This is often the result of lacking data interoperability planning during the data modelling, and often led to disparate data format and semantics. The data format differences include not only difference in file format, but also different types of information or the level of details of the recorded information. The semantic difference means the same word can mean different things in different datasets, or the same thing is referenced using different terms across the datasets.

As a result, (medical) data integration is a very time consuming manual process. There is a very large team within NHS (with thousands of staff) whose full time job is to map data from one source to another. This involves matching each column in one data table to the columns in another one if there is a such match. Keyword matching is not enough, as the staff needs to check if the same terms means the same thing in two data collections, which requires medical knowledge of the issue the data describes.

One common practice in such mapping is to annotate data with a standard vocabulary, which precisely defines what each term means. This makes the data mapping easier once both collections are annotated with the same standard vocabulary. However, mapping data column to a standard vocabulary is not easy, either. SNOMED is a popular choice for this purpose and It contains over 350,000 concepts. Finding a semantically matching concept or concepts from such a large connection is time consuming and requires considerable domain knowledge.

Project

This project will also follow the human-AI teaming approach, i.e. we will use AI (probably a LLM) to do a first pass on the mapping between a dataset and a controlled vocabulary such as SNOMED. Once this is completed, the user can check and further refine the results until they are satisfactory. The goal is not to fully automate this process, but make it (much) faster. The development of tools to such mapping has been ongoing for a long time, and this project is likely to build on the existing efforts such as the Carrot, which is developed by the project collaborators.

Resources

ReMatch: Retrieval Enhanced Schema Matching with LLMs (2024, arXiv by Microsoft)
Schema Matching with Large Language Models: an Experimental Study (2024, VLDB workshop)
GRAM: Generative Retrieval Augmented Matching of Data Schemas in the Context of Data Security (2024, KDD)
Matchmaker: Self-Improving Compositional LLM Programs for Table Schema Matching (2024, NeurIPS poster)
Valentine in action: matching tabular data at scale (2021, VLDB)
https://carrot.ac.uk/
https://co-connect.ac.uk/co-connect-home/
https://www.nottingham.ac.uk/dts/news/2021/co-connect.aspx
https://www.hdruk.ac.uk/projects/co-connect/

Human-AI Collaboration – LLM for Qualitative Analysis

What is Qualitative Analysis?

Why is qualitative analysis time consuming?

How can LLM help?

Project 1: Evaluation of psychological counselling

Background

Project

Resources

Project 2: Medical data integration

Background

Project

Resources

Share this: