This case study is currently in the process of a style update.

designing explainable, actionable human-ai collaborative systems

my role

UX Designer; this included qualitative research, modeling user and AI flows, UI and interaction design, systems modeling, and some front end development

key collaborators

Alexandra Hopping, Amy Zhuang, Janelle Wen, Kylon Chiang

timeframe

7 months

overview

Misinformation on social platforms is one of the most complex and pervasive issues of our time. People don't trust social media platforms, and they don't trust AI.

As part of my MHCI capstone at Carnegie Mellon University, our team of five masters students paired with two senior AI research scientists at the Pacific Northwest National Laboratory (PNNL). Our team studied how people experience and interact with AI systems for misinformation detection social platforms. We created guidelines for designers, data scientists, machine learning engineers to build better human-AI collaborative systems for slowing the spread of misinformation on social media platforms.

outcome

Since our research revealed the current norms aren’t working, our human-AI collaboration guidelines aim at fostering new norms for social media platform design by bringing forward users pain points with these AI systems.

We consolidated these pain points into guidelines focusing on three principles: Actionability, Customization, and Explainability (A.C.E). By using these principles, designers, data scientists, and ML engineers can find ways to increase trust and acceptance of AI systems for misinformation detection on social media platforms.

I worked with Kylon Chiang to design and develop the website to ensure that these guidelines, our prototype, and research report can be publicly accessed.

Visit our website

design process

It’s a political problem not an information one

On the surface, it’s easy to say misinformation is an information problem caused by social media platforms and bad actors. We thought that at first too. But our research (expert interviews, think-alouds, literature review, surveys) revealed it’s much more complex than that.

people are unaware of their own behavior

During the election, there was stuff that was fake and misleading... people said later, 'Oh, this isn't true.' so I’m thinking, we just sent this to a whole bunch of people. Why are we so stupid? Why don't we know it's fake?"
— Conservative, Retired Accountant, 62

We surveyed hundreds of people across every political spectrum and age range — and not a single participant said they want to engage with misinformation.

But in a smaller batch of interviews with a simulated feed, every participant used political viewpoints to evaluate if a source was trustworthy.

“He's a political analyst for Fox News. And he's followed by Laura Ingraham. So not really keeping great company. I assume I don't agree with this guy on basically anything..."
— Liberal, Business Analyst, 28

These self-made methods increase the psychological factors that make people more vulnerable to misinformation. Any information that conflicts with our own identity (especially political) is even harder to evaluate critically.

avoiding our own echo chamber

To gather this data, I built a user research recruitment pipeline via Facebook ads connected to an Airtable database. This enabled to build a large pool of diverse research participants from every state, ages 18-80+, political affiliation, education status, profession, etc... This is vital because all these factors affect online engagement and susceptibility to misinformation.

people don’t trust platforms or ai...

Social media platforms optimize for engagement. These personalization and recommendation algorithms recommend relevant content that in turn create echo chambers and reinforces polarization. Echo chambers create competing narratives about what to trust or believe —causing people to feel uncertain and mistrustful.

ACE_Personalization

Modeling the current norms in trust, identity, & online engagement

Our research revealed people are concerned about misinformation, but they lack the tools to evaluate their engagement with misinformation. As we started working through synthesizing the various parts of our research, I took the lead on building models to bridge information across research efforts pulling out key insights from the data.

everyone is tired and probably not paying attention

I led a brainstorming session with our team to capture how fatigue, bias, and identity contribute to how people engage with information on social platforms. This early whiteboarding session focused on surfacing our collective research knowledge of online behaviors and platform factors.

This model is based on our data from simulated feeds, feedback from our clients, and a collective 60 paper literature review.

This model is based on our data from simulated feeds, feedback from our clients, and a collective 60 paper literature review.

people define truth based on shifting individual beliefs and attitudes

Next, I attempted to build a model on how trust is tied to identity to uncover parallels for building trust in our interventions. I wanted to look at where trust breaks down and what influences polarization in the current system.

This model was based on a literature review of research on institutional trust, group identity, and interviews with David Danks and Kathleen Carley.

After a critique with our clients, I realized I either needed to go broader and more complex or narrow my efforts to more specific questions. The second iteration of this model narrows in on why certain people are more vulnerable to misinformation than others.

We used this model to define ways our guidelines could address the breakdown of trust and increased vulnerability at an interpersonal level.

Evaluating human-AI collaboration strategies

The modeling efforts helped us understand the underlying behavioral patterns between peoples trust, identity and online engagement. We defined the when, why, and how people were most vulnerable. But we also needed to research the optimal way for humans and machines to work together to create a system with new and better norms for users and platforms.

defining our impact area

At this stage, we took a step back to find an impact area —our design opportunity. I adapted a scoping framework to narrow our testing. We focused on people who unintentionally spread and consume misinformation since they make the largest group with the most willingness to adopt.

testing the edges of peoples comfort with ai

With this as our target group, we simulated no AI to full-AI assisted information evaluation techniques with participants.

ET-AI

The test was intentionally vague to prompt responses about what users thought was there.

Next, we explored a spectrum of solutions with people ranging from no AI to full-AI intervention. This testing used a method called Speed Dating and focused on evaluating the boundaries of trust and human-AI collaboration.

DIYAI

This sketch explores how people would feel about high levels of customization in their feed. Low AI.

After conducting our speed dating sessions, we had a lot of data. I imported our insights into Airtable spreadsheet, and led the team in a thematic analysis of the research data. We tagged our data with themes to categorize the insights.

Airtable data

I then collaborated with another team member to create data visualizations to identify patterns and relationships between themes, storyboards, user needs, and participants.

prototyping human-ai collaboration

Combined, this testing revealed people accept AI systems that help them meet their goals (even if they say they don’t). But to trust an AI system, people need to feel that the AI is helping them meet existing goals, rather than creating goals for them. As a team, we created five different low-fidelity prototypes of different possible AI solutions around these principles.

My prototype focused how a customization tool could help people explore outside their “bubble.” This was an evolution of the DIY AI idea from our Speed Dating testing.

In-Feed-Visualizations

However, we folded this idea into the guidelines around customization because testing revealed people were more likely to abuse this feature by filtering only for information that confirms their existing bias.

Designing for actionability, customization, explainability

This research and modeling scoped the intent and focus of the guidelines. These guidelines focus around three best practices around the design and implementation of human-AI collaborative interventions systems for social platforms. We refer to these guidelines as the acronym ACE, for Actionability, Customization, and Explainability.

A

Actionability

Support users in taking action against misinformation

C

Customization

Allow users to personalize interventions to best help them reach their specific goals

E

Explainability

Help users understand how features work to establish trust

book@16px

full intervention details

Get an in-depth view of the methods and research used to establish these principles.

creating goals and taking action together

Full transparency or customization can negatively affect the success of human-AI collaborative systems because people will game the system for their own goals. However, most people's distrust of AI stems from a lack of understanding and an inability to feel they can influence the algorithm’s parameters. Our guidelines surface the ability to understand the inputs and outputs of these systems at the right level of action:

Low

Quick, sweeping actions
Minimal time and effort

Medium

Targeted, "account" level actions
Some time and effort

High

Granular control with specific actions
Maximum time and effort

Implementing our guidelines on social platforms

These guidelines can manifest in many different ways. We defined specific user scenarios using these guidelines including those around information evaluation and reflection.

Calendly@16px

interactive prototype

You can explore an annotated version of the prototype design with notes on functionality and user interaction needs.

Customization

Feed-Settings

Feed Settings
Gives users broad control of the visibility of misinformation in their feeds; with an option for more targeted account level control

Preferred-Sources

Preferred Sources
Takes user preferences into account when compiling facts and resources for fact-check details

Evaluation & Reflection

Accounts-to-review

Accounts to Review
Surfaces accounts that post the highest proportion of misinformation in content users have engaged with

ReplacingMisinfs

Replacing Misinformers
Recommends more credible accounts with content similar to those which were unfollowed

Actionability

Notifications

Notifications
Tailoring notifications to show content with the right level of detail around key interests and concerns

Actionability

Engagement Hub
Surfacing personalized actions they can take via a centralized hub

Explainability

Content strategy and interface feedback modules exposed the inner workings of algorithms. This is aimed at helping people understand how the AI works on the backend and give feedback to that system throughout the process.

Explainability

mapping ai capabilities to the design

These design scenarios are built on three AI models. We worked closely with our clients to outline the training data, inputs, logic, and outputs needed to support the implementation. Each of these traces back to the guidelines and scenarios outlined above.

Information Evaluation

Powers a fact-checker in evaluating content credibility

Accounts to Review

Surfaces top misinformers, or high-risk accounts within the user's engagement

Accounts to Replace

Recommends credible accounts to replace misinformers that the user has unfollowed

Map@16px

ai component diagram

These diagrams modeled by Janelle Wen visualize the how the interactions combine to form the full model.

Reflection

The current state of misinformation is a web of complex interactions intensified by rapid societal and technological changes. I believe human-AI collaborative systems will succeed in combating misinformation, but how this will happen isn’t simple or clear.

AI is a powerful tool for addressing the sheer volume of misinformation. But it isn’t some magic solution in the waiting that will eventually figure out how to get it right. We have to guide that path. The complexity of the problem requires interventions at different levels of scale and medium. We need such as tools for deradicalization, programs for increasing social cohesion, education on media literacy, and increased regulation.

Our guidelines focus on defining best practices for matching human and AI goals. The scenarios outline how platforms and those who design them can use the guidelines to improve peoples engagement habits. As this kind of work enters the public discourse, it brings us a step closer to building successful, trusted human-AI collaborative systems.

what would i do different

While I believe in the research and design process, and our guidelines, I’ve since come to question our choice to focus on building tools for evaluating credibility. At the time I believed, if we improve people's ability and willingness to evaluate credibility, we could make a difference in reducing the spread of misinformation by reducing vulnerabilities.

I now believe our work should have focused on how human-AI collaboration can be used to assist community moderation or in preventative methods such as prebunking. In this way our work could have been more targeted to how people interact on social media platforms — and those who design and develop for them. There are many people doing amazing work in this space. Some of my "heros" are Erin Malone, Renée DiResta, and Ethan Zuckerman.

acknowledgements

This research would not be possible without the expertise and knowledge of the following people.

Clients at PNNL

Dustin Arendt
Maria Glenski

Capstone Advisors

Anna Abovyan
Raelin Musuraca

Expert Interviewees

Haiyi Zhu
Ethan Zuckerman
Kathleen Carley
David Danks
Jonathan Brown
Ken Holstein
Carol Smith
Motahhare Eslami
Erin Malone
Geoff Kaufman

resume          email          linkedin