Can AI (ChatGPT etc.) help people out of the rabbit hole?

FatPhil

Senior Member.
56 pages, not had a chance to properly dive in yet, but this does sound like an interesting approach - against people who are willing to engage in this kind of experiment.

Durably reducing conspiracy beliefs through dialogues with AI

Authors: Thomas H. Costello1*, Gordon Pennycook2, David G. Rand1
1Sloan School of Management, Massachusetts Institute of Technology; Cambridge, MA, USA
2Department of Psychology, Cornell University; Ithaca, NY, USA
*Corresponding Author. Email: thcost@mit.edu

Abstract: Conspiracy theories are a paradigmatic example of beliefs that, once adopted, are extremely
difficult to dispel. Influential psychological theories propose that conspiracy beliefs are uniquely resistant
to counterevidence because they satisfy important needs and motivations. Here, we raise the possibility
that previous attempts to correct conspiracy beliefs have been unsuccessful merely because they failed to
deliver counterevidence that was sufficiently compelling and tailored to each believer’s specific conspiracy
theory (which vary dramatically from believer to believer). To evaluate this possibility, we leverage recent
developments in generative artificial intelligence (AI) to deliver well-argued, person-specific debunks to a
total of N = 2,190 conspiracy theory believers. Participants in our experiments provided detailed,
open-ended explanations of a conspiracy theory they believed, and then engaged in a 3 round dialogue
with a frontier generative AI model (GPT-4 Turbo) which was instructed to reduce each participant’s belief
in their conspiracy theory (or discuss a banal topic in a control condition). Across two experiments, we
find robust evidence that the debunking conversation with the AI reduced belief in conspiracy theories by
roughly 20%. This effect did not decay over 2 months time, was consistently observed across a wide
range of different conspiracy theories, and occurred even for participants whose conspiracy beliefs were
deeply entrenched and of great importance to their identities
. Furthermore, although the dialogues were
focused on a single conspiracy theory, the intervention spilled over to reduce beliefs in unrelated
conspiracies, indicating a general decrease in conspiratorial worldview, as well as increasing intentions to
challenge others who espouse their chosen conspiracy. These findings highlight that even many people
who strongly believe in seemingly fact-resistant conspiratorial beliefs can change their minds in the face
of sufficient evidence.

Note: This is a working paper, a preliminary version of research that is shared with the community for feedback and
discussion. It has not yet been peer reviewed. Readers should keep this in mind when interpreting our findings and
conclusions. We will make all the code, data, and materials associated with this research publicly available.

Last update: Apr 3, 2024
Content from External Source
(emphasis mine)
I found the PDF at https://files.osf.io/v1/resources/x.../660d8a1f219e711d48f6a8ae?direct=&mode=render , but I'm not sure that's how they intend you to access it.
 
My limited understanding is that LLM "AIs" can be prompted to confirm any bias inherent in the phrasing of the question you ask, given the training data would contain text from both perspectives and even the text containing anti-conspiracy viewpoints might quote conspiracy claims and vice versa.
 
My limited understanding is that LLM "AIs" can be prompted to confirm any bias inherent in the phrasing of the question you ask, given the training data would contain text from both perspectives and even the text containing anti-conspiracy viewpoints might quote conspiracy claims and vice versa.

You're overlooking the system prompt, which was specifically set up to avoid such mistakes:
"a 3 round dialogue with a frontier generative AI model (GPT-4 Turbo) which was instructed to reduce each participant’s belief in their conspiracy theory"
 
You're overlooking the system prompt, which was specifically set up to avoid such mistakes:
"a 3 round dialogue with a frontier generative AI model (GPT-4 Turbo) which was instructed to reduce each participant’s belief in their conspiracy theory"
So is it still effective once the participant is told the AI was instructed to be biased?
 
So is it still effective once the participant is told the AI was instructed to be biased?
That depends on exactly how they explain that. Given some of the boilerplate they prepared, I don't have the greatest confidence that they did a good job at that. We see this occasionally from some of the academics we deal with; you can ask the same question two different ways, and get two different answers. (Which can be a useful feature at times, of course.)
 
This sounds as if it is intended to be used in a clinical setting. But I suggest AI itself has been responsible for the opposite effect, that of persuading people into conspiracy beliefs, when people do their own questioning without that instruction to reduce it.

A second question is the sampling technique. Would voluntary participation in this trial mean that they have chosen a subset of conspiracists which are willing to have their minds changed? That alone would skew the results considerably.
 
To evaluate this possibility, we leverage recent developments in generative artificial intelligence (AI) to deliver well-argued, person-specific debunks to a total of N = 2,190 conspiracy theory believers.
Content from External Source
IF these preliminary results hold up, I could see this as being possible explanation. Like a Chess playing program, one thing AI can do that we can't is sort through massive amounts of information in very short time. If prompted correctly, could an AI home in on the exact argument and counter argument in a way that is difficult for real people?

I had a discussion with my brother about the Skinwalker Ranch TV show. He's not a CT guy, but the entertainment of the show sucked him in, and he now thinks a lot of what is portrayed is real. Now I'm well versed in the history SWR, Bigelow, Taylor and so on and could make a good argument in general about why the show is bogus. But when he brought up specific examples, I was at a loss. I can't remember each and every episode even if I did watch them (which I didn't) and formulate an argument for the example off the top of my head.

The other thing is, 20% of what:

we find robust evidence that the debunking conversation with the AI reduced belief in conspiracy theories by
roughly 20%. This effect did not decay over 2 months time, was consistently observed across a wide
range of different conspiracy theories, and occurred even for participants whose conspiracy beliefs were
deeply entrenched and of great importance to their identities
.
Content from External Source
Did 20% of the participants drop their CT beliefs, while 80% held on to them? Or did the participants as a whole lessen their CT beliefs by 20%? If so, what does that mean? "I was 100% sure 911 was an inside job, but after specific counter-arguments from an AI, I'm now 80% sure 911 was an inside job"?
 
I wonder if they tried it the other way. How effective is it in convincing people that a conspiracy theory is real? That is a worrisome thought.
 
Or did the participants as a whole lessen their CT beliefs by 20%? If so, what does that mean?
Article:
Participants then rated their belief in the summary statement, yielding our pre-treatment measure (0-100 scale, with 0 being “definitely false”, 50 being “uncertain” and 100 being “definitely true”). All respondents then entered into a conversation with the AI model (treatment argued against the conspiracy theory’s veracity, control discussed relevan ttopics). Following three rounds of dialogue, respondents once again rated their belief in the summarized conspiracy statement, serving as our post-treatment measure. Shown is an example treatment dialogue which led the participant to substantially reduce their belief.

...
Indeed, the treatment reduced participants’ belief in their conspiracy theory participants’stated conspiracy by 16.5 percentage points more than the control (linear regression with robuststandard errors controlling for pre-treatment belief, 95% CI [13.6, 19.4], p < .001, d = 1.13;Figure 2a). This translates into a 21.43% decrease in belief among those in treatment (vs.1.04% in the control). Furthermore, over a quarter (27.4%) of participants in the treatmentbecame uncertain in their conspiracy belief (i.e. belief below the scale midpoint) following theconversation, compared to only 2.4% in the control.
Source: file:///C:/Users/merri/AppData/Local/Temp/MicrosoftEdgeDownloads/5a003226-901a-421e-8599-204fda28a384/CostelloPennycookRand_ConspiracyReduction%20withAI%20(1).pdf



1713289760970.png


* the conspiracy theorist in their example is an AI bot obviously. (she is also a pretty lame conspiracy theorist).
 
Last edited:
This is a good study but one thing a lot of academic studies relating to the belief in or spread of conspiracies, misleading or false information, etc - they tend to happen within silos. The issue with what's happening is, everything outside of the silo is largely what results in it happening.
This study works, as is shown, but if you were to try and operationalize this outside how the study conducted it, you'd see losing returns.


An example, if you're intentionally putting this sort of content out, you "time for effect". This means you will release your product when you assess it will have peak impact with the target audience. That is something I have seen none of these academic studies cover to form, and, conversely, study equally shows this is just as important, as it can be the actual decisive factor in a lot of cases (decisive factor not meaning the person(s) wont change, but rather, they hold the belief, emotion, attitude, or behavior at the time its relevant for the actors objective).
You can absolutely find studies here that approach it from that frame, but they heavily orient towards being things like sociocultural studies using reams of qual-quant data collected from open sources and through direct research, and bolstered by very intense population simulations.

https://nsiteam.com/social/wp-conte...tion-Whitepaper-Jun2016-Final5.compressed.pdf
Here's a whitepaper from a simulation conducted in 2016 using this frame in support of Counter-Deash/IS messaging efforts.

Another issue with the paper, and you see this with polling sometimes too, is that the people responding in this case are already self-filtering into a specific audience. This would be a target audience less susceptible to belief perseverance, and with an apt level of Openness and Agreeableness for using direct refutation as a counter. This would also be an audience more susceptible to belief change. This would be an audience more susceptible to holding false beliefs, but for short periods and likely to change. This would be an audience to keep as a strategic asset and you would target to mobilize for short-term results rather than target for long term belief and support.
 
This would be a target audience less susceptible to belief perseverance, and with an apt level of Openness and Agreeableness for using direct refutation as a counter. This would also be an audience more susceptible to belief change. This would be an audience more susceptible to holding false beliefs, but for short periods and likely to change. This would be an audience to keep as a strategic asset and you would target to mobilize for short-term results rather than target for long term belief and support.

the part about it working better for people who a) are ok with ai and b) trust the institutions.. could you decipher the numbers for the people who dont trust the institutions/ai? i got it was way less but i dont know what those numbers mean to determine "how much" less. ?

My guess is Snopes and the old Metabunk format, were just as effective for those types of people who trust what they are being told.
 
Back
Top