Truth and Trust in Large Language Models

Apr 1, 2024

How much truth is ChatGPT or Gemini giving us? How much can we trust their answers to queries? As we'll see, LLMs can lay no claim to truth.

Read →

6 Comments

John Paluska

Apr 4, 2024Edited

At my publication (The Washington Gazette) we found a way around LLMs hallucinating. What we do is use Factiverse to fact-check the content. It searches for credible sources like legacy news outlets and academic websites from all ideological persuasions. It then categorizes which agree and which oppose. When there is no clear consensus, we use Findsight to scour books and academic publications and other vetted texts. We plan on adding I Doubt News for bias analysis soon when we finish working with the developer. We use the same programs to fact check human-written content too.

Of course, we also try to go out of our way to link to the primary source whenever possible, but not everything has a primary source (like a breaking story from AP or an exclusive article on undisclosed documents, etc.). In these instances we still fact check using the above programs and try to find the best sources possible.

In short, Factiverse.ai, Findsight.ai, and Idoubt.News are ways of fact-checking LLM content quickly and easily.

Expand full comment

Reply (1)

Bill Dembski

Apr 4, 2024

Hi John. Thanks for this information. Services like these provide a valuable corrective. I suspect there will still be lots of edge cases where automated verification will need a further human touch. One service I've found helpful and that tries to combine generative AI with reference sourcing is a Jeff Bezos startup. Perplexity.ai. It answers queries and provides references to the web to back up its answers. But it seems more secure to have entirely separate entities doing the LLM text generation on the one hand and the fact checking on the other. --Bill

Expand full comment

Dennis Murphy

Apr 3, 2024

This is one of the clearest and most useful articles about LLM chatbots I have ever read. I wonder, though, about the suggested advice to "Verify first and only then trust." From Dr Dembski's examples, it seems that the more appropriate maxim would be just "Verify first." There is no trusting, because the very next query might contain bogus results.

I fear for the messed-up world that is coming due to proliferation of unverified AI generated stories.

Expand full comment

Reply (1)

Bill Dembski

Apr 3, 2024

Hi Dennis. Thanks for this insightful comment. I could see querying a given LLM in a given area of expertise, for instance, basic facts about chemistry. One could compile a long list of chemistry questions, along with verified answers, and then submit the questions to the LLM. What if the LLM is 99.9% accurate -- far more than the average human being? It could be that in such an event an LLM would gain trust, simply as a matter of statistical accuracy. But note, the trust would then be confined to a given topic area. The problem with LLMs in general, and the one I was trying to address in the paper, is how to assess their truthfulness across topics, especially when no such reliability study of the sort I just outlined exists. It seems that a proper skepticism is needed in such cases.

Expand full comment

Marc Mullie

Apr 2, 2024

Great article! Thanks for your insights!

" But intelligence can be had without truth. And without truth, there can be no trust."

Indeed. The problem with LLMs appears to be a lack of 'judgment' or the virtue of 'discernment'. As you say, this requires context and memory, a memory honed to previous experience and self-learning, AND independence.

If we look as the fitness of words scaled to truth, LLMs 'naturally select' the most popular (fit) current ideas with survival value, not because these are inherently better ('true' ) but because these ideas have been 'artificially selected' by the programmers and trainers as the word-traits and idea-traits the latter ideologically desire.

On the basis of 'selection', one could swap 'memes' for 'genes' here.

Marc Mullie MD

Montreal

Expand full comment

Reply (1)

Bill Dembski

Apr 2, 2024

Thanks Marc! Yes, it does seem that something like selection over a fitness landscape is in play with LLMs, but with no premium placed on truth.

Expand full comment