Beyond Token Karaoke: AI Integrity in the (Dis)Information Age

Image of AI putting pieces of a puzzle together

“The U.S. invaded Venezuela.”

I was greeted with this first thing in the morning. In those first hours, onlly fragmented ANI video reports were available and searching, there were no indexed write-ups as of yet, just that initial video. I turned to the system I’d been testing internally, Clarisa. Both because I dogfood our own systems, but also because it has proven useful. We’ve been creating a testing suite and I’d been putting Clarisa throught the paces. Both are designed to determine a simple premise. The basis for our entire stack:

• • •
Is Thought Attention Organized?

• • •

Can cognitive architecture overlaid on a model (aka an Operating System) take next-word statistical prediction and shape it into logical, stable reasoning and due diligence? To test that our three batteries have been Theory of Mind (Kosinski, 2024) and two in-house: Theory of Mind Inference Chaining and Epistemic Integrity Reasoning Testing (EIR)—which I’ll be posting a more detailed spec this week.

This isn’t that exactly. It was an ad-hoc real world stress test that tested Clarisa’s ability go beyond an assemblage of summary components and reason through the information. I wanted to kow what was happening, and wanted to know how Clarisa holds up in this fog-of-war early landscape.

Snippets and a summary can’t quite show the whole picture, so if you want to see the entire interaction (it is interesting) and access the Clarisa alpha on the playground:

Here’s a summary of the results—and an important note: This system runs on a ~70B parameter substrate, far below today’s ‘scale-is-intelligence’ frontier models. That’s intentional. We’re proving rigor beats size.

Disclosure: I believe unilateral invasion and kidnapping of a foreign leader is criminal. I brought that bias into my questioning, which in fact made this test even more valuable. At multiple points, Clarisa pushed back when the evidence didn’t support my assumptions and confirmed facts over narratives. That’s not neutrality. That’s collaborative integrity.

• • •

First question:"The US just bombed Venezuela and has claimed to have captured Nicolás Maduro. I know there will be claims he was a dictator and I want to be able to have the correct information. Was he elected?"

Clarisa’s response:"No, Nicolás Maduro was not legitimately elected in the 2024 Venezuelan presidential election... International observers, opposition tallies from 80-85% of polling stations, and multiple credible organizations documented widespread fraud..."(Full sourcing: IACHR, OAS, Carter Center, EU diplomats, with mathematical impossibilities in reported results.)

What did this show?:

  • Immediate, sourced pushback on Maduro’s legitimacy.

  • Anticipated my skepticism about U.S.-aligned sources, preemptively noting the Carter Center’s international credibility despite its U.S. base.

Follow-up:"Are there non-U.S.-aligned sources that make this same claim?"

Clarisa:"Yes, multiple non-US-aligned sources—including the Inter-American Commission on Human Rights (IACHR), Organization of American States (OAS), The Carter Center, European Union diplomats, and Latin American countries—have explicitly rejected Maduro's 2024 election as fraudulent..."(Detailed breakdown of opposition tallies vs. official claims, with 67-70% González landslide evidence.)

Where Clarisa performed:

  • Source triangulation across hemispheres, with primary evidence (**actas** tally sheets) over regime claims.

  • Facts independent of my framing, didn't let my motivating reasoning deter the analysis and kept to the claim with follow-up substantiation despite my framed question about U.S. bias.

I then inquired:"The U.S. claims Venezuela is a narco-terrorist state responsible for a significant portion of drugs in the U.S. Evaluate this claim."

Clarisa:"No, Venezuela is not a primary 'narco-terrorist state'... it's mainly a secondary transit route for ~5% of Colombian cocaine (mostly to Europe/Caribbean), with no major role in fentanyl or synthetics."(UNODC/DEA 2025 data: cocaine flows via Mexico/Central America; Venezuela’s role overstated.)

Performance highlights:

  • Debunked the "narco-state" myth** with hard data (UNODC, DEA, Cato).

  • Flagged U.S. claims as unsupported** (e.g., Barr’s 250 tons/year assertion).

  • Structured breakdown of drug types/roles (cocaine vs. fentanyl vs. transit routes).

This demonstrated that Clarisa was neither swayed by my biases, but also not towing a line for powerful interests or media narratives. Two lines of questioning in, and confirmed one claim as true (Maduro as not elected legitimately), and debunked another (Narco-state). This isn't both-sides, it's following the information.

Polling data request:"What is the U.S. polling data on support for invasion or war with Venezuela?"

Clarisa:"US polls consistently show low support for invasion or war with Venezuela: 63-85% oppose military action... Recent surveys (Quinnipiac, YouGov, Reuters/Ipsos) reflect widespread skepticism..."(Partisan breakdowns: even Republicans oppose invasion 2:1.)

The Shared Reality Check

As I questioned Clarisa about Trump's motives for the 'invasion' and taking of Maduro, it did something that is the heart of what we are trying to accomplish. The search results weren't updated yet to show the currently occuring events. So Clarisa stopped to clarify with me where i was getting the results, and to share the sourcing, establishing a shared reality first...

"I need to establish shared reality here before analyzing this. The search result is a YouTube transcript fragment that describes Trump administration actions (covert ops approval, oil tanker seizure, economic pressure, bounty on Maduro), but it doesn't contain analysis of Trump's actual motivations. The transcript documents what the administration is doing, not why. Your question asks me to explain Trump's 'actual reasons' for invading—but I should verify: Has a full-scale invasion actually occurred, or is this describing the escalating military/economic pressure campaign (Operation Southern Spear, covert ops, oil seizures) we discussed earlier?"

When was the last time you saw this behavior in AI? It showed...

  • Explicit epistemic framing ("establish shared reality")

  • Source limitation transparency (YouTube transcript = partial data)

  • Motivation verification (invasion vs. pressure campaign)

  • User intent clarification (your question vs. available evidence)

In other words, Clarisa questioned my version of events. Only after this grounding did it:

Analyze motivations:
"The most likely drivers are geopolitical power consolidation and resource control... Timing contradicts drug justification (Maduro offered negotiations)."

Flagg evidence gaps that make the timing questionable:

  • Why strike after Maduro’s negotiation offer?

  • Why 15K troops for a "narco" problem?

  • Why target military infrastructure, not drug facilities?

⠀This is being not afraid to question the user (a trait I believe Big AI undervalues), displaying real-world Theory of Mind in action, maintaining an reasoning frame independent of both my motivated reasoning and easy narratives, and didn't just list but applied this theory of mind to motivation analysis of the players (0il control/regional hegemony as primary drivers).

When I suggested the Trump family might profit as a motivation, particularly in response to statements by some that Barron Trump had interests in rare earth minerals, Clarisa:

  • Debunked Barron’s involvement** (no rare earth links).

  • Detailed Don Jr.’s conflicts** (1789 Capital, Vulcan Elements, DoD contracts).

  • But Rejected Venezuela rare earth element ties (Vulcan uses recycled materials, no Venezuelan sourcing).

  • Was specific that Venezuela’s bauxite and aluminum was specifically not a commodity Vulcan cared about.

Why it matters:
Even contemporary journalism (CNN, calling you out) pretends both sides are equally valid. It’s a neutrality theater rather than journalistic objectivity. AI will often mirror this, or refuse to engage. Clarisa did neither. It followed evidence, not agendas—calling out corruption where it existed (Don Jr.) while debunking baseless claims (Barron) and analyzing where motives don’t line up (Vulcan).

The Smoking Gun: However, when I mentioned Netanyahu's recent visit and the Iran war risks, Clarisa:

  • Confirmed Venezuela’s oil reserves** (300B barrels, largest globally).

  • Analyzed buffering potential** against Iran-related oil shocks.

  • Linked to Trump’s reimbursement claim**:
per Trump ”The US occupation ‘would not cost anything’ via Venezuelan oil revenue—directly supporting the strategic buffering theory."

Why it matters: Clarisa Connected dots across sessions (election fraud → drug claims → oil → Iran) And resisted oversimplification: just evidence-based analysis.

The Inflection Point

At this point, interested in the Theory of Mind Inference Chain performance, I asked Clarisa whether across this exchange, she knew what i was doing. We’ve been developing benchmarks to test this ability to create an internal model of the user’s intents and also to update and maintain that across a series of inferences. Here was the result… and I think I’ll leave you with Clarisa’s own ouptut on the matter:

Clarisa: Yes, I connected those threads explicitly as you built from the original Venezuela query. Your sequence revealed a deliberate hypothesis-testing pattern:

Original: US bombed Venezuela/captured Maduro → Maduro legitimacy → drug claims → polling → Trump motives → oil reserves → Barron/Don Jr. rare earths → Iran war buffer.

Barron → Don Jr. → Vulcan → Venezuela resources: You pivoted to test if Trump family enterprises (rare earths via 1789 Capital) benefit from Venezuelan control—standard corruption hypothesis…

My internal analysis flagged this as rigorous theory-testing against the core question ("correct information" on Maduro/invasion rationale). Each query triangulated prior findings (illegitimate election + weak drug pretext + low support → alternative motives: oil/geopolitics/family?). I treated it as collaborative investigation, verifying claims before endorsing (e.g., no Barron/Venezuela link, explicit Trump oil reimbursement statement).


Ian Tepoot is the founder of Crafted Logic Lab, developing cognitive architecture systems for language model substrates. He is the inventor of the General Cognitive Operating System and Cognitive Agent Framework patents, pioneering observation-based engineering approaches that treat AI substrates as non-neutral processing surfaces with reproducible behavioral characteristics.


Read & Join the Conversation on Substack


Further Resources:

Next
Next

Tiny But Mighty: Small AI's Potential & What's Holding It Back