Verification

Analysis that cross-examines itself.

Why should I trust what it says?

A second agent, looking for holes.

Every time the main Sidekick agent lands a finding, a separate validation agent runs in the background. Different model, different tools, different prompting. Its job is not to continue the investigation — it's to try to break the claim against the actual binary.

Outcomes carry the result in the notebook with a clear status — draft, verified, or rejected — plus an orthogonal important flag for keystone findings. Footer pills link to the creator thread and each validation thread, so the reasoning chain stays visible.

What makes validation different.

Four things separate a validation run from the thread that produced the claim:

1
Different model

Not the model that produced the claim. An independent perspective that was not primed by the original chat's momentum.

2
Different tools

Validation-specific checks — including disassembly-to-IL mapping verification and structural assumptions the main agent took on faith.

3
Different prompting

Explicitly adversarial. The validation agent looks for what doesn't match the binary rather than extending a narrative that already sounds plausible.

4
Different discipline

Won't advance a claim to verified without concrete evidence linked to IL, assembly, or memory state. Untested conditions stay in draft, not silently omitted.

Worked example

Persistence mechanism analysis with verification
Finding

The binary establishes persistence via registry modification at HKCU\Software\Microsoft\Windows\CurrentVersion\Run.

Verification pass

RegSetValueExW confirmed in IL at sub_401230+0x14. Three cross-references validate the call path. Key name "SvcUpdate" resolved from string table.

Flagged

Behavior under non-admin context untested. The call to RegSetValueExW may fail without elevation — this condition was not exercised during analysis.

Result

Finding marked verified with one untested condition noted. The analyst can accept the finding for reporting while flagging the elevation question for follow-up.

Escalating past static.

Static cross-checks handle most claims. But reachability, crash paths, elevation outcomes, and input-triggered behavior often require actually running the code — or solving the constraints. The validation agent can be equipped with deeper tools: the Binary Ninja debugger for dynamic proof, formal-methods integrations for symbolic reasoning. When static can't settle a claim, it escalates. When no tool can prove it, the outcome stays in draft with the gap named. The debugger is always one click away from the outcome for hands-on confirmation too.

On by default. Off when you need it.

Verification runs automatically against every finding — that is the default, because unsupported claims create the kind of rework that cancels out the speed gain. You can disable it per workspace when you want to stay under a Cloud subscription's usage limits, conserve tokens on Self-Hosted, or run exploratory chats where the cost of a stray claim is lower than the cost of verifying it.