1 part · reading order ↓
One model reviewing its own plan tends to agree with itself. So I built a cleanup workflow shaped like a jury: separate critics from a deliberate mix of model families weigh the evidence, with a simple forcing function that keeps neighbouring reviewers off the same family. The orchestrator presides as judge, refuting rather than merging, votes promote shared findings, and a post-approval diff check can overrule a verdict that was unanimous and still wrong.