Sub-Agent on Chuanxilu for Skilled Homo sapiens

Sub-Agent on Chuanxilu for Skilled Homo sapienshttps://blog.chuanxilu.net/en/tags/sub-agent/Recent content in Sub-Agent on Chuanxilu for Skilled Homo sapiensHugoen-USSun, 31 May 2026 10:00:00 +0800The Experiment Design Was Fine. The LLM Still Failed.https://blog.chuanxilu.net/en/posts/2026/05/execution-context-design/Sun, 31 May 2026 10:00:00 +0800https://blog.chuanxilu.net/en/posts/2026/05/execution-context-design/A double-blind experiment with flawless design still produced unusable results. The culprit wasn't the protocol—it was the execution context. ANSI-polluted output fed into a scorer that diligently scored garbage, and a single sub-agent aggregated across scenarios it shouldn't have. I show how reconstructing the execution context flipped the conclusion from 'insufficient evidence' to 'adopt B'.