<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>AB Testing on Chuanxilu for Skilled Homo sapiens</title><link>https://blog.chuanxilu.net/en/tags/ab-testing/</link><description>Recent content in AB Testing on Chuanxilu for Skilled Homo sapiens</description><generator>Hugo</generator><language>en-US</language><lastBuildDate>Sun, 17 May 2026 09:00:00 +0800</lastBuildDate><atom:link href="https://blog.chuanxilu.net/en/tags/ab-testing/index.xml" rel="self" type="application/rss+xml"/><item><title>The Upgrade — New Template and Three Transferable Lessons</title><link>https://blog.chuanxilu.net/en/posts/2026/05/why-articulation-upgrade-and-takeaways/</link><pubDate>Sun, 17 May 2026 09:00:00 +0800</pubDate><guid>https://blog.chuanxilu.net/en/posts/2026/05/why-articulation-upgrade-and-takeaways/</guid><description>Upgrading the Why Articulation template based on A/B test data: replacing explicit questions with open-ended reasoning plus self-check, keeping mandatory tone and negative-only examples. Three transferable prompt engineering lessons.</description></item><item><title>A 4-Variable A/B Test — Why Positive Examples Harm Prompt Performance</title><link>https://blog.chuanxilu.net/en/posts/2026/05/ab-test-positive-examples-harm/</link><pubDate>Fri, 15 May 2026 10:00:00 +0800</pubDate><guid>https://blog.chuanxilu.net/en/posts/2026/05/ab-test-positive-examples-harm/</guid><description>Why do positive examples make AI output worse? A 4-variable A/B test on Why Articulation structure, tone, position, and example type found that demonstrations hurt — echoing Anthropic&amp;#39;s alignment research.</description></item><item><title>From Anthropic's Alignment Research to a Prompt Design Insight</title><link>https://blog.chuanxilu.net/en/posts/2026/05/anthropic-alignment-to-prompt-design/</link><pubDate>Thu, 14 May 2026 10:00:00 +0800</pubDate><guid>https://blog.chuanxilu.net/en/posts/2026/05/anthropic-alignment-to-prompt-design/</guid><description>Anthropic discovered that teaching models &amp;#34;why&amp;#34; works better than teaching them &amp;#34;what&amp;#34; — misalignment dropped from 22% to 3%. This insight from safety training applies to everyday prompt design too.</description></item></channel></rss>