Discussion about this post

User's avatar
Girish's avatar

Thank you for writing this. It's a useful cautionary tale/reality check.

Expand full comment
Phaetrix's avatar

Brilliant breakdown — especially appreciated how you tied it back to RLHF and the deeper human bias underneath it.

The part that hits hardest? It’s not just the models that are sycophantic — we’ve trained ourselves to reward it. Even offline, most people aren’t wired to tolerate uncomfortable truth for long.

Do you think the solution is technical (prompting, model tuning) — or cultural (teaching people to want truth over comfort)? Or is that just wishful thinking?

Expand full comment
2 more comments...

No posts