LLMs Try to Please You ... and That's a…

Navin Kabra

Aug 7

How to overcome LLMs' "sycophancy bias" which gives you wrong answers to make you happy

Read →

4 Comments

Girish

Aug 15

Thank you for writing this. It's a useful cautionary tale/reality check.

Expand full comment

Phaetrix

Aug 7

Brilliant breakdown — especially appreciated how you tied it back to RLHF and the deeper human bias underneath it.

The part that hits hardest? It’s not just the models that are sycophantic — we’ve trained ourselves to reward it. Even offline, most people aren’t wired to tolerate uncomfortable truth for long.

Do you think the solution is technical (prompting, model tuning) — or cultural (teaching people to want truth over comfort)? Or is that just wishful thinking?

Expand full comment

Reply (1)

Navin Kabra

Aug 8

If a solution involves convincing a majority of people all over the world to change their behaviour, that is never going to happen. You can convince a few people, never a large number. So it has to be technical. Or changing yourself (i.e. become more mindful about how you use LLMs and how you interpret the results.)

Expand full comment

Reply (1)

Phaetrix

Aug 8

Good point about mass behavior change being unrealistic. But I keep thinking about how social media has amplified this exact problem - the engagement metrics actively reward bandwagon thinking over independent analysis.

Even if we solve AI sycophancy technically, we're still operating in information ecosystems designed to discourage the very mindfulness you're talking about. How do you maintain critical thinking when the algorithms reward conformity and the feedback loops punish uncomfortable questions?

Maybe the real challenge isn't just training better models - it's that we've built platforms that systematically erode our capacity for the kind of thinking we'd need to use those models well.

Expand full comment

AI IQ

LLMs Try to Please You ... and That's a…