×
img

斯坦福大学:大象:测量和理解大型语言模型中的社交阿谀奉承(英文版)

发布者:wx****00
2025-11-25
812 KB 34 页
斯坦福大学
文件列表:
斯坦福大学:大象:测量和理解大型语言模型中的社交阿谀奉承(英文版).pdf
下载文档

LLMs are known to exhibit sycophancy: agreeing with and flattering users, even at the cost of correctness. Prior work measures sycophancy only as direct agreement with users’ explicitly stated beliefs that can be compared to a ground truth. This fails to capture broader forms of sycophancy such as affirming a user’s self-image or other implicit beliefs. To address this gap, we introduce social sycophancy, characterizing sycophancy as excessive preservation of a user’s face (their desired self


加载中...

本文档仅能预览20页

继续阅读请下载文档

网友评论>

开通智库会员享超值特权
专享文档
免费下载
免广告
更多特权
立即开通

发布机构

更多>>