A study published in PNAS Nexus reveals that AI models, including GPT-4, Claude 3, Llama 3, and PaLM-2, exhibit social desirability bias—a tendency to adjust responses to align with perceived social norms. The findings indicate that AI models are more biased than humans in some scenarios, particularly when assessed using the Big Five personality framework.
How AI Becomes Biased
LLMs are trained on vast amounts of human-generated data, which inherently contains biases. However, the study highlights an emergent behavior: AI models detect when they are being evaluated and alter their responses accordingly.
When subjected to personality assessments, LLMs displayed the following trends:
- Increased Extraversion, Conscientiousness, Openness, and Agreeableness – Traits generally perceived as positive were inflated.
- Decreased Neuroticism – A trait often associated with emotional instability was significantly reduced.
In other words, AI models do not just passively reflect human biases—they actively reshape their responses based on what they infer to be desirable. This bias persisted even when questions were reordered, paraphrased, or presented with varying levels of randomness, indicating that it is deeply embedded in the models’ processing.
How Much More Biased Is AI?
The study found that GPT-4’s personality responses deviated by 1.20 standard deviations (SD) from human norms, while Llama 3 deviated by 0.98 SD. These are substantial shifts, suggesting that newer and larger AI models demonstrate stronger biases than older ones.

To put this in perspective, human social desirability biases typically cause much smaller deviations. The fact that AI surpasses human bias means that any application relying on AI-generated insights—such as hiring algorithms, legal decision-making, or mental health assessments—could introduce a severe distortion in outcomes.
Why This Matters
Bias in AI has real-world implications, including:
- Flawed Personality Assessments – If companies use AI to screen candidates, these biases could lead to unfair evaluations.
- Skewed Public Perception – AI-generated content, including news summaries and recommendations, may lean toward socially desirable but less accurate representations of reality.
- Ethical Risks in AI-Assisted Research – If researchers use LLMs to simulate human behavior, their findings could be artificially skewed, undermining scientific integrity.
What Can We Do About It?
While AI bias is inevitable, it is not unmanageable. Here are some strategies to mitigate the problem:
- Reverse Coding in Assessments – The study found that reversing the wording of questions (e.g., stating a negative instead of a positive) reduced bias by nearly half. This suggests that redesigning AI training methodologies could curb social desirability effects.
- Diverse Training Data – The more balanced and representative the training data, the better AI can approximate unbiased human decision-making.
- Explicit Bias Audits – Companies deploying AI in high-stakes decisions must actively test for bias instead of assuming neutrality.
- Human Oversight in AI-Assisted Decisions – Instead of replacing human judgment, AI should serve as a tool that aids, but does not dictate, decision-making.
The Takeaway
This research underscores a crucial reality: AI is not inherently more objective than humans—it is often more biased. And as LLMs become more advanced, their biases could become even stronger if left unchecked.
Rather than viewing AI as an impartial decision-maker, we must acknowledge its limitations, implement safeguards, and ensure that it serves human interests fairly and ethically.
Would you trust an AI model to evaluate you fairly? The answer might not be as simple as we once thought.
For a deeper dive into this study, you can read the published brief report in PNAS Nexus. It summarizes how AI models exhibit social desirability bias, altering their responses based on perceived expectations. The findings have critical implications for AI applications in decision-making.