Abstract
Impact of facial appearance on success has been well-established. For women, specifically, there is a complex interplay between perceived attractiveness, competence, and power. As artificial intelligence (AI) is rapidly being used for facial analysis, it is critical to evaluate whether AI-derived facial assessments align with or diverge from human perception in systematic and potentially biased ways. Official government portraits of White, female United States (U.S.) senators (n=20) were anonymized and analyzed. Five facial measurements (eyebrow elongation, facial elongation, facial width-to-height ratio, mandibular angle, and chin size) were quantified using an image analysis software. Human participants (n=47) and ChatGPT-5 rated the perceived power of the portraits on a 7-point scale. Statistical analyses were performed to understand which craniofacial metrics were associated with perceived power in both human and AI raters. Across portraits, AI systematically assigned higher power ratings than humans (p<0.001). Rankings showed moderate-to-strong alignment (r=0.533, p=0.015). Both humans and AI associated greater facial elongation and higher fWHR with reduced power. Notably, only AI judgments were disproportionately driven by elongation in multivariate models. AI systems have many human heuristics but may amplify specific features and inflate trait ratings. This raises important considerations for their clinical integration. In plastic surgery, where AI is increasingly informing aesthetic counseling, reliance on biased outputs risks oversimplifying patient assessment. Diverse training datasets and clinician oversight are essential to ensure AI functions as an adjunct, rather than a substitute, for human judgment.