Modern Machiavelli? The illusion of ChatGPT-generated patient reviews in plastic and aesthetic surgery based on 9000 review classifications.

Knoedler, Samuel; Sofo, Giuseppe; Kern, Barbara; Frank, Konstantin; Cotofana, Sebastian; von Isenburg, Sarah; Könneker, Sören; Mazzarone, Francesco; Dorafshar, Amir H; Knoedler, Leonard; Alfertshofer, Michael

doi:10.1016/j.bjps.2023.10.119

journal article

Titel:: Modern Machiavelli? The illusion of ChatGPT-generated patient reviews in plastic and aesthetic surgery based on 9000 review classifications.
Dokumenttyp:: Journal Article
Autor(en):: Knoedler, Samuel; Sofo, Giuseppe; Kern, Barbara; Frank, Konstantin; Cotofana, Sebastian; von Isenburg, Sarah; Könneker, Sören; Mazzarone, Francesco; Dorafshar, Amir H; Knoedler, Leonard; Alfertshofer, Michael
Abstract:: BACKGROUND: Online patient reviews are crucial in guiding individuals who seek plastic surgery, but artificial chatbots pose a threat of disseminating fake reviews. This study aimed to compare real patient feedback with ChatGPT-generated reviews for the top five US plastic surgery procedures. METHODS: Thirty real patient reviews on rhinoplasty, blepharoplasty, facelift, liposuction, and breast augmentation were collected from RealSelf and used as templates for ChatGPT to generate matching patient reviews. Prolific users (n = 30) assessed 150 pairs of reviews to identify human-written and artificial intelligence (AI)-generated reviews. Patient reviews were further assessed using AI content detector software (Copyleaks AI). RESULTS: Among the 9000 classification tasks, 64.3% and 35.7% of reviews were classified as authentic and fake, respectively. On an average, the author (human versus machine) was correctly identified in 59.6% of cases, and this poor classification performance was consistent across all procedures. Patients with prior aesthetic treatment showed poorer classification performance than those without (p < 0.05). The mean character count in human-written reviews was significantly higher (p < 0.001) that that in AI-generated reviews, with a significant correlation between character count and participants' accuracy rate (p < 0.001). Emotional timbre of reviews differed significantly with "happiness" being more prevalent in human-written reviews (p < 0.001), and "disappointment" being more prevalent in AI reviews (p = 0.005). Copyleaks AI correctly classified 96.7% and 69.3% of human-written and ChatGPT-generated reviews, respectively. CONCLUSION: ChatGPT convincingly replicates authentic patient reviews, even deceiving commercial AI detection software. Analyzing emotional tone and review length can help differentiate real from fake reviews, underscoring the need to educate both patients and physicians to prevent misinformation and mistrust.
Zeitschriftentitel:: J Plast Reconstr Aesthet Surg
Jahr:: 2024
Band / Volume:: 88
Seitenangaben Beitrag:: 99-108
Volltext / DOI:: doi:10.1016/j.bjps.2023.10.119
PubMed:: http://view.ncbi.nlm.nih.gov/pubmed/37972444
Print-ISSN:: 1748-6815
TUM Einrichtung:: Lehrstuhl für Plastische Chirurgie und Handchirurgie (Prof. Machens)
BibTeX

Vorkommen:

mediaTUM Gesamtbestand Einrichtungen Schools TUM School of Medicine and Health Departments Clinical Medicine Klinik und Poliklinik für Plastische Chirurgie und Handchirurgie (Prof. Machens)Lehrstuhl für Plastische Chirurgie und Handchirurgie (Prof. Machens)2024