
歐洲泌尿外科學會性與生殖健康指南機器人的準確性、可讀性與可理解性 Accuracy, readability, and understandability of European Association of Urology guidelines bot for Sexual and Reproductive Health Guidelines ——《性醫學雜誌》第23卷第4期,2026年4月—— <The Journal of Sexual Medicine>, Volume 23, Issue 4, April 2026 【摘要】背景:近期,歐洲泌尿外科學會(EAU)指南推出了一款官方機器人,旨在輔助泌尿外科醫生進行指南查閱。然而,截至目前,尚無針對該工具的外部驗證數據。目的:評估該“性與生殖健康指南機器人”在回答問題時的準確性、完整性及清晰度。方法:基於EAU《性與生殖健康指南》中的推薦建議,共設計了228個問題。每個問題均被輸入至EAU指南機器人中,其給出的回答隨後由兩名泌尿男科專家進行審閱。若審閱結果存在分歧,則通過與第三位專家討論的方式予以解決。評估結果進一步按推薦等級進行了分層分析。最終效果:利用5分制李克特量表(Likert scale)評估該機器人對指南相關問題的回答在準確性、完整性和清晰度方面的達標率;並分析推薦等級對回答質量的影響。結果:總體而言,共設計了228個問題。在準確性方面,224/228(98.3%)個回答被判定為準確(評分4-5分);2/228(0.9%)個回答準確性尚可(評分3分);而2/228(0.9%)個回答被判定為不準確(評分1-2分)。在完整性方面,223/228(97.8%)個回答被判定為完整(評分4-5分);2/228(0.9%)個回答完整性尚可(評分3分);而3/228(1.3%)個回答被判定為不完整。最後,在清晰度方面,225/228(98.7%)個回答被判定為清晰(評分4-5分);2/228(0.9%)個回答清晰度尚可(評分3分);且無回答被判定為不清晰(0/228)。在對比“強推薦”與“弱推薦”等級的問題回答時,未發現顯著差異。臨床啟示:EAU指南機器人可作為一種可靠的臨床決策支持工具,輔助泌尿外科醫生快速獲取關於性與生殖健康管理的循證醫學指導。優勢與局限性:本研究是對EAU指南機器人進行的首次外部評估。我們的結果表明,與通用人工智能工具相比,該工具在可靠性方面實現了顯著提升。然而,我們的查詢內容較為直接,且是直接基於指南建議擬定的;因此,所得結果可能並不適用於複雜的現實臨床情境。結論:EAU指南機器人(EAU Guidelines Bot)是一款用於輔助性與生殖健康指南查閱的準確且可靠的工具,但仍需進一步的驗證,以評估其在臨床實踐中的適用性。 【關鍵詞】人工智能,聊天機器人,EAU指南,性與生殖健康 [Abstract] Background: Recently, the European Association of Urology (EAU) Guidelines presented an official Bot to assist urologists during Guidelines navigation. However, up to date no external validation is available. Aim: To assess accuracy, completeness, and clarity of the Guidelines Bot for Sexual and Reproductive Health. Methods: A total of 228 questions based on the EAU Sexual and Reproductive Health Guidelines recommendations were developed. Each question was inputted to the EAU Guidelines Bot and the response was reviewed by two expert uro-andrologists. Discrepancies were resolved by discussion with a third expert. Results were further stratified per grade of recommendation. Outcomes: Evaluate the rate of accurate, complete, and clear answers to guidelines-related questions using a 5-point Likert scale and the impact of the grade of recommendation on the quality of the answer. Results: Overall, 228 questions were developed. In terms of accuracy 224/228 (98.3%) were defined as accurate (score-4-5), 2/228 (0.9%) presented a fair accuracy (score = 3) while 2/228 (0.9%) were deemed not accurate (score 1-2). In terms of completeness, 223/228 (97.8%) were defined as complete (score-4-5), 2/228 (0.9%) presented a fair completeness (score 3), while 3/228 (1.3%) were deemed not complete. Finally in terms of clarity, 225/228 (98.7%) were defined as clear (score-4-5), 2/228 (0.9%) presented a fair clarity (score 3) and 0/228 were not clear. When comparing strong and weak recommendations, no differences were recorded. Clinical Implications: The EAU Guidelines Bot may serve as a reliable clinical decision support tool for urologists seeking rapid, evidence-based guidance on sexual and reproductive health management. Strengths & Limitations: This is the first external evaluation of the EAU Guidelines Bot. Our results suggest a significant improvement in terms of reliability when compared to general AI tools. However, our queries were straightforward and developed directly from guideline recommendations and results might not apply to complex real-world clinical scenarios. Conclusions: EAU Guidelines Bot represents an accurate and reliable tool for Sexual and Reproductive Health Guidelines navigation, but further validation is required to evaluate its applicability in clinical practice. [Key word] AI, chatbot, eau guidelines, sexual and reproductive health 論文原文:Valerio Santarelli, Riccardo Lombardo, Matteo Romagnoli, et al. (2026). Accuracy, readability, and understandability of European Association of Urology guidelines bot for Sexual and Reproductive Health Guidelines. The Journal of Sexual Medicine, Volume 23, Issue 4, April 2026. https://doi.org/10.1093/jsxmed/qdag041 (翻譯兼責任編輯:MARY) (需要英文原文的朋友,請聯繫微信:millerdeng95或iacmsp)

|