research Performance Evaluation of the Generative Pre-trained Transformer (GPT-4) on the Family Medicine In-Training Examination Read Performance Evaluation of the Generative Pre-trained Transformer (GPT-4) on the Family Medicine In-Training Examination
Phoenix Newsletter - March 2025 President’s Message: ABFM’s Unwavering Commitment to Diplomates and the Specialty Read President’s Message: ABFM’s Unwavering Commitment to Diplomates and the Specialty
A Conversation with Dr. Phillip Wagner “Family Medicine Was All I Ever Wanted to Do” Dr. Phillip Wagner Read “Family Medicine Was All I Ever Wanted to Do”
Home Research Research Library Performance Evaluation of the Generative Pre-trained Transformer (GPT-4) on the Family Medicine In-Training Examination Performance Evaluation of the Generative Pre-trained Transformer (GPT-4) on the Family Medicine In-Training Examination 2024 Author(s) Wang, Ting, Mainous, Arch G III, Stelter, Keith L, O’Neill, Thomas R, and Newton, Warren P Topic(s) Education & Training Keyword(s) In-Training Examination Volume Journal of the American Board of Family Medicine Source Journal of the American Board of Family Medicine Objective: In this study, we sought to comprehensively evaluate GPT-4 (Generative Pre-trained Transformer)’s performance on the 2022 American Board of Family Medicine’s (ABFM) In-Training Examination (ITE), compared with its predecessor, GPT-3.5, and the national family residents’ performance on the same examination. Methods: We utilized both quantitative and qualitative analyses. First, a quantitative analysis was employed to evaluate the model’s performance metrics using zero-shot prompt (where only examination questions were provided without any additional information). After this, qualitative analysis was executed to understand the nature of the model’s responses, the depth of its medical knowledge, and its ability to comprehend contextual or new information through chain-of-thoughts prompts (interactive conversation) with the model. Results: This study demonstrated that GPT-4 made significant improvement in accuracy compared with GPT-3.5 over a 4-month interval between their respective release dates. The correct percentage with zero-shot prompt increased from 56% to 84%, which translates to a scaled score growth from 280 to 690, a 410-point increase. Most notably, further chain-of-thought investigation revealed GPT-4’s ability to integrate new information and make self-correction when needed. Conclusions: In this study, GPT-4 has demonstrated notably high accuracy, as well as rapid reading and learning capabilities. These results are consistent with previous research indicating GPT-4’s significant potential to assist in clinical decision making. Furthermore, the study highlights the essential role of physicians’ critical thinking and lifelong learning skills, particularly evident through the analysis of GPT-4’s incorrect responses. This emphasizes the indispensable human element in effectively implementing and using AI technologies in medical settings. Read More ABFM Research Read all 2017 Increased Public Accountability for Hospital Nonprofit Status: Potential Impacts on Residency Positions Go to Increased Public Accountability for Hospital Nonprofit Status: Potential Impacts on Residency Positions 2020 Rural Workforce Years: Quantifying the Rural Workforce Contribution of Family Medicine Residency Graduates Go to Rural Workforce Years: Quantifying the Rural Workforce Contribution of Family Medicine Residency Graduates 2019 New Allopathic Medical Schools Train Fewer Family Physicians Than Older Ones. Go to New Allopathic Medical Schools Train Fewer Family Physicians Than Older Ones. 2019 “That Was Pretty Powerful”: a Qualitative Study of What Physicians Learn When Preparing for Their Maintenance-of-Certification Exams Go to “That Was Pretty Powerful”: a Qualitative Study of What Physicians Learn When Preparing for Their Maintenance-of-Certification Exams
Author(s) Wang, Ting, Mainous, Arch G III, Stelter, Keith L, O’Neill, Thomas R, and Newton, Warren P Topic(s) Education & Training Keyword(s) In-Training Examination Volume Journal of the American Board of Family Medicine Source Journal of the American Board of Family Medicine
ABFM Research Read all 2017 Increased Public Accountability for Hospital Nonprofit Status: Potential Impacts on Residency Positions Go to Increased Public Accountability for Hospital Nonprofit Status: Potential Impacts on Residency Positions 2020 Rural Workforce Years: Quantifying the Rural Workforce Contribution of Family Medicine Residency Graduates Go to Rural Workforce Years: Quantifying the Rural Workforce Contribution of Family Medicine Residency Graduates 2019 New Allopathic Medical Schools Train Fewer Family Physicians Than Older Ones. Go to New Allopathic Medical Schools Train Fewer Family Physicians Than Older Ones. 2019 “That Was Pretty Powerful”: a Qualitative Study of What Physicians Learn When Preparing for Their Maintenance-of-Certification Exams Go to “That Was Pretty Powerful”: a Qualitative Study of What Physicians Learn When Preparing for Their Maintenance-of-Certification Exams
2017 Increased Public Accountability for Hospital Nonprofit Status: Potential Impacts on Residency Positions Go to Increased Public Accountability for Hospital Nonprofit Status: Potential Impacts on Residency Positions
2020 Rural Workforce Years: Quantifying the Rural Workforce Contribution of Family Medicine Residency Graduates Go to Rural Workforce Years: Quantifying the Rural Workforce Contribution of Family Medicine Residency Graduates
2019 New Allopathic Medical Schools Train Fewer Family Physicians Than Older Ones. Go to New Allopathic Medical Schools Train Fewer Family Physicians Than Older Ones.
2019 “That Was Pretty Powerful”: a Qualitative Study of What Physicians Learn When Preparing for Their Maintenance-of-Certification Exams Go to “That Was Pretty Powerful”: a Qualitative Study of What Physicians Learn When Preparing for Their Maintenance-of-Certification Exams