ABSTRACT

Background: The Medical College Admissions Test (MCAT) is a high-stakes test required for entry to most U.S. medical schools; admissions committees use this test to predict future accomplishment. Although there is evidence that the MCAT predicts success on multiple choice–based assessments, there is little information on whether the MCAT predicts clinical-based assessments of undergraduate and graduate medical education performance. This study looked at associations between the MCAT and medical school grade point average (GPA), Medical Licensing Examination (USMLE) scores, observed patient care encounters, and residency performance assessments. Methods: This study used data collected as part of the Long-Term Career Outcome Study to determine associations between MCAT scores, USMLE Step 1, Step 2 clinical knowledge and clinical skill, and Step 3 scores, Objective Structured Clinical Examination performance, medical school GPA, and PGY-1 program director (PD) assessment of physician performance for students graduating 2010 and 2011. Results: MCAT data were available for all students, and the PGY PD evaluation response rate was 86.2% (N = 340). All permutations of MCAT scores (first, last, highest, average) were weakly associated with GPA, Step 2 clinical knowledge scores, and Step 3 scores. MCAT scores were weakly to moderately associated with Step 1 scores. MCAT scores were not significantly associated with Step 2 clinical skills Integrated Clinical Encounter and Communication and Interpersonal Skills subscores, Objective Structured Clinical Examination performance or PGY-1 PD evaluations. Discussion: MCAT scores were weakly to moderately associated with assessments that rely on multiple choice testing. The association is somewhat stronger for assessments occurring earlier in medical school, such as USMLE Step 1. The MCAT was not able to predict assessments relying on direct clinical observation, nor was it able to predict PD assessment of PGY-1 performance.

The Medical College Admissions Test (MCAT) is a high-stakes test that is required for entry to most U. S. medical schools. Many studies have shown the summed MCAT scores to be a reliable, albeit moderate, predictor of medical school basic and clinical science grade point average (GPA) and U.S. Medical Licensing Examination (USMLE) Step 1 and Step 2 scores.1–5 In addition, the MCAT has been linked to overall successful progression through undergraduate medical education6 and has been found to be moderately correlated with factors beyond medical school, such as performance on the USMLE Step 3 examination.1,3,4

Consequently, the MCAT has become an essential tool for admission committees to use in predicting the future performance of prospective students during the medical school application cycle. Admissions officers rank MCAT score and undergraduate GPA as the most important factors in deciding who to interview and rank them among the top considerations in deciding who to accept.7 However, despite this heavy reliance on the MCAT, there is less clarity on whether the test predicts clinical performance—those portions of undergraduate and graduate medical education not assessed by multiple choice examinations. Even when a link between the MCAT and “clinical” (i.e., clerkship) medical school performance is demonstrated, it is difficult to assess how much of the association reflects a true relationship to direct patient care ability as opposed to performance on the National Board of Medical Examiners (NBME) Clinical Subject Examinations that often comprise a large portion of clerkship grades. The idea that such a link may not in fact exist was supported by a recent, large study of Jefferson Medical College students demonstrating that MCAT performance is not predictive of either clinical skills (CS) or professionalism in residency.1 Older studies on this topic, using previous MCAT incarnations, have provided mixed findings.8–11

Given this uncertainty in the relation between MCAT scores and more clinical and later occurring medical education outcomes and assessments, this study proposes to answer the very practical question of how well MCAT performance predicts medical school metrics (GPA, USMLE scores), observed patient care evaluations as adjudicated by undergraduate medical educators, and residency assessments by graduate medical education (GME) program directors (PD). First, we hypothesize that MCAT score will be more highly correlated with first and second year GPA and USMLE Step 1 scores than later GPA and Step 2 clinical knowledge (CK) scores. The literature on medical expertise indicates that the time lag between the MCAT and the later assessments should attenuate the association between them; further, the medical school instruction that takes place between sitting for the MCAT and receiving clerkship grades and Step 2 scores should further weaken their association. Additionally, as such measures typically can have range restriction within a medical school, the correlations may be weaker. In addition, we hypothesize that the knowledge necessary for good performance on the MCAT may be necessary, but not sufficient, for competent clinical performance. This is also based on the expertise literature that has repeatedly found that expert performance is tightly coupled with context specific knowledge; the MCAT is taken in a testing center, clinical performance is assessed in a patient care setting—expert performance in one setting may not translate to another.12 As a result, we hypothesize that a trainee's MCAT score (whether using first, last, or highest attempt, or an average of all attempts) will have only a weak to moderate association with undergraduate direct patient care assessments (as measured by institutional objective structured clinical examinations [OSCEs] and Step 2 CS subscores) and GME PD evaluations.

Study Context and Participants

This study was part of the larger Long-Term Career Outcome Study conducted at the F. Edward Hébert School of Medicine, Uniformed Services University (USU) of the Health Sciences. As the United States' only federal medical school, USU matriculates approximately 170 medical students annually and, at the time of this study, offered a traditional 4-year curriculum: 2 years of basic science courses followed by 2 years of clinical rotations (clerkships). In making acceptance decisions, the School of Medicine Admissions Committee balances performance on the MCAT against the attributes, undergraduate performance and experiences that potential students report to the American Medical College Application Service and demonstrate during in-person interviews. The MCAT is typically taken by students in the summer of the year they wish to apply to medical school, although some students have taken the test earlier, like those who chose or need to take a gap year(s) before matriculating. The MCAT consists of sections testing the biological sciences, the physical sciences, verbal reasoning, and, at the time of the study, included a writing sample. The participants of the present study were students graduating in 2010 and 2011 (N = 340); during that time period, the average MCAT score was approximately 29.

Measures and Statistical Analysis
MCAT

All student attempts at the MCAT were retained. As there has been some debate as to which score best reflects future performance, we derived four MCAT measures to test our hypothesis: highest single test MCAT score, average MCAT score, first attempt MCAT score, and last attempt MCAT score.

OSCE Performance

The second-year medical school OSCE consisted of 6 stations and featured scenarios involving chest pain, polyuria, hemiparesis, anemia, a geriatric assessment, and an oral presentation. The third-year OSCE featured five stations in abdominal pain, fatigue, foot pain, loss of memory, and chronic cough. OSCE cases with history and physical examination checklists and postencounter quizzes were developed from a panel of local experts employing a modified Delphi method. The OSCE scores were a composite of standardized patient ratings of history and physical examination components, preceptor observations using standardized assessment tools, post-encounter quizzes, and a review of the written documentation on select cases. According to a previously conducted but unpublished generalizability study at USU, the second-year OSCE stations demonstrated a moderate generalizability coefficient (r = 0.52), with 40.8% of the overall variance explained by student ability. These values are slightly lower than the published reliabilities for the Step 2 CS components of CIS, data gathering, and patient note, but are in line with other school-level OSCE reliabilities (with a similar number of stations or similar length).13

Medical School GPAs

We included preclinical GPA, clerkship year GPA, and medical school cumulative GPA in the study. Preclinical GPA was calculated using course grades from the first 2 years of medical school, where the curriculum focuses on basic sciences. Clerkship year GPA represents the GPA for all the core or major clinical clerkships during the third year of medical school; it is a composite of clerkship clinical points, OSCE scores, and NBME Subject Examination scores. Medical school cumulative GPA was the overall GPA of all 4 years.

USMLE Step Examinations

The USMLE is a single program consisting of four separate examinations designed to assess an examinee's understanding of, and ability to apply, concepts and principles that are important in health, disease, and effective patient care. We obtained students' first attempt USMLE Step 1 and Step 2 CK scores from USU's Registrar's Office, and Step 2 CS and Step 3 scores from the NBME. Students in this sample completed Step 1, which focuses on their understanding of the basic sciences after their first two years of medical school. Students completed the more clinically oriented Step 2 (CK and CS) during their fourth year of medical school.

Step 2 CS is a standardized patient (SP)–based CS assessment that scores students along three components: Integrated Clinical Encounter (ICE), Communication and Interpersonal Skills (CIS), and Spoken English Proficiency. The ICE component assesses examinee performance on history-taking, physical examination tasks, and post-encounter written summaries; performance information is provided both by SP and physician raters. The CIS component assesses examinee ability in questioning skills, information-sharing skills, and professional manner and rapport; this component is rated by the SP following each encounter. The present study focused only on the ICE and CIS components as native English speakers demonstrate little variability on the SEP scale. Step 3 assesses whether examinees can apply their medical knowledge and understanding of the biomedical and clinical sciences in a manner essential for the unsupervised practice of medicine.

PGY-1 PD Evaluation

We collect PGY-1 data annually from PD who oversee the training of military medical trainees (as shown in the Appendix). Each spring we identify military treatment facilities (and some nonmilitary training programs) where our interns and residents are educated, mailing evaluation forms to each trainee's respective PGY-1 PD. This evaluation form was examined in a previous study and showed good feasibility, validity, and reliability evidence.14 This evaluation form was designed with six sections largely paralleling the six American Accreditation Council for Graduate Medical Education competencies. Exploratory factor analysis revealed that the items loaded on five factors—expertise, military-unique practice, professionalism, system-based practice, and CIS. The students' scores on these five factors were used as separate variables in this study.

Statistical Analysis

We calculated the means and standard deviations of all above measures. We then examined the Pearson correlation coefficients between the measures and reported variance explained (i.e., the square of correlation coefficient) as well. The USU's Institutional Review Board provided ethical approval for the present study.

Table I shows descriptive statistics for measures included in the study. MCAT data were available for all 340 students in the data set. Response rate for the PGY-1 PD evaluations was 86.2%. MCAT scores (first, last, average, highest) were moderately to highly correlated with one another using Pearson's methodology (Table II).

Table

TABLE I. Descriptive Statistics of All the Measures

TABLE I. Descriptive Statistics of All the Measures

Measures Mean SD Minimum Maximum
Highest MCAT 29.43 2.71 24 39
Most Recent MCAT 28.83 3.21 18 39
First MCAT 28.44 3.36 18 39
Average MCAT 28.65 2.93 22 39
Second Year OSCE 70.90 6.08 53 87
Third Year OSCE 67.94 5.58 52 83
Preclinical GPA 3.04 0.45 2.16 4
Initial Clerkship Year GPA 3.21 0.42 1.97 4
Cumulative Medical School GPA 3.16 0.37 2.29 4
Step 1 215.97 17.61 185 265
Step 2 CK 221.87 18.15 184 278
CIS Score of Step 2 CS 20.06 1.01 14.73 22.44
ICE Score of Step 2 CS 0.31 0.70 −2.02 2.76
Step 3 Score 213.93 14.41 177 267
Patient Care and Medical Expertise (PGY-1 PC) 3.72 0.75 1.73 5
Military Unique Practice (PGY-1 MUP) 3.67 0.77 1.36 5
Professionalism (PGY-1 PRO) 3.96 0.82 1.00 5
System-Based Practice (PGY-1 SBP) 3.54 0.71 1.33 5
Communication and Interpersonal Skills (PGY-1 CIS) 3.76 0.74 2 5

The gender distribution of the classes was 70% male and 30% female. The average age at matriculation was 23.9 years.

Table

TABLE II. Bivariate Pearson Correlations Between MCAT Measures and Other Measures

TABLE II. Bivariate Pearson Correlations Between MCAT Measures and Other Measures

  Most Recent MCAT First MCAT Average MCAT Second Year OSCE Third Year OSCE Preclinical GPA Initial Clerkship Year GPA Cumulative Medical School GPA Step 1 Step CK CIS Score of Step 2 CS ICE Score of Step 2 CS Step 3 Score PGY-1 PC PGY-1 MUP PGY-1 PRO PGY-1 SBP PGY-1 CIS
Highest MCAT 0.83** 0.75** 0.89** 0.06 0.06 0.25** 0.15** 0.23** 0.34** 0.20** −0.06 −0.02 0.22* 0.02 0.03 −0.05 0.07 −0.07
Most Recent MCAT   0.62** 0.88** 0.08 0.09 0.19** 0.14* 0.18** 0.28** 0.20** 0.03 0.06 0.17* 0.03 −0.01 −0.05 0.05 −0.04
First MCAT     0.91** 0.09 0.13* 0.22** 0.15* 0.20** 0.28** 0.20** −0.08 0.01 0.24* 0.08 0.09 0.02 0.09 0.01
Average MCAT       0.10 0.12* 0.23** 0.16** 0.22** 0.32** 0.22** −0.03 0.04 0.23* 0.05 0.04 −0.03 0.07 −0.03
Second Year OSCE         0.33** 0.22** 0.32** 0.30** 0.13* 0.16** 0.24** 0.25** 0.11 0.20** 0.21** 0.23** 0.19** 0.17**
Third Year OSCE           0.19** 0.28** 0.25** 0.04 0.15** 0.20** 0.28** 0.06 0.24** 0.28** 0.25** 0.14* 0.22**
Preclinical GPA             0.66** 0.95** 0.73** 0.65** 0.16** 0.32** 0.54** 0.28** 0.17** 0.18** 0.21** 0.15*
Initial Clerkship Year GPA               0.85** 0.58** 0.64** 0.27** 0.35** 0.57** 0.41** 0.29** 0.30** 0.30** 0.29**
Cumulative Medical School GPA                 0.73** 0.70** 0.23** 0.37** 0.61** 0.38** 0.25** 0.26** 0.28** 0.24**
Step 1                   0.74** 0.09 0.21** 0.64** 0.23** 0.13* 0.06 0.15* 0.10
Step 2 CK                     0.09 0.31** 0.70** 0.28** 0.14** 0.11 0.13* 0.11
CIS Score of Step 2 CS                       0.28** 0.09 0.13* 0.05 0.09 0.08 0.17**
ICE Score of Step 2 CS                         0.27** 0.15* 0.04 0.11 0.08 0.07
Step 3 Score                           0.31** 0.21** 0.16* 0.23** 0.14*
Patient Care and Medical Expertise (PGY-1 PC)                             0.81** 0.81** 0.79** 0.84**
Military Unique Practice (PGY-1 MUP)                               0.80** 0.82** 0.80**
Professionalism (PGY-1 PRO)                                 0.71** 0.78**
System-Based Practice (PGY-1 SBP)                                   0.77**
Communication and Interpersonal Skills (PGY-1 CIS)                                    

*p < 0.05.

**p ≤ 0.01.

Each iteration of the MCAT score was weakly associated with second year cumulative GPA, third year clerkship GPA, and fourth year cumulative GPA, Step 2 CK score and Step 3 score. All MCAT scores were weakly to moderately associated with Step 1 score. No iteration of the MCAT score was significantly associated with the Step 2 CS ICE and CIS subscores.

MCAT scores were not consistently associated with scores on the OSCE administered to the second and third year students. Both the second year and third year OSCE were weakly associated with the Step 2 CIS and ICE, respectively.

No permutation of the MCAT score was significantly associated with the five dimensions of the PGY-1 PD evaluation (expertise, military-unique practice, professionalism, system-based practice, and CIS).

As expected, MCAT scores were weakly associated with GPA. Correlations ranged from weakly to moderately positive with Step 1, and weakly positive with Step 2 CK and Step 3 scores. These findings are consistent with the previous literature on the topic.1–5 As Step 1 features basic science knowledge and is taken closest in time to the MCAT (a test comprised of the building blocks of the basic sciences covered in medical school), it is not surprising that scores on this examination correlate more highly with MCAT performance than do scores on the more clinically oriented and later administered Step 2 CK and Step 3 examinations. The fact that these associations persist through all stages of the examination perhaps reflects the common testing methodology—multiple choice questions—shared among the examinations.

On the other hand, MCAT scores were not consistently associated with OSCE scores, and were not correlated with Step 2 CS subscores. This is not surprising, for several reasons. These clinical encounters measure more than knowledge; they also measure expertise in communication and the ability to establish rapport and an effective dyad with a patient as assessed by both patients and faculty. In addition, clinical encounters are context specific, whereas multiple choice tests are often taken under similar, standardized conditions; clinical encounters bring in the unique context of a dedicated clinical space with office or inpatient tools and the details and findings of a unique patient scenario. That a multiple choice test is less able to predict performance in these settings is anticipated.

Further, MCAT scores were not associated with any of the five factors of performance in the PGY-1 year (expertise, military-unique practice, professionalism, system-based practice, and CIS) as adjudicated by each individual's program director. These results are also consistent with the ambivalence found in the research literature, some of which finds positive associations between MCAT scores and clinical measures, and some of which does not.1,8–10 As opposed to prior reports, this study had the benefit of being able to reliably follow students through all stages of medical education in a single, unified medical education and health system.

That the MCAT scores were not reliable predictors of OSCE or PGY-1 performance reinforces the point that the MCAT was not designed to be a test of skill or professionalism. It was worth exploring the idea that these associations might exist since content knowledge has been shown to be coupled with expert performance.12 Given our findings, it is perhaps the case that the knowledge represented on the MCAT is necessary for good medical practice—but not sufficient. The sufficiency consists in the 4 years one spends in undergraduate medical training, the multiple years of one's residency, and the lifelong learning on which one embarks when in unrestricted medical practice. These intervening experiences—UME, residency, and practice—no doubt help explain the attenuation of the MCAT with later occurring outcomes of medical education. It may be that good MCAT scores are simply a surrogate marker for one's ability to navigate our educational system as it is currently constructed.

Finally, this study looked out how multiple different measures of the MCAT predicted performance. Of those measures studied, highest single score and the average score of all MCAT administrations showed the greatest predictive validity. This finding is similar to that seen in Zhao and colleagues,2 and may help admissions committees know how best to use scores when multiple sittings of the examination have been attempted.

The introduction of the new MCAT may alter these correlations. The hope of this new examination is to test integration and application of knowledge—skills which may better correlate with a student's ability to integrate historical and physical findings while devising diagnostic and treatment plans. In addition, the debut of a section addressing the psychological, social and biological foundations of behavior may be associated with the skills required to understand patients in their context. Future research will be needed to determine how this new MCAT relates to medical education outcomes.

This study has several limitations. It is a single school study conducted over the course of two classes. It may be that sample size involved (340 students), limited by the number of classes that received identical postgraduate surveys, is too small to detect some associations that would otherwise be present. On the other hand, the demonstrated associations were, for the most part, weak; any additional associations not seen in this study would be expected to be weak as well. On the other hand, a strength of the study is that the military health system uniquely allowed for the continuous assessment of a population of medical learners from initial admission (MCAT) through PGY-1 year (PD's assessment), an advantage not shared by other medical schools.

This study is also limited by a common weakness shared with other studies attempting to link MCAT scores with student performance: range restriction. Since admissions committees tend to admit only students at the upper end of the MCAT range, it is difficult to know how premedical students scoring over the lower range of the test would perform. It is possible that more pronounced effects would be seen if students with a wider range of scores were matriculated.

Another limitation is the inherent difficulty in accounting for all the intervening steps between an initial admissions examination, the MCAT, and later occurring educational assessments, like the program director's evaluation. One would expect a significant amount of the variance in residency performance to be attributable to both the undergraduate medical preparation that follows the MCAT and the training that occurs in residency—time spent in the anatomy lab, mastery in small group case discussions, teaching at patient bedsides, learning through the semi-independent assessment of patients. The MCAT measures preparation before medical school, but the larger predictor of future success would be expected to be all the training that happens from the time students enter school to the time they complete their internships. A study taking both pre-matriculation factors such as the MCAT and medical education factors (such as curricula, teaching modalities, assessment methodologies, support services) into account would give a stronger sense of what the MCAT and other educational variables can and cannot reliably predict.

In summary, the MCAT demonstrated weak predictive validity for performance on knowledge (especially multiple choice question)-based assessments of medical learner progress, but performed less well on assessments relying on observation of patient interactions whether during a discrete event (OSCE) or during a yearlong training program (PD assessment). As expected, the MCAT's association with future performance wanes as one progress in time from its administration. The expertise literature suggests some association should be present, but that it should be attenuated by the lag in time between the MCAT and later assessments as well as by the use of varying assessment modalities (direct clinical observation as opposed to multiple-choice question). The MCAT remains one tool in an admission committee's toolbox in selecting students with the greatest potential for becoming future physicians of a kind consistent with institutional mission. Given the low correlations, we believe that additional investigation into other selection measures that may predict future physician performance is warranted.

1. Callahan CA, Hojat M, Veloski J, Erdmann JB, Gonnella JS: The predictive validity of three versions of the MCAT in relation to performance in medical school, residency, and licensing examinations: a longitudinal study of 36 classes of Jefferson Medical College. Acad Med 2010; 85: 9807.
2. Zhao X, Oppler S, Dunleavy D, Kroopnick M: Validity of four approaches of using repeaters' MCAT scores in medical school admissions to predict USMLE Step 1 total scores. Acad Med 2010; 85: S647.
3. Donnon T, Paolucci EO, Violato C: The predictive validity of the MCAT for medical school performance and medical board licensing examinations: a meta-analysis of the published research. Acad Med 2007; 82: 1006.
4. Julian ER: Validity of the Medical College Admission Test for predicting medical school performance. Acad Med 2005; 80: 9107.
5. Huff KL, Koenig JA, Treptau MM, Sireci SG: Validity of MCAT scores for predicting clerkship performance of medical students grouped by sex and ethnicity. Acad Med 1999; 74: S414.
6. Dunleavy DM, Kroopnick MH, Dowd KW, Searcy CA, Zhao X: The predictive validity of the MCAT exam in relation to academic performance through medical school: a national cohort study of 2001–2004 matriculants. Acad Med 2013; 88: 66671.
7. Monroe A, Quinn E, Samuelson W, Dunleavy DM, Dowd KW: An overview of the medical school admission process and use of applicant data in decision making: what has changed since the 1980s? Acad Med 2013; 88: 67281.
8. Markert RJ: Predicting residency performance with the new Medical College Admission Test. Med Educ 1986; 20: 5125.
9. Colliver JA, Verhulst SJ, Williams RG: Using a standardized-patient examination to establish the predictive validity of the MCAT and undergraduate GPA as admissions criteria. Acad Med 1989; 64: 4824.
10. Kreiter CD, Kreiter Y: A validity generalization perspective on the ability of undergraduate GPA and the medical college admission test to predict important outcomes. Teach Learn Med 2007; 19: 95100.
11. Hamdy H, Prasad K, Anderson MB, et al: BEME systematic review: predictive values of measurements obtained in medical schools and future performance in medical practice. Med Teach 2006; 28: 10316. Medline
12. Ericsson KA, Charness N, Feltovich PJ, Hoffman RR (editors): The Cambridge Handbook of Experise and Expert Performance. Cambridge University Press, 2006.
13. Dong T, Swygert KA, Durning SJ, et al: Validity evidence for medical school OSCEs: associations with USMLE® step assessments. Teach Learn Med 2014; 26: 37986.
14. Dong T, Durning SJ, Gilliland W, Swygert K, Artino AR Jr: Development and initial validation of a program director's evaluation form for medical school graduates. Mil Med 2015; 180(4 Suppl): 97103. Abstract