Dr. Fisher,A few notes about this physician's observations.
I contacted ABIM today asking many questions about the initial certification exam and the scoring.
As I've seen you mention, they claim to use the (modified) Angoff Method to set the cut score for passing. It involves a bunch of "experts" reading questions and estimating what percentage of "minimally competent" test-takers would answer correctly. They then average all experts' percentages for each question, and average the percentage for all questions, to achieve the minimum passing score. Apparently. No info on who the experts are. Also no definition of what a "minimally competent" test taker is.
That's a weird system, but it wasn't what I found most egregious. Without going into too much detail, there are 240 questions on test. By my answer report, based on number of missed questions, I got 74% correct. They have interviews with their president and examples online stating you need to get about 65% correct to pass. I blew that away and still failed. So, I called them. To start, they told me that at least 35 questions are "test questions" that end up being thrown out and neither count for or against you. So we are paying to be research subjects. And paying handsomely. Even if you throw out 35 questions, I still got 70% correct, so I asked how I could've failed if I was still well above the 65% threshold. They told me that not all tests are equivalent and essentially I may have had an easier version and needed to get more correct. So apparently someone could've missed more than I did but still passed if their test was arbitrarily deemed more difficult?
I asked many questions about how this could even be considered an equitable way to grade a high stakes test, but got little response. They did offer to have a psychometric statistician call me. In 7-14 days.
Thanks for your time and for reading. What an arbitrary process. Also, they will not let me see my test or answers, nor can they show me what questions were thrown out. This being clouded in such secrecy just adds to the mistrust from physicians. How can we trust a test that's so arbitrary? Also, if I continue to fail (or am failed, depending on how you look at it), is the ABIM going to pay off my student loans? I've proven competence in residency and now fellowship training. I would argue I've proven competence on this very test too.
ABIM has published several "abstracts" on their webpage about methods of psychometric testing. The first abstract, published in February 2011 an co-authored with a representative from the computerized testing firm PearsonVue, discusses "transitioning the board from linear computer-based test to an adaptive, multistage testlet-based examination" and their "experience 'selling' this change to leadership." In other words, it appears the method of scoring ABIM examinations made a change before February 2011 to IRT scoring. Secondly, it appears a member of PearsonVue and an ABIM non-physician helped "convince" the ABIM leadership why they should change methods. (On a side note for those interested, here's a not-so-"simple guide" to Item Response Theory.) The second abstract, published in April 2011, discusses a "transition from classical test theory (CTT) to item response theory (IRT) scoring." What prompted the change in scoring technique is uncertain, but it appears the method for scoring examinations for ABIM board examinations changed before early 2011. Importantly with this new technique, the pass rate cut-off appears to be determined AFTER the test is taken and may vary from individual to individual, as the doctor suggests in the email above.
For those interested, the 2005-2015 pass rates for initial ABIM board certification were as follows: 92% (2005), 91% (2006), 94% (2007), 91% (2008), 88% (2009), 87% (2010), 84% (2011), 85% (2012), 86% (2013), 87% (2014), 89% (2015). (source: the Internet archive and ABIM websites).
Was this change in examination methods responsible for the declining pass rates of physicians taking the ABIM certification examination noticed by program directors across the country in 2013? Or was this the way PearsonVue required the test be scored if PearsonVue's computerized test centers from were used for scoring (making it a "win-win" for both organizations)? We are left to wonder. The falling pass rate hypotheses considered even included the possibility that study methods of millennial physicians were less rigorous. The New England Journal of Medicine's summary of the controversy tried to quell the outcry from millennials who were quick to respond. Even the American Board of Family Medicine's leadership felt compelled to explain their falling pass rates about the same time.
Isn't it interesting that no one ever entertained the possibility that a change in scoring method at the ABIM had occurred that might have caused the drop in pass rates? If true, the process of passing a physician's initial board certification is inconsistent between test takers since tests contain questions that are "weighted" differently for each diplomat and the process of determining pass rate cut-offs remain shrouded in secrecy.
It's time for the ABIM to stop hiding their "secret sauce" of test development, changes in test scoring, and test "security" processes and address this young physician's questions directly and honestly. Test scoring and the setting of pass rate cut-offs should be transparent and reproducible for any high stakes examination administered to US physicians, especially when this self-proclaimed "voluntary" testing and re-testing now affects all US physicians' ability to gain (and retain) employment.
Or is that too much to ask of folks that use physicians as research subjects without their consent, engage in illegal lobbying of Congress and "stakeholders" for their benefit, and delight in moving our testing fees offshore to the Caymans for their own benefit?
* Portions of the letter were edited to protect the physician's anonymity.