Have you seen the results from the field trials for the fifth edition of the Diagnostic and Statistical Manual? The purpose of the research was to test the reliability of the diagnoses contained in the new edition. Reliable (ri-lahy–uh-buhl), meaning “trustworthy, dependable, consistent.”
Before looking at the data, consider the following question: what are the two most common mental health problems in the United States (and, for that matter, most of the Western world)? If you answered depression and anxiety, you are right. The problem is that the degree of agreement between experts trained to used the criteria is unacceptably low.
Briefly, reliability is estimated using what statisticians call the Kappa (k) coefficient, a measure of inter-rater agreement. Kappa is thought to be a more robust measure than simple percent agreement as it takes into account the likelihood of raters agreeing by chance.
The results? The likelihood of two clinicians, applying the same criteria to assess the same person, was poor for both depression and anxiety. Although there is no set standard, experts generally agree that kappa coefficients that fall lower that .40 can be considered poor; .41-.60, fair; .61-.75, good; and .76 and above, excellent. Look at the numbers below and judge for yourself:
|Major Depressive Disorder||.32||.59||.53||.80|
|Generalized Anxiety Disorder||.20||.65||.30||.72|
Now, is it me or do you notice a trend? The reliability for the two most commonly diagnosed and treated “mental health disorders” has actually worsened over time! The same was found for a number of the disorders, including schizophrenia (.46, .76, .81), alcohol use disorder (.40, .71, .80), and oppositional defiant disorder (.46, .51., .66). Antisocial and Obsessive Personality Disorders were so variable as to be deemed unreliable.
Creating a manual of “all known mental health problems” is a momumental (and difficult) task to be sure. Plus, not all the news was bad. A number of diagnoses demonstrated good reliability (autism spectrum disorder, posttraumatic stress disorder (PTSD), and attention-deficit/hyperactivity disorder (ADHD) in children (.69, .67, .61, respectively). Still, the overall picture is more than a bit disconcerting–especially when one considers that the question of the manual’s validity has never been addressed. Validity (vuh–lid-i-tee), meaning literally, “having some foundation; based on truth.” Given the lack of any understanding of or agreement on the pathogenesis or etiology of the 350+ diagnoses contained in the manual, the volume ends up being, at best, a list of symptom clusters–not unlike categorizing people according to the four humours (e.g., phlegmatic, choleric, melancholy, sanquine).
Personally, I’ve always been puzzled by the emphasis placed on psychiatric diagnoses, given the lack of evidence of diagnostic specific treatment effects in psychotherapy outcome research. Additionally, a increasing number of randomized clinical trials has provided solid evidence that simply monitoring alliance and progress during care significantly improves both quality and outcome of the services delivered. Here’s the latest summary of feedback-related research.