Cancer Screenings and Over-diagnosis: Pick The Outcome That Matters

What are the principles of a good screening test?

The test should identify a clinically important disease that, if left untreated, will cause significant morbidity/mortality.
It should be inexpensive and noninvasive.
The disease being screened for should have a preclinical, pre-symptom phase for which there is a good intervention available.
The test should be highly sensitive and highly specific.

A good cancer screening test (and the frequency with which the test is performed) will minimize risks to healthy patients and hopefully, through early detection, save the lives of some who would have developed the cancer. When looking at data to determine the efficacy of a cancer screening protocol, we must be careful to look at all of the relevant outcomes and avoid focusing only on surrogate outcomes. By surrogate outcomes, I mean looking at an endpoint that doesn’t necessarily reflect the goal of the intervention. For example, if we do a study on preterm labor, a surrogate outcome might be to look at how many women stay pregnant for 48 hours or one week, or how many deliver preterm rather than at term. But those outcomes are not what we are really interested in. What we really want to know is, Does the intervention change neonatal outcomes? Are babies healthier as a result of the intervention? Similarly, in studying a cholesterol medication, it is a surrogate outcome to consider its effects on cholesterol. What we really want to know is, Does the drug reduce death from hearth disease? Or prolong life in general?

In regards to cancer screening, a surrogate outcome is whether the screening leads to more diagnoses of cancer, or whether there are fewer deaths caused by the cancer. Those outcomes may not reflect true cancer diagnosis (misdiagnosis) or might represent the diagnosis of cancers that needn’t be treated (over-diagnosis). Similarly, the reduction in deaths from the cancer might come at the cost of more deaths due to the screening itself, usually from treatment of mis- and over-diagnosis.

The less deadly and the more responsive to treatment a given cancer is, the more likely it is that a screening test for that cancer will lead to harm. We have screening interventions that are commonly used for breast, cervical, and colon cancer. We often deal with incidental findings on imaging that might suggest cancer, such as an ovarian cyst or thickened endometrium in a postmenopausal woman. How we interpret and how we react to these tests determines how much good or how much harm we do. The more we focus on preventing any possible case of cancer that we might be able to identify, the more harm we do to patients who do not have cancer.

How many women who do not have cancer deserve to be harmed in order to diagnose or prevent one case of cancer in another woman? This is not a hypothetical question. We over-diagnose cancer. We over-treat many cancerous and precancerous lesions. Because of this, we have to consider how many healthy women are harmed in the process of preventing or diagnosing cancer. We have to look at the correct outcomes in our studies, which means looking at total mortality in the population of women who receive an intervention compared to a non-intervention population. It is not enough to say that x number of cases of cancer were diagnosed or treated or prevented. The only outcome that really matters is, Did we extend the life expectancy (or the quality of life) for the patients being screened?

A recent paper in the British Medical Journal entitled Why Cancer Screening Has Never Been Shown to “Save Lives” explores some of the relevant issues. Some data:

In the 30-year Minnesota Colon Cancer Control Study, there were 128 colon cancer deaths / 10,000 participants in the screening group compared to 192 colon cancer deaths / 10,000 in the control group. So colon cancer screening saves lives! Well, not exactly. Total mortality was 7,111/10,000 in the screened group compared to 7,109/10,000 in the control group. Overall mortality was unaffected.

This means that while colon cancer deaths were decreased, deaths from other causes were increased. The authors refer to this as “off-target deaths” and state that they are particularly likely with screening tests associated with “false-positive rates, over-diagnosis of non-harmful cancers, and detection of incidental findings.” This was one of the major problems with PSA screenings for prostate cancer, which led to millions of unnecessary biopsies and deaths from other causes, including an increased risk of suicide in the first year after biopsy. We have to consider all outcomes, not just the one we are interested in. This is why we no longer recommend PSA screening for prostate cancer, chest x-rays for lung cancer screening, urine screening for neuroblastoma, etc. We were doing more harm than good.

Similarly, there is a balance for all screening tests that tends to maximize benefit while minimizing harms. The debate over onset and frequency of mammograms and pap smears isn’t about saving money; it’s about saving lives. An excellent summary of the merits of the three current US guidelines for mammography is here. But just to be clear, the reason to reduce the number of screenings is to reduce the rate of over-diagnosis, over-treatment, and harm caused to otherwise healthy women. In fact, opting out of mammography altogether will reduce your risk of being diagnosed with breast cancer by 1/3! That’s right. Why? Because as many as 1 in 3 women who have been treated for breast cancer didn’t have breast cancer! What other things can you think of that will reduce your risk of breast cancer by 1/3? The graph at the beginning of this post demonstrates the outcomes of mammography in a way that is helpful for communicating with patients. Note that the data in the graphic is for women over age 50; the numbers are worse when we include women aged 40 and up.

Just because a cancer or a premalignant lesion is detected doesn’t mean that it was worth detecting. Mammography has over-diagnosed more than a million women with early malignancies while not reducing the risk of metastatic lesions. No evidence to date exists to show that mammography reduces total mortality. Not all lesions progress. Early detection is not necessarily a good thing. More screenings will always lead to more diagnoses of cancer, but not necessarily fewer deaths because not all lesions matter and not all diagnoses are accurate. Yet every time a “diagnosis” is made, it adds fuel to the anecdotal flames. “If I had followed the current guidelines, we wouldn’t have found that cancer.” It’s hard to convince doctors or patients that that might have been a good thing.

Take cervical cancer screening guidelines, for instance. Current recommendations are that the first pap smear happen at age 21. Yet almost anyone who has been in practice for a number of years can produce many stories of women younger than 21 who had a pap smear and was discovered to have a premalignant condition, say severe dysplasia (CIN 3). There are two ways of looking at that finding;

Thank God we did a pap on her; she was only 17. Luckily for her, we caught it early and intervened with a cold knife cone, curing her. If we had waited until she was 21, she might have presented with an invasive cancer. Pap smears for everyone!
I can’t believe someone did a pap on her. With a woman her age, it’s not surprising she had that initial response to HPV. But by the time she was 21, it’s very likely she would have had a normal pap smear following her immune response to the virus. The conization could have sterilized her or led to an unintentional hysterectomy! HPV vaccines for everyone!

What’s the difference in the two responses? Partly the difference is values. If you are a Gyn-Oncologist who is sick and tired of seeing women die of cervical cancer, you might be more likely to have the first response. But sometimes we lose sight of the forrest for the trees. We have to stop looking at surrogate outcomes and consider the total picture. The 17-year-old might have progressed to cancer; but this is exceedingly rare. Is it worth sterilizing 5 women accidentally to prevent one early stage cervical cancer? No, it isn’t; but that’s also a value statement.

What about an incidental finding? For example, a thickened endometrium in a postmenopausal patient who has no vaginal bleeding. While the CT scan or ultrasound that found the incidental finding was not, per se, a screening test, the same rules apply. Should we biopsy the thickened endometrium just as we would in a woman who was bleeding? According to ACOG, the answer is probably not. Yet, if the endometrium is thickened, there is a significant chance of discovering hyperplasia or even cancer. In fact, if it is ≥ 11 mm, the risk of cancer is 6.7%. So why not biopsy? Well, we have to consider all the outcomes.

For each woman identified with cancer, many more will be identified with hyperplasia.
- All of these women are at risk of undergoing a hysterectomy, many for premalignant lesions that would not have progressed.
- Hysterectomy is associated with a risk of mortality, which increases with age (and perhaps obesity).
- Nonsurgical treatments for hyperplasia, such as progesterone therapy, are also associated with harms (such as blood clots).
- Many women are exposed to harms from over-treatment for every one woman who actually had cancer.
There is no evidence that women who have cancer diagnosed before the onset of bleeding have different outcomes compared to those who are diagnosed after the onset of bleeding.

So if we think back to our principles of a good screening test, it will be recognized that since intervening on a woman with endometrial cancer earlier does not change her outcome, but does expose dozens of women to unnecessary harms (even death), then the answer is don’t biopsy the asymptomatic woman. Ever? Well, no; not even the ACOG bulletin says never. Since we are Bayesian creatures, we are allowed to use judgment. If a 60-year-old, 400 lbs nulliparous woman with a history of PCOS presents with an endometrial lining that measures 22 mm and she has a history of cervical stenosis, I will probably biopsy her. And my Bayesian Spidey-sense tells me that I will probably find cancer. But even if I do, I have to appreciate that I likely will not change her outcome. Did I advantage her by diagnosing it earlier? Almost certainly not. If she gets surgery, she may very well die of a pulmonary embolism. So we have to be honest about the good we do patients and the harm we do patients.

Ultimately, our patients’ values do matter. Depending on her values, a 45 year old woman is in the right to both do a mammogram and to not do a mammogram. But unless we truly understand the risks and benefits, we cannot properly communicate the necessary information to her that will help her make that value-based decision.