“Induced labor at 39 weeks may reduce likelihood of C-section, NIH study suggests”.
Wait, what? But aren't there years of research that show women who are induced have a higher C-section rate than women who go into labor spontaneously? And what about those studies, including Cochrane Reviews that indicate an increase in maternal post partum hemorrhage and as much as a five fold increase in the rate of post partum hysterectomy in women's whose labor was induced? (Guerra etal, Dunne et al, Vargo et al)
But that headline was the widely reported and apparently well received conclusion of a study published in August 8 edition of the New England Journal of Medicine. The study, a randomized trial, does indeed appear to suggest that elective induction of labor at 39 weeks gestation reduces the likelihood of a C-section. Curiously this finding, which does contradict much of the published literature, was not one of the primary outcomes the study was designed to examine. The principle goal of the study was to address an issue that is becoming a significant point of contention among obstetricians: whether, and if so at what gestational age, does the inducing labor result in better perinatal outcomes than awaiting for the spontaneous onset of labor? In all the hype over the apparent benefit of fewer C-sections it seems to have been missed that the primary outcome of the study was that induction at 39 weeks did not improve perinatal outcomes. So what is going on here?
In recent years a coterie of obstetricians have been promoting the position that “nothing good happens after 39 weeks”, and currently that position appears to be gaining favor, among academics if not yet among the rank and file. Famously, at the 2016 annual ACOG meeting, two obstetricians were asked to debate the case for inducing at 39 weeks. To everyone's surprise both men elected to support the case for induction. Polls of the audience members which were taken before and after the debate, showed support for induction which began at 20 percent of those attending, rose to 72 percent in favor of inducing.
Still among the rank and file obstetricians, who did not happen to attend the “debate” there has remained resistance to the idea of inducing everyone with an uncomplicated pregnancy at 39 weeks. There is apparently something very unsatisfying with the conclusion that human females are no longer capable of conceiving, nurturing, and birthing a healthy offspring without medical intervention. But to one of the debate participants, Dr. Errol Norwitz, that proposition is not only not unsatisfying but actually self evident. “Nature is a terrible obstetrician” crowed Norwitz in the sadly misguided belief that that obstetricians, who have made dubious calls at almost every juncture in their tenure (see puerperal fever, DES, “twilight sleep”, etc etc) are somehow more capable of safely delivering human offspring than than human women are.
This new headline-making study was intended to address that big question: are perinatal outcomes better when pregnant women are induced at 39 weeks than when they are not? As formulated in the study: “The ARRIVE trial (A Randomized Trial of Induction Versus Expectant Management) was designed to test the hypothesis that elective induction of labor at 39 weeks would result in a lower risk of a composite outcome of perinatal
death or severe neonatal complications than expectant management among low-risk nulliparous women.”
The study, published as “Labor Induction versus Expectant Management in Low Risk Nulliparous Women” (N Eng J Med 2018;379:513-523) did indeed provide an answer to that question – No! The study found there was no statistically significant difference in the primary outcome (that thing the experiment was designed to measure) of “composite perinatal morbidity”. But that was not the outcome they expected to find. There was, however, a secondary outcome, that wasn't expected but could be made to serve the same purpose of promoting induction at 39 weeks: women who were induced at 39 weeks had fewer C sections than women who weren't. That would seem to make a strong argument for offering an induction to every pregnant woman at 39 weeks. After all, all parties to this debate agree that lower the Cesarean rate is a laudable goal. And if that is what they found it certainly should be published as such. But is that actually what they found?
When researchers seem to be heavily invested in supporting their hypothesis, that induction at 39 weeks was “better” even if it was for reasons other than those they expected, and especially when that hypothesis flies in the face of the consensus of existing literature, it is wise to look deeper into the methodology for biases, intentional or otherwise, and to examine carefully the data that is being presented and the conclusions that are being drawn from it.
There are in fact a number of very significant methodological critiques of this research, which cast significant doubt on the validity of even the outcome they chose not to feature as well as the outcome they did. I will address some of those in greater detail below. But the major issue which severely distorts the results as presented, and the meaning that is being widely broadcast as supporting induction at 39 weeks, is a little slight of hand hiding in plain sight.
Take a close look at the title of the study:
“Labor Induction versus Expectant Management in Low Risk Nulliparous Women.”
“Low Risk Nulliparous Women” is valid on it's face. “Women who have not previously delivered a child and have no identified risk factors” is a reasonable interpretation. Certainly the methods sections should contain a detailed explanation of what is considered “low risk” for the purposes of this study (and does), but most obstetricians, midwives, and most expectant mothers would not differ in picking out pregnancies that qualified as low risk and pregnancies that didn't. The term identifies pregnancies that readers would agree with reasonable consistency are low risk.
“Labor Induction”, some variation on “intervening in order to bring about the onset of labor when it doesn't previously exist, with the ultimate goal of delivering the child”, should do. It doesn't specify how labor induction is to be accomplished but that is left up to the inducing physician/midwife and their properly informed patients to define and is presumably within the ranged of options routinely employed to induce labor. In the context of this study labor induction was defined thusly:
“Women in the induction group were assigned to undergo induction of labor at 39 weeks 0 days to 39 weeks 4 days.”
Straight forward enough.
“Expectant Management.” If you chose some variation on “not intervening to bring about labor before it spontaneously begins, unless there were to be a medical indication requiring labor to be induced” you would be in very good company. Also known as “watchful waiting” the idea is to let the pregnancy proceed without intervention unless there is a medical reason to do so. Indeed the text book definition reads: “Expectant management of the pregnancy involves nonintervention at any particular point in time and allowing the pregnancy to progress to a future gestational age. Women undergoing expectant management may go into spontaneous labor or may require indicated induction of labor at a future gestational age.”
So, most reasonable observers, and definitely the press who picked up and publicized this research would interpret this as a comparison of inducing (at 39 weeks) versus not inducing, (waiting for spontaneous labor unless there were medical reasons to intervene).
Now let's see what the term “expectant management” was operationally defined as in the context of this study:
“Women in the expectant management group were asked to forego elective delivery before 40 weeks 5 days and to have delivery initiated no later than 42 weeks 2 days”
Forego elective delivery before 40 weeks 5 days? What elective delivery? Elective delivery is not expectant management. How many of these women chose to have elective deliveries (and what types of deliveries did they have?). That information was not even collected, or if it was it was neither presented nor acknowledged in the paper.
So what we actually have here is a study comparing elective induction at 39 weeks with elective induction at 40 weeks 5 days or later, or elective C-section, or the spontaneous onset of labor, or indicated induction for medical reasons, or indicated induction for dates beyond 42 weeks 2 days. Very different from induced labor versus natural labor which is how most reasonable readers, including most obstetricians and midwives are interpreting this study.
Further, and strangely without explanation, the characteristics of the vaginal deliveries in the expectant management arm were not published, and may not even have been collected. What percentage of women in that group had induced labors? And at what gestational ages? We do know from the data that the average length of gestation in the expectant management group was only 5 days longer than the average length of gestation in the induction group. Some of those were certainly spontaneous labors, but how many? For that matter how many labors in the induction group were spontaneous? Surely some of them delivered before their scheduled induction.
We really do not know and can not glean from this data exactly what management was being compared with what other management. What we do know for certain is that this study does not compare induction at 39 weeks with expectant management, and it does not compare induction with spontaneous labor.
Let's unpack the actual group comparison. From the discussion section:
“These findings contradict the conclusions of multiple observational studies that have suggested that labor induction is associated with an increased risk of adverse maternal and perinatal outcomes.4-6 These studies, however, compared women who underwent labor induction with those who had spontaneous labor, which is not a comparison that is useful to guide clinical decision making. Conversely, our findings are consistent with observational studies,7-11,20-23 as well as the randomized trial conducted by Walker et al.,12 in which women undergoing labor induction were compared with women undergoing the actual clinical alternative of expectant management.
The references cited in the study indicate that induction of labor increases adverse maternal and perinatal outcomes and increases the rate of cesarean delivery as compared to normal, spontaneous labor. It is highly misleading to then inform the population that induction at 39 weeks decreases the chance of a cesarean delivery and has no impact on maternal or perinatal outcomes compared with expectant management or spontaneous labor.
The excuse for doing so is that previous studies “compared women who underwent labor induction with those who had spontaneous labor, which is not a comparison that is useful to guide clinical decision making”. Why ever not? Certainly there will be instances where women in the expectant management group may have to be induced. Hypertensive disorders of pregnancy, which are more common as the duration of gestation increases are a perfect example (to which we shall return shortly). But the vast majority of women can safely wait for the onset of spontaneous labor. The number of women who did go into spontaneous labor can be recorded and broken out for sub group analysis, and should, based on the population prevalence of disorders of pregnancy, represent the vast majority of women assigned to the control group. To say that measuring the outcomes of spontaneous labor is not useful to guide clinical decision making – apparently because it would have excluded the choice to induce labor electively, is just absurd. Is spontaneous labor in low risk pregnancies now so rare that it isn't a useful comparison group? Have we really moved so far away from the process of normal, spontaneous labor that it is no longer relevant, even as a control group, to clinical decision making?
Other methodological concerns.
There are other, more technical questions that arise from even an admittedly superficial review of the methods and results.
One, which is somewhat technical but no less important, is the use of “frequentist” statistics which lend significance, or lack there-of to data that in all honesty are not terribly different from each other. In this case the distribution of the primary composite outcome, which was not statistically significantly different between groups is never-the-less very similar to the distribution of the secondary C section data, which was statistically significantly different. It is clinically questionable to call one distribution significant and the other insignificant when they are actually quite similar. A Bayesian analysis would have better characterized the results without the burden of declaring one to detect a true difference and the other to not to detect anything.
Another problem, which the authors do point out in the discussion, is that the assignment to groups was not blinded. This can lead to ascertainment bias, more specifically case ascertainment bias – aka surveillance bias. The doctors, who were tasked to make the call for induction or cesarean delivery, knew the gestational age of their client and what group they had been randomized to. They would be likely to more closely surveil (monitor) the women known to be at more advanced gestational age. And they would be likely to move to Cesarean delivery based on that monitoring more often (confirmation bias).
I wouldn't have been terribly surprised to find that maternal and perinatal morbidity were higher in the expectant management group. After all they were pregnant for a longer period of time which creates more opportunity for a pregnancy loss. We know that certain pregnancy related disorders become more likely the longer a pregnancy lasts. Hypertensive disorders of pregnancy (HDP) are a prime example. Indeed, while this study found no difference in the primary outcome of composite morbidity, the study did detect an increase in HDP, even though the average length of gestation was only 5 days longer.
Which brings up another limitation of the study. The incidence of HDP in the US is commonly quoted at between 5.2% and 8.2% of the population. (Umesawa, et al, 2017) In this study the rate of HDP was 9.1% in the induction group, higher than the prevalent rate in the US and 14.1% in the expectant management group, nearly twice as high as the population prevalence and very much higher than the experimental group. That alone could easily have been responsible for the difference in C-section rates, but clearly a population with a HDP rate of 14.1% can not be considered a representative sample of the US population so the results can not be generalized to the US population.
I, of course, have no access to the raw data and in truth have only skimmed the surface of the study. There may be other methodological considerations that I haven't yet uncovered but it is already quite clear that these results do not comport with the message being broadcast by the media and that birth professionals are being encouraged to incorporate into the counseling they offer to their clients.
Expectant management is the sticking point here. It's use in this study, is not consistent with the accepted definition in the medical literature. It leads the reader to believe the study compares inducing someone with not inducing someone and suggests that inducing is preferable. In fact that was not the case at all. This was not a comparison between inducing and not inducing. It was a comparison between inducing at 39 weeks and an unknown and uncollected amalgam of individual methods of bringing pregnancies to deliver before 42 weeks 2 days. We can't tease out the impact of induction in the expectant management group because the study authors did not collect (or if they did, did not publish) the number of “expectant management” patients that were induced – or at what gestational age. That would be a subgroup analysis worth examining.
The fact is, the literature has been and remains very clear that induction at any gestational age is more likely to lead to cesarean delivery than awaiting spontaneous labor. Indeed, spontaneous labor remains associated with lower c section rates and lower rates of perinatal and maternal morbidity.
This study compares induction at 39 weeks to what the authors presume represents the current state of expectant management in the US obstetric community. If anything, that suggests the current standard of care isn't very good. However bad nature is as an obstetrician Dr. Norwtiz, in an otherwise normal pregnancy, she appears to have much better outcomes than we do.
Guerra, G. Cecatti J. Souza, J: Elective induction versus spontaneous labor in Latin America. Bulletin of the World Health Organization, 2011, 89:9, 657-665.
Dunne, C. Silva, O. Schmidt, G. Et al: Outcomes of Elective Labour Induction and Elective Caesarean Section in Low-risk Pregnancies Between 37 and 41 Weeks Gestation. Journal of Obstetrics and Gynecology Canada. 2009, 31:12, 1124-1130.
Vardo, J. Thornburg, L. Glantz, J: Maternal and neonatal morbidity among nulliparous women undergoing elective induction of labor. Journal of Reproductive Medicine, 2011;56: 25-30.
Umesawa, M. Kobashi, G. Epidemiology of hypertensive disorders in pregnancy: prevalence, risk factors, predictors and prognosis. Hypertension Research, 2017, 40:3, 213-220.