Why a 3,572-pregnancy study can’t tell you Ozempic is safe in pregnancy
Birth defects happen in 3% of births. Telling 3% from 4% takes far more pregnancies than this study had, which is why “no increased risk” isn’t the same as “safe.”
The clever fix: comparing unhealthy women to unhealthy women.
Why you can’t just run a randomized, placebo-controlled clinical trial among pregnant women.
What the study actually found.
Somewhere this week, a headline will tell a worried woman that Ozempic is safe in pregnancy. It will be based on a careful, honest study that did not find that, could not have found that, and whose authors were at pains to say so.
A study published in the Annals of Internal Medicine addressed a question more women face every year: what happens if you keep taking a GLP-1 drug — semaglutide (Ozempic, Wegovy) or tirzepatide (Mounjaro, Zepbound) — into early pregnancy, before you know you’re pregnant. Using insurance records from about 3,572 pregnancies between 2011 and 2024, they compared women who kept taking the drug after conceiving to women who stopped. Both groups had been taking it for diabetes or obesity. They looked at pregnancy loss, abnormal fetal growth, and major birth defects, and found no clear increase in any of them among the women who continued.
For the many women who take one of these drugs and then discover, a few weeks in, that they’re pregnant, that is genuinely reassuring. The trouble starts when reassurance gets promoted to proof.
Why “no increased risk” isn’t the same as “safe.”
The study did not find that GLP-1 drugs are safe in pregnancy. It failed to find harm. Those sound alike. They are not. Finding nothing is not the same as proving nothing is there. Search a dark room with a weak flashlight and see no cat, and you haven’t shown the room is empty. You’ve shown the flashlight is weak.
This study had a weak flashlight for birth defects, not because the researchers did anything wrong, but because of how numbers work.
How many pregnancies it would take to spot a 1-point rise.
Major birth defects already happen in about 3% of births in the U.S., roughly 1 in 33 babies, with no drug involved. That’s the normal background rate, and it’s why birth defects are the leading cause of infant death. Now suppose a drug pushes that rate from 3% to 4%. That’s a real increase — a third higher in relative terms. To prove it, you have to reliably tell “3 in 100” apart from “4 in 100,” against a background that already drifts on its own from chance. To hear that one extra case per hundred over the noise, you need a lot of exposed pregnancies.
To reliably catch a rise from 3% to 4%, you’d need somewhere around 10,000 to 14,000 pregnancies split between the groups. This study had 3,572 pregnancies in total, and only a fraction of those women continued the drug. At a 3% background rate, that’s about 100 expected birth defects across the whole sample, and far fewer among the continuers. There weren’t enough cases to spot a small increase if one existed.
A limit is not a flaw.
So “no increased risk of birth defects” has an honest translation: we couldn’t detect a difference against a 3% baseline with a sample this size. Reassuring, yes. Proof of safety, no. The researchers said as much. They called their estimates imprecise and called for more research. A good study is honest about what it can’t see. The distortion gets added later, by a headline that rounds “couldn’t detect harm” up to “safe.” When a study reports “no increased risk,” ask how many cases it actually had. A handful means the study couldn’t see a small risk, not that there isn’t one.
The clever fix: comparing unhealthy women to unhealthy women.
The researchers compared women who continued the drug to women who stopped — both groups living with diabetes or obesity. The lazy version of this study would compare women on a GLP-1 drug to women not on it. But those groups differ in an obvious way: one is sicker. Find worse outcomes, and you can’t tell whether the drug did it or the underlying disease did. You might be measuring the diabetes and blaming the drug. Researchers call this “confounding by indication,” and comparing continuers to stoppers mostly factors that out, because the main thing separating the two groups is whether they kept taking the medication.
Why you can’t just run a randomized, placebo-controlled clinical trial among pregnant women.
You cannot ethically assign pregnant women at random to a drug you suspect might harm a fetus, and no one wants that loophole closed. So the researchers did the next best thing: they used insurance records to rebuild what a trial would have looked like — defining the groups, the starting point, and the comparison the way a real trial would — a method called target trial emulation. It’s rigorous, but it’s still a reconstruction. Insurance records show a drug was dispensed, not that it was swallowed. Early miscarriages and some defects go uncounted.
This is all within the context of a systemic failure. Pregnant women are routinely left out of drug trials, so when a medication becomes popular, we often have almost no pregnancy safety data for it. We end up rebuilding answers from claims data years later, one drug at a time. The thin evidence here isn’t a quirk of one analysis. It’s the predictable output of a system that treats pregnancy safety as something to reconstruct after the fact rather than study up front. The people asked to absorb that uncertainty are the patients, who get told to “ask their doctor” about safety questions that haven’t been answered.
Fans hear “safe,” skeptics hear “danger.”
People who like these drugs will read “no harm found” as “safe.” People who distrust them will resurface the old animal studies that prompted the caution in the first place and call the same data alarming. Same numbers, two wrong stories. The data are consistent with no large increase in risk, but they cannot rule out a real one. Anyone who tells you it’s settled in either direction is reading their own preference into the gaps.
Match the evidence to the question, because a study usually answers one specific question, though not necessarily the one you most want answered. This study answers, “I took a GLP-1 drug early, before I knew I was pregnant. Should I panic?” The honest answer is reassuring. This study does not answer, “Should I start, or stay on, one of these drugs while pregnant?” That answer hasn’t changed: these drugs are not recommended in pregnancy, and this study doesn’t change that.
The question it answers, and the one it doesn’t.
If you had an accidental early exposure, this is reason for calm, not alarm. If you’re planning a pregnancy or could become pregnant, the standing guidance holds. Talk with your clinician about stopping these drugs and about contraception while you’re on them.

