Chief Justice Roberts's bad math

Supreme Court Chief Justice John Roberts has a math problem. In his recently issued Year-End Report on the Federal Judiciary, the chief made a basic mistake that would be obvious to anyone who encounters simple math concepts – ratios, proportions, averages – in their work or pastimes, including sports fans, teachers, fulfillment center operators, bakers, mechanics, even lawyers and many others. Roberts’s miscalculation matters because it downplays an ethics problem among federal judges, as I will explain below. 

Roberts typically uses his annual report to address an issue confronting the federal judiciary. This year, it was a series of articles in the Wall Street Journal about the many judges who had failed to comply with the federal recusal statute. According to the U.S. Code, judges must disqualify themselves in any case where they own even a single share of stock in one of the parties to the litigation. The Journal’s comprehensive review of dockets discovered that 131 U.S. judges participated in at least 685 cases where they were clearly disqualified under the law during the period 2010-2018 (the last year for which data were available). To his credit, Roberts acknowledged the ethics violations, but he proceeded to minimize them with a math error. (Or maybe it was oblique advocacy; more on that later.) 

According to Roberts, “the 685 instances identified amount to a very small fraction—less than three hundredths of one percent—of the 2.5 million civil cases filed in the district courts in the nine years included in the study. That’s a 99.97% compliance rate.”


That was bad math, resulting in an overstated compliance figure, because it used the wrong denominator. The question presented by the WSJ series was how many times judges had failed to disqualify themselves when their financial disclosure forms revealed that they (or their spouses) actually owned stock in one of the parties in the case. The total number of civil cases over nine years is completely irrelevant to the calculation of a “compliance rate” because there was never any reason for recusal in the overwhelming majority of those matters (either because the judge owned no stocks or because there were no corporate parties involved).

Roberts could have easily included the actual rate for non-recusals based on the right denominator. As was clearly explained in the WSJ articles, the cases were selected for review by comparing judges’ financial disclosure forms with the companies named in pleadings, which allowed the identification of a universe comprising “tens of thousands” of cases for possible recusal. Substituting a corrected number for the exaggerated “2.5 million” figure would have revealed a much lower compliance rate, by more than an order of magnitude.

My understanding is that the researchers looked at 80,000–100,000 cases from which they identified the 685 recusal violations. The true rate of noncompliance is thus somewhere between .7 and .85 percent. That may not seem like a lot, but it is roughly triple the mortality rate from COVID-19 in the U.S., and about 25 times greater than Roberts’s own estimate of recusal failures. 

It may be that Roberts is just uncomfortable with numbers. At the 2017 oral argument of Gill v. Whitford, the Wisconsin partisan gerrymandering case, he derided the plaintiffs’ reliance on a straightforward statistical measurement as something “I can only describe as sociological gobbledygook.” 

Roberts did allow that there might have been a shortcoming in his “educational background” – he has two degrees from Harvard – but perhaps his comment was more strategic than innumerate. Before taking the bench, Roberts was renowned as one of the finest appellate advocates of his generation, and the latest report was not the first time he selectively stressed statistics that just happened to reinforce his point. 


Roberts devoted his 2006 annual report to a pitch for increasing judges’ salaries, to resolve what he called a “constitutional crisis.” In what was probably a first for a chief justice’s report, Roberts included three color-coded graphs, one of which purported to show how badly judges’ pay had lost ground in comparison to law school deans. While federal district judges had earned slightly more than the Harvard dean in 1969, they had fallen far behind by 2006. 

But the devil was in the baseline. Federal judges had just gotten a huge pay raise in 1969, which only temporarily made them better off than deans. If Roberts had chosen 1968 for the starting point, a year when judges earned less than deans, the comparison would have lost its punch. Economists call this “data dredging.” It is not something that happens by accident. 

As a famous baseball fan who once committed himself to objectively calling balls and strikes, Roberts probably understands statistics better than he lets on. A player’s batting average is computed by dividing his number of hits by the times at bat, not by the total number of at bats by all the players in an entire game. And any player may be made to look better (or worse) by selectively focusing on a hitting streak (or slump) rather than a full season or career. 

In baseball, of course, a difference of 0.7 percent may not always be significant, but things can be very different beyond the world of sports. It is a tired but still meaningful observation that nobody would get on an airplane that was likely to crash on seven out of every thousand landings, even if limited to a particular airport and spread over nine years. And nobody should have to appear before a judge who fails to recuse in seven or eight out of every thousand relevant cases.

Steven Lubet is Williams Memorial Professor at the Northwestern University School of Law. His new book is “The Trials of Rasmea Odeh: How a Palestinian Guerrilla Gained and Lost U.S. Citizenship.”