Not All VO2 Max Tests are Created Equal

In early 2024, we began a concerted effort to improve our physical fitness, partly with a goal of maintaining our active lifestyle as we aged. We wanted to keep participating in recreations such as hiking and scuba diving well into the next decade or two. This would require reasonable levels of strength, balance, and cardiovascular fitness, attributes that would naturally decline as we aged if we didn’t consciously maintain them. In fact, the first thing we discovered on starting to exercise is how much decline had already taken place. And we’d also need to avoid a chronic illness or accidental injury, as either would accelerate the decline or put us out of action altogether.

Scuba diving in the Maldives, left, and, hiking in North Cascades National Park.

A key measure of cardiorespiratory fitness is VO2 max, the maximum rate at which a person can process oxygen during physical exertion. In Outlive, author Peter Attia expounded the importance of this metric: people with high VO2 max scores relative to their age-group peers live much longer and healthier lives. Attia also discusses VO2 max in detail in the podcast episode Exercise for aging people.

VO2 max essentially measures what we are capable of physically—the lower it is, the less you can do—and it naturally falls over time. Studies indicate VO2 max declines about 10% a decade until age 50, and then it can drop up to by 15% a decade after that. Attia shows a table with VO2 max score ranges by performance groups, shown below, and age and sex. Women have a lower VO2 max than their age-equivalent male peers, and the range for each performance group falls with each decade. For example, an elite score for a woman in her 20s is 53, but elite for a female in their 60s is only 40.

Attia encourages his patients to be in the elite top 2% of their age and sex performance group. If they achieve that, then he tasks them to aim for elite, but for two decades younger. The rationale is to get a high VO2 max now to counteract the inevitable decline. He gives an example of hiking in the mountains, an activity that requires a VO2 max of about 30, roughly the 50th percentile for a 50-year-old woman. If a woman that age wants to continue hiking into her nineties, she needs to have a VO2 max now of 45 to 49 so that she will decline to around 30 as a nonagenarian. Unfortunately, we gave up a bunch by starting so old, but better late than never.

With this in mind, one of our fitness goals was to improve our VO2 max scores. Fitness watches and VO2 max calculators can produce an estimate, but the gold standard is an exercise stress test in a lab measuring ventilation and the concentration of oxygen and carbon dioxide in the inhaled and exhaled air. VO2 max is reached when the oxygen consumption stabilizes despite an increase in workload.

In researching options in our home of Seattle, the national chain DexaFit had a good reputation and offered both VO2 max testing plus another health metric we were interested in (DEXA body composition scan). We had our first appointment in May of 2024 to obtain a baseline. The VO2 max test takes about 10-15 minutes and typically is done on a treadmill or exercise bike. We went with the treadmill because that is how we were planning to train.

VO2 max test at DexaFit Seattle

Over the following year, we trained to improve our VO2 max once a week using Attia’s recommended 4×4 protocol. This consists of four minutes of steady state exercise at maximum exertion, followed by 4 minutes of passive recovery to bring our heart rate back down to 100, repeated six times. We initially used a treadmill, and later switched to an elliptical machine and occasionally use an exercise bike.

We returned to DexaFit every three months to gauge our progress. In this period, James’ score increased steadily, with an overall gain of 25%, a nice improvement. But Jennifer’s decreased with each test, dropping 14% overall. She fell from ‘high’ relative to her age-equivalent peers, to ‘above average’.

We initially assumed she must not be training hard enough. But her heart rate was reaching into the 170s. And near the end of a 4-minute set she was feeling the burning sensation that can occur during intense exercise, indicating she was working near her anaerobic threshold. Also, she wears a CGM (continuous glucose monitor), and her blood sugar often was spiking as the readily available glucose was depleted and her body released more, another indicator that the intensity was fairly high.

Training VO2 max on an elliptical machine.

Given that VO2 max is only expected to decline 15% in a decade without any training, that much of a drop in one year indicated a possible cardiovascular health issue. We researched online for an explanation, but couldn’t find anything beyond cardiovascular problems. And we raised this concern with DexaFit, who didn’t see a problem with the dropping score and had no credible theories to explain it.

To determine if we were dealing with a health or a test issue, we scheduled a cardiopulmonary stress test at the University of Washington (UW) Lung Function Testing Clinic, the absolute gold standard in VO2 max testing in our area. As expected from a medical facility, the test gathered much more data than those at DexaFit. After spirometry tests to measure lung function and other respiratory tests, Jennifer was hooked up to a breathing hose, similar to DexaFit, but also to a blood pressure cuff and multiple body probes to gather EKG (electrocardiogram) data.

Cardiopulmonary stress test at UW.

The medical test protocol was quite different from DexaFit’s, where the treadmill was immediately ramped to 5 mph after the warmup, and Jennifer ran the entire time. The UW test protocol began at 1.7 mph with a maximum speed of 4.2 mph, and she walked for all but the final 3 minutes.

Since the exertion wasn’t as difficult, Jennifer wasn’t very hopeful that the VO2 max score would be any different that DexaFit’s. But the maximum recorded score was 50.5, putting her at elite for her age group 2 decades younger. This result was fundamentally different from the DexaFit score of 32, or above average for her age group. And, most importantly, it meant she didn’t have a cardiovascular health issue.

Since the DexaFit numbers were off by such a large margin, the next problem was finding a lab where we could get an accurate test. The UW test was only available with a doctor’s referral, so not really an option. The other choices in the Seattle area were surprisingly limited, but we did find a new lab just opening up in Bellevue, Belmar. Their goal was to bring true medical-grade testing to the general public, beyond the available consumer tests. We got an appointment a couple of weeks after Jennifer’s UW test, allowing us to make a good comparison between the two.

We were impressed with the Belmar testing—the equipment and testing seemed identical to that of UW, including detailed spirometry and other airway tests. They also have a cardiopulmonary medical doctor on staff to really dig into the details when interpreting the results, something we really appreciate. And they understand their equipment and the test protocols well, and carefully calibrate the equipment before each test.

Click for larger image
Click for larger image
Spirometry and cardiovascular stress testing at Belmar

But the most important factor was the test results. And they were identical to UW, giving Jennifer a score of 50.6. We’d found a lab whose results we could trust.

We’re not really sure why Jennifer’s DexaFit VO2 max scores were so low. But that matters not. What is important is that we both have good scores and a reliable test source to ensure we are maintaining our fitness levels so we can hopefully stay active for many years to come.

Hiking to Camp Muir at 10,090 feet (3,075 m).

If your comment doesn't show up right away, send us email and we'll dredge it out of the spam filter.


4 comments on “Not All VO2 Max Tests are Created Equal
  1. Raffaele says:

    James,

    Thanks for the thoughtful response. My concern may be bit different: our ability to detect signals in the body seems to be advancing faster than medicine’s ability to understand which of those signals actually matter.

    I also suspect I might not have the confidence to look at certain findings and simply treat them as neutral data without worrying about them.

    You’re right, though — this will likely be one of the real challenges in the years ahead.

    Another aspect that occasionally crosses my mind is the need to trust that medicine remains primarily guided by physicians acting as doctors, rather than drifting toward a more “medical industry” mindset where diagnostics and treatments can also become products to sell.

    To be clear, I’m not someone who rejects diagnostics — quite the opposite. I value them greatly. But that thought about the balance between detection and understanding tends to stay somewhere in the back of my mind.

    Thanks again for the exchange — it’s a fascinating topic.

    • You are definitely right that diagnostics and detection leads treatment in capability of some ailments which is to say that some problems are detectable but not treatable so there is room to argue that these diagnostics are non-actionable bad news. For example, brain cancer in it’s most serious forms is easy to detect and very close to untreatable. You were also concerned with the possibility that control in medical situations is, in some cases, drifting from the doctor to a larger group that is commercially motivated. Arguable doctors are also paid and therefore a commercial group but they have sworn to act in the best interests of the patient so, for the most part, their commercial interest shouldn’t be a a factor. You’re right that, in many cases, a larger commercially driven diagnostics industry has emerged and is gaining influence but this trend remains a personal choice. You can chose to work exclusively with a family physician and their referrals or you can choose to acquire additional diagnostic data points.

  2. Raffaele says:

    Jennifer&James

    Really admire the effort you put into understanding your health level with all these tests — VO₂ max, DEXA, and the rest. It shows a real commitment to taking care of yourself.

    At the same time, I have to admit I’ve always been a bit terrified of over-diagnosis.

    My impression is that our diagnostic capability — driven by computing power, imaging, data analysis, and now AI — is advancing incredibly fast. In some ways it’s moving much faster than medicine itself, meaning our ability to “detect” things may outpace our ability to really understand or treat them.

    So we end up seeing anomalies, markers, or “signals” that maybe five or ten years ago we would never have discovered — and that people lived with perfectly well. Now that we can see them, there’s a temptation to interpret them as problems.

    And since medicine is also, inevitably, an industry, sometimes I wonder whether that can lead to treatments or prescriptions that might not actually be necessary.

    Maybe I’m overly cautious, but I do sometimes worry about that dynamic.

    Do you ever have that kind of concern about over-diagnosis, or do you see it differently?

    Good Health !

    • I understand your concerns and it’s perhaps one of the largest and most debated questions in modern medicine. There is concern in some parts of the medical community that when a patient learns of an potential issue they worry greatly about it. A large percentage of these “potential issues” are non-issues so pointing them out leads to lower quality of life with excess worry on the part of the patient. Another related concern is each potential issue will drive the patient to request non-necessary diagnostics some of which also bring patient risk. Each X-ray brings some potential negatives. Exploratory heart operations bring risk. On this concern, patients could end up going through procedures not medically required and the outcomes of the sum of the procedures could be medically worse than simply leaving the issue alone and not worrying about it. Bringing these two sets of concerns together, we have knowledge of minor medical issues yielding results that are worse than not knowing about them.

      Another perspective on these diagnostics is more data is always good as long as the patient is able to take a mature, risk calculated, medically informed approach to the issues founmd. This theory can be summarized by the belief that “more data is good.” I’m firmly in this camp. I really believe I’m better off having all the data even if it’s inconclusive and may not even be actionable. In the non-actionable case, it’s something to keep in mind in case you come across a diagnostic approach that can answer the question or find some way to mitigate the problem or find some way to detect early if it worsens. In the latter case, you basically don’t worry about it but check periodically for worsening.

      In looking at both camps above, the ethical question is price/performance. Which approach produces the least cost on the medical system for the quality of life delivered. The second one has more diagnostics. The former has more emergency interventions. I’ve met doctors in both camps but my belief is the diagnostics and preventative action is cheaper on average than emergency ward visits and it’s likely to lead to a higher quality of life. But I admit it’s far from clear and there are good doctors on both sides of this debate. Thanks for raising a super important question. Perhaps one of the most important questions in modern health care.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.