Dietary fat underreporting and risk estimation
Sir,
We would like to express concerns about Drs Heitmann and Lissner's conclusion that associations between fat intake and disease risk in observational studies may be overestimated, rather than attenuated, due to underreportingReference Heitmann and Lissner1. Unfortunately, their claim is based on considering underreporting as the only source of measurement error in dietary assessment and seems to be unlikely in practice, at least for macronutrients. Yet it could impact the interpretation of numerous epidemiological studies, notably in the context of the fat and breast cancer controversyReference Prentice, Caan, Chlebowski, Patterson, Kuller and Ockene2, Reference Freedman, Potischman, Kipnis, Midthune, Schatzkin and Thompson3.
Weaknesses in the demonstration include the objective measurements used to assess underreporting and the lack of consideration of other sources of measurement error. Underreporting in fat intake was indirectly evaluated using reference measurements for protein and energy intakes. Because underreporting was greater in energy than in protein intake, overreporting in percentage energy from protein was assumed to be exactly compensated for by underreporting in percentage energy from fat (fat density). When the relationship between fat density and low-density lipoprotein cholesterol (LDL-C) was then analysed, the correction for underreporting decreased rather than increased the magnitude of the association. We believe that such a result could be anticipated and does not provide a sufficient proof for fat and disease associations being overestimated.
Let Q E and Q P denote reported energy and protein intake, respectively (as estimated from a diet history questionnaire describing food intake in the previous month), and R E and R P the corresponding reference measurements (‘measured energy and protein intakes’ in the authors' words). The reference protein intake was assessed using nitrogen excretion in a single 24-hour urine sample. However, urinary nitrogen measurements are subject to substantial within-subject variability, with a coefficient of variation as large as 13 to 24%Reference Bingham4. In lieu of a direct ‘recovery’ biomarker (i.e. doubly labelled water)Reference Kaaks, Ferrari, Ciampi, Plummer and Riboli5, 24-hour energy expenditure derived from self-reported physical activity level and basal energy expenditure served as the reference energy intake. In addition to the potential misclassification of physical activity levels, the estimated 24-hour energy expenditure encompasses uncertainties related to the coefficients for body fat and fat-free mass in the equation of basal energy expenditure (respective standard errors 3.9% and 24.3% of the corresponding estimated parameters)Reference Garby, Garrow, Jorgensen, Lammert, Madsen and Sorensen6, as well as uncertainties related to the equation of body fat as a function of sex, age, measured impedance, height and weight (R 2 = 0.90)Reference Heitmann7. Even limited error in the reference energy intake may lead to substantial error in protein density R P/R E which therefore needs to be taken into account.
Systematic error (underreporting) in fat density Q D was adjusted for assuming the following equation for the reference fat density: R D = Q D+δ, where δ = (Q P/Q E) − (R P/R E)>0 denotes the overreporting in protein density. However, even if systematic bias in Q D is perfectly corrected, R D should include at least within-person random variation still remaining in Q D, plus additional random error due to the fact that the correction relies on single-day measurements and estimated components. Thus, at best, R D contains classical measurement error and can be represented as R D = T D+ξ, where T D denotes true (unmeasured) fat density with variance , and ξ denotes random error independent of T D with mean of 0 and variance . Under this model, the slope of the linear regression of LDL-C on R D will be attenuated by a factor
where ρ(ϕ,φ) denotes the correlation coefficient between random variables ϕ and φ. As for the reported fat density Q D, a reasonable measurement error model would be Q D = β0+β1T D+ɛ, where ɛ is the sum of within-person random error and person-specific biasReference Kipnis, Carroll, Freedman and Li8, has mean of 0 and variance . Compared with truth again, the slope of the regression of LDL-C on Q D will be attenuated by a factor
The observed findings are consistent with λQ>λR rather than λQ>1 as concluded in the articleReference Heitmann and Lissner1. The former inequality suggests that within-person variation in R D is large compared with between-person variation of true fat density . This should not be surprising, given that part of the variation is due to substantial random errors in the components of R D, Q D and δ, as mentioned above. But this fact does not mean that λQ>1 or, equivalently, that ρ2(Q D, T D)>β1. To the contrary, results of the OPEN (Observing Protein and Energy Nutrition) study with repeated recovery biomarker measurements for protein (24-hour urinary nitrogen) and total energy (doubly labelled water) intakes suggest that this is not the case for protein as well as non-protein density reported on the questionnaireReference Kipnis, Subar, Midthune, Freedman, Ballard-Barbash and Troiano9. In OPEN, the attenuation factors were estimated as 0.404 and 0.316 for protein density in men and women respectively, 0.370 and 0.290 for non-protein densityReference Kipnis, Subar, Midthune, Freedman, Ballard-Barbash and Troiano9.
The case when both the predicted variable (LDL-C) and the predictor variable (fat density) are categorical was also considered and yielded similar results, although not statistically significantReference Heitmann and Lissner1. The use of the dichotomised version of LDL-C instead of its continuous version would not change the conclusion because, to an excellent approximation, attenuation factors under logistic regression (for dichotomous predicted variable) are the same as under linear regressionReference Rosner, Willett and Spiegelman10. As for a predictor variable categorised into quantiles, the observed relationship can only be attenuatedReference Kipnis and Izmirlian11.
According to the intuitive explanation provided in the paper, underreporting was considered to be at least the major, if not the only, part of error in dietary questionnaire measurements. If it were so, then λQ would be close to 1/β1 and, because usually β1 < 1 due to the flattened-slope phenomenon (high consumers tend to underreport whereas low consumers tend to overreport), λQ would indeed be greater than 1. However, empirical data suggest that one cannot neglect random variation in dietary self-report, as it seems in practice to compensate for, and even overwhelm, the overestimating impact of systematic errorReference Kipnis, Subar, Midthune, Freedman, Ballard-Barbash and Troiano9.