基础权重basic weight:与常规抽样数据类似,对应样本被抽中的概率。
无应答调整 non-response adjustments:
分层后调整 post-stratification adjustments:

  1. 基础权重

但NHANES调查使用了更为高级的复杂多阶段抽样complex, multistage design.,因此其计算公式如下:

注:PSU为primary sampling unit,相关学科为抽样调查。
参考:抽样调查第07讲(整群抽样的小结,两阶段抽样:规模相等与规模不等) - 知乎 (zhihu.com)

  1. 无应答调整
    可以将nhanes的数据采集理解为一个多步筛选过程,一开始,绝大部分数据者都完整提供了人口学数据,下一步,一部分人会参与面谈,而一小部分人丢失;再下一步,一部分人参与MEC(mobile examination centers)体查,而一小部分人丢失。那么如果选择了包含MEC体查数据的样本,计算权重时就需要使用mec权重。一些特殊数据的采集的条件更为苛刻,如01-02周期中的早晨空腹血糖检测,其要求参与者在预约时间前至少9小时禁食,这导致了很大程度的样本缺失。

  2. 分层后调整

权重的计算由两个方面组成:1.基于最小子集选择合适权重 2.基于合并周期数计算

  1. 在NHANES中选择正确的权重
    数据发布文件中提供了各种样本权重,例如interview weight (wtint2yr), the MEC exam weight (wtmec2yr), and several subsample weights

  2. 选择正确权重
    2.1 example 1
    Example 1: All of the variables were collected in the in-home interview
    You are performing an NHANES 2013-2016 analysis to look at the association of race and Hispanic origin, and poverty on previous diagnosis of diabetes among adults aged 20 and over. All of these variables were collected in the in-home interview (N=11,488 adults aged 20 and over).
    Answer: You would use the interview weights for your analysis (). Because this analysis combines multiple survey cycles, you would need to use or create the appropriate multi-year interview weight as described in following section, “Constructing Weights for Combined NHANES Survey Cycles.” wtint4yr
    2.2 example 2
    Example 2: Some of the variables were collected in the MEC
    You are performing an NHANES 2013-2016 analysis looking at the association of race and Hispanic origin, age, poverty and the prevalence of high blood pressure among adults aged 20 and over. All three demographic variables were collected during the in-home interview (N=11,488). But blood pressure was collected during the MEC exam and MEC questionnaire portion of the survey (N=11,062). MEC-examined sample persons are a subset of those interviewed in the survey.
    Answer: You would use the MEC exam weight for your analysis (). Because this analysis combines multiple survey cycles, you would need to use or create the appropriate multi-year interview weight as described in following section, “Constructing Weights for Combined NHANES Survey Cycles.” wtmec4yr
    2.3 example 3
    You are performing an NHANES 2013-2016 analysis looking at the association of race and Hispanic origin, age, blood pressure and fasting triglycerides among adults age 20 and over.
    Race and Hispanic origin and age were available from the in-home interview (N=11,488)
    Blood pressure came from the MEC exam. MEC-examined sample persons are a subset of those interviewed in the survey.
    Fasting triglycerides are collected from those sample persons who were subsampled to fast before attending a morning MEC exam session and who actually fasted for at least 8.5 hours before the blood draw… This group is approximately half the sample of those who were MEC examined (N=4,660).
    MEC是interview的子集,Fasting triglycerides是MEC的子集,所以最后选择了空腹子样本权重
    Answer: You would use the fasting subsample weights (). Because this analysis combines multiple survey cycles, you would need to use or create the appropriate multi-year interview weight as described in following section, “Constructing Weights for Combined NHANES Survey Cycles.”
    2.4 example 4
    Example 4: Some of the variables were from the 24-hour dietary recall
    Although the 24-hour dietary recall is not considered a subsample, participants who completed this component also have special weights that adjust for non-response to the dietary component and incorporate the day of the week of recall. This adjustment is needed because food intake often varies between weekdays and weekends.
    Answer: You would use the dietary day one sample weight () for an analysis that uses data from the first dietary recall. In addition, the dietary two-day sample weight () was constructed for participants who completed two days of dietary recall, and this weight should be used if an analysis uses both days of dietary intake. See the documentation for the dietary component for more information about the sample weights for the dietary intake data. wtdrd1 wtdr2d
    If your analysis combines multiple survey cycles, you would need to use or create the appropriate multi-year dietary day one sample weight as described in following section, “Constructing Weights for Combined NHANES Survey Cycles.”

  3. 为合并NHANES调查周期构建权重。
    3.1 合并周期时一些注意的点:
    3.1.4 检查在合并的时间段内估计中没有趋势的固有假设(这一条我也不太明白)。
    3.1.5 从2003年开始,每两年的调查内容尽可能保持不变,以便与数据发布周期保持一致。在连续调查的头四年(1999-2002年),情况并非总是如此。
    3.2 权重的构建方法
    3.2.1 合并NHANES 1999-2000和2001-2002周期的样品权重
    由于使用了不同的人口基数,1999-2000年和2001-2002年的两年加权数没有直接可比性。因此,在分析中将 1999-2000 年与 2001-2002 年调查年相结合时,必须使用 NCHS 提供的 4 年样本权重。
    3.2.2 合并NHANES 01-02及以后的样本权重
    2001-2002年NHANES的两年抽样加权数和随后的所有两年周期均以纳入2000年人口普查计数的人口估计数为基础。NCHS 不会在公开发布文件中为多个两年周期的组合构建或包含所有可能的权重,因为这样做是不切实际的。相反,NCHS为分析师提供了有关如何组合这些循环并构建适当权重的信息。当合并2001-2002年以后的两个或多个两年周期时,可以通过简单地将两年样本权重除以分析中的两年周期数来计算新的多年样本权重。下表提供了公式。(虽然是sas代码,但看懂没有难度)
    1 = 1999-2000
    2 = 2001-2002
    3 = 2003-2004
    4 = 2005-2006
    5 = 2007-2008
    6 = 2009-2010
    7 = 2011-2012
    8 = 2013-2014
    9 = 2015-2016
    10 = 2017-2018

以上表格也显示了组合MEC权重的公式()。相同的结构也适用于组合访谈权重()或任何子样本权重(例如空腹子样本权重)。wtmec2yr wtint2yr wtsaf2yr


示例 1:如何合并 1999-2000 年和 2001-2002 年的四年数据

答:你必须使用 SAS 人口统计文件中提供的 4 年权重 ()WTMEC4YR(见上面的解释)。

示例 2:如何合并 2001-2002 年和 2003-2004 年的四年数据

答:合并 2001-2002 年以后的调查周期时,通过将 2 年权重 () 除以分析中的两年周期数来创建四年权重变量 (2)。WTMEC2YR
if sddsrvyr in (2,3) then MEC4YR = 1/2 * WTMEC2YR;

示例 3:如何合并 1999-2004 年的六年数据

答:由于您使用的是 1999-2000 年调查和 2001-2002 年调查,因此您必须从 1999-2001 年提供的 4 年加权数 () 和人口统计文件中提供的 2003-2004 年加权数 () 开始。这将允许您使用以下代码创建一个 6 年权重变量 ()。WTMEC4YRWTMEC2YRMEC6YR

对于 1999-2004 年的 6 年数据,权重应构造为:

if sddsrvyr in (1,2) then MEC6YR = 2/3 * WTMEC4YR; /* for 1999-2002 /
else if sddsrvyr = 3 then MEC6YR = 1/3 * WTMEC2YR; /
for 2003-2004 */

答:由于您没有合并 1999-2000 年的任何数据,因此可以通过将分析 (3) 中的 2 年权重 () 除以两年周期数来创建 6 年权重变量。WTMEC2YR
if sddsrvyr in (2,3,4) then MEC4YR = 1/3 * WTMEC2YR;
