BACKGROUND: Studies of ecologic or aggregate data suffer from a broad range of biases when scientific interest lies with individual-level associations. To overcome these biases, epidemiologists can choose from a range of designs that combine these group-level data with individual-level data. The individual-level data provide information to identify, evaluate, and control bias, whereas the group-level data are often readily accessible and provide gains in efficiency and power. Within this context, the literature on developing models, particularly multilevel models, is well-established, but little work has been published to help researchers choose among competing designs and plan additional data collection. METHODS: We review recently proposed "combined" group- and individual-level designs and methods that collect and analyze data at 2 levels of aggregation. These include aggregate data designs, hierarchical related regression, two-phase designs, and hybrid designs for ecologic inference. RESULTS: The various methods differ in (i) the data elements available at the group and individual levels and (ii) the statistical techniques used to combine the 2 data sources. Implementing these techniques requires care, and it may often be simpler to ignore the group-level data once the individual-level data are collected. A simulation study, based on birth-weight data from North Carolina, is used to illustrate the benefit of incorporating group-level information. CONCLUSIONS: Our focus is on settings where there are individual-level data to supplement readily accessible group-level data. In this context, no single design is ideal. Choosing which design to adopt depends primarily on the model of interest and the nature of the available group-level data.
From the (a)Department of Biostatistics, Harvard School of Public Health, Boston, MA; and (b)Department of Epidemiology and Program in Public Health, University of California at Irvine, Irvine, CA.
%0 Journal Article
%1 Haneuse2011
%A Haneuse, Sebastien
%A Bartell, Scott
%D 2011
%J Epidemiology (Cambridge, Mass.)
%K Bias(Epidemiology) BirthWeight ConfoundingFactors(Epidemiology) DataCollection DataInterpretation EffectModifier Epidemiologic EpidemiologicResearchDesign EthnicGroups EthnicGroups:statistics&numericaldata Female Humans Infant LowBirthWeight Male Models Newborn NorthCarolina RiskFactors SexFactors Statistical
%N 3
%P 382-9
%R 10.1097/EDE.0b013e3182125cff
%T Designs for the combination of group- and individual-level data.
%U http://ovidsp.ovid.com/ovidweb.cgi?T=JS&CSC=Y&NEWS=N&PAGE=fulltext&D=ovftl&AN=00001648-201105000-00020 http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3347777&tool=pmcentrez&rendertype=abstract
%V 22
%X BACKGROUND: Studies of ecologic or aggregate data suffer from a broad range of biases when scientific interest lies with individual-level associations. To overcome these biases, epidemiologists can choose from a range of designs that combine these group-level data with individual-level data. The individual-level data provide information to identify, evaluate, and control bias, whereas the group-level data are often readily accessible and provide gains in efficiency and power. Within this context, the literature on developing models, particularly multilevel models, is well-established, but little work has been published to help researchers choose among competing designs and plan additional data collection. METHODS: We review recently proposed "combined" group- and individual-level designs and methods that collect and analyze data at 2 levels of aggregation. These include aggregate data designs, hierarchical related regression, two-phase designs, and hybrid designs for ecologic inference. RESULTS: The various methods differ in (i) the data elements available at the group and individual levels and (ii) the statistical techniques used to combine the 2 data sources. Implementing these techniques requires care, and it may often be simpler to ignore the group-level data once the individual-level data are collected. A simulation study, based on birth-weight data from North Carolina, is used to illustrate the benefit of incorporating group-level information. CONCLUSIONS: Our focus is on settings where there are individual-level data to supplement readily accessible group-level data. In this context, no single design is ideal. Choosing which design to adopt depends primarily on the model of interest and the nature of the available group-level data.
%@ 1044-3983
@article{Haneuse2011,
abstract = {BACKGROUND: Studies of ecologic or aggregate data suffer from a broad range of biases when scientific interest lies with individual-level associations. To overcome these biases, epidemiologists can choose from a range of designs that combine these group-level data with individual-level data. The individual-level data provide information to identify, evaluate, and control bias, whereas the group-level data are often readily accessible and provide gains in efficiency and power. Within this context, the literature on developing models, particularly multilevel models, is well-established, but little work has been published to help researchers choose among competing designs and plan additional data collection. METHODS: We review recently proposed "combined" group- and individual-level designs and methods that collect and analyze data at 2 levels of aggregation. These include aggregate data designs, hierarchical related regression, two-phase designs, and hybrid designs for ecologic inference. RESULTS: The various methods differ in (i) the data elements available at the group and individual levels and (ii) the statistical techniques used to combine the 2 data sources. Implementing these techniques requires care, and it may often be simpler to ignore the group-level data once the individual-level data are collected. A simulation study, based on birth-weight data from North Carolina, is used to illustrate the benefit of incorporating group-level information. CONCLUSIONS: Our focus is on settings where there are individual-level data to supplement readily accessible group-level data. In this context, no single design is ideal. Choosing which design to adopt depends primarily on the model of interest and the nature of the available group-level data.},
added-at = {2023-02-03T11:44:35.000+0100},
author = {Haneuse, Sebastien and Bartell, Scott},
biburl = {https://www.bibsonomy.org/bibtex/25640a20f820b9ba24825e90d33f02539/jepcastel},
city = {From the (a)Department of Biostatistics, Harvard School of Public Health, Boston, MA; and (b)Department of Epidemiology and Program in Public Health, University of California at Irvine, Irvine, CA.},
doi = {10.1097/EDE.0b013e3182125cff},
interhash = {131fe71c08511c2b9270ca92a39b9011},
intrahash = {5640a20f820b9ba24825e90d33f02539},
isbn = {1044-3983},
issn = {1531-5487},
journal = {Epidemiology (Cambridge, Mass.)},
keywords = {Bias(Epidemiology) BirthWeight ConfoundingFactors(Epidemiology) DataCollection DataInterpretation EffectModifier Epidemiologic EpidemiologicResearchDesign EthnicGroups EthnicGroups:statistics&numericaldata Female Humans Infant LowBirthWeight Male Models Newborn NorthCarolina RiskFactors SexFactors Statistical},
month = {5},
note = {6098<m:linebreak></m:linebreak>[Review]; NLM Journal Code: a2t, 9009644;},
number = 3,
pages = {382-9},
pmid = {21490533},
timestamp = {2023-02-03T11:44:35.000+0100},
title = {Designs for the combination of group- and individual-level data.},
url = {http://ovidsp.ovid.com/ovidweb.cgi?T=JS&CSC=Y&NEWS=N&PAGE=fulltext&D=ovftl&AN=00001648-201105000-00020 http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3347777&tool=pmcentrez&rendertype=abstract},
volume = 22,
year = 2011
}