Respondent-driven sampling (RDS) employs a variant of a link-tracing network sampling strategy to collect data from hard-to-reach populations. By tracing the links in the underlying social network, the process exploits the social structure to expand the sample and reduce its dependence on the initial (convenience) sample.The current estimators of population averages make strong assumptions in order to treat the data as a probability sample. We evaluate three critical sensitivities of the estimators: (1) to bias induced by the initial sample, (2) to uncontrollable features of respondent behavior, and (3) to the without-replacement structure of sampling.Our analysis indicates: (1) that the convenience sample of seeds can induce bias, and the number of sample waves typically used in RDS is likely insufficient for the type of nodal mixing required to obtain the reputed asymptotic unbiasedness; (2) that preferential referral behavior by respondents leads to bias; (3) that when a substantial fraction of the target population is sampled the current estimators can have substantial bias.This paper sounds a cautionary note for the users of RDS. While current RDS methodology is powerful and clever, the favorable statistical properties claimed for the current estimates are shown to be heavily dependent on often unrealistic assumptions. We recommend ways to improve the methodology.
Description
RESPONDENT-DRIVEN SAMPLING: AN ASSESSMENT OF CURRENT METHODOLOGY - Gile - 2010 - Sociological Methodology - Wiley Online Library
%0 Journal Article
%1 SOME:SOME1223
%A Gile, Krista J.
%A Handcock, Mark S.
%D 2010
%I Blackwell Publishing Inc
%J Sociological Methodology
%K driven respondent sampling sociology
%N 1
%P 285--327
%R 10.1111/j.1467-9531.2010.01223.x
%T RESPONDENT-DRIVEN SAMPLING: AN ASSESSMENT OF CURRENT METHODOLOGY
%U http://dx.doi.org/10.1111/j.1467-9531.2010.01223.x
%V 40
%X Respondent-driven sampling (RDS) employs a variant of a link-tracing network sampling strategy to collect data from hard-to-reach populations. By tracing the links in the underlying social network, the process exploits the social structure to expand the sample and reduce its dependence on the initial (convenience) sample.The current estimators of population averages make strong assumptions in order to treat the data as a probability sample. We evaluate three critical sensitivities of the estimators: (1) to bias induced by the initial sample, (2) to uncontrollable features of respondent behavior, and (3) to the without-replacement structure of sampling.Our analysis indicates: (1) that the convenience sample of seeds can induce bias, and the number of sample waves typically used in RDS is likely insufficient for the type of nodal mixing required to obtain the reputed asymptotic unbiasedness; (2) that preferential referral behavior by respondents leads to bias; (3) that when a substantial fraction of the target population is sampled the current estimators can have substantial bias.This paper sounds a cautionary note for the users of RDS. While current RDS methodology is powerful and clever, the favorable statistical properties claimed for the current estimates are shown to be heavily dependent on often unrealistic assumptions. We recommend ways to improve the methodology.
@article{SOME:SOME1223,
abstract = {Respondent-driven sampling (RDS) employs a variant of a link-tracing network sampling strategy to collect data from hard-to-reach populations. By tracing the links in the underlying social network, the process exploits the social structure to expand the sample and reduce its dependence on the initial (convenience) sample.The current estimators of population averages make strong assumptions in order to treat the data as a probability sample. We evaluate three critical sensitivities of the estimators: (1) to bias induced by the initial sample, (2) to uncontrollable features of respondent behavior, and (3) to the without-replacement structure of sampling.Our analysis indicates: (1) that the convenience sample of seeds can induce bias, and the number of sample waves typically used in RDS is likely insufficient for the type of nodal mixing required to obtain the reputed asymptotic unbiasedness; (2) that preferential referral behavior by respondents leads to bias; (3) that when a substantial fraction of the target population is sampled the current estimators can have substantial bias.This paper sounds a cautionary note for the users of RDS. While current RDS methodology is powerful and clever, the favorable statistical properties claimed for the current estimates are shown to be heavily dependent on often unrealistic assumptions. We recommend ways to improve the methodology.},
added-at = {2012-07-09T17:50:29.000+0200},
author = {Gile, Krista J. and Handcock, Mark S.},
biburl = {https://www.bibsonomy.org/bibtex/2bb02e7731939bea0e8095e2a5ec1a4d7/emrahcem},
description = {RESPONDENT-DRIVEN SAMPLING: AN ASSESSMENT OF CURRENT METHODOLOGY - Gile - 2010 - Sociological Methodology - Wiley Online Library},
doi = {10.1111/j.1467-9531.2010.01223.x},
interhash = {13961bfa1391def0453075e34d51f83e},
intrahash = {bb02e7731939bea0e8095e2a5ec1a4d7},
issn = {1467-9531},
journal = {Sociological Methodology},
keywords = {driven respondent sampling sociology},
number = 1,
pages = {285--327},
publisher = {Blackwell Publishing Inc},
timestamp = {2012-07-09T18:16:09.000+0200},
title = {RESPONDENT-DRIVEN SAMPLING: AN ASSESSMENT OF CURRENT METHODOLOGY},
url = {http://dx.doi.org/10.1111/j.1467-9531.2010.01223.x},
volume = 40,
year = 2010
}