{"id":11959,"date":"2019-04-02T13:30:00","date_gmt":"2019-04-02T12:30:00","guid":{"rendered":"https:\/\/surveyinsights.org\/?p=11959"},"modified":"2023-07-13T09:05:51","modified_gmt":"2023-07-13T08:05:51","slug":"needles-in-haystacks-and-diamonds-in-the-rough-using-probability-and-nonprobability-methods-to-survey-low-incidence-populations","status":"publish","type":"post","link":"https:\/\/surveyinsights.org\/?p=11959","title":{"rendered":"Needles in Haystacks and Diamonds in the Rough: Using Probability and Nonprobability Methods to Survey Low-incidence Populations"},"content":{"rendered":"<h1>Introduction<\/h1>\n<p>While the strong consensus is to use probability samples as standard practice in survey research, interest in nonprobability samples has been around for decades (Brick, 2014). Both interest and use have grown in recent years as Internet-based sampling, especially using panels, has emerged as a quick, inexpensive method for collecting data, particularly to inform market research (Boyle et al., 2017; Mooi and Sarstedt, 2014) but also to assess public opinion (Ansolabehere and Schaffner, 2014), forecast election outcomes (Wang et al., 2015), and reach \u201chidden\u201d populations, such as those that are stigmatized or not well represented in the general population (Barratt et al., 2015). To date, findings comparing the two types of samples are consistent: probability samples produce estimates that are better than those from nonprobability samples as determined by comparisons to valid, reliable benchmarks (see, for example, Yeager et al., 2011; Pennay et al., 2018).<\/p>\n<p>Regardless, nonprobability samples are embedded in survey research, and practitioners are continuing to examine their properties and suitability. Agreement seems to be emerging that nonprobability surveys may be acceptable when researchers do not intend to generalize results to populations and when they are appropriate for the research questions being posed (Brick, 2014). We suggest that researchers also consider another situation where a nonprobability sample may be acceptable: when the target sample of interest is so small or hard to survey that investing in a probability sample would be hugely expensive and would produce so many ineligible respondents that the very principles of random sampling would be called into question. This matter becomes particularly salient when techniques such as weighting and propensity matching cannot be used because the populations of interest are so specific that adjustment variables simply do not exist.<\/p>\n<p>This article reports on two studies that targeted populations matching the conditions of very small or hard to survey, which present challenges to using probability sampling. The two studies and their target populations have Tourangeau\u2019s (2014) five characteristics that could make a population hard to survey, namely that individuals may:<\/p>\n<ul>\n<li>have a low incidence in the general population.<\/li>\n<li>be reluctant to identify as part of the population of interest.<\/li>\n<li>not be reachable due to factors including geography, lack of technology such as computer or phone, or mobility.<\/li>\n<li>not want to answer surveys generally or perceive the topic as sensitive.<\/li>\n<li>have language or cognitive abilities that make interviewing difficult.<\/li>\n<\/ul>\n<p>Typically, when researchers want to study very small or hard-to-survey segments of the population, two nonprobability sampling techniques have been favored: snowball sampling or respondent-driven sampling (Tourangeau et al., 2014). We suggest that other types of nonprobability samples could be added to these approaches, especially when at least some data are available to assess the comparability of probability and nonprobability samples. This article has three objectives: (1) expand the consideration of nonprobability samples beyond the current emphasis on Internet-based panel samples and respondent-driven sampling; (2) describe methods we used in two studies that began with probability samples and augmented them with nonprobability samples to increase the number of responses from low-incidence populations; and (3) contribute to the discussion about the possibility of combining probability and nonprobability samples to answer particular research questions.<\/p>\n<p>Below, we first present background information about the two studies that are the focus of this article, including details about each study\u2019s sample, data collection methods, and response rates. Next, within each study we compare the values of key metrics for data from probability and nonprobability samples. This is followed by regression analyses to ascertain whether the type of sample is associated with the measured values and an examination of the external validity of the measured results. We conclude that the probability and nonprobability samples could be combined within each study to increase the survey sample size for analytical purposes.<\/p>\n<p>We note here one important point. We refer to the initial samples in both studies as probability based, but we used additional information (as discussed below) to increase the likelihood of reaching the studies\u2019 target populations. We do not have indicators about the accuracy and coverage of that additional information. In the strictest sense of the term, then, the initial samples are not \u201cprobability based,\u201d but we are comfortable referring to them as such for purposes of the comparisons and conclusions presented in this article.<\/p>\n<h1>Data and Methods<\/h1>\n<p>Between 2015 and 2017, RTI International conducted data collection in two US metropolitan areas, each for a component of a research initiative known as the National Asset Scorecard for Communities of Color. This initiative documents\u00a0wealth disparities among racial and ethnic groups in the United States. The first component was an in-person survey of specific racial and ethnic groups in Los Angeles County (the LA Wealth Inequality study). The second component was a telephone survey in Baltimore City examining the impact of incarceration on household finances (the Baltimore Incarceration study).<\/p>\n<ul>\n<li>The LA Wealth Inequality study asked: <em>What is the financial situation of families from particular racial and ethnic groups, especially in terms of assets and debts?<\/em> Los Angeles County was selected because of its diverse population. The study completed 512 in-person interviews with residents from six racial and ethnic groups: Africans, African Americans, Cambodians, Hispanics, Koreans, and whites. Details about the study design are in Marks et al., 2015.<\/li>\n<li>The Baltimore Incarceration study asked: <em>What is the financial status, in terms of assets and debts, of African American and white households with individual(s) who have been incarcerated, compared to households without an incarceration history?<\/em> The study addresses gaps in research knowledge and was initiated soon after the arrest and death of Freddie Gray in Baltimore, Maryland, and the subsequent unrest there. RTI completed 254 telephone interviews with respondents in Baltimore City. Marks and Rhodes, 2017, has a discussion of the study design.<\/li>\n<\/ul>\n<p>The survey questions were similar for both studies (Marks et al., 2015; Marks and Rhodes, 2017). The questionnaire began with a screener to determine eligibility for the study, then created a list of all members in the household. The person with the most knowledge of household financial matters was selected as the respondent. Subsequent sections of the questionnaire addressed labor market participation and income, family assets (interest-earning accounts, stocks and mutual funds, pensions, gifts, real estate, vehicles, businesses, and other financial assets) and family debt (credit cards; personal, business, and student loans; medical bills; real estate; and other debt). Surveys about financial matters are well-known as challenging, and these were no exception (Riphahn and Serfling, 2005; Davern et al., 2005; Kennickell et al., 2000). They took, on average, 45 minutes to complete and probed into personal matters many people typically choose to keep private. Those who completed the interview in either study received a $25 cash incentive.<\/p>\n<h1>The Sample and Data Collection: Los Angeles<\/h1>\n<p>The LA Wealth Inequality study centered on six racial and ethnic groups in Los Angeles County. These groups varied considerably in their proportion of the population, with the Cambodian population making up less than half of one percent of the overall population. Table 1 provides the population data for the groups of interest. These groups of interest have several characteristics of a hard-to-survey population described by Tourangeau: low incidence in the population, high mobility, and large proportions of immigrants, who are often reluctant to respond to surveys, making them hard-to-persuade (Massey, 2014).<\/p>\n<p><strong>Table 1. Population of Racial\/Ethnic Groups of Interest in Los Angeles County<\/strong><\/p>\n<p><strong>\u00a0<a href=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/2019\/03\/Table_1.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-12121\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/2019\/03\/Table_1.png\" alt=\"\" width=\"652\" height=\"160\" srcset=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/2019\/03\/Table_1.png 652w, https:\/\/surveyinsights.org\/wp-content\/uploads\/2019\/03\/Table_1-300x74.png 300w\" sizes=\"auto, (max-width: 652px) 100vw, 652px\" \/><\/a><\/strong><\/p>\n<p><em>Source: U.S. Census Bureau, 2011-2015 American Community Survey 5-Year Estimates<br \/>\n<\/em><em><sup>a<\/sup> <\/em><em>Our study sought to interview individuals who were born in Africa and those with a parent or grandparent born in Africa. This population figure refers only to Africa-born residents, so it underestimates the study\u2019s population of interest.<\/em><\/p>\n<p>For this study we used an address-based sample (ABS). ABS offered the most statistically robust approach while containing costs to conduct in-person survey that could collect detailed financial information better than other data collection modes. We used an ABS sample based on the United States Postal Service\u2019s Computerized Delivery Sequence file, which is the best current frame for household surveys in the United States (Harter et al., 2016). Commercial vendors attach flags to the USPS file to indicate household characteristics. One flag indicates the race\/ethnicity of the household. Not all households are flagged, the information does not include the date on which the flag was determined, and the accuracy of the information is unknown. Our analysis of the flags for household race\/ethnicity found that the accuracy ranged from 6% to 55% depending on the race\/ethnicity of interest (Rhodes and Marks, 2018b).To draw the sample, we randomly drew 2,218 households from the USPS list, stratified by major race and ethnic categories\u2014Korean, Cambodian, Hispanic, non-Hispanic black,<a href=\"#_edn1\" name=\"_ednref1\">[1]<\/a> and other (including unknown). Interviewers visited the address and administered a screener to determine eligibility.We achieved a response rate of 37.4% for the ABS portion of the LA Wealth Inequality study, using the formula for AAPOR response rate 3 (the response rate is 26.1% using AAPOR response rate 1) (AAPOR, 2015).<a href=\"#_edn2\" name=\"_ednref2\">[2]<\/a> Table 2 shows that more than half of the ABS are in households where we were unable to determine eligibility because no one was ever home, no one ever responded to letters and notes asking them to call us, or no one ever opened the door. Almost 20% are classified as ineligible, meaning they did not match the racial\/ethnic categories for this study or they fell into a category whose quota had already been reached.<\/p>\n<p><strong>Table 2. Disposition of Address-Based Sample Cases: LA Wealth Inequality Study<\/strong><\/p>\n<p><a href=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/2019\/03\/Table_2.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-12123\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/2019\/03\/Table_2.png\" alt=\"\" width=\"620\" height=\"290\" srcset=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/2019\/03\/Table_2.png 620w, https:\/\/surveyinsights.org\/wp-content\/uploads\/2019\/03\/Table_2-300x140.png 300w\" sizes=\"auto, (max-width: 620px) 100vw, 620px\" \/><\/a><\/p>\n<p>We monitored the sample\u2019s performance to determine whether sampling quotas were met. After completing 451 interviews, the study had successfully reached targets for the African, African American, Hispanic, and white racial\/ethnic groups, but would not achieve sufficient numbers of Cambodian and Korean interviews within the available budget. For these two groups, we then transitioned from an address-based sample to a convenience sample.<\/p>\n<p>To locate potential respondents for the convenience sample, we worked with our field interviewers who were from these communities to identify the best ways to contact Cambodian and Korean potential respondents. The field interviewers identified religious institutions, restaurants, and cultural fairs likely to attract Cambodians or Koreans, then visited them and approached adults, explained the purpose of the study, and asked screener questions to see if they were eligible. If yes, interviews were conducted on the spot or scheduled for a convenient time. We completed 25 additional interviews with Cambodian respondents, and 31 additional interviews with Korean respondents.<\/p>\n<h1>The Sample and Data Collection: Baltimore<\/h1>\n<p>For the Baltimore Incarceration study, the mode of data collection and the target sample size were driven by the amount of available funding, informed by power analyses (available from the authors upon request) and loose estimates about the size of likely financial differences between households that did and did not have a history of incarceration. We found no prior research that could even suggest the magnitude of differences in assets and debts or financial status between households with and without an incarceration history. We targeted completed interviews with approximately 140 nonincarcerated and 140 incarcerated households, with each of those evenly divided between African Americans and whites. Characteristics of the Baltimore City population (Table 3) showed that reaching the white, incarcerated population would be challenging because of their relatively low prevalence in the city.<a href=\"#_edn3\" name=\"_ednref3\"><sup>[3]<\/sup><\/a><\/p>\n<p><strong>Table 3. Estimated Population of Baltimore City Ever Incarcerated, by Race<\/strong><strong>\u00a0<\/strong><\/p>\n<p><a href=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/2019\/03\/Table_3.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-12125\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/2019\/03\/Table_3.png\" alt=\"\" width=\"701\" height=\"91\" srcset=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/2019\/03\/Table_3.png 701w, https:\/\/surveyinsights.org\/wp-content\/uploads\/2019\/03\/Table_3-300x39.png 300w\" sizes=\"auto, (max-width: 701px) 100vw, 701px\" \/><\/a><\/p>\n<p><em>Sources: <strong>Population data<\/strong>: U.S. Census Bureau, 2011-2015 American Community Survey 5-Year Estimates. <strong>Incarceration data<\/strong>: Bucknor, C. and Barber, A., 2016. &#8220;The price we pay: Economic costs of barriers to employment for former prisoners and people convicted of felonies.&#8221;\u00a0CEPR Reports and Issue Briefs\u00a02016-07, Center for Economic and Policy Research, Washington, DC.<\/em><\/p>\n<p>Households with a history of incarceration are a hard-to-survey population according to Tourangeau\u2019s criteria. As Table 3 indicates, the incidence of ex-offenders in the general population is low, particularly among the white population of Baltimore City. Ex-offenders tend to be low-income and mobile, making them hard to reach. If an interviewer is able to reach them, they may not want to declare their ex-offender status, making them hard to identify.Because we expected difficulties locating the target populations through strict random digit dialing methods, we took four steps to increase our chances of reaching the population of interest.<\/p>\n<ol>\n<li>The sampling frame consisted of only cell phone numbers. A cell-only frame offers nearly full population coverage for the low-income population of interest (Mobile Fact Sheet, 2017). Furthermore, a cell-only frame leads to lower total survey error, eliminates adjustments associated with dual-frame designs, and reduces respondent burden (Peytchev and Neely, 2013).<\/li>\n<li>We drew only from cell numbers that were associated with a billing address in Baltimore City or had a number whose area code and first three digits were associated with a Baltimore City rate center. While neither is a perfect indicator of sample member location, the restriction significantly reduced the number of calls (and therefore costs) to reach Baltimore City residents.<\/li>\n<li>We removed inactive numbers from the sample frame.<\/li>\n<li>We were able to obtain an indicator of household income for some sample frame numbers from commercial vendors and used that to oversample low-income households, who are more likely to have had contact with law enforcement (Rabury and Kopf, 2015).<\/li>\n<\/ol>\n<p>We drew a final random sample of 43,707 telephone numbers and made 135,163 attempts to call these numbers, administer a screener, and complete an interview with eligible sample members. All working, residential numbers were attempted up to 12 times.<\/p>\n<p>Again using AAPOR response rate 3, we achieved a response rate of 6.7% for the random digit dial sample portion of the Baltimore study (the AAPOR response rate 1 is 6.5%) (see Table 4). This low rate is consistent with typical RDD surveys of the general population. The Pew Research Center (2016) reports that in 2012, the response rate for public opinion polls (not a notoriously difficult telephone survey to conduct) had fallen to 9%.<\/p>\n<p><strong>Table 4: Disposition of RDD Sample Cases: Baltimore Incarceration Study<\/strong><\/p>\n<p><a href=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/2019\/03\/Table_4.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-12131\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/2019\/03\/Table_4.png\" alt=\"\" width=\"623\" height=\"258\" srcset=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/2019\/03\/Table_4.png 623w, https:\/\/surveyinsights.org\/wp-content\/uploads\/2019\/03\/Table_4-300x124.png 300w\" sizes=\"auto, (max-width: 623px) 100vw, 623px\" \/><\/a><\/p>\n<p>We monitored production rates and used case disposition data to determine how well the sample was performing. Monitoring indicated we would not be able to meet the target number of completed interviews with households that had a history of incarceration, within the amount of resources available.We considered nonprobability options for the sample segment that was falling short and discarded two:<\/p>\n<ul>\n<li>Snowball sampling would not work because the target population is unlikely to provide information about other ex-offenders. Moreover, the terms of their probation may prohibit them from contacting other ex-offenders (Administrative Office of the U.S. Courts, 2016).<\/li>\n<li>An intercept survey at or near a prison or jail would require complex and expensive logistical arrangements. We would need to apply to the prisons and receive approval; the sample would consist only of those households with current incarceration, which was not the intent of the study; and it would be costly to visit sufficient numbers of institutions and screen family members for residence in Baltimore City.<\/li>\n<\/ul>\n<p>A third option was more promising: We devised a low-cost way to increase the sample size by placing a targeted ad to recruit individuals through Facebook and Instagram. The ability to target specific groups is an advantage of social media advertising over other online recruitment methods. Our ad asked users to click on a link to complete a survey if they or someone in their household had been to jail or prison. Once we had developed the ad and Facebook had approved it, Facebook targeted the ad to individuals in Baltimore using data on the user\u2019s reported current residence and the geolocation of the user\u2019s device; we also attempted to target the ad to those with interests that might correlate with our target populations, such as users who had shown an interest in African American history. Facebook does not allow advertisers to target based on certain user characteristics, including race or criminal history.<\/p>\n<p>People who were interested clicked on the link in the ad, answered eligibility questions, and provided contact information. RTI interviewers telephoned those whose answers suggested they met the eligibility criteria and administered a screener. If they were, in fact, eligible and willing to participate, a telephone interview was conducted.<\/p>\n<p>The ad campaign ultimately reached 181,754 social media users, of whom 696 clicked on the ad\u2019s link, completed a few questions on eligibility, and provided a telephone number where they could be reached. We completed 34 interviews with individuals recruited through social media and stopped only (1) after the study\u2019s target for African Americans with an incarceration history had been reached and (2) none of the remaining eligible respondents were whites with an incarceration history, which was the group that needed more respondents (Rhodes and Marks, 2018a).<\/p>\n<h1>Results: Los Angeles<\/h1>\n<p>To examine characteristics of respondents from the probability and convenience samples in Los Angeles, Table 5 compares the key demographic characteristics and financial information for Cambodian and Korean respondents. Cambodians are more comparable than Koreans across the two sampling methods for sex and household income; Koreans are more comparable than Cambodians in terms of their average education level.<\/p>\n<p><strong>Table 5. Demographic Characteristics of Probability and Convenience Samples (Cambodian and Korean Respondents)<\/strong><\/p>\n<p><a href=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/2019\/03\/Table_5.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-12132\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/2019\/03\/Table_5.png\" alt=\"\" width=\"719\" height=\"491\" srcset=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/2019\/03\/Table_5.png 719w, https:\/\/surveyinsights.org\/wp-content\/uploads\/2019\/03\/Table_5-300x205.png 300w\" sizes=\"auto, (max-width: 719px) 100vw, 719px\" \/><\/a><\/p>\n<h1>Results: Baltimore<\/h1>\n<p>Characteristics of RDD and social media respondents with household incarceration are presented in Table 6. Across the full sample, RDD and social media sample households share similar demographic characteristics (sex, age, education level). Cell sizes for the social media sample are small, but the data show that findings are somewhat inconsistent: for the African American group, social media recruitment resulted in more female respondents and higher income households; for the white group, social media recruitment resulted in respondents with a lower household income yet a slightly higher level of education.<\/p>\n<p><strong>Table 6. Demographic Characteristics of Probability and Nonprobability Samples for Respondents with Household Incarceration History<\/strong><strong>\u00a0<\/strong><\/p>\n<h1><a href=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/2019\/03\/Table_6.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-12134\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/2019\/03\/Table_6.png\" alt=\"\" width=\"722\" height=\"439\" srcset=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/2019\/03\/Table_6.png 722w, https:\/\/surveyinsights.org\/wp-content\/uploads\/2019\/03\/Table_6-300x182.png 300w\" sizes=\"auto, (max-width: 722px) 100vw, 722px\" \/><\/a><\/h1>\n<h1>Does the Sampling Method Affect the Values of Key Variables?<\/h1>\n<p>We used convenience and social media samples to increase the number of respondents from low-incidence groups, so the primary methodological question is whether the two types of samples (probability and nonprobability) can be combined for analytical purposes. Because the sample sizes are small, we cannot determine the answer with certainty, but the richness of the data enables us to compare household assets and debts\u2014the primary focus of data collection\u2014across the two types of samples.We use a negative binomial regression model because of its suitability for the nature of the data, particularly that 0 is a valid response to the financial questions that were the focus of the two studies. We estimated Poisson models, but due to the overdispersion of the data, the negative binomial models were more appropriate (Hilbe, 2011). We ran two models\u2014one with assets as the dependent variable, and one with debts as the dependent variable\u2014for the data from the Los Angeles Wealth Inequality study and the data from the Baltimore Incarceration study. In each model, we included the type of the sample\u2014probability or nonprobability\u2014as an independent variable, along with other variables known to be associated with household assets and debts, namely level of education, race, and household income.<\/p>\n<p>Results are provided in Table 7. In sum, the type of sample\u2014probability or nonprobability is not a statistically significant predictor of the dollar amount of household assets or debts. The same results are obtained when computing likelihood ratio statistics for Type 3 analysis, which examines the effect for a variable after all other factors in the model have been accounted for (Table 8).<\/p>\n<p><strong>Table 7. Regression Results to Determine the Effect of Type of Sample on Key Variables of Household Assets and Debts<\/strong><\/p>\n<p><a href=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/2019\/03\/Table_7.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-12135\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/2019\/03\/Table_7.png\" alt=\"\" width=\"626\" height=\"912\" srcset=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/2019\/03\/Table_7.png 626w, https:\/\/surveyinsights.org\/wp-content\/uploads\/2019\/03\/Table_7-206x300.png 206w\" sizes=\"auto, (max-width: 626px) 100vw, 626px\" \/><\/a><\/p>\n<p><strong>Table 8. Likelihood Ratio Statistics for Type 3 Analysis<\/strong><\/p>\n<p><a href=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/2019\/03\/Table_8.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-12136\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/2019\/03\/Table_8.png\" alt=\"\" width=\"737\" height=\"201\" srcset=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/2019\/03\/Table_8.png 737w, https:\/\/surveyinsights.org\/wp-content\/uploads\/2019\/03\/Table_8-300x82.png 300w\" sizes=\"auto, (max-width: 737px) 100vw, 737px\" \/><\/a><\/p>\n<p>We ran a few other models to see if their specification affected the results (data are not presented in this article, but are available from the authors upon request):<\/p>\n<ul>\n<li>One model added age as an independent variable. It reduced the number of observations in the Baltimore Incarceration study by about one-fourth due to nonresponse. The type of sample remained nonsignificant. We did not include age in the regressions presented in Table 7 to avoid the reduction in sample size.<\/li>\n<li>Because of the association between the type of sample and respondent race\/ethnicity, we ran regressions without the race\/ethnicity variable. In the model with assets as the dependent variable, the type of sample became significant. While this result warrants attention, it could be meaningful or it could be due merely to chance given the number of tests we ran.<\/li>\n<li>Another model added an interaction term for race\/ethnicity and type of sample because the nonprobability samples focused on specific racial and ethnic groups. The type of sample remained nonsignificant in these models.<\/li>\n<li>A third set of models substituted household income for assets and debts as the dependent variable. The type of sample remained nonsignificant in these models.<\/li>\n<\/ul>\n<p>The preponderance of evidence suggests internal validity when combining probability and nonprobability for analytical purposes with samples of low-incidence populations similar to those studied here.To examine external validity, we have limited options because only limited information is available for the low-incidence populations in our two studies. After considering multiple datasets, we chose to use the U.S. Census Bureau\u2019s 2011-2015, 5-year estimates from the American Community Survey (ACS) from Los Angeles County. Although the ACS does not collect detailed assets and debts information, ACS data do allow us to look at the racial\/ethnic groups of interest in our study. The metrics of interest are not absolute dollar comparisons because the groups are not equivalent. Instead, the focus is on the <em>difference<\/em> from Census data between (1) the probability sample and (2) the combined probability and nonprobability samples. If differences are relatively small, it seems reasonable to combine the two types of samples for analysis.We wanted to perform similar analysis for the Baltimore incarceration study but were unable to locate any data on household income or similar metrics for households with and without a history of incarceration. Thus, we cannot check the external validity of the two types of samples in the Baltimore study.Results for Los Angeles are presented in Table 9. We compared the median income for the two groups in the convenience sample, Cambodians and Koreans. Differences are small: the probability sample versus the combined sample shows a 3.5 percentage point difference for Cambodians and 4.0 percentage points for Koreans. Because the differences are small for these categories, combining the probability and convenience samples for analysis seems to be reasonable.<\/p>\n<p><strong>Table 9: Median Income, Comparisons of Census Data Against Sample Data<\/strong><\/p>\n<p><a href=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/2019\/03\/Table_9.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-12138\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/2019\/03\/Table_9.png\" alt=\"\" width=\"700\" height=\"239\" srcset=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/2019\/03\/Table_9.png 700w, https:\/\/surveyinsights.org\/wp-content\/uploads\/2019\/03\/Table_9-300x102.png 300w\" sizes=\"auto, (max-width: 700px) 100vw, 700px\" \/><\/a><\/p>\n<p><strong>\u00a0<\/strong><em><sup>a<\/sup> 2011-2015 American Community Survey, five-year estimate, Table B19013.<\/em><\/p>\n<h1>Conclusion \/ Discussion<\/h1>\n<p>In this paper, we have expanded the current discourse about nonprobability samples to include those obtained through convenience sampling and through social media recruitment. While Internet panels and respondent-driven methods remain a focus of attention in current survey research literature, consideration of other nonprobability samples is important, particularly when studying groups that are rare in the population. We suggest it may be appropriate to purposefully design studies that rely on a nonprobability sample when locating targeted sample members is so challenging that basic principles underlying probability sampling may be violated.For two distinct studies, we compared key measures for the two types of samples, focusing on respondent demographic characteristics and household financial status. Although our sample sizes are small, the analyses we conducted show that the type of sample was not a significant predictor for the two key variables of interest, namely household assets and debts. Examining the Los Angeles samples against Census data results seems to indicate external validity. Thus, we conclude that the probability and nonprobability samples could be combined within each study to increase the sample size for analytical purposes. We suggest researchers working with probability and non-probability samples for rare populations conduct similar analyses when determining if combining cases from the two types of samples may be appropriate.<\/p>\n<p><a href=\"#_ednref1\" name=\"_edn1\">[1]<\/a> In the rest of this document, \u201cblack\u201d refers to the African\/African American\/non-Hispanic category.<\/p>\n<p><a href=\"#_ednref2\" name=\"_edn2\">[2]<\/a> AAPOR Response Rate 3 estimates the proportion of eligible cases from those with an unknown eligibility. We used the proportion of eligible cases from all cases of known eligibility for this estimate. AAPOR Response Rate 1, or the minimum response rate, does not include an estimate of the proportion of eligible cases from those with an unknown eligibility.<\/p>\n<p><a href=\"#_ednref3\" name=\"_edn3\">[3]<\/a> To determine the percent ever incarcerated, we began with national estimates of the proportion of the US population that had been formerly incarcerated, by race, using US Bureau of Justice Statistics data (Bonczar, 2003). We then applied those proportions to the population of Baltimore City to estimate the number of residents who had ever been incarcerated. Next, we added counts of individuals currently in a Maryland state prison to estimate the number of households with someone currently in state prison. While these estimates are imperfect, they served our goal of getting a general sense of the size of the population of interest to inform planning for data collection.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Introduction While the strong consensus is to use probability samples as standard practice in survey research, interest in nonprobability samples has been around for decades (Brick, 2014). Both interest and use have grown in recent years as Internet-based sampling, especially using panels, has emerged as a quick, inexpensive method for collecting data, particularly to inform [&hellip;]<\/p>\n","protected":false},"author":972,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[444],"tags":[212,486,485,460,406,487],"class_list":["post-11959","post","type-post","status-publish","format-standard","hentry","category-probability-and-nonprobability-sampling","tag-convenience-sampling","tag-empirical-comparisons","tag-nonprobability-samples","tag-probability-samples","tag-rare-populations","tag-social-media"],"acf":[],"_links":{"self":[{"href":"https:\/\/surveyinsights.org\/index.php?rest_route=\/wp\/v2\/posts\/11959","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/surveyinsights.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/surveyinsights.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/surveyinsights.org\/index.php?rest_route=\/wp\/v2\/users\/972"}],"replies":[{"embeddable":true,"href":"https:\/\/surveyinsights.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=11959"}],"version-history":[{"count":26,"href":"https:\/\/surveyinsights.org\/index.php?rest_route=\/wp\/v2\/posts\/11959\/revisions"}],"predecessor-version":[{"id":18838,"href":"https:\/\/surveyinsights.org\/index.php?rest_route=\/wp\/v2\/posts\/11959\/revisions\/18838"}],"wp:attachment":[{"href":"https:\/\/surveyinsights.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=11959"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/surveyinsights.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=11959"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/surveyinsights.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=11959"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}