{"id":11203,"date":"2019-04-02T13:28:32","date_gmt":"2019-04-02T12:28:32","guid":{"rendered":"https:\/\/surveyinsights.org\/?p=11203"},"modified":"2023-07-13T08:58:48","modified_gmt":"2023-07-13T07:58:48","slug":"inferences-based-on-probability-sampling-or-nonprobability-sampling-are-they-nothing-but-a-question-of-models","status":"publish","type":"post","link":"https:\/\/surveyinsights.org\/?p=11203","title":{"rendered":"Inferences based on Probability Sampling or Nonprobability Sampling \u2013 Are They Nothing but a Question of Models?"},"content":{"rendered":"<h1><strong> Introduction<\/strong><\/h1>\n<p>There is a constantly increasing demand for objective information about some characteristics of finite populations of interest based on data. Regarding the sources of such data, in this paper, we will distinguish between probability samples and nonprobability samples.<\/p>\n<p>The <em>probability sampling techniques<\/em> can be described under a unique theoretical framework because they all share the fact that they assign a known, nonzero sample selection probability to each unit of the target population (cf., for instance, the textbooks by S\u00e4rndal et al. 1992, or Lohr 2010). Examples of such sampling schemes include simple, stratified, cluster, multistage, or probability proportional to size random sampling. The essential aspect of these procedures is that the known selection probabilities of the sample members allow a design-unbiased point estimation of population characteristics, such as totals, means, or proportions. However, for example, non-negligible nonresponse rates may require the formulation of models regarding the mechanism that generates such a behavior.<\/p>\n<p>In contrast to the probability sampling schemes, the different <em>nonprobability sampling techniques<\/em> have only little more in common than the lack of knowledge of the associated selection probabilities. Therefore, in contrast to the probability sampling schemes, to be able to conduct inferential statistics, these methods will also require model assumptions to explain the selection process itself. Examples of such techniques are the purposive methods of quota or expert choice sampling; the link-tracing designs, such as snowball or respondent-driven sampling that are particularly used with hard-to-reach populations (cf., for instance, Tourangeau et al. 2014); and the arbitrary sampling methods, such as volunteer or river sampling.<\/p>\n<p>In the context of this paper, the term &#8220;big data&#8221; will refer to big, not survey-, but process-generated, hence, non-probabilistic data sets, which primarily were not collected with the intention to conclude on population characteristics. Examples of such process-generated data collections are those collected by mobile phones\u2019 network providers, which are used to estimate temporal variations in population density, social media data used to estimate flows in the labor market, or crime-related data, which are analyzed in the field of crime prediction.<\/p>\n<p>For the purpose of setting the standard regarding the quality of sample results, the term&#8220;representativeness&#8221;, which has been used in so many different meanings (cf. Kruskal and Mosteller 1980), is defined\u00a0as the indicator of the inference quality of survey outcomes (cf. Quatember 2019):<\/p>\n<p><em>A sample is called &#8220;representative&#8221; with respect to a certain population characteristic (such as a whole distribution of a study variable or a parameter of this distribution) if this characteristic can be (at least approximately) unbiasedly estimated from the available data with a predefined accuracy.<\/em><\/p>\n<p>In this definition, the goal of representativeness of a sample is described by the statistical similarity concept of the unbiasedness of estimators (cf. S\u00e4rndal et al. 1992, 40) and by a requirement regarding the efficiency of the estimates. Hence, it implicitly includes the consideration of the total survey error\u2013 a concept that addresses both the sampling error, which describes the sample-to-sample variation of estimators, and the systematic (or nonsampling) error that can also occur in population surveys (cf. Weisberg 2005). In the context of statistical surveys, the term &#8220;error&#8221; refers to the difference between an estimate of a population characteristic and its true value. Several types of errors, in particular the frame error, the nonresponse error, the measurement error, and the processing error contribute to the nonsampling error.<\/p>\n<p>In the subsequent section, the different implicit assumptions that are made when a specific estimator of a population characteristic is applied to any sampling method, are discussed exemplarily for the expression of the Horvitz-Thompson estimator of a population total under simple random sampling without replacement. In particular, the remarkable effect of a deviation from the assumption concerning the selection method is presented. Furthermore, the statistical repair methods that may reduce the increase of the total survey error caused by deviations from these implicitly made assumptions are considered. Complementing the definition of the representative samples from above, the definition of the informative samples with regard to the declared survey purpose, which may prove useful in practice, is presented in Section 3. In the concluding section, the question asked in the title of the paper is discussed again and answered.<\/p>\n<p><strong>\u00a0<\/strong><\/p>\n<h1><strong> Implicit assumptions and explicit models<\/strong><\/h1>\n<p>Let the task be the estimation of a certain population characteristic from the finite target population <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-310ff55814df4f7f49f3b8e8fb604d00_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#85;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"13\" style=\"vertical-align: 0px;\"\/> of size <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-1a732b251e4708d7f9f9ab565fd49ceb_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#78;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"16\" style=\"vertical-align: 0px;\"\/>. Throughout the paper, as it is quite common in textbooks in the field of sampling theory (cf., for instance, S\u00e4rndal et al. 1992), let the population total<\/p>\n<p><a name=\"id1892987736\"><\/a><\/p>\n<p class=\"ql-center-displayed-equation\" style=\"line-height: 27px;\"><span class=\"ql-right-eqno\"> (1) <\/span><span class=\"ql-left-eqno\"> &nbsp; <\/span><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-a0cef17bc0a27f9fbfd7350a368c5ec2_l3.png\" height=\"27\" width=\"86\" class=\"ql-img-displayed-equation \" alt=\"&#92;&#98;&#101;&#103;&#105;&#110;&#123;&#101;&#113;&#117;&#97;&#116;&#105;&#111;&#110;&#42;&#125; &#116;&#61;&#92;&#115;&#117;&#109;&#92;&#110;&#111;&#108;&#105;&#109;&#105;&#116;&#115;&#95;&#123;&#85;&#125;&#32;&#121;&#95;&#123;&#107;&#125; &#92;&#101;&#110;&#100;&#123;&#101;&#113;&#117;&#97;&#116;&#105;&#111;&#110;&#42;&#125;\" title=\"Rendered by QuickLaTeX.com\"\/><\/p>\n<p>(<img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-67c6a4178fa479bf7142f8aaf73863b9_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#92;&#115;&#117;&#109;&#92;&#110;&#111;&#108;&#105;&#109;&#105;&#116;&#115;&#95;&#123;&#85;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"18\" width=\"30\" style=\"vertical-align: -5px;\"\/> denotes the sum over all units of <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-310ff55814df4f7f49f3b8e8fb604d00_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#85;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"13\" style=\"vertical-align: 0px;\"\/>) of a variable <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-f25756709afaa64c21973962eb2ab191_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#121;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"9\" style=\"vertical-align: -4px;\"\/> under study serve as the example of a population characteristic of interest in <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-310ff55814df4f7f49f3b8e8fb604d00_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#85;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"13\" style=\"vertical-align: 0px;\"\/>. Therein,\u00a0<img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-a6c97ca60be4d556cdc0da62a7e4598e_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#121;&#95;&#123;&#107;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"16\" style=\"vertical-align: -4px;\"\/> denotes the fixed <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-f25756709afaa64c21973962eb2ab191_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#121;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"9\" style=\"vertical-align: -4px;\"\/>-value of population unit <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-eef62da36a17b47825f4728b455d3922_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#107;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"9\" style=\"vertical-align: 0px;\"\/>.<\/p>\n<p>Under the laboratory conditions of the urn model from probability theory, in a probability sample <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-10e819663dea22b9885975167b455297_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#112;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"10\" style=\"vertical-align: -4px;\"\/> of size <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-7b7417db9747afd47c7d2b674cce1895_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#110;\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"11\" style=\"vertical-align: 0px;\"\/> (<img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-e5ec730a37b77a8b4356a155c0582bcd_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#112;&#92;&#115;&#117;&#98;&#115;&#101;&#116;&#101;&#113;&#32;&#85;\" title=\"Rendered by QuickLaTeX.com\" height=\"16\" width=\"48\" style=\"vertical-align: -4px;\"\/>, <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-a9159ddc1ed89fa1346fef4b981592a9_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#110;&#92;&#108;&#101;&#113;&#32;&#78;\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"50\" style=\"vertical-align: -3px;\"\/>), drawn according to some probability sampling scheme\u00a0<img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-457e30339f9877ba4e8c0bf99860d73d_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#80;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"13\" width=\"13\" style=\"vertical-align: -1px;\"\/> with known sample selection probabilities, the Horvitz-Thompson estimator<\/p>\n<p><a name=\"id3848132416\"><\/a><\/p>\n<p class=\"ql-center-displayed-equation\" style=\"line-height: 30px;\"><span class=\"ql-right-eqno\"> (2) <\/span><span class=\"ql-left-eqno\"> &nbsp; <\/span><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-cabff39038db511115feb1b81ced33d3_l3.png\" height=\"30\" width=\"145\" class=\"ql-img-displayed-equation \" alt=\"&#92;&#98;&#101;&#103;&#105;&#110;&#123;&#101;&#113;&#117;&#97;&#116;&#105;&#111;&#110;&#42;&#125; &#116;&#95;&#123;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#80;&#125;&#125;&#61;&#92;&#115;&#117;&#109;&#92;&#110;&#111;&#108;&#105;&#109;&#105;&#116;&#115;&#95;&#123;&#112;&#92;&#115;&#117;&#98;&#115;&#101;&#116;&#101;&#113;&#32;&#85;&#125;&#32;&#121;&#95;&#123;&#107;&#125;&#92;&#99;&#100;&#111;&#116;&#32;&#100;&#95;&#123;&#107;&#125; &#92;&#101;&#110;&#100;&#123;&#101;&#113;&#117;&#97;&#116;&#105;&#111;&#110;&#42;&#125;\" title=\"Rendered by QuickLaTeX.com\"\/><\/p>\n<p>is design-unbiased for <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-741e9770700e79d3c4f7c6f929f31eb6_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#116;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"6\" style=\"vertical-align: 0px;\"\/> with variance\u00a0<img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-4e5c4184c2693b9e49fae4cd52873640_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#86;&#95;&#123;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#80;&#125;&#125;&#40;&#116;&#95;&#123;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#80;&#125;&#125;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"53\" style=\"vertical-align: -5px;\"\/> (cf., for instance, S\u00e4rndal et al. 1992, 42-48). In (2), the\u00a0<img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-d22443f36678ba2a23b22597dd11b6d5_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#100;&#95;&#123;&#107;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"16\" style=\"vertical-align: -3px;\"\/> denotes the design weights of the sample elements that are defined as the reciprocals of the sample selection probabilities.<\/p>\n<p>Therefore, for simple random sampling without replacement (SI) with the probabilities <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-49ad7e0823e3f17cad723e8f6e9eb212_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#110;&#47;&#78;\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"35\" style=\"vertical-align: -5px;\"\/> of being selected for the SI sample <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-58e714fd617a8059f0ad6fa2f611df26_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#112;&#95;&#123;&#83;&#73;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"26\" style=\"vertical-align: -4px;\"\/>, the Horvitz-Thompson estimator (2) is given by<\/p>\n<p><a name=\"id1537483857\"><\/a><\/p>\n<p class=\"ql-center-displayed-equation\" style=\"line-height: 38px;\"><span class=\"ql-right-eqno\"> (3) <\/span><span class=\"ql-left-eqno\"> &nbsp; <\/span><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-d91a81e05d49f16331e35e223cf4c6c9_l3.png\" height=\"38\" width=\"173\" class=\"ql-img-displayed-equation \" alt=\"&#92;&#98;&#101;&#103;&#105;&#110;&#123;&#101;&#113;&#117;&#97;&#116;&#105;&#111;&#110;&#42;&#125; &#116;&#95;&#123;&#83;&#73;&#125;&#61;&#92;&#102;&#114;&#97;&#99;&#123;&#78;&#125;&#123;&#110;&#125;&#92;&#99;&#100;&#111;&#116;&#32;&#92;&#115;&#117;&#109;&#92;&#110;&#111;&#108;&#105;&#109;&#105;&#116;&#115;&#95;&#123;&#112;&#95;&#123;&#83;&#73;&#125;&#92;&#115;&#117;&#98;&#115;&#101;&#116;&#101;&#113;&#32;&#85;&#125;&#32;&#121;&#95;&#123;&#107;&#125;&#44; &#92;&#101;&#110;&#100;&#123;&#101;&#113;&#117;&#97;&#116;&#105;&#111;&#110;&#42;&#125;\" title=\"Rendered by QuickLaTeX.com\"\/><\/p>\n<p>which results in <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-1a732b251e4708d7f9f9ab565fd49ceb_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#78;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"16\" style=\"vertical-align: 0px;\"\/> times the sample mean of <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-f25756709afaa64c21973962eb2ab191_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#121;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"9\" style=\"vertical-align: -4px;\"\/>, with variance <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-fdda9c0a32583b8e62ea20f941c1b989_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#86;&#95;&#123;&#83;&#73;&#125;&#40;&#116;&#95;&#123;&#83;&#73;&#125;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"63\" style=\"vertical-align: -5px;\"\/>.<\/p>\n<p>However, what about using, for instance, this estimator under real conditions? Another question is, what about using expression (3) for nonprobability samples, for which an estimator also has to be calculated although the selection probabilities are unknowable?<\/p>\n<p>Formally, the application of (3) to an available data set <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-8219fa5b592e50fdabe8f10744386940_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#115;\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"8\" style=\"vertical-align: 0px;\"\/>, be it a probability or a nonprobability sample drawn by a sampling method <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-4bf5ba7bb77556ff3d3dcd76c2c62546_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#83;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"12\" style=\"vertical-align: 0px;\"\/>, results in<\/p>\n<p><a name=\"id3962280880\"><\/a><\/p>\n<p class=\"ql-center-displayed-equation\" style=\"line-height: 37px;\"><span class=\"ql-right-eqno\"> (4) <\/span><span class=\"ql-left-eqno\"> &nbsp; <\/span><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-1c5343489cfcc2667c8bc580164fc0ee_l3.png\" height=\"37\" width=\"151\" class=\"ql-img-displayed-equation \" alt=\"&#92;&#98;&#101;&#103;&#105;&#110;&#123;&#101;&#113;&#117;&#97;&#116;&#105;&#111;&#110;&#42;&#125; &#116;&#95;&#123;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#83;&#125;&#125;&#61;&#92;&#102;&#114;&#97;&#99;&#123;&#78;&#125;&#123;&#110;&#125;&#92;&#99;&#100;&#111;&#116;&#32;&#92;&#115;&#117;&#109;&#92;&#110;&#111;&#108;&#105;&#109;&#105;&#116;&#115;&#95;&#123;&#115;&#92;&#115;&#117;&#98;&#115;&#101;&#116;&#101;&#113;&#32;&#85;&#125;&#32;&#121;&#95;&#123;&#107;&#125;&#44; &#92;&#101;&#110;&#100;&#123;&#101;&#113;&#117;&#97;&#116;&#105;&#111;&#110;&#42;&#125;\" title=\"Rendered by QuickLaTeX.com\"\/><\/p>\n<p>which only for an SI sample (with <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-5d52104007897a3f62771ec8af340e01_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#115;&#32;&#61;&#32;&#112;&#95;&#123;&#83;&#73;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"57\" style=\"vertical-align: -4px;\"\/>) provides the estimator (3) with its known statistical properties. The usage of this estimator is based on several assumptions that are discussed in the following together with the models that have to be applied in the case of deviations from these assumptions:<\/p>\n<p>&nbsp;<\/p>\n<p><strong>The operationalization assumption: <\/strong>The first implicit assumption when an estimator such as (4) is applied to an available data set <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-8219fa5b592e50fdabe8f10744386940_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#115;\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"8\" style=\"vertical-align: 0px;\"\/> collected by a probability or a nonprobability sampling method <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-4bf5ba7bb77556ff3d3dcd76c2c62546_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#83;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"12\" style=\"vertical-align: 0px;\"\/>, is that variable <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-f25756709afaa64c21973962eb2ab191_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#121;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"9\" style=\"vertical-align: -4px;\"\/> actually measures what is intended to be measured. In other words, it is assumed that the research questions are correctly operationalized. In the big data context of nonprobability sampling, this assumption plays a special role because there the research topics usually have to orient themselves on the available data sets and not the other way around as it is usual in empirical research. An example is the Google project on flu trends, in which records of search entries were analyzed to find those flu-related terms that can be used for the estimation of flu prevalence. However, after an initial success, together with a media-stoked increase of relevant searches, Google\u2019s constantly tested and improved auto-suggest feature and other changes in the search algorithms led to a persistent overestimation of the flu prevalence because these search items lost their predictive value (cf., for instance, Lazer et al. 2014, 1203).<\/p>\n<p><strong>The frame assumption:<\/strong> The next implicit assumption, when the estimator (4) is applied in the practice of survey sampling, is that the available sampling frame <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-717cd9b7650f3df012d390f202463f2e_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#85;&#95;&#123;&#70;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"22\" style=\"vertical-align: -3px;\"\/>, from which the members of the sample <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-8219fa5b592e50fdabe8f10744386940_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#115;\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"8\" style=\"vertical-align: 0px;\"\/> are actually recruited by the sampling method <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-4bf5ba7bb77556ff3d3dcd76c2c62546_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#83;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"12\" style=\"vertical-align: 0px;\"\/>, corresponds to the real study population <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-310ff55814df4f7f49f3b8e8fb604d00_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#85;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"13\" style=\"vertical-align: 0px;\"\/>, or that\u00a0<img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-717cd9b7650f3df012d390f202463f2e_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#85;&#95;&#123;&#70;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"22\" style=\"vertical-align: -3px;\"\/> and <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-310ff55814df4f7f49f3b8e8fb604d00_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#85;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"13\" style=\"vertical-align: 0px;\"\/> only differ negligibly with respect to the interesting characteristic. In other words, it is either assumed that there is no frame error or that there is an ignorable coverage bias, which is defined as the difference of the expected value of the estimator <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-71d4ded9ffc5cac571d2d88cf99350a7_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#116;&#95;&#123;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#83;&#125;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"15\" style=\"vertical-align: -3px;\"\/> of the total of <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-f25756709afaa64c21973962eb2ab191_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#121;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"9\" style=\"vertical-align: -4px;\"\/> in\u00a0<img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-717cd9b7650f3df012d390f202463f2e_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#85;&#95;&#123;&#70;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"22\" style=\"vertical-align: -3px;\"\/> and the total <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-741e9770700e79d3c4f7c6f929f31eb6_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#116;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"6\" style=\"vertical-align: 0px;\"\/> in <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-310ff55814df4f7f49f3b8e8fb604d00_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#85;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"13\" style=\"vertical-align: 0px;\"\/>, so that <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-8219fa5b592e50fdabe8f10744386940_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#115;\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"8\" style=\"vertical-align: 0px;\"\/> is representative with respect to the population total, when no other nonsampling errors apply. For nonprobability sampling schemes, the avoidance of a non-ignorable coverage bias is <em>the<\/em> big challenge because the frame population of potential sample members almost always excludes very large parts of the target population from the possible sample membership.<\/p>\n<p>With covariates available in\u00a0<img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-717cd9b7650f3df012d390f202463f2e_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#85;&#95;&#123;&#70;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"22\" style=\"vertical-align: -3px;\"\/> and <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-310ff55814df4f7f49f3b8e8fb604d00_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#85;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"13\" style=\"vertical-align: 0px;\"\/>, this assumption can be tested. After that, an expected non-ignorable coverage bias can be reduced by an explicitly formulated model concerning the distributions of the interesting variable <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-f25756709afaa64c21973962eb2ab191_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#121;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"9\" style=\"vertical-align: -4px;\"\/> and these auxiliary variables, for instance, in a ratio estimation approach (cf., for instance, S\u00e4rndal et al. 1992, 540-546).<\/p>\n<p><strong>The sample selection assumption:<\/strong> A third assumption that is implicitly made when the estimator<\/p>\n<p class=\"ql-center-displayed-equation\" style=\"line-height: 37px;\"><span class=\"ql-right-eqno\"> &nbsp; <\/span><span class=\"ql-left-eqno\"> &nbsp; <\/span><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-1061532b3578b5cf7c89f849dd91ea32_l3.png\" height=\"37\" width=\"146\" class=\"ql-img-displayed-equation \" alt=\"&#92;&#98;&#101;&#103;&#105;&#110;&#123;&#101;&#113;&#117;&#97;&#116;&#105;&#111;&#110;&#42;&#125; &#116;&#95;&#123;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#83;&#125;&#125;&#61;&#92;&#102;&#114;&#97;&#99;&#123;&#78;&#125;&#123;&#110;&#125;&#92;&#99;&#100;&#111;&#116;&#32;&#92;&#115;&#117;&#109;&#92;&#110;&#111;&#108;&#105;&#109;&#105;&#116;&#115;&#95;&#123;&#115;&#92;&#115;&#117;&#98;&#115;&#101;&#116;&#101;&#113;&#32;&#85;&#125;&#32;&#121;&#95;&#123;&#107;&#125; &#92;&#101;&#110;&#100;&#123;&#101;&#113;&#117;&#97;&#116;&#105;&#111;&#110;&#42;&#125;\" title=\"Rendered by QuickLaTeX.com\"\/><\/p>\n<p>is applied to an available sample <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-8219fa5b592e50fdabe8f10744386940_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#115;\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"8\" style=\"vertical-align: 0px;\"\/>, is that the used sampling technique <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-4bf5ba7bb77556ff3d3dcd76c2c62546_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#83;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"12\" style=\"vertical-align: 0px;\"\/> actually provides the SI selection probabilities that are used for the calculation of the design weights <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-0be8c612f6a831bec310ae2cc2d9372c_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#78;&#47;&#110;\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"35\" style=\"vertical-align: -5px;\"\/> in (4). In other words, it is either assumed that there is no selection error with regard to the presumed selection probabilities or that there is no selection bias resulting from that error.<\/p>\n<p>For an insight into the impact of such a bias, estimator (4) is rewritten by<\/p>\n<p class=\"ql-center-displayed-equation\" style=\"line-height: 37px;\"><span class=\"ql-right-eqno\"> &nbsp; <\/span><span class=\"ql-left-eqno\"> &nbsp; <\/span><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-e0b0a773c49d535cf7971dfa80463ca4_l3.png\" height=\"37\" width=\"489\" class=\"ql-img-displayed-equation \" alt=\"&#92;&#98;&#101;&#103;&#105;&#110;&#123;&#101;&#113;&#117;&#97;&#116;&#105;&#111;&#110;&#42;&#125; &#116;&#95;&#123;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#83;&#125;&#125;&#61;&#92;&#102;&#114;&#97;&#99;&#123;&#78;&#125;&#123;&#110;&#125;&#92;&#99;&#100;&#111;&#116;&#32;&#92;&#115;&#117;&#109;&#92;&#110;&#111;&#108;&#105;&#109;&#105;&#116;&#115;&#95;&#123;&#115;&#92;&#115;&#117;&#98;&#115;&#101;&#116;&#101;&#113;&#32;&#85;&#125;&#32;&#121;&#95;&#123;&#107;&#125;&#61;&#92;&#102;&#114;&#97;&#99;&#123;&#78;&#125;&#123;&#110;&#125;&#92;&#99;&#100;&#111;&#116;&#32;&#92;&#115;&#117;&#109;&#92;&#110;&#111;&#108;&#105;&#109;&#105;&#116;&#115;&#95;&#123;&#115;&#92;&#115;&#117;&#98;&#115;&#101;&#116;&#101;&#113;&#32;&#85;&#125;&#32;&#40;&#92;&#118;&#97;&#114;&#101;&#112;&#115;&#105;&#108;&#111;&#110;&#95;&#123;&#107;&#125;&#43;&#92;&#111;&#118;&#101;&#114;&#108;&#105;&#110;&#101;&#123;&#121;&#125;&#41;&#61;&#92;&#102;&#114;&#97;&#99;&#123;&#78;&#125;&#123;&#110;&#125;&#92;&#99;&#100;&#111;&#116;&#32;&#92;&#115;&#117;&#109;&#92;&#110;&#111;&#108;&#105;&#109;&#105;&#116;&#115;&#95;&#123;&#85;&#125;&#32;&#73;&#95;&#123;&#107;&#125;&#92;&#99;&#100;&#111;&#116;&#32;&#92;&#118;&#97;&#114;&#101;&#112;&#115;&#105;&#108;&#111;&#110;&#95;&#123;&#107;&#125;&#43;&#116; &#92;&#101;&#110;&#100;&#123;&#101;&#113;&#117;&#97;&#116;&#105;&#111;&#110;&#42;&#125;\" title=\"Rendered by QuickLaTeX.com\"\/><\/p>\n<p>with the sample membership indicator\u00a0<img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-97e3946042b5fefade52271608bff46f_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#73;&#95;&#123;&#107;&#125;&#61;&#92;&#116;&#101;&#120;&#116;&#98;&#102;&#123;&#49;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"49\" style=\"vertical-align: -3px;\"\/> of population unit <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-eef62da36a17b47825f4728b455d3922_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#107;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"9\" style=\"vertical-align: 0px;\"\/> and the deviation\u00a0<img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-9fbd880a61bf9925d4f828816543ca7c_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#92;&#118;&#97;&#114;&#101;&#112;&#115;&#105;&#108;&#111;&#110;&#95;&#123;&#107;&#125;&#61;&#121;&#95;&#123;&#107;&#125;&#45;&#92;&#111;&#118;&#101;&#114;&#108;&#105;&#110;&#101;&#123;&#121;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"89\" style=\"vertical-align: -4px;\"\/> of the unit&#8217;s <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-f25756709afaa64c21973962eb2ab191_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#121;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"9\" style=\"vertical-align: -4px;\"\/>-value from the population mean <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-573b75194b25fbbe6a7b6b79f30ba8da_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#92;&#111;&#118;&#101;&#114;&#108;&#105;&#110;&#101;&#123;&#121;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"10\" style=\"vertical-align: -4px;\"\/> (cf. Ardilly and Till\u00e9 2006, 111-114, Meng 2018, 689-700). With <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-9aefc6bc4ff0bb9713a06256eecdc0c3_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#92;&#115;&#117;&#109;&#92;&#110;&#111;&#108;&#105;&#109;&#105;&#116;&#115;&#95;&#123;&#85;&#125;&#32;&#92;&#118;&#97;&#114;&#101;&#112;&#115;&#105;&#108;&#111;&#110;&#95;&#123;&#107;&#125;&#61;&#48;\" title=\"Rendered by QuickLaTeX.com\" height=\"18\" width=\"82\" style=\"vertical-align: -5px;\"\/>, the population sum <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-b7567f41db1526ec5b42fb7bc611a099_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#92;&#115;&#117;&#109;&#92;&#110;&#111;&#108;&#105;&#109;&#105;&#116;&#115;&#95;&#123;&#85;&#125;&#32;&#73;&#95;&#123;&#107;&#125;&#92;&#99;&#100;&#111;&#116;&#32;&#92;&#118;&#97;&#114;&#101;&#112;&#115;&#105;&#108;&#111;&#110;&#95;&#123;&#107;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"18\" width=\"77\" style=\"vertical-align: -5px;\"\/> of the products of the sample membership indicators <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-13e75d744ebe8d94e0aada0b4600b6e8_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#73;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"9\" style=\"vertical-align: 0px;\"\/> (with population mean <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-b2e5aa5a3ed211905688d664f1b41735_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#92;&#111;&#118;&#101;&#114;&#108;&#105;&#110;&#101;&#123;&#73;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"10\" style=\"vertical-align: 0px;\"\/>) and the <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-17922929d07c285244a6f54d53decbf7_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#92;&#118;&#97;&#114;&#101;&#112;&#115;&#105;&#108;&#111;&#110;\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"8\" style=\"vertical-align: 0px;\"\/>-values can be written as<\/p>\n<p class=\"ql-center-displayed-equation\" style=\"line-height: 27px;\"><span class=\"ql-right-eqno\"> &nbsp; <\/span><span class=\"ql-left-eqno\"> &nbsp; <\/span><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-9f66a5b6e4d39c5d59fd3c4c172184ee_l3.png\" height=\"27\" width=\"403\" class=\"ql-img-displayed-equation \" alt=\"&#92;&#98;&#101;&#103;&#105;&#110;&#123;&#101;&#113;&#117;&#97;&#116;&#105;&#111;&#110;&#42;&#125; &#92;&#115;&#117;&#109;&#92;&#110;&#111;&#108;&#105;&#109;&#105;&#116;&#115;&#95;&#123;&#85;&#125;&#32;&#73;&#95;&#123;&#107;&#125;&#92;&#99;&#100;&#111;&#116;&#32;&#92;&#118;&#97;&#114;&#101;&#112;&#115;&#105;&#108;&#111;&#110;&#95;&#123;&#107;&#125;&#61;&#92;&#115;&#117;&#109;&#92;&#110;&#111;&#108;&#105;&#109;&#105;&#116;&#115;&#95;&#123;&#85;&#125;&#32;&#40;&#73;&#95;&#123;&#107;&#125;&#45;&#92;&#111;&#118;&#101;&#114;&#108;&#105;&#110;&#101;&#123;&#73;&#125;&#41;&#92;&#99;&#100;&#111;&#116;&#32;&#40;&#121;&#95;&#123;&#107;&#125;&#45;&#92;&#111;&#118;&#101;&#114;&#108;&#105;&#110;&#101;&#123;&#121;&#125;&#41;&#61;&#40;&#78;&#45;&#49;&#41;&#92;&#99;&#100;&#111;&#116;&#32;&#83;&#95;&#123;&#73;&#121;&#125; &#92;&#101;&#110;&#100;&#123;&#101;&#113;&#117;&#97;&#116;&#105;&#111;&#110;&#42;&#125;\" title=\"Rendered by QuickLaTeX.com\"\/><\/p>\n<p>with the &#8220;(<img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-0304db7116a7b12247b92e63e8cfafad_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#78;&#45;&#49;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"46\" style=\"vertical-align: 0px;\"\/>)-population covariance&#8221;<\/p>\n<p class=\"ql-center-displayed-equation\" style=\"line-height: 36px;\"><span class=\"ql-right-eqno\"> &nbsp; <\/span><span class=\"ql-left-eqno\"> &nbsp; <\/span><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-299f173641ca3310e62c158edb97b669_l3.png\" height=\"36\" width=\"285\" class=\"ql-img-displayed-equation \" alt=\"&#92;&#98;&#101;&#103;&#105;&#110;&#123;&#101;&#113;&#117;&#97;&#116;&#105;&#111;&#110;&#42;&#125; &#83;&#95;&#123;&#73;&#121;&#125;&#61;&#92;&#102;&#114;&#97;&#99;&#123;&#49;&#125;&#123;&#78;&#45;&#49;&#125;&#92;&#99;&#100;&#111;&#116;&#32;&#92;&#115;&#117;&#109;&#92;&#110;&#111;&#108;&#105;&#109;&#105;&#116;&#115;&#95;&#123;&#85;&#125;&#32;&#40;&#73;&#95;&#123;&#107;&#125;&#45;&#92;&#111;&#118;&#101;&#114;&#108;&#105;&#110;&#101;&#123;&#73;&#125;&#41;&#92;&#99;&#100;&#111;&#116;&#32;&#40;&#121;&#95;&#123;&#107;&#125;&#45;&#92;&#111;&#118;&#101;&#114;&#108;&#105;&#110;&#101;&#123;&#121;&#125;&#41; &#92;&#101;&#110;&#100;&#123;&#101;&#113;&#117;&#97;&#116;&#105;&#111;&#110;&#42;&#125;\" title=\"Rendered by QuickLaTeX.com\"\/><\/p>\n<p>of <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-13e75d744ebe8d94e0aada0b4600b6e8_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#73;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"9\" style=\"vertical-align: 0px;\"\/> and <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-f25756709afaa64c21973962eb2ab191_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#121;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"9\" style=\"vertical-align: -4px;\"\/> (cf., for instance,\u00a0S\u00e4rndal et al. 1992, 186). The population correlation <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-a04204509129d05078ac56882e05ebdb_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#114;&#95;&#123;&#73;&#121;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"14\" width=\"22\" style=\"vertical-align: -6px;\"\/> of these variables under sampling technique <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-4bf5ba7bb77556ff3d3dcd76c2c62546_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#83;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"12\" style=\"vertical-align: 0px;\"\/> is given by<\/p>\n<p class=\"ql-center-displayed-equation\" style=\"line-height: 42px;\"><span class=\"ql-right-eqno\"> &nbsp; <\/span><span class=\"ql-left-eqno\"> &nbsp; <\/span><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-e0abd97f3f96047522bec787f4397108_l3.png\" height=\"42\" width=\"100\" class=\"ql-img-displayed-equation \" alt=\"&#92;&#98;&#101;&#103;&#105;&#110;&#123;&#101;&#113;&#117;&#97;&#116;&#105;&#111;&#110;&#42;&#125; &#114;&#95;&#123;&#73;&#121;&#125;&#61;&#92;&#102;&#114;&#97;&#99;&#123;&#83;&#95;&#123;&#73;&#121;&#125;&#125;&#123;&#83;&#95;&#123;&#73;&#125;&#92;&#99;&#100;&#111;&#116;&#32;&#83;&#95;&#123;&#121;&#125;&#125; &#92;&#101;&#110;&#100;&#123;&#101;&#113;&#117;&#97;&#116;&#105;&#111;&#110;&#42;&#125;\" title=\"Rendered by QuickLaTeX.com\"\/><\/p>\n<p>with <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-dc04be17a4f38ba77a96dacc16cc6af5_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#83;&#95;&#123;&#73;&#125;&#94;&#50;&#61;&#92;&#102;&#114;&#97;&#99;&#123;&#49;&#125;&#123;&#78;&#45;&#49;&#125;&#92;&#99;&#100;&#111;&#116;&#32;&#92;&#115;&#117;&#109;&#92;&#110;&#111;&#108;&#105;&#109;&#105;&#116;&#115;&#95;&#123;&#85;&#125;&#32;&#40;&#73;&#95;&#123;&#107;&#125;&#45;&#92;&#111;&#118;&#101;&#114;&#108;&#105;&#110;&#101;&#123;&#73;&#125;&#41;&#94;&#50;\" title=\"Rendered by QuickLaTeX.com\" height=\"22\" width=\"188\" style=\"vertical-align: -6px;\"\/> and <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-878e978ebf7e54d48982f7a7a20ff88f_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#83;&#95;&#123;&#121;&#125;&#94;&#50;&#61;&#92;&#102;&#114;&#97;&#99;&#123;&#49;&#125;&#123;&#78;&#45;&#49;&#125;&#92;&#99;&#100;&#111;&#116;&#32;&#92;&#115;&#117;&#109;&#92;&#110;&#111;&#108;&#105;&#109;&#105;&#116;&#115;&#95;&#123;&#85;&#125;&#32;&#40;&#121;&#95;&#123;&#107;&#125;&#45;&#92;&#111;&#118;&#101;&#114;&#108;&#105;&#110;&#101;&#123;&#121;&#125;&#41;&#94;&#50;\" title=\"Rendered by QuickLaTeX.com\" height=\"23\" width=\"189\" style=\"vertical-align: -7px;\"\/>, the &#8220;(<img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-0304db7116a7b12247b92e63e8cfafad_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#78;&#45;&#49;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"46\" style=\"vertical-align: 0px;\"\/>)-population variances&#8221; of <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-13e75d744ebe8d94e0aada0b4600b6e8_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#73;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"9\" style=\"vertical-align: 0px;\"\/> and <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-f25756709afaa64c21973962eb2ab191_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#121;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"9\" style=\"vertical-align: -4px;\"\/>, respectively. Moreover,\u00a0<img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-7e09339dd9abd85887fe5d4c76c79072_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#92;&#102;&#114;&#97;&#99;&#123;&#78;&#45;&#49;&#125;&#123;&#78;&#125;&#92;&#99;&#100;&#111;&#116;&#32;&#83;&#95;&#123;&#73;&#125;&#94;&#50;&#61;&#92;&#102;&#114;&#97;&#99;&#123;&#110;&#125;&#123;&#78;&#125;&#92;&#99;&#100;&#111;&#116;&#32;&#40;&#49;&#45;&#92;&#102;&#114;&#97;&#99;&#123;&#110;&#125;&#123;&#78;&#125;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"22\" width=\"176\" style=\"vertical-align: -6px;\"\/> \u00a0applies (cf., for instance,\u00a0S\u00e4rndal et al. 1992, 36). Hence, the actual estimation error <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-1cd4846df7d989975b927b50577018a9_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#116;&#95;&#123;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#83;&#125;&#125;&#45;&#116;\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"44\" style=\"vertical-align: -3px;\"\/> of the estimate <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-71d4ded9ffc5cac571d2d88cf99350a7_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#116;&#95;&#123;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#83;&#125;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"15\" style=\"vertical-align: -3px;\"\/> is given by<\/p>\n<p class=\"ql-center-displayed-equation\" style=\"line-height: 103px;\"><span class=\"ql-right-eqno\"> &nbsp; <\/span><span class=\"ql-left-eqno\"> &nbsp; <\/span><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-a81fb4dded6614ed5104f7735c04cdfb_l3.png\" height=\"103\" width=\"598\" class=\"ql-img-displayed-equation \" alt=\"&#92;&#98;&#101;&#103;&#105;&#110;&#123;&#101;&#113;&#110;&#97;&#114;&#114;&#97;&#121;&#42;&#125; &#116;&#95;&#123;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#83;&#125;&#125;&#45;&#116;&#38;&#61;&#38;&#92;&#102;&#114;&#97;&#99;&#123;&#78;&#125;&#123;&#110;&#125;&#92;&#99;&#100;&#111;&#116;&#32;&#40;&#78;&#45;&#49;&#41;&#92;&#99;&#100;&#111;&#116;&#32;&#83;&#95;&#123;&#73;&#121;&#125;&#61;&#92;&#102;&#114;&#97;&#99;&#123;&#78;&#125;&#123;&#110;&#125;&#92;&#99;&#100;&#111;&#116;&#32;&#40;&#78;&#45;&#49;&#41;&#92;&#99;&#100;&#111;&#116;&#32;&#83;&#95;&#123;&#121;&#125;&#92;&#99;&#100;&#111;&#116;&#32;&#92;&#115;&#113;&#114;&#116;&#123;&#92;&#102;&#114;&#97;&#99;&#123;&#110;&#125;&#123;&#78;&#125;&#92;&#99;&#100;&#111;&#116;&#32;&#40;&#49;&#45;&#92;&#102;&#114;&#97;&#99;&#123;&#110;&#125;&#123;&#78;&#125;&#41;&#92;&#99;&#100;&#111;&#116;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#78;&#125;&#123;&#78;&#45;&#49;&#125;&#125;&#92;&#99;&#100;&#111;&#116;&#32;&#114;&#95;&#123;&#73;&#121;&#125;&#92;&#92; &#38;&#61;&#38;&#92;&#115;&#113;&#114;&#116;&#123;&#78;&#94;&#50;&#92;&#99;&#100;&#111;&#116;&#32;&#40;&#49;&#45;&#92;&#102;&#114;&#97;&#99;&#123;&#110;&#125;&#123;&#78;&#125;&#41;&#92;&#99;&#100;&#111;&#116;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#83;&#95;&#123;&#121;&#125;&#94;&#50;&#125;&#123;&#110;&#125;&#125;&#92;&#99;&#100;&#111;&#116;&#32;&#92;&#115;&#113;&#114;&#116;&#123;&#40;&#78;&#45;&#49;&#41;&#125;&#92;&#99;&#100;&#111;&#116;&#32;&#114;&#95;&#123;&#73;&#121;&#125;&#61;&#92;&#115;&#113;&#114;&#116;&#123;&#86;&#95;&#123;&#83;&#73;&#125;&#40;&#116;&#95;&#123;&#83;&#73;&#125;&#41;&#125;&#92;&#99;&#100;&#111;&#116;&#32;&#92;&#115;&#113;&#114;&#116;&#123;&#78;&#45;&#49;&#125;&#92;&#99;&#100;&#111;&#116;&#32;&#114;&#95;&#123;&#73;&#121;&#125; &#92;&#101;&#110;&#100;&#123;&#101;&#113;&#110;&#97;&#114;&#114;&#97;&#121;&#42;&#125;\" title=\"Rendered by QuickLaTeX.com\"\/><\/p>\n<p>with <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-fdda9c0a32583b8e62ea20f941c1b989_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#86;&#95;&#123;&#83;&#73;&#125;&#40;&#116;&#95;&#123;&#83;&#73;&#125;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"63\" style=\"vertical-align: -5px;\"\/>, the variance of <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-c782880b5fe2ca78048ba019d75a61bd_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#116;&#95;&#123;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#83;&#125;&#125;&#61;&#116;&#95;&#123;&#83;&#73;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"63\" style=\"vertical-align: -3px;\"\/> under SI sampling. Since the biased estimation shall be addressed, Meng (2018) defines the design effect <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-7890990f86cd755be978b2c649111ea3_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#68;&#95;&#123;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#83;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"24\" style=\"vertical-align: -3px;\"\/> as the ratio of the mean square errors <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-a60bcffd50007d69730310d42637f59a_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#77;&#83;&#69;&#95;&#123;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#83;&#125;&#125;&#40;&#116;&#95;&#123;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#83;&#125;&#125;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"84\" style=\"vertical-align: -5px;\"\/> of the applied estimator <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-b4935686e99df2078aaf6a73239f1455_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#116;&#95;&#123;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#83;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"15\" style=\"vertical-align: -3px;\"\/> under the sampling method\u00a0<img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-4bf5ba7bb77556ff3d3dcd76c2c62546_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#83;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"12\" style=\"vertical-align: 0px;\"\/> that was actually used and <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-fdda9c0a32583b8e62ea20f941c1b989_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#86;&#95;&#123;&#83;&#73;&#125;&#40;&#116;&#95;&#123;&#83;&#73;&#125;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"63\" style=\"vertical-align: -5px;\"\/> under SI sampling (cf., ibid., 696). This is derived from<\/p>\n<p><a name=\"id1168846043\"><\/a><\/p>\n<p class=\"ql-center-displayed-equation\" style=\"line-height: 78px;\"><span class=\"ql-right-eqno\"> (5) <\/span><span class=\"ql-left-eqno\"> &nbsp; <\/span><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-abf90d28d6b8fd7a5ad93ad23365110d_l3.png\" height=\"78\" width=\"458\" class=\"ql-img-displayed-equation \" alt=\"&#92;&#98;&#101;&#103;&#105;&#110;&#123;&#101;&#113;&#110;&#97;&#114;&#114;&#97;&#121;&#42;&#125; &#77;&#83;&#69;&#95;&#123;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#83;&#125;&#125;&#40;&#116;&#95;&#123;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#83;&#125;&#125;&#41;&#38;&#61;&#38;&#69;&#95;&#123;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#83;&#125;&#125;&#91;&#40;&#116;&#95;&#123;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#83;&#125;&#125;&#45;&#116;&#41;&#94;&#50;&#93;&#61;&#69;&#95;&#123;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#83;&#125;&#125;&#91;&#86;&#95;&#123;&#83;&#73;&#125;&#40;&#116;&#95;&#123;&#83;&#73;&#125;&#41;&#92;&#99;&#100;&#111;&#116;&#32;&#40;&#78;&#45;&#49;&#41;&#92;&#99;&#100;&#111;&#116;&#32;&#114;&#95;&#123;&#73;&#121;&#125;&#94;&#50;&#93;&#92;&#110;&#111;&#116;&#97;&#103;&#32;&#92;&#92;&#38;&#61;&#38;&#86;&#95;&#123;&#83;&#73;&#125;&#40;&#116;&#95;&#123;&#83;&#73;&#125;&#41;&#92;&#99;&#100;&#111;&#116;&#32;&#92;&#117;&#110;&#100;&#101;&#114;&#98;&#114;&#97;&#99;&#101;&#123;&#40;&#78;&#45;&#49;&#41;&#92;&#99;&#100;&#111;&#116;&#32;&#69;&#95;&#123;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#83;&#125;&#125;&#40;&#114;&#95;&#123;&#73;&#121;&#125;&#94;&#50;&#41;&#125;&#95;&#123;&#92;&#101;&#113;&#117;&#105;&#118;&#32;&#68;&#95;&#123;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#83;&#125;&#125;&#125; &#92;&#101;&#110;&#100;&#123;&#101;&#113;&#110;&#97;&#114;&#114;&#97;&#121;&#42;&#125;\" title=\"Rendered by QuickLaTeX.com\"\/><\/p>\n<p>In the design effect <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-53186ac3521341c07b7e65628b86c16a_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#68;&#95;&#123;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#83;&#125;&#125;&#61;&#40;&#78;&#45;&#49;&#41;&#92;&#99;&#100;&#111;&#116;&#32;&#69;&#95;&#123;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#83;&#125;&#125;&#40;&#114;&#95;&#123;&#73;&#121;&#125;&#94;&#50;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"23\" width=\"181\" style=\"vertical-align: -8px;\"\/>, the second term on the right-hand side is a measure of the selection bias when using the estimator <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-71d4ded9ffc5cac571d2d88cf99350a7_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#116;&#95;&#123;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#83;&#125;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"15\" style=\"vertical-align: -3px;\"\/> for the data collected by a technique <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-4bf5ba7bb77556ff3d3dcd76c2c62546_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#83;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"12\" style=\"vertical-align: 0px;\"\/>. Obviously, for SI sampling, <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-37036191ce2b75739d621db717c6a54f_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#68;&#95;&#123;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#83;&#125;&#125;&#61;&#49;\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"56\" style=\"vertical-align: -3px;\"\/> applies. For any other method <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-4bf5ba7bb77556ff3d3dcd76c2c62546_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#83;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"12\" style=\"vertical-align: 0px;\"\/>, for which <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-6ab8aefe8e8c37a0cce59d15d2303837_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#68;&#95;&#123;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#83;&#125;&#125;&#62;&#49;\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"56\" style=\"vertical-align: -3px;\"\/> applies, the usage of the usual SI variance estimation formula under the\u00a0sampling method <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-4bf5ba7bb77556ff3d3dcd76c2c62546_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#83;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"12\" style=\"vertical-align: 0px;\"\/> that was actually applied leads to the following two negative effects (cf. Meng 2018, 700-701):<\/p>\n<ol>\n<li>The actual coverage rates of the common approximate confidence intervals are too small;<\/li>\n<li>The true significance levels for hypotheses tests are too large, thus resulting in too many significant results under the null hypothesis.<\/li>\n<\/ol>\n<p>The essence of (5) is that for a given population size <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-1a732b251e4708d7f9f9ab565fd49ceb_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#78;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"16\" style=\"vertical-align: 0px;\"\/> the design effect <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-49ae1feb7e817ed712066a9171d5dfba_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#68;&#95;&#123;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#83;&#125;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"24\" style=\"vertical-align: -3px;\"\/> does not depend on the size <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-7b7417db9747afd47c7d2b674cce1895_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#110;\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"11\" style=\"vertical-align: 0px;\"\/> of the sample at all because <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-7b7417db9747afd47c7d2b674cce1895_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#110;\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"11\" style=\"vertical-align: 0px;\"\/> only influences the term <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-fdda9c0a32583b8e62ea20f941c1b989_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#86;&#95;&#123;&#83;&#73;&#125;&#40;&#116;&#95;&#123;&#83;&#73;&#125;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"63\" style=\"vertical-align: -5px;\"\/>. In other words,\u00a0<img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-49ae1feb7e817ed712066a9171d5dfba_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#68;&#95;&#123;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#83;&#125;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"24\" style=\"vertical-align: -3px;\"\/> does not depend on how&#8220;big&#8221; the data is, but only on the deviation of the true sample selection probabilities of the sampling technique \u00a0<img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-4bf5ba7bb77556ff3d3dcd76c2c62546_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#83;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"12\" style=\"vertical-align: 0px;\"\/> that was actually applied from the SI selection probabilities applied in (4). For <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-2e0127cacf6e0ab3625343be7b6e01d2_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#69;&#95;&#123;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#83;&#125;&#125;&#40;&#114;&#95;&#123;&#73;&#121;&#125;&#94;&#50;&#41;&#62;&#92;&#102;&#114;&#97;&#99;&#123;&#49;&#125;&#123;&#78;&#45;&#49;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"24\" width=\"116\" style=\"vertical-align: -8px;\"\/>, the bias of <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-71d4ded9ffc5cac571d2d88cf99350a7_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#116;&#95;&#123;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#83;&#125;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"15\" style=\"vertical-align: -3px;\"\/> takes over the leading role in the mean square error <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-a60bcffd50007d69730310d42637f59a_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#77;&#83;&#69;&#95;&#123;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#83;&#125;&#125;&#40;&#116;&#95;&#123;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#83;&#125;&#125;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"84\" style=\"vertical-align: -5px;\"\/>. If <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-1a732b251e4708d7f9f9ab565fd49ceb_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#78;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"16\" style=\"vertical-align: 0px;\"\/> is large, a tiny deviation of the true selection mechanism from the implicit SI assumption already results in a large design effect <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-49ae1feb7e817ed712066a9171d5dfba_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#68;&#95;&#123;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#83;&#125;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"24\" style=\"vertical-align: -3px;\"\/> with a devastating impact on the estimator&#8217;s inferential quality.<\/p>\n<p>This may apply to complex probability sampling, when for the sake of simplicity, this selection model is used in the statistical analysis although the true design weights are knowable (cf. Bacher 2009). For nonprobability sampling, the validity of this sample selection model, which is applied in many settings, will almost always be in doubt, yielding the described consequences. As an alternative approach to such a na\u00efve explicit modeling of the unknown sample selection probabilities of nonprobability sampling, estimates of these probabilities can be used for the calculation of the design weights needed in (2). This estimation relies on auxiliary variables (such as demographic characteristics) that on the one hand should explain the unknown nonprobability sample selection probabilities and on the other hand are available for the given nonprobability sample as well as for a probability sample or the population (cf., for instance, Elliot 2009).<\/p>\n<p>Statistical methods such as poststratification or iterative proportional fitting can be applied. Such methods match the sample to given population distributions of available auxiliary variables with the aim of reducing selection bias by adjusting the modeled design-weights (cf., for instance, Lohr 2010, 340-346).<\/p>\n<p>&nbsp;<\/p>\n<p><strong>The nonresponse assumption: <\/strong>Another implicit assumption of the application of the estimator (4) is that all elements in the drawn sample <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-8219fa5b592e50fdabe8f10744386940_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#115;\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"8\" style=\"vertical-align: 0px;\"\/> are available and willing to respond. In other words, it is assumed that there is no nonresponse (even in surveys on sensitive topics or in hard-to-reach and hard-to-ask populations) or, if this is not the case, at least only a negligible nonresponse bias exists.<\/p>\n<p>When despite all efforts to prevent high nonresponse rates given the applied survey mode, nonresponse occurs, according to (5), the design effect <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-45f7c1247c258e17623de332bbadf7d7_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#68;&#95;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#83;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"24\" style=\"vertical-align: -3px;\"\/> of the sampling technique <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-4bf5ba7bb77556ff3d3dcd76c2c62546_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#83;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"12\" style=\"vertical-align: 0px;\"\/> that was actually applied will be affected by an increase of the expected value \u00a0<img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-e9b6900d0e2e6039af59e1d8c4eefaf4_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#69;&#95;&#123;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#83;&#125;&#125;&#40;&#114;&#95;&#123;&#73;&#121;&#125;&#94;&#50;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"23\" width=\"59\" style=\"vertical-align: -8px;\"\/>, where variable <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-13e75d744ebe8d94e0aada0b4600b6e8_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#73;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"9\" style=\"vertical-align: 0px;\"\/> now indicates the sample membership of the responding units. A measure of this impact of nonresponse on the inference quality is given by the representativeness-indicator (Schouten et al. 2009). This measure is a function of the variance of response probabilities in <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-310ff55814df4f7f49f3b8e8fb604d00_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#85;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"13\" style=\"vertical-align: 0px;\"\/>. The larger this variance, the lower is the representativeness of the given responses. In this way, the representativeness-indicator estimates the deviation of the actual nonresponse mechanism from being completely at random and thus, the potential for a non-ignorable nonresponse bias.<\/p>\n<p>The complete ignorance of nonresponse in the estimation process is a common practice, which means that the nonresponse that occurred is modeled as being completely at random (cf., for instance, Little and Rubin 2002, 12). In particular, in the application of the nonprobability sampling methods, a nonrespondent is usually simply replaced by the next suitable person who is willing to cooperate and nonresponse rates are usually not reported for the resulting data sets.<\/p>\n<p>However, in the presence of non-ignorable nonresponse, it is impossible to calculate reliable estimates of population characteristics of interest, such as the total <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-741e9770700e79d3c4f7c6f929f31eb6_l3.png\" class=\"ql-img-inline-formula \" alt=\"&#116;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"6\" style=\"vertical-align: 0px;\"\/> by a formula like (4) without any intervention in the estimation process. For this purpose, for example, the statistical repair methods of weighting adjustment to compensate for the unit-nonresponse that occurred (by procedures such as poststratification or iterative proportional fitting) and data imputation for the item-nonresponse (by techniques, such as mean or regression imputation) can be applied to reduce the amount of the nonresponse bias under adequate and explicitly formulated models regarding the nonresponse mechanism (cf., for instance, Bethlehem et al. 2011, Chaps. 8 and 14).<\/p>\n<p>&nbsp;<\/p>\n<p><strong>The measurement and data processing assumption:<\/strong> With the application of an estimator such as<\/p>\n<p class=\"ql-center-displayed-equation\" style=\"line-height: 37px;\"><span class=\"ql-right-eqno\"> &nbsp; <\/span><span class=\"ql-left-eqno\"> &nbsp; <\/span><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/surveyinsights.org\/wp-content\/uploads\/ql-cache\/quicklatex.com-5e82b2fcf2e51ea179bf51fda0f57e1a_l3.png\" height=\"37\" width=\"151\" class=\"ql-img-displayed-equation \" alt=\"&#92;&#98;&#101;&#103;&#105;&#110;&#123;&#101;&#113;&#117;&#97;&#116;&#105;&#111;&#110;&#42;&#125; &#116;&#95;&#123;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#83;&#125;&#125;&#61;&#92;&#102;&#114;&#97;&#99;&#123;&#78;&#125;&#123;&#110;&#125;&#92;&#99;&#100;&#111;&#116;&#32;&#92;&#115;&#117;&#109;&#92;&#110;&#111;&#108;&#105;&#109;&#105;&#116;&#115;&#95;&#123;&#115;&#92;&#115;&#117;&#98;&#115;&#101;&#116;&#101;&#113;&#32;&#85;&#125;&#32;&#121;&#95;&#123;&#107;&#125;&#44; &#92;&#101;&#110;&#100;&#123;&#101;&#113;&#117;&#97;&#116;&#105;&#111;&#110;&#42;&#125;\" title=\"Rendered by QuickLaTeX.com\"\/><\/p>\n<p>it is further assumed that there are no untruthful answers given or wrong measurements as well as no processing errors, such as a data encoding error. If this does not apply, it is at least assumed that there is no non-negligible measurement and data processing bias, respectively.<\/p>\n<p>To reduce the extent of an occurred measurement or data processing error, an explicitly formulated plausible stochastic model describing the mechanisms that led to the wrong observations can be applied to calculate a reliable estimate (cf. for instance, S\u00e4rndal et al. 1992, 601-634).<\/p>\n<p>The task force of the Executive Council of the American Association of Public Opinion Research (AAPOR) had the task &#8220;o examine the condition under which various survey designs that do not use probability samples might still be useful for making inferences to a larger population (cf. Baker et al. 2013, 6).&#8221; It was noted that the different nonprobability sampling techniques can be thought of &#8220;as falling on a continuum of expected accuracy of the estimates (ibid., 105).&#8221; At one end of the quality scale, are the completely uncontrolled arbitrary samples, whereas at the other end, are the methods based on less risky selection procedures in which the results are adjusted as described above, using auxiliary variables that are correlated with the variables of interest (cf. Baker et al. 2013, 105-106).<\/p>\n<p>&nbsp;<\/p>\n<h1><strong> A complementary concept on the inferential quality of surveys<\/strong><\/h1>\n<p>Suggesting the definition of representativeness in Section 1, in the practice of sampling, it cannot be ignored that it is often sufficient to get a very rough idea of a population characteristic of interest. Examples from empirical sciences include pretests or pilot studies, but there are also public surveys, for instance, to identify some of the causes of a possible dissatisfaction among community residents that fall into this category of surveys. When nothing or very little is known about characteristics of interest describing, for instance, a hard-to-reach population, the following supplementary definition takes account of this fact (Quatember 2001, 20):<\/p>\n<p><em>A sample is called &#8220;informative&#8221; for a certain population characteristic if it provides sufficient information on that characteristic with respect to the declared survey purpose. <\/em><\/p>\n<p>Herein, the acceptable degree of inaccuracy is mainly determined by the usefulness of the resulting outcomes with respect to the purpose of the survey, which does not always have to be a high-quality inference from a representative sample to the target population.<\/p>\n<p>&nbsp;<\/p>\n<h1><strong> Conclusions<\/strong><\/h1>\n<p>The question was this: Inferences based on probability sampling or nonprobability sampling \u2013 are they nothing but a question of models? The answer is this: Yes, they are!\u2013 but only under certain implicit assumptions and explicit models to react on deviations, in this regard as discussed in Section 2 exemplarily for the usage of the Horvitz-Thompson estimator (3) of SI sampling for a total (1) of a variable under study. It is implicitly assumed that there is no operationalization, coverage, selection, nonresponse, measurement, or processing bias. In the presence of deviations from these basic assumptions, facing the risk of a substantially biased estimator, a model-based estimation has to be established instead. For this purpose, complementary explicit models have to be formulated concerning these deviations between theory and practice. Then, even the representativeness of a probability sample is only valid under these models, which always applies to nonprobability samples.<\/p>\n<p>However, is there a difference between probability samples and nonprobability samples regarding these models? Again, the answer is: Yes! There are far more unverifiable, disputable models that address the different implicit assumptions, needed in the nonprobability approach to sampling, including big data. Nevertheless, the application of a nonprobability sampling technique instead of a probability sampling method might be justified for specific research objectives concerning, for example, special populations, such as hard-to-reach ones, for which informative instead of representative samples, according to the additionally presented definition in Section 3, are sufficient. However, if high-quality inference is the survey purpose, it is still the theory of probability sampling that sets the standard and serves as a landmark.<\/p>\n<p>As a consequence of the different strengths of model-dependencies and the varying intended research purposes, sufficient details about the applied sampling design and the survey purpose shall have to be standardly reported along with all the applied implicit assumptions and explicit models. Only such a report may enable data users to assess the real quality of produced survey results.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Introduction There is a constantly increasing demand for objective information about some characteristics of finite populations of interest based on data. Regarding the sources of such data, in this paper, we will distinguish between probability samples and nonprobability samples. The probability sampling techniques can be described under a unique theoretical framework because they all share [&hellip;]<\/p>\n","protected":false},"author":965,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[444],"tags":[538,539,471,472,91],"class_list":["post-11203","post","type-post","status-publish","format-standard","hentry","category-probability-and-nonprobability-sampling","tag-representativeness","tag-sample-surveys","tag-sampling-techniques","tag-survey-methodology","tag-total-survey-error"],"acf":[],"_links":{"self":[{"href":"https:\/\/surveyinsights.org\/index.php?rest_route=\/wp\/v2\/posts\/11203","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/surveyinsights.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/surveyinsights.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/surveyinsights.org\/index.php?rest_route=\/wp\/v2\/users\/965"}],"replies":[{"embeddable":true,"href":"https:\/\/surveyinsights.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=11203"}],"version-history":[{"count":52,"href":"https:\/\/surveyinsights.org\/index.php?rest_route=\/wp\/v2\/posts\/11203\/revisions"}],"predecessor-version":[{"id":18832,"href":"https:\/\/surveyinsights.org\/index.php?rest_route=\/wp\/v2\/posts\/11203\/revisions\/18832"}],"wp:attachment":[{"href":"https:\/\/surveyinsights.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=11203"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/surveyinsights.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=11203"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/surveyinsights.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=11203"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}