* * Three-Step Strategy Implementation in STATA * *------------------------------- * * Please cite when using this syntax: * Gummer, T., Roßmann, J. (2013). Good Questions, Bad Questions? A Post-Survey Evalution Strategy Based on Item Nonresponse. * Survey Methods: Insights from the Field. Retrieved from http://surveyinsights.org/?p=2330 * *------------------------------- * * * version 12.1 /* Enter Stata version */ capture log close clear clear matrix set more off * * * cd "C:/" /* Enter path of working directory */ use dataset.dta, replace /* Enter name of data set */ * * *------------------------------- * Data set should only contain * numerical variables. *------------------------------- * * *------------------------------- * Recoding Item Nonresponse (INR) and Donīt Know (DK) * * Codes in example: * 98 == "don't know" * 99 == "no answer" * 100 == "not applicable" (R did not receive a question) *------------------------------- * * Defining missing values * mvdecode _all, mv(98=.b \ 99=.a \ 100=.c) * * * foreach var of varlist _all { /* Enter varlist (only numeric variables) */ gen inr_`var' = `var'==.a replace inr_`var' = . if `var'== .c lab var inr_`var' "`: var l `var''" gen dk_`var' = `var'==.b replace dk_`var' = . if `var'== .c lab var dk_`var' "`: var l `var''" quietly: ci inr_`var' gen ci_inr_`var'=r(lb) quietly: ci dk_`var' gen ci_dk_`var'=r(lb) } * * * Collapse data set on variable level * collapse (mean) inr_* ci_inr_* dk_* ci_dk_* gen id=. gen inr=. gen ci_inr=. gen dk=. gen ci_dk=. gen str name="" local obs_n 1 foreach var of varlist inr_* { set obs `obs_n' replace id=`=_N' in `=_N' quietly: sum `var' replace inr=r(mean) in `=_N' quietly: sum ci_`var' replace ci_inr=r(mean) in `=_N' replace name="`: var l `var''" in `=_N' local obs_n `=_N+1' } local count 1 foreach var of varlist dk_* { quietly: sum `var' replace dk=r(mean) in `count' quietly: sum ci_`var' replace ci_dk=r(mean) in `count' local count `=`count' + 1' } * * Correct labels * replace name=substr(name,12,.) * * Drop unnecessary variables * keep id name inr ci_inr dk ci_dk * * * *------------------------------- * Prepare list of variables *------------------------------- * * Calculate threshold value quietly: sum inr, d gen threshold=r(p75)+(1.5*(r(p75)-r(p25))) /* Factor k = 1.5 */ * * Provide ID ratio and IDI ratio gen id_ratio=inr/dk gen idi_ratio=inr/(dk+inr) * * * Create list of all critical variables with additional information list name inr dk id_ratio idi_ratio if ci_inr > threshold & inr!=. & ci_inr!=. *