{"id":7031,"date":"2016-05-09T22:00:52","date_gmt":"2016-05-09T21:00:52","guid":{"rendered":"http:\/\/surveyinsights.org\/?p=7031"},"modified":"2016-05-11T10:47:24","modified_gmt":"2016-05-11T09:47:24","slug":"comparing-smartphones-to-tablets-for-face-to-face-interviewing-in-kenya","status":"publish","type":"post","link":"https:\/\/surveyinsights.org\/?p=7031","title":{"rendered":"Comparing smartphones to tablets for face-to-face interviewing in Kenya"},"content":{"rendered":"<p><strong>Background<\/strong><\/p>\n<p>Research conducted over the past 30 years has demonstrated a reduction in errors and improvement in data quality when face-to-face social surveys are carried out using computers instead of paper and pencil (Banks &amp; Laurie, 2000; de Leeuw, 2008; Schrapler, Schupp, &amp; Wagner, 2010). \u00a0Most studies of data quality by survey mode and by device type have been conducted in developed countries, where the vast number of surveys conducted for policy, marketing, and other purposes provide opportunities for methodological research. \u00a0Caeyers, Chalmers, and De Weedt\u2019s (2010) study comparing paper and pencil interviews (PAPI) to computer-assisted personal interviews (CAPI) in Tanzania represents a rare example of an experimental study on survey methodology in a developing country, and the results confirmed that the internal validation checks that are programmed into CAPI questionnaires to detect skip errors, implausible answers, and impossible answers led to a substantial reduction in errors compared to surveys conducted using PAPI. \u00a0An experimental study in Fiji by Yu et al. (2009) found that none of the errors observed in 20.8% of the paper questionnaires were found in the CAPI versions and that the PDA-programmed version led to cost and time savings compared to paper forms. \u00a0Aside from these rare examples, most research on mode effects in developing countries consist of non-experimental studies carried out as pilots within ongoing surveys or conducted retrospectively when a new mode is adopted on a longitudinal survey. \u00a0Most results have paralleled those found in earlier developed-country studies comparing PAPI to CAPI; researchers have found interviewer error on CAPI surveys is lower than they might have expected if they had used paper surveys, and this perceived reduction in error is likely attributable to enforced skips (Trott &amp; Simpson. 2005; Siekmans, Ngni\u00e9-Teta, Ndiaye, &amp; Berti, 2012; for an alternative view, see Escobal &amp; Benites, 2013).<\/p>\n<p>International funders and organizations carrying out data collection projects are eager to adopt computerized methods for data collection. \u00a0Indeed, the United Nations Economic and Social Affairs Statistics Division explicitly recommends that, \u201cin all cases, data should be collected in electronic format wherever possible, as this facilitates data capture and editing\u201d (United Nations Statistics Division, 2014). \u00a0Yet, adoption of CAPI surveys for data collection in developing countries has been slow through the mid-2010s. \u00a0Based on the authors\u2019 collective experience and anecdotal information from survey managers conducting surveys in developing countries, this is likely due to obstacles such as the cost and availability of hardware and software for surveys, a relatively short battery life for laptops, the need for frequent access to an electrical current, the relative fragility of the hardware, the lack of reliable mobile networks for data transmission, and limited experience with questionnaire programming and CAPI management on the part of in-country survey organizations.<\/p>\n<p>Recent advances in lower-cost, lighter weight mobile devices such as smartphones and tablets with longer-lived batteries, user-friendly interfaces, and easy programming coincide with the rapid expansion of mobile networks to produce an opportune time for adopting CAPI instruments for surveys in developing countries. \u00a0Tomlinsen et al. (2009), among others, suggest that the ease of use and familiarity of mobile phones could make them more useful for data collection than other CAPI hardware.\u00a0 The World Bank has taken a leading role in expanding mobile-platform surveys by developing a mobile questionnaire and survey management tool for use on the global Living Standards Measurement Survey (Carletto, 2015) and other World Bank-sponsored surveys. \u00a0The United States Census Bureau also has developed a mobile version of its free CSPro survey questionnaire software.\u00a0 However, little is known about the impact smaller devices have on the quality of data when used for face-to-face interviews. \u00a0Instead, research on device-mode effects on data quality have been carried out primarily on self-administered questionnaires (SAQ).<\/p>\n<p>On SAQs, whether computerized or paper and pencil, respondents must process information that generally appears before them in a static form: textual, numeric, symbolic, and graphic (Redline &amp; Dillman, 1999).\u00a0 In contrast to SAQs, CAPIs include the intervening presence of an interviewer, who delivers the question orally and provides, in the gold standard method, only pre-defined interpretations of the question.\u00a0 But while the contextual differences between SAQs and CAPIs are substantial, the existing research on SAQs is nonetheless instructive for implementers of any type of CAPI data collection, since human-computer interaction (HCI) is necessary for an interviewer to complete the survey. \u00a0Studying the smaller size of mobile devices, Bruijne and Wijnant (2013) found that self-administered web surveys carried out on mobile devices took longer to complete, perhaps due to formatting differences, than the same survey on desktop computers. \u00a0Mavletova (2013) also found that durations were longer when respondents used mobile devices to complete a survey compared to PC or laptop, although only a portion of the longer duration was due to respondents finding it more difficult to complete questions. \u00a0Rather, slow question loading explained most of the difference. Lugtig and Toepoel (2016) found larger measurement error when smaller devices were used for an SAQ, although they surmised that this error might be due to respondent characteristics rather than device, per se, wherein respondents who choose to use smaller devices might differ in substantive ways from those who choose to use larger devices such as desktops or tablets. \u00a0Other studies of HCI suggest that mobile phones may not be an optimal replacement for the larger screens of laptops and larger mobile devices such as tablets for completing questionnaires. \u00a0Peytchev and Hill (2010) found that small keyboard size led to avoidance of open-ended questions in an experimental mobile self-administered survey. \u00a0Peytchev and Hill point to a broader literature on HCI, which shows that task success rates, such as correct selections, are lower on smaller screens. Applying this growing body of research on the effect of screen size on the quality of survey data, we suspect that device size could affect the quality of survey data entered by interviewers. \u00a0Even if a respondent provides a lengthy response to an open-ended question or a well-considered response to a closed question, the interviewer may short-cut or mis-select responses at a higher rate on a smaller device, thus altering responses and curtailing quality.<\/p>\n<p>Experimental research on CAPIs under field conditions in developing countries is rare and to date we can find no experimental comparisons of device size impact on interviewer data quality in such settings. \u00a0As a first effort, using data from a small pilot study conducted during a large-scale CAPI survey in Kenya, we compare the influence of device size on the quality of survey data collected by interviewers using tablets or smartphones. \u00a0By assessing interviewer data quality in terms of thoroughness (low number of missing responses and high rate of GPS coordinate capture), accuracy (correct data entry), and consistency (mean duration), we explore the influence of device size on interviewers\u2019 administration behaviour. \u00a0In our analysis we assume equality in experience across the interviewers, \u201cJohn\u201d and \u201cJane,\u201d but we also collected information on the interviewers\u2019 perceptions of the two devices to better understand the individual user experience. \u00a0We hypothesize that data collected on smartphones will be of lower quality than data collected using tablets. \u00a0We expect that lower quality will be seen through a higher number of missing responses and GPS capture, more errors in numeric or text entry, and shorter or implausible durations, and that these indicators of lower quality are linked to the use of the smaller screens and keyboards on smartphones.<\/p>\n<p><em>Thoroughness (low missing data)<\/em>: Item nonresponse is one of two main types of nonresponse error (the other being sample unit nonresponse). \u00a0Rates of item missingness, including \u201cDon\u2019t Know\u201d (DK), \u201cRefuse\u201d (REF), and \u201cNot applicable\u201d (NA), are routinely used as markers of interviewer data quality in surveys under the expectation that \u201cgood\u201d interviewer behaviour will lead to high cooperation and willingness from respondents to provide responses other than DK\/REF\/NA (Groves, 1989; de Leeuw, 2001; de Leeuw &amp; Huisman, 2003; Jans, Sirkis, &amp; Morgan, 2013). \u00a0Recent research on questionnaire design suggests that item nonresponse differs by device type (Mavletova &amp; Couper, 2014).<\/p>\n<p><em>Accuracy (correct data entry):<\/em> Training interviewers to correctly enter numeric and text strings is a strategy for reducing other interviewer-related measurement error, such as out of range responses or mis-recorded responses (Biemer &amp; Lyberg, 2003; Fowler, 2004).\u00a0 Whether entering case ID codes, monetary values, or telephone numbers, correct and complete numerical data entry is a key interviewer skill for ensuring data quality.<\/p>\n<p><em>Consistency (mean duration)<\/em>: Survey managers track the average duration of survey interviews as part of process management and as a useful indicator of interview quality (Olson &amp; Peytchev, 2007). \u00a0For process management and budget control, the expected duration of the interview is determined during pretesting of the instrument and re-estimated in the early field period. \u00a0These benchmarks are used during the field period to identify outlier cases for further scrutiny or to identify interviewers whose average duration is outside the expected range. \u00a0Duration in a personal interview can correlate to cooperation and rapport (Holbrook, Green, &amp; Krosnick, 2003), is simple to measure, and acceptable ranges are relatively easy to set and monitor. \u00a0Differences in duration can be understood in a variety of ways.\u00a0 Shorter times may indicate a high degree of rapport and cooperation between respondent and interviewer or suggest efficiency on the part of the interviewer. \u00a0On the other hand, shorter duration might suggest shortcutting or speeding on the part of the interviewer. \u00a0In their 2013 study of response times, Couper and Kreuter found that questionnaire items with interviewer instructions took less time to administer than items without instructions, leading the authors to surmise that interviewers might not be reading the instructions. \u00a0Unobtrusive computer audio recorded interviewing (CARI) studies support this finding; in a study of interviewer effect on data quality, Kosyakova, Skopek, and Eckman (2014) found that CAPI interviewers manipulate the triggering rate of filter questions and that this undesirable behaviour increased over the field period. \u00a0When interviewer pay structure is per-completed-case, speeding might be a logical approach to maximizing wages.<\/p>\n<p><strong>\u00a0<\/strong><\/p>\n<p><strong>Methodology<\/strong><\/p>\n<p>Data for the World Bank\u2019s Kenya State of the Cities Baseline Survey was collected from July 2012 to March 2013.\u00a0 The survey supports the Kenya Municipal Program (KMP), a long-term effort to improve living conditions through infrastructure investment and service delivery in 15 municipalities in Kenya. The survey portion of the State of the Cities project included two main tasks: 1) creating a sample frame based on listing a projected 194,000 households in 2,087 enumeration areas (EAs) in 15 of Kenya\u2019s largest cities and, 2) carrying out interviews of 30-45 minutes\u2019 duration with approximately 14,600 households randomly selected from the sample frame. Listing and interviewing were carried out concurrently using tablet computers. \u00a0Teams of data collectors used tablet-programmed listing forms to enumerate all households contained within each EA. \u00a0Next, interviewers uploaded the listing data to a server via the mobile network using their SIM card-enabled tablets.\u00a0 The data were captured in a server accessible via a web interface.\u00a0 The data collection team sampled households from each fully listed EA using the web interface, and then transmitted the selected case data, including household identifier, location, and descriptive data, to interviewer tablets.\u00a0 Finally, interviewers contacted the selected households for interviewing.\u00a0 At the end of each day, all completed survey response data were transmitted to the server via the mobile network, and all data were accessible for review and processing through a web interface.<\/p>\n<p>As part of a grant from the Center for Excellence in Survey Research at NORC at the University of Chicago, the research team selected two KMP survey interviewers, \u201cJohn\u201d and \u201cJane,\u201d to carry out 200 of their assigned household interviews using smartphones instead of tablets.\u00a0 The selected interviewers had several years\u2019 experience working with the data collection company, demonstrated high production on social surveys, and were considered to collect high quality data, according to the data collection manager (<em>n.b.<\/em> the criteria for this determination was not clear, and no specific data supporting the rating were provided to the authors). \u00a0Midway through the tablet data collection period, the two interviewers conducted approximately 50 interviews each using smartphones in two cities, Nairobi and Thika, to reach a total of 200 interviews. \u00a0To complete these interviews, interviewers simply switched devices until they had completed 50 interviews in each city. \u00a0The application and interface was identical on both the phone and the tablet, with no differences in functionality; the sole difference between the devices were the screen and keyboard size. \u00a0The cases completed on phones were all \u201cfresh\u201d; in other words, respondents had not been previously contacted by the interviewers and were not pre-screened in any way. \u00a0By performing the pilot study in the middle of the field period, we were able to ensure that the interviewers were already familiar with the software and that any differences in quality would be attributable to device effects.<\/p>\n<p>The purpose of the research was to permit comparison of the quality of data collected using smartphones to the quality of data collected using tablets. \u00a0We compared the data in terms of missing responses (thoroughness), mistyped phone numbers (accuracy), and mean duration of the interview (consistency).\u00a0 We also carried out qualitative interviews with the interviewers and their supervisor to gain a more textured understanding of their experiences with the smartphone and tablet, and their preferences in using the two different devices.<\/p>\n<p>We performed two-sample t-tests to determine whether there was a significant difference in the several indicators of survey quality between interviews conducted with phones compared to interviews conducted with tablets using several different comparison methods.\u00a0 Below, we discuss the results for each indicator.<\/p>\n<p><strong>\u00a0<\/strong><\/p>\n<p><strong>Results<\/strong><\/p>\n<p>In our initial research design, we planned to compare the data collected using phones to the data collected using tablets by combining our two phone interviewers\u2019 results and comparing to tablet data collected by all interviewers in all 15 cities.\u00a0 However, we found significant differences in outcomes between the two interviewers participating in the experiment. \u00a0This made the analysis more challenging as the two distinct interviewer profiles reduced our ability to make generalizations to other interviewers. \u00a0This also reduced our sample size since combining their results distorted the output; therefore we analysed their results separately. \u00a0However, the interviewer-specific differences provided an opportunity to explore individual interviewing styles and experiences, and how these interacted with the two mobile devices. \u00a0Below, we present the results of our quantitative analysis of the data and include illustrative or explanatory qualitative data where appropriate.\u00a0 For each dimension, we present each interviewer&#8217;s tablet results compared to his or her own phone results.\u00a0 The two interviewers\u2019 results are presented side-by-side to create an easy visualization of the differences between their results.<\/p>\n<p><strong><em>Thoroughness (missing data: DK\/REF\/NA)<\/em><\/strong><\/p>\n<p>For this analysis, we compared our two interviewers\u2019 rates of missing items and found that only one of the two interviewers demonstrated a significant difference in item missingness by device mode.\u00a0 As shown in Table 1, interviewer Jane showed a statistically significant lower proportion of missing data for tablet interviews compared to phone interviews, using all variations of comparison groups, at a level of 0.84% to 1.5% lower, while John\u2019s rate of item missingness was similar on both devices under all comparison scenarios.<\/p>\n<p><strong>Table 1: Difference in mean percent of missing values (Refused, Don&#8217;t Know, Not Applicable)<\/strong><\/p>\n<p>&nbsp;<\/p>\n<table width=\"743\" border=\"1\" cellspacing=\"0\" cellpadding=\"0\">\n<tbody>\n<tr>\n<td valign=\"top\" width=\"407\">Comparison Groups, by device used<\/td>\n<td colspan=\"2\" valign=\"top\" width=\"336\">\n<p align=\"center\">Difference in mean % by observations with missing values, by interviewer<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\" width=\"407\"><\/td>\n<td valign=\"top\" width=\"171\">\n<p align=\"center\">Tablets compared to phones (John)<\/p>\n<\/td>\n<td valign=\"top\" width=\"165\">\n<p align=\"center\">Tablets compared to phones (Jane)<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\" width=\"407\">Thika tablets compared to Thika phones<\/td>\n<td valign=\"top\" width=\"171\">\n<p align=\"center\">-0.15<\/p>\n<\/td>\n<td valign=\"top\" width=\"165\">\n<p align=\"center\">-1.32***<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\" width=\"407\">Nairobi tablets compared to Nairobi phones<\/td>\n<td valign=\"top\" width=\"171\">\n<p align=\"center\">-0.01<\/p>\n<\/td>\n<td valign=\"top\" width=\"165\">\n<p align=\"center\">-0.99*<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\" width=\"407\">Thika &amp; Nairobi tablets compared to Thika &amp; Nairobi phones<\/td>\n<td valign=\"top\" width=\"171\">\n<p align=\"center\">-0.11<\/p>\n<\/td>\n<td valign=\"top\" width=\"165\">\n<p align=\"center\">-1.14***<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\" width=\"407\">All cities compared to Thika phones<\/td>\n<td valign=\"top\" width=\"171\">\n<p align=\"center\">0.01<\/p>\n<\/td>\n<td valign=\"top\" width=\"165\">\n<p align=\"center\">-0.84**<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\" width=\"407\">All cities compared to Nairobi phones<\/td>\n<td valign=\"top\" width=\"171\">\n<p align=\"center\">-0.26<\/p>\n<\/td>\n<td valign=\"top\" width=\"165\">\n<p align=\"center\">-1.53***<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\" width=\"407\">All cities compared to Thika &amp; Nairobi phones<\/td>\n<td valign=\"top\" width=\"171\">\n<p align=\"center\">-0.12<\/p>\n<\/td>\n<td valign=\"top\" width=\"165\">\n<p align=\"center\">-1.17***<\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><em><span style=\"font-size: small;\">\u00a0\u00a0\u00a0 <\/span><\/em><em>Statistical significance indicated as follows: *=p&lt;0.1, **=p&lt;0.05, ***=p&lt;0.01<\/em><\/p>\n<p>When asked to compare her use of the phone to the tablet, Jane indicated that she was able to type faster on tablets because of the size of the keys.\u00a0 Jane and John both indicated that they were more likely to accidentally mis-select responses on the phones than on tablets.\u00a0 While both of these statements suggest potential drawbacks of phones, we cannot draw a clear link to the higher rate of missing items for Jane\u2019s phone.<\/p>\n<p><strong><em>Accuracy (typing errors)<\/em><\/strong><\/p>\n<p>In this survey, both interviewers collected significantly more valid phone numbers on tablets than on phones by nearly every measure of comparison, as shown in Table 2.\u00a0 (Phone numbers were deemed \u201cvalid\u201d if they contained the correct number of digits and started with Kenyan prefixes.)<\/p>\n<p><strong>Table 2: Difference in the mean number of valid phone numbers listed<\/strong><\/p>\n<p>&nbsp;<\/p>\n<table width=\"743\" border=\"1\" cellspacing=\"0\" cellpadding=\"0\">\n<tbody>\n<tr>\n<td valign=\"top\" width=\"405\">Comparison Groups, by device used<\/td>\n<td colspan=\"2\" valign=\"top\" width=\"338\">\n<p align=\"center\">Difference in mean number of valid phone numbers listed, by interviewer<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\" width=\"405\"><\/td>\n<td valign=\"top\" width=\"173\">\n<p align=\"center\">Tablets compared to phones (John)<\/p>\n<\/td>\n<td valign=\"top\" width=\"165\">\n<p align=\"center\">Tablets compared to phones (Jane)<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\" width=\"405\">Thika tablets compared to Thika phones<\/td>\n<td valign=\"top\" width=\"173\">\n<p align=\"center\">0.10<\/p>\n<\/td>\n<td valign=\"top\" width=\"165\">\n<p align=\"center\">0.63*<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\" width=\"405\">Nairobi tablets compared to Nairobi phones<\/td>\n<td valign=\"top\" width=\"173\">\n<p align=\"center\">0.21**<\/p>\n<\/td>\n<td valign=\"top\" width=\"165\">\n<p align=\"center\">0.18**<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\" width=\"405\">Thika &amp; Nairobi tablets compared to Thika &amp; Nairobi phones<\/td>\n<td valign=\"top\" width=\"173\">\n<p align=\"center\">0.15**<\/p>\n<\/td>\n<td valign=\"top\" width=\"165\">\n<p align=\"center\">0.17***<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\" width=\"405\">All cities compared to Thika phones<\/td>\n<td valign=\"top\" width=\"173\">\n<p align=\"center\">0.10<\/p>\n<\/td>\n<td valign=\"top\" width=\"165\">\n<p align=\"center\">0.20**<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\" width=\"405\">All cities compared to Nairobi phones<\/td>\n<td valign=\"top\" width=\"173\">\n<p align=\"center\">0.19**<\/p>\n<\/td>\n<td valign=\"top\" width=\"165\">\n<p align=\"center\">0.16**<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\" width=\"405\">All cities compared to Thika &amp; Nairobi phones<\/td>\n<td valign=\"top\" width=\"173\">\n<p align=\"center\">0.14**<\/p>\n<\/td>\n<td valign=\"top\" width=\"165\">\n<p align=\"center\">0.18***<\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><em><span style=\"font-size: small;\">\u00a0\u00a0 \u00a0<\/span><\/em><em>Statistical significance indicated as follows: *=p&lt;0.1, **=p&lt;0.05, ***=p&lt;0.01<\/em><\/p>\n<p>The difference between tablets and phones for interviewers accurately collecting phone numbers may have been due to differences in the interaction between the interviewer and the device. First, the keyboard size is smaller on the phone, which might lead to accidental \u201ctypos,\u201d or errors in numbers when interviewers\u2019 fingers touch more than one key. Second, thumb typing with either or both thumbs is typical for keyboard data entry on the smartphone, while interviewers could more easily use all fingers on one hand or, possibly, both hands, to enter data on the tablet keyboard. \u00a0This study, which was carried out under normal field conditions, did not include capture of typing method by device, although the participating interviewers indicated thumb-typing was most typical on the phones and both thumb-typing and one-handed typing were used on tablets.<\/p>\n<p>Although the analysis showed significantly poorer results for accurately capturing numbers on phones, the two interviewers in our experiment described different experiences typing with the phones.\u00a0 Jane said that she tended to type less (fewer words in text strings) with the phone than when using the tablet and she found typing easier on the tablet because of the larger size of the keys. This could mean that she skipped some typing tasks on the phone, including entering phone numbers. John did not find that one device was easier for typing than the other.<\/p>\n<p>Alternatively, respondents may have felt uncomfortable giving out their phone number when it was being entered into what may have looked like the interviewers\u2019 personal cell phone; when the interviewer used a tablet, confidence may have been higher that the phone numbers were being collected for legitimate purposes.\u00a0 Therefore, we cannot rule out respondent reluctance as a source of error in phone number collection.<\/p>\n<p><strong><em>Consistency (mean duration)<\/em><\/strong><\/p>\n<p>By using the start time and the end time captured in the programmed questionnaire, we calculated the length of each interview on the KMP survey.\u00a0 The overall mean duration (all interviewers) was 24.3 minutes. We performed two-sample t-tests to determine whether there was a significant difference in the mean durations of interviews conducted using phones as compared to interviews conducted using tablets.<\/p>\n<p>As shown in Table 3, John had significantly longer survey durations on tablets than phones in Thika and Nairobi.\u00a0 In Thika, his tablet interviews were, on average, five minutes longer than his phone interviews, and in Nairobi, they were over 11 minutes longer.\u00a0 His mean duration on tablets in all 15 cities was also significantly longer than his mean duration on phones in Nairobi by an average of over eight minutes.<\/p>\n<p><strong>Table 3: Difference in mean survey durations (in minutes) <\/strong><\/p>\n<p>&nbsp;<\/p>\n<table width=\"743\" border=\"1\" cellspacing=\"0\" cellpadding=\"0\">\n<tbody>\n<tr>\n<td valign=\"top\" width=\"407\">Comparison Groups, by device used<\/td>\n<td colspan=\"2\" valign=\"top\" width=\"336\">\n<p align=\"center\">Difference in mean survey durations (minutes), by interviewer<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\" width=\"407\"><\/td>\n<td valign=\"top\" width=\"171\">\n<p align=\"center\">Tablets compared to phones (John)<\/p>\n<\/td>\n<td valign=\"top\" width=\"165\">\n<p align=\"center\">Tablets compared to phones (Jane)<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\" width=\"407\">Thika tablets compared to Thika phones<\/td>\n<td valign=\"top\" width=\"171\">\n<p align=\"center\">5.03*<\/p>\n<\/td>\n<td valign=\"top\" width=\"165\">\n<p align=\"center\">0.75<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\" width=\"407\">Nairobi tablets compared to Nairobi phones<\/td>\n<td valign=\"top\" width=\"171\">\n<p align=\"center\">11.66***<\/p>\n<\/td>\n<td valign=\"top\" width=\"165\">\n<p align=\"center\">1.22<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\" width=\"407\">Thika &amp; Nairobi tablets compared to Thika &amp; Nairobi phones<\/td>\n<td valign=\"top\" width=\"171\">\n<p align=\"center\">0.59<\/p>\n<\/td>\n<td valign=\"top\" width=\"165\">\n<p align=\"center\">1.06<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\" width=\"407\">All cities compared to Thika phones<\/td>\n<td valign=\"top\" width=\"171\">\n<p align=\"center\">-6.10<\/p>\n<\/td>\n<td valign=\"top\" width=\"165\">\n<p align=\"center\">3.28*<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\" width=\"407\">All cities compared to Nairobi phones<\/td>\n<td valign=\"top\" width=\"171\">\n<p align=\"center\">8.30**<\/p>\n<\/td>\n<td valign=\"top\" width=\"165\">\n<p align=\"center\">-2.39<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\" width=\"407\">All cities compared to Thika &amp; Nairobi phones<\/td>\n<td valign=\"top\" width=\"171\">\n<p align=\"center\">0.51<\/p>\n<\/td>\n<td valign=\"top\" width=\"165\">\n<p align=\"center\">0.51<\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><em><span style=\"font-size: small;\">\u00a0\u00a0\u00a0 <\/span><\/em><em>Statistical significance indicated as follows: *=p&lt;0.1, **=p&lt;0.05, ***=p&lt;0.01<\/em><\/p>\n<p>In contrast, Jane showed no significant difference in the mean duration of interviews on phones compared to tablets in Thika or Nairobi (separately or combined). \u00a0However, when comparing all tablet interviews in all 15 cities to Jane\u2019s Thika phone interviews, tablet interview durations were significantly longer than Jane\u2019s phone interviews by an average of three minutes.\u00a0 In discussions about the phone pilot, Jane indicated that she owned a smartphone and used it for texting.\u00a0 John indicated that he did not have a smartphone.\u00a0 It is possible that Jane\u2019s shorter durations on phones were a result of more familiarity with a similar device and that John\u2019s longer durations on phones represent his longer learning curve, but Jane\u2019s higher rate of missing and mistyped values on the phone muddy this supposition.<\/p>\n<p><strong><em>Interviewer perceptions and preferences<\/em><\/strong><\/p>\n<p>As described above, our research plan included gathering impressions from our interviewers on differences using tablets and phones for the data collection. \u00a0The primary differences, as the interviewers experienced them, can be summarized as the following:<\/p>\n<ul>\n<li>The tablets attracted more attention than the phones in most interviewing areas or neighbourhoods.\u00a0 Both respondents and other observers wanted to know more about the tablets, such as how much they cost and how they work. \u00a0As a result of peoples\u2019 curiosity, some of the interviewers\u2019 activities were stalled, as they felt obliged to \u201copen ourselves up and answer the questions\u2026it can be a problem, and we might not get to the respondents (on time).\u201d\u00a0 Phones did not attract this kind of attention. \u00a0\u201cWhen you have the phone, people assume you are a visitor coming to see the neighbours.\u00a0 When you have the tablet, they ask questions about you, assuming you are coming for other reasons.\u201d \u00a0The interviewers also had different perceptions of the smartphones and the tablets depending on their location and the sample to be interviewed:<\/li>\n<\/ul>\n<ol>\n<li>The interviewers preferred tablets in higher income neighbourhoods in Nairobi, as the interviewers felt the tablets helped them appear more professional. They saw this as an advantage for gaining cooperation among white collar and other employed respondents.<\/li>\n<li>The interviewers preferred phones in \u201cslums,\u201d as they do not stand out like tablets, and are easier to hide in insecure locations.<\/li>\n<\/ol>\n<ul>\n<li>There was an adjustment period for the interviewers as they learned to use the smartphones, which could account for some differences in data quality.\u00a0 Smartphones were introduced three months into the field period, and the interviewers indicated that it took a little time to become familiar with the phones. \u00a0\u201cOur thumbs are used to the tablets and have been using them a longer time. \u00a0As we continue using the phones, we\u2019ll get more used to the phones so it will be more or less the same.\u201d<\/li>\n<li>Jane typed faster and more on tablets than on phones, according to her own review of the experience.\u00a0 The reason, she stated, had to do with the size of the keys. \u00a0John did not indicate any difference in typing on the two devices.<\/li>\n<\/ul>\n<p>While discussing the smartphones and tablets, the interviewers described two differences in their interactions with the devices that were particularly revelatory for data collection planning:<\/p>\n<ul>\n<li>Phones require more scrolling to read questions and select response options, which the interviewers admitted led them to avoid fully scrolling to read questions as they were written. \u00a0Instead, the interviewers stated that after having spent months doing many interviews, they no longer needed to scroll to read the questions and\/or response options. \u00a0These comments suggest a significant departure from the standard survey methodology and data quality step of reading each question exactly as it is written.<\/li>\n<li>Interviewers also indicated that the act of scrolling to read response options can lead to accidentally selecting a response with the touch-screen interface. \u00a0The interviewers indicated that mis-selecting responses occurred more frequently on the phone due to more scrolling needed on the phone than the tablet to view screens. \u00a0Our analysis is unable to detect mis-selected responses.<\/li>\n<\/ul>\n<p><strong>\u00a0<\/strong><\/p>\n<p><strong>Discussion<\/strong><\/p>\n<p>Despite our hypothesis that smaller screen size would lead to poorer quality data collected on smartphones than on tablets, our quantitative analysis was not overwhelmingly conclusive regarding differences in data quality collected on tablets versus phones. \u00a0A lower proportion of valid phone numbers on phones compared to tablets (Table 2) was the only measure on which both interviewers showed significant differences between devices. \u00a0This result should be taken into consideration when researchers adopt smartphones for this type of household survey data collection.\u00a0 Most social scientific surveys require gathering numeric data, not just limited to phone numbers but also including income, expenditure, quantities, and other numeric values.\u00a0 Accurately recording numbers is challenging for interviewers even on laptops with a full keyboard and, consequently, repeated practice forms an important module in interviewer training for many social scientific surveys.\u00a0 Even simple differences such as the layout of the numeric keypad can affect accuracy and speed of data entry for numbers (Armand, Redick, &amp; Poulsen, 2013), as can the size of the numeric keys (Park &amp; Han, 2010).\u00a0 The smaller keypads on phones may prove to be a source of error for this device type, but both tablets and phones require practice for interviewers to acquire accuracy.<\/p>\n<p>Returning to the surprising result of very different outcomes for the two interviewers, who were selected using the same criteria, we believe that an \u201cinterviewer effect\u201d has muddied some of our results.\u00a0 Data collected showed significant differences between our two interviewers in nearly all dimensions of data quality (not shown here); John had longer duration and lower GPS capture on tablets than Jane, and Jane had fewer missing values on tablets and more valid phone numbers than John. \u00a0From the literature we know that differences in missing values may arise from respondent characteristics, such as an unwillingness to provide information for one device due to privacy concerns or systematic difference in the sample assigned to one interviewer, or from differences in interviewer behaviour, such as lower rates of probing or other causes (see de Leeuw, 2001). \u00a0Our research is unable to uncover respondent reluctance associated with device, but a thorough examination of the pilot interviewers\u2019 cases revealed that differences in sample characteristics did not explain between-interviewer differences on quality measures (not shown). \u00a0Instead, it is possible that we are detecting differences in quality that are associated with the capabilities or experience of the two interviewers in the pilot rather than differences attributable to their interactions with the two different devices. \u00a0Of particular note in this regard is the much shorter duration of interviews by Jane on both devices, despite no differences in her sample compared to John\u2019s that would lead to shorter interviews.<\/p>\n<p>The unexpected admission of poor adherence to survey administration protocols (using memory instead of scrolling for long questions or response options) and the problem of mis-selecting responses while scrolling suggest several recommendations for programming and interviewer monitoring when using tablets or smartphones for data collection.\u00a0 First, surveys must be optimized for the screen size of the data collection device, including cutting lists into segments that fit onscreen. \u00a0Second, programmers must take care to place selection buttons in the centre of the screen, away from the edges of the form, where users place fingers for scrolling or paging. \u00a0Third, programmers should weigh the benefits of programming confirmation screens against the cost of lengthier surveys. \u00a0Inconsistencies between initial response and confirmation screen should produce a flag immediately visible to the interviewer to allow for correction during the interview. \u00a0Fourth, interviewer training should include demonstration and practice on correct use of the touchscreen to avoid mis-selections. \u00a0Finally, interviewer monitoring should include, if feasible, recording portions of the interviewers\u2019 survey administration.<\/p>\n<p>While implementing these recommendations could reduce interviewer-produced errors, ultimately the quality of survey data largely depends on the technical skills of the interviewers and the investment in training, data review, and continuous interviewer feedback made by the research team.<\/p>\n<p><strong>\u00a0<\/strong><\/p>\n<p><strong>Limitations of the analysis<\/strong><\/p>\n<p>This research has a number of limitations, briefly listed below. \u00a0Budget constraints were the major driver of our choice of a non-experimental method, while client reluctance to extend the pilot to a larger portion of the total survey sample was another design consideration.\u00a0 Thus, readers should keep in mind that the research is limited by:<\/p>\n<ul>\n<li>Non-experimental method<\/li>\n<li>Small sample (2interviewers, 2 cities, 100 respondents per city, 50 in each arm)<\/li>\n<li>No independent verification of the response data (call-back data verification with respondents was carried out by the interviewer team supervisor, not by independent researchers and did not include full re-interviews)<\/li>\n<li>Unknown influence of interviewer effects and interviewer interaction with the devices<\/li>\n<li>May not be generalizable to other contexts or survey content<\/li>\n<li>Survey was used \u201cout of the box\u201d and not optimized for use on phone<\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<p><strong>Conclusion<\/strong><\/p>\n<p>Adoption of new, faster, and cheaper devices for data collection is tempting on any survey project, perhaps particularly so in developing countries where alternatives are few and data collection budgets are low.\u00a0 However, researchers should incorporate methods for identifying in advance the ideal screen size and functionality of data collection device depending on the content and length of questionnaires, as well as other relevant requirements of the survey project.\u00a0 In software and systems engineering, analysts define \u201cuse-cases\u201d appropriate for different purposes. \u00a0Our research suggests that there could be some use-cases for which tablets are most appropriate, others in which phones are best, and still others in which phones and tablets are interchangeable. \u00a0In addition, further study is needed to better understand how human-computer interaction affects data quality on CAPI studies that adopt mobile devices.\u00a0 Researchers must focus efforts on reducing errors that could be tied to device size and screen layout when selecting a device, and to modify hiring, training, and monitoring of interviewers to take into account different interviewer experience and interviewing styles.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Background Research conducted over the past 30 years has demonstrated a reduction in errors and improvement in data quality when face-to-face social surveys are carried out using computers instead of paper and pencil (Banks &amp; Laurie, 2000; de Leeuw, 2008; Schrapler, Schupp, &amp; Wagner, 2010). \u00a0Most studies of data quality by survey mode and by [&hellip;]<\/p>\n","protected":false},"author":420,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[94,90,95,330,327,329],"class_list":["post-7031","post","type-post","status-publish","format-standard","hentry","category-uncategorized","tag-capi","tag-data-quality","tag-face-to-face","tag-mobile","tag-smartphones","tag-tablets"],"acf":[],"_links":{"self":[{"href":"https:\/\/surveyinsights.org\/index.php?rest_route=\/wp\/v2\/posts\/7031","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/surveyinsights.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/surveyinsights.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/surveyinsights.org\/index.php?rest_route=\/wp\/v2\/users\/420"}],"replies":[{"embeddable":true,"href":"https:\/\/surveyinsights.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=7031"}],"version-history":[{"count":94,"href":"https:\/\/surveyinsights.org\/index.php?rest_route=\/wp\/v2\/posts\/7031\/revisions"}],"predecessor-version":[{"id":7041,"href":"https:\/\/surveyinsights.org\/index.php?rest_route=\/wp\/v2\/posts\/7031\/revisions\/7041"}],"wp:attachment":[{"href":"https:\/\/surveyinsights.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=7031"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/surveyinsights.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=7031"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/surveyinsights.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=7031"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}