Evaluation of Survey Data Quality Based on Interviewers’ Assessments: An Example from Taiwan’s Election and Democratization Study
Chi-lin Tsai, Tsung-Wei Liu, Yi-ju Chen
Asian Journal for Public Opinion Research. 2019. February, 7(1): 57-74
Researchers usually examine the quality of survey data by several conventional measures of reliability and validity. However, those measures are mainly designed to examine the quality of an individual measurement, rather than the quality of a data set as a whole. There is a relative lack of methods for evaluation of the overall data quality. This paper attempts to fill this gap. We propose using interviewers’ assessments as one of criteria for evaluating the overall data quality. Interviewers are the ones who literally conduct and thus directly observe interviews. Taiwan’s Election and Democratization Studies (TEDS) have required interviewers to assess how trustworthy the responses of each of their interviewees are, and to provide several descriptions about the process and environment of the interviews. We use this information to evaluate the data quality of TEDS surveys and compare it with the results from the conventional test-retest method. The findings are that the interviewer assessment is a fair indicator of the overall reliability of attitudinal questions but not a good indicator when factual questions are examined. Regarding the evaluation of data validity, more data is required to see whether or not interviewers’ assessment is informative in terms of data quality.
data quality; interviewers’ assessment; reliability; validity; Taiwan’s Election and Democratization Study;Social science research relies heavily on opinion polls to collect individual-level information for hypothesis testing and theory building. To evaluate the quality of survey data, researchers have developed various measures of data validity and reliability ( de Vaus, 2001: 20-31 ; Maxim, 1999: 208-250 ; Tudd et al., 1991: 49-61 ). There are three conventional measures of validity: content, criterion-related, and construct validity; in regards to reliability, the commonly used measures are test-retest, alternative form, split-halves, and internal consistency. These measures, however, are not always feasible in practice, because of the limited time and budget of polls. Moreover, those measures are mainly used to test individual survey items, rather than evaluating the overall quality of a data set. A cost-effective method for evaluating the quality of the entire data set is still absent. This paper attempts to fill these research gaps. Liu and Chen (2004) attempted to use the “interviewer assessment” (also known as “interviewer perception”) as a possible complement to the conventional measures of data quality. Their idea is noteworthy, but their analysis is preliminary – based on only three surveys from the 2002 and 2003 Taiwan’s Election and Democratization Study (TEDS) – and consequently their research finding is inconclusive. In this paper, we use 27 surveys to re-examine Liu and Chen’s idea and offer more empirical evidence for the usefulness of the interviewer assessment as a measure of data quality. In short, this paper examines interviewers’ assessments of their completed interviews, which may be a useful indicator of the overall data quality. In the following sections, we first discuss why data quality is an important issue for opinion polls. Then, we discuss, from both theoretical and practical perspectives, why the interviewer assessments can be a useful supplementary indicator of data quality. An empirical examination of our argument is carried out based on the TEDS surveys from 2002-2017. This paper concludes with some remarks about our findings and suggestions for future studies.
Why is Data Quality an Important Issue for Opinion Polls?Despite state-of-the-art methodology and technology, scientific opinion polls seldom claim to be error-free; on the contrary, an opinion poll is scientific mainly because it acknowledges errors and endeavors to control and correct errors ( Lavrakas, 2013 ). Evaluation of data quality is therefore a fundamental basis of modern survey methodology and research. Survey data are potentially subject to errors due to both the nature of public opinion and the designs of opinion polls. In regards to public opinion, it is unstable in nature and hence difficult to measure without error. Over fifty years ago, Converse (1964) noticed that survey respondents did not answer related questions in an interview consistently, and their answers change apparently randomly from interview to interview. Converse considers his research findings as evidence that the mass public has no genuine attitude toward most of issues of society. Opinion polls are inevitably subject to errors, because respondents do not admit to their non-attitudes but tend to randomly make up “doorstep opinions” at the moment of the interview. Achen (1975) , in contrast, emphasizes the imperfect designs of opinion polls as a source of errors in survey data. He argues that the mass public does have genuine attitudes toward issues of society, though the attitudes tend to be vague. Consequently, public opinion is not fixed at a point but a distribution of points around some central position. Therefore, better survey designs – particularly the questionnaire designs that take the vagueness of attitudes into account – are essential to capturing public opinion and reducing errors in opinion polls. Overall, Achen and Converse, though debating over the existence of genuine attitudes, both concur implicitly that public opinion is difficult to measure accurately, hence the importance of evaluation of survey data quality. Furthermore, studies on the formation of public opinion also provide theoretical accounts of why public opinion is unstable and difficult to measure. Sniderman, Brody, and his Tetlock (1991, p. 5-7) argue that an individual takes into account the “evaluatively distinct dimensions of judgments” in interpreting events or in making decisions. When the number of distinctive dimensions involved increases, the number of considerations needed is increased, which complicates judgment and results in opinion instability. Zaller (1992) also maintains that an individual possesses numerous inconsistent considerations relating to a particular issue. However, he argues that, rather than taking all considerations into account, the individual forms his or her survey response to that issue based on only a few of the considerations that are at the top of his or her mind at the moment of the interview. Given that the considerations are inconsistent and their relative salience varies with time, public opinion (more specifically, the survey-measured opinion) is unstable by nature. Similarly, Alvarez and Brehm (2002) argue that public opinion is structured by a set of diverse predispositions. If an individual has consistent opinions related to a particular issue, his or her opinion toward that issue will be stable. In contrast, if the individual holds multiple predispositions, his or her opinion will become ambivalent, equivocal, and uncertain. Taken together, although these classic works do not entirely agree with each other about the mechanism of opinion instability and to what extent public opinion is unstable. There appears a consensus that public opinion is indeed unstable, which implies the difficulty in measuring public opinion without error. It is therefore important to evaluate the quality of survey data.
Assessment of Overall Data QualityPublic opinion is variable and dynamic. Conventional measures of data quality, especially those based on response consistency (e.g. the test-retest reliability), are thus not always adequate to provide a clear evaluation of data quality ( Johnson, Joslyn, & Reynolds, 2001 ). Moreover, those measures are mainly designed to evaluate individual survey items rather than the entire data set. Surely, if such an evaluation is carried out for a substantial proportion of items in a questionnaire, the aggregation of individual evaluation results might serve as an indicator of the overall quality of a data set. The problem is, most opinion polls can only afford to evaluate a limited number of items, and those items are often chosen subjectively. The evaluation results are thus not always well representative of the quality of the entire data. In some cases, there is no evaluation of any individual item use as an indicator of the overall data quality. For example, opinion polls that evaluate the test-retest reliability of individual items, e.g. TEDS, are now under greater pressure to abandon such evaluation, as survey interviews are becoming increasingly costly. Taken together, all these considerations stress the need for a more cost-effective method for evaluating the overall quality of survey data. It is important to clarify that we are not arguing against using conventional measures of reliability and validity as an indicator of the overall data quality. Instead, we are arguing for making the use of supplementary information for evaluation of survey data. One potential source of such information is interviewers’ personal assessments of their completed interviews. If interviewers’ assessments are highly correlated with the traditional indicators of reliability and validity, the evaluations of respondents by interviewers would be a cheap and efficient method to provide information about the overall data quality.
- Interviewer Assessment as a Measure of Data QualityTable 1 summarizes all TEDS face-to-face surveys to date. Since 2002, TEDS has required every interviewer to complete a short questionnaire right after each completed interview. The questionnaire consists of two parts (except TEDS 2017). The first part is designed to record special events that occurred in the interview (e.g. the respondent’s comments about the survey), and the second part is comprised of several Likert items to assess of the interview. Three items are of particular interest to our analysis: (1) how cooperative the respondent was, (2) how well the respondent understood the questions, and (3) how trustworthy the respondent’s answers are.
A Summary of TEDS Face-to-Face Surveys with interviewer assessment
- RationaleThe literature on survey non-response and measurement error provides some support for the use of the interviewer assessment as an indicator of data quality. It has been established that survey participation and response accuracy are connected to some extent ( Olson, 2006 ; Peytchev, Peytcheva, & Groves 2010 ; Tourangeau, Robert, & Redline, 2010). People with a low willingness to participate in surveys tend to decline the interview when contacted, but if those people participate in interview, they – the so-called “reluctant respondents” – tend to have poor interview behavior, and most crucially, they tend to give poor responses and, as a consequence, compromise data quality. From this theoretical perspective, we argue that interviewers’ assessments of respondents’ interview behavior (e.g. uncooperativeness, comprehension, and untrustworthiness) should be informative to the evaluation of data quality. In addition to this theoretical consideration, the interviewer assessment in TEDS has three features that have practical value for evaluating survey data. First, the interviewer assessment is aimed at providing an overall evaluation of the interview rather than individual survey items. Second, whereas the conventional measures focus on the preparatory work for interviews (e.g.. questionnaire design) or the end result of interviews (i.e., survey responses), interviewers’ assessments take the real context of interviews into account, through their observations and interactions with respondents. Third, compared to some commonly used measures that require repeated interviews or measurements, the interviewer assessment is a more affordable, convenient, and hence, practical approach to evaluation of data quality. These three features make the interviewer assessment a nice complement to the conventional measures of reliability and validity.
- Empirical Analysis
- Q1. Do you usually think of yourself as close to any particular party?
- Q2. Do you feel yourself a little closer to one of the political parties than the others?
- Q3. Which party do you feel closest to?
- Q4. Do you feel very close to this party, somewhat close, or not very close?
ConclusionsThis paper examines whether interviewers’ assessments of their completed interviews serve as a useful indicator of the overall data quality. To answer this research question, we compare the interviewer assessment with some commonly used measures of data quality based on the TEDS. We found that the interviewer assessment is a fair indicator of the overall reliability of attitudinal questions in TEDS surveys. However, , the interviewer assessment is uninformative about the reliability evaluation of factual questions. Regarding the evaluation of data validity, the interviewer assessment fails to give a correct indication of survey error and nonsensical responses to important items in TEDS surveys, though the interviewer assessment is sensitive to the non-response problem. Taken together, our findings suggest that the interviewer assessment, provides some useful information about data quality, but it is more appropriate to use that information to add to the evaluation of data reliability rather than to validity. These research findings have a substantive implication. Survey interviews are becoming increasingly difficult and costly to conduct. Given a limited project budget and fieldwork time, often opinion polls have no choice but to abandon the retest interview and hence the test-retest reliability evaluation. (We suspect that this is one of reasons why TEDS 2013 and 2017 did not conduct the retest interview.) According to our findings, the use of interviews’ assessments as a cost-effective alternative (or supplement) to the test-retest reliability evaluation may be a solution to this difficult situation. Despite these findings and implications, this study is inarguably still preliminary. Several issues need further investigation. Why does the interviewer assessment fail to serve as a good indicator of the reliability of factual questions and data validity? Can other kinds of interviewer assessments provide information for data evaluation (e.g. interviewers’ assessments of respondents’ knowledge and interest with respect to survey questions)? How can we evaluate and improve the quality of the interviewer assessment itself, and so forth? In future work, we will attempt to understand these issues.
Achen C. H. 1975 Mass political attitudes and the survey response. American Political Science Review 69 (4) 1218 - 1231 DOI : 10.2307/1955282
Alvarez R. M. , Brehm J. 2002 Hard choice, easy answer. Princeton University Press Princeton, NJ
Carmines E. G. , Zeller R. A. 1979 Reliability and validity assessment. Sage Publications London, England
Chu Y. H. 2004 Tai Wan Xuan Ju Yu Min Zhu Hua Diao Zha, 2003[Taiwan's Election and Democratization Study, 2003] (TEDS2003). (NSC92-2420-H-001-004). Guo Ke Hui Zhuan Ti Yan Jiu Ji Hua Bao Gao Shu [National Science Council Research Project]. Taipei, Taiwan
Converse P. E. 1964 The nature of belief systems in mass publics. InIdeology and discontent, Apter, D. (ed). Free Press New York, NY
de Vaus D. 2001 Research design in social research. Sage Publications London
Huang C. 2002 Tai Wan Xuan Ju Yu Min Zhu Hua Diao Zha, 2001[Taiwan's Election and Democratization Study, 2001] (TEDS2001). (NSC90-2420-H-194-001). Guo Ke Hui Zhuan Ti Yan Jiu Ji Hua Bao Gao Shu [National Science Council Research Project] Taipei, Taiwan
Huang C. 2003 Tai Wan Xuan Ju Yu Min Zhu Hua Diao Zha, 2002: Bei Gao Liang Shi Xuan Ju Fang Wen An [Taiwan's Election and Democratization Study, 2002: the Survey of Taipei and Kaohsiung Cities Mayoral Elections] (TEDS2002). (NSC91-2420-H-194-001-SSS). Guo Ke Hui Zhuan Ti Yan Jiu Ji Hua Bao Gao Shu [National Science Council Research Project] Taipei, Taiwan
Johnson J. B. , Joslyn R. A. , Reynolds H. T. 2001 Political science research methods, CQ Press Washington, DC
Lavrakas P. J. 2013 Presidential address: Applying a total error perspective for improving research quality in the social, behavioral, and marketing sciences. Public Opinion Quarterly 77 (3) 831 - 850 DOI : 10.1093/poq/nft033
Liu T. W. , Chen K. H. Data quality of the Taiwan’s election and democratization study: Examination of retest and interviewer assessment. Presented at the International Conference of the 2003 Taiwan’s Election and Democratization Taipei, Taiwan 2004, September
Maxim P. S. 1999 Quantitative research methods in the social sciences. Oxford University Press Oxford, England
Olson K. 2006 Survey participation, nonresponse bias, measurement error bias, and total bias. Public Opinion Quarterly 70 (5) 737 - 758 DOI : 10.1093/poq/nfl038
Peytchev A. , Peytcheva E. , Groves R. M. 2010 Measurement error, unit nonresponse, and self-reports of abortion experiences. Public Opinion Quarterly 74 (4) 319 - 327 DOI : 10.1093/poq/nfq002
Shiao Y. C. 2006 “Tai Wan Xuan Ju Yu Min Zhu Hua Diao Zha” Zai Ce Xin Du Zhi fen Xi [Taiwan Analysis of test-retest reliability in Taiwan's election and democratization study] Xuan Ju Yan Jiu [Journal of Electoral Studies 13 (2) 117 - 144
Sniderman P. M. , Brody R. A. , Tetlock P. E. 1991 Reasoning and choice Cambridge University Press Cambridge, England
Tourangeau R. , Groves R. M. , Redline C. D. 2010 Sensitive topics and reluctant respondents: Demonstrating a link between nonresponse bias and measurement error. Public Opinion Quarterly 74 (3) 413 - 432 DOI : 10.1093/poq/nfq004
Tudd C. M. , Smith E. R. , Kidder L. H. 1991 Research methods in social relations Harcourt Brace Jovanovich Orlando, FL
Zaller J. 1992 The nature and origins of mass opinion. Cambridge University Press Cambridge, England