On the use and misuse of online surveys in social science research and policymaking
Online polls have become a very convenient and popular strategy for gathering data. After all, online survey forms are easy to create, with the help of free applications such as Google Forms and Survey Monkey. They do not require a sampling frame, which is almost impossible to obtain anyway. The internet’s blanket of anonymity and the absence of a human interviewer, such as in grievance or feedback forms, may also encourage respondents to be more vocal about their opinions. Online surveys can reach unique and hesitant populations for more sensitive topics such as health, sexuality, and religion (Wright, 2017). For these and other reasons, online surveys are more widespread in market research, wherein technical assumptions — such as the representativeness of the sample — can be relaxed at times.
But self-administered, online methods remain unpopular or unadvised in larger-scale social science research, and arguably rightly so, as prudence should be exercised when gathering, analyzing, and interpreting the data collected online.
There is no doubt that, no matter how cheap, online surveys are no match to face-to-face household interviews in terms of data quality. Unlike online surveys, personal interviews can ensure sample representativeness when one applies established sample selection methods, such as random, stratified, or cluster sampling. Experienced enumerators can assess whether interviewees are ambivalent or are already experiencing fatigue. They can also determine the households’ social standing through quick visual inspection of the houses’ construction materials. Moreover, researchers can measure the non-response rate and follow up on the respondents.
However, there may be instances — including a lockdown — when personal interviews are not possible that social researchers and policy workers have no better choice but to resort to online data-gathering. The National Economic Development Authority (NEDA), for instance, conducted an online survey to gauge consumer and business confidence, in order to aid in developing policies for the so-called new normal. The Department of Education (DepEd) disseminated an online poll primarily to assess the possibility of opening the classes in August. There are also a number of online polls by some private firms which measure Filipinos’ “satisfaction” to the government’s responses to the pandemic. The results are unfortunately cited in the media despite obvious lapses in the methodology and survey instrument.
Anyone can do a social survey. A Facebook page or Twitter “influencer” can gain 10,000 responses to a social media poll. A private consultancy firm can be paid millions to gather data from a closed circle of like-minded individuals and report that most Filipinos support the government’s policies.
More often than not, organizations do surveys to tailor to their beliefs, to confirm their biases, and to shape public opinion. But it takes serious research and contemplation to formulate the right questions under a careful framework, to ensure that the questions are objective (or close to it) and not leading, to set protocols for proper online data collection, and ultimately, to acknowledge and be ready to report the biases and severe limitations of the data at hand, which include:
High selection bias and exclusion error. There are no timely official statistics on the state of internet coverage in the Philippines, but a survey by the Social Weather Stations in the first quarter of 2019 found that less than half or only 46% of adult Filipinos are internet users, with Manila topping the list (64%), followed by Balance Luzon (48%), Mindanao (39%), and Visayas (34%). As expected, the use of the internet is selective with younger age and higher educational attainment.
This is the primary limitation of any online survey, especially in social science research, wherein social class is closely related to so many other social variables. Most responses in an online poll are expected to come from urban, young, and wealthier demographics, and totally exclude those without access to the internet, who comprise the majority of the population.
If the NEDA conducted a survey among businessmen, in no way could it sufficiently infer about small- and micro-entrepreneurs. If the DepEd circulated an online survey to teachers, in no way could it claim that most teachers have access to the internet and can hence hold online classes. If some survey firms found that most of their online respondents favor the government’s actions, in no way could this be claimed to be true among the poor, who have lost access to their livelihoods, public transportation, and healthcare.
When it comes to policymaking, one can still conduct an online survey but strictly only for internal usage, e.g. rapid assessment, and should never be cited as the only reference for policymaking in the presence of high exclusion error. I have come across the results of a survey by a youth advocacy organization for good governance, which, instead of releasing the actual results, only listed down the common grievances of their respondents. This is one responsible use of online data.
At any rate, the Philippine Statistical Act of 2013 requires agencies to have their survey forms be examined and approved by the Statistical Survey Review and Clearance System (SSRCS) before official dissemination and publication.
Self-selection bias. Online polls generally belong to a class of non-probability sampling methods called convenience sampling. True to the term, convenience sampling does not require a sampling frame, and the primary selection criterion is the ease of getting a sample (Lavrakas, 2008). For this reason, convenience samples such as online respondents are not “scientific samples” (ibid.).
An important concern in online convenience sampling is that responses tend to come from individuals with an interest in or strong opinion about the research topic. This is referred to as response propensity — the “correlation between the survey variables of interest and people’s likelihood of participating in the survey” (Groves in ibid.). One way to lessen self-selection bias toward a certain demographic or interest group is to introduce quota or “controls”, in which case the researcher sets a minimum number or an ideal proportion of respondents, such as by sex, age, and education (see Moser, 1952). But even quota sampling does not address the problem of systematic exclusion nor does it ensure representativeness within quotas.
Immeasurable non-response rate. Non-response is an important element in surveys, but it is not widely understood and is hence often not reported. Lavrakas (2008) defines nonresponse as when people or households are sampled but from whom data are not gathered. A common instance of this is the respondent’s flat out refusal to be interviewed or to answer the survey. Sometimes, a sampled respondent cannot be reached. Another is when limitations beyond the control of the researchers hinder the conduct of the interview, such as interviewing people with physical and mental disability or, in connection to the point above, those who do not have access to the internet.
In any case, there is always a degree of non-response in a survey, but non-response can sometimes be systematic. Analyzing 600,000 opinion surveys, Reyes (2016) found that non-response rates are higher among younger, less educated, single, and poorer populations. But the problem with online polls, and convenience sampling for that matter, is that non-response rate obviously cannot be measured, even though response propensity may be high for some groups, as the preceding point explained.
Response validity. Because the internet can provide a blanket of anonymity, online surveys are more common for perception surveys. This, however, introduces several issues, primarily concerning the validity of the responses. One can haphazardly answer an online survey probably due to respondent fatigue, or repeat their responses using a different e-mail address or social media account. If a government agency conducts or sponsors a survey, responses may also vary according to the respondents’ perception of the office, e.g. as a means to express their grievance, or to make their responses socially desirable out of their belief that they can be selected as a program beneficiary. There is no commonly accessible way of ascertaining the validity of online responses, and mechanisms must be in place to detect discrepancies in the data (person-fit analysis in item response theory is one possible way.).
Moreover, perception surveys are very sensitive to the phrasing of the questions and the categories. In their pandemic-related survey, the Gallup International Association (not to be confused with the US-based survey firm Gallup) asked their respondents whether they are “willing to sacrifice their human rights until the threat from COVID-19 has gone” and whether “democracy is effective in the crisis.” Surely, loaded abstract concepts such as human rights and democracy can be adequately captured by a question or two.
How to conduct and properly utilize online surveys is an evolving research subject. It is not to “gatekeep” the practice of survey and asking questions, but to ensure that the data, from collection to presentation, adhere to the fundamental principles of survey science. Ultimately, good research is transparent about its methodology, is reflective of its reflexivity, and acknowledges the limitations of its data.
As our professors in the UP Population Institute always tell us, our analysis is only as good as the data we have.
Creswell, J. (2003). Research Design: Qualitative, Quantitative, and Mixed Methods Approaches (2nd ed.). SAGE.
Lavrakas, P. (2008). Encyclopedia of Survey Research Methods. SAGE.
Moser, C. (1952). Quota Sampling. Journal of the Royal Statistical Society, Series A (General), 115(3), 411–423. Wiley.
Reyes, G. (2016). Understanding non response rates: insights from 600,000 opinion surveys. World Bank Paper. http://pubdocs.worldbank.org/en/708511466183857404/paper-reyes.pdf
Wright, K. (2006). Researching Internet‐Based Populations: Advantages and Disadvantages of Online Survey Research, Online Questionnaire Authoring Software Packages, and Web Survey Services. Journal of Computer-Mediated Communication, 10(3). Wiley.