Four challenges working with ordinal dataš
Tips for using & interpreting subjective wellbeing & similar survey data
Economists and other social scientists often work with ordinal data ā that is, categorical data with natural, ordered categories yet the distances between the categories are not known.1
Macroeconomists have long used data obtained from business surveys (e.g. NZIERās Quarterly survey of Business Opinion). Other researchers use qualitative data from surveys such as StatsNZās General Social Survey (NZGSS). Data from NZGSS enable us to examine many interesting questions that relate to an economistās notion of utility (or welfare or wellbeing).
Questions in such surveys are often based on a Likert (or an end-coded) scale with multiple response options. An example is the NZGSS ālife satisfactionā question:
First of all, I am going to ask you a very general question about your life as a whole these days. This includes all areas of your life. Looking at the showcard below, where zero is completely dissatisfied, and ten is completely satisfied, how do you feel about your life as a whole?
Answer: 0 (completely dissatisfied), 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 (completely satisfied).
Iāll use this example throughout this post.
Analysts may wish to provide descriptive statistics of the data (potentially for differing groups and/or across time) and may wish to use the responses either as an explanatory variable or as a dependent variable in a regression. However, there are some challenges in doing so. Stephen Jenkins (2020) discussed several of these in his New Zealand Economic Papers article; this post is a brief primer for those who may be embarking on such analyses.
Four challenges
Several challenges arise when we wish to use ordinal data, even as descriptive statistics. These are illustrated below using the life satisfaction question.
When using cross-sectional ordinal data, we must assume that survey respondents (both within and across countries, if relevant) have a common interpretation of the life satisfaction scale, i.e. that they have a ācommon reporting functionā. In addition, if we use ordinal data as a time series (even for the same person), we must assume that respondents have a consistent interpretation of life satisfaction across time.
We must decide whether to treat the data purely as ordered data (i.e. 7>6, 6>5, etc.), or as cardinal data, which adds the assumption that the effect on life satisfaction of an increase from a survey response of say 5 to 6 is equal to the effect of a shift from 6 to 7, etc.
Related to the previous challenge, we must consider how to treat people who answer at either end of the scale. A person who answers 10 can go no higher, while someone who answers 0 can go no lower.
If we are interested in inequality of wellbeing, and if we treat the data as ordinal, how can we measure inequality?
Some solutions
1. Common reporting functions
Whether a common reporting function exists across people is difficult to verify, or to refute. Cantril (1965) showed that, both within and across cultures, people have some shared notion of what a good life should entail and how their life ranks against this notion. A wide body of studies shows that people tend to report higher life satisfaction if they have: high incomes, good health, a good family environment and good friends. People who are unemployed, recently divorced, have mental health problems, or who live in countries with poor human rights report lower levels of life satisfaction.
Consistent with the importance of income and human rights for wellbeing, the Gallup World Poll ranks mean life satisfaction (across 146 countries) as being highest in Finland, Denmark and Iceland, and lowest in Afghanistan, Lebanon and Zimbabwe. These data are reassuringly consistent with intuitive notions of where a typical person might and might not lead a good life, consistent with the assumption of a common reporting function.
When we use ordinal data from a longitudinal panel, we can relax the assumption of a common base level in the reporting function of different people by looking at changes in responses for the same individual across time (or through inclusion of individual fixed effects in a regression). However, to do so requires the assumption of consistent reporting over time. This assumption is in keeping with the common assumption in economics that individuals have stable preferences, so might be considered reasonably innocuous.
2. Ordinal or cardinal?
More contentious is whether we can treat the data as ordinal or cardinal. Ordinal data can be presented using a histogram or cumulative distribution function (cdf), each of which respects the purely ordered nature of responses. One advantage of presenting data using the cdf is that we can compare cdfās of multiple groups. If we have no information about the individuals other than their group, a comparison of the cdfās can, in certain cases, allow us to observe that one set of outcomes is superior to the other.
For instance, in the accompanying figure, I have used (unweighted) data from the Stats NZ 2018 NZGSS SURF (synthetic unit record file, commissioned by Kate Prickett) to compare the life satisfaction of adults who are partnered (c.d.f. of 1) versus the life satisfaction of adults who are not partnered (c.d.f. of 0).2
The cdf for non-partnered people is everywhere above that of partnered people; in other words, there is a higher proportion of unhappy non-partnered people than of partnered people, no matter how we define āunhappyā (e.g. 0, or 0-1, 0-2, ā¦ or even 0-9). Using the terminology of Allison & Foster (2004), the partnered distribution F-dominates (first order dominates) the non-partnered distribution, so exhibits superior outcomes. (NB: I am not attributing any causality to this result!) The conclusion here is reached without assuming cardinality (provided we assume a common reporting function). If, however, the two cdfās had crossed we could not unequivocally say one group is happier than the other.
The observations above carry through to the use of other percentile based descriptives of the data (e.g. median, upper & lower quartiles, etc.) which are appropriate to use even when cardinality is not assumed. Note, however, that in the absence of a cardinality assumption, we cannot use the mean as a descriptive statistic. Hence the common practice of reporting the mean of Likert scale data (as in the cross-country comparison above) implicitly assumes that the series is cardinal.
Take care with regression analyses
In regression analysis, treating the data as ordered (and not cardinal) is straightforward when life satisfaction is used as an explanatory variable (e.g. does productivity increase if workers are āhappyā). We simply enter each category of life satisfaction as a separate dummy variable (with one category omitted as a base). Usefully, we can then test whether the series can be treated as cardinal using a Wald test on the coefficients (i.e. test whether the coefficients increase linearly as they would if the series were cardinal, and if the relationship in question is linear).
More complications arise when the ordinal series is the dependent variable. An OLS (or IV) equation is normally predicated on the basis that the dependent variable is cardinal. In practice, life satisfaction (and many other ordinal variables) is often treated as if they are cardinal and consequently used as the dependent variable in an OLS regression. Many studies (the most famous being Ferrer-i-Carbonell & Frijters, 2004) indicate that use of ordered logit estimation (which treats the data as ordered, but not cardinal) generally produces very similar results to the use of OLS.
In an influential article, Bond & Lang (2019) show, however, that even ordered logit estimation imposes assumptions on the distribution of the data, which may not necessarily hold. They demonstrate the possibility that ordered probit findings can be reversed by lognormal transformations of the underlying data. The extreme results that may potentially arise from the Bond & Lang critique are far-fetched, mirroring the Queenās declaration in Alice in Wonderland that āsometimes I've believed as many as six impossible things before breakfastā. Nevertheless, it has led to the development of methods to test the robustness of regressions involving an ordinal dependent variable. One ((valid) approach is to reinterpret ordered logit results as being applicable to the median rather than to the mean (Chen et al., 2022).
Another approach, due to Bloem & Oswald (2022), is to split the sample at the median of life satisfaction to give a binary variable, and then to estimate the equation of interest as a simple probit or logit. The researcher can then check whether the relationship of interest survives this simplification of the data which imposes no distributional assumptions other than ordering. A more detailed approach, which is akin to using generalised ordered logit estimation, is to break the sample into two groups separately at each step of the scale and estimate the corresponding regressions for each of the binary variables.
As an example of the above, I have used the same SURF as before to estimate regressions in which life satisfaction is regressed on a measure of material wellbeing (MWB) which reflects the householdās access to a range of material necessities, measured on a 0-20 scale. The estimated OLS coefficient on MWB in this simple regression is 0.126 (p<0.001) while the odds ratio from an ordered logit regression is 1.144 (p<0.001). However, a Brant test rejects the assumption of parallel odds (or parallel lines) for the ordered logit. When estimating a generalised ordered logit specification, the odds ratios vary between 1.068 (between steps 9 & 10) and 1.236 (between steps 0 & 1), all significant at p<0.05).3
Thus, in this example, while the assumptions underpinning the ordered logit (and, implicitly, the OLS) regression are rejected, the qualitative findings of a positive relationship between material wellbeing and life satisfaction are retained in the generalised ordered logit ā and we learn something extra through inspecting the pattern of the generalised ordered logit estimates. The key lesson from this discussion is that we have techniques to deal with the ordinal nature of the data, even when an ordinal variable is adopted as the dependent variable in a regression analysis.
3. Top-coded and bottom-coded responses
The treatment of top-coded (e.g. life satisfaction=10) and bottom-coded (e.g. life satisfaction=0) responses ā i.e. censored responses ā depends on the matter at hand. When presenting or comparing cross-sectional data for descriptive purposes, there is little need to be concerned with people recording their wellbeing at these end points.
More challenging is when we use the data in regression analysis. It is (always!) useful to plot a histogram of the observed distribution before estimation to check whether many of the observations fall in the top or bottom categories. If āfewā observations are in these categories, one might choose to ignore the censoring. Even in these cases, however, one might check robustness of results to a sub-sample which drops the end-coded responses. If there are more than a āfewā end-coded responses, one may choose to use an estimation approach suitable for a censored regression model such as Tobit estimation.
Potentially more problematic is the use of end-coded data in longitudinal analysis since the respondent who initially entered 0 or 10 can only move in one direction in the following survey wave. Again, use of robustness testing (e.g. dropping individuals who have an end-coded response in anything but the final wave) or applying censored regression techniques to panel data can be adopted. In practice, while high prevalence of end-coding poses a potential challenge, having only a small proportion of observations in the end-categories means that it does not normally pose an insurmountable problem.
4. Measures of inequality
Perhaps the greatest challenge when using ordinal data arises when we wish to derive a measure of inequality (e.g. inequality of life satisfaction).
Many common measures of inequality (e.g. standard deviation, Gini coefficient, Theil index) assume cardinality of the variable under consideration. Allison & Foster suggest that we can compare inequality in two separate distributions by looking at how closely responses are bunched around the median, but their claim also implicitly rests on an assumption of cardinality. At least one measure suited to ordinal data has been developed (Cowell & Flachaire, 2017), but has yet to be widely adopted.
In my recent work with two co-authors (Grimes et al., 2023), we showed that estimates for the relationship between individual life satisfaction and inequality of life satisfaction within the personās country is highly sensitive to the inequality measure that is chosen. (It turns out that skewness also appears to be important, but that is another story!) Development of suitable measures of inequality for ordinal variables is an active area of research internationally, and practitioners should be extremely cautious in how they employ different inequality measures relating to ordinal variables. In this case, robustness testing is perhaps more important than for any of the other challenges outlined above.
Some rewards
I have written about some of the challenges in using ordinal data and some possible solutions. But the existence of challenges does not mean that the data cannot or should not be used. The rewards of using survey-based ordinal data can be huge. We can address important issues that are unlikely to be addressed using more āconventionalā economic data.
For instance, conventional approaches enable us to explain unemployment and inflation ā but how do we know how the relative outcomes for these variables affect the wellbeing of the populace? This question was addressed in one of the most famous papers in the field by Robert MacCulloch (now at University of Auckland) and his co-authors (Di Tella et al., 2001).
Or we can analyse whether there are persistent effects of certain shocks on peopleās wellbeing (Clark et al., 2008). Here we find that, unlike many other shocks that hit people, becoming and staying unemployed has a long-term deleterious effect on the personās wellbeing. We can analyse the effects of Covid lockdowns on wellbeing and see that while an initial lockdown may have a positive effect, a second lockdown has an unambiguous negative impact on life satisfaction (Grimes, 2022). We can analyse how human rights and corruption affect wellbeing (Layard et al., 2012) and examine who benefits most from free speech (Voerman-Tam et al., forthcoming).
Each of these analyses uses life satisfaction as a proxy for peopleās evaluative subjective wellbeing.Ā The richness of the topics that can be explored with data such as these more than compensates for the challenges faced in ensuring that the analyses are both meaningful and rigorous.
Professor of Wellbeing and Public Policy, School of Government, Victoria University of Wellington & Senior Fellow, Motu Research.
References
Allison R, Foster J. 2004. Measuring health inequality using qualitative data. Journal of Health Economics, 23, 505-524.
Bloem R, Oswald A. 2022. The analysis of human feelings: a practical suggestion for a robustness test. Review of Income and Wealth, 68(3), 689-710.
Bond T, Lang K. 2019. The Sad Truth about Happiness Scales. Journal of Political Economy, 127 (4), 1629ā40.
Cantril H. 1965. The Pattern of Human Concern. New Brunswick: Rutgers University Press.
Chen L-Y, Oparina E, Powdthavee N, Srisuma S. 2022. Robust ranking of happiness outcomes: A median regression perspective. Journal of Economic Behavior & Organization, 200, 672-686.
Clark A, Diener E, Georgellis Y, Lucas R. 2008. Lags and leads in life satisfaction: A test of the baseline hypothesis. Economic Journal, 118, F222āF243.
Cowell F, Flachaire E. 2017. Inequality with ordinal data. Economica, 84, 290ā321.
Di Tella R, MacCulloch R, Oswald A. 2001. Preferences over inflation and unemployment: Evidence from surveys of happiness. American Economic Review, 91(1), 335-341.
Ferrer-i-Carbonell A, Frijters P. 2004. How important is methodology for the estimates of the determinants of happiness? Economic Journal, 114, 641-659.
Grimes A. 2022. Measuring pandemic and lockdown impacts on wellbeing. Review of Income and Wealth, 68(2), 409-427.
Grimes A, Jenkins S, Tranquilli F. 2023. The relationship between subjective wellbeing and subjective wellbeing inequality: Taking ordinality and skewness seriously. Journal of Happiness Studies, 24,Ā 309-330.
Helliwell J, Layard R, Sachs J, De Neve J-E, Aknin L, Wang S. 2022. World Happiness Report. Sustainable Development Solutions Network, New York.
Jenkins S. 2020. Better off? Distributional comparisons for ordinal data about personal well-being. New Zealand Economic Papers, 54(3), 211-238. [Open access version]
Layard R, Clark A, Senik, C. 2012. The causes of happiness and misery. In: Helliwell J, Layard R, Sachs J (Eds.), World Happiness Report 2012, 58-89, Columbia Earth Institute, New York.
Voerman-Tam D, Grimes A, Watson N. Forthcoming. The economics of free speech: Subjective wellbeing and empowerment of marginalized citizens. Journal of Economic Behavior and Organization, https://doi.org/10.1016/j.jebo.2023.05.047.
Ordinal data can be distinguished from cardinal and nominal data. Cardinal data categories are ordered, like ordinal data, but the distances between categories are fixed (e.g. the number of children in a family). Nominal data categories are unordered (e.g. postcodes).
The SURF is publicly available from http://www.kateprickett.com/teaching.
Details are available from the author.