Psychometrics as a solution to many traditional problems of applied welfare economics, a thesis precis
Unlike most of the posts I make on this blog I’m not sending this one out via email because it is likely to be very dense to people without a background in philosophy, economics and/or the psychology of subjective well being. I’m putting it up here to get feedback- so please go ahead!
This is a thesis about what might broadly be called applied welfare economics -a phrase some might contend is, or ought to be a tautology, but which in this thesis refers to welfare economics intended to be deployed, in a very direct way, to shape policy. The paradigm case of applied welfare economics is probably cost-benefit analysis, but other cases (some of which may wholly or partly overlap) include designing tax systems, designing institutions and the economic analysis of law.
Many criticisms have been made of the various approaches that make up new welfare economics, but these criticisms have sometimes languished for want of an alternative paradigm. We aim to address that in the area of applied welfare economics. In this thesis, I claim to locate an alternative paradigm already existing- the study of subjective-well being applied in the context of economic policy. Such research already exists, but my purpose is to defend the idea that this study- of the relationship between subjective-wellbeing and varieties of economic policy, can address exactly those needs which welfare economics was created to address, while at the same time offering a resolution to many of the traditional antinomies of welfare economics. I call this approach the psychometric approach. For clarity, my proposition is not that this approach to welfare economics is the only possible approach that addresses these problems, but simply that our approach can well be applied to both the traditional uses, and the traditional difficulties, of welfare economics and is thus a promising paradigm. Key authors upon which this argument is built include Ng, Alexandrova and Haybron.
I begin with two expository chapters
An initial chapter on traditional criticisms of welfare economics
In this chapter, I will consider traditional criticisms of new welfare economics- especially in the area of applied welfare economics. These criticisms include its formalistic character, often weak conclusions, and tendency to adopt implausible ethical standpoints in an attempt to avoid taking on any controversial ethical assumptions at all (e.g., the Kaldor-Hicks criterion).
The best way to understand many of the problems of welfare economics is to understand them as constituents of a broader problem- the problem of how to best weigh and balance different interests when a policy affects different groups and their interests in different ways. In this thesis, we call this problem the problem of interpersonal aggregation. It is important not to confuse the problem of interpersonal aggregation with the problem of interpersonal comparison, although a solution to the interpersonal aggregation problem may require a solution to the interpersonal comparison problem, they are distinguished in our usage in the thesis as follows. The interpersonal aggregation problem is the problem about how to compare either overall goodness, or perhaps just the welfare-related portion of goodness when a menu of policies has costs and benefits accruing to different people. The interpersonal comparison problem is the problem of how to compare the magnitude of a mental state of some sort- perhaps a desire, emotion or even personality trait between two or more persons.
We argue that three separate problems, viz: the interpersonal comparison problem, the problem of measuring mental states in a cardinal way, especially welfare, and the problem of the proper role of values in welfare economics can all be seen as components of the interpersonal aggregation problem.
A chapter on traditional approaches to the interpersonal aggregation problem
In this chapter, I present a taxonomy of many different approaches to welfare economics, organised by their relation to the interpersonal aggregation problem. I list a variety of different approaches which offer different solutions, but they all tend to either suffer from limited scope, or a presumptuousness about ethics that is unlikely to be persuasive in a democratic and ethically pluralistic society, or both.
Having considered the problems with traditional approaches to applied welfare economics, we begin to layout our alternative approach.
A chapter laying out our thesis and introducing the basic ideas of the psychometrics of wellbeing
We begin the chapter by explaining the methodology in applied welfare economics we are going to be defending- an approach to applied welfare economics that views it as continuous with the psychometrics of subjective wellbeing. Such a view is not novel. Ng (---) has made very similar arguments, and some projects (e.g. the UK’s Green book) can be seen as practical implementations of what we propose. Our concern though is to provide a sustained, thesis length argument for the methodological and practical benefits of this approach.
[For clarity, our argument diverges from Ng’s insomuch as Ng views his approach as tied to happiness, whereas we are agnostic about which form of subjective-wellbeing is used].
We then provide an explanation of standard psychometric techniques and concepts that will be relevant to this work. This includes an introduction to the concept of validity and reliability,
We briefly review the state of the art- philosophical and scientific, on psychometric measures of happiness. My general conclusion is that the science of subjective wellbeing is reasonably robust. As regards the philosophical foundations of psychometrics and subjective wellbeing, while there is always room for improvement the study of the conceptual foundations of subjective wellbeing and psychometrics is relatively advanced- at least in comparison to other areas in the human sciences. I express a general skepticism about the idea that science should “wait” for clarity in conceptual foundations- science and philosophy develop jointly, not sequentially, and the standard that sometimes seems to be applied to psychometrics by its enemies in economics- that it should not proceed until we have a full philosophical account of methodology- would not be persuasive in any other area of science.
I then proceed to address one problem in welfare economics per chapter which traditional welfare economics has struggled with, but which I believe a psychometric approach can beat. Three of the four problems I look at can be seen as tributaries to a larger problem- the interpersonal aggregation problem or how to quantify gains and losses when different people or groups gain or lose. One other problem we consider is not connected to the interpersonal aggregation problem, at least not directly, but is interesting in its own right- that is the question of the best way to infer what people’s interests are.
We’ll now go through each of these chapters:
The problem of interpersonal comparison
The first problem we consider is the problem of interpersonal comparison of mental states. Choose some mental attribute, say happiness, anger or desire. How can we compare X’s degree of this attribute to Y’s degree of this attribute? Clearly, if we define a person’s good in a way that is tied, in at least some degree to their mental states, resolving this problem will be necessary to the interpersonal aggregation problem. This problem has traditionally devilled economics, in the form of worries about the interpersonal comparison of utility (roughly, this can be seen as the problem of the interpersonal comparison of degrees of desires). Thus if the psychometric approach can resolve this, so much the better for psychometrics and so much the worse for the traditional methodology of welfare economics.
I argue that psychometrics when combined with even moderate functionalism in the philosophy of mind, gives us a meaningful way to empirically compare degrees of states such as life satisfaction and happiness between people. Functionalism says that like functional states are associated with like mental states, and psychometrics shows us through the process of construct validation that like scores tend to be associated with like functional states. We can make this result appealing to even more people by weakening functionalism from a metaphysical to an epistemic principle- that is to move from saying that like mental states are associated with like functional states to saying that it is reasonable to believe that mental states are like when functional states are like.
If this argument is right then the comparison of subjective wellbeing type states in multiple people is no more complex (in very broad principle!!) than the comparison of the temperatures of different objects.
We can further supplement this combination of functionalism and construct validation in the psychometrics of subjective wellbeing with Lerner’s principle of equal ignorance. Our argument then becomes not so much that like mental states must be associated with like mental states, or even that it is reasonable to assume such, as that assuming this has the lowest expected error of any view.
The problem of cardinality
Even if we could compare X and Y on their degree of happiness, or desire, establishing whether one is greater or both are equal, could we quantify the difference between them? At least for many common ways of approaching social welfare, this is going to be of key importance to resolving the interpersonal aggregation problem.
Welfare economists have traditionally been troubled by the issue of the cardinality of welfare, with the advent of ordinalism being seen as a major innovation in consumer theory, but a major difficulty for welfare economics. The problem also arises in the context of the relationship between underlying wellbeing, and wellbeing scores. Personally, I think it is quite possible to defend cardinalism even within a non-psychometric approach to welfare and/or utility, however, my purpose in this chapter will be to argue that the problem can also be solved in the context of the psychometric approach to welfare economics.
To see the problem in a psychological context, imagine a questionnaire like Cantrill’s ladder. Participants are asked to rate their happiness on a scale from 0 to 10. Now suppose that we have three responses: 6, 8 and 10. We might therefore reason that the average happiness is an 8, but unless we have established cardinality this is impermissible. It might be, for example, that the gap between a 6 and an 8 is much larger than the gap between an 8 and a 10, so taking the arithmetical averages of the responses won’t really tell us much about the average of happiness as such.
I argue that while there have been some difficulties in establishing that psychometric measures are truly cardinal measures of wellbeing, the following considerations give us a reason for confidence:
1. There is some preliminary evidence that subjective wellbeing is linear in subjective well being scores, including, but not limited to the homoscedasticity of errors all across the scale in test-retest studies, and evidence given in studies of how respondents interpret these scales. (Our indebtedness to Plant for collating this evidence must be acknowledged here). Also notable in this regard is Kristoffersen’s point (? Query reference) that in essentially all cases the stats tend to come out the same way, whether we treat the scales as cardinal or ordinal.
2. If scores of measured subjective well being is not already linear in subjective wellbeing, there are many promising directions for future research on this topic that could uncover the real relationship between happiness scales and underlying quantities of happiness. Some of these ideas (A & B) are my own or worked out between me and colleagues, others are already existing in the literature. These include:
Exploring the decision utilities associated with different levels on happiness scales in risk choice experiments
Correlating the scale with different levels of Kahneman’s unbounded experience sampling-based utilities (integrated over a period of time to give the participant’s average experience utility then compared with their response, say, to Cantrill’s ladder),
Investigation based on the assumption of just noticeable differences,
Just asking participants what the distance between various points on the scale is,
and various others.
3. Finally, sensitivity testing conducted by myself & Kieran Latty suggests that the controversy is unlikely to matter very much. That is to say, It is very likely that if we found the true relationship between happiness and happiness scales, and recalculated the average happiness of different groups or countries accordingly, that their order would not change by much.
This can be shown through sensitivity tests- i.e. Kieran and myself have adjusted the distance between scores on the basis of various assumptions (some of them massively different from linear), ranked countries by the resulting “averages” of happiness, and found that the order of the list of countries by average happiness changes not much or not at all. Some of these sensitivity tests assumed very great distance from linearity (e.g. treating each happiness score as corresponding to a degree of happiness equal to score^2, where 1 becomes 1, 2 becomes 4, 5 becomes 25, 10 becomes 100 and so on).
The problem of the proper role of values in science
There has long been a problem about the proper role of values in science, and this problem has often troubled welfare economists in particular. This is linked to the overarching problem of interpersonal aggregation because to make any headway in cases where what is good for one person diverges from what is good for another it will be necessary to make a judgement of value- for example we might establish that Susan’s subjective wellbeing is improved more by a policy than Bob’s loss of subjective wellbeing, but a judgement of value will be needed to decide whether that matters more. Thus welfare economics seems committed to making such value judgements, but one might worry that this is not the proper place of the scientist.
I suggest that the problem is most profitably divided in two. There is a concern that values may contaminate science, invalidating its empirical method and positive focus, and there is a concern that, in a pluralistic and democratic society, proceeding on the basis of any one set of values may be inappropriately sectarian. In this chapter we will be concerned solely with the problem of sectarianism rather than the problem of contamination, as we regard the contamination problem as largely solved.
There are many senses in which values may be involved in science, so it is worth being explicit that in this thesis, we are talking about the role values play when scientists make explicit policy recommendations on the basis of their work, not subtler and more complex deployments of such values, e.g. using values as tiebreakers in cases of empirical underdetermination. It’s in this very narrow sense of the values in science that I feel comfortable saying the contamination problem is solved.
Moreover, even our focus on the problem of sectarianism is fairly restricted. We conceive of the problem of sectarianism in a way that is more practical than normative (although perhaps what we argue may ultimately cast light on the normative issues as well- at least in this particular science). Our problem is: given that welfare economics is often paralyzed -not just in theory but in practice- by a lack of normative agreement and fear of ethical sectarianism, what is a practical way it might it continue?
I argue that the best approach in the context of welfare economics for solving the problem of sectarianism is for welfare economics to see itself not as prescribing solutions but instead providing ethically relevant information to inform decision-makers or the democratic public. An approach to welfare economics grounded in the study of subjective wellbeing is capable of performing this role because it can give us information that almost everyone cares about knowing, even if it will not form the whole basis for their decision making.
Such an approach does not remove us from the necessity of appealing to values in welfare economics, but it gets us out of ethical sectarianism by making the ethical appeals much less controversial. All one must assent to now is that it is worth having information about how different policies will affect subjective wellbeing before we make decision because welfare economics is conceptualised not as making policy recommendations, but as providing information for decision makers and the democratic public.
I argue that this kind of move is not so readily available to approaches in welfare economics outside the psychometric approach. Conceptualising the results of investigations such as cost-benefit analysis as mere information is difficult because it is unclear what that information is, stripped of a normative recommendation, and of what basis there is for caring about it. Approaches in new welfare economics, e.g. unweighted cost-benefit analysis have attempted to avoid controversial value judgements through appeal to democracy. I find these attempts fail, largely because unweighted cost-benefit analysis is not, on examination, a democratic method. We give a lot of different reasons for thinking this, but at base our main contention is that these methods are more oligarchic than democratic.
Weighted CBA based approaches might be more promising, but these tend to either turn out to be a special case of the psychometric approach, weighting by estimates of the effect on subjective wellbeing, or beg the question “weighted on the basis of what?”. If we weight simply on the basis of one set of ideas about what is distributively appropriate, we have run aground on ethical sectarianism. [Nyborg and Spangen contribute greatly to these arguments]
One argument that our psychometric approach will nonetheless turn out to be ethically sectarian comes from the fact that there are multiple theories of what the good life is, and no approach to subjective well being captures what all of those theories say matter (and indeed some approaches may not be especially connected to subjective wellbeing). Thus even if we aim to just present information on the subjective-wellbeing effects of policy, we have necessarily made an ethically sectarian choice in choosing to focus on certain measures of subjective wellbeing.
Our reply is threefold, firstly, that there is empirical evidence that most people value the kinds of things measured by measures of subjective well being. Secondly, that all measures of subjective well being, and probably all popular theories about what the good life are, are massively intercorrelated as Hausman points out, meaning that to provide information on how policy affects one measure of subjective-wellbeing is to provide information on how policy affects all forms of subjective-well being. Thirdly, if desired, it is possible to survey the public on what they would like in measures of subjective wellbeing, as Alexandrova points out, thus securing a kind of democratic assent.
The problem of inferring well being- revealed preferences or subjective wellbeing
We come to the final problem- how should we measure welfare? The two options we consider are should we prefer psychometric subjective-wellbeing measures, or revealed preference measures of welfare? In some sense, this debate is analogous to the long-running debate about behavioural economics and the proper role, if any, of psychology in economics. Yet in another way, it is quite different, we are not debating whether psychological variables are causes or play an explanatory role in economic phenomena, but something closer to the reverse, what is the appropriate way of understanding the influence of economic variables on a psychological variable- wellbeing.
Although this debate over the right method to measure welfare is not directly connected to the interpersonal aggregation problem like the other three issues we have considered, it does share a conceptual link: when we are contemplating changes that make things better for everyone, it is often less necessary to know the details of what people want or feel. However, when a change may divide the population, knowing exactly what people want and/or how they will feel becomes more important.
I argue that psychometric measures are both empirically and morally superior to revealed preferences measures for inferring welfare. Revealed preferences are indeterminate, as there are multiple plausible ways of assigning cardinal utilities that contradict each other. For example, there is no need that the implied elasticity of the marginal utility of income given by risk preferences, and the same elasticity given by time preferences, need be identical. Given the indeterminacy of behaviour in setting a unique utility function, it is ethically proper for an individual to adjudicate their own wellbeing, and verbal measures allow this. Thus the very argument most often used for revealed preferences- that they respect the autonomy of the agent- is turned against it. [This isn’t a very good argument though is it? because there are multiple contradictory methods of deriving psychometric welfare as well, so tu quoque. The bit about the ethical propriety of people setting their own welfare levels or utilities does seem to make a point, however- this will need to be drawn out]
Another problem: Generally the revealed preferences account is formulated in terms of the representational theory of measurement, where measurement of utility is really a set of rules for assigning numbers to behaviours themselves. However arguably the metaphysics of the representational theory of measurement make interpersonal comparison especially troublesome. This is because if our measurement is not the inference of an underlying quantity influencing behaviour, but rather simply an assignment of numbers to behaviour itself, an interpersonal comparison will necessarily be a matter of a convention for comparing the behaviour of two people. Any such extensions of revealed preferences to allow interpersonal comparison using the representational theory of measurement will be, in a sense, arbitrary and not answerable to any deep facts in the world, just the formulators preferences. This raises the question of whether the revealed preferences approach can be separated from the representational theory of measurement.
But what about the weaknesses of the psychometric approach? Economists have often distrusted the assumptions necessary to make self-report and the so-called psychometric approach to measurement work. It can be hard to argue with incredulity, but I give it a try. An argument is made that in deciding what methodological assumptions to place our (provisional) faith in, we should consider what sorts of questions different assumptions will allow us to tackle, and their social significance. To put it bluntly and a little crudely, both psychometrics and orthodox welfare economics both make big assumptions (not for nothing are the assumptions of economics sometimes called “heroic assumptions”) yet at least the heroic assumptions of psychometrics allow us to do many socially useful things that the heroic assumptions of standard welfare economics do not permit.
It might also be nice to explore whether revealed preferences fulfilment or degree of subjective wellbeing is a better correlate of objective measures of wellbeing (health, suicide etc) although this may be out of scope for the thesis.
One argument that might be worth making is that participants seem to care a great deal about the subjective states measured by measures of subjective wellbeing. How to compare this to how much they care about behavioural measures though? I note that the kind of preferences revealed preferences measures track are most often instrumental in some way- even if only in the trivial sense that people do not care about the objects they purchase in themselves, but rather what they do with them. On the other hand, people care non-instrumentally about many of the things- like happiness and life satisfaction- measured by psychometrics.
One final point that I will address is those economists who claim, on the basis of a desire satisfaction account of welfare, that revealed preferences are ethically superior to any possible psychometric measure of welfare. This raises larger questions too- what theories of welfare and the good life are psychometric measures capable of tracking? Arguments have been made that psychometric measures can track preference fulfilment (e.g. Plant argues this), eudaimonia and objective-list theories so there is literature to explore here. There is even a small chance of breaking this off into its own chapter.
A couple of things you might like if you enjoyed this post. Firstly, my free book which you can find by clicking this: Live More Lives Than One, and secondly my subreddit which you can find by clicking this: r/PhilosophyBear. Also please share this post if you liked it <3.
If Neuroticism is a good metric of wellbeing, and it is 60% genetic... Would the evaluation of treatment effectiveness weighted and controlled against changing the other 40%?
Hi, I found this interesting and in response to your request on FB, wondered if I might make a couple of brief suggestions for further reading? I was taught some time ago by a nursing lecturer about patients’ individual complexity and the application of virtue ethics in care - here’s one of her papers https://www.researchgate.net/publication/23274349_Truth-telling_honesty_and_compassion_A_virtue-based_exploration_of_a_dilemma_in_practice
Secondly, I based my own social anthropology thesis on ethnographic research among mental health day care clients - using semi structured interviews and recovery narrative case studies. Methodology was a central point of debate and I make the case for mixed methods in evidence based health policy in a resultant book available at Palgrave and Amazon (Roberta McDonnell 2014 Creativity and Social Support in Mental Health: Service Users’ Perspectives). Best of luck with your thesis 🙂