You should get the impression from 1000 words that most policy changes are small or not radically different from the past: Lindblom identifies incrementalism; punctuated equilibrium highlights a huge number of small changes and small number of huge changes; the ACF compares routine learning by a dominant coalition to a ‘shock’ which prompts new subsystem and policy dynamics; and multiple streams identifies the conditions (rarely met) for major change.
Yet, I just gave you the impression that we don’t know how to define policy. If we can’t define it well, how can we measure it well enough to come to this conclusion so consistently?
Why is the measurement of policy change important?
We miss a lot if we equate policy with statements rather than outputs/ outcomes. We also miss a lot if we equate policy change with the most visible outputs such as legislation. I list 16 different policy instruments, although they tend to be grouped into smaller categories: focusing on regulation (including legislation) and resources (money and staffing) to accentuate the power at policymaker’s disposal; or regulatory/ distributive/ redistributive to suggest that some policy measures are more difficult to ‘sell’ than others.
So, as in defining policy change, we need to make choices about what counts as policy in this instance to measure how much it has changed. For example, I have (a) written on one output as a key exemplar of policy change – legislation to ban smoking in public places for Scotland, England/ Wales, the UK, and (almost) EU – to show that a government is signalling major changes to come, but also (b) situated that policy instrument within a much broader discussion – of many tobacco policies in the UK and across the globe – to examine the extent to which it is already consistent with a well-established direction of travel.
Breadth (to give the ‘big picture’) versus depth (to note important details forensically)
How much we expect policy to change, given the size of the problem (a big feature in public health studies, which criticise government inaction)
How radical policy change looks from the ‘top’ (at the point of central government choice) or the ‘bottom’ (longer-term delivery of policy by other bodies)
What policies mean (what problem were policymakers trying to solve?)
How consistent ‘policy’ seems when made of often-contradictory instruments
How do we solve the problem?
The problem is that we can produce very different accounts of policy change from the same pool of evidence, by accentuating some measures and ignoring others, or putting more faith in some data more than others (e.g. during interviews).
Sometimes, my preferred solution is to compare more than one narrative of policy change. Another is simply to ‘show your work’.
Take home message for students: ‘show your work’ means explaining your logical process and step-by-step choices. Don’t just write that it is difficult to define policy and measure change. Instead, explain how you assess policy change in one important way, why you chose this way, and shine a light on the payoffs to your approach. Read up on how other scholars do it, to learn good practice and how to make your results comparable to theirs. Indeed, part of the benefit of using an established theory, to guide our analysis, is that we can engage in research systematically as a group.
My aim is to tell you about the use and value of comparative research by combining (a) a summary of your POLU9RM course reading, and (b) some examples from my research. Of course, I wouldn’t normally blow my own trumpet, but Dr Margulis insisted that I do so. Here is the podcast (25 minutes) which you can download or stream:
The reading for this session is Halperin and Heath’s chapter 9, which makes the following initial points about comparative research:
‘Comparative’ describes methods to identify and explain differences/ similarities between cases.
These cases are often – but need not be – countries (for example, comparisons over time, or by policy area, can be just as illuminating).
There are three main approaches: large-N (many cases), small-N (several cases), and single-N (or ‘case studies’).
Comparison is not an end in itself. Rather, we need to justify, or clarify the value of, each comparison. This involves identifying a clear theoretical reason for a particular comparison (and, as in all studies, producing a clear research question that can be answered).
It helps us avoid what Rose calls ‘false uniqueness’ (assuming rather than demonstrating that a case is exceptional) and ‘false universalism’ (assuming that a finding from one time/ place applies to all times or places).
Comparative methods are most frequently used to determine the extent to which a theory generated from one set of cases applies to other cases.
Key issues when you apply theories to new cases
There are two interesting implications of the final bullet point.
First, note the tendency of a small number of countries to dominate academic publication. For example, a lot of the policy theories to which I refer in this ‘1000 words’ series derive from studies of the US. With many theories, you can see interesting developments when scholars apply them outside of the US:
The ‘universal’ v ‘territorial’ issue. In my article with Michael Jones (summarised here) on Kingdon’s ‘multiple streams analysis’ we note that it began as a study of health/ transport policy in the US in the 1980s. It then morphed into a theory applied to many regions at many times. A large part of its success comes from the abstract nature of its key concepts, which can be applied at any place and time: for example, a lot of its discussion of agenda setting relates to ‘bounded rationality’ which applies to all people. Yet, there are also ‘territorial’ issues which apply to particular regions or types of government. For example, some studies adapt MSA’s concepts to explain the influence of supranational authorities or to describe the differences between EU and US policymaking.
The ‘beyond the West’ issue. In one interesting application of MSA to China, Zhu argues that a key concept is not applicable: policy change in the US requires policy solutions to be technically feasible; in China, they only prompt change if ‘infeasible’. The example highlights the limits to the explanatory power of theories derived from studies of a small number of ‘Western’ countries (I know this description is loaded – what other terms could we use to describe countries like the US and UK?). In such cases, scholars are now exploring the implications in some depth: how well do these policy theories travel?
Second, consider the extent to which we are accumulating knowledge when applying the same theory to many cases. There are now some major reviews or debates of key policy theories, in which the authors highlight the difficulties of systematic research to produce a large number of comparable case studies:
For me, the important common factor in all of these reviews is that many scholars pay insufficient attention to the theory when applying its insights to new cases. Consequently, when you try to review the body of work, you find it difficult to generate an overall sense of developments by comparing many case studies. So, in my humble opinion, we’d be a lot better off if people did a proper review of the literature – and made sure that their concepts were clear and able to be ‘operationalised’ – before jumping into case study analysis. I wrote these blog posts for established scholars and new PhD students, but the argument should also apply to you as an undergraduate: get the basics right (which includes understanding the theory you are applying) to get the comparative research right. This is just as important as your case selection.
From case study to large-N research: choosing between depth and breadth?
Case studies. It is in this context that we might understand Halperin and Heath’s point that case study research (single-N) is comparative. You might be going into much depth to identify key aspects of a single case, but also considering the extent to which it compares meaningfully with other case studies using the same theory. All that we require in such examples is that you justify your case selection. In some examples, you are trying to see if a theory drawn from one country applies to another. Or, you might be interested in how far theories travel, or how applicable they are to cases which seem unusual or ‘deviant’. In some examples, we seek the ‘crucial case study’ that is central to the ‘confirmation or disconfirmation of a theory’ (p207), but don’t make the mistake of concluding that you need a new theory because the old one can only explain so much. Further, although single case studies should not be dismissed, you can only conclude so much from them. So, be careful to spell out and justify any conclusions that you find to be ‘generalisable’ to other cases.
Example 1. In my research, I often use US-origin theories to help explain policymaking by the UK and devolved governments. For example, I used the same approach as Kingdon (documentary analysis and semi-structured interviews) to identify, in great detail, the circumstances under which 4 governments in the UK introduced the same ban on smoking in public places (the take home message: there was more to it than you might think!).
Small-N studies. The systematic comparison of several cases allows you to extend analysis often without compromising on depth. However, there is great potential to bias your outcomes by cherry-picking cases to suit particular theories. So, we look for ways to justify case selection. Halperin and Heath go to some length to identify the problems of ‘selection on the dependent variable’, which can be summed up as: don’t compare cases just because they seem to have the same outcome in common (focus instead on what causes such outcomes). Two well-established approaches are ‘most similar systems design’ (MSSD) and ‘most different systems design’ (MDSD) (p209). With MSSD, you choose cases which share key explanatory factors/ characteristics (e.g. they have the same socio-economic/ political context) so that you can track the effects of one or more difference (perhaps in the spirit of a randomised control trial, but without the substance). With MDSD you choose cases which do not share characteristics so that you can track the effect of a key similarity.
Aside from the problems of case study selection bias, it’s worth noting how difficult it is to produce a clear MSSD or MDSD research design, since you are making value judgements about which shared/ not shared characteristics are crucial (see their discussion of necessary/ sufficient factors).
Example 2. Take the example of a study I did with Karin Ingold and Manuel Fischer recently, comparing ‘fracking’ policy and policymaking in the UK and Switzerland. We use the same theory (the ACF) and same method (documentary analysis and a survey of key actors) in both countries. We used a survey to allow us to quantify key relationships between actors: to what extent do they share the same beliefs with other actors, and how likely are they to share information frequently with their allies and competitors?
We describe the research design as MDSD because their political systems represent two contrasting political system archetypes – the ‘majoritarian’ UK and ‘consensus’ Switzerland – which Lijphart describes as key factors in their contrasting policymaking processes. Yet, central to our argument is that there are policymaking processes common to policy subsystems despite their system differences. In effect, we try to measure the effect of political system design on subsystem dynamics and find a subtle but important impact.
We also find that, although these differences exist, their policy outcomes are remarkably similar. So, ‘most different’ systems often produce very similar policymaking processes and policy outcomes. Then, we note that if we had our time again we would have extended the analysis to subnational governments in the UK. The ‘most different’ design prompted us to focus on the UK central government and Swiss Cantons (the alleged locus of power in both cases), but maybe we could have started from an assumption that they are not as different as they look. Have a look and you can see the dilemmas that still play out in (what I think is) a well-designed study.
Example 3. I face the same difficulties when comparing policy and policymaking by the UK and Scottish Governments. In some respects, they are ‘most different’: ‘new Scottish politics’ was designed to contrast with ‘old Westminster’ (particularly when it came to elections; the Scottish Parliament is also unicameral). In others, they are similar: the ‘architects of devolution’ introduced a system that seems to be of the Westminster family (particularly the executive-legislative relationships). Further, Scotland remains part of the UK, and the UK Government retains responsibility for many policies affecting Scotland. Overall, it is difficult to say for sure how similar/ different are their systems (which I discuss in a series of lectures). So, the comparison is fraught with difficulty. In such examples, they key solution is to ‘show your working’: describe these problems and state how you work within them (for example, in my case, I try to examine the extent to which policymaking reflects ‘territorial’ context or ‘universal’ drivers’, partly by interviewing policymakers in each government).
Large-N (quantitative) studies. Halperin and Heath lay out some of the potential benefits of large-N research. For example, it is easier to come to general conclusions about many countries by studying many countries (rather than trying to generalise from a few, which might not be representative). They also highlight the pitfalls, including the problem of meaning: when you ‘operationalise’ a concept such as democracy or populism, can you provide a simple enough definition to allow you to give each system a number (to denote democratic/ undemocratic or X% democratic) that means the same thing in each case? To this problem, I would add the general issue of breadth and depth. With large-N studies you can examine the effects of a small number of variables, to explain a small part of the politics of many systems. With small-N you can study a large number of variables in a few systems. The classic trade-off is between breadth and depth. Of course, if you are doing an undergraduate dissertation the big Q is: what can you reasonably be expected to do? Maybe your highest aim should be to make sense of the studies which already exist.
These posts introduce you to key concepts in the study of public policy. They are all designed to turn a complex policymaking world into something simple enough to understand. Some of them focus on small parts of the system. Others present ambitious ways to explain the system as a whole. The wide range of concepts should give you a sense of a variety of studies out there, but my aim is to show you that these studies have common themes.