You can find more on this topic here: https://paulcairney.wordpress.com/ebpm/
We can generate new insights on policymaking by connecting the dots between many separate concepts. However, don’t underestimate some major obstacles or how hard these dot-connecting exercises are to understand. They may seem clear in your head, but describing them (and getting people to go along with your description) is another matter. You need to set out these links clearly and in a set of logical steps. I give one example – of the links between evidence and policy transfer – which I have been struggling with for some time.
In this post, I combine three concepts – policy transfer, bounded rationality, and ‘evidence-based policymaking’ – to identify the major dilemmas faced by central government policymakers when they use evidence to identify a successful policy solution and consider how to import it and ‘scale it up’ within their jurisdiction. For example, do they use randomised control trials (RCTs) to establish the effectiveness of interventions and require uniform national delivery (to ensure the correct ‘dosage’), or tell stories of good practice and invite people to learn and adapt to local circumstances? I use these examples to demonstrate that our judgement of good evidence influences our judgement on the mode of policy transfer.
Insights from each concept
From studies of policy transfer, we know that central governments (a) import policies from other countries and/ or (b) encourage the spread (‘diffusion’) of successful policies which originated in regions within their country: but how do they use evidence to identify success and decide how to deliver programs?
From studies of ‘evidence-based policymaking’ (EBPM), we know that providers of scientific evidence identify an ‘evidence-policy gap’ in which policymakers ignore the evidence of a problem and/ or do not select the best evidence-based solution: but can policymakers simply identify the ‘best’ evidence and ‘roll-out’ the ‘best’ evidence-based solutions?
From studies of bounded rationality and the policy cycle (compared with alternative theories, such as multiple streams analysis or the advocacy coalition framework), we know that it is unrealistic to think that a policymaker at the heart of government can simply identify then select a perfect solution, click their fingers, and see it carried out. This limitation is more pronounced when we identify multi-level governance, or the diffusion of policymaking power across many levels and types of government. Even if they were not limited by bounded rationality, they would still face: (a) practical limits to their control of the policy process, and (b) a normative dilemma about how far you should seek to control subnational policymaking to ensure the delivery of policy solutions.
The evidence-based policy transfer dilemma
If we combine these insights we can identify a major policy transfer dilemma for central government policymakers:
Note how closely connected these concerns are: our judgement of the ‘best evidence’ can produce a judgement on how to ‘scale up’ success
Here are three ideal-type approaches to using evidence to transfer or ‘scale up’ successful interventions. In at least two cases, the choice of ‘best evidence’ seems linked inextricably to the choice of transfer strategy:
With approach 1, you gather evidence of effectiveness with reference to a hierarchy of evidence, with systematic reviews and RCTs at the top (see pages 4, 15, 33). This has a knock-on effect for ‘scaling up’: you introduce the same model in each area, requiring ‘fidelity’ to the model to ensure you administer the correct ‘dosage’ and measure its effectiveness with RCTs.
With approach 2, you reject this hierarchy and place greater value on practitioner and service user testimony. You do not necessarily ‘scale up’. Instead, you identify good practice (or good governance principles) by telling stories based on your experience and inviting other people to learn from them.
With approach 3, you gather evidence of effectiveness based on a mix of evidence. You seek to ‘scale up’ best practice through local experimentation and continuous data gathering (by practitioners trained in ‘improvement methods’).
The comparisons between approaches 1 and 2 (in particular) show us the strong link between a judgement on evidence and transfer. Approach 1 requires particular methods to gather evidence and high policy uniformity when you transfer solutions, while approach 2 places more faith in the knowledge and judgement of practitioners.
Therefore, our choice of what counts as EBPM can determine our policy transfer strategy. Or, a different transfer strategy may – if you adhere to an evidential hierarchy – preclude EBPM.
I describe these issues, with concrete examples of each approach here, and in far more depth here:
Evidence-based best practice is more political than it looks: ‘National governments use evidence selectively to argue that a successful policy intervention in one local area should be emulated in others (‘evidence-based best practice’). However, the value of such evidence is always limited because there is: disagreement on the best way to gather evidence of policy success, uncertainty regarding the extent to which we can draw general conclusions from specific evidence, and local policymaker opposition to interventions not developed in local areas. How do governments respond to this dilemma? This article identifies the Scottish Government response: it supports three potentially contradictory ways to gather evidence and encourage emulation’.
Both articles relate to ‘prevention policy’ and the examples (so far) are from my research in Scotland, but in a future paper I’ll try to convince you that the issues are ‘universal’
This post accompanies a 40 minute lecture (download) which considers ‘evidence-based policymaking’ (EBPM) through the lens of policy theory. The theory is important, to give us a language with which to understand EBPM as part of a wider discussion of the policy process, while the lens of EBPM allows us to think through the ‘real world’ application of concepts and theories.
To that end, I’ll make three key points:
Definitions and clarity are important, so what is ‘evidence-based policymaking’?
What is Policy? It is incredibly difficult to say what policy is and measure how much it has changed. I use the working definition, ‘the sum total of government action, from signals of intent to the final outcomes’ to raise important qualifications: (a) it is problematic to conflate what people say they will do and what they actually do; (b) a policy outcome can be very different from the intention; (c) policy is made routinely through cooperation between elected and unelected policymakers and actors with no formal role in the process; (d) policymaking is also about the power not to do something. It is also important to identify the many components or policy instruments that make up policies, including: the level of spending; the use of economic incentives/ penalties; regulations and laws; the use of voluntary agreements and codes of conduct; the provision of public services; education campaigns; funding for scientific studies or advocacy; organisational change; and, the levels of resources/ methods dedicated to policy implementation (2012a: 26).
In that context, we are trying to capture a process in which actors make and deliver ‘policy’ continuously, not identify a set-piece event which provides a single opportunity to use a piece of scientific evidence to prompt a policymaker response.
Who are the policymakers? The intuitive definition is ‘people who make policy’, but there are two important distinctions: (1) between elected and unelected participants, since people such as civil servants also make important decisions; (2) between people and organisations, with the latter used as a shorthand to refer to a group of people making decisions collectively. There are blurry dividing lines between the people who make and influence policy. Terms such as ‘policy community’ suggest that policy decisions are made by a collection of people with formal responsibility and informal influence. Consequently, we need to make clear what we mean by ‘policymakers’ when we identify how they use evidence.
What is evidence? We can define evidence as an argument backed by information. Scientific evidence describes information produced in a particular way. Some describe ‘scientific’ broadly, to refer to information gathered systematically using recognised methods, while others refer to a specific hierarchy of scientific methods, with randomized control trials (RCTs) and the systematic review of RCTs at the top. This is a crucial point:
policymakers will seek many kinds of information that many scientists would not consider to be ‘the evidence’.
This discussion helps identify two key points of potential confusion when people discuss EBPM:
Realistic models are important, so what is wrong with the policy cycle?
One traditional way to understand policymaking in the ‘real world’ is to compare it to an ideal-type: what happens when the conditions of the ideal-type are not met? We do this in particular with the ‘policy cycle and ‘comprehensive rationality’.
So, consider this modified ideal-type of EBPM:
So far, so good (although you might stop to consider who is best placed to provide evidence, and who – or which methods of evidence gathering – should be privileged or excluded), but what happens when we move away from the ideal-type? Here are two insights from a forthcoming paper (Cairney Oliver Wellstead 26.1.16).
Lessons from policy theory: 1. Identify multi-level policymaking environments
First, policymaking takes place in less ordered and predictable policy environment, exhibiting:
A focus on this bigger picture shifts our attention from the use of scientific evidence by an elite group of elected policymakers at the ‘top’ to its use by a wide range of influential actors in a multi-level policy process. It shows scientists and practitioners that they are competing with many actors to present evidence in a particular way to secure a policymaker audience. Support for particular solutions varies according to which organisation takes the lead and how it understands the problem. Some networks are close-knit and difficult to access because bureaucracies have operating procedures that favour particular sources of evidence and some participants over others, and there is a language – indicating what ways of thinking are in good ‘currency’ (such as ‘value for money’) – that takes time to learn. Well-established beliefs provide the context for policymaking: new evidence on the effectiveness of a policy solution has to be accompanied by a shift of attention and successful persuasion. In some cases, social or economic ‘crises’ can prompt lurches of attention from one issue to another, and some forms of evidence can be used to encourage that shift. In this context, too many practitioner studies analyse, for example, a singular point of central government decision rather than the longer term process. Overcoming barriers to influence in that small part of the process will not provide an overall solution.
Lessons from policy theory: 2. Policymakers use two ‘shortcuts’ to make decisions
How do policymakers deal with their ‘bounded rationality’? They employ two kinds of shortcut: ‘rational’, by pursuing clear goals and prioritizing certain kinds and sources of information, and ‘irrational’, by drawing on emotions, gut feelings, deeply held beliefs, habits, and the familiar to make decisions quickly. Consequently, the focus of policy theories is on the links between evidence, persuasion, and framing (in the wider context of a tendency for certain beliefs to dominate discussion).
Framing refers to the ways in which we understand, portray, and categorise issues. Problems are multi-faceted, but bounded rationality limits the attention of policymakers, and actors compete to highlight one image at the expense of others. The outcome of this process determines who is involved (for example, portraying an issue as technical limits involvement to experts), who is responsible for policy, how much attention they pay, and what kind of solution they favour. For example, tobacco control is more likely when policymakers view it primarily as a public health epidemic rather than an economic good, while ‘fracking’ policy depends on its primary image as a new oil boom or environmental disaster (I discuss both examples in depth here).
Scientific evidence plays a part in this process, but we should not exaggerate the ability of scientists to win the day with reference to evidence. Rather, policy theories signal the strategies that practitioners may have to adopt to increase demand for their evidence:
Further, the impact of a framing strategy may not be immediate, even if it appears to be successful. Scientific evidence may prompt a lurch of attention to a policy problem, prompting a shift of views in one venue or the new involvement of actors from other venues. However, for example, it can take years to produce support for an ‘evidence-based’ policy solution, built on its technical and political feasibility (will it work as intended, and do policymakers have the motive and opportunity to select it?).
This discussion helps identify two key points of potential confusion when people discuss the policy cycle and comprehensive rationality:
Realistic strategies are important, so how far should you go to overcome ‘barriers’ between evidence and policy?
You can’t take the politics out of EBPM. Even the selection of ‘the evidence’ is political (should evidence be scientific, and what counts as scientific evidence?).
Further, providers of scientific evidence face major dilemmas when they seek to maximise the ‘impact’ of their research. Armed with this knowledge of the policy process, how should you seek to engage and influence decisions made within it?
If you are interested in this final discussion, please see the short video here and the follow up blog post: Political science improves our understanding of evidence-based policymaking, but does it produce better advice?
My aim is to tell you about the use and value of comparative research by combining (a) a summary of your POLU9RM course reading, and (b) some examples from my research. Of course, I wouldn’t normally blow my own trumpet, but Dr Margulis insisted that I do so. Here is the podcast (25 minutes) which you can download or stream:
The reading for this session is Halperin and Heath’s chapter 9, which makes the following initial points about comparative research:
Key issues when you apply theories to new cases
There are two interesting implications of the final bullet point.
First, note the tendency of a small number of countries to dominate academic publication. For example, a lot of the policy theories to which I refer in this ‘1000 words’ series derive from studies of the US. With many theories, you can see interesting developments when scholars apply them outside of the US:
Second, consider the extent to which we are accumulating knowledge when applying the same theory to many cases. There are now some major reviews or debates of key policy theories, in which the authors highlight the difficulties of systematic research to produce a large number of comparable case studies:
For me, the important common factor in all of these reviews is that many scholars pay insufficient attention to the theory when applying its insights to new cases. Consequently, when you try to review the body of work, you find it difficult to generate an overall sense of developments by comparing many case studies. So, in my humble opinion, we’d be a lot better off if people did a proper review of the literature – and made sure that their concepts were clear and able to be ‘operationalised’ – before jumping into case study analysis. I wrote these blog posts for established scholars and new PhD students, but the argument should also apply to you as an undergraduate: get the basics right (which includes understanding the theory you are applying) to get the comparative research right. This is just as important as your case selection.
From case study to large-N research: choosing between depth and breadth?
Case studies. It is in this context that we might understand Halperin and Heath’s point that case study research (single-N) is comparative. You might be going into much depth to identify key aspects of a single case, but also considering the extent to which it compares meaningfully with other case studies using the same theory. All that we require in such examples is that you justify your case selection. In some examples, you are trying to see if a theory drawn from one country applies to another. Or, you might be interested in how far theories travel, or how applicable they are to cases which seem unusual or ‘deviant’. In some examples, we seek the ‘crucial case study’ that is central to the ‘confirmation or disconfirmation of a theory’ (p207), but don’t make the mistake of concluding that you need a new theory because the old one can only explain so much. Further, although single case studies should not be dismissed, you can only conclude so much from them. So, be careful to spell out and justify any conclusions that you find to be ‘generalisable’ to other cases.
Example 1. In my research, I often use US-origin theories to help explain policymaking by the UK and devolved governments. For example, I used the same approach as Kingdon (documentary analysis and semi-structured interviews) to identify, in great detail, the circumstances under which 4 governments in the UK introduced the same ban on smoking in public places (the take home message: there was more to it than you might think!).
Small-N studies. The systematic comparison of several cases allows you to extend analysis often without compromising on depth. However, there is great potential to bias your outcomes by cherry-picking cases to suit particular theories. So, we look for ways to justify case selection. Halperin and Heath go to some length to identify the problems of ‘selection on the dependent variable’, which can be summed up as: don’t compare cases just because they seem to have the same outcome in common (focus instead on what causes such outcomes). Two well-established approaches are ‘most similar systems design’ (MSSD) and ‘most different systems design’ (MDSD) (p209). With MSSD, you choose cases which share key explanatory factors/ characteristics (e.g. they have the same socio-economic/ political context) so that you can track the effects of one or more difference (perhaps in the spirit of a randomised control trial, but without the substance). With MDSD you choose cases which do not share characteristics so that you can track the effect of a key similarity.
Aside from the problems of case study selection bias, it’s worth noting how difficult it is to produce a clear MSSD or MDSD research design, since you are making value judgements about which shared/ not shared characteristics are crucial (see their discussion of necessary/ sufficient factors).
Example 2. Take the example of a study I did with Karin Ingold and Manuel Fischer recently, comparing ‘fracking’ policy and policymaking in the UK and Switzerland. We use the same theory (the ACF) and same method (documentary analysis and a survey of key actors) in both countries. We used a survey to allow us to quantify key relationships between actors: to what extent do they share the same beliefs with other actors, and how likely are they to share information frequently with their allies and competitors?
We describe the research design as MDSD because their political systems represent two contrasting political system archetypes – the ‘majoritarian’ UK and ‘consensus’ Switzerland – which Lijphart describes as key factors in their contrasting policymaking processes. Yet, central to our argument is that there are policymaking processes common to policy subsystems despite their system differences. In effect, we try to measure the effect of political system design on subsystem dynamics and find a subtle but important impact.
We also find that, although these differences exist, their policy outcomes are remarkably similar. So, ‘most different’ systems often produce very similar policymaking processes and policy outcomes. Then, we note that if we had our time again we would have extended the analysis to subnational governments in the UK. The ‘most different’ design prompted us to focus on the UK central government and Swiss Cantons (the alleged locus of power in both cases), but maybe we could have started from an assumption that they are not as different as they look. Have a look and you can see the dilemmas that still play out in (what I think is) a well-designed study.
Example 3. I face the same difficulties when comparing policy and policymaking by the UK and Scottish Governments. In some respects, they are ‘most different’: ‘new Scottish politics’ was designed to contrast with ‘old Westminster’ (particularly when it came to elections; the Scottish Parliament is also unicameral). In others, they are similar: the ‘architects of devolution’ introduced a system that seems to be of the Westminster family (particularly the executive-legislative relationships). Further, Scotland remains part of the UK, and the UK Government retains responsibility for many policies affecting Scotland. Overall, it is difficult to say for sure how similar/ different are their systems (which I discuss in a series of lectures). So, the comparison is fraught with difficulty. In such examples, they key solution is to ‘show your working’: describe these problems and state how you work within them (for example, in my case, I try to examine the extent to which policymaking reflects ‘territorial’ context or ‘universal’ drivers’, partly by interviewing policymakers in each government).
Large-N (quantitative) studies. Halperin and Heath lay out some of the potential benefits of large-N research. For example, it is easier to come to general conclusions about many countries by studying many countries (rather than trying to generalise from a few, which might not be representative). They also highlight the pitfalls, including the problem of meaning: when you ‘operationalise’ a concept such as democracy or populism, can you provide a simple enough definition to allow you to give each system a number (to denote democratic/ undemocratic or X% democratic) that means the same thing in each case? To this problem, I would add the general issue of breadth and depth. With large-N studies you can examine the effects of a small number of variables, to explain a small part of the politics of many systems. With small-N you can study a large number of variables in a few systems. The classic trade-off is between breadth and depth. Of course, if you are doing an undergraduate dissertation the big Q is: what can you reasonably be expected to do? Maybe your highest aim should be to make sense of the studies which already exist.