There are are seven possible design types for doing high-level outcome/impact attribution evaluation [1]. This set of design types is used within the Outcomes Theory framework and the applied Easy Outcomes system to help users to assess whether or not it is appropriate, feasible and/or affordable to undertake an outcome/impact evaluation of a program or intervention. The power of this approach lies in its claim that this is an exhaustive set of design types. If a user has looked at the appropriateness, feasibility and affordability of each of these design types in regard to a particular program, they can then definitely say that it is, or is not, possible to undertake an outcome/impact evaluation of the particular program (subject to appropriateness, feasibility and/or affordability). High-level outcome/impact design types are one of the five building-blocks of all outcomes systems (evaluation systems, performance management systems, results-based systems etc.) which are detailed within Outcomes Theory. (See: Selecting impact/outcome evaluation designs: a decision-making table and checklist approach for an article on how to select amongst these different types of designs when designing an impact evaluation and Impact/outcome evaluation designs and techniques illustrated with a simple example to see how each of the designs could be used in regard to the same simple example.
The seven possible designs are:
1. True experimental deign
2. Regression-discontinuity design
3. Time-series design
4. Constructed matched comparison group design
5. Exaustive alternative causal identification and elimination design
6. Expert opinion summary judgement design [4]
7. Key informants summary judgement design [4].
![]() |
Figure 1: Conventions used in high-level outcome/impact evaluation design diagrams |
True experimental design
In the typical simplest case of this design, a group of units (people, schools, hospitals) is identified which is the focus of the intervention being studied. A sample is taken from this group (if there are large numbers of the particular unit on which the intervention could be used). The sample is randomly divided. One half of the units have the intervention applied to them (the intervention group) and the other half do not (control group). Changes in measurements of the high-level outcomes are compared before and after the intervention has been run. It is presumed that any significant difference (beyond what is estimated as likely to have occurred by chance), is a result of the intervention. This is because there is no reason to believe that the units in the intervention and the control group differed in any systematic way which could have created the difference, apart from receiving, or not receiving, the intervention.
A variation on the true experimental design is to use what is called a 'waiting list' or 'pipeline' design. This design uses the same approach as the true experimental design however the intervention is only withheld from those in the control group for a limited period of time (the time they spend on the 'waiting list'). This is in contrast to a true experiment where the control group would never get the intervention. This design is often regarded as more appropriate (because it is more ethical in that the control group do not miss out on the intervention) and more feasible (because participants and stakeholders are more likely to accept it) than true experiments. The problem with the design is that the effect of the intervention needs to be able to be measured in the time between the intervention group getting the intervention and the control group getting the intervention. In the case of interventions which take a relatively long time to improve outcomes, the waiting-list/pipeline experimental design is not appropriate.
Figure 2 illustrates the true experimental design:

Figure 2: True experimental design
A regression discontinuity design can be used in the case where units can be ranked in order based on measurement of a high-level outcome before any intervention takes place. For instance, reading level for students or crime clearance rate for a police district. A sub-set of the units below a point on the outcome measurement are then given the intervention. After the intervention has taken place, if it is successful, there should be a clear improvement in those subject to the intervention but no similar amount of improvement amongst those units above the cut off point (which did not receive the intervention). This design is more ethically acceptable in a case where there are limited resources for piloting an intervention because (in contrast to a true experiment) the intervention resources are being allocated to those units with the greatest need.
Figure 3 illustrates this design:
![]() |
Figure 3: Regression discontinuity design |
A time series design uses the fact that a sufficiently long series of measures have been taken on a high-level outcome. An intervention is then introduced (or has been introduced in a retrospective analysis) and if the intervention has had an effect, a clear shift in the level of the high-level outcome measurements should be observable at the point in time when the intervention occurred.
Figure 4 illustrates this design:
![]() |
Figure 4: Interrupted time-series design |
Constructed matched comparison group design
This design is where a naturally occurring group is located which is similar in as many ways as possible to the group which is receiving the intervention apart from the fact that it is not receiving the intervention. For instance this could be different administrative units which do not receive an intervention. Or different towns or different countries. In a somewhat different version of this design (but which employs the same underlying logic) estimates are made of what happens on average to people with a certain set of characteristics (e.g. who have been on an unemployment benefit for four weeks). An intervention is then given to a group and what happens to them (how long they remain on the unemployment benefit) is compared to the predicted amount of time they should have remained on the unemployment benefit if they had not received the intervention. This type of design is called propensity matching. Problems arise for constructed matched comparison group designs because (in contrast to true experiments) the comparison group is more likely to be different from the control group. There is a set of techniques for attempting to deal with this problem. They are set out in Techniques for improving constructed matched comparison group impact/outcome evaluation designs.
Figure 5 illustrates the general case of constructed matched comparison group designs.
![]() |
Figure 5: Constructed matched comparison group |
Exhaustive alternative causal explanation elimination design
The exhaustive alternative causal explanation design proceeds by examining all of the possible alternative hypothetical outcomes hierarchies that may lie behind the changes observed in high-level outcome measurement. This can use a range of techniques all directed at identifying and excluding alternative explanations to the intervention. Sometimes this is described as more “forensic-type” method rather than the experimental approaches used above. Figure 6 illustrates this design.

Figure 6: Exhaustive alternative causal explanation elimination design
Expert opinion summary judgment design [3]
In this design, an expert is asked to give their summary judgment opinion regarding whether high-level outcomes are attributable to an intervention. They are expected to use whatever data gathering and analysis methods they normally use in their work in the area and to draw on their previous knowledge in dealing with similar instances.
Figure 7 illustrates this design.

Figure 7: Expert opinion judgment design
In this design, key informants (people who have experience of the program or significant parts of the program) are asked to give them summary judgment opinion as to whether changes in high-level outcomes are attributable to the intervention. They are expected to use whatever data gathering and analysis methods they normally use in their day to day work and to draw on their previous knowledge in dealing with similar instances. These judgments are then summarized and analyzed and brought together as a set of findings about the outcomes of the program. [3][4]
Figure 8 illustrates this design.
![]() |
Figure 8: Key informants' judgment design |
Please comment on this article
This article is based on the developing are of outcomes theory which is still in a relative early stage of development. Please critique any of the arguments laid out in this article so they can be improved through critical examination and reflection.
Citing this article
Duignan, P. (2005-2009). Seven possible impact/outcome evaluation design types. Outcomes Theory Knowledge Base article No. 209. (http://knol.google.com/k/paul-duignan-phd/seven-possible-outcomeimpact-evaluation/2m7zd68aaz774/10).
[If you are reading this is a PDF or printed copy, the web page version may have been updated.]
[1] This set of outcome/impact design types is an interim set. It may be subject to change in the future if it is established that there is a design type which has not been included within the list. Comments on whether this list is exhaustive can be posted below.
[2] Regression discontinuity design could be regarded as being a type of Constructed Matched Comparison Group Design. However it has been separately listed here because there are some stakeholders who separate true experiments and regression discontinuity out from other designs as providing a more robust estimate of effect. This framework does not take a position on this issue, it simply allows those wishing to make that claim to make it.
[3] The first five of these designs are based on the thinking of the international evaluation expert Michael Scriven on possible ways of establishing causality in evaluation. The author has added the final two as some stakeholders in some situations regard these as providing sufficient evidence of causality for them to act upon. Whether or not these designs are accepted by a particular community of users of an outcomes system is up to that community of users. In theory it would be possible for a community of users to reject the notion that there is a particular set of whole-intervention outcomes attribution evaluation designs which provide more robust outcome attribution than other types of evaluation (often known as formative or process evaluation). Some of those who adopt a post-modern, relativist, interpretativist, constructivist or some other theories of science may want to do so. Outcomes theory only seeks that such communities of users make an explicit decision about their rejection so that they can be clear about what is known, not known and what is feasible and affordable to know about a particular outcomes system.
[4] A number of stakeholders (called communities of users in outcomes theory) believe that the last two designs would not usually be expected to establish causality as robustly as the other listed designs. However these designs are frequently used by some communities of users and therefore deserve a place in a full typology of whole-intervention outcome attribution evaluation designs; in particular circumstances they are feasible, timely, affordable and accepted by stakeholders as better than having no whole-intervention high-level outcome attribution information. Even though they are often more feasible, timely and affordable than the other five designs, decision-makers have to consider on a case by case basis whether these designs can actually provide any coherent information about attribution or whether they will just end up being examples of pseudo-outcomes studies. Pseudo-outcomes studies are ones which do not contribute any sound information about attribution to a particular intervention but merely record that outcomes improved over the time period that the intervention was running.
V1-2 2005-2009
[Outcomes Theory Article #209]
References
- Some of this work was developed when the author was the 2005 New Zealand Fulbright Senior Scholar working at the Urban Institute in Washington D.C











Anonymous
Invite as author
evaluation battles for supremacy
I think it would be a mistake to attempt to rank order these outcome attribution techniques. I have big doubts about the blanket claims of 'robustness' of either quantitative or qualitative methodologies for determining causal attribution - particularly since causality is an unobservable. I think both (or all seven) methodologies are far too dependent on what has to be taken for granted for them to make sense and to seem credible. Unfortunately, in their advocate forms any method can be conducted in ways that are conceited and unquestioning about their taken-for-granteds. And to me, the most important or valuable aspect of 'scientificity' is being unconceited and genuinely seeking the truth* by being open to just this sort of questioning.
*even constructivists make ontological claims about the truth.
To discuss your points in order:
1. Blanket claims for the superiority of any one kind of impact/outcome attribution design should not be made. This is a possible position for someone to hold and I am trying to communicate that fact in Footnote 3 to the article: "In theory it would be possible for a community of users to reject the notion that there is a particular set of whole-intervention outcomes attribution evaluation designs which provide more robust outcome attribution than other types of evaluation".
2. It is not just relativists, interpretivists, constuctivists or post-modernists who may hold this type of view. I have amended Footnote 3 to include 'some other theory of science' which is intended to cover anyone else with any other theory of science position (e.g. post-Popperian) who because of their theory of science does not believe that it is possible to make blanket claims for the superiority of any one kind of impact/outcome attribution design.
3. All the designs are far too dependent on what has to be taken for granted for them to make sense and to seem credible. I'm not entirely sure what you are saying here. But if it is that there are many assumptions behind the impact/outcome designs people try to use I would agree with you. And also that in some cases, it could be argued that these assumptions are wrong. However, many people believe that in particular cases, despite there having to be assumptions, some (or potentially all of the designs) do seem credible and some people believe that in some cases (e.g. pharmaceuticals) some of these designs (e.g. experiments) are usually better at working out what is causing what than other designs. One of the things I am trying to do with this work is to encourage people to make such claims in an explicit, rather than implicit, way. So this framework of impact/outcome designs lies within a wider framework called the Building-blocks of outcomes systems (http://knol.google.
4. That at least sometimes people advocate for the superiority of particular designs in ways that are conceited and unquestioningly take things for granted. I would agree with you on this point.
5. Being scientific is about being unconceited and genuinely seeking the truth by being open to questioning. I would agree with you on this point. What I am trying to do with the list of impact/outcome designs is to not just include a narrow set of traditionally quantitatively focused designs, but to also include qualitative designs so that if people want to, they can attempt to make the claim that these say something about impact attribution. So in that sense I have tried to be as inclusive as possible in this framework, because I am trying to provide a generic framework, and then let people fight it out as to whether or not a particular design is appropriate, feasible and/or affordable in a particular case. If you look at the following article you will see how such assessment of whether an impact evaluation is appropriate, feasible and/or affordable works (http://knol.google.
In order to allow people to do this type of analysis and to come to a conclusion as to whether impact evaluation is appropriate in a particular case, then within outcomes theory there needs to be a concept of 'impact' evaluation as making a separate type of claim from other types of evaluation (non-impact evaluation - formative and process evaluation - see http://knol.google.c
So I see my role in this area as primarily providing a framework in which people can make claims and other people (like yourself can challenge those claims) and they can argue it out and others can see what they think of the arguments.
Thanks for your comments.
Paul Duignan, PhD (Follow on: http://www.OutcomesB
EditSaveCancelDeleteDeleteBlock this userReport abusive commentHide report window
Anonymous
Invite as author
fair comparisons
I am no expert but I do know that one of the most commonly known techniques is triangulation, usually understood as assessing validity through convergence (but there is at least one constructivist alternative use of triangulation that rejects the singular truth presumption). This was a concept/practice first used in social and behavioural scientists back in the 1960s in the field of measurement (eg its the principle of validation behind inter-rater reliability and for multiple items being used in one survey to measure a concept/practice). But since the 1970s the concept was adopted in qualitative research and expanded by Denzin to include multiple methods, multiple evaluators, multiple theories, multiple subjects/perspective
You might like to have a quick look at my comment on the previous posting from another reader above which tries to explain more about what I am trying to do with the list set out in this article. I am trying to get evaluators to be explicit when they are, and when they are not, trying to make a claim about attribution of changes in high-level outcomes to a particular organization, project or intervention.
I may have to elaborate what I am saying in the last two designs (expert judgment design and key informant design) along the lines you suggest when I think about it more. However, at the moment I am trying to keep them really generic and simply say that they can consist of whatever methods the people making the judgments want to make (of course this means that they can include all that you are saying, triangulation etc.) But what I am trying to capture here is the essence of the method which is being used in the design and in the case of the last two designs, that can be put in the simplest fashion by just saying that the method relies on asking people to make a judgment.
I think that I am trying to do this (thanks for stimulating my thinking on this) because I want the different 'designs' to be sort of mutually exclusive, in the sense that they actually set out different ways of going about and trying to attibute changes in high-level outcomes to specific interventions.
So I could talk about triangulation and all sort of things under the last two designs. But I think that the reason I don't want to do this is that I will then turn them into descriptions of research in general. As I say, I am trying in this article to tease out all of the possible types of claim which stakeholders may be prepared to accept as having established (for the pragmatic purpose of deciding on future action) that an intervention causes high-level outcomes to change.
So, as I said in my comment to the previous readers post, I want to distinguish this type of claim-making from other types of evaluation and research (often called formative and process evaluation) because I want us to be clear when people are, and are not making causal claims).
I don't know if what I am saying here makes any sense to you, I will think about it further.
On a more technical note, are you talking about construct validity at the end of your post? Very interesting comments will think on them further, I also am not entirely happy with the way I have described the last two designs.
Paul Duignan, PhD. (Follow on http://www.OutcomesB
EditSaveCancelDeleteDeleteBlock this userReport abusive commentHide report window
I believe I understand your intention with this list. But I think you are not treating qualitative and quantitative techniques in the same way. In the first five quantitative tehcniques you set out five very specific techniques in which causality theories/attribution
I think you can do this in simple terms, as you have done with the quant research, though it might take more digging through good qual evaluation books to identify distinct techniques relevant to outcomes attribution validation. I know that some qual researchers encourage the lack of attention to qual validation techniques by referring to the researcher as the 'instrument'.
another technique that might be worth including as its own category are mixed methods (qual/quant). There appear to be some interesting techniques for quality assurance developing in American evaluation that depend on quant and qual researchers working together and checking the validity of each other's methods.
EditSaveCancelDeleteDeleteBlock this userReport abusive commentHide report window
My dilemma is that I want the list ultimately be an exhaustive list of 'designs' which at least some people accept can be used to make a claim that an intervention has caused changes in high-level outcomes. I don't want it to turn into a general list of evaluation related research methods. The textbooks are full of such lists and they are great for learning about evaluation research methods.
What I am trying to do here is very specific and different from a general list of evaluation research methods. If we can develop an exhaustive list of 'designs' which constitute claims about interventions having caused changes in high-level outcomes, such a list can be used in two ways. First, for the evaluation planner, if they have gone through the list and assessed each design for its appropriateness, feasibility and/or affordability they, and their stakeholders, can be assured that they have done 'due diligence' in evaluation planning in regard to impact evaluation. The result of such an exercise will be either be that some of the designs are possible, or that none of them are possible. In the case where some are possible, the evaluation designer can put to their funder (or other stakeholders) the question - it is possible to do designs X,Y and Z... if I did these designs would you accept that the results establish (to whatever level of satisfaction you hold), that changes in high-level outcomes have been demonstrated as attributable to the program?
This then leads into the second, and related, use of the list. This use is to help evaluation planners and stakeholders to be absolutely clear when we are deciding not to do impact evaluation. I think that there is a lot of confusion in the evaluation area. Many stakeholders think that impact evaluation is being done and many evaluators go along with the idea that some 'sort' of impact evaluation can be done in all cases. I don't think that this is the case, I think that often impact evaluation is not appropriate, feasible and/or affordable (this is not to say that it is not great when you can do it properly).
What happens if we all go along with the 'we will do impact evaluation regardless' approach is that we get a lot of what I call pseudo-impact evaluations - evaluations which do not actually establish attribution but just pretend to do so. If we are brutally frank about when we cannot do impact evaluation (and it takes a lot of professional confidence on the part of the evaluator to claim this - the purpose of designing the list in the way I am trying to design it is to buttress and support this professional confidence) then this opens the way to think about other types of evaluation. (Impact evaluation - when it should and should not be done http://knol.google.c
Paul Duignan, PhD. (Follow on http://www.OutcomesB
EditSaveCancelDeleteDeleteBlock this userReport abusive commentHide report window
Wanting to use the list in this way - to clearly identify when we can, and when we can not do impact evaluation, has implications for what goes into the list. So in constructing the list, so far in my thinking, I don't think that there is any need to differentiate between quantitative and qualitative techniques. So while you perceived the first five designs to be quantitative, I don't think that the fifth design 'Exhaustive alternative causal identification and elimination design' should be viewed as a solely quantitative design, I think that it could be done in a qualitative manner. So I am not trying to differentiate the last two designs 'Expert judgment design' and 'Key informant judgment design' by virtue of them being qualitative designs - regardless of whether people see them in this way or not.
What I am trying to get at is, what is the discrete mechanism by which an audience would accept that a design had actually made a claim attempting to establish attribution? So while under design six, for instance, 'Expert judgment design' I could put a whole lot about triangulation and qualitative techniques etc. (which I may do) I don't want to distract from what the essential mechanism really is which an audience is pinning its hopes on in regard to establishing attribution. This is because, if someone does a lot of triangulation and establishing 'meaning' and all sorts of things that I could list from the qualitative research cannon under design six (or whatever you wanted to call the design), then (at the moment, and I am open to correction on this) I think that such activity - in terms of actually attempting to make a claim for establishing causality - may come, from the perspective of an evaluation audience - as either: 1) attempting to do the Exhaustive alternative causal identification and elimination design (Design 5), or basically a 'trust me, I'm an expert approach' (Design 6). Is there some other mechanism between these two which have not captured?
The 'Trust me I'm an expert approaches' are known in evaluation circles as connoisseurship evaluation (often compared to wine tasting). Such an expert may use quantitative as well as qualitative methods (again making the point that I am not trying to drive a clear quantitative/qualita
Paul Duignan, PhD. (Follow on http://www.OutcomesB
EditSaveCancelDeleteDeleteBlock this userReport abusive commentHide report window
Nevertheless, I have another suggestion which seems the most clear cut qualitative 'design' alternative to include here. This is the Most Significant Change technique. http://www.mande.co.
It is specifically designed to address impact/outcomes attribution particularly in places where for various reasons quantitative information is unreliable or unavailable.
EditSaveCancelDeleteDeleteBlock this userReport abusive commentHide report window
audiences will be very similar in terms of what sort of information it would take to convince them (you may not disagree with this);
that to persuade people about causality attribution it will necessarily (or most of the time) be enough to rely on one discrete mechanism in isolation;
that each causality-outcome attribution technique can be strictly compared when they provide different types of information relevant to high level outcomes attribution; and
that it makes sense to provide funders with a list discrete mechanisms without also discussing the conditions of their plausibility (not just their feasibility).
As you know, these issues are particularly important when considering impact/attribution evaluations for complex, differentiated and/or unpredictable interventions. I think these days with rising skepticism and politicisation of social research (both quant and qual), the diversity of audiences for evaluations, the trend toward learning, and the democratisation of program management it has become increasingly expected that a variety of methods will be necessary to convince audiences of outcome/impact attributions. Thats why I suggested that quality assurance techniques for combining mixed methods in outcome attribution evaluation designs could be a useful and realistic addition if your main focus is identifying what is necessary to persuade audiences.
EditSaveCancelDeleteDeleteBlock this userReport abusive commentHide report window
I also think it is not at accurate or fair to reduce all qualitative research techniques to fitting somewhere between trust in expertise and exhaustive alternative causal identification and elimination design. If we were to apply the same level of generalization to the first four designs then these could also be similarly caricatured as 'trust me I'm an expert at interpreting statistical models, designing prophetic causal attribution/impact models and inferencing powers about unknown populations. Its important to remember that people often don't trust statistics precisely because there is a bitter history of 'experts' stuffing them up (with poor premises, model implementation and interpretation) or of statistics being subject to politicised manipulation. I think statistics are particulalry vulnerable to mistrust to the extent that people feel that they cannot themselves check or be sure that they have been conducted with integrity and sufficient critical thinking. I think the mystification effect of statistics (like connoisseurship) means that its important that the designs are transparent and readily understandeable by non-experts and/or there is some robust technique that diverse audiences feel they can be confident that the premises, assumptions, implementation and interpretations are robust.
EditSaveCancelDeleteDeleteBlock this userReport abusive commentHide report window
Critical and analytical traditions in the social sciences - big topic, do have comments on it but would need to think more for them to add any value. Re the Most Significant Change (MSC) approach - when I get a moment I want to go through and carefully analyze approaches like that (another one is Contribution Analysis) to try to work out exactly what the method wants to claim it is doing. MSD, in for instance, the handbook you linked to, does not seem to put a lot of emphasis on looking for evidence of a causal process and eliminating other possible causal processes which have produced changes, but it may be that this is so obvious in the process that it does not have to be mentioned separately.
What I would be attempting to do in my analysis of MSC and Contribution Analysis and others is to try to see what sort of information and analysis they are trying to provide. In doing this I would use the five building-blocks diagram - http://knol.google.c
It may be that within MSC, as you say, there is another discrete mechanism being offered as a way of establishing attribution - and maybe it is one which claims that if you get close enough to the details of the program - e.g. by going to all the effort they do in MSC to collect stories about the program, and put them through a quality control selection process - you can just 'see' in some way the causal mechanisms operating. I presume that this is like Thick Description in qualitative research and I know that Carol Weiss (an evaluator) talks in at least one place about almost being able to 'see' certain program mechanisms. And I know that in at least one evaluation I've been involved in designing - one for a national academic research assessment system - it seemed to me that one line of argument was that in terms of the impact of the program it was so obvious that you could 'see' the mechanism and the fact that you could not do most of the other seven designs did not mean that you should say that it was impossible to establish impact attribution (http://www.systemat
I am have detected a little of that, without having the opportunity to look in any detail - but in Contribution Analysis there is I think talk of looking at all of the alternative explanations, and as I skilled through the MSC handbook just now I saw mention of it being a good idea to also collect stories of changes that were not brought about by the program being evaluated - which to me hints of the Exhaustive alternative causal identification and elimination design.
Of course, whether or not MSC or contribution analysis fits within this typology says nothing about its value as an evaluation technique (which I have a lot of respect for). I am just trying to tease out the exact nature of the claims for establishing impact attribution which lie behind it.
Paul Duignan, PhD. (Follow on http://www.OutcomesB
EditSaveCancelDeleteDeleteBlock this userReport abusive commentHide report window
1. It may not be enough to rely on one discrete mechanism of impact evaluation in isolation to persuade people about causality attribution
2. That each causality-outcome attribution technique cannot be strictly compared because they provide different types of information relevant to high level outcomes attribution
3. That if funders are provided with a list discrete mechanisms they also need the conditions of their plausibility (not just their feasibility) discussed.
1. I totally agree with the first point, and so have no problem with the concept of triangulation (of course as a working evaluator I spend my time pointing out to clients that it depends on how much money you want to spend). Having made the list I am leaving it up to stakeholders to figure out what they want to 'buy' to be sufficiently convinced (for the pragmatic purpose of determining further action - not establishing perfect truth - sorry dropping into theory of science issues here) in a particular case. In this article I am just wanting to provide a menu of technical options. What they have in the meal is up to them. At some stage I might try to formulate something that helps with thinking about combining them. But often would recommend to clients that they combined an impact design with some formative or process evaluation rather than two types of impact evaluation (just because of resource constraints).
2. I think that the seven impact designs can be compared in that they are all attempting to claim that they are producing credible information establishing attribution - that is the common thing which I want them to have by being in the list of seven. However, different ones are more or less suitable in different circumstances and this is what I am getting at in the decision-making table approach to selecting impact evaluation designs here (http://knol.google.
3.I suppose that I could attempt to provide some guidance on plausibility, however, I think at the moment I want to leave the ball in their court as to what they will regard as plausible. It seems to me it would be quite a minefield to deal with that issue (l see lots of theory of science discussions emerging). At the moment I think that the safest thing to do is to leave that up to the evaluation stakeholders to use what criteria they like.
The second comment three posts above about the complexity of programs and stakeholder expectations at the moment could lead to talking about quality assurance methods for combining mixed methods - but I only want to do this if it is a discrete design which makes a claim that it is establishing attribution and is separate from the others which I have listed above and at the moment I am not certain that is it. This is not to say that on further reflection I can't come to the point of seeing that it is a discrete design.
Paul Duignan, PhD. (Follow on http://www.OutcomesB
EditSaveCancelDeleteDeleteBlock this userReport abusive commentHide report window
From a technical point of view for the typology here, it is not a problem that few people accept a particular design - only that it is a discrete type of design. There is a lot of skepticism about experts around these days, but I think that we do accept 'trust me I'm an expert' a lot in daily life and that if stakeholders wish to, they can adopt this approach.
I have considerable sympathy with the comments in the second part of the post about trust in experts and statistics. The way I often put it is that in some situations what is being proposed is that we put an economist in a room for a week, close the door and let them do their multivariate regression thing and they come out with a determination of how much of the variance can be explained by the intervention. Now this makes sense if you trust the average economist and presumably many stakeholders do.
Regarding the comment:
'I think statistics are particularly vulnerable to mistrust to the extent that people feel that they cannot themselves check or be sure that they have been conducted with integrity and sufficient critical thinking. I think the mystification effect of statistics (like connoisseurship) means that its important that the designs are transparent and readily understandable by non-experts'
Making 'designs transparent and readily understandable by non-experts' is what my mission is here. The truth of the matter is that the statistics used in some of these designs is complex and we have to watch that because of this we don't just abandon complex methods using statistics because stakeholders can more readily understand other 'simpler' methods. I think that there is also an issue about stakeholders understanding what is happening in more qualitative designs with all their talk of triangulation, thick description, let alone if they head of into territory such as neo-faucauldian discourse analysis.
Anyway, what I am trying to do along these lines is: 1) help stakeholders be very clear about when an evaluator is, and is not, making a claim that they believe they have established attribution of high-level outcomes to an intervention. On this one, I suppose you could see what I am doing is trying to draw evaluators out of the position I think they sometimes occupy of, implying (even just by using the name 'impact evaluation') that they are establishing attribution, when, if you actually examine what they are saying and press them on whether they are or are not claiming that they believe they are establishing attribution, they back off and say that what they are trying to do is something else - e.g. mounting a case that because the lower levels of an outcomes model (logic model) have been established in the case of a particular program, and there is reason to believe that the connection between these and higher-levels in the model occurs in other instances - that it is reasonable to presume that high-level outcomes have been changed in the case of a particular program. I have no problem with this line of reasoning, I just don't want to characterize it as the same as making a claim that, in the case of a particular program you believe you have actually established attribution at a high level. (See here for the types of claims you can make in regard to outcomes models (logic models) http://knol.google.c
2. Make it easier for stakeholders to understand exactly what is being done when a particular impact evaluation design is being used. So for instance, my attempt to spell out all of the impact evaluation designs by way of using a single very simple illustrative example in this article here http://knol.google.c
Thanks for all these comments I do think there is likely to be other qualitative orientated mechanisms for making a claim and do want to identify these better than they are identified so far.
Paul Duignan, PhD. (Follow on http://www.OutcomesB
EditSaveCancelDeleteDeleteBlock this userReport abusive commentHide report window