An Introduction to Evaluation. Chris Fox

An Introduction to Evaluation

and not-for-profit sectors, a greater emphasis on the choice available to service users, and the creation of new public–private vehicles to deliver services (Hill and Hupe 2014). The adoption of business practices, a greater focus on managing by outputs and the increased ‘marketisation’ associated with NPM contributed to the proliferation of performance management measures.

Palfrey et al. (2012: 19, citing Carter et al. 1992) suggest that PIs are very useful as ‘tin openers’ because they help us clarify questions about performance. In this sense, they are a valuable starting point for evaluation, but are not a substitute for evaluation that incorporates the use of research methods (Palfrey et al. 2012).

Audit has also proliferated in the UK and US. Power (1997) charts a move from traditional audits that focus on financial probity to audits that ask broader questions about organisational performance and ‘Value for Money’ (VFM) (Palfrey et al. 2012). However, deciding on what matters in VFM involves value judgements (ibid.) and by imposing values audits can have unintended and dysfunctional consequences for the audited organisation. Evaluation does not avoid such value judgements but social scientists recognise their importance and have developed a number of strategies to avoid or incorporate them, depending upon the social science tradition they come from.

Accreditation has been used widely in the UK public sector as a strategy for setting standards for the performance of organisations and often starts with self-evaluation (Palfrey et al. 2012). Well-known examples in the UK include the use of ‘Trust’ status in sectors such as health and ‘Investors in People’ – a government agency that accredits organisations that demonstrate good practice in workforce management (ibid.).

If the ‘research’ component is what distinguishes evaluation from practices such as PIs, audits and accreditation, what is it then that differentiates ‘evaluation’ from ‘research’?

Distinguishing evaluation from research

The distinction between evaluation and research is discussed by Lincoln and Guba (1986) who note that both are forms of ‘disciplined inquiry’ and use many of the same tools or methods. However, having such shared methods does not make them one and the same thing. Lincoln and Guba argue that ‘to assert identity or similarity on the basis of common methods would be analogous to saying that carpenters, electricians, and plumbers do the same thing because their tool kits all contain hammers, saws, wrenches, and screwdrivers’ (1986: 547). The key distinction between evaluation and other types of research is the importance of values of judgement in evaluation.

Defining evaluation according to judgements of value

This brings us to the third group of definitions identified by Mark et al. (2006), which concentrate on the function evaluation serves and assume that evaluation involves judgements of value. As an example of a definition of evaluation based on judgements of value, Mark et al. (2006) cite Scriven’s definition:

Evaluation refers to the process of determining the merit, worth, or value of something, or the product of that process ... The evaluation process normally involves some identification of relevant standards of merit, worth, or value; some investigation of the performance of the evaluands on these standards; and some integration or synthesis of the results to achieve an overall evaluation or set of associated evaluations. (Scriven 1991: 139; emphasis added)

Lincoln and Guba (1986), when considering what makes evaluation different from research, argue that latter is undertaken to resolve a problem while evaluation is undertaken to establish value. They also define research as: ‘a type of disciplined inquiry undertaken to resolve some problem in order to achieve understanding or to facilitate action’ (1986: 549), whereas evaluation is defined as:

a type of disciplined inquiry undertaken to determine the value (merit and/or worth) of some entity – the evaluand – such as a treatment, program, facility, performance, and the line – in order to improve or refine the evaluand (formative evaluation) or to assess its impact (summative evaluation). (1986: 550)

This difference, which Lincoln and Guba describe as ‘monumental’, also leads to what they see as a key distinction in the products that result. Whereas research is typically adequately served by a technical report, this by itself is rarely sufficient for an evaluation if it has to meet the needs of, and communicate with, its various audiences (Lincoln and Guba 1986).¹

Our preferred definition of evaluation

The many definitions of evaluation suggest that it is not easy to pin down the concept. Indeed, some observers have argued that this is a pointless task. For example, some evaluators reject objective ‘scientific’ approaches to evaluation, arguing instead that because the human world is socially constructed evaluation is itself a social construct. There are multiple social constructs and therefore from this relativist point of view there is no right way to define evaluation. Thus, by the end of the decade, Guba and Lincoln were arguing: ‘There is no answer to the question, “But what is evaluation really?” and there is no point in asking it’ (1989: 21).

We recognise the importance of purpose and methods in defining evaluation, but also take the view that what is crucial for defining evaluation is the emphasis on a process of determining the merit, worth or value of something, along the lines suggested by Scriven.

Distinguishing evaluation from research as a practice designed to establish the value of an entity has notable implications that will resurface throughout this book. If we accept that the purpose of evaluation is to determine the value of the entity being evaluated, and that the products of an evaluation are designed to improve the thing being evaluated or to assess its impact, this has important repercussions for evaluation and for evaluators. If we return to the very first definition of evaluation that we considered, i.e. Mark et al.’s (2006) view of evaluation as a ‘politicized practice that nonetheless aspires to some position of impartiality or fairness’, we can start to see the potential tensions in a practice that is at once politicised but also aspires to impartiality or fairness.

Some would go further and see in the literature the suggestion that evaluation is either a political activity or that it serves a political purpose (Palfrey et al. 2012). In their review of the relevant literature Palfrey and colleagues distinguish between these two possibilities. Citing the work of Patton (1988) they suggest that, at a minimum, if evaluation is intended to improve services then, in the public sector, the decision makers who will act on evaluation findings are either local or national politicians. In this case evaluation serves a political purpose. However, many commentators on evaluation go further and, as Palfrey et al. (2012) note, with particular reference to the collected work of Guba and Lincoln, some see evaluators as ideologically committed with political sympathies that should influence the design of their evaluations. Whichever view is taken, they argue that over recent decades:

evaluation has come to be associated for the past few decades as a potent ally of politicians in that it is a means of assessing the ‘value’ of projects, programmes and policies. (Palfrey et al. 2012: 29)

The implication of this is that:

In the real world of politics, despite the mass of literature supporting and promoting evaluation as a subject worthy of intense bookish activity, it has to be acknowledged that for all its intellectualising credentials it is a servant and not an equal of politicians. (2012: 29)

Different types of evaluation

A number of distinctions are made when describing different types of evaluation and we look at some of the more common ones here.

Formative and summative evaluation

Scriven (1967) makes a distinction between formative and summative evaluation, which Lincoln and Guba (1986) suggest are, broadly speaking, aims of evaluation:

The aim of formative evaluation is to provide descriptive

Скачать книгу