.
experimental designs in which researchers deliberately manipulate factors that are seen as potentially causal (often referred to as ‘treatments’ or ‘interventions’) and carefully observe the effects of such actions while controlling as far as possible for extraneous factors. Data analysis will focus on the nature of those designs and how results can be tested using statistical inference. Non-experimental datasets, by contrast, will be generated largely from various kinds of survey, but may also be a result of using electronic or observational data capture techniques. The kinds of manipulations and controls in experimental designs are seldom possible with survey and other non-experimental data, while samples tend to be much larger, making the role of statistical inference far less important. Statistical inference is still covered in this text, but as part of variable-based analyses, taking variables one, two and three or more at a time.
Within the context of non-experimental datasets, this text is designed as a stand-alone, complete introduction to both variable-based and case-based approaches to data analysis. To date, books tend to focus on one or the other, while comparisons between approaches are few. The book assumes no prior knowledge of statistics. However, details of statistical calculations or how to use particular pieces of software are put into boxes so that students or researchers can skip these if they are already familiar with the procedures involved. The approaches, methods and techniques are all illustrated with one main dataset based on research carried out at the Institute of Social Marketing at the University of Stirling, which investigates the role of alcohol marketing on the drinking behaviour of young people. The dataset is far from perfect; in fact there are many problems with it. This, however, turns out to be an advantage since many typical problems in the analysis of real-life data come to light and remedies or ways to handle them can be considered.
This book should be of interest to final-year undergraduates and postgraduates who are undertaking modules or courses in research methodology in the areas of sociology, business studies, marketing, health and education. In particular, it should be of interest to students who are doing projects, dissertations or theses and who are wondering what approaches to data analysis are possible and what data analysis strategies they should adopt. It should also be of interest to researchers who usually go to great lengths to construct their data, but often find that traditional, variable-based statistics produce disappointing or inconclusive results.
Part One of this book provides an overview of the nature of quantitative data, their structure, preparation and analysis. Chapter 1 introduces the alcohol marketing dataset, it considers what data are, how they are constructed, how they are structured and how errors can arise in the process of construction. Chapter 2 looks in some detail at how quantitative data need to be prepared ready for analysis, for example by performing a number of transformations on them. Chapter 3 explains the notions of datasets and data matrices, outlines the various elements that go into the data analysis process and explores some of the ethical considerations when constructing and analysing data.
Part Two then turns to an approach to data analysis that reviews data on a variable-by-variable basis, looking at the distributions of values and their frequencies across a set of cases. Chapter 4 shows how variables can be displayed and summarized one at a time and how inferences can be drawn should the data be based on a random sample. Chapter 5 then takes variables two at a time, showing the variety of relationships that is possible and how such relationships can be displayed, summarized and have inferences drawn from them. Finally, Chapter 6 introduces multivariate analysis which takes three or more variables at a time. Part Two will be familiar to those readers who have already studied traditional statistics. However, I have tried very hard to show what these procedures accomplish (and what they do not) without being, at this stage, overly evaluative or critical of the approach. All the techniques are illustrated using the survey analysis package IBM® SPSS® Statistics software. Explanations of how to use SPSS are put into boxes, so while this text is by no means a manual on how to use the software, focusing on the boxes will get you started on this program. SPSS has been around a long time (it was developed in the 1960s) and is available in most university and college labs. However, there are many other packages and these are reviewed briefly at the end of Chapter 4. The version of SPSS used in this text is version 19.0. At the time of writing, the latest version is 22.0. However, most users are likely to have earlier versions. The guidelines on the use of SPSS in the boxes in this book are unlikely to be affected by the version being used.
An alternative approach to variable-based data analysis, which is the focus of Part Three, is to review a dataset on a case-by-case basis, looking at the configurations of values for each case across a set of case characteristics. Chapter 7 on set-theoretic methods and configurational data analysis, and which uses fuzzy set analysis, is not likely to be familiar to readers. While the logic of this approach can be daunting, I have tried to explain and illustrate each step of the way. To those who are uneasy about ‘numbers’, the good news is that there is very little by way of calculation in this chapter. Furthermore, if the research is based (or likely to be based) on a relatively small sample or population (of between about 30 and 100 or so) then this approach is well worth looking at. This chapter introduces a particular freeware program for fuzzy set analysis called fsQCA, which stands for fuzzy set Qualitative Comparative Analysis.
Part Four compares the strengths and weaknesses of variable-based and case-based approaches, considers how they can be mixed or combined, and discusses how both can be used to evaluate hypotheses, establish causal relationships, explain the findings and communicate results to an audience.
I have used several learning tools to assist the reader. Each chapter begins with a list of learning objectives that outline what the reader can expect to learn in that chapter. The list will allow readers to monitor their understanding and progress during the chapter. An introduction then explains how the chapter is organized, provides some background to its content and links it to other chapters in the book. At the end of most major sections in each chapter there are key points and wider issues. These provide a quick summary of the key points and go on to consider any issues that might arise from the section by looking at the wider context. Boxed areas provide more detailed information on either the calculation of particular statistics or the operation of the software that is being used for illustration. Each chapter makes frequent reference to the alcohol marketing dataset to illustrate procedures and points being made and, at the end of each chapter, the implications of the chapter content for the alcohol marketing dataset are discussed. The chapter is then summarized, there are exercises and questions for discussion, and suggestions are made for further reading. These are annotated to help readers to decide what is worth reading as a follow-up to the chapters from their point of view. At the end of the book there is a glossary, all the references are collected together and there is a full index. In the text, a selection of key terms to be found in the glossary are highlighted in bold the first time they appear, and in selected places where it is felt this would be useful to the reader.
On matters of style, the word ‘data’ is treated as plural throughout. Where numbers or words used in the text refer to what appears or might appear in or are to be entered into software, they are in Courier New font. I have tried to develop a terminology that is consistent throughout the text so that words like ‘case’, ‘variable’, ‘value’, ‘score’, ‘measure’, and so on have the same meaning throughout. I have, of necessity, felt the need to introduce in many places terms that will be explained later in the text. I have indicated where this is so, and in addition the terms will be in the glossary. The material in the boxes can be skipped without detracting from the meaning of the text.
I would like to thank the people who have helped me to complete this text. First, my editor, Jai Seaman, for her encouragement, suggestions and critical comments. Second, the eight anonymous reviewers of my book proposal whose comments