Analyzing Qualitative Data. Graham R Gibbs
Nevertheless, in most cases you will want to record exactly what the respondent said, even if it is ungrammatical. Other, and often more insidious errors arise because the transcriber has misheard what was said on the recording. Sometimes this is because the recording was made in a noisy place or it has picked up the sound of the recorder mechanism and it is hard to make out what is said. In face-to-face speech humans are very good at filtering out such noises, but recordings don’t and then we experience more difficulty hearing over the background. But even where the sound is good there are many cases where the transcriber has heard one thing whereas the respondent said something else. Hearing exactly what is said involves understanding and interpretation. Sometimes the right sound is heard but the interpretation is wrong, as in the UK comedian Ronnie Barker’s classic comedy sketch on the confusion between ‘four candles’ and ‘fork handles’. More often than not, though, it is in the process of interpretation that something different is heard from what was actually said. Table 2.1 lists some of the errors of interpretation found by a Canadian researcher using audio typists to transcribe interviews on trade union activities.
Table 2.1
From email from Carl Cuneo, 16 June 1994, QUALRS-L Listserv
Various things can be done to minimize these errors. It helps to have as good a quality sound as possible. So use good equipment. But no matter how good the sound, there is always going to be a need for interpretation and understanding of what is heard. The best way to reduce errors here is to make sure that the transcriber understands the context and subject matter he or she is transcribing and is used to the accent, cadence and rhythm of the speakers. Therefore, transcribers may need training to help them become familiar with the subject matter. This is one of the biggest advantages of doing your own transcription. You will know the context of the interview, and, I hope, be familiar with the subject matter.
You should also use your word processor to check the spellings in your text. Not only should common words be spelled correctly, but proper names and dialect and jargon terms should also be spelled consistently. This means that if you are using software to assist your analysis you can use the search facility in it without having to worry about alternative spellings.
Printing the Transcript
Even if you intend to use CAQDAS for managing all your analysis you may still want to print out your transcripts because it is easier to check them, you can show them to respondents for checking and because you want to do some analysis using the printed copy. One thing to decide about at this stage is whether you are going to use CAQDAS for your main analysis or for keeping the definitive record of your analysis – especially your coding. If you are doing either, then you should ensure that your printouts are the same as the text that appears on screen when you have imported transcripts into your CAQDAS program. That way you make it easier to transfer into the software any notes you have written on the transcripts. In this case it is best to import your transcripts into the CAQDAS program and use that program to print them out.
If you do not intend to use CAQDAS then you can print directly from your word processor. There are three things to consider.
1. Line numbers
If you want your transcripts to show line numbers (some approaches recommend this, e.g. for cross-referencing) then use your word processor to set this up. Most have an option to do this automatically – you don’t have to do it manually (e.g. in Microsoft Word use the ‘Page Layout’ ribbon). N.B. If you are using CAQDAS then use that software to insert line numbers. Do not do it in your word processor before you import the files to your project.
2. Margins
Leave wide margins on the sheets for you to annotate and indicate coding ideas. Most people leave a wide margin on the right. Use the margin setting in your word processor (e.g. in Microsoft Word select all the text then move the margin tabs in the ruler).
3. Line spacing
Double-space the text (or use line-and-a-half spacing). Again this leaves room for underlining, comments and circling the text. (In Microsoft Word, use the ‘Home’ ribbon.)
Internet Data
One way to avoid most of the problems associated with transcription is to collect your data via the Internet. All textual data that can be gathered from the Internet – email messages, web pages, chat room dialogues, commercial news archives, etc. – come already in electronic form. No transcription is required. Most email is still plain text, so there is no problem just saving the messages as that. However, it is important to keep the header information too, so that you know who the message was from, who it was to, when it was sent and what topic it was about. Some emails are threaded. That is, messages on the same topic are linked together chronologically. You may want to preserve the threading in your files for analysis; for example, by putting all messages in the same thread in the same file, in chronological order.
Web pages present a different problem. You can of course just save the URL, the address of the page, and go to that page in your browser when you want to analyze it. But the page may change during your analysis (e.g. if it is a discussion group then more discussion might be added) or it may disappear. So, you might want a snapshot copy at the time you visit the page. Web pages are written not in plain text but in a mark-up language, e.g. HTML, so that they can be displayed in a formatted form in web browsers. They may also include various multimedia elements such as images, sounds and movies. You need to decide if you just need the text – in which case save the pages as plain text (an option in the ‘File: Save As …’ menu of the web browser) – or whether you want to save them as web pages (or web archives) including the multimedia elements. If you save them as web pages or as a web archive then you will need your web browser to open them again when you want to analyze them.
Most CAQDAS programs can import and code plain text files. But they cannot display the HTML files that your browser can display. If you want to include data from the web in your CAQDAS project then you will need to save the page as a pdf file (this is often an option in the print dialog). This preserves most of the visual elements in the web page, but will not keep any audio or video elements (though it might preserve the links to the live web version). Most CAQDAS programs can read pdf files and allow you to code them. The CAQDAS program NVivo has a plug in for the web browsers Internet Explorer and Chrome that enables you to capture web pages as pdf files and Twitter messages in a database format.
Even if you import all the web pages you want to analyze as pdfs into your CAQDAS project you will have lost the hyperlinks in them. Web pages typically contain hyperlinks to other web pages. They are therefore an excellent example of intertextuality, the linkage between and interdependence of documents. Thus it is a moot point whether the meaning of a web page is indicated just by the content of the page itself or whether you need to include some or all of the hyperlinked pages. Saving a site as a web archive may be one option, though this may not be able to deal with all the relevant hyperlinks such as those to external websites, and means it is hard to use CAQDAS.
In some cases, such as when selecting material from commercial news archives, even if you convert the files to plain text, you may need to undertake some processing and filtering to eliminate superfluous and irrelevant material. The process of selection may not be selective enough, as Seale found when he searched a commercial news archive for articles on cancer (Seale, 2002). A lot of the articles he received were about astrology and the star sign Cancer and not about the illness that he was interested in.
Meta-data
Put simply, meta-data is data about data. In the context of data preparation there are two important forms of meta-data to consider. First, there is information about your interviews, notes, etc., that records their provenance, an outline of their content and who they involve. Second, is information about the details of your data that you need for archiving, such as details of how the study was carried