Paper writing IV: analysing data

One of the trickiest areas for researchers working with data – either primary or secondary (data you have generated in ‘the field’, or that gleaned from texts etc) – is the analysis of that data. It can be a significant challenge to move from redescribing findings, observations or results, to showing the reader what these mean in the context of the argument that is being made, and the field into which the research fits. There are a few moves that need to be made in constructing an analysis, and these will be unpacked in this post.

Often, in empirical research, we make our contribution to knowledge in our field through the data we generate, and analyse. Especially in the social sciences, we take well-known theories and established methodologies and use these to look at new cases – adding incrementally to the body of knowledge in our field. Thus, analysis is a really important thing to get right: if all we do is describe our data, without indicating how it adds to knowledge in useful ways, what kind of contribution will we be making? How will our research really benefit peers and fellow researchers? After all, we don’t write papers just to get published. We conduct research and publish it so that our work can influence and shape the work of others, even in small ways. We write and publish to join a productive conversation about the research we are doing, and to connect our research with other research, and knowledge.

data 1

How to make a contribution to knowledge that really counts, though?

First things first, you can’t use all your data in one paper (or even in one thesis). You will need to choose the most relevant data and use it to further illustrate and consolidate your argument. But how do you make this choice – what data should you use, and why? The key tool used to make all the choices in a paper (or thesis) – from relevant literature, to methodology and methods, to data for analysis – is the argument you are making. You need to have, in one or two sentences, a very clear argument (sometimes referred to as a problem statement, or a main claim). In essence, whatever you call it, this is the central point of your paper. To make this point, succinctly and persuasively, you need to craft, section by section, support for this argument, so that you reader believes it to be valid and worth engaging with.

So, you have worked out your argument in succinct form, and have chosen relevant section of data that you feel best make or illustrate that argument. Now what? In the analysis section, you are making your data mean something quite specific: you are not just telling us what the data says (we can probably work that out from reading the quotes or excerpts you are including in the paper). To make meaning through analysis, you need to connect the specific with the general. By this I mean that your data is specific – to your research problem and your consequent choice of case study, or experiment, or archival search and so on. It tells us something about a small slice of the world. But, if all we did in our papers was describe small slices of the world, we would all be doing rather isolated or disconnected research. This would defeat the aim of research to build knowledge, and forge connections between fields, countries, studies and so on. Thus, we have to use our specific data to speak back to a more general or broader phenomenon or conversation.

data 2

The best, and most accepted way, of making meaning of your data is through theorising. To begin theorising your data, you need to start by asking yourself: What does this data mean? Are these meanings valid, and why? There are different kinds of theory, of course, and too many to go into here, but the main thing to consider in ‘theorising’ your data is that you need a point of reference against which to critically think about and discuss your data: you need to be able to connect the specifics of your data with a relevant general phenomenon, explanation, frame of reference, etc. You don’t necessarily need a big theory, like constructivism or social realism; you could simply have a few connected concepts, like ‘reflection’, ‘learning’ and ‘practice’ for example; but you do need a way of lifting your discussion out of the common sense, descriptive realm into the critical, analytical realm that shows that reader why and how the data support your argument, and add knowledge to your field.

Analysis and theorising data is an iterative process, whether you are working qualitatively or quantitatively. It can be difficult, confusing, and take time. This is par for the course: a strong, well-supported analysis should take time. Don’t worry if you can’t make the chosen data make sense in the first go: you may well need to read, and re-read your data, and write several drafts of this section of the paper (preferably with critical feedback) before you can be confident of your analysis. But don’t settle for the quick-fix, thin analysis that draft one might produce. Keep at it, and strive for a stronger, more influential contribution to your field. In the long run, it’ll be worth more to you,to your peers, and to your field.

Advertisements

Researching your own ‘backyard’: on bias and ethical dilemmas

This is a post particularly for those in the social sciences and humanities who may be doing a form of ethnographic research within the context in which they work or study – in other words, doing ‘insider research’ to use Paul Trowler’s term. Researching a context with which one is intimately familiar and in which one has a vested interest can create possible bias and ethical dilemmas which need to be considered by researchers in these situations. The last thing you want, in presenting your completed research, is for your findings to be called into question or invalidated because you have not accounted clearly enough for issues of insider bias, and your own vested interests.

Insider bias and vested interests

In the article cited in this post, Trowler considers issues of bias in data generation. Bias in research can be defined as having only part of the ‘truth’ in your data but treating that part as a whole, ignoring other possibilities or answers because you are prejudiced towards the ones that best represent your interests or investment. If you are working in a context with which you are familiar, especially your own department or faculty, or an organisation in which you have worked or do work, you will have a vested interest in that context. Either you want everyone and everything to look amazing, or perhaps you are unhappy about certain aspects of the ways in which they work and you want your research to show problems and struggles so you have a basis for your unhappiness. Either way, you have to acknowledge going in that you cannot be anything but biased about this research.

bias blindspot

However, acknowledging that you are biased, and detailing what that bias might entail for readers and examiners, does not undermine your position as researcher. By making yourself aware of potential blindspots in your research design – for example the participants you have chosen, or the cases you are including and excluding from your dataset (and why) – you can better head off possible challenges to the validity of your data later on, and you can strengthen your research design choices. Be honest with yourself: there is a balance to strike here between being pragmatic and strategic in choosing research participants, sites, or cases that will be accessible and that will yield the data you need to make your argument, and between choosing too neatly and risking one-sided or myopic data generation. Why these participants, these cases, these sites? Are there others that you know less well that you could include to balance out the familiarity, and increase the validity of your eventual findings? If not, how might you maintain awareness of your ‘insiderness’ and account for this in analysis and discussion later on?

You need to account for these decisions and questions in your methodology, and discuss what it means for your study that you are doing insider research, and that this does imply particular forms of bias. I don’t think you can get away from being biased in these cases, but you can think through how this may affect your data generation processes, and your analysis as well, and share this thinking with your readers frankly and reflexively.

Insider bias and ‘intuitive analysis’

Another point Trowler makes concerns insider ‘intuition’ when analysing the data you have generated and selected for your study. You may be analysing a policy process you were part of, or meetings you sat in on, or projects you were involved in. You have insider knowledge of what was said, the tone of the conversations, background knowledge (and perhaps even gossip) about participants – in other words, you have a kind of cultivated ‘intuition’ about your data set that you reader will not be privy too. Accounting for bias here is crucial, because if you cannot see it, you may rely too much on this insider intuition in analysing your data, and too much of the language of description you are using to convey your theorised findings will be tacit and hidden from the reader. They will then struggle to understand fully on what basis you are claiming that X is an example of poor management, or that Y means that the department is doing well in these particular areas.

ideas

It is thus vital that you get feedback here on whether it is clear to your reader why you are making particular claims, and whether they can see and understand the basis on which you are making such claims. Do they understand your ‘external language of description’ or ‘translation device’ to use Bernstein’s and Maton’s terms respectively? If they do not, you may be relying too much on your insider view of your case or participants, and may need to find a way to step back, and try to see the data you are looking at as more strange and less familiar. Getting help from a supervisor or critical friend who can ask you questions, and expose and critique possible points of bias is a useful way to re-interrogate your data with fresher eyes.

Ethical dilemmas

An ethical dilemma is defined as ‘a choice between two options, both of which will bring a negative result based on society and personal guidelines’. In research, this definition could be nuanced to suggest that an ethical dilemma presents itself when you have to make a decision to protect the interests of your research or the interests of your participants or study site. For example, in an interview with a senior manager you learn information that may be better off staying private and confidential, yet would also add an important and insightful dimension to your findings. What do you do? A participant in your study asks you for help, but to help might be to prejudice that participant’s responses in a later survey or interview, possibly skewing your data. Yet it is your job to help them. Study first, or job first? These are the kinds of dilemmas that can arise when you do research in the same spaces in which you work, and with people you work with and have other responsibilities to outside of your research.

Cheating-clients,-ethical-dilemmas

As researchers we have a duty to be as truthful and ethical in our research as possible. We are working to create and add to knowledge, not to simply maintain the status quo. In your study this may mean being carefully but resolutely critical, reflective and challenging, rather than only saying the palatable or easy things to say. This work is always going to present difficulties and dilemmas, but accounting as far as possible for your own bias and vested interests, and for your own relevant insider knowledge, can create space in your study for the development of your own reflexivity as a researcher, and can bolster rather than undermine the validity and veracity of your findings.

Trowler, P. (2011) Researching your own institution: Higher Education, British Educational Research Association online resource. Available online at [http://www.bera.ac.uk/files/2011/06/researching_your_own_institution_higher_education.pdf]

‘Retrofitting’ your PhD: when you get your data before your theory

I gave a workshop recently to two different groups of students at the same university on building a theoretical framework for a PhD. The two groups of students comprised scholars at very different points in their PhDs, some just starting to think about theory, some sitting with data and trying to get the theory to talk to the data, and others trying to rethink the theory after having analysed their data. One interesting question emerged: what if you have your data before you really have a theoretical framework in place? How do you build a theoretical framework in that case?

I started my PhD with theory, and spent a year working out what my ‘gaze’ was. I believed, and was told, that this was the best way to go about it: to get my gaze and then get my data. In my field, and with my study, this really seemed like the only way to progress. All I had starting out was my own anecdotal issues, problems and questions I wanted answers to, and I needed to try and understand not just what the rest of my field had already done to try and find answers, but what I could do to find my own answers. I needed to have a sense of what kinds of research were possible and what these might entail. I had no idea what data to generate or what to do with it, and could not have started there with my PhD. So I moved from reading the field, to reading the theory, to building an internal language of description, to generating data, to organising and analysing it using the theory to guide me, to reaching conclusions that spoke back to the theory and the field – a closed circle if you will. This seems, to me certainly, the most logical way to do a PhD.

But, I have colleagues and friends who haven’t necessarily followed this path. In their line of work, they have had opportunities to amass small mountains of data: interview transcripts, documents, observation field notes, student essays, exam transcripts and so forth. They have gathered and collected all of these data, and have then tried to find a PhD in the midst of all of it. They are, in other words, trying to ‘retrofit’ a PhD by looking to the data to suggest a question or questions and through these, a path towards a theoryology.

Many people start their doctoral study in my field – education studies – to find answers to very practical or practice-based questions. Like: ‘What kinds of teaching practice would better enable students to learn cumulatively?’ (a version of my own research question) Or: ‘What kinds of feedback practices better enable students to grow as writers in the Sciences?’ And so on. If you are working as a lecturer, facilitator, tutor, writing-respondent, staff advisor or similar, you may have many opportunities to generate or gather data: workshop inputs, feedback questionnaires, your own field notes and reports, student essays and exam submissions, and so on. After a while, you may look at this mountain of data and wonder: ‘Could there be a thesis in all of this? Maybe I need to start thinking about making some order and sense out of all of this’. You may then register for a PhD, searching for and finding a research question in your data, and then begin the process of retrofitting your PhD with substantive theory and a theoryology to help you work back again towards the data so as to tell its story in a coherent way that adds something to your field’s understanding or knowledge of the issues you are concerned with.

The question that emerged in these workshops was: ‘Can you create a theoretical framework if you have worked so far like this, and if so, how?’ I think the answer must be ‘yes’, but the how is the challenging thing. How do you ask your data the right kinds of questions? A good starting point might be to map out your data in some kind of order. Create mind-maps or visual pictures of what data you have and what interests you in that data. Do a basic thematic analysis – what keeps coming up or emerging for you that is a ‘conceptual itch’ or something you really feel you want or need to answer or explore further? Follow this ‘itch’ – can you formulate a question that could be honed into a research question? Once you have a basic research question, you can then move towards reading: what research is being or has been done on this one issue that you have pulled from your data? What methodologies and what theory are the authors doing this research using? What tools have they found helpful? Then, much as you would in a more ‘traditional’ way, you can begin to move from more substantive research and theory towards an ontological or more meta-theoretical level that will enable you to build a holding structure and fit lenses to your theory glasses, such that you have a way of looking at your data and questions that will enable you to see possible answers.

Then you can go back to your data, with a fresh pair of eyes using their theory glasses and re-look at your data, finding perhaps things you expect to see, but also hopefully being surprised and seeing new things that you missed or overlooked before you had the additional dimension or gaze offered by your theoretical or conceptual framing. But working in this ‘retrofitted’ way is potentially tricky: if you have been looking and looking at this data without a firm(ish) theoretically-informed or shaped gaze, can you be surprised by it? Can you approach your research with the curious, tentative ‘I don’t know the answers, but let’s explore this issue to find out’ kind of attitude that a PhD requires? I think, if you do decide to do or are doing a PhD in what I would regard as a middle-to-front sort of way, with data at the middle, then you need to be aware of your own already-established ideas of what is or isn’t ‘real’ or ‘true’, and your own biases informed by your own experience and immersion in your field and your data. You may need to work harder at pulling yourself back, so that you can look at your data afresh, and consider things you may be been blind to, or overlooked before; so that you can create a useful and illuminating conversation between your data and your theory that contributes something to your field.

Retrofitting a PhD is not impossible – there is usually more than one path to take in reaching a goal (especially if you are a social scientist!) – but I would posit that this way has challenges that need to be carefully considered, not least in terms of the extra time the PhD may take, and the additional need to create critical distance from data and ‘findings’ you may already be very attached to.

Iterativity in data analysis: part 2

This post follows on from last week’s post on the iterative process of doing qualitative data analysis. Last week I wrote a more general musing on the challenges inherent in doing qualitative analysis; this week’s post is focused more on the ‘tools’ or processes I used to think and work my way through my iterative process. I drew quite a lot on Rainbow Chen’s own PhD tools as well as others, and adapted these to suit my research aims and my study (reference at the end).

The first tool was a kind of  ’emergent’ or ‘ground up’ form of organisation and it really helps you to get to know your data quite well. It’s really just a form of thematic organisation – before you begin to analyse anything, you have to sort, organise and ‘manage’ your mountain of data so that you can see the wood for the trees, as it were. I didn’t want to be overly prescriptive. I knew what I was looking for, broadly, as I had generated specific kinds of data and my methodology and theorology were very clearly aligned. But I didn’t really know what exactly all my data was trying to tell me and I really wanted it to tell its story rather than me telling it what it was supposed to be saying. I wanted, in other words, for my data to surprise me as well as to show me what I had broadly hoped to find in terms of my methodology and my theoretical framework.  So, the ‘tool’ I used organised the data ‘organically’ I suppose – creating very descriptive categories for what I was seeing and not trying to overthink this too much. As I read through my field notes, interview transcripts, video transcripts, documents, I created categories like ‘focusing on correct terminology’ and ‘teacher direction of classroom space’ and ‘focus on specific skills’. The theory is always informing the researcher’s gaze, as Chen notes in her paper (written with Karl Maton) but to rush too soon to theory can be a mistake and can narrow your findings. So my theory was here, underpinning my reading of the data, but I did not want to rush to organise my data into theoretical and analytical ‘codes’ just yet. There was a fair bit of repetition as I did this over a couple of weeks, reading through all my data at least twice for each of my two case studies. I put the same chunks of text into different categories (a big plus of using data software) and I made time to scribble in my research journal at the end of each day during this this process, noting emerging patterns or interesting insights that I wanted to come back to in more depth in the analysis.

An example of my first tool in action

An example of my first tool in action

The second process was what a quantitative researcher might call ‘cleaning’ the data. There was, as I have noted, repetition in my emergent categories. I needed to sort that out and also begin to move closer to my theory by doing what I called ‘super-coding’ – beginning to code my data more clearly in terms of my analytical tools. There were two stages here: the first was to go carefully through all my categories and merge very similar ones, delete unnecessary categories left over after the merging, and make sure that there were no unnecessary or confusing repetitions. I felt like the data was indeed ‘cleaner’ after this first stage. The second stage was to then super-code by creating six overarching categories, names after the analytical tools I developed from the theory. For example, using LCT gave me ‘Knowers’, ‘Knowledge’, ‘Gravity’ and ‘Density’. I was still not that close to the theory here so I used looser terms than the theory asks researchers to use (for example we always write ‘semantic gravity’ rather than just ‘gravity’). I then organised my ‘emergent’ categories under these headings, ending up with two levels of coded data, and coming a step closer to analysis using the theoretical and analytical tools I had developed to guide the study.

By this stage, you really do know you data quite well, and clearer themes, patterns and even answers to your questions begin to bubble up and show themselves. However, it was too much of a leap for me to go from this coding process straight into writing the chapter; I needed a bridge. So I went back to my research journal for the third ‘tool’ and started drawing webs, maps, plans for parts of my chapters. I planned to write chunks, and then connect these together later into a more coherent whole. This felt easier than sitting myself down to write Chapter Four or Chapter Five all in one go. I could just write the bit about the classroom environment, or the bit about the specialisation code, and that felt a lot less overwhelming. I spent a couple of days thinking through these maps, drawing and redrawing them until I felt I could begin to write with a clearer sense of where I was trying to end up. I did then start writing, and working on the chapters, and found myself (to my surprise, actually) doing what looked and felt like and was analysis. It was exciting, and so interesting – after being in the salt mines of data generation, and enduring what was often quite a tedious process of sitting in classrooms and making endless notes and transcribing everything, to see in the pile of salt beautiful and relevant shapes, answers and insights emerging was very gratifying. I really enjoyed this part of the PhD journey – it made me feel like a real researcher, and not a pretender to the title.

One of my 'maps'

Another ‘map’ for chapter writing

A different 'map' for writing

A ‘map’ for writing

This part of the PhD is often where we can make a more noticeable contribution to the development, critique, generation of new knowledge, of and in our fields of study. We can tell a different or new part of a story others are also busy telling and join a scholarly conversation and community. It’s important to really connect your data and the analysis of it with the theoretical framework and the analytical tools that have emerged from that. If too disconnected, your dissertation can become a tale of two halves, and can risk not making a contribution to your field, but rather becoming an isolated and less relevant piece of research. One way to be more conscious of making these connections clear to yourself and your readers is to think carefully about and develop a series of connected steps in your  data analysis process that bring you from you data towards your theory in an iterative and rich rather than linear and overly simplistic way. Following and trying to trust a conscious process is tough, but should take you forward towards your goal. Good luck!

keep calm

 

Reference: Chen, T-S. and Maton, K. (2014) ‘LCT and Qualitative Research: Creating a language of description to study constructivist pedagogy’. Draft chapter (forthcoming).

 

Iterativity in data analysis: part 1

This post is a 2-parter and follows on from last week’s post about generating data.

The one thing I did not know, at all, during my PhD was that qualitative data analysis is a lot more complex, messy and difficult than it looks. I had never done a study of this magnitude or duration before, so I had never worked with this much data before. I had written papers, and done some analysis of much smaller and less messy data sets, so I was not a c0mplete novice, but I must say I was quite taken aback by the mountain of data I found I had once the data generation was complete. What to do now? Where to start? Help!

The first thing I did, on my supervisor’s advice, was get a license for Nvivo10 and uploaded all my documents, interview and video recordings and field notes into its clever little software brain so that I could organise the data into folders, and so that I could start reading and coding it. This was invaluable. Software that enables you to store, organise and code your data is a must, I think, for a study as large and long as a PhD. This is not an advert for Nvivo so I won’t get into all its features, and I am sure that other free and paid-for qualitative data analysis packages like Atlas Tii or the Coding Analysis Toolkit from UMass would do the job just as well. However, I will say that being able to keep everything in one place, and being able to put similar chunks of text into different folders without mixing koki colours or scribbling all over paper to the point of confusion was really useful. I felt organised, and that made a big difference to my mental ability to cope with the data analysis and sense-making process.

The second thing I did was keep very detailed notes in my research journal on my process as it unfolded. This was essential as I needed to narrate my analysis process to my readers in as much detail as possible in my methodology chapter. If a researcher cannot tell you how they ended up with the insights and conclusions they did, it is much harder to trust their research or believe what they are asking you to. I wanted to be believable and convincing – I think all researchers do. Bernstein (2000) wrote about needed two ‘languages of description (LoD)’ in research: the internal (InLoD) which is essentially where you create a theoretical framework for your study that coheres and explains how you are going to understand your problem in a more abstract way; and the external (ExLoD) where you analyse and explain the data using that framework, outlining clearly the process of bringing theory to data and discovering answers to your questions. The stronger and clearer the InLod and ExLoD, the greater chance other researchers then have of using, adapting, learning from your study, and building on it in their own work. When too much of your process of organising, coding, recoding, reading, analysing, connecting the data is hidden from the reader, or tacit in your writing about it, there is a real risk that your research can become isolated. By this I mean that no one will be able to replicate your study, or adapt your tools or framework to their own study while referencing yours, and therefore your research cannot be readily be built on or incorporated into a greater understanding of the problems you are interested in solving (and the possible solutions).

This was the first reason for keeping detailed notes. The second was to trace what I was doing, and what worked and what did not so that I could learn from mistakes and refine my process for future research projects. As I had never worked with a data set this large or varied before, I really didn’t know what to do, and the couple of qualitative research ‘textbooks’ I looked at were quite mechanical or overly instrumental in their approach, which didn’t make complete sense to me. I wanted a more ‘ground-up’ process, which I felt would increase the validity and reliability of my eventual claims. I also wanted to be surprised by my data, as much as I wanted to find what I thought I was looking for. The theory I was using further required that I not just ‘apply’ theory to data (which really can limit your analysis and even lead to erroneous conclusions), but rather engage in an open, multiple and iterative reading of the data in successive stages. Detailed notes were key in keeping track of what I was doing, what confused me, what made sense and so on. Doing this consciously has made me feel more confident in taking on similarly sized research projects in future, and I feel I can keep building and learning from this foundation.

This post is a more conceptual musing about the nature of qualitative data analysis and lays the groundwork for next week’s post, where I’ll get into some of the ‘tools’ or approaches I took in actually doing my analysis. Stay tuned… 🙂