What is content analysis?

“a research method for the subjective interpretation of the content of text data through the systematic classification process of coding and identifying themes or patterns”

(Hsieh & Shannon, 2005, p.1278)

“any qualitative data reduction and sense-making effort that takes a volume of qualitative material and attempts to identify core consistencies and meanings

(Patton, 2002, p.453).

Content analysis is way of analyzing text-based, qualitative data for example newspaper articles, children’s books, interview transcripts and advert or film scripts. Content analysis can be quantitative or qualitative. Quantitative researchers may simply search for specific words, phrases or ideas in the data and count them up, qualitative researchers will attempt to extract “meaning” through a search for themes in the data. They will not add these up or do any form of statistical analysis and this is a more sensible approach given that the sampling method is unlikely to be random; it is far more likely to be some sort of purposive sampling, whereby the texts have been chosen specifically as they are known to be examples of the particular topic under investigation.

Describing the data: Thematic analysis

Thematic analysis refers to a process by which a series of codes, categories and ultimately themes (underlying and recurring ideas) are derived from qualitative data. Researcher will use a process of selective reduction to turn the full text into manageable units.

The themes may be arrived at in two ways, identified by Hsieh and Shannon, (2015).

Conventional (formative analysis)

The researcher uses a technique called close reading to turn every phrase from the text into a coding unit; they have no starting point other than the data itself and as such can be described as a bottom-up approach . Researchers read over and over again until they have reached a point known as data saturation, where they feel there are no further coding units to be found.

This form of analysis is called an inductive content analysis as there is no theory being tested, an theory is emergent from the data itself.

Directed analysis

Similar to conventional analysis however, the researchers start with some ideas in mind, from previous studies or theories, and this helps them to create the coding units in advance. They then search for examples of these codes in the text. This is very similar to a priori coding which we learnt about when we talked about creating closed questions in questionnaires and also creating a quantitative observation schedule. This approach can be viewed as top-down, as we are seeing the data through a filter, which is the pre-existing theory or research that we have used to guide the creation of the a priori-codes. This is a way of unveiling which themes are apparent in the data we are analysing, from a set of themes which we have chosen prior to the analysis, based on previous research.

Interpreting the data; a step on from thematic analysis…

Summative analysis 

Once the thematic analysis is complete whether this is conventional or directed, the researchers may choose to compare and interpret themes across different texts. This can be done in a quantitative manner; counting/tallying the frequency with which certain words, themes or ideas arise and making comparisons; this might result in a statistical analysis but this should only be completed if the sources of the days were chosen at random, which is unlikely. This is known as manifest content analysis and is quantitative in nature. The analysis may not stop here however, researchers may go onto examine the latent content, or underlying meanings in the data (also known as relational analysis) where they look at the context in which the words/phrases are used and try to search for meaning, and is thus more interpretative.

Linked concepts:  Grounded Theory is very similar to qualitative summative content analysis; it uses the conventional approach andtakes a truly inductive, bottom- up approach to the collection and description of the data, resulting in a qualitative interpretation which is usually communicated visually through some sort of diagram.

Evaluating Content Analysis

Qualitative content analysis is an interpretative technique and as such the meaning that is extracted may be considered subjective to the researcher. In qualitative psychology, researchers tend to use a language of their own and instead of asking whether the findings are valid, instead we consider the trustworthiness and credibility of the findings. There are various ways of tackling credibility in qualitative psychology, for example:

  • researcher triangulation: using more than one researcher to analyse the transcripts and compare the codes, categories and themes generated (the qualitative term for inter-rater reliability).
  • checking the findings with the participants from which the dat was drawn; if the data relate to texts in newspapers/films etc clearly there are no Pps as such, however one could organise a focus group and discuss the findings with a group of people who are consumers of these media products and see what they have to say abut the trustworthiness of the outcomes the researchers have highlighted.
  • reflexivity; this is an important term in qualitative psychology; follow the links and you will find some resources on this from the IB course; there are two types of reflexivity to consider personal and epistemological; mentioning either in your evaluation of the clinical practical would look most impressive.

Example of a content analysis

A good example of a content analysis from a case study of a lady whose brother has schizophrenia:

Practice Questions


Further reading:



This page has firsthand accounts from witnesses of the Rwandan genocide. Choose a small sample of transcripts to conduct your own inductive content analysis, following the stages from the grounded theory handout. Read a few extracts first and decide what your research question will be about. Then start the process in earnest. Your sample may be a purposive sample once you have decided on an RQ, i.e. choose transcripts from people whose experiences fit with your research interests . Present your findings as as a poster, including RQ, sampling method, step by step procedure used to analyse the data, findings, including memos, codes, fragments of speech as examples, categories, core categories and a diagram to show how the categories link to each other, and conclusions. You can;t use member checking to check the credibility of your analysis but you could use some form of inter-rater reliability/researcher triangulation.

Practice Questions:

  1. Two students have been asked to undertake a content analysis of gender issues in stories that have been written for children between the ages of two and twelve years. Explain how they could design and undertake their content analysis, including how they might analyse the data collected. (6)

You may wish to include some of the following in your answer:

  • sampling method
  • sources of information
  • categories
  • inter-rater reliability
  • data analysis

