Claims have been made that the media has treated different legal cases differently. We look at the numbers.
Recently many pundits have claimed that the mainstream media has treated the different cases differently. Specifically, the cases of Ahmaud Arbery, Darrell Brooks, and Kyle Rittenhouse. Some of the specific claims have been the highlighting of race in some cases but not others as well as the general sentiment each is written about.
We reviewed 2000+ new articles for this analysis, each containing the keywords of ‘Ahmaud Arbery’, ‘Darrell Brooks’ and ‘Kyle Rittenhouse’. These results are representative of what you may find if you searched a search engine for any of these terms and filtered to only the news articles.
After collecting the data we moved forward with our analysis. Our analysis was broken down into two parts.
First, we reviewed the frequency of words. To do this we took each article and evaluated both the titles of the articles and the body of the articles. We felt this was important in today’s social media world where many people are only exposed to the headline. For both the headline and body we split all of the articles into their respective coverage; Arbery, Brooks, or Rittenhouse. Next, we combined the titles and text of the articles with each other. We then removed common words, or stop words, that are found in each. We remove these words as they are very commonly used in a given language, this allows us to focus on the important words instead. These words include determiners, coordinating conjunctions, and prepositions. Finally, we reviewed the frequency of words that remained, created a word cloud and graphics summarizing the top 20 most frequent words.
Second, we reviewed the sentiment of the articles. Again, we took each article and evaluated both the titles of the articles and the body of the articles. For both the headline and body we split all of the articles into their respective coverage; Arbery, Brooks, or Rittenhouse. Next, we combined the titles and text of the articles with each other. We then removed common words, or stop words, that are found in each. We remove these words as they are very commonly used in a given language, this allows us to focus on the important words instead. Finally, we processed each article using a natural language processor. This processor creates a sentiment score indicating how positive or negative a given article was with regards to sentiment, between -1 and 1. -1 being a very negative sentiment and 1 being a very positive sentiment.
Results – Frequency
Our first step here was to create a word cloud for each. Word clouds are fairly straightforward. The more often a word appears in the text the larger the word. We created word clouds for both the headlines and article texts.
Title Word Clouds
Analyzing the word clouds, we can see which words are dominant in the headlines of each case.
In the Arbery case, we can see that the words ‘murder’, ‘killing’, and ‘justice’ appeared predominantly in the headlines of articles. In the Brooks headlines, we can see that the case was described in less harsh terms such as ‘crash’ and ‘suspect’. Christmas and parade terms were predominant here. With the Rittenhouse case, we found that ‘justice’ also appeared. Surprisingly we saw that ‘Trump’ was a predominant word. Trump was president when the incident occurred but we can compare this to the Arbery case, where we do see ‘Biden’, however it appears much smaller.
We also generated word clouds for the text of the articles. Here we see with the Arbery case that racial terms appear very predominantly, ‘black man’, ‘white men’, ‘three white’. Here the actors are described frequently by their race. In the Brooks case race is not a predominant term. Instead, here, we found that those involved were described as ‘suspects’ and ‘victims’. In the Rittenhouse case, we found the word ‘white’ did appear in a predominant fashion. More predominant were the words ‘right’, ‘jury’, and ‘time’. The words ‘Trump’ and ‘Fox News’ also appeared frequently.
Quantified Word Frequency – Titles
Often, what most news consumers see is the title as they review their social media feed or newspaper. For each case, we created a top 20 list of terms for the text of articles and the top 6 words found in the headlines.
In the headlines, we saw some variability. For the Arbery case we found the top term to be ‘murder’, but in the Brooks case, the third most common term, which was used to describe the offender, was ‘suspect’. The Rittenhouse headlines did not reference the offender at all but instead, the top term reference was ‘Trump’.
When reviewing the text of the articles we found that both the Arbery and Rittenhouse cases predominately mentioned race. In the Arbery case, ‘black’ was the most used word and ‘white’ was the 4th most often used word. The offender, ‘McMichael’ was the 2nd most mentioned word. In the Brook’s articles, we did not find any racial terms in the top 10 most mentioned words. Here we found words relating to the location (‘Waukesha’, ‘Milwaukee’) and the event (‘parade’, ‘Christmas’). Outside of these the most frequent words mentioned were ‘bail’, ‘police’, and ‘SUV’. The Rittenhouse case also focused highly on race – both the words ‘black’ and ‘white’ appear here the 2nd most frequently and 3rd most frequently, even though the case was based on a white man attacking three other white men.
Results – Sentiment
The natural language processor we utilized also allows us to review the sentiment of each article, from -1 to 1 with 0 being neutral. We were able to quantify the average sentiment across all media outlets and compare that to some of the most common and largest media outlets.
Here we found that all of the articles had an average negative sentiment. The articles, however, did vary in how negatively they were covered, on average and individually. Here we can see that the Brooks case was covered the most negatively, with the Rittenhouse case covered the least negatively with the Arbery case falling in between the two.
We also calculated how far each outlet varied compared to its peers by identifying each outlet’s variance from the mean and identifying if it was positive or negative. Here we found some interesting findings. Fox News and NPR always covered the stories with positive sentiment and CNN covered all stories with negative sentiment – in this way they were consistent. At the other end of the spectrum, we saw some wide variances in how different stories were covered. We found the CNBC covered the Brooks case with negative sentiment but the Rittenhouse and Arbery case with positive sentiment. Axios and The Washington Post covered the Arbery case with positive sentiment but the Brooks and Rittenhouse case with negative sentiment.
Outlets That Varied the Most
We took the absolute value of these variance scores and combined them to see which outlets covered stories the most abnormally. Here we found that CNBC, NPR, and Fox News varied the most of all outlets in how they covered these stories.
We found those anecdotal observations surrounding the use of differences in racial terms are indeed true. Both the Arbery and Rittenhouse case focused heavily on race, while the Brooks case has not been covered in this way.
We also found that media outlets do cover the cases very differently, some more so than others. CNBC, Fox News, and NPR had the three largest variances in how they covered these stories compared to the rest of the media. Yahoo and USA Today covered these stories with the least amount of variability.