You can tell any story you want to explain any of those count changes. You could theorize that the words reflect the prevalence of concern about issues those words are often associated with (which is my theory). But so what? Is it going to help you?
Right now, this is like arguing over the meaning of an abstract painting. If you wanted to do something serious, you’d want to compare the trends for one word against a set of other words—either related or unrelated. Or better yet, you’d run a factor analysis to try to figure out which words most closely changed in lock step with each other.
But then what do you have? I am very skeptical of word counting. I think people make much more of it than is there. I think it can give you some guidance about themes that might be contained in the documents you are looking at, but the only way to test whether those themes are there (whether you are interpreting the counted words in any sensible way) is to actually read the material the words were counted from.
Who’s going to do that?
This is qualitative data analysis. This is not something that can really be quantified beyond counting words. Is anyone surprised that the term “traffic jam” first starts appearing around 1900 and then increases in a linear fashion since then, except for a few downticks. The first is around world war II; the second around the oil crisis in the mid-70s, and the third, which is still continuing, started somewhere around 2005, when the wars in Afghanistan and Iraq were starting.
In other words, the use of the term “traffic jam” seems to decline when the oil supply for consumers seems to be declining and/or when the economy is declining. Except that during the depression, the use of the phrase kept increasing, although a slightly slower rate.
So, how do we explain this? My guess is that with less access to gasoline, there are fewer cars on the road and that, with fewer cars, there are fewer traffic jams and people write about traffic jams less often. It could also be that people care less about traffic jams at these times when they are worried about other things in the economy.
So what does it mean that the words and and the have been slowly declining over the last century?