I was recently asked what type of digital corpuses are available to track word frequency changes over time. In addition to Google’s N-gram I would recommend their Insights project, which allows for a more recent and detailed picture. Though the time span is considerably shorter (’04-’12), Insights is a remarkable tool, since search queries have a more democratic tinge to them than publications. It reveals what populations are curious about and willing to seek out.
Then just this morning I discovered Capitol Words, a project by the Sunlight Foundation. As they describe it,
Capitol Words scrapes the bulk data of the Congressional Record from the Government Printing Office, does some computer magic to clean-up and organize the data, then presents an easy-to-use front-end website where you can quickly search the favorite keywords of legislators, states or dates.
The new version now allows users to search, index and graph up to five-word phrases that give greater context and meaning to the turns-of-phrase zinging across the aisle. Where we once could only track individual terms like ‘health‘ or ‘energy,’ now we can break down the issue further into ‘health care reform,’ ‘renewable energy,’ ‘high energy prices‘ or however you wish.
Such a site promises to be a playground for rhetoricians.
Now go play.