Software

R Packages

Events

Events is an R package for manipulating event data of the kind generated by KEDS/Tabari. Go to the events homepage.

Austin

Austin is an R package for doing things with words. Right now it allows you to scale texts in the style of Wordscores and Wordfish. Go to the austin homepage.

Resha

Resha is an R package re-implementing Harun Reşit Zafer’s stemming tool for Turkish. He says it’s “less aggressive than Snowball”. You can find it here: https://github.com/conjugateprior/Resha.

Java Applications

Yoshikoder

Yoshikoder is a cross-platform multilingual content analysis program.
Go to the Yoshikoder homepage.

JFreq

JFreq counts words, quickly. If you have a lot of documents that need to be preprocessed and turned into a word frequency matrix in a hurry without filling up your disk, this might be the software for you. Go to the JFreq homepage.

YKConverter

The YKConverter is a utility that tries to extract the text from documents in various formats (HTML, Word, PDF, Powerpoint, Excel, encoded text) and save it as UTF-8 encoded plain text. Go to the YKConverter homepage.

Re-encoder

The Re-encoder takes a folder full of text files in one file encoding and switches them into another one. It needs a better name. Go to the Re-encoder homepage.

Python

Content Analysis in Python

brief demonstration of how easy it is to do basic content analysis in python. Not really software but not really a tutorial either. Perhaps it will be useful or inspiring to someone.

There’s more bits of python code on the blog.

Third Party Software

VBPro

VBPro is Mark Miller’s classic free content analysis software. I am simply hosting the latest version and cannot answer questions about it. Please address questions to Mark: mmarkmiller [at] mac.com.

Download vbpro.zip (for Windows only).