Constants in Logit scales

A little while ago I got a query about the calculation of the logit policy scales from Lowe et al. (2011). I thought it might be useful to repeat the answer slightly more publicly, in case anybody else was wondering. The pesky constants in that paper confuse people. Anyway, here’s the question:

In the article you give the formula as log(R+.5)-log(L+.5). I had assumed that in the formula that ‘R’ and ‘L’ were the total number of sentences on each ‘side’ of a policy scale and so consequently .5 is added to the total number of sentences in all the manifesto categories assigned to each side of a policy scale. However I was reading an article [… where they] seem to add .5 to each of the categories assigned to a policy scale and then also divide by the number of items used in the scale (their approximate formula [without proper subscripting] is: p = [(log(p_1 +0.5)-log(p_2+.5)+ \ldots +(log(p_3+.5)-log(p_4+.5)]/3) where p is a manifesto category). Consequently I’m slightly worried that I’ve misinterpreted how you calculate your scale

OK, so the way to think about this scale is as follows…
Continue reading Constants in Logit scales

Unicode in R packages (not)

Perhaps you are trying to add your nice new object as data for an R package. But wait. It has [gasp] foreign letters in its dimnames, so ‘R CMD check’ will certainly complain.

What you need is something to turn R’s natural Unicode-processing goodness into a relic from the early days of computing without inadvertently aliasing any words that differ only by non-ASCII element. Here’s a handy iconv-invoking function to do that…
Continue reading Unicode in R packages (not)

A conversion to Yoshikoder format

A couple of months ago somebody asked how to convert a new dictionary file so that it would run in the Yoshikoder. The format of the original file had two parts: the first half looked like:

1	funct							
2	pronoun							
3	ppron							
4	i							
5	we							

which assigns identifiers to category labels, and a second half that looked like:

a	1	10
abandon*	125	127	130	131	137
abdomen*	146	147			
abilit*	355

in which words or wildcarded patterns were assigned to categories via their identifiers. How to get it into Yoshikoder-readable format?
Continue reading A conversion to Yoshikoder format