On the use and abuse of weasels in science journalism

Dean Burnett writes a column in Guardian, sometimes about science but more entertainingly on pseudo-, wannabe-, and not-actually- science. Most of the time this is good BS-shovelling fun and I recommend it. Unfortunately today we get some ill-considered overreach under the guise of shovelling.

The subject is a silly equation purporting to define how depressing any day of the year is, and thereby to identify the most depressing one. It is sufficiently silly that it doesn’t deserve a link, has no redeeming features and if you’ve not read it yet you’re just lucky. He’s right. It is nonsense.

The arguments are more interesting, if a bit alarming. To put it bluntly: if they were sound they would sink all regression models about any social issue of any interest to anybody ever.
Continue reading

Constants in Logit scales

A little while ago I got a query about the calculation of the logit policy scales from Lowe et al. (2011). I thought it might be useful to repeat the answer slightly more publicly, in case anybody else was wondering. The pesky constants in that paper confuse people. Anyway, here’s the question:

In the article you give the formula as log(R+.5)-log(L+.5). I had assumed that in the formula that ‘R’ and ‘L’ were the total number of sentences on each ‘side’ of a policy scale and so consequently .5 is added to the total number of sentences in all the manifesto categories assigned to each side of a policy scale. However I was reading an article [… where they] seem to add .5 to each of the categories assigned to a policy scale and then also divide by the number of items used in the scale (their approximate formula [without proper subscripting] is: p = [(log(p_1 +0.5)-log(p_2+.5)+ \ldots +(log(p_3+.5)-log(p_4+.5)]/3) where p is a manifesto category). Consequently I’m slightly worried that I’ve misinterpreted how you calculate your scale

OK, so the way to think about this scale is as follows…
Continue reading

Unicode in R packages (not)

Perhaps you are trying to add your nice new object as data for an R package. But wait. It has [gasp] foreign letters in its dimnames, so ‘R CMD check’ will certainly complain.

What you need is something to turn R’s natural Unicode-processing goodness into a relic from the early days of computing without inadvertently aliasing any words that differ only by non-ASCII element. Here’s a handy iconv-invoking function to do that…
Continue reading

A conversion to Yoshikoder format

A couple of months ago somebody asked how to convert a new dictionary file so that it would run in the Yoshikoder. The format of the original file had two parts: the first half looked like:

1	funct							
2	pronoun							
3	ppron							
4	i							
5	we							

which assigns identifiers to category labels, and a second half that looked like:

a	1	10
abandon*	125	127	130	131	137
abdomen*	146	147			
abilit*	355

in which words or wildcarded patterns were assigned to categories via their identifiers. How to get it into Yoshikoder-readable format?
Continue reading