Dean Burnett writes a column in Guardian, sometimes about science but more entertainingly on pseudo-, wannabe-, and not-actually- science. Most of the time this is good BS-shovelling fun and I recommend it. Unfortunately today we get some ill-considered overreach under the guise of shovelling.
The subject is a silly equation purporting to define how depressing any day of the year is, and thereby to identify the most depressing one. It is sufficiently silly that it doesn’t deserve a link, has no redeeming features and if you’ve not read it yet you’re just lucky. He’s right. It is nonsense.
The arguments are more interesting, if a bit alarming. To put it bluntly: if they were sound they would sink all regression models about any social issue of any interest to anybody ever.
Continue reading “On the use and abuse of weasels in science journalism”
R’s formula interface is sweet but sometimes confusing. ANOVA is seldom sweet and almost always confusing. And random (a.k.a. mixed) versus fixed effects decisions seem to hurt peoples’ heads too. So, let’s dive into the intersection of these three.
Continue reading “Formulae in R: ANOVA and other models, mixed and fixed”
A little while ago I got a query about the calculation of the logit policy scales from Lowe et al. (2011). I thought it might be useful to repeat the answer slightly more publicly, in case anybody else was wondering. The pesky constants in that paper confuse people. Anyway, here’s the question:
In the article you give the formula as log(R+.5)-log(L+.5). I had assumed that in the formula that ‘R’ and ‘L’ were the total number of sentences on each ‘side’ of a policy scale and so consequently .5 is added to the total number of sentences in all the manifesto categories assigned to each side of a policy scale. However I was reading an article [… where they] seem to add .5 to each of the categories assigned to a policy scale and then also divide by the number of items used in the scale (their approximate formula [without proper subscripting] is: p = [(log(p_1 +0.5)-log(p_2+.5)+ \ldots +(log(p_3+.5)-log(p_4+.5)]/3) where p is a manifesto category). Consequently I’m slightly worried that I’ve misinterpreted how you calculate your scale
OK, so the way to think about this scale is as follows…
Continue reading “Constants in Logit scales”
You have an SQLite database, perhaps as part of some replication materials, and you want to query it from R. You might want to be able to say:
results < - runsql("select * from mytable order by date")
and get the results back as an R object. Here's a function to do it.
Continue reading "Querying an SQLite database from R"
Perhaps you are trying to add your nice new object as data for an R package. But wait. It has [gasp] foreign letters in its dimnames, so ‘R CMD check’ will certainly complain.
What you need is something to turn R’s natural Unicode-processing goodness into a relic from the early days of computing without inadvertently aliasing any words that differ only by non-ASCII element. Here’s a handy iconv-invoking function to do that…
Continue reading “Unicode in R packages (not)”
A couple of months ago somebody asked how to convert a new dictionary file so that it would run in the Yoshikoder. The format of the original file had two parts: the first half looked like:
which assigns identifiers to category labels, and a second half that looked like:
a 1 10
abandon* 125 127 130 131 137
abdomen* 146 147
in which words or wildcarded patterns were assigned to categories via their identifiers. How to get it into Yoshikoder-readable format?
Continue reading “A conversion to Yoshikoder format”
Perhaps you have a file written in Markdown with embedded R of the kind that RStudio makes so nice and easy but you’d like a range of output formats to keep your collaborators happy. Say latex, pdf, html and MS Word. Here’s what you might do
Continue reading “R Markdown to other document formats”