I've already had occasion to deal with an online claim that Patsy "must have" written the note, based on something called "statement analysis." The findings presented on that website were easy to debunk, since the author was focused on Patsy and only Patsy, cherry picking certain words or phrases of hers that were, supposedly, sure "signs of deception," and ignoring the need to query other statements by other suspects, whose utterances might have been similarly questionable. Nor was it ever clear exactly why certain phrases or modes of expression were necessarily deceptive, outside the author's own personal notion of how a guilty person would express himself. This sort of thing is confirmation bias writ large and can easily be dismissed.
But this new finding, posted by a very interesting, intelligent and well-informed fellow named Tom Berger, is something else entirely. Berger's finding was based on a computer program called Jstylo, developed at Drexel University specifically for the purpose of identifying the authors of anonymous texts, using algorithmic methods based strictly on statistical analysis, with no room for the sort of subjective judgments we've seen so often coming from people claiming to have solved this case. Berger entered statements and documents produced by both Patsy and John Ramsey into the Jstylo program, along with a large set of documents from the "Enron Email Corpus," a large body of texts dating from the Enron scandal, now archived at Carnegie Mellon University for use as a reference text for research purposes. The Enron texts, along with a few others selected by Berger, function as what he called "placebos," i.e., controls to provide a sufficiently large body of documents for the software to sift through.The Ramsey ransom note was entered as the unknown text, to be compared with all the others in the search for the closest match, stylistically.
According to Berger, the software "took the ransom note and asked--which text does this ransom note look most like? Result--Patsy Ramsey at 75%. I ran it again with different emails and text, and then different data mining algorithms, same result." Berger then goes on to quote from one of Patsy's Christmas messages included in his sample: "Had there been no birth of Christ, there would be no hope of eternal life, and, hence, no hope of ever being with our loved ones again," observing that "the ransom note also has the same unusual and grammatically incorrect "and hence" which looks highly suspicious, but once the software runs, it creates much more than this--Wordprint creates 800 variables per text, creating sliding window analysis and a broad range of things that are tested."
Naturally my eyebrows went up when I read that conclusion. Unlike so many dubious efforts to "prove" that Patsy wrote the ransom note, based largely on confirmation bias, this result looked solid. The computer can't be accused of bias, and the method employed by this software is supported by years of serious linguistic research.
Now anyone familiar with this blog realizes that, for me, the notion that Patsy wrote the ransom note makes no sense, and anyone who insists she wrote it has to be mistaken. Not a popular view, admittedly. My analysis is not based on a close examination of the note itself, either from a handwriting or a content perspective, because as I see it, such efforts can never be conclusive, but on a very different approach, based primarily on a careful consideration of both the facts and the logic of the case, too often ignored during the course of the investigation.
So! In all the years I've followed this case, I've never encountered what I would regard as a definitive challenge to my interpretation -- until now. If in fact this software is truly accurate in matching an unknown text to its author, then it looks very much as though Patsy has, for the first time, been conclusively identified as the author of the notorious ransom note -- and the case, finally, has been solved.
Or has it? Obviously, the next thing to do was get hold of Jstylo myself, and do some testing of my own, to learn more about what Berger's results actually meant and make sure they could be replicated. I went to the Jstylo website, downloaded the software, figured out, after a bit of a struggle, how to get it running (it's a Java-based program, which makes things a bit tricky) and entered some samples of my own from both Patsy and John -- only mine were different from the ones Berger used, and I was wondering if different exemplars might produce a different result.
One cause of skepticism was his use of the Christmas message with the notorious "and hence," usually assumed to have been written by Patsy, but actually composed by both Patsy and John. As reported in my blog post Johnisms, John actually uttered that phrase during an interview originally posted at the "Newseum" website. I've never been able to find a single instance of Patsy using "and hence" in any interview, letter, or other document known to have been produced by her. In fact the phrase is much more John's style, which tends to be rather formal, than Patsy's, which is often rather breezy, mostly informal and generally much more colloquial than John's rather stilted language.
If we can assume, and I think we can, that the "and hence" in the Christmas message emanated from John, then we can also assume that many other things in that message might also have originated with him. Yet Berger included this text in his Patsy sample. Would the results look different if it were removed? I compiled samples of my own from Patsy and John's police interviews, as well as some other texts, both spoken and written, and entered them into the appropriate slots in Jstylo. I also included some other items, such as the Leopold-Loeb ransom note, a letter by John's lawyer, Hal Haddon and Steve Thomas's rather verbose introduction to his 1997 interview with Patsy. I then proceeded to the next page in Jstylo, where I accepted certain default features, and then to the next page, where I was prompted to select a "Classifier." And at that point, the program stalled. No matter what I selected I got the same error message. The program refused to let me continue, and no matter what I did, including beefing up the Java memory heap, re-installing the software, and reverting to an earlier version, nothing worked. And nothing has worked since. Very frustrating.
Fortunately Mr. Berger has turned out to be a remarkably sympathetic and cooperative fellow and when I emailed him with a request to help me out, he more than complied. I sent him copies of my samples and asked him to enter them into Jstylo himself, which he willingly did. Shortly afterward, I received an email message reflecting his latest finding:
There's a lot more to the story, as you might imagine, which I'll try to summarize in my next post.Don't give up your theory just yet.I couldn't get a result.The output has enron at 100%.I tried various classifiers until I got to the one with the highest hit rate and least error,and the highest Kappa value. Using that classifier, the bayesian one I have used a lot, the results were enron.
To be continued . . .