I don't think Mr. Berger will mind my calling him Tom. Over the last week or so we've exchanged several emails and by this time I see him as a friend. Not that we always agree. But I find him to be affable, open-minded, reasonable and extremely helpful. Also very knowledgeable. As I mentioned last time, when Tom did me the favor of double-checking his previous result with a different set of exemplars, he came up with a completely different "suspect," drawn from the large pool of Enron texts used as a control. Did someone who once worked at Enron write the Ramsey ransom note? I seriously doubt it. The Enron "hit" can be regarded as an artifact of the methodology, designed to control for confirmation bias by including a large sampling of texts from a presumably innocent source.
At that point, puzzled by the new result, and after consulting a respected text on the subject, Tom decided that his samples were not long enough to satisfy a statistically oriented system:
1--You need 6500+ words to TRAIN ie your suspects and corpora
(Tom has also alluded to problems stemming from an attempt to compare verbal with written texts. Ideally they should be treated separately -- but that would cut down on the sample size, as the pool of written texts by Patsy and John is limited.) He decided to toss out the relatively short samples from Haddon, Thomas, and Leopold-Loeb, and possibly he managed to beef up the size of the samples from Patsy and John -- though I doubt he was able to find anything close to 6500 words from each. (The sample I sent from Patsy was less than 3000.) Then, trying one more time, using his favorite Baysian classifier, he once again got Patsy:2--You need 500+ words to TEST ie ransom note.
At this point I see a potential problem. If it's simply a matter of feeding texts into a black box and getting a result, based strictly on probabilities based on prior research, that's one thing. But where we have a situation where there are all sorts of knobs and dials that can be adjusted, such as sample size, number of sample texts, feature sets, and classifiers, we open the door to confirmation bias. If you give the software a spin and it doesn't give you what you want, then you can decide there was a problem with your inputs, adjust them and try again: until you get your desired result. I'm not saying anyone does that deliberately. And I'm not accusing Tom of doing anything like that either. I honestly believe he didn't. Nevertheless, it is a concern.As I said, I added more enron to try to get over 6000 words per author.Ran it again--This time Patsy at 100%.
Tom added some reassuring comments, based on his considerable experience with this sort of methodology:
I dont see it like you---the automatic software is not proof positive, its a strong indication ie red flag. . . . These are not tools to convict but to point the investigation, get more info, direct resources etc.. . .
What this means is that you dont have to give up your theory, because even with my result, the probability is realistically maybe 70-75% because the training sample of patsy and john were lower than recommended. We both look at the results differently--You are disappointed because you want certainty, I am delighted with 70% (a guess) under the circumstances, so we are both happy....your theory still stands! But I have a small window of opportunity too. Thats how I see it anyway.Very gracious. But I am nevertheless troubled. Why would this software settle on Patsy out of all the many Enron samples?
There's more, however, because shortly after this report I get the following email:
Hi. . . Things get weirder... I tested your patsy exemplars against 4 patsy texts I have, and it is not similar, according to Jstylo. In fact, john exemplars is a close match to patsy exemplars...In other words, 4 patsy texts I have do not match your patsy exemplars----your john exemplars is a close match to patsy exemplars. Make sense?? me neither.In other words, the Jstylo software, designed to pick the author of an anonymous text out of a wide array of known exemplars, was unable to match two different texts from the same author. I suspect that, once again, the problem is sample size. Algorithms of this kind are based on probabilities and probabilities are based on "the law of large numbers." The smaller the sample, the less amenable it is to this sort of test. Which brings me to the ransom note itself, weighing in at roughly 370 words (according to an online word counter). According to the text Tom consulted, 500 words are the absolute minimum for a test document, such as the ransom note. We must also take into consideration that some of the ransom note text is drawn from external sources:
"At this time, we have your daughter in our possession. She is safe and unharmed. . ." "You will withdraw $118,000 from your account. $100,000 will be in $100 bills and the remaining $18,000 in $20 bills." Standard ransom note language, obviously no help in identifying anyone.
"If we catch you talking to a stray dog, she dies. If you alert bank authorities, she dies." "Don't try to grow a brain . . . " Obviously lifted from movie scripts.
"Use that good, Southern common sense of yours." Generally thought to be a deliberate reference to a pet phrase of Patsy's, which should be eliminated from consideration to avoid obvious bias.
That's 66 extraneous words that should be eliminated from the note before attempting a match with Patsy, John, or anyone else. Leaving a text of only 304 usable words, while a length of 500 or over is considered necessary for a meaningful search.
As should now be clear: 1. the Jstylo software has some limitations, possibly due to sample size issues, possibly due to more fundamental design flaws; 2. regardless, the Ramsey ransom note is much too brief to be meaningfully evaluated by statistically oriented methods; 3. the brevity of the note suggests that we need not rely on statistics at all, but can evaluate its content directly.
I recall a saying I heard recently: "If you have a talking dog, you don't need statistics."
In a blog post titled Johnisms, I've already identified several key words or phrases drawn from various utterances of John Ramsey that match similar words and phrases from the note in a manner that, if nothing else, makes one wonder. If anyone thinks he or she can do something similar with Patsy, I invite them to give it a try. Of course, for Tom anything other than a statistical comparison based on distinctive linguistic features is far too crude to be meaningful. That's one point on which we can agree to disagree. When dealing with a text of several thousand words, then obviously a statistical analysis is the only meaningful option -- but when the text is only 300 words or so, it seems to me that statistics are both irrelevant and unnecessary.
Despite the many problems with the Jstylo methodology, due to sample size or some other issue, I find it puzzling that Patsy would turn up as a match to the ransom note at all, and for whatever reason. To me, there is nothing whatever in the style of that note that comes close to Patsy's style, either verbal or written. So what is it in the sample Tom found that led to Patsy as author of the note? One thing I can think of is his inclusion of the Christmas message authored jointly by Patsy and John. Was it the "and hence" that triggered the match? Or some other features of that message, possibly authored by John rather than Patsy? Or possibly the phrase "good southern common sense," probably a deliberate reference to Patsy, intended as sarcasm? Possibly the match was just a coincidence? Or maybe she wrote it after all, and I've been wrong. Hard to say. And food for thought.
As for Jstylo and other methods of that type, I find them extremely interesting and very promising, especially where sizeable texts are involved, such as long essays or books. I looked for references to Jstylo via Google, by the way, and could not find any dating from later than 2013. And my efforts to contact some of the people involved in that project have drawn a blank. I have a sneaking suspicion that further testing by independent researchers failed to replicate their results. But I have no doubt that any such problems will be overcome in time, and an exciting new forensic tool, comparable to fingerprinting or DNA identification, will emerge.