IA Logo


IA Information
Communication

Dave Mark's books

IA on AI


When Formulas Go Horribly Wrong

I came upon an article that wasn’t necessarily AI related, but it teaches a heckuva lesson. Carl Bialik of the Wall Street Journal does a regular column under the banner “The Numbers Guy“. In his March 14th installment, entitled Can You Read as Well As a Fifth-Grader? Check the Formula he talks about how Microsoft Word’s grammar checker uses an antiquated formula to decide if your sentences are in line with the typical readers’ skill levels. He points to numerous such formulas the can differ wildly about the grade levels that they ascribe to your writing.

Things that are taken into consideration include words per sentence, the length of words, and the number of syllables per word. For example, he explains that Word’s formula is as follows:

It multiplies 0.39 by the average number of words per sentence, adds that to 11.8 times the average number of syllables per word, and subtracts 15.59 from the total. The result is the supposed minimum grade level of readers who can handle the text in question.

Even as an AI designer and programmer, especially as one that specializes in algorithmic construction for simulations that purportedly mimic reality, I found that to be not only a little esoteric but rather simplistic. (That sentence scored a grade level of 20.9 – email me if you need help understanding it.)

Note that they don’t care about what the text is saying or the sentence structure at all. (Grade 5.6) Therefore, as Carl pointed out:

The formulas treat writing as a mere collection of words and spaces. Word meaning and sentence structure don’t figure. George Weir, a philosopher and computer scientist at the University of Strathclyde in Glasgow, says Word’s readability test thinks grade-schoolers could handle the nonsense passage, “Acuity of eagles whistle truck kidney. Head for the treacle sump catch and but. What figgle faddle scratch dog and whistle?” Similarly, “Had lamb little a Mary” and “Mary had a little lamb” score identically.

I will let you peruse the rest of the article yourself in order to get the full, head-shaking magnitude of the stupidity of it all. (11.6) My point, as it relates to artificial intelligence, is that constructing formulas to score something is sometimes a complex issue bordering on the quixotic. (15.8 – although I admit spelling out “artificial intelligence” rather than “AI” just for the syllable count to get an extra 3 grade levels. On the other hand I should have gotten extra credit for “quixotic”. Incidentally, for lots of extra credit, play it on a “triple word score” in Scrabble!)

Where’s the AI, Dave?

One of the most common errors in constructing weighted sum formulas to score a concept or recreate a behavior is simply leaving out an important factor. You can play with the weights of the factors you did include all you want – and that is great for fine tuning – but if you left out an input that the “real world” takes into consideration, you will never be able to completely mimic the result set you are trying to match. In the case of the grammar checker, there was no accounting for the actual content of the text. I will give them a little slack in the fact that natural language processing is a large mountain to climb (can I use “quixotic” again?) even as a stand-alone application much less a simple convenience tool added onto an existing program. However, because it wasn’t taken into consideration, the results not only suffered greatly, but when tested to the corners of the envelope, failed in rather spectacular fashion. (Grade 14.6 for the whole paragraph.)

Therein lies another lesson. If I were to type up a school paper, a blog post, or even my next contribution to the “AI Game Programming Wisdom” series, I would likely be falling into the comfort zone of the algorithm. I would be giving it input that it expects and, therefore, can deal with in a relatively safe manner. The results may not be accurate to a fine degree, but they don’t look ridiculous. However, if I were to hammer it with the nonsense words and phrases that the testers did, I expose the alleged “grammar checker” for what it is – a mathematical algorithm that falls flat on its face once you move beyond some vague threshold of comprehensibility. (11.3… welcome back to high school!)

But isn’t this the way at least some portion of the gaming public approaches our games? In a brief blurb over on Post-Play’em, (Driving Crisis in Crysis) I pointed out an issue that Crysis had with exactly this problem. If people played the game within certain bounds of behavior, the AI looked fantastic. However, once people started realizing that certain factors were not taken into account by the AI agents, the soft underbelly of the engine was not only exposed but widely publicized on YouTube. They had “broken” the AI. (Not as in “broken a code” but as in “my car is broken… again.”) (8.7?!? I’m regressing!)

So, the challenge to us as AI designers is… “what else have I not considered?” It may be an input from the game mechanics that we haven’t added, a constraint on a variable that may get out of control, or a scaling of a continuous input that becomes more or less relevant as it moves up and down the scale. The point is, we constantly are wrestling with trying to mimic the grandiosity and magnitude of chaos theory as it manifests itself in the “real world”. As much as the game AI world celebrates the increase in clock cycles that we are being granted by the Powers That Be (the graphics programmers), we will never have enough to completely do deterministic modeling on the scale of reality. So, we will always need to pick and choose what inputs we are using in the first place. Then, and only then, should we allow ourselves to be free to adjust… and tweak… and massage… and tinker… and tune… (10.7 – although one sentence in there had an 18.2!)

So, in closing I would like to suggest the following:

We shall heretofore proceed with unambiguous certitude upon a magnificent, glorious expedition into sesquipedalian practices with our ultimate ambition being that of maximizing our alleged numerical educational equivalence as ostensibly adjudged by the ill-conceived but utilitarian application of algorithmic approaches used by the multinational, multidisciplinary, conglomerate known colloquially, industrially and professionally as Microsoft.

(How does a grade level of 36.6 for that sentence hit you? I rule!! As my high school chemistry teacher was apt to tell my class “Your loquaciousness is exceeded only by your verbosity.” We were Juniors and that sentence scores a 12.6… no wonder no one understood him. Translated: “The only thing you do more than talk too much is… talk too much.”)

(I only get 11.1 for the whole article? Are you kidding me?!?)

Tags: , ,



One Response to “When Formulas Go Horribly Wrong”

  1. Andrew Armstrong says:

    Hahaha, funny post Dave 🙂 I see Word’s language parser sucks. I wonder how many points I get for this too 🙂

    Back to English school for you! (or are you too cool for school now? 🙂 Microsoft thinks you aren’t!).

Leave a Reply

Add to Google Reader or Homepage




Content 2002-2018 by Intrinsic Algorithm L.L.C.

OGDA