IA Logo


IA Information
Communication

Dave Mark's books

IA on AI


Posts Tagged ‘neural networks’

AI Architectures: A Culinary Guide (GDMag Article)

Thursday, November 1st, 2012

(This is a slightly edited copy of the feature article I wrote for Game Developer Magazine that appeared in their August, 2012 issue.)

A question that seems to come up regularly from students, designers, and even from veteran programmers is “what AI architecture should I use?” It’s a query that is sprinkled across internet message boards, comes up at GDC dinners, and I’m sure is a frequent topic in pre-production meetings at game studios – indie and AAA alike. As a game AI consultant, I certainly field it from my clients on a regular basis. More often than not, the less-than-satisfying answer to this question is a resounding, “It depends.” What follows resembles something more along the lines of an interrogation than an answer.

It puts me in mind of the days long ago when I was a waiter. (I suppose they call them “servers” now.) People would often ask me, “what do you recommend?” or the even vaguer, “what’s good here?” For those that haven’t had the opportunity to work in the restaurant business, you may not realize that this can be a horribly uncomfortable question to be presented with. After all, most people are well aware that tastes differ… so why would a waiter server be able to ascertain what it is you have a proverbial hankerin’ for? The way out of this situation would be to ask return questions. “Well, how hungry are you? Are you in a hurry? What are you in the mood for? Steak? Chicken? Allergic to peanuts? Oh… you’re vegan? And you need gluten-free, eh? Well that certainly redirects things.” The result of this Q&A is not that I tell them what they want, but rather I help them discover for themselves what it is they really want.

Much the same falls out of the conversations surrounding AI architectures. After all, there isn’t necessarily one best way of doing things. Often, as I mentioned before, it simply depends. What is it you are trying to do? What are your technical limitations? What is your experience? What is your time frame? How much authorial control do your designers need? Really, it is something that you as the developer need to ascertain for yourself. As a waiter, I can only help you ask the right questions of yourself and point you in the right direction once we have the answers.

Unfortunately, many of the available books, articles, and web sites only tell you how the different architectures work. They often don’t tell you the pros and cons of each. That often leads to misguided enthusiasts proclaiming with exuberant confidence, “I will create my [incredible AI] with a [not even remotely appropriate technology]!!” Don’t get me wrong… those resources are often very good at telling you how to work on a technical level with the different methods. What is missing is why you would do so.

At the 2010 GDC AI Summit, we presented a panel-based session entitled “Deciding on an AI Architecture: Which Tool for the Job?” The premise of the panel pitted four different technologies against each other to see which of them was most appropriate to use in four different game AI situations. Each architecture was represented by a person whose job it was to argue for our own and against the others. (Spoiler alert: the 4 game highly-contrived situations were specifically chosen so that each architecture was shown to have a strong point.) Even with the pre-written outcomes designed to steer toward a “best” answer, the three “losing” panelists managed to make the point that their method could work but simply was not the best tool for that job.

But that’s part of the problem, isn’t it? Certainly most AI architectures could manage to muddle through most any AI situation. However, that often leads people to a false sense of security. AI is often complicated enough that some amount of hair-pulling is expected. This often obscures the fact that an inappropriate—or, shall we say, “less than optimal”—method is making things more difficult than it should be. All of this brings us back to our initial question—“what AI architecture should I use?” And, once again, I respond, “it depends.”

Same Stuff—Different Shapes

It’s the same stuff… just different shapes.

If I may tap once again into my food metaphor, selecting an AI architecture is often like selecting Mexican food—for most intents and purposes, it’s the same stuff… just different shapes. Allow me to expound. The contents of most Mexican food can be reduced to some combination of a fairly succinct list of possibilities: tomato, cheese, beans, lettuce, onions… you know the drill. Additionally, the non-vegan among us often elect to include some form of meat (real or faux). Think of these ingredients as the behavior content of our AI. This is the stuff the gives all the flavor and “substance” to the dish. But what about the outside? Often we order our Mexican food in terms of its shape or form—a taco… a burrito… etc. Why do we think in terms of the outside form when it is the stuff that is on the inside that is pretty much the point of the order in the first place?

The answer lies in the fact that the outside—that is, the shell or wrapper of some sort—is merely a content delivery mechanism. For the most part, it exists only as a way of keeping those internals together long enough for consumption. In this way, these “delivery mechanisms” compare to AI architectures. They only serve to package and deliver the tasty behavioral content that we want our players to experience. But why are there so many different forms for such a utilitarian function? And which form do I use? Well, it depends.

The same holds true when we talk about our AI systems. We often speak in terms of the mechanism rather than the content. We are writing a finite state machine, a behavior tree, or a planner, just like we are filling a taco, a burrito, or an enchilada. Once we have decided on that packaging, we can put any (or all!) of the tasty stuff into it that we want to. However… there are pros and cons to each.

The Tostada—Just a Pile of Stuff

Starting simple, let’s look at the tostada. It is, quite literally, about as simple a delivery platform that you can use for our Mexican food. Everything just sits on top of it right where you can see it. You can add and remove ingredients with ease. Of course, you are somewhat limited on how much you can put on since eventually it will start to fall off the sides. Although it is a little difficult to grab initially, if you pick it up correctly, everything stays put. However, it’s not a terribly stable platform. If you tip it in the slightest, you run the risk of sending things tumbling. What’s more, as soon as you start biting into it, you run the risk of having the whole thing break in unpredictable ways at which point, your entire pile of content falls apart. For all intents and purposes, the tostada really isn’t a container at all once you start using it—it’s just a hard flat object that you pile stuff on top of and hope it stays put.

You never know when the entire platform is going to simply fall apart.

From an AI standpoint, the equivalent isn’t even really an architecture. This would be similar to simply adding rules here or there around our code that change the direction of things in a fairly haphazard manner. Obviously, the problem with that is, like the tostada, you can only get so much content before things become unstable. It’s a bit unwieldy as well. Most importantly, every time you take a bite of your content, you never know when the entire platform is going to simply fall apart.

Adding a Little Structure—The State Machine Taco

Our tostada suffered from not having enough structure to hold the content stable before it started to fall off—or fall apart entirely. However, by just being a little more organized about how we arrange things, we can make sure our content is a lot more self-contained. In the Mexican food world, we can simply bend our tostada shell into a curve and, behold, it becomes a taco! The result is that we can not only hold a lot more content, but we can also pick it up, manipulate it, and move it around to where we can get some good use out of it.

Adding a little bit of structure to a bunch of otherwise disjointed rules maps over somewhat to the most basic of AI architectures—the finite state machine (FSM). The most basic part of a FSM is a state. That is, an AI agent is doing or being something at a given point in time. It is said to be “in” a state. Theoretically, an agent can only be in one state at a time. (This is only partially correct because more advanced agents can run multiple FSMs in parallel… never mind that for now. Suffice to say, each FSM can only be in one state.)

The reason this organizes the agent behavior better is because everything the agent needs to know about what it is doing is contained in the code for the state that it is in. The animations it needs to play to act out a certain state, for example, are listed in the body of that state. The other part of the state machine is the logic for what to do next. This may involve switching to another state or even simply continuing to stay in the current one.

The transition logic in any given state may be as simple or as complex as needed. For example, it may simply involve a countdown timer that says to switch to a new state after a designated amount of time. It may be a simple random chance that a new state should be entered. For example, State A might say that, every time we check, there is a 10% chance of transitioning to State B. We could even elect to make the new state that we will transition to a result of a random selection as well—say a 1/3 chance of State B and a 2/3 chance of State C.

More commonly, state machines employ elaborate trigger mechanisms that involve the game logic and situation. For instance our “guard” state may have the logic, “if [the player enters the room] and [is holding the Taco of Power] and [I have the Salsa of Smiting], then attack the player” at which point my state changes from “guard” to “attack”. Note the three individual criteria in the statement. We could certainly have a different statement that says, “if [the player enters the room] and [is holding the Taco of Power] and [I DO NOT have the Salsa of Smiting], then flee.” Obviously, the result of this is that I would transition from “guard” to “flee” instead.

So each state has the code for what to do while in that state and, more notably, when, if, and what to do next. While some of the criteria can access some of the same external checks, in the end each state has its own set of transition logic that is used solely for that state. Unfortunately, this comes with some drawbacks.

Figure 1 – As we add states to a finite state machine, the number of transitions rapidly.

First, as the number of states increases, the number of potential transitions increases as well—at an alarming rate. If you assume for the moment that any given state could potentially transition to any of the other states, the number of transitions increases fairly quickly. Specifically, the number of transitions would be the [number of states] × ([number of states] – 1). In Figure 1, there are 4 states each of which can transition to 3 others for a total of 12 transitions. If we were to add a 5th state, this would increase to 20 transitions. 6 states would merit 30, etc. When you consider that games could potentially have dozens of states transitioning back and forth, you begin to appreciate the complexity.

What really drives that issue home, however, is the realization of the workload that is involved in adding a new state to the mix. In order to have that state accessible, you have to go and touch every single other state that could potentially transition to it. Looking back at Figure 1, if we were to add a State E, we would have to edit states A-D to add the transitions to E. Editing a state’s logic invokes the same problem. You have to remember what other states may be involved and revisit each one.

And the bigger it gets, the more opportunity for disaster.

Additionally, any logic that would be involved in that transition must also be interworked into the other state-specific logic that may already be there. With the sheer numbers of states in which to put the transition logic and the possible complexity of the integration into each one, we realize that our FSM taco suffers from some of the same fragility of the ad hoc tostada we mentioned earlier. Sure, because of its shape, we can pile a little more on and even handle it a little better. One bite, however, could shatter the shell and drop everything into our lap. And the bigger it gets, the more opportunity for disaster.

A Softer Approach—The Behavior Tree

So the problem with the taco is that it can hold a fair bit of content (more than the tostada anyway), but is a bit brittle. It is not the shape of the taco that is at fault as much as it is a result of using the hard shell. If only we could hold the same content delivered in much the same manner, but have our container be less prone to shattering when we put some pressure on it. The answer, of course, is the soft taco. And the analogue in the AI architecture could very well be the behavior tree.

At this point, it is useful to point out the different between an action and a decision. In the FSM above, our agents were in one state at a time—that is, they were “doing something” at any given moment (even if that something was “doing nothing”). Inside each state was decision logic that told them if they should change to something else and, in fact, what they should change to. That logic often has very little to do with the state that it is contained in and more to do with what is going on outside the state or even outside the agent itself. For example, if I hear a gunshot, it really doesn’t matter what I’m doing at the time—I’m going to flinch, duck for cover, wet myself, or any number of other appropriate responses. Therefore, why would I need to have the decision logic for “React to Gunshot” in each and every other state I could have been in at the time?

Figure 2 – In a behavior tree, the decision logic is separate from the actual state code.

This is the strength of the behavior tree. It separates the states from the decision logic. Both still exist in the AI code, but they are not arranged so that the decision logic is in the actual state code. Instead, the decision logic is removed to a stand-alone architecture (Figure 2). This allows it to run by itself—either continuously or as needed—where it selects what state the agent should be in. All the state code is responsible for is doing things that are specific to that state such as animating, changing values in the world, etc.

The main advantage to this is that all the decision logic is in a single place. We can make it as complicated as we need to without worrying about how to keep it all synchronized between different states. If we add a new behavior, we add the code to call it in one place rather than having to revisit all of the existing states. If we need to edit the transition logic for a particular behavior, we can edit it in one place rather than many.

Figure 3 — A simple behavior tree. At the moment the agent has decided to do a ranged attack.

Another advantage of behavior trees is that there is a far more formal method of building behaviors. Through a collection of tools, templates, and structures, very expressive behaviors can be written—even sequencing behaviors together that are meant to go together (Figure 3). This is one of the reasons that Behavior Trees have become one of the more “go-to” AI architectures in games having been notably used in titles ranging from Halo 2 and 3, to Spore.

A detailed explanation of what makes behavior trees work, how they are organized, and how the code is written is beyond the scope of this article. Suffice to say, however, that they are far less prone to breaking their shell and spilling their contents all over your lap. Because the risk of breaking is far less, and the structure is so much more organized, you can also pack in a lot more behavioral content.

For an excellent primer on behavior trees, see Bjoern Knafla’s Introduction to Behavior Trees on #AltDevBlogADay

A Hybrid Taco—The Hierarchical Finite State Machine

Figure 4 – In a hierarchical finite state machine, some states contain other related states making the organization more manageable.

A brief note before we leave the land of tacos behind. One the advantages of the behavior tree—namely the tree-like structure—is sometimes applied to the finite state machine. In the hierarchical finite state machine (HFSM), there are multiple levels of states. Higher-level states will only be concerned with transitioning to other states on the same level. On the other hand, lower-level states inside the parent state can only transition to each other. This tiered separation of responsibility helps to provide a little structural organization to a flat FSM and helps to keep some of the complexity under control.

If we were to place the HFSM into our Mexican metaphor, it would be similar to one of those nifty hard tacos wrapped in a soft taco shell. There’s still only so much you can pile into it before it gets unwieldy, but at least it doesn’t tend to shatter and make as big of a mess.

Building it from Scratch—Planning your Fajita

If you are anything like my wife, you want to be able to choose exactly what is in your Mexican dish—not just overall, but in each bite. That’s why she orders fajitas. While the fajita looks and acts a lot like a taco (specifically, a soft shell one), the method of construction is a little different. Rather than coming as an already assembled construction, the typical method of serving it is to bring you the tortillas and having the content in a couple of separate piles. You then construct your own on the spot like a personal mini buffet. You can choose what you want to put in the first one and then even change it up for the subsequent ones. It all depends on what you deem appropriate for your tastes at that moment.

The AI equivalent of this is the planner. While the end result of a planner is a state (just like the FSM and behavior tree above), how it gets to that state is significantly different.

Like a behavior tree, the reasoning architecture behind a planner is separate from the code that “does stuff”. A planner compares its situation—the state of the world at the moment—and compares it to a collection of individual atomic actions that it could do. It then assembles one or more of these tasks into a sequence (the “plan”) so that its current goal is met.

A planner actually works backwards from its goal.

Unlike other architectures that start at its current state and look forward, a planner actually works backwards from its goal (Figure 5). For example, if the goal is “kill player”, a planner might discover that one method of satisfying that goal is to “shoot player”. Of course, this requires having a gun. If the agent doesn’t have a gun, it would have to pick one up. If one is not nearby, it would have to move to one it knows exists. If it doesn’t know where one is, it may have to search for one. The result of searching backwards is a plan that can be executed forwards. Of course, if another method of satisfying the “kill player” goal is to throw a Taco of Power at it, and the agent already has one in hand, it would likely elect to take the shorter plan and just hurl said taco.

Figure 5 – The planner has found two different methods of achieving “kill player” and selected the shorter one.

The planner diverges from the FSM and BT in that it isn’t specifically hand-authored. Therein lies the difference in planners—they actually solve situations based on what is available to do and how those available actions can be chained together. One of the benefits of this sort of structure is that it can often come up with solutions to novel situations that the designer or programmer didn’t necessarily account for and handle directly in code.

From an implementation standpoint, a major plus of the planner is that a new action can be dropped into the game and the planner architecture will know how to use it. This speeds up development time markedly. All the author says is, “here are the potential things you could do… go forth and do things.”

Of course, a drawback of this is that authorial control is diminished. In a FSM or BT, creative, “outside the box” solutions were the exception from the predictable, hand-authored systems. In a planner, the scripted, predictable moments are the exception; you must specifically override or trick the planning system to say, “no… I really want you to do this exact thing at this moment.”

While planner-based architectures are less common than behavior trees, there are notable titles that used some form of planners. Most famously, Jeff Orkin used them in Monolith’s creepy shooter, F.E.A.R. His variant was referred to as Goal-Oriented Action Planning or GOAP. For more information on GOAP, see Jeff’s page, http://web.media.mit.edu/~jorkin/goap.html

A more recent flavor of planner is the hierarchical task network (or HTN) planner such as was used to great effect in Guerilla’s Killzone 2. For more information on HTN planning in Killzone 2, visit http://aigamedev.com/open/coverage/htn-planning-discussion/

Putting It All in a Bowl—A Utility-Based Salad

Another architecture that is less structured than the FSM or behavior tree is what has been called in recent years simply “utility-based” method. Much like the planner, a utility-based system doesn’t have a pre-determined arrangement of what to do when. Instead, potential actions are considered by weighing a variety of factors—what is good and bad about this?—and selecting the most appropriate thing to do. As you can see, this is similar to the planner in that the AI gets to choose what’s best at the time.

The action with the highest score wins.

Instead of assemblinga plan like the fajita-like planner, however, the utility-based system simply selects the single next bite. This is why it is more comparable to a taco salad in a huge bowl. All the ingredients are in the mix and available at all times. However, you simply select what it is that would like to poke at and eat. Do you want that piece of chicken in there? A tomato, perhaps? An olive? A big wad of lettuce? You can select it based on what you have a taste for or what is most accessible at the moment.

Figure 6 – A typical utility-based system rates all the potential actions on a variety of criteria and selects the best.

One of the more apparent examples of a utility-based AI system is in The Sims. In fact, the considerations are largely shown in the interface itself. The progression of AI architectures throughout The Sims franchise is well documented and I recommend reading up on it. The short version is that each potential action in the game is scored based on a combination of an agent’s current needs and the ability of that action or item to satisfy that need. The agent then uses an approach common in utility-based methods and constructs a weighted sum of the considerations to determine which action is “the best” at that moment. The action with the highest score wins (Figure 6).

While utility-based systems can be used in many types of games, they are more appropriate in situations where there are a large number of potentially competing actions the AI can take—often with no obvious “right answer.” In those times, the mathematical approach that utility-based systems employ is necessary to ferret out what the most reasonable action to take is. Aside from The Sims, other common areas where utility-based systems are appropriate are in RPGs, RTS, and simulations.

Like behavior trees and planners, the utility-based AI code is a reasoner. Once an action is decided up, the agent still must transition to a state. The utility system simply is selecting what state to go to next. In the same way, then, just like those other systems, the reasoning code is all in a single place. This makes building, editing, tuning and tweaking the system much more compartmentalized. Also, like a planner, adding actions to the system is fairly straight forward. By simply adding the action with the appropriate weights, the AI will automatically take it into account and being using it in relevant situations. This is one of the reasons that games such as The Sims were as expandable as they were—the agents simply included any new object into their decision system without any changes to the underlying code.

The system is providing suggestions as to what might be a good idea.

On the other hand, one drawback of a utility system is that there isn’t always a good way to intuit what will happen in a given situation. In a behavior tree, for example, it is a relatively simple exercise to traverse the tree and find the branches and nodes that would be active in a particular situation. Because a utility system is inherently more fuzzy than binary, determining how the actions stack up is often more opaque. That’s not to say that a utility-based AI is not controllable or configurable. In fact, utility systems offer a deep level of control. The difference is that rather than telling the system exactly what to do in a situation, the system is providing suggestions as to what might be a good idea. In that respect, a utility system shares some of the adaptable aspects of planners—the AI simply looks at its available options and then decides what is most appropriate.

For more reading on utility-based systems, please check out my book, Behavioral Mathematics for Game AI or, if you have GDC Vault access, you can view my AI Summit lectures (with Kevin Dill) from 2010 and 2012 entitled, “Improving AI Decision Modeling through Utility Theory” and “Embracing the Dark Art of Mathematical Modeling” respectively.

Wrap it Up—A Neural Network Burrito

The last entry in my (extremely strained) metaphorical cornucopia is the burrito. In the other examples, all the content that was being delivered was open and available for inspection. In the case of the fajita, you (the AI users) were able to assemble what you wanted in each iteration. In the taco salad, the hard and soft tacos, and even the tostada you were able to, as the cliché says, “season to taste”. Perhaps a little extra cheese over the top? Maybe a few extra tomatoes? Even if you didn’t edit the content of your dish, you were at least able to see what you were getting into before you took a bite. There was no mystery about what made up your dinner; everything was open and available for your inspection.

The burrito is different in this respect. You are often told, in general terms, what the burrito is supposed to be packing but the details are often hidden from view. To paraphrase Winston Churchill, “it is a riddle, wrapped in a mystery, inside a soft flour shell.” While the burrito (and for that matter, the neural network) is extremely flexible, you have absolutely no idea what is inside or what you are going to get in the next bite. Don’t like sour cream? Olives? Tough. If it’s in there, you won’t know until you take that bite. At that point, it is too late. There is no editing without completely unwrapping the package and, for all intents and purposes, starting from scratch.

This is the caveat emptor of the NN-based AI solution. As a type of “learning” AI, neural nets need to be trained with test or live performance data. At some point you have to wrap up the training and say, “This is what I have”. If a designer wanders in, looks over your shoulder and says, “It looks pretty cool, but in [this situation] I would like it to do [this action] a little a little more often,” there’s really nothing you can do to change it. You’ve already closed your burrito up. About all you can do is try to retrain the NN and hope for the best.

So while the NN offers some advantages in being able to pile a lot of things into a huge concoction of possibilities, there are huge disadvantages in a lack of designer control after the fact. Unfortunately, this tends to disqualify NNs and other machine-learning solutions from consideration in the game AI environment where that level of control is not only valuable but often a requirement.

That said, there have been a few successful implementations of NNs in games—for example, Michael Robbins used NNs to improve the tactical AI of Supreme Commander 2 from Gas Powered Games. (For those with GDC Vault access, you can see more about his implementation of this in the AI Summit session, “Off the Beaten Path: Non-Traditional Uses of AI”.)

Browsing the Buffet

So we’ve covered a variety of architectures and, for what its worth, equated them with our Mexican delights. Going back to our premise, as far as the content goes, it’s all the same stuff. The difference is simply the delivery mechanism—that is, the shape of the wrapping—and the associated pros and cons of each. This has by no means been an exhaustive treatment of AI architectures. The purpose was simply to expose you to the options that are out there and why you may or may not want to select each for the particular tastes and needs of your project. To sum up, though, let’s go through the dishes once again…

You can certainly just throw the occasional rule into your code that controls behavior—but that’s not really an “architecture”. By organizing your AI into logical chunks, you can create a finite state machine. A FSM is relatively easy to construct and for non-programmers to understand. While this is good for simple agents with a limited number behaviors, it gets brittle quickly as the number of states increases. By organizing the states into the logical tiers of a hierarchical finite state machine (HFSM), you can mitigate some of this complexity.

By removing the reasoning code from the states themselves, you can gain a lot more flexibility. By organizing behaviors into logically similar branches, you can construct a behavior tree. BTs are also fairly easy for designers and other non-programmers to understand. The main advantages, however, are not only the ease with which they can be constructed but how well they scale without the need for a lot of extra programming. Once the BT structure is in place, new branches and nodes can be added easily. However, despite how robust BT implementations can get, it is still a form of hand-authored of scripting—“when X, do Y”.

Figure 7 — The type of architecture you select needs to be based on YOUR needs.

Like the behavior tree, a planner allows for very extensible creation. However, where the BT is more hand-authored by designers, a planner simply “solves” situations using whatever it feels is best for the situation. This can be powerful, but also leads to a very scary lack of control for designers.

Similarly, utility-based systems depart from the specific script approach and allow the characters to decide freely what to do and, as above, might unsettle some designers. They are incredibly expandable to large numbers of complex factors and possible decisions. However, they are slightly more difficult to intuit at times although tools can be easily built that aid that process.

The ultimate hands-off black box in the neural network. Even the programmers don’t know what’s going on inside their little neurons at times. The good news is that they can often be trained to “do what human players would do.” That aspect itself holds a lot of appeal in some genres. They are also a little easier to build since you only have to construct the training mechanism and then… well… train it.

There is no “one size fits all” solution to AI architectures.

The point is, there is no “one size fits all” solution to AI architectures. (You also don’t have to limit yourself to a single architecture in a game.) Depending on your needs, your team, and your prior experience, any of the above may be The Right Way for you to organize your AI. As with any technical decision, the secret is to research what you can, even try a few things out, and decide what you like. If I am your waiter AI consultant, I can help you out… but the decision is ultimately going to be what you have a hankerin’ for.

Now let’s eat!

(The author would like to dedicate this article to all the GDC, Gamasutra, and GDMag staff who are still lamenting the departure of the beloved Mexican restaurant that used to be located in their building. I mourn with you, my friends.)

Boston All-Stars Weigh in on AI

Monday, February 15th, 2010

Back in November, there was a get-together of Boston Post Mortem (billed as “games and grog, once a month”) that had a panel of local AI folks. The panelist’s names are familiar to many of us… Damián Isla, Jeff Orkin, and John Abercrombie. It was moderated by Christian Baekkelund whom I had the opportunity to have dinner with in Phily when I was in town for the GameX Industry Summit. Thankfully, this panel was recorded by Darius Kazemi and posted on his Vimeo site and on the Boston Post Mortem page. I embed it here for simplicity’s sake.

Anyway, a few comments on the video:

You’re Doing it Wrong

The first question to the panel was “what do new AI developers do wrong” or something to that effect. Damián set up the idea of two competing mentalities… gameplay vs. realistic behavior. He and John both supported the notion that the game is key and that creating a system just for the sake of creating it can range anywhere from waste of time to downright wrong.
…create autonomous characters and then let the designers create worlds…
The thing that caught me was Jeff’s response, though (5:48). His comment was that AI teams can’t force designers to be programmers through scripts, etc. That’s not their strength and not their job. While that’s all well and good, it was his next comment that got me cheering. He posited that it is the AI programmers job to create autonomous characters and then let the designers create worlds that let the characters do their thing in.
Obviously, it isn’t a one way street there… the designers job isn’t to show off the AI. However, I like the idea of the designers not having to worry about implementing behavior at all — just asking for it from the AI programmer and putting the result into their world. John’s echo was that it’s nice to build autonomous characters but with overrides for the designers. It isn’t totally autonomous or totally scripted. This sounds like what he told me in his Bioshock experience when I talked to him about it a few years ago.
I happen to agree that the focus needs to be on autonomy first and then specific game cases later. The reason for this is too often the part of the AI that looks “dumb” or “wrong” is when the AI isn’t being told to do anything specific. For example, how often would you see a monster or soldier just standing there? Some of the great breakthroughs in this were from places like Crytek in Far Cry, Crysis, etc. The idea of purposeful-looking idle behaviors was a great boon to believable AI.

The other advantage to creating autonomy first was really fleshed out by Jeff Orkin’s work on F.E.A.R. (Post-Play’em) No more do designers (or even AI programmers) have to worry about telling an agent exactly what it should do in a situation. By creating an autonomous character, you can simply drop them in a situation and let them figure it out on their own. This can be done with a planner like Jeff did, a utility-based reasoner, or even a very detailed behavior tree. Like John said above, all you need to remember is to provide the override hooks in the system so that a designer can break an AI out of its autonomy and specifically say “do this” as an exception rather than hand-specifying each and every action.
What’s in Our Way?
The next question was about “the biggest obstacle” for game AI moving forward. Jeff’s first answer was about authoring tools. This has been rehashed many times over. John expressed his frustration at having to start from scratch all the time (and his jealousy that Damián didn’t have to between Halo 2 and 3).
…to get your AI reviewed well, you need to invest in great animators.
Damián’s comment was amusing, however. He suggested that to get your AI reviewed well and have players say “that was good AI”, you need to invest in great animators. This somewhat reflects my comment in the 2009 AI Summit where I pointed out that AI programmers are in the middle of a pipeline with knowledge representation on one side and animation on the other. It doesn’t do any good to be able to generate 300 subtle behaviors if the art staff can only represent 30 of them.
On the other hand, he reiterated what the other guys said about authoring tools and not having to re-invent the wheel. He supports middleware for the basic tasks like A*, navmesh generation, etc. If we don’t have to spend time duplicating the simple tasks over and over, we can do far more innovation with what we have.
That’s similar to my thought process when I wrote my book, “Behavioral Mathematics for Game AI“. You won’t see a description of FSMs, A*, or most of the other common AI fare in the book. How many times has that been done already? What I wanted to focus on was how we can make AI work better through things other authors haven’t necessarily covered yet. (Ironically, it was Jeff Orkin who told me “I don’t think anyone has written a book like that. Many people need to read a book about that. Heck… I’d read a book about that!” — Thanks Jeff!)
What Makes the Shooter Shot?
The next question (11:45) was about what FPS-specific stuff they have to deal with.
When Halo 3 came out, they could afford fewer raycasts than Halo 2.
Damián talked about how their challenge was still perception models. They really tried to do very accurate stuff with that in Halo. He pointed out that raycasting is still the “bane” of his existence because it is so expensive still. Despite the processors being faster, geometry is far more complex. Alarming note: When Halo 3 came out, they could actually afford fewer raycasts than on Halo 2. Now that sucks! Incidentally, the struggle for efficiency in this area very much relates to Damián’s “blackboard” interview that I wrote about last week.
Interestingly, Jeff argued the point and suggested that cheating is perfectly OK if it supports the drama of the game. I found this to possibly be in conflict with his approach to making characters autonomous rather than scripted. Autonomy is on the “realistic” end of the spectrum and “scripted” on the other. The same can be said for realistic LOS checks compared to triggered events where the enemy automatically detects the player regardless.
John split the difference with the profound statement, “as long as the player doesn’t think you’re cheating, it’s totally OK.” Brilliant.
AI as the Focus or as a Focusing Tool
Supporting the overall design of how the game is to be experienced is just as important as the naked math and logic.
In response to a question about what Damián meant about “AI as a game mechanic,” he offered an interesting summation. He said that, from a design standpoint, the AI deciding when to take cover and when to charge is as important as how much a bullet took off your vitality. That is, supporting the overall design of how the game is to be experienced is just as important as the naked math and logic.
He also pointed out that the design of a game character often started out with discussions and examples of how that AI would react to situations. “The player walks into a room and does x and the enemy will do y.” Through those conversations, the “feel” of a type of character would be created and, hopefully, that is what the player’s experience of that type of character would end up being.
In my opinion, a lot of this is accomplished by being able to not only craft behaviors that are specific to a type of enemy (the easy way of differentiation) but also parameterizing personality into those agents so that they pick from common behaviors in different ways. That is, something that all enemies may do at some point or another but different types of enemies do at different times and with different sets of inputs. I went into this idea quite a bit in my lecture from the 2009 AI Summit (Breaking the Cookie-Cutter: Modeling Individual Personality, Mood, and Emotion in Characters) where I talked about incorporating personality into game agents.
The Golden Rules of AI (20:30)
Christian started things off by citing the adage, “it’s more important to not look stupid than to look smart.” No big surprise there.
The player feels good about killing someone if the kill is challenging.
John said that the AI must be “entertaining”. My only problem with this is that different people find different things entertaining. It’s kinda vague. Better to say that the AI must support the design. Both John and Jeff extended this idea by talking about providing a challenge… the player feels good about killing someone if the kill is challenging.
Damián sucked up to my buddy Kevin Dill a little bit my citing a quote that he made in our joint lecture at the GameX Industry Summit, The Art of Game AI: Sculpting Behavior with Data, Formulas, and Finesse. Kevin said that AI programmers must be an observer of life. I certainly agree with this notion… in fact, for years, my little tag at the end of my industry bios has said, “[Dave] continues to further his education by attending the University of Life. He has no plans to graduate any time soon.” In fact, Chapter 2 of my book is titled “Observing the World”… and Kevin was my tech editor. It should be intuitively obvious, even to the most casual observer, that Kevin stole that idea from me! Damián should have cited me in front of a bunch of game developers! So there!

Anyway, Damián’s point was not only how Kevin and I meant it — observing how people and animals do their thing, but also in being a very detailed and critical observer of your AI. There must be a discipline that scrubs out any little hiccup of animation or behavior before they pile up into a disjointed mess of little hiccups.
Jeff agreed to some extend but related something interesting from the development of F.E.A.R. — he said that most development schedules start by laying out the behavior for each type of character and then, if there is time, you go back and maybe try to get them to work together or with the environment, etc. With F.E.A.R., they started from the beginning with trying to work on the coordinated behaviors. With all the shouting and chaos going on with these guys working against you, you don’t notice the little glitches of the individuals quite as much.
Damián backtracked and qualified his comments… not just hunting down everything that is wrong… but rather everything that is wrong that matters.

Look but Don’t Touch
If you are fighting for your life, you don’t notice the details as much.
John brought up an interesting helpful hint. He talked about how, by turning on God mode (invulnerability), you can dispense with the fight for survival and really concentrate on how the AI is behaving. If you are fighting for your life, you don’t notice the details as much.
I have to agree. That’s why when I go to places like E3, I would rather stand back and watch other people play so I can observe what’s going on with the AI. (I always fear that demo people at E3 are going to be put off by my refusal to join in.) This is one of the problems I have when I’m playing for my Post-Play’em articles. I get so caught up in playing that I don’t look for things anymore. One solution is to have a house full of teenagers… that gives you ample time to just sit and watch someone else play games. I recommend it to anyone. (I believe John and Damián have started their respective processes of having teenagers… eventually.)
Emergence… for Good or Evil
If the AI can have the freedom to accidentally do something cool and fun, it also has the freedom to do something dumb or uninteresting.
In response to a question asking if AI has ever done anything unexpected, Damián spoke about how the sacred quest of emergence isn’t always a good thing. He said that emergent behavior must fall out of the AI having the freedom to do things. If the AI can have the freedom to accidentally do something cool and fun, it also has the freedom to do something dumb or uninteresting. Because of that, emergence has a really high cost in that it can be more of a drag on gameplay than the occasional gem it might produce.
Christian qualified the question a little more by asking if there was a time when the emergent behavior was found well before ship and the team thought it was something to encourage. He cited examples from his own experience. John talked about emergent gameplay that resulted from simple rules-based behavior in Bioshock. You could set a guy on fire, he would run to water, then you could shock the water and kill him. Was it necessary? No. Was it fun for the player to be creative this way? Sure.
To Learn or Not to Learn?
One question that was asked was about using machine learning algos. Christian started off with a gem about how you can do a lot of learning just by keeping track of some stats and not resorting to the “fancier stuff.” For example, keeping track of pitches the player throws in a baseball game doesn’t need a neural network. He then offered that machine learning can, indeed, be used for nifty things like gesture recognition.
Oh… are you using neural networks?
Jeff admitted that he hates the question that comes up when people find out he does AI in games. They often ask him, “Oh… are you using neural networks?” I think this really hits home with a lot of game AI folks. Unfortunately, even a cursory examination of the AI forums at places like GameDev.net will show that people still are in love with neural networks. (My quip is that it is because The Terminator claims to be one so it’s real sexy.) Thankfully, the AIGameDev forum is a little more sane about them although they do come up from time to time. Anyway, Jeff said he has never used them — and that he’s not even sure he understands them. While he thinks that NNs that are used in-game like in Creatures or Black & White are cool, they are more gimicky and not as useful with the ever-increasing possibility space in today’s games.
I Have a Plan
It is hard to accommodate the ever-changing goals of the human players.
Jeff acknowledged that the planner revolution that he started has graduated to hierarchical planners to do more complex things like squad interaction. However, one big caveat that he identified was when you incorporate humans into the mix alongside the squad. It is hard to accommodate the ever-changing goals of the human players.
This, of course, brought us back to the idea of conveying intent — one of the nasty little problem spaces of game AI. Damián explained it as a function of interpretation rather than simply one of planning. I agree that this is going to be a major issue for a long time until we can crack the communication barrier such as voice recognition and natural language processing. Until we can tell our AI squad-mates the same thing we can tell our human squad-mates and expect a similar level of understanding, we are possibly at an impasse.
Moving Forward by Standing Still
As the worlds become more complicated, we have to do so much more work just to do the same thing we did before.
Someone asked the panel what sort of things that they would like to see out of smaller, indie developers that might not be able to be made by the bigger teams. To set the stage, Damián responded with a Doug Church quote from the first AIIDE conference. Doug said that we have to “move forward by standing still.” As the worlds become more complicated, we have to do so much more work just to do the same thing we did before. (See Damián’s note about the LOS checks in Halo 2 and 3 above.) Damián suggested that the indie space has more opportunities to move forward with this because they aren’t expected to do massive worlds. Instead, they can focus on doing AI-based games.

This is what Matt Shaer’s interview with me in Kill Screen magazine was going after. With my Airline Traffic Manager as one of his examples, he spoke specifically about how it is the smaller, dedicated developer who is going after the deeper gameplay rather than the bigger world. I hope this is the case. We have seen a few examples so far… there are likely more to come.
Jeff cited interactive storytelling as a possible space for development in this area as well. This is what we are after with one of our sessions at the 2010 AI Summit when Dan Kline, Michael Mateas, and Emily Short deliver AI and Interactive Storytelling: How We Can Help Each Other.
AI on the GPU?
Someone mentioned seeing a demo of improved LOS checks by using the GPU and asked if AI programmers should be pushing that more. John somewhat wittily suggested that AI programmers won’t be using the graphics chips until someone can find a system where a graphics chip isn’t being used at all. Damián was a little more direct in saying that the last people he wanted to work in conjunction with was the graphics programmers. This reminds me of a column I wrote on AIGameDev a few years back, Thou Shalt Not Covet Thy Neighbor’s Silicon. The graphics guys finally got “their space.” They aren’t letting us have any of it any time soon!
Damián pointed out that most advanced AI needs a lot of memory and access to the game state, etc. Those are things that the GPU isn’t really good at. About the only thing that you could do without those things is perhaps flocking. I agree that there really isn’t a lot we can do with the GPU. He did concede that if ATI wanted to do a raycasting chip (rather than borrowing time from the hated graphics guys) , that would be beautiful… but that’s about it.
Director as Designer?

Someone asked about the possibility of seeing the technology that was behind Left 4 Dead’s “AI Director” being used as a sort of game designer, instead. Damián pointed out that the idea of a “meta-AI” has been around for years in academic AI and that it is now really starting to get traction in the game world. I agree that the customized gameplay experience is a major turning point. I really like the idea of this as it really comes down to a lot of underlying simulation engine stuff. That’s my wheelhouse, really.
Where to Now?
They closed with some commentary on the future of game AI which I will leave to you to listen to. Suffice to say that for years, everyone has been expecting more than we have delivered. I’m not sure that is because we are slacking off of because they have vastly underestimated what we have to do as AI designers and programmers. At least, with all the attention that is being brought to it through the Game AI Conference that AIGameDev puts on, the AI Summit now being a regular feature at GDC, and other such events, we are going to be moving forward at some speed.
Hang in there…
Add to Google Reader or Homepage




Content 2002-2015 by Intrinsic Algorithm L.L.C.

OGDA