Entity Component Systems

Entity Systems are the future of MMOG development – Part 1

A few years ago, entity systems (or component systems) were a hot topic. In particular, Scott Bilas gave a great GDC talk (http://scottbilas.com/files/2002/gdc_san_jose/game_objects_slides.pdf – updated link thanks to @junkdogAP) on using them in the development of Dungeon Siege. The main advantages to entity systems were:

  • No programmer required for designers to modify game logic
  • Circumvents the “impossible” problem of hard-coding all entity relationships at start of project
  • Allows for easy implementation of game-design ideas that cross-cut traditional OOP objects
  • Much faster compile/test/debug cycles
  • Much more agile way to develop code

I first started using entity systems in anger back in 2001-2003, when I was working on MMOG server middleware. We were targetting the few most painful problems in MMOG development, one of which was the difficulty of constantly changing your game logic after launch, which led us to entity systems. I learnt then that entity systems were an almost perfect solution to massively speeding up the development time for most MMOG’s, and for also allowing almost unrestrained re-writing of fundamental game features post-launch with very little effort.

That last issue is critical to the success of an MMOG: once an MMOG is launched successfully, its long term success or failure depends more upon the ability of the dev team to evolve that game into a better game month after month than upon anything else.

I learned a lot from that first run-in with them, more about the things that go wrong and what makes developing with entity systems particularly hard than about the things that went right. We also discovered that performance could easily become a major issue – although they are very flexible and dynamic, the lack of pre-compiled lookups and optimizations can make runtime performance disappointingly (unacceptably) poor. Why? Mainly because of the amount of indirection and checks needed to run even a single method call (but I’ll go into detail on that problem, and how to fix it, a bit later).

I moved on, and didn’t think about them again, until last year. In 2006 I joined the Operation Flashpoint 2 team as the lead network programmer, where we trying to make an MMO-FPS on an unprecedented scale, and I discovered that the programming team was considering an entity-system to drive the whole game. The attractions for the OFP2 team were different – mainly around the improvements to memory management they could get from it, and the stream-oriented coding (which is essential for PS3 development) – but it turned out to be something of a silver bullet for the “MM” and “O” parts of the MMOFPS. As the network programmer, discovering that an entity system was going to be the interconnect for all other subsystems was a huge relief: the entity system meant I could implement the complex latency hiding and prediction techniques with minimal interference with the coding of all the rest of the game systems. When you’ve got 20+ programmers on a team, you really don’t want to be placing yourself in a position where you’re going to have to redesign code for all of them just to make it “network friendly”.

whilst I was working on OFP2, I got in touch with Scott, and exchanged ideas on what the future might hold for entity systems, including additional uses for them, and on how to share this knowledge more widely. He encouraged me to start a blog (and, actually, a year later that was one of the main reasons I even started this blog, T=Machine), so I figured now was a good time to start (finally) writing about those entities :).

And to blame Scott for the existence of this blog…

Since then, I’ve been thinking a lot off and on about entity systems and their appropriateness for use in MMOG development (again!). Only this time around – thanks to OFP2 and some of the issues it threw up – I had a much better idea how to use them to increase performance rather than reduce it, and had come across some other major problems that they conveniently solve. Most obviously, they work wonders for PS3 development. Knowing how PS3’s fundamental architecture works, I can’t immediately see how you’d want to use anything other than a full entity system for game development. There’s so much horsepower in that beast that you can certainly write games many different ways, but it’s particularly well-suited to this approach.

As it stands, I’m beginning to think that next-generation MMOG’s are going to be practically impossible to develop unless based heavily around a core entity-system. “practically” being the key word – unless you’re willing to spend $100 million for your development…

“Anything is possible”, of course, so there’ll be many exceptions to that – but I think any team that doesn’t go this route is going to suffer a lot because of it.

I think most people would agree. But … recent discussions on game-industry mailing lists – and the experiences I had within games companies with programmers who were much better at programming than I was – made me realise that there’s a lot of ignorance over these systems, and many people aren’t getting anywhere near the full potential of them. So, if you’re interested, read on…

I’ll do this in a series of posts (it’s going to take some time to write it all up!), but it’ll go something approximately like this:

ADDENDUM:

EDIT, December 2007 – You may also want to take a look at Mick West’s introduction to entity systems for game development, it’s a shorter read than my posts, but narrower in scope.

NB: as I post the other parts, I’ll update the list above to link to them.

Entity Systems are the future of MMOG development – Part 2

Part 2 – What is an Entity System?

(Part 1 is here)

Sadly, there’s some disagreement about what exactly an Entity System (ES) is. For some people, it’s the same as a Component System (CS) and even Component-Oriented Programming (COP). For others, it means something substantially different. To keep things clear, I’m only going to talk about Entity Systems, which IMHO are a particular subset of COP. Personally, I think Component System is probably a more accurate name given the term COP, but then COP is improperly named and is so confusing that I find if you call these things Component Systems they miss the point entirely. The best would be Aspect Systems, but then AOP has already taken ownership of the Aspect word.

An entity system is simply a part of your program that uses a particular way of breaking up the logic and variables of your program into source code.

For the most part, it does this using the Component Oriented Programming paradigm instead of the OOP paradigm – but there’s subtleties that we’ll go into later.

We refer to an entity “system” instead of entity “programming” because ES’s are usually found embedded in a larger, OOP, program. They are used to solve a subset of problems that OOP is poor at dealing with.

So, it is generally a replacement for Object Oriented Programming. The two are mutually exclusive: you have to choose one or the other (although any complex program may have multiple levels of abstraction, and you could mix and match OOP and COP/ES at different layers, any given layer is going to be one or the other (or, of course, any other alternative way of structuring your program).

Common misconception #1 – Entity Systems are part of OOP
Common misconception #2 – Entity == Class, Component == Object

Where OOP has Classes and Objects, an ES has Entities and Components – but it is very important to note that an Entity is NOT equivalent to a class, and a Component is NOT equivalent to an Object, although superficially they seem to share some characteristics. If you think of Entities/Component’s in OOP terms, you will never understand the ES, and you will screw up any attempt to program with it.

The whole point of ES’s using a different programming *paradigm* is that this means there CAN NEVER BE a direct equivalent between Classes and anything, or between Objects and anything – if there were, it would actually be the same paradigm, and just be a variant of OOP. It’s not. It’s fundamentally different and incompatible.

What’s an Entity?

An Entity System is so named because Entities are the fundamental conceptual building block of your system, with each entity representing a different concrete in-game object. For every discernible “thing” in your game-world, you have one Entity. Entities have no data and no methods.

In terms of number of instances at runtime, they are like Objects from OOP (if you have 100 identical tanks, you have 100 Entities, not 1).

However, in terms of behaviour, they are like classes (Entities indirectly define all the behaviour of the in-game object – we’ll see how in a second).

So, entities on their own are pretty much entirely useless – they are coarse-grained, and they serve to do little more than to tag every coarse gameobject as a separate item. This is where Components come in.

What’s a Component?

Every in-game item has multiple facets, or aspects, that explain what it is and how it interacts with the world. A bicycle, for instance, is:

  • Made of metal
  • Can be used by a human
  • A means of transportation
  • Something that can be bought and sold
  • A popular birthday present
  • Man-made

At this point, please note: OOP does not have this concept of multiple aspects; it hard-codes all the aspects into one glob, and declares that the set of behaviours and values of data to be the single-aspected Class of an Object. C++’s Multiple inheritance and Java’s Interfaces can get you a small part of the way towards the aspects of an ES, but they quickly collapse under the strain in fundamental ways that cannot be fixed within OOP.

The way that an Entity for an item represents the different aspects of the item is to have one Component for each aspect. The Component does one really important thing:

  • Labels the Entity as possessing this particular aspect

It may do several other things, depending upon your implementation and the design of your ES. However, with an ES, this is the most important, because this is the basis of all method execution and processing.

What’s a System? (or subsystem)

An ES goes further than OOP in how it splits up the universe of code and data. In OOP, you make everything into Objects. With an ES, you have two different things that you split code and data into: the first is “Entity + Components”, and the second is “Systems”.

This is the source of a lot of the value of an ES. OOP is very good at implementing the part of any program that has lots of data floating around and lots of methods floating around which need to be executed only on a small, instanced, subset of the data in the program. OOP is very poor at implementing the “global” parts of a program, which have to operate “on everything”, or have to be invoked “from everywhere”. An Es solves this by explicitly dealing with all the global stuff using Systems, which are outside the realm of Entity/Component.

Each System runs continuously (as though each System had it’s own private thread) and performs global actions on every Entity that possesses a Component of the same aspect as that System.

There is a one to one relationship between Systems and AVAILABLE aspects (i.e. types of Component). A System essentially provides the method-implementation for Components of a given aspect, but it does it back-to-front compared to OOP. OOP style would be for each Component to have zero or more methods, that some external thing has to invoke at some point. ES style is for each Component to have no methods but instead for the continuously running system to run it’s own internal methods against different Components one at a time.

Typical Systems in a game would be: Rendering System, Animation System, Input System, etc.

The Rendering system wakes up every 16 milliseconds, renders every Entity that has the Renderable Component, and then goes back to sleep.

The Animation system constantly looks out for anything that would trigger a new animation, or cause an existing animation to change (e.g. player changes direction mid-step), and updates the data of each affected Entity (of course, only those that actually have the Animatable Component, i.e. are animations).

The Input system polls the gamepad, mouse, etc, and changes the state of whichever Entities with the Inputable Component are currently marked as being controlled by the player.

In tradtional OOP, if you have 100 units on a battlefield, and each is represented by an object, then you theoretically have 100 copies of each method that can be invoked on a unit. In practice, most OOP languages use runtime and compiler optimizations to share those methods behind the scenes and avoid wasting memory – although scripting languages for instance often don’t do any sharing, allowing all the methods to be independently changed on each and every instance.

By contrast, in an ES, if you have 100 units on a battlefield, each represented by an Entity, then you have zero copies of each method that can be invoked on a unit – because Entities do not contain methods. Nor do Components contain methods. Instead, you have an external system for each aspect, and that external system contains all the methods that can be invoked on any Entity that possess the Component that marks it as being compatible with this system.

That’s not an Entity System!

Well … at least, that’s what *most* people I’ve worked with consider to be an Entity System. But even within a single project, I’ve found multiple different ideas of what an ES is, some of which are compatible, others of which are incompatible. Here are some of the other ideas I’ve come across that people thought were ES’s.

Alternative definitions of entity systems:

1. An ES provides a different model of class-inheritance to OOP, especially w.r.t inheritance of fields. For any given entity-class you declare which aspects that class possesses, and the actual class is looked-up at runtime by piecing together all the different aspects.

2. The same as 1., but taken literally for methods as well as fields, i.e. this provides an interesting way to mix-n-match inheritance of functions/methods. So long as methods are attached to the fields upon which they operate, and/or a “pre-requisite” system is implemented such that “having aspect X automatically causes you to also have aspect Y”, then this works quite smoothly. NB: this is implementing “modular objects”: the “aspects” are complete objects from the standard theory of OOP.

3. A variant on 1. or 2. where the aspect data is pre-compiled into a large number of OOP classes at compile time, so that at runtime you are just running “normal” classes, without any lookup cost of the dynamic “I’ve got an entity E which claims to have aspect X; just hold on a sec while I build a function table to find out what the heck to do now someone just tried to invoke function Z…”

4. A way of compiling standard OOP classes into runtime-data (note: disassociating them from their OOP-ness) specifically so that you can treat them as streamed data instead of chunked binary + data. The main value of this is to play nicely with hardware that has tiny tiny amounts of memory available in the code cache and/or very very slow access to fetch extra code, but has vast bandwidth to stream SEQUENTIAL data through the CPU. PlayStation consoles especially like this approach, as opposed to the typical PC approaches. Note that in this case, the idea is that the game-programmers and game-designers are UNAWARE that there’s an entity system – it’s a runtime/compile time performance optimization, not a design-feature.

5. A system design that revolves around excessive use of the Observer pattern, but does so using standard OOP techniques. I mention this because it’s a common (ab)use of the concepts of ES’s, but lacks much of the advantages. For what it’s worth … that’s the situation I was in when I first discovered ES’s. So, personally, I really don’t count these as ES’s. These are not an ES, but an: “OMG, I need an ES!”, but many people persist in thinking (and claiming) they’re a true ES.

NB: IMHO this is one of those “you really shouldn’t do this” classes of Entity System usage. Why? Because you end up just creating MASSIVE OOP classes with tonnes of independent implemented Observer methods tacked onto them. That’s fine as a general approach, it buys you something whilst remaining ENTIRELY OOP, but it also loses most of what ES’s have to offer, especially in the realm of flexibility and compound inheritance.

Who cares about ES’s and why? (where did they invent these from?)

1. Coders who are used to writing a graphics/rendering engine and trying to keep it isolated / separate from the rest of the computer game. They learnt that they can write the entire graphics engine using only PARTIAL information (a couple of aspects – no more than 2 or 3) of every OOP object that exists in the game-universe. They need to use: Renderable (do I have to paint it to the screen every frame? [it will have all the methods necessary to do that painting generically]), Updateable (does it potentially change in response to one or more game-loop-ticks? [animations are a great example of this – their internal state changes independently of how many times they’re rendered]),

2. Multi-threaded coders who are looking at the concept of aspects per se (i.e. Renderable as opposed to Updateable) to allow for coarse-grained multi-threading: you are allowed one thread per aspect, and this is guaranteed safe simply because the aspects are – by definition! – independent from one another. Since a typical ES will have a dozen or more aspects, you’ve just bought yourself reasonably effective performance scaling of up to 12 separate CPU cores, for free.

Also … going to the next level of difficulty and performance improvement, they’re looking for the largest possible individual (literally: things that “cannot be divided”) things that can be made the atomic level of multi-threading. The default position is that only manually-hard-coded atomic operations count, and OOP objects certainly are full of all sorts of multi-threading individual pieces so are far too coarse, but maybe components within entities are small enough to be the unit of atomicity.

3. Game designers who want to enable “everything to become anything”.

Do Cameras shoot people?

Do Bullets accept input from the player?

No?

Well … have you played Unreal Tournament? If you have but didn’t answer yes to the earlier questions then you need to find someone to show you a Redeemer being used in Alternate Fire mode (hint: you get to manually “fly by wire” the rocket from a first-person view inside the warhead itself).

ES’s allow everything and anything to be used as anything and everything, interchangeably, with no re-coding. That’s the beauty for designers – all the artifical constraints on design that were previously enforced by coders having to, you know, place SOME constraints just to write their own damn code and have it be reasonably performant and bug-free are suddenly lifted.

It’s true – you really CAN have strongly-typed safe code and yet have total freedom to pass any object to any method; at least, you can with Entity Systems.

4. Coders who go a bit overboard with the Observer pattern.

If you’re unsure about this one, and have used Observers a lot, ask yourself this: have you ever had that problem with Observers where you found that “just one base class isn’t enough”? i.e. one base class that implemented a handful of different Observers worked great, but as your system / game / application expanded you found you needed more variety in that base class, you needed multiple different versions, and you needed *every possible combination* of those different versions, because base-classes cannot/will not “stack” nicely on top of each other.

Thought for the day

Programming *well* with Entity Systems is very close to programming with a Relational Database. It would not be unreasonable to call ES’s a form of “Relation Oriented Programming”.

This needs a followup post, but you might want to think about that 🙂 both in terms of how the above statement could be true (hint: think about what most of the code you write for each System is going to look like, and what exactly it is doing), and also in terms of what the knock-on effects of this are if it is true.

Bear in mind that no-one’s yet managed to do OOP fast with Relations; the excessively high conversion cost between RDBMS and OOP – both at coding-time and at runtime – is still one of the biggest single development problems with writing an MMO.

Entity Systems are the future of MMOG development – part 3

Also known as: Nobody expects the Spanish Inquisition!

(because I’m now deviating from the original schedule I outlined in Part 1; what the heck, it was only a rough agenda anyway…)

Questions, questions…

First of all, there’s a bunch of good questions that have been raised in response to the first two posts:

  • what data and methods are stored in the OOP implementation of an entity?
  • where does the data “live”?
  • how do you do object initialization?
  • what does the ES bring that cannot be accomplished with an AOP framework?
  • what’s the link between entity systems and SQL/Relational Databases? (OK, so that one’s my own question from last time)
  • what, exactly, is an entity?

Let’s start with that last one first.

What, exactly, is an entity?

Obviously, I’ve already covered this from the conceptual perspective in the last post, where I defined them as:

For every discernible thing in your game-world, you have one Entity. Entities have no data and no methods

Great. But a couple of people were wondering what ACTUALLY is it, when you start implementing the thing? Is it a class? Is it an array of data? Is it a list of Components?

The last statement is particularly important: actually, entities have NO data and NO methods – an entity is not a stub for a class, it is not a struct of data, and – ideally – it’s not a list of Components. An entity is really just a name. Last post I also wrote that entities “do little more than to tag every gameobject as a separate item” – and that’s the key thing here, an entity is just a unique label for an in-game object.

I was a bit non-specific, because I didn’t want to say empirically that an entity cannot or should not be a list of components – sure, you CAN implement it that way – because many people do implement entities like that, and who’s to say what’s right or wrong here?

Although…I actually think that *is* the wrong way, and think that by the time you get to the last of these Entity Systems posts, you’ll probably agree with me ;). The problem is, the right way is usually too slow, so you’ll probably end up *to some extent* implementing entities as lists anyway, but if you do so as a performance optimization (e.g. by hiding that detail from the rest of the system) rather than as a logical implementation, it’ll cause fewer problems with your code in the long term.

So, to answer the question directly:

What is an entity?

An Entity is a number (a globally unique number)

Remembering, of course, that a String – or any other kind of unique label – is merely a fancy kind of number, not just in terms of how computers implement them, but in terms of how Mathematicians think of them. People tend to think that strings have extra features, for instance you can encode trees within strings easily (“root.node1.leaf”, for instance – as used in most OOP languages derived from C to navigate the namespace of classes, methods, etc) – although of course you can encode things like that in numbers, too, for instance “123045607” (using 0 to stand for the dot… i.e. “123.456.7”). But … you *really* don’t want to do this!

Gotcha number 1: only OOP programmers want to give entities hierarchical names. It can tunr out a pretty dumb idea, if you think about it: it’s usually an implicit refusal to let go of those OOP ways we’re so used to thinking in, when those OOP ways are exactly what caused most of the problems we’re trying to fix by using ES’s.

So, don’t be tempted into hierarchical encoding, and definitely don’t do ANY encoding in the entity names – there’s a much BETTER place to do metadata for entities (trust me).

And I’ll be calling the implementation of an entity a GUID from now on (“Globally Unique IDentifier”).

What data and methods are stored in the OOP implementation of an entity?

The answer for the previous question makes this one easy to answer: none. The entity is merely a label, and whilst you DO want to store and manipulate some metadata around that label (e.g. it’s kind of handy to have a list of all the entity names you’ve currently got in memory, and be able to rename them without breaking everything), you should be thinking of them as nothing more or less than a GUID (String. Number. Whatever!).

Where does the data “live”?

The first time I made an ES, I implemented the individual Entities as classes, and although I didn’t put any methods in those classes (that would just be bog-standard OOP!), I did put the data into them. It didn’t occur to me that I’d need to worry about where the data was living, so I just stuck it somewhere natural – in the Entity.

Of course, I’d made a cardinal error, because whatever feels “natural” to an OOP programmer is usually “completely wrong” in ES programming.

Anyway, the last time I did an ES I was a lot wiser, and so we put the data inside the Components. All of it.

Gotcha 2: ALL the data goes into the Components. ALL of it. Think you can take some “really common” data, e.g. the x/y/z co-ords of the in-game object, and put it into the Entity itself? Nope. Don’t go there. As soon as you start migrating data into the Entity, you’ve lost. BY DEFINITION the only valid place for the data is inside the Component

How does that work?

Well, you need to start thinking about your Components a bit more carefully here. I previously said that:

A Component labels the Entity as possessing this particular aspect

So, again, a Component is merely another “label”, just like the Entity itself. Right? Wrong. This is my fault – in the last post, I was trying to Keep It Simple for you, and glossed over quite a few things; because the very first ES I made, I treated the components as mere labels, that description came out most easily as “the simple version” of what they are. In more detail:

A Component is the chunk of stuff that provides a particular aspect to any Entity

…where “provides” means “does all the things that are necesssary to make this work”. Which means that all your Components certainly *could* be implemented as standard OOP objects. To define an Entity, you just provide a name (GUID) and a list of Component classes that it needs, and to instantiate that Entity, you just instantiate one object from each class (and then you need to somehow attach those in-memory objects to your GUID, probably by implementing the Entity as an empty class that just contains a GUID and a list-of-component-instances (which I said to avoid, but we’ll come back to that later)).

But, as you may have noticed by now, I’m vigourously against using OOP anywhere inside an ES implementation. This isn’t some strange dislike of OOP itself, it’s just that in my experience with ES development it’s all too easy for experienced OOP programmers to keep trying to sneak some OOP back in where it’s inappropriate, and it’s all too easy to delude yourself into thinking this is a Good Idea. And, worse, from the teams I’ve worked on in the past, it has consistently seemed that the better you are at OOP programming, the harder this is to resist…

You COULD implement your Components as OOP objects. But, really, if you go back to the definition of a “System” from the last post, you’ll see that the methods for acting upon Components should all live inside the Systems, not the Components. In fact, re-reading that last post, I notice that I explicitly described Systems as doing exactly this: “a System essentially provides the method-implementation for Components”.

One tiny exception to this rule: getters and setters are really useful. If you’re implementing your ES within an OOP language, you might allow yourself to implement Components as objects just to get the benefits of get/set and e.g. multiple different versions of each get/set that do basic type conversion, or integrity checking on data, etc. Be warned, though, that if you do you MUST implement them as flyweight objects (go google the Flyweight Pattern if you don’t know it) or else you’ll lose lots of the memory-management and efficiency advantages of a full ES.

Where does the data “live”?

Each Component is merely a classification label (e.g. “Renderable”) and a struct/array/list/whatever of the data relating to it’s purpose; all data lives in Components; ALL data.

How do you do gameobject initialization?

Here’s a fun one – I haven’t mentioned ANYTHING about object/entity intialization. It was really only from trying to use entity systems that I finally realized just how bizarre OOP is when it comes to initialization: you have some “magic methods” (constructor, destructor, finalizer, etc) that have nothing to do with the main description of OOP, but instead are part of a meta-programming that allows you to get your OOP system up and running, and to keep it running, and then to tear it all down again at the end. Until then, I’d not noticed how out-of-place those methods are…

Hats-off to the OOP folks: they integrated their meta-programming so neatly into OOP that in most languages it’s unnoticeable. But I’ve not yet seen the same trick done for an ES – so, get used to the fact that initialization is going to be a bit … “different” …

There’s really a bunch of issues here:

  • what’s the technical term for an entity’s archetype?
  • how do you define the archetypes for your entities?
  • how do you instantiate multiple new entities from a single archetype?
  • how do you STORE in-memory entities so that they can be re-instantiated later on?

What’s the technical term for an entity’s archetype?

There doesn’t seem to be one. Really. There’s a couple floating around, influenced by people’s own backgrounds but I’ve heard quite a few used interchangeably.

My favourites are:

  • template
  • model
  • assemblage

(I actually picked the last one from a thesaurus – looking for equivalents of “class” ;)).

The problem with both “template” and “model” is that they’re both very frequently used generic terms in programming. Entity is pretty good as a term, because it’s hardly used for anything else (and the main obvious competitor is now widely called a “gameobject” by game programmers). I’ve used both, in the past. Interchangeably. (cackle!)

So. “Assemblage”, anyone?

how do you define the archetypes for your entities?
how do you instantiate multiple new entities from a single archetype?
how do you STORE in-memory entities so that they can be re-instantiated later on?

Big topics. Really big. Too big to go into here (ah. Now I know what the next post is going to be about…).

What does the ES bring that cannot be accomplished with an AOP framework?

The simple, facetious, answer is: almost everything.

AOP (Aspect-Oriented Programming) doesn’t really help much to solve the problems that an Entity System solves (although it does solve similar ones that ALSO tend to be present when the ES-related ones are present), and doesn’t provide the new features that you get from an ES (e.g. the improved memory/data-performance).

An ES takes a system where there are many things going on that are all independent, which OOP tangles up together, and pulls them apart and lets them live on their own. It also highly efficiently manages simple interoperation between those disparate aspects of the program, and allows them to be disconnected, reconnected, at will.

AOP takes a system where there are things going on that are fundamentally DEPENDENT, which OOP tangles up together, and allows them to be reasoned about “separately, but together” – you can view the code for logging independent of the code which is being logged, but you have to still reason about them at the same time, you cannot merely ignore one or the other. By definition, different aspects (vested as Components) of an Entity are INdependent, and this whole “together” thing is an irrelevance.

However … AOP certainly helps if you have implemented with OOP something you ideally should have done with an ES, and you want to add new features to it, because the tangled OOP system is going to be a pain to modify without breaking stuff accidentally, and AOP will help you more precisely focus the impact of your changes. But it’s a poor replacement for just implementing the ES you wanted in the first place.

There is also a fairly large area of crossover between “some things you can do with entity systems” and “some things you can do with aspect oriented programming”, mainly because they are both fundamentally data-driven at the function-despatch level (which I’ll come back to later). However, they each use this data-drivenness in very different ways – AOP uses it to react to, and wrap itself around, the code of the program, whereas an ES uses it to react to the data of the program.

Using both together could be really exciting – but I’ve not been brave enough to try that yet myself.

What’s the link between entity systems and SQL/Relational Databases?

You might have guessed this by now – it’s been the theme of most of the answers above, starting from the very beginning of this post.

Mathematically-speaking, an Entity is a database Key, just as you’d see in any RDBMS.

Likewise, from a purely abstracted point of view, the “set of component-instances that comprise Entity #7742” is, literally, a database Query.

THIS is why I said at the start that, ideally, you do NOT want to store your component-instances as a list inside a struct/class representation of each Entity. Fundamentally, that set is NOT defined as a list (a list is a static thing, it’s members don’t change without you changing them), it’s defined as a query (whose members are always changing, whether you want them to or not), and using one where really you wanted the other tends to cause problems in the long term.

Obviously, you can use Lists as a simple form of query (that query being “what things are currently in this list”), but that’s a rather poor and inflexible kind of query. Life is much more interesting if you embrace the dynamicity of the query, and fetch the components (and, hence, the VALUES of the data…) by running a query every time you want one or more of them.

This is a fairly star-gazing view of how to make games: the game is just one giant database, running dynamic queries all the time. Current performance, even of fast RDBMS’s, is just way too slow to be running thousands of queries per frame on a home PC.

However, these articles are about ES’s for massively multiplayer online game development, not just standard game development, and that changes two things in particular:

1. Most of the game is NOT running on home PC’s, it’s running on arbitrarily beefy server-farms that are already running bajillions of QL queries (nearly always SQL these days, but any of MS SQL Server, Oracle, or MySQL).

2. The data is much much more important than rendering speed; we have almost no problem making incredibly beautiful 3D rendered landscapes, but managing the data to make those landscapes act and react as if they were real (i.e. world-logic, game-logic, etc) is something we still find rather hard

A new thought for the day…

Think what you can do with an ES if you deploy it on an MMO server farm, and go the full monty by making all your entity/component aggregations stored merely as SQL queries. Here’s some to start you off…

1. No more pain from “Relational vs OOP”

One of the slowest parts of MMO server performance is translating OOP objects into RDBMS tables so that the database can save them, and then translating them back again when the game needs to read them. This also wastes large amounts of programmer time on boring, annoying, tricky to maintain code to handle the serialization processes.

2. Massive increase in parallelization for free

If you make all processing data-driven, and you have no data-structures that have to remain synchronized, then it becomes a lot easier to “just add more CPU’s” to increase runtime performance…

3. Where else could you use SELECT instead of data structures?

When each System has to do it’s processing, it’s probably going to iterate over “all Entities that have my Component”. Is that something you can store easily in a standard data-structure? Or is it better off as a QL query? And what else can you achieve if you start doing your processing this way? (hint: there’s some nice benefits for the low-level network programming here…)

Entity Systems are the Future of MMOs Part 4

Massively Multiplayer Entity Systems: Introduction

So, what’s the connection between ES and MMO, that I’ve so tantalisingly been dangling in the title of the last three posts? (start here if you haven’t read them yet).

The short answer is: it’s all about data. And data is a lot harder than most people think, whenever you have to deal with an entire system (like an MMO) instead of just one consumer of a system (like the game-client, which is the only part of an MMO that traditional games have).

Obviously, we need to look at it in a lot more detail than that. First, some background…

Massively Multiplayer Game Development 101

There’s a few key things you need to be aware of in MMO development. Many professional game developers know some or all of these already, but seeing as I’ve even met MMO developers who didn’t know them all, and much of what I’m going to say later on won’t make sense without it, I’m going to do a quick recap. Bear with me; I’ll have to do some gross generalization to keep this short (so it won’t be 100% true or accurate).

1. The vast majority of the cost of MMO development is content

Content is:
– quests (missions, storylines, scripted events)
– 3D areas (meshes, textures and logic for: zones, dungeons, towns, landscapes)
– loot (item graphics, drop rates, item stats, item abilities)

Content is NOT:
– Fancy 3D graphics
– Physics engines
– AI
– Core game rules

…which tend to make up the bulk of non-MMO games.

As Raph Koster (amongst others, but IIRC Raph was one of the first to come up with a clear concise description) pointed out close to 10 years ago, the rate at which players consume content vastly excedes the rate at which developers generate it – you have perhaps 50 people working purely on content generation on a modern MMO, and you have perhaps 500,000 people consuming it. The problem is, those 500,000 use Thottbot, so they *share* their discoveries, and consume at a rate proportional to the square of their number, whereas it’s generally produced at a rate more directly proportional to the number of developers.

There are many ways to work around this issue, but most of them end up producing or managing vast amounts of content (they just get more cunning in exactly how that content is generated).

2. The vast majority of the development of an MMO takes place AFTER launch

The ten-year-old MMO’s have been releasing expansions and content updates every 6 to 18 months; modern MMO’s release new content every 3 to 12 months. “New content” in this context generally means entirely new 3D areas with entirely new quests and entirely new items, not to mention entirely new special abilities and new plotlines. i.e. complete “miniature MMOs”.
The only bits you don’t need to keep rewriting are the technology, although nearly all MMOs have done minor updates throughout their lifetime, usually when a new expansion pack is released as a retail boxed product (if you buy and install the expansion, not only do you get the bonus content, but all the “old” content becomes higher resolution / more pretty / added sparkly bits). Although … I long wished there was more updating of tech, and three major MMOs are currently doing huge updates: Ultima Online (1997) recently massively renovated their 10-year old graphics as Kingdom Reborn, Eve Online (2001) just completed their “shininess and detailed spaceships with added humanoids” update known as Trinity, and Anarchy Online (2001) recently previewed massive sweeping changes to their very poor detail and draw distance client (they say “the current engine was designed before there really were GPUs to utilize” but that is blatantly false as the GeForce had been on sale for almost 2 years when they launched, and the TNT,Voodoo, et al for years before that. Yes, I was bitter that when it launched the most recent hardware it supported was 5 years old :)). All represent major changes to the look and feel of the games – it’s hard to overestimate the visual impact of these. Runescape did their mega update (“Runescape 2”) a couple of years ago.

However … all are bringing games that were NOT cutting edge at launch up to a standard that is still well short of what is cutting edge now. This is important to understand: graphical quality is (according to what the successful MMO’s deploy) definitely not the main selling point for these games, unlike almost all other successful games you can possibly buy.

So … what is that “development” that takes place after launch? It’s content, not engine. And even when it’s engine (better graphics, better physics) it’s always a minimal improvement by the standards of standalone PC and console titles of the time.

For reference, many MMO’s after launch end up with LARGER development teams than they had in the 3-5 years of development leading up to the launch. All that extra content requires a lot of people (not to mention keeping all the old content working, updating it where it conflicts with new content, etc).

On top of all that, though, there is the issue of customer support. When you launch a traditional game, that’s it. Done. Over the last 15 years there’s been a trend towards launching occasional “patches” and even minor content updates – but mostly bugfixes – which, incidentally, I suspect has been largely driven by publishers learning from the MMOs: releasing a content patch for an old game is a good way to get extra marketing / publicity and increase some sales. Bethesda’s TES4/Oblivion even went as far as to charge for the patches.

MMO’s require a lot more work than that: since MMO’s are all monetized off continuous play (i.e. the more each player plays, the more money the publisher makes), either through monthly subscriptions or through in-game commerce, it is imperative to keep all the players playing as long as possible. If a player plays your MMO for 12 months instead of 6, you’ll make twice as much profit; if they play your console title for 12 months instead of 6, you’ll make no more money and will probably cannibalize your sales of the sequel.

So. Total lifetime development cost is a lot more important, financially, to an MMO than the pre-launch development cost.

3. The way the core logic works now is not how it will work forever

Sooner or later, you nerf.

Usually because you(r players) discover a disproportionately powerful game tactic in PvP that you didn’t think of which is unbalancing and ruining the game, and so you “have” to fix it.

Sooner or later, you choose to break every absolute rule of your game logic, because a new expansion wants to slightly redefine and “freshen” the core game.

And that means you have to be able to break every rule you ever hardcoded into the game-logic. Worse, it means you have to be able to handle the fallout (all the bugs that the change introduces, plus all the invisible bugs that were originally there but had no noticeable effect and now suddenly have visible effect, the new security holes, etc).

Since the ongoing development of the game (post launch) is greater than the initial development in scope and cost, over the lifetime of the game it is more important to make it easy to change the core logic – and to react to side effects of that change – than to be able to make the core logic easily in the first place.

4. Even a moderately successful MMO generates gigabytes of data every 24 hours

The progress of every player of the MMO has to be uniquely independently tracked, so that the game can remember for each player which quests they’ve completed, which equipment they have, etc. With 100,000 players, that data racks up quickly. A lot of it can be thrown away every day (for instance, you don’t generally need the history of how many hitpoints you had at the end of every day), but a lot of it has to be kept forever (whether you completed a given quest, and for some quests how exactly you completed it).

However, the trend at the moment is towards keeping ALL this data, in the form of historical records of your personal game experience – e.g. the automatic “blog” that is created whilst you play Vanguard, which e.g. updates every time you level up, or kill a monster, or die, etc.

This latter kind of data alone is generated at the rate of around 30Gb every 24 hours.

And any of that data that gets used in-game (e.g. even if just referenced in a conversation with an NPC) has to be programmatically accessible.

Ultimately, MMOs have to store, retrieve, and update vast databases. There is a huge decades-old industry that specializes in providing software to manage databases of this size and larger, but MMO developers don’t like SQL and Relational programming. This is a real problem – relational databases can handle the load, but relational programming is fundamentally incompatible with Object-Oriented programming. The net effect is that programmers who have to write their data to disk find it boring, difficult, and irritating to do – and that leads to mistakes and greatly increased bugs, as well as reduced functionality (because no-one wants to write or add to the code to add the new features more than they absolutely have to).

MMO Development Priorities

The net effect of these aspects of MMO development is that the long term profitability of an MMO can be greatly affected by a bunch of simple issues:

  • How much effort is required to change the data used to describe an in-game item
  • How much effort is required to add (or remove) new (old) types of in-game item
  • How re-usable is the content
  • How easy it is in practice (i.e. how much coding is needed) to re-use re-usable content
  • How much specialist knowledge of the game code, overall game design, and details of the game systems is needed to modify the content and logic (can new team members be effective as the team who originally wrote the game pre-launch?)
  • How exportable is the data generated by playing the game (progress, achievements, player-history, etc)
  • How analysable is the game-data generated by playing (to make decisions about what to change, and check that game improvements have improed things)
  • How modifiable the core-content is
  • How easy it is to change individual rules and check the side-effects of the changes
  • How many of all the above changes can be done WITHOUT requiring a programmer (can designers change code themselves? can artists?)
  • How much can third-party specialist systems be utilised for server-side service provision

Leading to a few rules of thumb that are often adopted by commercial MMOs:

  1. Use commercial databases for all persistent data-storage
  2. Use Relational Databases and allow arbitrary general data requests
  3. Implement as much as possible of the game-logic as raw data rather than as compiled data (and preferably as data rather than source code)
  4. Use standard but simple to learn scripting languages where data isn’t expressive enough
  5. Do lots and lots of testing during development
  6. Do even more testing post-launch EVERY SINGLE TIME you consider changing ANYTHING on the live servers

Some of those aren’t very effective (too vague), others are horrifically expensive (it takes a lot of people to “test EVERYTHING” every time you make a change to a live game). None of them are great for performance in and of themselves (although they can be made highly performant, that requires extra work), and none of them are “programmer friendly” – in fact, they’re all deliberately programmer UNfriendly, being in there for the benefit of non-programmers.

NB: this is in no way a comprehensive or exhaustive list; I’m trying to give merely a flavour of the thing here for people who aren’t familiar with typical MMO dev processes. I don’t know any real MMO that faces only those challenges and uses only those rules of thumb, but they are indicative of what is used, and very very roughly explain why those things are used.

And so on to Entity Systems, and how they can affect the typical MMO development process by helping with problems or making the current solutions more programmer-friendly… (which is going to be Part 5. This is getting long enough already).

Entity Systems are the Future of MMOs Part 5

(Start by reading Entity Systems are the Future of MMOs Part 1)

It’s been a long time since my last post on this topic. Last year, I stopped working for a big MMO publisher, and since then I’ve been having fun doing MMO Consultancy (helping other teams write their games), and iPhone development (learning how to design and write great iPhone apps).

Previously, I posed some questions and said I’d answer them later:

  • how do you define the archetypes for your entities?
  • how do you instantiate multiple new entities from a single archetype?
  • how do you STORE in-memory entities so that they can be re-instantiated later on?

Let’s answer those first.

A quick warning…

I’m going to write this post using Relational terminology. This is deliberate, for several reasons:

  • It’s the most-correct practical way of describing runtime Entities
  • It’s fairly trivial to see how to implement this using static and dynamic arrays – the reverse is not so obvious
  • If you’re working on MMO’s, you should be using SQL for your persistence / back-end – which means you should already be thinking in Relations.
…and a quick introduction to Relational

If you know literally nothing about Relational data, RDBMS’s, and/or SQL (which is true of most game programmers, sadly), then here’s the idiot’s guide:

  1. Everything is stored either in arrays, or in 2-dimensional arrays (“arrays-of-arrays”)
  2. The index into the array is explicitly given a name, some text ending in “_id”; but it’s still just an array-index: an increasing list of integers starting with 0, 1, 2, 3 … etc
  3. Since you can’t have Dictionaries / HashMaps, you have to use 3 arrays-of-arrays to simulate one Dictionary. This is very very typical, and so obvious you should be able to understand it easily when you see it below. I only do it twice in this whole blog post.
  4. Where I say “table, with N columns”, I mean “a variable-length array, with each element containing another array: a fixed-size array of N items”
  5. Where I say “row”, I mean “one of the fixed-size arrays of N items”
  6. Rather than index the fixed-size arrays by integer from 0…N, we give a unique name (“column name”) to each index. It makes writing code much much clearer. Since the arrays are fixed-size, and we know all these column names before we write the program, this is no problem.

Beyond that … well, go google “SQL Tutorial” – most of them are just 1 page long, and take no more than 5 minutes to read through.

How do you store all your data? Part 2: Runtime Entities + Components (“Objects”)

We’re doing part 2 first, because it’s the bit most of us think of first. When I go onto part 1 later, you’ll see why it’s “theoretically” the first part (and I called it “1”), even though when you write your game, you’ll probably write it second.

Table 3: all components

(yes, I’m starting at 3. You’ll see why later ;))

Table 3: components
component_id official name human-readable description table-name

There are N additional tables, one for each row in the Components table. Each row has a unique value of “table-name”, telling you which table to look at for this component. This is optional: you could instead use an algorithmic name based on some criteria like the official_name, or the component_id – but if you ever change the name of a component, or delete one and re-use the id, you’ll get problems.

Table 4: all entities / entity names
Table 4 : entities
entity_id human-readable label FOR DEBUGGING ONLY

(really,you should only have 1 column in this table – but the second column is really useful when debuggin your own ES implementation itself!)

…which combines with:…

Table 5: entity/component mapping
Table 5 : entity_components
entity_id component_id component_data_id

…to tell you which components are in which entity.

Technically, you could decide not to bother with Table 4; just look up the “unique values of entity_id from table 5” whenever you want to deal with Table 4. But there are performance advantages for it – and you get to avoid some multi-threading issues (e.g. when creating a new entity, just create a blank entity in the entity table first, and that fast atomic action “reserves” an entity_id; without Table 4, you have to create ALL the components inside a Synchronized block of code, which is not good practice for MT code).

Tables 6,7,8…N+5: data for each component for each Entity
Table N+5 : component_data_table_N
component_data_id [1..M columns, one column for each piece of data in your component]

These N tables store all the live, runtime, data for all the entity/component pairs.

How do you store all your data? Part 1: Assemblages (“Classes”)

So … you want to instantiate 10 new tanks into your game.

How?

Well, you could write code that says:


int newTank()
{
int new_id = createNewEntity();

// Attach components to the entity; they will have DEFAULT values

createComponentAndAddTo( TRACKED_COMPONENT, new_id );
createComponentAndAddTo( RENDERABLE_COMPONENT, new_id );
createComponentAndAddTo( PHYSICS_COMPONENT, new_id );
createComponentAndAddTo( GUN_COMPONENT, new_id );

// Setup code that EDITS the data in each component, e.g:
float[] gunData = getComponentDataForEntity( GUN_COMPONENT, new_id );
gunData[ GUN_SIZE ] = 500;
gunData[ GUN_DAMAGE ] = 10000;
gunData[ GUN_FIRE_RATE ] = 0.001;
setComponentDataForEntity( GUN_COMPONENT, new_id, gunData );


return new_id;
}

…and this is absolutely fine, so long as you remember ONE important thing: the above code is NOT inside a method “because you wanted it in an OOP class”. It’s inside a method “because you didn’t want to type it out every time you have a place in your code where you instantiate tanks”.

i.e. IT IS NOT OOP CODE! (the use of “methods” or “functions” is an idea that predates OOP by decades – it is coincidence that OOP *also* uses methods).

Or, in other words, if you do the above:

NEVER put the above code into a Class on its own; especially NEVER NEVER split the above code into multiple methods, and use OOP inheritance to nest the calls to “createComponet” etc.

But … it means that when you decide to split one Component into 2 Components, you’ll have to go through the source code for EVERY kind of game-object in your game, and change the source, then re-compile.

A neater way to handle this is to extend the ES to not only define “the components in each entity” but also “templates for creating new entities of a given human-readable type”. I previously referred to these templates as “assemblages” to avoid using the confusing term “template” which means many things already in OOP programming…

An assemblage needs:

Table 1: all Assemblages
Table 1 : assemblages
assemblage_id default human-readable label (if you’re using that label in Table 1 above) official name human-readable description
Table 2: assemblage/component mapping
Table 2 : assemblage_components
assemblage_id component_id

This table is cut-down version of Table 5 (entity/component mapping). This table provides the “template” for instantiating a new Entity: you pick an assemblage_id, find out all the component_id’s that exist for it, and then create a new Entity and instantiate one of each of those components and add it to the entity.

Table 3: all components
Table 3: components
component_id official name human-readable description table-name

This is the same table from earlier (hence the silly numbering, just to make sure you noticed ;)) – it MUST be the same data, for obvious reasons.

Things to note

DataForEntity( (entity-id) ) – fast lookup

If you know the entity-id, you may only need one table lookup to get the data for an entire component (Table5 is highly cacheable – it’s small, doesn’t change, and has fixed-size rows).

Splitting Table 5 for performance or parallelization

When your SQL DB is too slow and you want to split to multiple DB servers, OR you’re not using SQL (doing it all in RAM) and want to fit inside your CPU cache, then you’ll split table 5 usually into N sub-tables, where N = number of unique component_id’s.

Why?

Because you run one System at a time, and each System needs all the components with the same component_id – but none of the components without that id.

Isolation

The entire data for any given system is fully isolated into its own table. It’s easy to print to screen (for debugging), serialize (for saving / bug reports), parallelize (different components on different physical DB servers)

Metadata for editing your Assemblages and Entities

a.k.a. “Programmer/Designers: take note…”

It can be tempting to add extra columns to the Entity and Assemblage tables. Really, you shouldn’t be doing this. If you feel tempted to do that, add the extra data as more COMPONENTS – even if the data is NOTHING to do with your game (e.g. “name_of_designer_who_wrote_this_assemblage”).

Here’s a great feature of Entity Systems: it is (literally) trivial for the game to “remove” un-needed information at startup. If, for instance, you have vast amounts of metadata on each entity (e.g. “name of author”, “time of creation”, “what I had for lunch on the day when I wrote this entity”) – then it can all be included and AUTOMATICALLY be stripped-out at runtime. It can even be included in the application, but “not loaded” at startup – so you get the benefits of keeping all the debug data hanging around, with no performance overhead.

You can’t do that with OOP: you can get some *similar* benefits by doing C-Header-File Voodoo, and writing lots of proprietary code … but … so much is dependent upon your header files that unless you really know what you’re doing you probably shouldn’t go there.

Another great example is Subversion / Git / CVS / etc metadata: you can attach to each Entity the full Subversion metadata for that Entity, by creating a “SubversionInformation” System / Component. Then at runtime, if something crashes, load up the SubversionInformation system, and include it in the crash log. Of course, the Components for the SubversionInformation system aren’t actually loader yet – because the system wasn’t used inside the main game. No problem – now you’ve started the system (in your crash-handler code), it’ll pull in its own data from disk, attach it to whatever entities are in-memory, and all works beautifully.

Wrapping up…

I wanted to cover other things – like transmitting all this stuff over the network (and maybe cover how to do so both fast and efficiently) – but I realise now that this post is going to be long enough as it is.

Next time, I hope to talk about that (binary serialization / loading), and editors (how do you make it easy to edit / design your own game?).

Entity ID’s: how big, using UUIDs or not, why, etc?

This has come up a few times, and I ended up replying on Twitter:

But that’s a crappy way to find things later, so I made a quick-and-dirty infographic with a few key points:

Docs-entityID

I’m working on my own Entity System; want to follow it?

Refactoring Game Entities with Components

Up until fairly recent years, game programmers have consistently used a deep class hierarchy to represent game entities. The tide is beginning to shift from this use of deep hierarchies to a variety of methods that compose a game entity object as an aggregation of components. This article explains what this means, and explores some of the benefits and practical considerations of such an approach. I will describe my personal experience in implementing this system on a large code base, including how to sell the idea to other programmers and management.

GAME ENTITIES

Different games have different requirements as to what is needed in a game entity, but in most games the concept of a game entity is quite similar. A game entity is some object that exists in the game world, usually the object is visible to the player, and usually it can move around.

Some example entities:

  • Missile
  • Car
  • Tank
  • Grenade
  • Gun
  • Hero
  • Pedestrian
  • Alien
  • Jetpack
  • Med-kit
  • Rock

Entities can usually do various things. Here are some of the things you might want the entities to do

  • Run a script
  • Move
  • React as a rigid body
  • Emit Particles
  • Play located audio
  • Be packed up by the player
  • Be worn by the player
  • Explode
  • React to magnets
  • Be targeted by the player
  • Follow a path
  • Animate

TRADITIONAL DEEP HIERARCHIES


The traditional way of representing a set of game entities like this is to perform an object-oriented decomposition of the set of entities we want to represent. This usually starts out with good intentions, but is frequently modified as the game development progresses – particularly if a game engine is re-used for a different game. We usually end up with something like figure 1, but with a far greater number of nodes in the class hierarchy.

As development progresses, we usually need to add various points of functionality to the entities. The objects must either encapsulate the functionality themselves, or be derived from an object that includes that functionality. Often, the functionality is added to the class hierarchy at some level near the root, such as the CEntity class. This has the benefit of the functionality being available to all derived classes, but has the downside of the associated overhead also being carried by those classes.

Even fairly simple objects such as rocks or grenades can end up with a large amount of additional functionality (and associated member variables, and possibly unnecessary execution of member functions). Often, the traditional game object hierarchy ends up creating the type of object known as “the blob”. The blob is a classic “anti-pattern” which manifests as a huge single class (or a specific branch of a class hierarchy) with a large amount of complex interwoven functionality.

While the blob anti-pattern often shows up near the root of the object hierarchy, it will also show up in leaf nodes. The most likely candidate for this is the class representing the player character. Since the game is usually programmed around a single character, then the object representing that character often has a very large amount of functionality. Frequently this is implemented as a large number of member functions in a class such as CPlayer.

The result of implementing functionality near the root of the hierarchy is an overburdening of the leaf objects with unneeded functionality. However, the opposite method of implementing the functionality in the leaf nodes can also have unfortunate consequence. Functionality now becomes compartmentalized, so that only the objects specifically programmed for that particular functionality can use it. Programmers often duplicate code to mirror functionality already implemented in a different object. Eventually messy re-factoring is required by re-structuring the class hierarchy to move and combine functionality.

Take for example the functionality of having an object react under physics as a rigid body. Not every object needs to be able to do this. As you can see in figure 1, we just have the CRock and the CGrenade classes derived from CRigid. What happens when we want to apply this functionality to the vehicles? You have to move the CRigid class further up the hierarchy, making it more and more like the root-heavy blob pattern we saw before, with all the functionality bunched in a narrow chain of classes from which most other entity classes are derived.

AN AGGREGATION OF COMPONENTS

The component approach, which is gaining more acceptance in current game development, is one of separating the functionality into individual components that are mostly independent of one another. The traditional object hierarchy is dispensed with, and an object is now created as an aggregation (a collection) of independent components.

Each object now only has the functionality that it needs. Any distinct new functionality is implemented by adding a component.

A system of forming an object from aggregating components can be implemented in one of three ways, which may be viewed as separate stages in moving from a blob object hierarchy to a composite object.

OBJECT AS ORGANIZED BLOB

A common way of re-factoring a blob object is to break out the functionality of that object into sub-objects, which are then referenced by the first object. Eventually the parent blob object can mostly be replaced by a series of pointers to other objects, and the blob object’s member functions become interface functions for the functions of those sub-objects.

This may actually be a reasonable solution if the amount of functionality in your game objects is reasonably small, or if time is limited. You can implement arbitrary object aggregation simply by allowing some of the sub-objects to be absent (by having a NULL pointer to them). Assuming there are not too many sub-objects, then this still allows you the advantage of having lightweight pseudo-composite objects without having to implement a framework for managing the components of that object.

The downside is that this is still essentially a blob. All the functionality is still encapsulated within one large object. It is unlikely you will fully factor the blob into purely sub-objects, so you will still be left with some significant overhead, which will weight down your lightweight objects. You still have the overhead of constantly checking all the NULL pointers to see if they need updating.

OBJECT AS COMPONENT CONTAINER

The next stage is to factor out each of the components (the “sub-objects” in the previous example) into objects that share a common base class, so we can store a list of components inside of an object.

This is an intermediate solution, as we still have the root “object” that represents the game entity. However, it may be a reasonable solution, or indeed the only practical solution, if a large part of the code base requires this notion of a game object as a concrete object.

Your game object then becomes an interface object that acts as a bridge between the legacy code in your game, and the new system of composite objects. As time permits, you will eventually remove the notion of game entity as being a monolithic object, and instead address the object more and more directly via its components. Eventually you may be able to transition to a pure aggregation.

OBJECT AS A PURE AGGREGATION

In this final arrangement, an object is simply the sum of its parts. Figure 2 shows a scheme where each game entity is comprised of a collection of components. There is no “game entity object” as such. Each column in the diagram represents a list of identical components, each row can be though of as representing an objects. The components themselves can be treated as being independent of the objects they make up.

PRACTICAL EXPERIENCE

I first implemented a system of object composition from components when working at Neversoft, on the Tony Hawk series of games. Our game object system had developed over the course of three successive games until we had a game object hierarchy that resembled the blob anti-pattern I described earlier. It suffered from all the same problems: the objects tended to be heavyweight. Objects had unnecessary data and functionality. Sometimes the unnecessary functionality slowed down the game. Functionality was sometimes duplicated in different branches of the tree.

I had heard about this new-fangled “component based objects” system on the sweng-gamedev mailing list, and decided it sounded like a very good idea. I set to re-organizing the code-base and two years later, it was done.

Why so long? Well, firstly we were churning out Tony Hawk games at the rate of one per year, so there was little time between games to devote to re-factoring. Secondly, I miscalculated the scale of the problem. A three-year old code-base contains a lot of code. A lot of that code became somewhat inflexible over the years. Since the code relied on the game objects being game objects, and very particular game objects at that, it proved to be a lot of work to make everything work as components.

EXPECT RESISTANCE

The first problem I encountered was in trying to explain the system to other programmers. If you are not particularly familiar with the idea of object composition and aggregation, then it can strike you as pointless, needlessly complex, and unnecessary extra work. Programmers who have worked with the traditional system of object hierarchies for many years become very used to working that way. They even become very good at working that way, and manage to work around the problems as they arise.

Selling the idea to management is also a difficult. You need to be able to explain in plain words exactly how this is going to help get the game done faster. Something along the lines of:

Whenever we add new stuff to the game now, it takes a long time to do, and there are lots of bugs. If we do this new component object thing, it will let us add new stuff a lot quicker, and have fewer bugs.”

My approach was to introduce it in a stealth manner. I first discussed the idea with a couple of programmers individually, and eventually convinced them it was a good idea. I then implemented the basic framework for generic components, and implemented one small aspect of game object functionality as a component.

I then presented this to the rest of the programmers. There was some confusion and resistance, but since it was implemented and working there was not much argument.

SLOW PROGRESS

Once the framework was established, the conversion from static hierarchy to object composition happened slowly. It is thankless work, since you spend hours and days re-factoring code into something that seems functionally no different to the code it replaces. In addition, we were doing this while still implementing new features for the next iteration of the game.

At an early point, we hit the problem of re-factoring our largest class, the skater class. Since it contained a vast amount of functionality, it was almost impossible to re-factor a piece at a time. In addition, it could not really be re-factored until the other object systems in the game conformed to the component way of doing things. These in turn could not be cleanly refactored as components unless the skater was also a component.

The solution here was to create a “blob component.” This was a single huge component, which encapsulated much of the functionality of the skater class. A few other blob components were required in other places, and we eventually shoehorned the entire object system into a collection of components. Once this was in place, the blob components could gradually be refactored into more atomic components.

RESULTS

The first results of this re-factoring were barely tangible. But over time the code became cleaner and easier to maintain as functionality was encapsulated in discreet components. Programmers began to create new types of object in less time simply by combining a few components and adding a new one.

We created a system of data-driven object creation, so that entirely new types of object could be created by the designers. This proved invaluable in the speedy creation and configuration of new types of objects.

Eventually the programmers came (at different rates) to embrace the component system, and became very adept at adding new functionality via components. The common interface and the strict encapsulation led to a reduction in bugs, and code that that was easier to read, maintain and re-use.

IMPLEMENTATION DETAILS

Giving each component a common interface means deriving from a base class with virtual functions. This introduces some additional overhead. Do not let this turn you against the idea, as the additional overhead is small, compared to the savings due to simplification of objects.

Since each component has a common interface, it is very easy to add additional debug member functions to each component. That made it a relatively simple matter to add an object inspector that could dump the contents of the components of a composite object in a human readable manner. Later this would evolve into a sophisticated remote debugging tool that was always up to date with all possible types of game object. This is something that would have been very tiresome to implement and maintain with the traditional hierarchy.

Ideally, components should not know about each other. However, in a practical world, there are always going to be dependencies between specific components. Performance issues also dictate that components should be able to quickly access other components. Initially we had all component references going through the component manager, however when this started using up over 5% of our CPU time, we allowed the components to store pointers to one another, and call member functions in other components directly.

The order of composition of the components in an object can be important. In our initial system, we stored the components as a list inside a container object. Each component had an update function, which was called as we iterated over the list of components for each object.

Since the object creation was data driven, it could create problems if the list of components is in an unexpected order. If one object updates physics before animation, and the other updates animation before physics, then they might get out of sync with each other. Dependencies such as this need to be identified, and then enforced in code.

CONCLUSIONS

Moving from blob style object hierarchies to composite objects made from a collection of components was one of the best decisions I made. The initial results were disappointing as it took a long time to re-factor existing code. However, the end results were well worth it, with lightweight, flexible, robust and re-usable code.

Game Object Structure: Inheritance vs. Aggregation

By Kyle Wilson
Wednesday, July 03, 2002

Every game engine I’ve encountered has had some notion of a game object, or a scene object, or an entity.  A game object may be an animated creature, or an invisible box that sets off a trigger when entered, or an invisible hardpoint to which a missile attaches.  In general, though, there will be a game object class which exposes an interface, or multiple interfaces, to the systems that handle collision, rendering, triggering, sound, etc.

(c) FreeFoto.comAt Interactive Magic, and later at HeadSpin/Cyan, I worked with inheritance-based game objects.  There was a base game object class, from which were derived other object types.  For example, dynamic objects derived from game objects, and animated objects derived from dynamic objects, and so forth.  When I got to iROCK, I found a similar scheme, but with some differences that I’ll discuss in a minute.

There are several problems with a game object design based on inheritance.  Some problems can be worked around (with varying degrees of difficulty).  Some problems are inherent in an inheritance-based design.

  • The Diamond of Death.  What do you do if trigger objects are derived from game objects, and dynamic objects are derived from game objects, but you want a dynamic trigger object?  In C++, you have to make your trigger objects and dynamic objects inherit from game objects virtually.  Virtual inheritance is poorly understood by most programmers, makes maintenance more difficult, and adds subtle inefficiencies to your code.  (Read [7] for details.  Also see [6], Item 43, for other problems with multiple inheritance.)  Inheritance handles overlapping categories poorly.
  • Pass-Thru Enforcement.  In a game object hierarchy — more than most class hierarchies, I think — it’s useful to have functions perform class specific actions, then pass the function call down to a parent class.  For example, at HeadSpin, our game object base class declared a virtual function called Update.  Update was intended to do everything a class required before being drawn.  In the root game object class, it refreshed the current world space transform for the game object.  In derived classes, Update might change other state settings, then call Update in the parent class.  But C++ doesn’t offer any convenient mechanism for requiring that a virtual function recurse down through base class implementations of itself.  Every time we added a new class to the engine, at least one function pass-thru got left out, and had to be caught in debugging.
    This problem can be ameliorated somewhat if you don’t publicly expose virtual functions, but instead use the Template Method pattern[3], or what Herb Sutter calls the “Nonvirtual Interface idiom”[8].  That is, have a base class consisting only of public non-virtual functions which call private virtual functions.  The non-virtual functions can perform whatever base-class-specific actions they require before calling the private virtuals.  This still won’t help, however, with deep class hierarchies where leaf classes need to pass through function calls to intermediate parent classes, not just the root class.
  • Unintended Consequences.  A corollary to the Pass-Thru Enforcement problem is that when a virtual function call recurses down the class hierarchy, it’s easy to lose track of which actions are being performed for which classes.  In [5], Herb Marselas writes that since Ensemble used an inheritance-based game object hierarchy for Age of Empires II, “functionality can be added or changed in a single place in the code to affect many different game units. One such change inadvertently added line-of-sight checking for trees,” which was a performance problem, since an AOE2 level generally contained a large number of trees.
  • Dependencies.  Inheritance is one of the tightest couplings there is in C++.  As such, it affects not just program logic, but also the physical design of a program.  Inheritance always creates compile- and link-time dependencies in your code.  To compile a file using any game object, the parser must also load the header files containing all its ancestors.  The linker must resolve all dependencies to the same.  If any game object needs to know about its descendents, then cyclic dependencies arise, and proper levelization becomes impossible.  (See [4].)
  • Difficulty in Comprehension.  To understand the behavior of any class, you have to open other files and learn the behavior of its ancestors.
  • Interface Bloat.  Monolithic classes tend to develop large interfaces to cover the host of purposes they serve.  By the time I left Cyan, our scene object base class had a class definition alone that was over four hundred lines of flag enums and function declarations for physics, graphics, sound, animation and stream I/O.  And the interface would have been bloated further by null virtual functions used by derived classes if we hadn’t instead resorted sometimes to just checking type-ids and doing static_casts to derived types.
    We partly solved this problem at iROCK by having multiple class hierarchies, instead of just one.  What would normally be one game object becomes three, a game object, a scene object, and a sound object, all in separate modules.  This is an improvement in principle, but in practice, independence of the different modules was never well enforced.  In the end, game objects included scene object headers, sound objects included game object headers, and the code was rife with the cyclic dependencies that John Lakos so deplores.

(c) FreeFoto.comSo if an inheritance-based game object design is riddled with problems, what is a better approach?  In redesigning the Plasma engine used by HeadSpin/Cyan, the software engineering team opted to flatten the game object hierarchy down to a single game object class.  This class aggregated some number of components which supported functionality previously supported by derived game objects.

In a post to Sweng-Gamedev back in early 2001[1] — around the time Cyan started rearchitecting Plasma — Scott Bilas of Gas Powered Games described changing the Dungeon Siege engine from a “static class hierarchy” to a “component based” design.  In the same thread[2], Charles Bloom of Oddworld described the Munch’s Oddysee class design as being similar to that of Dungeon Siege.

From these data points, I’m willing to interpolate a trend.  If moving from class hierarchies to containers of components for game objects is a trend in game development, it mirrors a broader shift in the C++ development community at large.  The traditional game object hierarchy sounds like what Herb Sutter characterizes as “mid-1990s-style inheritance-heavy design” [8].  (Sutter goes on to warn against overuse of inheritance.)  In Design Patterns, the gang of four recommends only two principles for object oriented design:  (1) program to an interface and not to an implementation and (2) favor object composition over class inheritance [3].

Component-based game objects are cleaner and more powerful than game objects based on inheritance.  Components allow for better encapsulation of functionality.  And components are inherently dynamic — they give you the power to change at runtime state which could only be changed at compile time under an inheritance-based design.  The use of inheritance in implementing game object functionality is attractive, but eventually limiting.  A component-based design is to be preferred.

The Entity-Component-System – An awesome game-design pattern in C++ (Part 1)

by Tobias Stein on 11/22/17 10:25:00 am   Featured Blogs
5 comments Share on Twitter

   RSS


The following blog post, unless otherwise noted, was written by a member of Gamasutra’s community.
The thoughts and opinions expressed are those of the writer and not Gamasutra or its parent company.


Hi Folks.

This is actually my first post on gamasutra 🙂 I am here pretty much every day and checkout cool posts, today is gonna be the day I add one by myself 🙂 You will find the origial post here.

In this article I want to talk about the Entity-Component-System (ECS). You can find a lot of information about the matter in the internet so I am not going to deep into explanation here, but talking more about my own implementation.

First things first. You will find the full source code of my ECS in my github repository.

An Entity-Component-System – mostly encountered in video games – is a design pattern which allows you great flexibility in designing your overall software architecture[1]. Big companies like Unity, Epic or Crytek in-cooperate this pattern into their frameworks to provide a very rich tool for developers to build their software with. You can checkout these posts to follow a broad discussion about the matter[2,3,4,5].

If you have read the articles I mentioned above you will notice they all share the same goal: distributing different concerns and tasks between Entities, Components and Systems. These are the three big players in this pattern and are fairly loose coupled. Entities are mainly used to provide a unique identifier, make the environment aware of the existence of a single individual and function as a sort of root object that bundles a set of components. Components are nothing more than container objects that do not possess any complex logic. Ideally they are simple plain old data objects (POD’s). Each type of a component can be attached to an entity to provide some sort of a property. Let’s say for example a “Health-Component” can be attached to an entity to make it mortal by giving it health, which is not more than an integer or floating point value in memory.

Up to this point most of the articles I came across agree about the purpose and use of entity and component objects, but for systems opinions differ. Some people suggest that systems are only aware of components. Furthermore some say for each type of component there should be a system, e.g. for “Collision-Components” there is a “Collision-System”, for “Health-Components” there is a “Health-System” etc. This approach is kind of rigid and does not consider the interplay of different components. A less restrictive approach is to let different systems deal with all components they should be concerned with. For instance a “Physics-Systems” should be aware of “Collision-Components” and “Rigidbody-Components”, as both probably contain necessary information regarding physics simulation. In my humble opinion systems are “closed environments”. That is, they do not take ownership of entities nor components. They do access them through independent manager objects, which in turn will take care of the entities and components life-cycle.

This raises an interesting question: how do entities, components and systems communicate with each other, if they are more or less independent of each other? Depending on the implementation the answer differs. As for the implementation I am going to show you, the answer is event sourcing[6]. Events are distributed through an “Event-Manager” and everyone who is interested in events can listen to what the manager has to say. If an entity or system or even a component has an important state change to communicate, e.g. “position changed” or “player died”, it can tell the “Event-Manager”. He will broadcast the event and all subscriber for this event will get notified. This way everything can be interconnected.

Well I guess the introduction above got longer than I was actually planning to, but here we are 🙂 Before we are going to dive deeper into the code, which is C++11 by the way, I will outline the main features of my architecture:

  • memory efficiency – to allow a quick creation and removal of entity, component and system objects as well as events I could not rely on standard new/delete managed heap-memory. The solution for this was of course a custom memory allocator.
  • logging – to see what is going on I used log4cplus[7] for logging.
  • scalable – it is easy to implement new types of entities, components, systems and events without any preset upper limit except your system’s memory
  • flexible – no dependencies exist between entities, components and systems (entities and components sure do have a sort of dependency, but do not contain any pointer logic of each other)
  • simple object lookup/access – easy retrieval of entity objects and there components through an EntityId or a component-iterator to iterate over all components of a certain type
  • flow control – systems have priorities and can depend on each other, therefore a topological order for their execution can be established
  • easy to use – the library can be easily in cooperate into other software; only one include.

The following figure depicts the overall architecture of my Entity-Component-System:

ECS_Overview Figure-01: ECS Architecture Overview (ECS.dll).

As you can see there are four different colored areas in this picture. Each area defines a modular piece of the architecture. At the very bottom – actually in the picture above at the very top; it should be upside down – we got the memory management and the logging stuff (yellow area). This first-tier modules are dealing with very low-level tasks. They are used by the second-tier modules in the Entity-Component-System (blue area) and the event sourcing (red area). These guys mainly deal with object management tasks. Sitting on top is the third-tier module, the ECS_Engine (green area). This high-level global engine object orchestrates all second-tier modules and takes care of the initialization and destruction. All right, this was a short and very abstract overview now let’s get more into the details.

Memory Manager

Let’s start with the Memory-Manager. It’s implementation is based on an article[8] I have found on gamedev.net. The idea is to keep heap-memory allocations and releases to an absolute minimum. Therefore only at application start a big chuck of system-memory is allocated with malloc. This memory now will be managed by one or more custom allocator. There are many types of allocators[9] ( linear, stack, free list…) and each one of them has it’s pro’s and con’s (which I am not going to discuss here). But even if they internally work in a different way they all share a common public interface:

class Allocator
{
     public:
          virtual void* allocate(size_t size) = 0;
          virtual void free(void* p) = 0;
};

The code snippet above is not complete, but outlines the two major public methods each concrete allocator must provide:

  1. allocate – which allocates a certain amount of bytes and returns the memory-address to this chunk and
  2. free – to de-allocates a previously allocated chuck of memory given it’s address.

Now with that said, we can do cool stuff like chaining-up multiple allocators like that:

CustomMemoryMgr Figure-02: Custom allocator managed memory.

As you can see, one allocator can get it’s chunk of memory – that it is going to manage – from another (parent) allocator, which in turn could get it’s memory from another allocator and so on. That way you can establish different memory management strategies. For the implementation of my ECS I provide a root stack-allocator that get’s an initial allocated chuck of 1GB system-memory. Second-tier modules will allocate as much memory as they need from this root allocator and only will free it when the application get’s terminated.

MemoryMgr Figure-03: Possible distribution of global memory.

Figure-03 shows how the global memory could be distributed among the second-tier modules: “Global-Memory-User A” could be the Entity-Manager, “Global-Memory-User B” the Component-Manager and “Global-Memory-User C” the System-Manager.

Logging

I am not going to talk too much about logging as I simply used log4cplus[7] doing this job for me. All I did was defining a Logger base class hosting a log4cplus::Logger object and a few wrapper methods forwarding simple log calls like “LogInfo()”, “LogWarning()”, etc.

Entity-Manager, IEntity, Entity and Co.

Okay now let’s talk about the real meat of my architecture; the blue area in Figure-01. You may have noticed the similar setup between all manager objects and their concerning classes. Have a look at the EntityManager, IEntity and Entity classes for example. The EntityManger class is supposed to manage all entity objects during application run-time. This includes tasks like creating, deleting and accessing existing entity objects. IEntity is an interface class and provides the very basic traits of an entity object, such as an object-identifier and (static-)type-identifier. It’s static because it won’t change after program initialization. This type-identifier is also consistent over multiple application runs and may only change, if source code was modified.

class IEntity
{
    // code not complete!
EntityId m_Id;
 
    public:
        IEntity();
        virtual ~IEntity();
 
        virtual const EntityTypeId GetStaticEntityTypeID() const = 0;
 
        inline const EntityId GetEntityID() const { return this->m_Id; }
};

The type-identifier is an integer value and varies for each concrete entity class. This allows us to check the type of an IEntity object at run-time. Last but not least comes the Entity template class.

template<class T>
class Entity : public IEntity
{
    // code not complete!
 
    void operator delete(void*) = delete;
    void operator delete[](void*) = delete;

public:
 
    static const EntityTypeId STATIC_ENTITY_TYPE_ID;
    
    Entity() {}
    virtual ~Entity() {}
 
    virtual const EntityTypeId GetStaticEntityTypeID() const override { return STATIC_ENTITY_TYPE_ID; }
};
 
// constant initialization of entity type identifier
template<class T>
const EntityTypeId Entity<T>::STATIC_ENTITY_TYPE_ID = util::Internal::FamilyTypeID::Get();

This class’s soul purpose is the initialization of the unique type-identifier of a concrete entity class. I made use of two facts here: first constant initialization[10] of static variables and second the nature of how template classes work. Each Version of the template class Entity will have its own static variable STATIC_ENTITY_TYPE_ID. Which in turn will be guaranteed to be initialized before any dynamic initialization happens. The term “util::Internal::FamilyTypeID::Get()” is used to implement a sort of type counter mechanism. It internally increments a counter every time it gets called with a different T, but always returns the same value when called with the same T again. I am not sure if that patter has a special name, but it is pretty cool 🙂 At this point I also got ride of the delete and delete[] operator. This way I made sure nobody would accidentally call these guys. This also – as long as your compiler is smart enough – would give you a warning when trying to use the new or new[] operator of entity objects as their counterparts are gone. These operators are not intended to be used since the EntityManager class will take care of all this. Alright, let’s summarize what we just learned. The manager class provides basic functionality such as creating, deleting and accessing objects. The interface class functions as the very root base class and provides an unique object-identifier and type-identifier. The template class ensures the correct initialization of the type-identifier and removes the delete/delete[] operator. This very same pattern of a manager, interface and template class is used for components, systems and events as well. The only, but important, thing these groups differ, is the way manger classes store and access their objects.

Let’s have a look at the EntityManager class first. Figure-04 shows the overall structure of how things are stored. 

EntityMgr

Figure-04: Abstract view of EntityManager class and it’s object storage.When creating a new entity object one would use the EntityManager::CreateEntity<T>(args…) method. This public method first takes a template parameter which is the type of the concrete entity to be created. Secondly this method takes in an optional amount of parameters (can be empty) which are forwarded to the constructor of T. Forwarding  these parameters happens through a variadic template[11]. During creation the following things happen internally …

  1. The ObjectPool[12] for entity objects of type T will be acquired, if this pool does not exists a new one will be created
  2. New memory will be allocated from this pool; just enough to store the T object
  3. Before actually calling the constructor of T, a new EntityId is acquired from the manager. This id will be stored along with the before allocated memory into a look-up table, this way we can look-up the entity instance later with that id
  4. Next the C++ in-placement new operator[13] is called with the forwarded args… as input to create a new instance of T
  5. finally the method returns the entity’s identifier.

After a new instance of an entity object got created you can get access to it via it’s unique object identifier (EntityId) and EntityManager::GetEntity(EntityId id). To destroy an instance of an entity object one must call the EntityManager::DestroyEntity(EntityId id) method.

The ComponentManager class works in the same way plus one extension. Besides the object pools for storing all sorts of components it must provide an additional mechanism for linking components to their owning entity objects. This constraint results in a second look-up step: first we check if there is an entry for a given EntityId, if there is one we will check if this entity has a certain type of component attached by looking it up in a component-list.

CompMgr Figure-05: Component-Manager object storage overview.

Using the ComponentManager::CreateComponent(EntityId id, args…) method allows us to add a certain component to an entity. With ComponentManager::GetComponent(EntityId id) we can access the entity’s components, where T specifies what type of component we want to access. If the component is not present nullptr is returned. To remove a component from an entity one would use the ComponentManager::RemoveComponent(EntityId id) method. But wait there is more. Another way of accessing components is using the ComponentIterator. This way you can iterate over all existing components of a certain type T. This might be handy if a system like the “Physics-System” wants to apply gravity to all “Rigidbody-Components”.

The SystemManager class does not have any fancy extras for storing and accessing systems. A simple map is used to store a system along with it’s type-identifier as the key.

The EventManager class uses a linear-allocator that manages a chunk of memory. This memory is used as an event buffer. Events are stored into that buffer and dispatched later. Dispatching the event will clear the buffer so new events can be stored. This happens at least once every frame.

ECS_Access Figure-06: Recap ECS architecture overview

I hope at this point you got a somewhat idea how things work in my ECS. If not, no worries, have a look at Figure-06 and let’s recap. You can see the EntityId is quite important as you will use it to access a concrete entity object instance and all it’s components. All components know their owner, that is, having a component object at hand you can easily get the entity by asking the EntityManager class with the given owner-id of that component. To pass an entity around you would never use it’s pointer directly, but you can use events in combination with the EntityId. You could create a concrete event, let’s say “EntityDied” for example, and this event (which must be a plain old data object) has a member of type EntityId. Now to notify all event listeners (IEventListener) – which could be Entities, Components or Systems – we use EventManager::SendEvent(entityId). The event receiver on the other side now can use the provided EntityId and ask the EntityManager class to get the entity object or the ComponentManager class to get a certain component of that entity. The reason for that detour is simple, at any point while running the application an entity or one of it’s components could be deleted by some logic. Because you won’t clutter your code by extra clean-up stuff you rely on this EntityId. If the manager returns nullptr for that EntityId, you will know that an entity or component does no longer exists. The red square btw. is corresponding to the one in Figure-01 and marks the boundaries of the ECS.

The Engine object

To make things a little bit more comfortable I created an engine object. The engine object ensures an easy integration and usage in client software. On client side one only has to include the “ECS/ECS.h” header and call the ECS::Initialize() method. Now a static global engine object will be initialized (ECS::ECS_Engine) and can be used at client side to get access to all the manager classes. Furthermore it provides  a SendEvent method for broadcasting events and an Update method, which will automatically dispatch all events and update all systems. The ECS::Terminate() should be called before exiting the main program. This will ensure that all acquired resources will be freed. The code snippet bellow demonstrates the very basic usage of the ECS’s global engine object.

#include <ECS/ECS.h>
 
int main(int argc,char* argv[])
{
    // initialize global 'ECS_Engine' object
    ECS::Initialize();
 
    const float DELTA_TIME_STEP = 1.0f / 60.0f; // 60hz
 
    bool bQuit = false;
 
    // run main loop until quit
   while(bQuit == false)
   {
       // Update all Systems, dispatch all buffered events,
       // remove destroyed components and entities ...
       ECS::ECS_Engine->(DELTA_TIME_STEP);
       /*
           ECS::ECS_Engine->GetEntityManager()->...;
           ECS::ECS_Engine->GetComponentManager()->...;
           ECS::ECS_Engine->GetSystemManager()->...;
 
           ECS::ECS_Engine->SendEvent<T>(...);
       */
       // more logic ...
   }
 
   // destroy global 'ECS_Engine' object
   ECS::Terminate();
   return 0;
}

Conclusion

The Entity-Component-System described in this article is fully functional and ready to use. But as usual there are certainly a few thinks to improve. The following list outlines just a few ideas that I came up with:

  • Make it thread-safe,
  • Run each system or a group of systems in threats w.r.t. to their topological order,
  • Refactor event-sourcing and memory management and include them as modules,
  • serialization,
  • profiling

I hope this article was helpful and you enjoyed reading it as much as I did writing it 🙂 If you want to see my ECS in action check out this demo:

The BountyHunter demo makes heavily use of the ECS and demonstrates the strength of this pattern. If you want to know how?, have a look at this post.

So far …

The Entity-Component-System – BountyHunter game (Part 2)

by Tobias Stein on 11/22/17 10:25:00 am

The following blog post, unless otherwise noted, was written by a member of Gamasutra’s community.
The thoughts and opinions expressed are those of the writer and not Gamasutra or its parent company.


Hey Folks 🙂

Welcome to part two of the series “The Entity-Component-System“. As always you can checkout the original post.

Continuing writing from my last post, where I talked about the Entity-Component-System (ECS) design pattern. I now want to show you how to actually use it to build a game with it. If you not already have seen it, check out what kinda game I built with the help of my ECS.

I will admit this does not look much, but if you ever had build your own game without help of a big and fancy game engine, like Unity or Unreal, you might give me some credit here 😉 So for the purpose of demonstrating my ECS I simply just need that much. If you still have not figured out what this game (BountyHunter) is about, let me help you out with the following picture:

BH_GameRules

The picture on the left may look familiar as it is a more abstract view of the game you saw in the video clip. Focus is laid on game entities. On the right hand side you will find the game objective and rules. This should be pretty much self-explanatory. As you can see we got a bunch of entity types living in this game world and now you may wonder what they are actually made of? Well components of course. While some types of components are common for all this entities a few are unique for others. Check out the next picture.

BH_Comps Figure-02: Entity and their components.

By looking at this picture you can easily see the relation between entities and their components (this is not a complete depiction!). All game entities have the Transform-Component in common. Because game entities must be somewhere located in the world they have a transform, which describes the entities position, rotation and scale. This might be the one and only component attached to an entity. The camera object for instance does require more components especially not a Material-Component as it will be never visible to the player (this might not be true if you would use it for post-effects). The Bounty and Collector entity objects on the other hand do have a visual appearance and therefore need a Material-Component to get displayed. They also can collide with other objects in the game world and therefore have a Collision-Component attached, which describes their physical form. The Bounty entity has one more component attached to it; the Lifetime-Component. This component states the remaining life-time of a Bounty object, when it’s life-time is elapsed the bounty will fade away.

So what’s next? Having all these different entities with their individual gathering of components does not complete the game. We also need someone who knows how to drive each one of them. I am talking about the systems of course. Systems are great. You can use systems to split up your entire game-logic into much smaller pieces. Each piece dealing with a different aspect of the game. There could or actually should be an Input-System, which is handling all the player input. Or a Render-System that brings all the shapes and color onto screen. A Respawn-System to respawn dead game objects. I guess you got the idea. The following picture shows a complete class-diagram of all the concrete entity, component and system types in BountyHunter.

ClassDiagram1 Figure-03: BountyHunter ECS class-diagram.

Now we got entities, components and system (ECS), but wait there is more.. events! To let systems and entities communicate with each other I provided a collection of 38 different events:

GameInitializedEvent GameRestartedEvent GameStartedEvent 
GamePausedEvent GameResumedEvent GameoverEvent GameQuitEvent 
PauseGameEvent ResumeGameEvent RestartGameEvent QuitGameEvent 
LeftButtonDownEvent LeftButtonUpEvent LeftButtonPressedEvent 
RightButtonDownEvent RightButtonUpEvent RightButtonPressedEvent
KeyDownEvent KeyUpEvent KeyPressedEvent ToggleFullscreenEvent
EnterFullscreenModeEvent StashFull EnterWindowModeEvent 
GameObjectCreated GameObjectDestroyed PlayerLeft GameObjectSpawned
GameObjectKilled CameraCreated, CameraDestroyed ToggleDebugDrawEvent
WindowMinimizedEvent WindowRestoredEvent WindowResizedEvent 
PlayerJoined CollisionBeginEvent CollisionEndEvent 

And there is still more , what else did I need to make BountyHunter:

  • general application frameworkSDL2 for getting the player input and setting up the basic application window.
  • graphics – I used a custom OpenGL renderer to make rendering into that application window possible.
  • math – for solid linear algebra I used glm.
  • collision detection – for collision detection I used box2d physics.
  • Finite-State-Machine – used for simple AI and game states.

Obviously I am not going to talk about all these mechanics  as they are worth their own post, which I might do at a later point 😉 But, if your are enthusiastic to get to know anyway I won’t stop you and leave you with this link. Looking at all the features I mentioned above you may realize that they are a good start for your own small game engine. Here are a few more things I got on my todo-list, but actually did not implement just because I wanted to get things done.

  • Editor – an editor managing entities, components, systems and more
  • Savegame – persist entities and their components into a database using some ORM library (e.g. codesynthesis)
  • Replays – recoding events at run-time and replay them at a later point
  • GUI – using a GUI framework (e.g. librocket) to build an interactive game-menu
  • Resource-Manager – synchronous and asynchronous loading of assets (textures, fonts, models etc.) through a custom resource manager
  • Networking – send events across the network and setup a multiplayer mode

I will leave these todo’s up to you as a challenge to proof that you are an awesome programmer 😉

Finally let me provide you some code, which demonstrates the usage of the my ECS. Remember the Bounty game entity? Bounties are the small yellow, big red and all in between squares spawning somewhere randomly in the center of the world. The following snipped shows the code of the class declaration of the Bounty entity.

// Bounty.h
 
class Bounty : public GameObject<Bounty>
{
private:
 
    // cache components
    TransformComponent*   m_ThisTransform;
    RigidbodyComponent*   m_ThisRigidbody;
    CollisionComponent2D* m_ThisCollision;
    MaterialComponent*    m_ThisMaterial;
    LifetimeComponent*    m_ThisLifetime;
 
   // bounty class property
   float                 m_Value;
 
public:
 
    Bounty(GameObjectId spawnId);
    virtual ~Bounty();
 
    virtual void OnEnable() override;
    virtual void OnDisable() override;
 
    inline float GetBounty() const { return this->m_Value; }
 
    // called OnEnable, sets new randomly sampled bounty value
    void ShuffleBounty();
};

The code is pretty much straight forward. I’ve created a new game entity by deriving from GameObject<T> (which is derived from ECS::Entity<T>), with the class (Bounty) itself as T. Now the ECS is aware of that concrete entity type and a unique (static-)type-identifier will be created. We will also get access to the convenient methods AddComponent<U>, GetComponent<U>, RemoveComponent<U>. Besides the components, which I show you in a second, there is another property; the bounty value. I am not sure why I did not put that property into a separate component, for instance a BountyComponent component, because that would be the right way. Instead I just put the bounty value property as member into the Bounty class, shame on me. But hey, this only shows you the great flexibility of this pattern, right? 😉 Right, the components …

// Bounty.cpp
Bounty::Bounty(GameObjectId spawnId)
{
    Shape shape = ShapeGenerator::CreateShape<QuadShape>();
    AddComponent<ShapeComponent>(shape);
    AddComponent<RespawnComponent>(BOUNTY_RESPAWNTIME, spawnId, true);
 
    // cache this components
    this->m_ThisTransform = GetComponent<TransformComponent>();
    this->m_ThisMaterial  = AddComponent<MaterialComponent>(MaterialGenerator::CreateMaterial<defaultmaterial>());
    this->m_ThisRigidbody = AddComponent<RigidbodyComponent>(0.0f, 0.0f, 0.0f, 0.0f, 0.0001f);
    this->m_ThisCollision = AddComponent<CollisionComponent2d>(shape, this->m_ThisTransform->AsTransform()->GetScale(), CollisionCategory::Bounty_Category, CollisionMask::Bounty_Collision);
    this->m_ThisLifetime  = AddComponent<LifetimeComponent>(BOUNTY_MIN_LIFETIME, BOUNTY_MAX_LIFETIME);
}
// other implementations ...

I’ve used the constructor to attach all the components required by the Bounty entity. Note that this approach creates a prefabricate of an object and is not flexible, that is, you will always get a Bounty object with the same components attached to it. Where this is a good enough solution for this game it might be not in a more complex one. In such a case you would provide a factory that produces custom tailored entity objects. As you can see in the code above there are quite a few components attached to the Bounty entity. We got a ShapeComponent and MaterialComponent for the visual appearance. A RigidbodyComponent and CollisionComponent2D for physical behavior and collision response. A RespawnComponent for giving Bounty the ability to get respawned after death. Last but not least there is a LifetimeComponent that will bind the existents of the entity on a certain amount of time. The TransformComponent is automatically attached to any entity that is derived from GameObject<T>. That’s it. We’ve just added a new entity to the game.

Now you probably want to see how to make use of all this components. Let me give you two examples. First the RigidbodyComponent. This component contains information about some physical traits, e.g. friction, density or linear damping. Furthermore it functions as an adapter class which is used to in-cooperate the box2d physics into the game. The RigidbodyComponent is rather important as it is used to synchronize the physics simulated body’s transform (owned by box2d) and the the entities TransformComponent (owned by the game). The PhysicsSystem is responsable for this synchronization process.

// PhysicsEngine.h
 
class PhysicsSystem : public ECS::System<PhysicsSystem>, public b2ContactListener
{
public:
PhysicsSystem();
    virtual ~PhysicsSystem();
 
    virtual void PreUpdate(float dt) override;
    virtual void Update(float dt) override;
    virtual void PostUpdate(float dt) override;
 
    // Hook-in callbacks provided by box2d physics to inform about collisions
    virtual void BeginContact(b2Contact* contact) override;
    virtual void EndContact(b2Contact* contact) override;
}; // class PhysicsSystem
// PhysicsEngine.cpp
 
void PhysicsSystem::PreUpdate(float dt)
{
    // Sync physics rigidbody transformation and TransformComponent
    for (auto RB = ECS::ECS_Engine->GetComponentManager()->begin<RigidbodyComponent>(); RB != ECS::ECS_Engine->GetComponentManager()->end<RigidbodyComponent>(); ++RB)
    {
        if ((RB->m_Box2DBody->IsAwake() == true) && (RB->m_Box2DBody->IsActive() == true))
        {
            TransformComponent* TFC = ECS::ECS_Engine->GetComponentManager()->GetComponent<TransformComponent>(RB->GetOwner());
            const b2Vec2& pos = RB->m_Box2DBody->GetPosition();
            const float rot = RB->m_Box2DBody->GetAngle();
 
            TFC->SetTransform(glm::translate(glm::mat4(1.0f), Position(pos.x, pos.y, 0.0f)) * glm::yawPitchRoll(0.0f, 0.0f, rot) * glm::scale(TFC->AsTransform()->GetScale()));
        }
    }
}
 
// other implementations ...

From the implementation above you may have noticed the three different update functions. When systems get updated, first all PreUpdate methods of all systems are called, then Update and last the PostUpdate methods. Since the PhysicsSystem is called before any other TransformComponent concerned system, the code above ensures a synchronized transform. Here you can also see the ComponentIterator in action. Rather than asking every entity in the world, if it has a RigidbodyComponent, we ask the ComponentManager to give us a ComponentIterator for type RigidbodyComponent. Having the RigidbodyComponent we easily can retrieve the entity’s id and ask the ComponentManager once more to give us the TransformComponent for that id as well, too easy. Let’s check out that second example I’ve promised. The RespawnComponent is used for entities which are intended to be respawned after they died. This component provides five properties which can be used to configure the entity’s respawn behavior. You can decide to automatically respawn an entity when it dies, how much time must pass until it get’s respawned and a spawn location and orientation. The actual respawn logic is implemented in the RespawnSystem.

// RespawnSystem.h
class RespawnSystem : public ECS::System<RespawnSystem>, protected ECS::Event::IEventListener
{
private:
 
    // ... other stuff
    Spawns       m_Spawns;
    RespawnQueue m_RespawnQueue;
 
    // Event callbacks
    void OnGameObjectKilled(const GameObjectKilled* event);
 
public:
 
    RespawnSystem();
    virtual ~RespawnSystem();
 
    virtual void Update(float dt) override;
 
    // more ...
}; // class RespawnSystem
// RespawnSystem.cpp
// note: the following is only pseudo code!
 
voidRespawnSystem::OnGameObjectKilled(const GameObjectKilled * event)
{
    // check if entity has respawn ability
    RespawnComponent* entityRespawnComponent = ECS::ECS_Engine->GetComponentManager()->GetComponent<RespawnComponent>(event->m_EntityID);
 
    if(entityRespawnComponent == nullptr || (entityRespawnComponent->IsActive() == false) || (entityRespawnComponent->m_AutoRespawn == false))
        return;
 
    AddToRespawnQeueue(event->m_EntityID, entityRespawnComponent);
}
 
void RespawnSystem::Update(float dt)
{
    foreach(spawnable in this->m_RespawnQueue)
    {
        spawnable.m_RemainingDeathTime -= dt;
        if(spawnable.m_RemainingDeathTime <= 0.0f)
        {
            DoSpawn(spawnable);
            RemoveFromSpawnQueue(spawnable);
        }
    }
}

The code above is not complete, but grasps the important lines of code. The RespawnSystem is holding and updating a queue of EntityId’s along with their RespawnComponent’s. New entries are enqueued when the systems receives a GameObjectKilled event. The system will check if the killed entity has the respawn ability, that is,  if there is a RespawnComponent attached. If true, then the entity get’s enqueued for respawning, else it is ignored. In the RespawnSystem’s update method, which is called each frame, the system will decrease the initial respawn-time of the queued entitys’ RespawnComponents‘ (not sure if I got the single quotes right here?). If a respawn-time drops below zero, the entity will be respawned and removed from the respawn queue.

I know this was a quick tour, but I hope I could give you a rough idea how things work in the ECS world. Before ending this post I want to share some more of my own experiences with you. Working with my ECS was much a pleasure. It is so surprisingly easy to add new stuff to the game even third-party libraries. I simply added new components and systems, which would link the new feature into my game. I never got the feeling being at a dead end. Having the entire game logic split up into multiple systems is intuitive and comes for free using an ECS. The code looks much cleaner and becomes more maintainable as all this pointer-spaghetti-dependency-confusion is gone. Event sourcing is very powerful and helpful for inter system/entity/… communication, but it is also a double bleeding edge and can cause you some trouble eventually. I am speaking of event raise conditions. If you have ever worked with Unity’s or Unreal Engine’s editor you will be glad to have them. Such editors definitely boost your productivity as your are able to create new ECS objects in much less time than hacking all these line of code by hand. But once you have setup a rich foundation of entity, component, system and event objects it is almost child’s play to plug them together and build something cool out of them. I guess I could go on and talk a while longer about how cool ECS’s are, but I will stop here.

Thanks for swinging by and making it this far 🙂

Cheers, Tobs.

Understanding Component-Entity-Systems
General and Gameplay Programming

Klutzershy

The traditional way to implement game entities was to use object-oriented programming. Each entity was an object, which intuitively allowed for an instantiation system based on classes and enabled entities to extend others through polymorphism. This led to large, rigid class hierarchies. As the number of entities grew, it became increasingly difficult to place a new entity in the hierarchy, especially if the entity needed a lot of different types of functionality. Here, you can see a simple class hierarchy. A static enemy does not fit well into the tree. classdiagram.png To solve this, game programmers started to build entities through composition instead of inheritance. An entity is simply an aggregation (technically a composition) of components. This has some major benefits over the object-oriented architecture described above:

  1. It’s easy to add new, complex entities
  2. It’s easy to define new entities in data
  3. It’s more efficient

Here’s how a few of the entities above would be implemented. Notice that the components are all pure data – no methods. This will be explained in detail below. compositiondiagram.png

The Component

A component can be likened to a C struct. It has no methods and is only capable of storing data, not acting upon it. In a typical implementation, each different component type will derive from an abstract Component class, which provides facilities for getting a component’s type and containing entity at runtime. Each component describes a certain aspect of an entity and its parameters. By themselves, components are practically meaningless, but when used in conjunction with entities and systems, they become extremely powerful. Empty components are useful for tagging entities.

Examples

  • Position (x, y)
  • Velocity (x, y)
  • Physics (body)
  • Sprite (images, animations)
  • Health (value)
  • Character (name, level)
  • Player (empty)

The Entity

An entity is something that exists in your game world. Again, an entity is little more than a list of components. Because they are so simple, most implementations won’t define an entity as a concrete piece of data. Instead, an entity is a unique ID, and all components that make up an entity will be tagged with that ID. The entity is an implicit aggregation of the components tagged with its ID. If you want, you can allow components to be dynamically added to and removed from entities. This allows you to “mutate” entities on the fly. For example, you could have a spell that makes its target freeze. To do this, you could simply remove the Velocity component.

Examples

  • Rock (Position, Sprite)
  • Crate (Position, Sprite, Health)
  • Sign (Position, Sprite, Text)
  • Ball (Position, Velocity, Physics, Sprite)
  • Enemy (Position, Velocity, Sprite, Character, Input, AI)
  • Player (Position, Velocity, Sprite, Character, Input, Player)

The System

Notice that I’ve neglected to mention any form of game logic. This is the job of the systems. A system operates on related groups of components, i.e. components that belong to the same entity. For example, the character movement system might operate on a Position, a Velocity, a Collider, and an Input. Each system will be updated once per frame in a logical order. To make a character jump, first the keyJump field of the Input data is checked. If it is true, the system will look through the contacts contained in the Collider data and check if there is one with the ground. If so, it will set the Velocity‘s y field to make the character jump. Because a system only operates on components if the whole group is present, components implicitly define the behaviour an entity will have. For example, an entity with a Position component but not a Velocity component will be static. Since the Movement system uses a Position and a Velocity, it won’t operate on the Position contained within that entity. Adding a Velocity component will make the Movement system work on that entity, thus making the entity dynamic and affected by gravity. This behaviour can be exploited with “tag components” (explained above) to reuse components in different contexts. For example, the Input component defines generic flags for jumping, moving, and shooting. Adding an empty Player component will tag the entity for the PlayerControl system so that the Input data will be populated based on controller inputs.

Examples

  • Movement (Position, Velocity) – Adds velocity to position
  • Gravity (Velocity) – Accelerates velocity due to gravity
  • Render (Position, Sprite) – Draws sprites
  • PlayerControl (Input, Player) – Sets the player-controlled entity’s input according to a controller
  • BotControl (Input, AI) – Sets a bot-controlled entity’s input according to an AI agent

Conclusion

To wrap up, OOP-based entity hierarchies need to be left behind in favour of Component-Entity-Systems. Entities are your game objects, which are implicitly defined by a collection of components. These components are pure data and are operated on in functional groups by the systems. I hope I’ve managed to help you to understand how Component-Entity-Systems work, and to convince you that they are better than traditional OOP. If you have any questions about the article, I’d appreciate a comment or message. A follow-up article has been posted, which provides a sample C implementation and solves some design problems. Implementing Component-Entity-Systems

Article Update Log

1 April 2013 – Initial submission 2 April 2013 – Initial publication; cleaned up formatting 29 September 2013 – Added notice of follow-up article; changed some formatting)

Implementing Component-Entity-Systems
General and Gameplay Programming

Klutzershy

This is the follow-up to my article from April, Understanding Component-Entity-Systems. If you haven’t read that article yet, I suggest looking it over because it explains the theory behind what I am about to show you. To summarize what was written:

  • Components represent the data a game object can have
  • Entities represent a game object as an aggregation of components
  • Systems provide the logic to operate on components and simulate the game

The purpose of this article is to show how to implement the architecture that I described in an efficient way, and to provide solutions for some sample problems. All the code samples that I provide will be written in C.

Implementation

Components

I wrote in the last article that a component is essentially a C struct: plain old data, so that’s what I used to implement them. They’re pretty self-explanatory. I’ll implement three types of component here:

  1. Displacement (x, y)
  2. Velocity (x, y)
  3. Appearance (name)

Here’s the sample of code defining the Displacement component. It is a simple struct with two members that define its vector components.
typedef struct
{

float x;
float y;
} Displacement;
The Velocity component is defined the same way, and the Appearance has a single string member. In addition to the concrete data types, the implementation makes use of an enum for creating “component masks”, or bit fields, that identify a set of components. Each entity and system has a component mask, the use of which will be explained shortly.
typedef enum
{
COMPONENT_NONE = 0,
COMPONENT_DISPLACEMENT = 1 << 0,
COMPONENT_VELOCITY = 1 << 1,
COMPONENT_APPEARANCE = 1 << 2
} Component;
Defining a component mask is easy. In the context of an entity, a component mask describes which components the entity has. If the entity has a Displacement and a Appearance, the value of its component mask will be COMPONENT_DISPLACEMENT | COMPONENT_APPEARANCE.

Entities

The entity itself will not be defined as a concrete data type. In accordance with data-oriented-design (DOD) principles, having each entity be a structure containing each of its components, creating an “array of structs”, is a no-no. Therefore, each component type will be laid out contiguously in memory, creating a “struct of arrays”. This will improve cache coherency and facilitate iteration. In order to do this, the entity will be represented by an index into each component array. The component found at that index is considered as part of that entity. I call this “struct of arrays” the World. Along with the components themselves, it stores a component mask for each entity.
typedef struct
{

int mask[ENTITY_COUNT];

Displacement displacement[ENTITY_COUNT];
Velocity velocity[ENTITY_COUNT];
Appearance appearance[ENTITY_COUNT];
} World;
ENTITY_COUNT is defined in my test program to be 100, but in a real game it will likely be much higher. In this implementation, the maximum number of entities is constrained to this value. I prefer to use stack-allocated memory to dynamic memory, but the world could also be implemented as a number of C++-style vectors, one per component. Along with this structure, I have defined a couple of functions that are able to create and destroy specific entities.
unsigned int createEntity(World *world)
{
unsigned int entity;
for(entity = 0; entity < ENTITY_COUNT; ++entity)
{
if(world->mask[entity] == COMPONENT_NONE)
{
return(entity);
}
}

printf(“Error! No more entities left!\n”);
return(ENTITY_COUNT);
}

void destroyEntity(World *world, unsigned int entity)
{
world->mask[entity] = COMPONENT_NONE;
}
The first does not “create” an entity per se, but instead returns the first “empty” entity index, i.e. for the first entity with no components. The second simply sets an entity’s component mask to nothing. Treating an entity with an empty component mask as “non-existent” is very intuitive, because no systems will run on it. I’ve also created a few helper functions to create a fully-formed entity from initial parameters such as displacement and velocity. Here is the one that creates a tree, which has a Displacement and an Appearance.
unsigned int createTree(World *world, float x, float y)
{
unsigned int entity = createEntity(world);

world->mask[entity] = COMPONENT_DISPLACEMENT | COMPONENT_APPEARANCE;

world->displacement[entity].x = x;
world->displacement[entity].y = y;

world->appearance[entity].name = “Tree”;

return(entity);
}
In a real-world engine, your entities would likely be defined using external data files, but that is beyond the scope of my test program. Even so, it is easy to see how flexible the entity creation system is.

Systems

The systems are easily the most complex part of the implementation. Each system is a generic function which is mapped to a certain component mask. This is the second use of a component mask: to define which components a certain system operates on.
#define MOVEMENT_MASK (COMPONENT_DISPLACEMENT | COMPONENT_VELOCITY)

void movementFunction(World *world)
{
unsigned int entity;
Displacement *d;
Velocity *v;

for(entity = 0; entity < ENTITY_COUNT; ++entity)
{
if((world->mask[entity] & MOVEMENT_MASK) == MOVEMENT_MASK)
{
d = &(world->displacement[entity]);
v = &(world->velocity[entity]);

v->y -= 0.98f;

d->x += v->x;
d->y += v->y;
}
}
}
Here is where the component mask becomes really powerful. It makes it trivial to select an entity based on whether or not it has certain components, and it does it quickly. If each entity was a concrete structure with a dictionary or set to show which components it has, it would be a much slower operation. The system itself adds the effect of gravity and then moves any entity with both a Displacement and a Velocity. If all entities are initialized properly, every entity processed by this function is guaranteed to have a valid Displacement and Velocity. The one downside of the component mask is that the number of possible components is finite. In this implementation it is 32 because the default integer type is 32 bits long. C++ provides the std::bitset class, which is N bits long, and I’m sure other languages provide similar facilities. In C, the number of bits can be extended by using multiple component masks in an array and checking each one independently, like this:
(EntityMask[0] & SystemMask[0]) == SystemMask[0] && (EntityMask[1] & SystemMask[1]) == SystemMask[1] // && ...

Source Files

I’ve zipped up the source code here. Main.c runs a sample program that creates three entities and runs each system once. CES.zip

Real-World Problems

This implementation works very well in the small scope of my program and can easily be extended to use more components and systems. It can also easily be extended to run in a main loop, and extended to read entities from data files with some work. This section will tackle some problems with transferring gameplay mechanics and advanced features over to my implementation of CES.

Power-ups and collision filtering

This problem was pitched to me by Krohm in a comment on the original article. He was asking about gameplay-specific behaviours in general, but provided the specific example of a power-up that stopped collisions with a certain entity type. Dynamic components to the rescue! Let’s create a component, say GhostBehaviour, that has a list of qualifiers for determining which entities an object can pass through. For example, a list of component masks, or possibly material indices. Any component can be added or removed (technically, enabled or disabled) from any entity at any time, simply by changing that entity’s component mask. When the player grabs the powerup, the GhostBehaviour component will be added. It could also have a built-in timer to automatically remove itself after a few seconds. To actually disable the necessary collisions, the typical collision response in a physics engine can be exploited. In most physics engines, there is first a step to detect collisions and produce contacts, and then a step to actually apply the contact forces to each body. Let’s say that each of those steps is implemented in a system, and that there is a component that keeps track of each entity’s collision contacts (Collidable). To permit the desired behaviour, each contact should store the index of the other entity. By injecting a system that operates on a GhostBehaviour and a Collidable in between the two physics steps, the contacts between the entities that should pass through each other can be deleted before they are acted upon by the physics engine. This will have the effect of a disabled collision. The same system can also disable the GhostBehaviour after a few seconds. A similar approach can be used to perform a certain action upon collision. There could be a unique system for every action, or the same system could govern all the actions. In any case, the system would read the collision contacts to determine whether the entity collided with some other entity and then act accordingly. In fact, this would be needed to give the player the power-up in the first place.

The Big F***ing Spell – Destroying all monsters

Another problem I received was how to kill all monsters with a spell. Thanks to smorgasbord for this one! The key to solving this one is that a system can be run somewhere outside of the top-level main loop, and that any entity that is a monster, according to CES rules, satisfies the same component mask. For example, every entity with both Health and AI is a monster, and this can be described with a component mask. Remember how a system is just a function and a component mask? Let’s define the “kill all monsters” spell as a system. The function, at its core, is destroyEntity, but could also create particle effects or play a sound. The component mask of the system can be COMPONENT_HEALTH | COMPONENT_AI. In terms of actually casting the spell, I mentioned in the previous article that each entity can have one or more input components, which store boolean or real values that map to various inputs, including AI and networked players. Let’s create a MagicInputComponent that has a boolean value that says when the entity should cast the spell, and an ID corresponding to the spell that should be cast. Each spell has a unique ID, which is actually a key in a lookup table. This lookup table maps a spell ID to the function that “casts” that spell. In this case, the function would run the system..

Conclusion

Remember that this is just a sample implementation. It works for the test program, but is probably not ready for a full game. However, I hope that the design principles have been made clear and that it is easy to understand and implement in your own way in a different language. If you have any more problems you’d like me to solve you can write a comment or send me a message. I’ll update this article periodically with my solutions. Keep in touch for the next article!

Article Update Log

29 September 2013 – Initial submission 2 October 2013 – Initial publication; simplified system code (thanks TiagoCosta!); clarified 2nd example (thanks BeerNutts!)] 21 October 2013 – Reflected code changes in the source archive

Nomad Game Engine: Part 2 — ECS

Down with inheritance!

This post is part of a series where I’m documenting my experience building an ECS game engine from scratch. Check out the homepage for this project for more posts, information, and source code.

Entity Component System

As mentioned in my last post, the game engine I’m starting to make is going to follow ECS (Entity Component System) methodology. In this blog post I’m going to do my best to explain my implementation of ECS as simply as possible. There are many great resources that have been created by people much smarter than me explaining ECS, so you might be wondering why I’m even bothering to make a post about this.

  1. If I’m going to make a blog series about this engine, I think it’s better to have a holistic approach than to just link to a bunch of other people’s posts about a topic.
  2. There are actually many different ways to implement an ECS engine that vary quite dramatically. By making this post I’m setting ground work for the posts to come.

Entities and Components

ECS follows the principle of composition over inheritance. The following examples should be able to illustrate this concept, but if you’re curious about it, I’d highly recommend checking out this video, as it does a great job of explaining why composition over inheritance is important.

In Nomad, we have Entities, Components, and Systems (ECS). To explain these concepts, I’m going to use this example:

Three entities

In this example, we have three entities, or game objects — the player, the log, and the orb. Here are the game’s requirements:

  1. The player is controlled by the arrow keys
  2. The player and the orb both have a health value (can take damage)
  3. The player can’t walk through the log (but the orb can float over it)

In an ECS architecture, entities are assigned components based on what attributes they have. We can drill down into the requirements above to find that we have 7 basic components:

Our 7 components

This might seem like a lot of components that make the game unnecessarily complex, but we can see that each of the components is actually a very small piece of functionality, which makes it much easier to conceptualize. With these components, let’s take a look at our entities:

Entities and their components

This is the essence of ECS: an entity is simply a collection of components which provide functionality. When done properly, components can be added and removed to add or remove functionality. For example, if I wanted the orb to collide with the log and player as well, I could simply add a “Collision” component to it. If the player had an invisibility cloak, I could simply remove its “Sprite” component. Intuitive, right?

Changing how the game works is as easy as adding or removing a component!

ECS with a data-driven approach

Okay, so we’ve got a bunch of entities that have components assigned to them. How does this actually work behind the scenes? This is where Nomad gets slightly more complex. Let’s take a look at what Nomad thinks an entity is:

struct Entity {
    unsigned int id;
}

Yup, that’s right. It’s just an id. That means no functionality, no implementation. Just data. How about taking a look at one of our components:

struct HealthComponent {
    int currentHealth;
    int maxHealth;
}

Once again, no functionality, just data. With this new knowledge, I should clarify our components:

Data oriented components

You’ll notice that now no functionality is assumed by any of these components, they’re simply bags of data.

So at this point you probably have two main questions:

  1. How are Entities and Components tied together?
  2. Where is the actual code (functionality)?

Component Managers

The answer to the first question is actually very simple. Component managers manage all components of one type and keep references to which entities own them. Here’s how the data is actually organized in Nomad:

Component Managers hold all the components (the number below each component is the entity associated with it)

Giving components their own managers as opposed to letting entities own components may seem to be an arbitrary decision, but doing so actually gives a serious performance increase. For a moment, let’s dive into the memory layout of both of these options:

Memory layout for our two possibilities

The most important information to know here is that processors love to iterate over arrays of contiguous data. The less we jump around the computer’s memory, the better.

Let’s use an example to show why the right side is much better performance-wise. Our player (and his trusty companion the orb) are fighting a boss who decides to throw a bomb that reduces everyone in the area’s health by 20% for the duration of the fight.

The bomb reduces maximum health by 20%

The pseudocode would look like this:

foreach(entity hit by bomb):
    HealthComponent hp = entity.getHealth();
    hp.maxHealth = hp.maxHealth * 0.8;
Memory accesses for updating a single component

If our entities held their own components, we would be jumping in memory from the “player” entity’s memory to the “orb” entity’s memory. In this loop, we are sequentially accessing random memory locations, which is not ideal. However, if we hold components contiguously in memory, we’re accessing an array of data in order, which processors love. Obviously, this example is only two components, but for the sake of argument let’s consider that a game might have hundreds of components of a given type. The performance difference between jumping around in memory to update their maximum health and simply running through an array is sizable.

Systems

Alright, so we’ve covered Entities and Components in reasonable depth. How do we actually add functionality? Where does the game code go?

The answer to that is “systems”. Entities and components are just data containers, and systems are the ones who actually modify that data. In Nomad, a system can specify a set of component types that it wishes to pay attention to. Any component that has the necessary components will be updated by the system. This might sound confusing, but it should make more sense after an example.

Movement System

The movement system is one of the most basic and necessary systems. If you take a look at the components we listed up above, you’ll notice that we had both a Transform and a Motion component. Here’s what they look like (note that in the game code they look a bit different but this should serve to illustrate the concept):

struct Transform {
    int x;
    int y;
}
struct Motion {
    Vec2 velocity;
    Vec2 acceleration;
}

The movement system is in charge of updating all entities’ positions and velocities every game tick. Therefore, the movement system states that it wants to pay attention to any entities that have both a “Transform” and a “Motion” component. As components are added and removed, the list of entities that the movement system pays attention to will change. Every update, the movement system will run something like this:

void update(int dt){
    for(entity in m_entities){
        TransformComponent position = entity.getTransform();
        MotionComponent motion = entity.getMotion();
        position.x += motion.velocity.x;
        position.y += motion.velocity.y;
        motion.velocity.x += motion.acceleration.x;
        motion.velocity.y += motion.acceleration.y;
    }
}

Once again let’s think of how memory is traversed in this update() function.

Movement System memory accesses

Notice a couple important things about this chart:

  1. We’re not actually accessing every “transform” component. This is because Entity #2 (The log) doesn’t have a motion component, so the Movement system doesn’t pay attention to it.
  2. Even though we’re skipping the 2nd “transform” component, our memory accesses are still using an array of data, which gives us great performance increases. As we add more entities that move, the performance gains continue to increase.

Back to our example

Let’s take a look at the systems we would need to bring our original example to life (once again, remember that a system will only pay attention to an entity that has *all* of the required components):

The systems we would need to create our game.

Based on these systems, we can see that the player entity would be part of Movement, Player Input, Collision, and Render. The log entity would be part of the Collision system and the Render system, and the orb entity would be part of the Movement, Follow, and Render systems. Note for the astute among you that collision system would normally need to take motion into account as well, but we’re leaving it out for this example.

Adding a new feature or functionality is simple, simply add components and systems as needed. Because systems are independent and only deal with a specific subset of components, the game engine has very low coupling, which makes it a lot easier to debug and plan. In addition, the majority of systems don’t actually need to be run in a certain order, so we can have different systems execute on different threads concurrently, significantly boosting our performance.

Here are a couple other implementations that might differ slightly from mine but do a good job of explaining ECS:

Current Progress

Current progress of Nomad.

Most of the boxes you see are for debugging (bounding boxes for collisions, etc.). A couple changes since my last post:

  • Collision detection now uses spatial hashing (the black squares)
  • Sprites are now drawn in z-order (that’s why the player can run both behind and in front of the tree)
  • Added the ability to sword slash
  • Added rotation to the Transform component (both fireball and sword slash use it)

Keep your eye out in the next couple weeks for my next post in the series!

Component Based Engine Design

What is Component Based Design?

Component based engine design was originally pioneered in order to avoid annoying class hierarchies that inheritance introduces. The idea is to package all functionality of game objects into separate objects. A single game object is just a collection of the components, as so the components of a game object define it’s behavior, appearance and functionality. This is perfectly fine, though there are plenty of resources out that talk about this topic. However I’d like to take a step back and start from the top.

It should be noted that the implementation presented here is just one way of going about things, and comes directly from my highly subjective opinion. Perhaps you as a reader can come up with or know of better solutions or designs.

Here is an example game object, note that the game object is generic and simply contains some components:


The Actual Engine

The engine of a game can be thought of as a manager of systems. As to what a system is, we’ll get to that later, for now think of a system as either Physics, Graphics or AI. The engine ought to have a main loop function, as well as an update function. The update function calls update on all contained systems in a specific order. The main loop function is just a small infinite loop that calls update.

Often times the main loop will deal with timestepping itself. Have a look at the linked article to learn more about proper timestepping.

It is important to have your engine expose the update function, as sometimes your engine will need to be compiled as a static library and linked to externally. In this case the main loop of your simulation may reside outside of the engine library altogether. A common usage I’ve seen for this sort of design choice is when creating an editor of some sort, perhaps a level or content editor. Often times these editors will have a veiwport to preview the game, and in order to do so access to some sort of engine update function is necessary.

Beyond containing and calling update, the Engine also forwards global messages to all the systems. More on messaging later.


Singletons?

Creating more than one instance of an engine should never happen. The question of “should I make this a singleton” will sometimes arise. In my experience the answer is often no. Unless you’re working on a large team of programmers where the chances of some moron making an instance of some Engine or System class, making things a singleton is just a waste of time. Especially if retrieving data from that singleton incurs a little bit of overhead.


Systems

Each system in the engine corresponds to one type of functionality. This idea is best shown by example, so here are the various systems I usually have in engines I work on:

  • Graphics
  • Physics
  • GameLogic
  • Windowing/Input
  • UI
  • Audio
  • ObjectFactory – Creates objects and components from string or integral ID

Each system’s primary functionality is to operate upon game objects. You can think of a system as a transform function: data is input, modified somehow, and then data is output. The data passed to each system should be a list of game objects (or of components). However a system only updates components on game objects, and the components to be updated are the ones related to that system. For example the graphics system would only update sprite or graphics related components.

Here’s an example header file for a system:

The naive approach to a system update would be to pass a list of game objects like so:

The above code is assuming the ObjectFactory contains all game objects, and can be accessed somehow (perhaps by pointer). This code will work, and it’s exactly what I started with when I wrote my first engine. However you’ll soon realize the folly involved here.


Cache is King

That’s right, those who have the gold makes the rules; the golden rule. In a more serious sense, cache is king due to processing speed related to memory access speed. The bottleneck in all engines I have ever seen or touched in the past couple years has been due to poor memory access patterns. Not a single serious bottleneck was due computation.

Currently modern hardware performs very very fast. Reaching for data in memory (RAM, not just hard disk) is orders of magnitude slower. So caches come the rescue. A cache can be thought of, in a simplified sense, as a small chunk of memory right next to the CPU (or GPU). Accessing the cache memory is way faster than going all the way out to main RAM. Whenever memory is fetched from RAM memory around that RAM location is also placed into the CPU cache. The idea here is that when you retrieve something from RAM the likelyhood of requiring to fetch something very nearby is high, so all the data in that area is grabbed all at once.

Long story short, if you place things that need to be accessed at around the same time next to each other in memory, huge performance benefits will be reaped. The best performance comes from traversing memory linearly, as if iterating over an array. This means that if we can stick things into arrays and traverse these arrays linearly, there will be no faster form of memory access.

Fetching memory that does not exist in the cache is called a cache miss.


Cache and Components

Lets revisit the naive approach to updating systems. Assuming a system is handed a generic game object, that system must then retrieve its corresponding component(s) to update, like so:

As this loop is run the memory of every game object and every component type that corresponds to the system is touched. A cache miss will likely be incurred over and over as the loop runs bouncing around in memory. This is even worse if the ObjectFactory is just creating random objects with new, as every memory access to every object and every component will likely incur cache misses.

What do? The solution to all these memory access problems is to simplify the data into arrays.


Arrays GameObjects + Components

I suggest having every game object exist within a single giant array. This array should probably be contained within the ObjectFactory. Usage of std::vector for such a task is recommended. This keeps game objects together in memory, and even though deletion of a game object is of O(n) complexity, that O(n) operation traverses an array, and usually will turn out to be unnoticeable. A custom vector or array class can be created that avoids the O(n) operation entirely by taking the element at the end, and placing it into the deleted slot. This can only be done if references into the array are translated handles (more on handles momentarily).

Every component type should be in a giant array too. Each component array should be stored within their respective systems (but can be “created” from the Factory). Again, an array like data structure would be ideal.

This simple setup allows for linear traversal of most memory in the entire engine, so long as the update function of each system is redesigned slightly. Instead of handing a list of game objects to each system, the system can just iterate over its related components directly, since the components are stored within the systems already.


So, how are Game Objects “handled” now?

Since components have been moved into large arrays, and the game objects themselves are in a big array, what exactly should the relationship between a game object and a component be? In the naive implementation some sort of map would have worked perfectly fine, as the memory location of each component could be anywhere due to the use of new calls. However the relation isn’t so carefree.

Since things are stored in arrays its time to switch from pointer-centric relationships to handle based relationships. A handle can be thought of in its simplest form an index into an array. Since game objects and components are stored in arrays, it is only natural that to access a game object you do so by index. This allows for these giant arrays to grow and shrink as necessary without obliterating dangling pointers in the rest of the program.

Here’s a code example:

As you can see, an array of handles is stored to represent the containment of components. There is one slot in the array for each type of component. By design this limits each component to be of unique type within a game object. Each handle is an index into a large array of components. This index is used to retrieve components that correspond to a game object. A special value (perhaps -1, or by some other mechanism) can be used to denote “no component”.

Handles can get quite a bit more versatile than just a plain ol’ integer. I myself created a HandleManager for translating an integer into a pointer. Here’s a great resource for creating your own handle manager.

The idea of translating a handle into a pointer is such that once the pointer is used it is not kept around. Just let it be reclaimed back into the stack. This makes it so that every time a pointer is required there is a translation of handle to a single pointer somewhere in memory. This constant translation allows for the actual pointer value to be translated to, to be swapped for another pointer at any time without fear of leaving dangling pointers.


Where does the Code go?

Code for update routines can be put into either systems or components. The choice is yours entirely. A more data oriented approach would put as much code into systems as possible, and just use components as buckets of data. I myself prefer this approach. However once you hit game logic components virtual functionality is likely to be desired, and so code will likely be attached directly to such components.

The last engine I built used the naive approach to component based design, and it worked wonderfully. I used a block allocator so cache misses weren’t as high as with raw new calls.

The point is, do what makes most sense and keep things simple. If you want to store routines within your components and use virtual function calls, then you’ll probably have trouble storing things in an array unless you place all memory in the base class. If you can externalize as much code from your components as possible, it may be simpler to keep all your components in an array.

There is a tradeoff between flexibility and efficiency. My personal preference is to store performance sensitive components in huge arrays, and keep AI and game logic related things together in memory as much as possible, but not really stress too much about it. Game logic and AI should probably just be rather flexible, and so details about memory locations aren’t so important. One might just allocate these types of components with a block allocator and call it good.


Messaging

The last major devil in an engine is messaging. Messaging is transferring data from one location to another. In this sense the most basic form of messaging is a simple function call.

Taking this a step further, we’d like to be able to send messages over a connection in a generic manner. It should not matter what type of message we send; all messages should be sent the same way to reduce code duplication and complexity. The most basic form of this is dynamic dispatch, or virtual function calls. An ID is introduced to the message so the internal data can be typecasted to the correct type.

Still, we can do better. Lets imagine we have a sort of GameLogic component or system. We need a way to send a message to this object that contains some data. Lets not focus much on memory access patterns, as simplicity and flexibility are key here. Take a look at this code:

This code highlights the usage of messaging quite well. Say the player emits some messages as it walks around, perhaps something like “I’m here” to all nearby things in a level. The player can blindly send these messages across the SendMessage function without caring about whether or not it will respond or do anything. As you can see, the implementation of the SendMessage function ignores most message types and responds to a few.

In this example when the player nears the treasure chest it will glimmer a bit. Perhaps when the player gets closer it glimmers brighter and brighter. In order to do so, the MSG object sent to the treasure chest ought to contain the player coordinates, and so it can be typecasted to the appropriate message type.

The eActivate message may be emitted by the player whenever they hit the “e” button. Anything that could possibly respond to an eActive message will do so, and the rest of the objects receiving the message will safely ignore it.

This type of messaging is simple and easy to implement, quite efficient (if a block allocator or stack memory is used for the messages), and rather powerful.

A more advanced version of messaging makes heavy use of C++ introspection. A future article will likely be devoted to this topic, as it’s a hefty topic altogether. Edit: Here’s a link to a slideshow I presented at my university.


Conclusion and Resources

This was a rather whirlwind tour through a lot of different information -I hope it all came out coherent. Please don’t hesitate to ask any questions or add comments; I may even just update this post with more information or clarifications!

What is an Entity Component System architecture for game development?

Posted on

Last week I released Ash, an entity component system framework for Actionscript game development, and a number of people have asked me the question “What is an entity component system framework?”. This is my rather long answer.

Entity systems are growing in popularity, with well-known examples like Unity, and lesser known frameworks like Actionscript frameworks Ember2, Xember and my own Ash. There’s a very good reason for this; they simplify game architecture, encourage clean separation of responsibilities in your code, and are fun to use.

In this post I will walk you through how an entity based architecture evolves from the old fashioned game loop. This may take a while. The examples will be in Actionscript because that happens to be what I’m using at the moment, but the architecture applies to all programming language.

Note that the naming of things within this post is based on

  • How they were named as I discovered these archtectures over the past twenty years of my game development life
  • How they are usually named in modern entity component system architectures

This is different, for example, to how they are named in Unity, which is an entity architecture but is not an entity component system architecture.

This is based on a presentation I gave at try{harder} in 2011.

The examples

Throughout this post, I’ll be using a simple Asteroids game as an example. I like to use Asteroids as an example because it involves simplified versions of many of the systems required in larger games – rendering, physics, ai, user control of a character, non-player characters.

The game loop

To understand why we use entity systems, you really need to understand the old-fashioned game loop. A game loop for Asteroids might look something like this

function update( time:Number ):void
{
  game.update( time );
  spaceship.updateInputs( time );
  for each( var flyingSaucer:FlyingSaucer in flyingSaucers )
  {
    flyingSaucer.updateAI( time );
  }
  spaceship.update( time );
  for each( var flyingSaucer:FlyingSaucer in flyingSaucers )
  {
    flyingSaucer.update( time );
  }
  for each( var asteroid:Asteroid in asteroids )
  {
    asteroid.update( time );
  }
  for each( var bullet:Bullet in bullets )
  {
    bullet.update( time );
  }
  collisionManager.update( time );
  spaceship.render();
  for each( var flyingSaucer:FlyingSaucer in flyingSaucers )
  {
    flyingSaucer.render();
  }
  for each( var asteroid:Asteroid in asteroids )
  {
    asteroid.render();
  }
  for each( var bullet:Bullet in bullets )
  {
    bullet.render();
  }
}

This game loop is called on a regular interval, usually every 60th of a second or every 30th of a second, to update the game. The order of operations in the loop is important as we update various game objects, check for collisions between them, and then draw them all. Every frame.

This is a very simple game loop. It’s simple because

  1. The game is simple
  2. The game has only one state

In the past, I have worked on console games where the game loop, a single function, was over 3,000 lines of code. It wasn’t pretty, and it wasn’t clever. That’s the way games were built and we had to live with it.

Entity system architecture derives from an attempt to resolve the problems with the game loop. It addresses the game loop as the core of the game, and pre-supposes that simplifying the game loop is more important than anything else in modern game architecture. More important than separation of the view from the controller, for example.

Processes

The first step in this evolution is to think about objects called processes. These are objects that can be initialised, updated on a regular basis, and destroyed. The interface for a process looks something like this.

interface IProcess
{
  function start():Boolean;
  function update( time:Number ):void;
  function end():void;
}

We can simplify our game loop if we break it into a number of processes to handle, for example, rendering, movement, collision resolution. To manage those processes we create a process manager.

class ProcessManager
{
  private var processes:PrioritisedList;

  public function addProcess( process:IProcess, priority:int ):Boolean
  {
    if( process.start() )
    {
      processes.add( process, priority );
      return true;
    }
    return false;
  }

  public function update( time:Number ):void
  {
    for each( var process:IProcess in processes )
    {
      process.update( time );
    }
  }

  public function removeProcess( process:IProcess ):void
  {
    process.end();
    processes.remove( process );
  }
}

This is a somewhat simplified version of a process manager. In particular, we should ensure we update the processes in the correct order (identified by the priority parameter in the add method) and we should handle the situation where a process is removed during the update loop. But you get the idea. If our game loop is broken into multiple processes, then the update method of our process manager is our new game loop and the processes become the core of the game.

The render process

Lets look at the render process as an example. We could just pull the render code out of the original game loop and place it in a process, giving us something like this

class RenderProcess implements IProcess
{
  public function start() : Boolean
  {
    // initialise render system
    return true;
  }

  public function update( time:Number ):void
  {
    spaceship.render();
    for each( var flyingSaucer:FlyingSaucer in flyingSaucers )
    {
      flyingSaucer.render();
    }
    for each( var asteroid:Asteroid in asteroids )
    {
      asteroid.render();
    }
    for each( var bullet:Bullet in bullets )
    {
      bullet.render();
    }
  }
  
  public function end() : void
  {
    // clean-up render system
  }
}

Using an interface

But this isn’t very efficient. We still have to manually render all the different types of game object. If we have a common interface for all renderable objects, we can simplify matters a lot.

interface IRenderable
{
  function render();
}
class RenderProcess implements IProcess
{
  private var targets:Vector.<IRenderable>;

  public function start() : Boolean
  {
    // initialise render system
    return true;
  }

  public function update( time:Number ):void
  {
    for each( var target:IRenderable in targets )
    {
      target.render();
    }
  }
  
  public function end() : void
  {
    // clean-up render system
  }
}

Then our spaceship class might contain some code like this

class Spaceship implements IRenderable
{
  public var view:DisplayObject;
  public var position:Point;
  public var rotation:Number;

  public function render():void
  {
    view.x = position.x;
    view.y = position.y;
    view.rotation = rotation;
  }
}

This code is based on the flash display list. If we were blitting, or using stage3d, it would be different, but the principles would be the same. We need the image to be rendered, and the position and rotation for rendering it. And the render function does the rendering.

Using a base class and inheritance

In fact, there’s nothing in this code that makes it unique to a spaceship. All the code could be shared by all renderable objects. The only thing that makes them different is which display object is assigned to the view property, and what the position and rotation are. So lets wrap this in a base class and use inheritance.

class Renderable implements IRenderable
{
  public var view:DisplayObject;
  public var position:Point;
  public var rotation:Number;

  public function render():void
  {
    view.x = position.x;
    view.y = position.y;
    view.rotation = rotation;
  }
}
class Spaceship extends Renderable
{
}

Of course, all renderable items will extend the renderable class, so we get a simple class heirarchy like this

The move process

To understand the next step, we first need to look at another process and the class it works on. So lets try the move process, which updates the position of the objects.

interface IMoveable
{
  function move( time:Number );
}
class MoveProcess implements IProcess
{
  private var targets:Vector.<IMoveable>;
  
  public function start():Boolean
  {
    return true;
  }

  public function update( time:Number ):void
  {
    for each( var target:IMoveable in targets )
    {
      target.move( time );
    }
  }
  
  public function end():void
  {
  }
}
class Moveable implements IMoveable
{
  public var position:Point;
  public var rotation:Number;
  public var velocity:Point;
  public var angularVelocity:Number;

  public function move( time:Number ):void
  {
    position.x += velocity.x * time;
    position.y += velocity.y * time;
    rotation += angularVelocity * time;
  }
}
class Spaceship extends Moveable
{
}

Multiple inheritance

That’s almost good, but unfortunately we want our spaceship to be both moveable and renderable, and many modern programming languages don’t allow multiple inheritance.

Even in those languages that do permit multiple inheritance, we have the problem that the position and rotation in the Moveable class should be the same as the position and rotation in the Renderable class.

One common solution is to use an inheritance chain, so that Moveable extends Renderable.

class Moveable extends Renderable implements IMoveable
{
  public var velocity:Point;
  public var angularVelocity:Number;

  public function move( time:Number ):void
  {
    position.x += velocity.x * time;
    position.y += velocity.y * time;
    rotation += angularVelocity * time;
  }
}
class Spaceship extends Moveable
{
}

Now the spaceship is both moveable and renderable. We can apply the same principles to the other game objects to get this class hierarchy.

We can even have static objects that just extend Renderable.

Moveable but not Renderable

But what if we want a Moveable object that isn’t Renderable? An invisible game object, for example? Now our class hierarchy breaks down and we need an alternative implementation of the Moveable interface that doesn’t extend Renderable.

class InvisibleMoveable implements IMoveable
{
  public var position:Point;
  public var rotation:Number;
  public var velocity:Point;
  public var angularVelocity:Number;

  public function move( time:Number ):void
  {
    position.x += velocity.x * time;
    position.y += velocity.y * time;
    rotation += angularVelocity * time;
  }
}

In a simple game, this is clumsy but manageable, but in a complex game using inheritance to apply the processes to objects rapidly becomes unmanageable as you’ll soon discover items in your game that don’t fit into a simple linear inheritance tree, as with the force-field above.

Favour composition over inheritance

It’s long been a sound principle of object-oriented programming to favour composition over inheritance. Applying that principle here can rescue us from this potential inheritance mess.

We’ll still need Renderable and Moveable classes, but rather than extending these classes to create the spaceship class, we will create a spaceship class that contains an instance of each of these classes.

class Renderable implements IRenderable
{
  public var view:DisplayObject;
  public var position:Point;
  public var rotation:Number;

  public function render():void
  {
    view.x = position.x;
    view.y = position.y;
    view.rotation = rotation;
  }
}
class Moveable implements IMoveable
{
  public var position:Point;
  public var rotation:Number;
  public var velocity:Point;
  public var angularVelocity:Number;

  public function move( time:Number ):void
  {
    position.x += velocity.x * time;
    position.y += velocity.y * time;
    rotation += angularVelocity * time;
  }
}
class Spaceship
{
  public var renderData:IRenderable;
  public var moveData:IMoveable;
}

This way, we can combine the various behaviours in any way we like without running into inheritance problems.

The objects made by this composition, the Static Object, Spaceship, Flying Saucer, Asteroid, Bullet and Force Field, are collectively called entities.

Our processes remain unchanged.

interface IRenderable
{
  function render();
}
class RenderProcess implements IProcess
{
  private var targets:Vector.<IRenderable>;

  public function update(time:Number):void
  {
    for each(var target:IRenderable in targets)
    {
      target.render();
    }
  }
}
interface IMoveable
{
  function move();
}
class MoveProcess implements IProcess
{
  private var targets:Vector.<IMoveable>;

  public function update(time:Number):void
  {
    for each(var target:IMoveable in targets)
    {
      target.move( time );
    }
  }
}

But we don’t add the spaceship entity to each process, we add it’s components. So when we create the spaceship we do something like this

public function createSpaceship():Spaceship
{
  var spaceship:Spaceship = new Spaceship();
  ...
  renderProcess.addItem( spaceship.renderData );
  moveProcess.addItem( spaceship.moveData );
  ...
  return spaceship;
}

This approach looks good. It gives us the freedom to mix and match process support between different game objects without getting into spagetti inheritance chains or repeating ourselves. But there’s one problem.

What about the shared data?

The position and rotation properties in the Renderable class instance need to have the same values as the position and rotation properties in the Moveable class instance, since the Move process will change the values in the Moveable instance and the Render process will use the values in the Renderable instance.

class Renderable implements IRenderable
{
  public var view:DisplayObject;
  public var position:Point;
  public var rotation:Number;

  public function render():void
  {
    view.x = position.x;
    view.y = position.y;
    view.rotation = rotation;
  }
}
class Moveable implements IMoveable
{
  public var position:Point;
  public var rotation:Number;
  public var velocity:Point;
  public var angularVelocity:Number;

  public function move( time:Number ):void
  {
    position.x += velocity.x * time;
    position.y += velocity.y * time;
    rotation += angularVelocity * time;
  }
}
class Spaceship
{
  public var renderData:IRenderable;
  public var moveData:IMoveable;
}

To solve this, we need to ensure that both class instances reference the same instances of these properties. In Actionscript that means these properties must be objects, because objects can be passed by reference while primitives are passed by value.

So we introduce another set of classes, which we’ll call components. These components are just value objects that wrap properties into objects for sharing between processes.

class PositionComponent
{
  public var x:Number;
  public var y:Number;
  public var rotation:Number;
}
class VelocityComponent
{
  public var velocityX:Number;
  public var velocityY:Number;
  public var angularVelocity:Number;
}
class DisplayComponent
{
  public var view:DisplayObject;
}
class Renderable implements IRenderable
{
  public var display:DisplayComponent;
  public var position:PositionComponent;

  public function render():void
  {
    display.view.x = position.x;
    display.view.y = position.y;
    display.view.rotation = position.rotation;
  }
}
class Moveable implements IMoveable
{
  public var position:PositionComponent;
  public var velocity:VelocityComponent;

  public function move( time:Number ):void
  {
    position.x += velocity.velocityX * time;
    position.y += velocity.velocityY * time;
    position.rotation += velocity.angularVelocity * time;
  }
}

When we create the spaceship we ensure the Moveable and Renderable instances share the same instance of the PositionComponent.

class Spaceship
{
  public function Spaceship()
  {
    moveData = new Moveable();
    renderData = new Renderable();
    moveData.position = new PositionComponent();
    moveData.velocity = new VelocityComponent();
    renderData.position = moveData.position;
    renderData.display = new DisplayComponent();
  }
}

The processes remain unaffected by this change.

A good place to pause

At this point we have a neat separation of tasks. The game loop cycles through the processes, calling the update method on each one. Each process contains a collection of objects that implement the interface it operates on, and will call the appropriate method of those objects. Those objects each do a single important task on their data. Through the system of components, those objects are able to share data and thus the combination of multiple processes can produce complex updates in the game entities, while keeping each process relatively simple.

This architecture is similar to a number of entity systems in game development. The architecture follows good object-oriented principles and it works. But there’s more to come, starting with a moment of madness.

Abandoning good object-oriented practice

The current architecture uses good object-oriented practices like encapsulation and single responsibility – the IRenderable and IMoveable implementations encapsulate the data and logic for single responsibilities in the updating of game entities every frame – and composition – the Spaceship entity is created by combining implementations of the IRenderable and IMoveable interfaces. Through the system of components we ensured that, where appropriate, data is shared between the different data classes of the entities.

The next step in this evolution of entity systems is somewhat counter-intuitive, breaking one of the core tenets of object-oriented programming. We break the encapsulation of the data and logic in the Renderable and Moveable implementations. Specifically, we remove the logic from these classes and place it in the processes instead.

So this

interface IRenderable
{
  function render();
}
class Renderable implements IRenderable
{
  public var display:DisplayComponent;
  public var position:PositionComponent;

  public function render():void
  {
    display.view.x = position.x;
    display.view.y = position.y;
    display.view.rotation = position.rotation;
  }
}
class RenderProcess implements IProcess
{
  private var targets:Vector.<IRenderable>;

  public function update( time:Number ):void
  {
    for each( var target:IRenderable in targets )
    {
      target.render();
    }
  }
}

Becomes this

class RenderData
{
  public var display:DisplayComponent;
  public var position:PositionComponent;
}
class RenderProcess implements IProcess
{
  private var targets:Vector.<RenderData>;

  public function update( time:Number ):void
  {
    for each( var target:RenderData in targets )
    {
      target.display.view.x = target.position.x;
      target.display.view.y = target.position.y;
      target.display.view.rotation = target.position.rotation;
    }
  }
}

And this

interface IMoveable
{
  function move( time:Number );
}
class Moveable implements IMoveable
{
  public var position:PositionComponent;
  public var velocity:VelocityComponent;

  public function move( time:Number ):void
  {
    position.x += velocity.velocityX * time;
    position.y += velocity.velocityY * time;
    position.rotation += velocity.angularVelocity * time;
  }
}
class MoveProcess implements IProcess
{
  private var targets:Vector.<IMoveable>;

  public function move( time:Number ):void
  {
    for each( var target:Moveable in targets )
    {
      target.move( time );
    }
  }
}

Becomes this

class MoveData
{
  public var position:PositionComponent;
  public var velocity:VelocityComponent;
}
class MoveProcess implements IProcess
{
  private var targets:Vector.<MoveData>;

  public function move( time:Number ):void
  {
    for each( var target:MoveData in targets )
    {
      target.position.x += target.velocity.velocityX * time;
      target.position.y += target.velocity.velocityY * time;
      target.position.rotation += target.velocity.angularVelocity * time;
    }
  }
}

It’s not immediately clear why we’d do this, but bear with me. On the surface, we’ve removed the need for the interface, and we’ve given the process something more important to do – rather than simply delegate its work to the IRenderable or IMoveable implementations, it does the work itself.

The first apparent consequence of this is that all entities must use the same rendering method, since the render code is now in the RenderProcess. But that’s not actually the case. We could, for example, have two processes, RenderMovieClip and RenderBitmap for example, and they could operate on different sets of entities. So we haven’t lost any flexibility.

What we gain is the ability to refactor our entities significantly to produce an architecture with clearer separation and simpler configuration. The refactoring starts with a question.

Do we need the data classes?

Currently, our entity

class Spaceship
{
  public var moveData:MoveData;
  public var renderData:RenderData;
}

Contains two data classes

class MoveData
{
  public var position:PositionComponent;
  public var velocity:VelocityComponent;
}
class RenderData
{
  public var display:DisplayComponent;
  public var position:PositionComponent;
}

These data classes in turn contain three components

class PositionComponent
{
  public var x:Number;
  public var y:Number;
  public var rotation:Number;
}
class VelocityComponent
{
  public var velocityX:Number;
  public var velocityY:Number;
  public var angularVelocity:Number;
}
class DisplayComponent
{
  public var view:DisplayObject;
}

And the data classes are used by the two processes

class MoveProcess implements IProcess
{
  private var targets:Vector.<MoveData>;

  public function move( time:Number ):void
  {
    for each( var target:MoveData in targets )
    {
      target.position.x += target.velocity.velocityX * time;
      target.position.y += target.velocity.velocityY * time;
      target.position.rotation += target.velocity.angularVelocity * time;
    }
  }
}
class RenderProcess implements IProcess
{
  private var targets:Vector.<RenderData>;

  public function update( time:Number ):void
  {
    for each( var target:RenderData in targets )
    {
      target.display.view.x = target.position.x;
      target.display.view.y = target.position.y;
      target.display.view.rotation = target.position.rotation;
    }
  }
}

But the entity shouldn’t care about the data classes. The components collectively contain the state of the entity. The data classes exist for the convenience of the processes. So we refactor the code so the spaceship entity contains the components rather than the data classes.

class Spaceship
{
  public var position:PositionComponent;
  public var velocity:VelocityComponent;
  public var display:DisplayComponent;
}
class PositionComponent
{
  public var x:Number;
  public var y:Number;
  public var rotation:Number;
}
class VelocityComponent
{
  public var velocityX:Number;
  public var velocityY:Number;
  public var angularVelocity:Number;
}
class DisplayComponent
{
  public var view:DisplayObject;
}

By removing the data classes, and using the constituent components instead to define the spaceship, we have removed any need for the spaceship entity to know what processes may act on it. The spaceship now contains the components that define its state. Any requirement to combine these components into other data classes for the processes is some other class’s responsibility.

Systems and Nodes

Some core code within the entity system framework (which we’ll get to in a minute) will dynamically create these data objects as they are required by the processes. In this reduced context, the data classes will be mere nodes in the collections (arrays, linked-lists, or otherwise, depending on the implementation) used by the processes. So to clarify this we’ll rename them as nodes.

class MoveNode
{
  public var position:PositionComponent;
  public var velocity:VelocityComponent;
}
class RenderNode
{
  public var display:DisplayComponent;
  public var position:PositionComponent;
}

The processes are unchanged, but in keeping with the more common naming I’ll also change their name and call them systems.

class MoveSystem implements ISystem
{
  private var targets:Vector.<MoveNode>;

  public function update( time:Number ):void
  {
    for each( var target:MoveNode in targets )
    {
      target.position.x += target.velocity.velocityX * time;
      target.position.y += target.velocity.velocityY * time;
      target.position.rotation += target.velocity.angularVelocity * time;
    }
  }
}
class RenderSystem implements ISystem
{
  private var targets:Vector.<RenderNode>;

  public function update( time:Number ):void
  {
    for each( var target:RenderNode in targets )
    {
      target.display.view.x = target.position.x;
      target.display.view.y = target.position.y;
      target.display.view.rotation = target.position.rotation;
    }
  }
}
interface ISystem
{
  function update( time:Number ):void;
}

And what is an entity?

One last change – there’s nothing special about the Spaceship class. It’s just a container for components. So we’ll just call it Entity and give it a collection of components. We’ll access those components based on their class type.

class Entity
{
  private var components : Dictionary;
  
  public function add( component:Object ):void
  {
    var componentClass : Class = component.constructor;
    components[ componentClass ] = component'
  }
  
  public function remove( componentClass:Class ):void
  {
    delete components[ componentClass ];
  }
  
  public function get( componentClass:Class ):Object
  {
    return components[ componentClass ];
  }
}

So we’ll create our spaceship like this

public function createSpaceship():void
{
  var spaceship:Entity = new Entity();
  var position:PositionComponent = new PositionComponent();
  position.x = Stage.stageWidth / 2;
  position.y = Stage.stageHeight / 2;
  position.rotation = 0;
  spaceship.add( position );
  var display:DisplayComponent = new DisplayComponent();
  display.view = new SpaceshipImage();
  spaceship.add( display );
  engine.add( spaceship );
}

The core Engine class

We mustn’t forget the system manager, formerly called the process manager.

class SystemManager
{
  private var systems:PrioritisedList;

  public function addSystem( system:ISystem, priority:int ):void
  {
    systems.add( system, priority );
    system.start();
  }

  public function update( time:Number ):void
  {
    for each( var system:ISystem in systemes )
    {
      system.update( time );
    }
  }

  public function removeSystem( system:ISystem ):void
  {
    system.end();
    systems.remove( system );
  }
}

This will be enhanced and will sit at the heart of our entity component system framework. We’ll add to it the functionality mentioned above to dynamically create nodes for the systems.

The entities only care about components, and the systems only care about nodes. So to complete the entity component system framework, we need code to watch the entities and, as they change, add and remove their components to the node collections used by the systems. Because this is the one bit of code that knows about both entities and systems, we might consider it central to the game. In Ash, I call this the Engine class, and it is an enhanced version of the system manager.

Every entity and every system is added to and removed from the Engine class when you start using it and stop using it. The Engine class keeps track of the components on the entities and creates and destroys nodes as necessary, adding those nodes to the node collections. The Engine class also provides a way for the systems to get the collections they require.

public class Engine
{
  private var entities:EntityList;
  private var systems:SystemList;
  private var nodeLists:Dictionary;

  public function addEntity( entity:Entity ):void
  {
    entities.add( entity );
    // create nodes from this entity's components and add them to node lists
    // also watch for later addition and removal of components from the entity so
    // you can adjust its derived nodes accordingly
  }

  public function removeEntity( entity:Entity ):void
  {
    // destroy nodes containing this entity's components
    // and remove them from the node lists
    entities.remove( entity );
  }

  public function addSystem( system:System, priority:int ):void
  {
    systems.add( system, priority );
    system.start();
  }

  public function removeSystem( system:System ):void
  {
    system.end();
    systems.remove( system );
  }

  public function getNodeList( nodeClass:Class ):NodeList
  {
    var nodes:NodeList = new NodeList();
    nodeLists[ nodeClass ] = nodes;
    // create the nodes from the current set of entities
    // and populate the node list
    return nodes;
  }

  public function update( time:Number ):void
  {
    for each( var system:ISystem in systemes )
    {
      system.update( time );
    }
  }
}

To see one implementation of this architecture, checkout the Ash entity system framework, and see the example Asteroids implementation there too.

A step further

In Actionscript, the Node and Entity classes are necessary for efficiently managing the Components and passing them to the Systems. But note that these classes are just glue, the game is defined in the Systems and the Components. The Entity class provides a means to find and manage the components for each entity and the Node classes provide a means to group components into collections for use in the Systems. In other languages and runtime environments it may be more efficient to manage this glue differently.

For example, in a large server-based game we might store the components in a database – they are just data after all – with each record (i.e. each component) having a field for the unique id of the entity it belongs to along with fields for the other component data. Then we pull the components for an entity directly from the database when needed, using the entity id to find it, and we create collections of data for the systems to operate on by doing joined queries across the appropriate databases. For example, for the move system we would pull records from the postion components table and the movement components table where entity ids match and a record exists in both tables (i.e. the entity has both a position and a movement component). In this instance the Entity and Node classes are not required and the only presence for the entity is the unique id that is used in the data tables.

Similarly, if you have control over the memory allocation for your game it is often more efficient to take a similar approach for local game code too, creating components in native arrays of data and looking-up the components for an entity based on an id. Some aspects of the game code become more complex and slower (e.g. finding the components for a specific entity) but others become much faster (e.g. iterating through the component data collections inside a system) because the data is efficiently laid out in memory to minimise cache misses and maximise speed.

The important elements of this architecture are the components and the systems. Everything else is configuration and glue. And note that components are data and systems are functions, so we don’t even need object oriented code to do this.

Conclusion

So, to summarise, entity component systems originate from a desire to simplify the game loop. From that comes an architecture of components, which represent the state of the game, and systems, which operate on the state of the game. Systems are updated every frame – this is the game loop. Components are combined into entities, and systems operate on the entities that have all the components they are interested in. The engine monitors the systems and the components and ensures each system has access to a collection of all the components it needs.

An entity component system framework like Ash provides the basic scaffolding and core management for this architecture, without providing any actual component or system classes. You create your game by creating the appropriate components and systems.

An entity component system game engine will provide many standard systems and components on top of the basic framework.

Three entity component system frameworks for Actionscript are my own Ash, Ember2 by Tom Davies and Xember by Alec McEachran. Artemis is an entity system framework for Java, that has also been ported to C#.

My next post covers some of the reasons why I like using an entity system framework for my game development projects.

Share this post or a comment online –

Why use an Entity Component System architecture for game development?

Posted on

Following my previous post on entity systems for game development I received a number of good questions from developers. I answered many of them directly to the questioners but one question stands out. It’s all very well explaining what an entity component system framework is, and even building one, but why would you want to use one? In particular, why use the later component/system architecture I described and that I implement in Ash rather than the earlier object-oriented entity architecture as used, for example, in PushButton Engine.

So, in this post I will look more at what the final architecture is, what it gives us that other architectures don’t, and why I personally prefer this architecture.

The core of the architecture

First, it’s worth noting that the core of this architecture is the components and the systems. Components are value-objects that contain the state of the game, and systems are the logic that operates on that state, changing it as the game progresses. The other elements of the architecture are purely incidental, designed to make life easier.

The entity object is present to collect related components together. The relation between components is encapsulated by the concept of an entity and that is vital to the architecture, but it isn’t necessary to use an explicit Entity object for this. There are some frameworks that simply use an id to represent the entity, adding the entity id to every component to indicate which entity it belongs to.

As a concept, the entity is vital to the architecture. But as a code construct it is entirely optional. I include it because, when using an object-oriented language, the entity object makes life easier. In particular, it enables us to create the methods that operate on the entity as methods of an entity object and to track and manage the entity through this object, removing the need to track ids throughout the code.

While the concept of an entity is vital to the architecture, the concept of node objects is entirely incidental. The nodes are used as the objects in the collections of components supplied to the systems. We could instead provide each system with a collection of the relevant entities and let the systems pull the components they want to operate on out of the entities, using the get() method of the entity.

In Ash, the nodes serve two purposes. First they enable us to use a more efficient data structure in which the node objects that the systems receive are nodes in a linked list. This improves the execution speed of the framework.

Second, using the node objects enables us to apply strong, static typing throughout our code. The method to fetch a component from an entity necessarily returns an untyped object, which must then be cast to the correct component type for use in the game code. The properties on the node are already statically typed to the components’ data types, so no casting is necessary.

So, fundamentally, the entity architecture is about components and systems.

This is not object-oriented programming

We can build our entity architecture using an object-oriented language but, on a fundamental level, this is not object-oriented programming. The architecture is not about objects, it’s about data (components) and sub-routines that operate on that data (systems).

For many object-oriented programmers this is the hardest part of working with an entity system framework. Our tendency is to fall back to what we know and as an object-oriented programmer using an object-oriented language that means encapsulating data and operations together into objects. If you do this with a framework like Ash you will fail.

Data-Oriented Programming

Games tend to be about lots of fast changing state, with players, non-player characters, game objects like bullets and lasers, bats and balls, tables and chairs, and levels, scores, lives and more all having state that might include position, rotation, speed, acceleration, weight, colour, intention, goals, desires, friendships, enemies and more.

The state of the game can be encapsulated in this large mass of constantly changing data, and on a technical level the game is entirely about what this data is and how this data changes.

In a game a single little piece of this data may have many operations acting on it. Take, for example, a player character that has a position property that represents the character’s position in the game world. This single piece of data may be used by

  • The render system, to draw the player in the world.
  • The camera system, to position the camera relative to the player.
  • The AI systems of all non-player characters, to decide how they should react to the player.
  • The input system, which alters the player’s position based on user input.
  • The physics system, which alters the player’s position based on the physics of the game world.
  • The collision system, which tests whether the player is colliding with other objects and resolves those collisions.

and probably many more systems besides. If we try to build our game using objects that encapsulate data with the operations that act on that data, then we will build dependencies between all these different systems as they all want to be encapsulated with the player’s position data. This can’t be done unless we code the game as one single, massive class, so inevitably we break some parts of the game into separate systems and provide data to those systems – the physics system, the graphics system – while including other elements of the game logic within the game objects themselves.

An entity architecture based on components and systems takes the idea of discrete systems to its logical conclusion. All operations are programmed as independent systems, and all game state is stored separately in a set of data components, which are provided to the systems according to their need.

The systems are decoupled from each other. Each system knows only about itself and the data it operates on. It knows and cares nothing at all about the other systems and how they be affected by or use the data before or after this system gets to work with it.

Also, by embracing the system as the core logic of the architecture, we are encouraged to make many smaller and simpler systems rather than a few large complex ones, which again leads to simpler code and looser coupling.

This decoupling makes building your game much easier. It is why I enjoy working with this form of entity system so much, and why I built Ash.

Storing the game state

Another benefit of the component/system architecture is apparent when you want to save and restore the game state. Because the game state is contained entirely in the components, and because these are simple value objects, saving the game state is a relatively simple matter of serialising out the components, and restoring the game state involves just deserialising the data back in again.

In most cases, serialising a value-object is straightforward, and one could simply json-encode each component, with additional data to indicate its entity owner (an id) and its component type (a string), to save the game state.

Adam Martin wrote about comparing components in an entity system framework to data in a relational database (there’s lots of interesting entity related stuff on Adam’s blog), and emphasising that conversion between a relational database for long-term storage and components for game play doesn’t require any object/relational mapping, because components are simple copies of the relational database’s data structure rather than complex objects.

This leads further to the conclusion that a component/system architecture is ideal for an MMO game, since state will be stored in a relational database on the game servers, and much of the processing of that state will occur on the servers, where using a set of discrete, independent systems to process the data as the game unfolds is an excellent fit to both the data storage requirements of the state and the parallelism available on the servers.

Concurrency

Indeed, a component/system architecture is well suited to applying concurrency to a game. In most games, some of the systems are entirely independent of each other, including being independent of the order in which they are applied. This makes it easy to run these systems in parallel.

Further, most systems consist of a loop in which all nodes are processed sequentially. In many cases, the loop can be parallelised since the nodes can be updated independently of each other.

This gives us two places in the code where concurrency can be applied without altering the core logic of the game, which is inside the loop in the systems, or the core state of the game, which is in the components.

This makes adding concurrency to the game relatively simple.

We don’t need object-orientation

Finally, because the component/system architecture is not object-oriented, it lends itself to other programming languages that implement different programming paradigms like functional programming and procedural programming. While I created Ash as an Actionscript framework, this architecture would be well suited to Javascript for client side development or any of the many functional languages used for highly concurrent server side development.

Update: In-game editors

Tom Davies has pointed out that a very valuable benefit to him is how easy it is to create an in-game level editor when developing with an entity system framework like Ember or Ash. You can see his example here.

The complete separation of the game state and the game logic in an entity system framework makes it easy to create an editor that lets you alter the state (configuration, level design, AI, etc.) while playing the game. Add to this the easier saving and loading of state and you have a framework that is very well suited to in-game editing.

How to Build an Entity Component System Game in Javascript

Game Development
Posted on Aug 3rd, 2014

Rectangle Eater

View the source code or play Rectangle Eater

Creating and manipulating abstractions is the essence of programming. There is no “correct” abstraction to solve a problem, but some abstractions are better suited for certain problems than others. Class-based, Object Oriented Programming (OOP), is the most widely used paradigm for organizing programs. There are others. Prototypical based languages, like Javascript, provide a different way of thinking around how to solve problems. Functional programming provides yet another completely different way to think about and solve problems. Programming languages are just one area where a different mindset can help solve problems better.

Even within a class based or prototype based language, many methods exist for structuring code. One approach I’ve grown to love is a more data driven approach to code. One such technique is Entity-Component-System (ECS). While it is a general architecture pattern that could be applied to many domains, the predominant uses of it are in game development. In this post, I’ll cover the basic topics of ECS and we’ll build a basic HTML5 game about eating rectangles – oh-so creatively called “Rectangle Eater”.

Entity-Component-System

Discovering Entity Component System (ECS) was an “ah-hah” moment for me. With ECS, entities are just collections of components; just a collection of data.

  • Entity: An entity is just an ID
  • Component: Components are just data.
  • System: Logic that runs on every entity that has a component of the system. For example, a “Renderer” system would draw all entities that have a “appearance” component.

With this approach, you avoid having to create gnarly inheritance chains. With this approach, for example, a half-orc isn’t some amalgamation of a human class and an orc class (which might inherit from a Monster class). With this approach, a half-orc is just a grouping of data.

An entity is just a like a record in a database. The components are the actual data. Here’s a high level example of what the data might look like for entities, shown by ID and components. The beauty of this system is that you can dynamically build entities – an entity can have whatever components (data) you want.


|         | component-health  | component-position |  component-appearance |
|---------|-------------------|--------------------|-----------------------|
|entity1  | 100               | {x: 0, y: 0}       | {color: green}        |
|entity2  |                   | {x: 0, y: 0}       |                       |
|entity3  |                   |                    | {color: blue}         |

Dynamic Data Driven Design

Everything is tagged as an entity. A bullet, for instance, might just have a “physics” and “appearance” component. Entity Component System is data driven. This approach allows greater flexibility and more expression. One benefit is the ability to dynamically add and remove components, even at run time. You could dynamically remove the appearance component to make invisible bullets, or add a “playerControllable” component to allow the bullet to be controlled by the player. No new classes required.

This can potentially be a problem as systems have to iterate through all entities. Of course, it’s not terribly difficult to optimize and structure code so not all entities are hit each iteration if you have too many, but it’s helpful to keep this constraint in mind, especially for browser based games.

Assemblages

One benefit of a Class based approach is the ability to easily create multiple objects of the same type. If I want one hundred orcs, I can just create a hundred orc objects and know what properties they’ll all have. This can be accomplished with ECS through an abstraction called an assemblage, which is just a way to easily create entities that have some grouping of components. For instance, a Human assemblage might contain “position”, “name”, “health”, and “appearance” components. A Sword assemblage might just have “appearance” and “name”.

One benefit this provides over normal Class inheritance is the ability to easily add on (or remove) components from assemblages. Since it’s data driven, you can manipulate and change them programmatically based on whatever parameters you desire. Maybe you want to create a ton of humans but have some of them be invisible – no need for a new class, just remove the “appearance” component from that entity.

Code

This is not an attempt to build out a robust ECS library. This is designed to be an overview of Entity Component System implemented in Javascript. It’s not the best or most optimized way to do it; but it can provide a foundation for a concrete understanding of how everything fits together. All code can be found on github.

Entity

The abstraction is that an entity is just an ID; a container of components. Let’s start by creating a function which we can create entities from. Each entity will have just an id and components property. (Note: the follow expects a global ECS object to exist, which looks like var ECS = {};)

ECS.Entity = function Entity(){
    // Generate a pseudo random ID
    this.id = (+new Date()).toString(16) + 
        (Math.random() * 100000000 | 0).toString(16) +
        ECS.Entity.prototype._count;

    // increment counter
    ECS.Entity.prototype._count++;

    // The component data will live in this object
    this.components = {};

    return this;
};
// keep track of entities created
ECS.Entity.prototype._count = 0;

ECS.Entity.prototype.addComponent = function addComponent ( component ){
    // Add component data to the entity
    // NOTE: The component must have a name property (which is defined as 
    // a prototype protoype of a component function)
    this.components[component.name] = component;
    return this;
};
ECS.Entity.prototype.removeComponent = function removeComponent ( componentName ){
    // Remove component data by removing the reference to it.
    // Allows either a component function or a string of a component name to be
    // passed in
    var name = componentName; // assume a string was passed in

    if(typeof componentName === 'function'){ 
        // get the name from the prototype of the passed component function
        name = componentName.prototype.name;
    }

    // Remove component data by removing the reference to it
    delete this.components[name];
    return this;
};

ECS.Entity.prototype.print = function print () {
    // Function to print / log information about the entity
    console.log(JSON.stringify(this, null, 4));
    return this;
};

View Source

To create a new entity, we’d simply call it like: var entity = new ECS.Entity();.

There’s not a lot going on in the code here. First, in the function itself, we generate an ID based on the current time, a call to Math.random(), and a counter based on the total number of created entities. This ensures we get a unique ID for each entity. We increment the counter (prototype properties are shared across all object instances; sort of similar to a class variable). Then, we create an empty object to stick components (the data) in.

We expose an addComponent and removeComponent function on the prototype (again, single functions in memory shared across all object instances). Each take in a component object and add or remove the passed in component from the Entity. Lastly, the print method simply JSON-ifies the entity, providing all the data. We could use this to dump out and reload data later (e.g., saving).

So, at the core, an Entity is little more than an object with some data properties. We’ll cover how to create multiple Entities soon, and where assemblages fit in.

Component

Here’s where the data part of data driven programming kicks in. I’ve structured components similarly to entities; for example:

ECS.Components.Health = function ComponentHealth ( value ){
    value = value || 20;
    this.value = value;

    return this;
};
ECS.Components.Health.prototype.name = 'health';

To get a Health component, you’d simply create it with new ECS.Components.Health( VALUE ) where VALUE is an optional starting value (20 if nothing is passed in). Importantly, there is a name property on the prototype which tells the Entity what to call the component. For example, to create an entity then give it a health component:

var entity = new ECS.Entity();
entity.addComponent( new ECS.Components.Health() );

That’s all that is required to add a component to an entity. If we printed the entity out now (entity.print();), we’d see something like:

{
    "id": "1479f3d15bd4bf98f938300430178",
    "components": {
        "health": {
            "value": 20
        }
    }
} 

That’s it – it’s just data! We could change the entity’s health by modifying it directly, e.g., entity.components.health.value = 40; We can have any kind of data nesting we want; for example, if we created a position component with x and y data values, we’d get as output:


{
    "id": "1479f3d15bd4bf98f938300430178",
    "components": {
        "health": {
            "value": 20
        },
        "position": {
            "x": 426,
            "y": 98
        }
    }
} 

To keep this post manageable, here’s all the code for all these components used in the game

Since components are just data, that don’t have any logic. (Depending on what works for you, you could add some prototype functions to components that would aide in data calculations, but it’s helpful to view components just as data). So we have a bunch of data now, but to do anything interesting we need to run operations on it. That’s where Systems come in.

System

Systems run your game’s logic. They take in entities and run operations on entities that have specific components the system requires. This way of thinking is a bit inverted from typical Class based programming.

In Class based programming, to model a cat, a Cat Class would exist. You’d create cat objects and to get the cat to meow, you’d call the speak() method. The functionality lives inside of the object. The object is not just data, but also functionality.

With ECS, to model a Cat you’d first create an entity. Then, you’d add some components that cats have (size, name, etc.). If you wanted the entity to be able to meow, maybe you’d give it a speak component with a value of “meow”. The distinction here though is that this is just data – maybe it looks like:

{
    "id": "f279f3d85bd4bf98f938300430178",
    "components": {
        "speak": {
            "sound": "meeeooowww"
        }
    }
}

The entity can do nothing by itself. So, to get a “cat” entity to speak, you’d use a speak System. (Note: the component name and system name do not have to be 1:1, this is just an example. Most systems use multiple different components). The system would look for all entities that have a speak component, then run some logic – plugging in the entity’s data.

The functionality happens in Systems, not on the objects themselves. You’ll have many different systems that are tailored for your game. Systems are where your main game logic lives. For our rectangle eating game, we only need a few systems: collision, decay, render, and userInput.

The way I’ve sturctured systems is to take in all entities (here, the entities are an object of key:value pairs of entityId: entityObject). Let’s take a look at a snippet of the render system. Note, the systems are just functions that take in entities.

ECS.systems.render = function systemRender ( entities ) {
    // Here, we've implemented systems as functions which take in an array of
    // entities. An optimization would be to have some layer which only 
    // feeds in relevant entities to the system, but for demo purposes we'll
    // assume all entities are passed in and iterate over them.

    // This happens each tick, so we need to clear out the previous rendered
    // state
    clearCanvas();

    var curEntity, fillStyle; 

    // iterate over all entities
    for( var entityId in entities ){
        curEntity = entities[entityId];

        // Only run logic if entity has relevant components
        //
        // For rendering, we need appearance and position. Your own render 
        // system would use whatever other components specific for your game
        if( curEntity.components.appearance && curEntity.components.position ){

            // Build up the fill style based on the entity's color data
            fillStyle = 'rgba(' + [
                curEntity.components.appearance.colors.r,
                curEntity.components.appearance.colors.g,
                curEntity.components.appearance.colors.b
            ];

            if(!curEntity.components.collision){
                // If the entity does not have a collision component, give it 
                // some transparency
                fillStyle += ',0.1)';
            } else {
                // Has a collision component
                fillStyle += ',1)';
            }

            ECS.context.fillStyle = fillStyle;

            // Color big squares differently
            if(!curEntity.components.playerControlled &&
            curEntity.components.appearance.size > 12){
                ECS.context.fillStyle = 'rgba(0,0,0,0.8)';
            }

            // draw a little black line around every rect
            ECS.context.strokeStyle = 'rgba(0,0,0,1)';

            // draw the rect
            ECS.context.fillRect( 
                curEntity.components.position.x - curEntity.components.appearance.size,
                curEntity.components.position.y - curEntity.components.appearance.size,
                curEntity.components.appearance.size * 2,
                curEntity.components.appearance.size * 2
            );
            // stroke it
            ECS.context.strokeRect(
                curEntity.components.position.x - curEntity.components.appearance.size,
                curEntity.components.position.y - curEntity.components.appearance.size,
                curEntity.components.appearance.size * 2,
                curEntity.components.appearance.size * 2
            );
        }
    }
};

The logic here is simple. First, we clear the canvas before doing anything. Then, we iterate over all entities. This system renders entities, but we only care about entities that have an appearance and position. In our game, all entities have these components – but if we wanted to create invisible rectangles that the player could interact with, all we’d have to do is remove the appearance component. So, after we’ve found the entities which contain the relevant data for the system, we can do operations on them.

In this system, we just render the entity based on the colors properties in the appearance component. One benefit too is that we don’t have to set all the appearance properties here – we might set some in the collision system, or in the health system, or in the decay system; we have complete flexibility over what roles we want to assign to each system. Because the systems are driven by data, we don’t have to limit our thinking to just “methods on classes and objects.” We can have as many systems as want, as complex or simple as we want that target whatever kinds of entities we want.

Overview of Rectangle Eater’s systems:

  • collision: Handles collision, updating the data for health on collision, and removing / adding new entities (rectangles). Bulk of the game’s logic.
  • decay: Handles rectangles getting small and losing health. Any entity with a health component (e.g., most rectangles and the player controlled rectangle) will be affected. This is where a lot of the “fun” configuration happens. If the rectangle decays and goes away too fast then the game is too hard – if it’s too slow, it’s not fun.
  • render: Handles rendering entities based on their appearance component.
  • userInput: Allows the player to move around entities that have a PlayerControlled component.

In the collision system of our game, we check for collisions between the user controlled entity (specified by a PlayerControlled component and handled via the userInput system) and all other entities with a collision component. If a collision occurs, we update the entity’s health (via the health component), remove the collided entity immediately (the systems are data driven – there’s no problem dynamically adding or removing entities on a per system level), then finally randomly add some new rectangles (most of these will decay over time, and when they get smaller they give you more health when you collide with them – something else we check for in this collision system).

Like in a normal game loop, the order which the systems gets called is also important. If your render before the decay system is called, for example, you’ll always be a tick behind.

Gluing it all together

The final step involves connecting all the pieces together. For our game, we just need to do a few things:

  1. Setup the initial entities
  2. Setup the order of the systems which we want to use
  3. Setup a game loop, which calls each system and passed in all the entities
  4. Setup a lose condition

1. Let’s take a look at the code for setting up the initial entities and the player entity:

var self = this;
var entities = {}; // object containing { id: entity  }
var entity;

// Create a bunch of random entities
for(var i=0; i < 20; i++){
    entity = new ECS.Entity();
    entity.addComponent( new ECS.Components.Appearance());
    entity.addComponent( new ECS.Components.Position());

    // % chance for decaying rects
    if(Math.random() < 0.8){
        entity.addComponent( new ECS.Components.Health() );
    }

    // NOTE: If we wanted some rects to not have collision, we could set it
    // here. Could provide other gameplay mechanics perhaps?
    entity.addComponent( new ECS.Components.Collision());

    entities[entity.id] = entity;
}

// PLAYER entity
// Make the last entity the "PC" entity - it must be player controlled,
// have health and collision components
entity = new ECS.Entity();
entity.addComponent( new ECS.Components.Appearance());
entity.addComponent( new ECS.Components.Position());
entity.addComponent( new ECS.Components.Collision() );
entity.addComponent( new ECS.Components.PlayerControlled() );
entity.addComponent( new ECS.Components.Health() );

// we can also edit any component, as it's just data
entity.components.appearance.colors.g = 255;
entities[entity.id] = entity;

// store reference to entities
ECS.entities = entities;

Note how we can modify any of the component data directly. It’s all data that can be manipulated however and whenever you want! The player entity step could be even further simplified by using assemblages, which are basically entity templates. For instance (using our assemblages):

entity = new ECS.Assemblages.CollisionRect();
entity.addComponent( new ECS.Components.Health());
entity.addComponent( new ECS.Components.PlayerControlled() );

2. Next, we setup the order of the systems:

// Setup systems
// Setup the array of systems. The order of the systems is likely critical, 
// so ensure the systems are iterated in the right order
var systems = [
    ECS.systems.userInput,
    ECS.systems.collision,
    ECS.systems.decay, 
    ECS.systems.render
];

3. Then, a simple game loop

// Game loop
function gameLoop (){
    // Simple game loop
    for(var i=0,len=systems.length; i < len; i++){
        // Call the system and pass in entities
        // NOTE: One optimal solution would be to only pass in entities
        // that have the relevant components for the system, instead of 
        // forcing the system to iterate over all entities
        systems[i](ECS.entities);
    }

    // Run through the systems. 
    // continue the loop
    if(self._running !== false){
        requestAnimationFrame(gameLoop);
    }
}
// Kick off the game loop
requestAnimationFrame(gameLoop);

4. Finally, a lose condition

// Lose condition
this._running = true; // is the game going?
this.endGame = function endGame(){ 
    self._running = false;
    document.getElementById('final-score').innerHTML = +(ECS.$score.innerHTML);
    document.getElementById('game-over').className = '';

    // set a small timeout to make sure we set the background
    setTimeout(function(){
        document.getElementById('game-canvas').className = 'game-over';
    }, 100);
};

View on github

Now, we can kick off the game (provided the html has been set up)!

Play Rectangle Eater Rectangle Eater

Conclusion

Programming is a complex endeavor by nature. We program in abstractions, and different frameworks for thinking can make certain problems easier to solve. Data driven programming is one framework for thinking of how to write programs. It’s not the best fit for all problems, but can make some problems much easier to solve and understand. Thinking of code as data is a powerful concept. Entity Component System is a pattern that fits game development well. Try taking the passenger seat. Try letting data drive your code.

Posted on 

Last week I released Ash, an entity component system framework for Actionscript game development, and a number of people have asked me the question “What is an entity component system framework?”. This is my rather long answer.

Entity systems are growing in popularity, with well-known examples like Unity, and lesser known frameworks like Actionscript frameworks Ember2Xember and my own Ash. There’s a very good reason for this; they simplify game architecture, encourage clean separation of responsibilities in your code, and are fun to use.

In this post I will walk you through how an entity based architecture evolves from the old fashioned game loop. This may take a while. The examples will be in Actionscript because that happens to be what I’m using at the moment, but the architecture applies to all programming language.

Note that the naming of things within this post is based on

  • How they were named as I discovered these archtectures over the past twenty years of my game development life
  • How they are usually named in modern entity component system architectures

This is different, for example, to how they are named in Unity, which is an entity architecture but is not an entity component system architecture.

This is based on a presentation I gave at try{harder} in 2011.

The examples

Throughout this post, I’ll be using a simple Asteroids game as an example. I like to use Asteroids as an example because it involves simplified versions of many of the systems required in larger games – rendering, physics, ai, user control of a character, non-player characters.

The game loop

To understand why we use entity systems, you really need to understand the old-fashioned game loop. A game loop for Asteroids might look something like this

function update( time:Number ):void
{
  game.update( time );
  spaceship.updateInputs( time );
  for each( var flyingSaucer:FlyingSaucer in flyingSaucers )
  {
    flyingSaucer.updateAI( time );
  }
  spaceship.update( time );
  for each( var flyingSaucer:FlyingSaucer in flyingSaucers )
  {
    flyingSaucer.update( time );
  }
  for each( var asteroid:Asteroid in asteroids )
  {
    asteroid.update( time );
  }
  for each( var bullet:Bullet in bullets )
  {
    bullet.update( time );
  }
  collisionManager.update( time );
  spaceship.render();
  for each( var flyingSaucer:FlyingSaucer in flyingSaucers )
  {
    flyingSaucer.render();
  }
  for each( var asteroid:Asteroid in asteroids )
  {
    asteroid.render();
  }
  for each( var bullet:Bullet in bullets )
  {
    bullet.render();
  }
}

This game loop is called on a regular interval, usually every 60th of a second or every 30th of a second, to update the game. The order of operations in the loop is important as we update various game objects, check for collisions between them, and then draw them all. Every frame.

This is a very simple game loop. It’s simple because

  1. The game is simple
  2. The game has only one state

In the past, I have worked on console games where the game loop, a single function, was over 3,000 lines of code. It wasn’t pretty, and it wasn’t clever. That’s the way games were built and we had to live with it.

Entity system architecture derives from an attempt to resolve the problems with the game loop. It addresses the game loop as the core of the game, and pre-supposes that simplifying the game loop is more important than anything else in modern game architecture. More important than separation of the view from the controller, for example.

Processes

The first step in this evolution is to think about objects called processes. These are objects that can be initialised, updated on a regular basis, and destroyed. The interface for a process looks something like this.

interface IProcess
{
  function start():Boolean;
  function update( time:Number ):void;
  function end():void;
}

We can simplify our game loop if we break it into a number of processes to handle, for example, rendering, movement, collision resolution. To manage those processes we create a process manager.

class ProcessManager
{
  private var processes:PrioritisedList;
  public function addProcess( process:IProcess, priority:int ):Boolean
  {
    if( process.start() )
    {
      processes.add( process, priority );
      return true;
    }
    return false;
  }
  public function update( time:Number ):void
  {
    for each( var process:IProcess in processes )
    {
      process.update( time );
    }
  }
  public function removeProcess( process:IProcess ):void
  {
    process.end();
    processes.remove( process );
  }
}

This is a somewhat simplified version of a process manager. In particular, we should ensure we update the processes in the correct order (identified by the priority parameter in the add method) and we should handle the situation where a process is removed during the update loop. But you get the idea. If our game loop is broken into multiple processes, then the update method of our process manager is our new game loop and the processes become the core of the game.

The render process

Lets look at the render process as an example. We could just pull the render code out of the original game loop and place it in a process, giving us something like this

class RenderProcess implements IProcess
{
  public function start() : Boolean
  {
    // initialise render system
    return true;
  }
  public function update( time:Number ):void
  {
    spaceship.render();
    for each( var flyingSaucer:FlyingSaucer in flyingSaucers )
    {
      flyingSaucer.render();
    }
    for each( var asteroid:Asteroid in asteroids )
    {
      asteroid.render();
    }
    for each( var bullet:Bullet in bullets )
    {
      bullet.render();
    }
  }
  public function end() : void
  {
    // clean-up render system
  }
}

Using an interface

But this isn’t very efficient. We still have to manually render all the different types of game object. If we have a common interface for all renderable objects, we can simplify matters a lot.

interface IRenderable
{
  function render();
}
class RenderProcess implements IProcess
{
  private var targets:Vector.<IRenderable>;
  public function start() : Boolean
  {
    // initialise render system
    return true;
  }
  public function update( time:Number ):void
  {
    for each( var target:IRenderable in targets )
    {
      target.render();
    }
  }
  public function end() : void
  {
    // clean-up render system
  }
}

Then our spaceship class might contain some code like this

class Spaceship implements IRenderable
{
  public var view:DisplayObject;
  public var position:Point;
  public var rotation:Number;
  public function render():void
  {
    view.x = position.x;
    view.y = position.y;
    view.rotation = rotation;
  }
}

This code is based on the flash display list. If we were blitting, or using stage3d, it would be different, but the principles would be the same. We need the image to be rendered, and the position and rotation for rendering it. And the render function does the rendering.

Using a base class and inheritance

In fact, there’s nothing in this code that makes it unique to a spaceship. All the code could be shared by all renderable objects. The only thing that makes them different is which display object is assigned to the view property, and what the position and rotation are. So lets wrap this in a base class and use inheritance.

class Renderable implements IRenderable
{
  public var view:DisplayObject;
  public var position:Point;
  public var rotation:Number;
  public function render():void
  {
    view.x = position.x;
    view.y = position.y;
    view.rotation = rotation;
  }
}
class Spaceship extends Renderable
{
}

Of course, all renderable items will extend the renderable class, so we get a simple class heirarchy like this

The move process

To understand the next step, we first need to look at another process and the class it works on. So lets try the move process, which updates the position of the objects.

interface IMoveable
{
  function move( time:Number );
}
class MoveProcess implements IProcess
{
  private var targets:Vector.<IMoveable>;
  public function start():Boolean
  {
    return true;
  }
  public function update( time:Number ):void
  {
    for each( var target:IMoveable in targets )
    {
      target.move( time );
    }
  }
  public function end():void
  {
  }
}
class Moveable implements IMoveable
{
  public var position:Point;
  public var rotation:Number;
  public var velocity:Point;
  public var angularVelocity:Number;
  public function move( time:Number ):void
  {
    position.x += velocity.x * time;
    position.y += velocity.y * time;
    rotation += angularVelocity * time;
  }
}
class Spaceship extends Moveable
{
}

Multiple inheritance

That’s almost good, but unfortunately we want our spaceship to be both moveable and renderable, and many modern programming languages don’t allow multiple inheritance.

Even in those languages that do permit multiple inheritance, we have the problem that the position and rotation in the Moveable class should be the same as the position and rotation in the Renderable class.

One common solution is to use an inheritance chain, so that Moveable extends Renderable.

class Moveable extends Renderable implements IMoveable
{
  public var velocity:Point;
  public var angularVelocity:Number;
  public function move( time:Number ):void
  {
    position.x += velocity.x * time;
    position.y += velocity.y * time;
    rotation += angularVelocity * time;
  }
}
class Spaceship extends Moveable
{
}

Now the spaceship is both moveable and renderable. We can apply the same principles to the other game objects to get this class hierarchy.

We can even have static objects that just extend Renderable.

Moveable but not Renderable

But what if we want a Moveable object that isn’t Renderable? An invisible game object, for example? Now our class hierarchy breaks down and we need an alternative implementation of the Moveable interface that doesn’t extend Renderable.

class InvisibleMoveable implements IMoveable
{
  public var position:Point;
  public var rotation:Number;
  public var velocity:Point;
  public var angularVelocity:Number;
  public function move( time:Number ):void
  {
    position.x += velocity.x * time;
    position.y += velocity.y * time;
    rotation += angularVelocity * time;
  }
}

In a simple game, this is clumsy but manageable, but in a complex game using inheritance to apply the processes to objects rapidly becomes unmanageable as you’ll soon discover items in your game that don’t fit into a simple linear inheritance tree, as with the force-field above.

Favour composition over inheritance

It’s long been a sound principle of object-oriented programming to favour composition over inheritance. Applying that principle here can rescue us from this potential inheritance mess.

We’ll still need Renderable and Moveable classes, but rather than extending these classes to create the spaceship class, we will create a spaceship class that contains an instance of each of these classes.

class Renderable implements IRenderable
{
  public var view:DisplayObject;
  public var position:Point;
  public var rotation:Number;
  public function render():void
  {
    view.x = position.x;
    view.y = position.y;
    view.rotation = rotation;
  }
}
class Moveable implements IMoveable
{
  public var position:Point;
  public var rotation:Number;
  public var velocity:Point;
  public var angularVelocity:Number;
  public function move( time:Number ):void
  {
    position.x += velocity.x * time;
    position.y += velocity.y * time;
    rotation += angularVelocity * time;
  }
}
class Spaceship
{
  public var renderData:IRenderable;
  public var moveData:IMoveable;
}

This way, we can combine the various behaviours in any way we like without running into inheritance problems.

The objects made by this composition, the Static Object, Spaceship, Flying Saucer, Asteroid, Bullet and Force Field, are collectively called entities.

Our processes remain unchanged.

interface IRenderable
{
  function render();
}
class RenderProcess implements IProcess
{
  private var targets:Vector.<IRenderable>;
  public function update(time:Number):void
  {
    for each(var target:IRenderable in targets)
    {
      target.render();
    }
  }
}
interface IMoveable
{
  function move();
}
class MoveProcess implements IProcess
{
  private var targets:Vector.<IMoveable>;
  public function update(time:Number):void
  {
    for each(var target:IMoveable in targets)
    {
      target.move( time );
    }
  }
}

But we don’t add the spaceship entity to each process, we add it’s components. So when we create the spaceship we do something like this

public function createSpaceship():Spaceship
{
  var spaceship:Spaceship = new Spaceship();
  ...
  renderProcess.addItem( spaceship.renderData );
  moveProcess.addItem( spaceship.moveData );
  ...
  return spaceship;
}

This approach looks good. It gives us the freedom to mix and match process support between different game objects without getting into spagetti inheritance chains or repeating ourselves. But there’s one problem.

What about the shared data?

The position and rotation properties in the Renderable class instance need to have the same values as the position and rotation properties in the Moveable class instance, since the Move process will change the values in the Moveable instance and the Render process will use the values in the Renderable instance.

class Renderable implements IRenderable
{
  public var view:DisplayObject;
  public var position:Point;
  public var rotation:Number;
  public function render():void
  {
    view.x = position.x;
    view.y = position.y;
    view.rotation = rotation;
  }
}
class Moveable implements IMoveable
{
  public var position:Point;
  public var rotation:Number;
  public var velocity:Point;
  public var angularVelocity:Number;
  public function move( time:Number ):void
  {
    position.x += velocity.x * time;
    position.y += velocity.y * time;
    rotation += angularVelocity * time;
  }
}
class Spaceship
{
  public var renderData:IRenderable;
  public var moveData:IMoveable;
}

To solve this, we need to ensure that both class instances reference the same instances of these properties. In Actionscript that means these properties must be objects, because objects can be passed by reference while primitives are passed by value.

So we introduce another set of classes, which we’ll call components. These components are just value objects that wrap properties into objects for sharing between processes.

class PositionComponent
{
  public var x:Number;
  public var y:Number;
  public var rotation:Number;
}
class VelocityComponent
{
  public var velocityX:Number;
  public var velocityY:Number;
  public var angularVelocity:Number;
}
class DisplayComponent
{
  public var view:DisplayObject;
}
class Renderable implements IRenderable
{
  public var display:DisplayComponent;
  public var position:PositionComponent;
  public function render():void
  {
    display.view.x = position.x;
    display.view.y = position.y;
    display.view.rotation = position.rotation;
  }
}
class Moveable implements IMoveable
{
  public var position:PositionComponent;
  public var velocity:VelocityComponent;
  public function move( time:Number ):void
  {
    position.x += velocity.velocityX * time;
    position.y += velocity.velocityY * time;
    position.rotation += velocity.angularVelocity * time;
  }
}

When we create the spaceship we ensure the Moveable and Renderable instances share the same instance of the PositionComponent.

class Spaceship
{
  public function Spaceship()
  {
    moveData = new Moveable();
    renderData = new Renderable();
    moveData.position = new PositionComponent();
    moveData.velocity = new VelocityComponent();
    renderData.position = moveData.position;
    renderData.display = new DisplayComponent();
  }
}

The processes remain unaffected by this change.

A good place to pause

At this point we have a neat separation of tasks. The game loop cycles through the processes, calling the update method on each one. Each process contains a collection of objects that implement the interface it operates on, and will call the appropriate method of those objects. Those objects each do a single important task on their data. Through the system of components, those objects are able to share data and thus the combination of multiple processes can produce complex updates in the game entities, while keeping each process relatively simple.

This architecture is similar to a number of entity systems in game development. The architecture follows good object-oriented principles and it works. But there’s more to come, starting with a moment of madness.

Abandoning good object-oriented practice

The current architecture uses good object-oriented practices like encapsulation and single responsibility – the IRenderable and IMoveable implementations encapsulate the data and logic for single responsibilities in the updating of game entities every frame – and composition – the Spaceship entity is created by combining implementations of the IRenderable and IMoveable interfaces. Through the system of components we ensured that, where appropriate, data is shared between the different data classes of the entities.

The next step in this evolution of entity systems is somewhat counter-intuitive, breaking one of the core tenets of object-oriented programming. We break the encapsulation of the data and logic in the Renderable and Moveable implementations. Specifically, we remove the logic from these classes and place it in the processes instead.

So this

interface IRenderable
{
  function render();
}
class Renderable implements IRenderable
{
  public var display:DisplayComponent;
  public var position:PositionComponent;
  public function render():void
  {
    display.view.x = position.x;
    display.view.y = position.y;
    display.view.rotation = position.rotation;
  }
}
class RenderProcess implements IProcess
{
  private var targets:Vector.<IRenderable>;
  public function update( time:Number ):void
  {
    for each( var target:IRenderable in targets )
    {
      target.render();
    }
  }
}

Becomes this

class RenderData
{
  public var display:DisplayComponent;
  public var position:PositionComponent;
}
class RenderProcess implements IProcess
{
  private var targets:Vector.<RenderData>;
  public function update( time:Number ):void
  {
    for each( var target:RenderData in targets )
    {
      target.display.view.x = target.position.x;
      target.display.view.y = target.position.y;
      target.display.view.rotation = target.position.rotation;
    }
  }
}

And this

interface IMoveable
{
  function move( time:Number );
}
class Moveable implements IMoveable
{
  public var position:PositionComponent;
  public var velocity:VelocityComponent;
  public function move( time:Number ):void
  {
    position.x += velocity.velocityX * time;
    position.y += velocity.velocityY * time;
    position.rotation += velocity.angularVelocity * time;
  }
}
class MoveProcess implements IProcess
{
  private var targets:Vector.<IMoveable>;
  public function move( time:Number ):void
  {
    for each( var target:Moveable in targets )
    {
      target.move( time );
    }
  }
}

Becomes this

class MoveData
{
  public var position:PositionComponent;
  public var velocity:VelocityComponent;
}
class MoveProcess implements IProcess
{
  private var targets:Vector.<MoveData>;
  public function move( time:Number ):void
  {
    for each( var target:MoveData in targets )
    {
      target.position.x += target.velocity.velocityX * time;
      target.position.y += target.velocity.velocityY * time;
      target.position.rotation += target.velocity.angularVelocity * time;
    }
  }
}

It’s not immediately clear why we’d do this, but bear with me. On the surface, we’ve removed the need for the interface, and we’ve given the process something more important to do – rather than simply delegate its work to the IRenderable or IMoveable implementations, it does the work itself.

The first apparent consequence of this is that all entities must use the same rendering method, since the render code is now in the RenderProcess. But that’s not actually the case. We could, for example, have two processes, RenderMovieClip and RenderBitmap for example, and they could operate on different sets of entities. So we haven’t lost any flexibility.

What we gain is the ability to refactor our entities significantly to produce an architecture with clearer separation and simpler configuration. The refactoring starts with a question.

Do we need the data classes?

Currently, our entity

class Spaceship
{
  public var moveData:MoveData;
  public var renderData:RenderData;
}

Contains two data classes

class MoveData
{
  public var position:PositionComponent;
  public var velocity:VelocityComponent;
}
class RenderData
{
  public var display:DisplayComponent;
  public var position:PositionComponent;
}

These data classes in turn contain three components

class PositionComponent
{
  public var x:Number;
  public var y:Number;
  public var rotation:Number;
}
class VelocityComponent
{
  public var velocityX:Number;
  public var velocityY:Number;
  public var angularVelocity:Number;
}
class DisplayComponent
{
  public var view:DisplayObject;
}

And the data classes are used by the two processes

class MoveProcess implements IProcess
{
  private var targets:Vector.<MoveData>;
  public function move( time:Number ):void
  {
    for each( var target:MoveData in targets )
    {
      target.position.x += target.velocity.velocityX * time;
      target.position.y += target.velocity.velocityY * time;
      target.position.rotation += target.velocity.angularVelocity * time;
    }
  }
}
class RenderProcess implements IProcess
{
  private var targets:Vector.<RenderData>;
  public function update( time:Number ):void
  {
    for each( var target:RenderData in targets )
    {
      target.display.view.x = target.position.x;
      target.display.view.y = target.position.y;
      target.display.view.rotation = target.position.rotation;
    }
  }
}

But the entity shouldn’t care about the data classes. The components collectively contain the state of the entity. The data classes exist for the convenience of the processes. So we refactor the code so the spaceship entity contains the components rather than the data classes.

class Spaceship
{
  public var position:PositionComponent;
  public var velocity:VelocityComponent;
  public var display:DisplayComponent;
}
class PositionComponent
{
  public var x:Number;
  public var y:Number;
  public var rotation:Number;
}
class VelocityComponent
{
  public var velocityX:Number;
  public var velocityY:Number;
  public var angularVelocity:Number;
}
class DisplayComponent
{
  public var view:DisplayObject;
}

By removing the data classes, and using the constituent components instead to define the spaceship, we have removed any need for the spaceship entity to know what processes may act on it. The spaceship now contains the components that define its state. Any requirement to combine these components into other data classes for the processes is some other class’s responsibility.

Systems and Nodes

Some core code within the entity system framework (which we’ll get to in a minute) will dynamically create these data objects as they are required by the processes. In this reduced context, the data classes will be mere nodes in the collections (arrays, linked-lists, or otherwise, depending on the implementation) used by the processes. So to clarify this we’ll rename them as nodes.

class MoveNode
{
  public var position:PositionComponent;
  public var velocity:VelocityComponent;
}
class RenderNode
{
  public var display:DisplayComponent;
  public var position:PositionComponent;
}

The processes are unchanged, but in keeping with the more common naming I’ll also change their name and call them systems.

class MoveSystem implements ISystem
{
  private var targets:Vector.<MoveNode>;
  public function update( time:Number ):void
  {
    for each( var target:MoveNode in targets )
    {
      target.position.x += target.velocity.velocityX * time;
      target.position.y += target.velocity.velocityY * time;
      target.position.rotation += target.velocity.angularVelocity * time;
    }
  }
}
class RenderSystem implements ISystem
{
  private var targets:Vector.<RenderNode>;
  public function update( time:Number ):void
  {
    for each( var target:RenderNode in targets )
    {
      target.display.view.x = target.position.x;
      target.display.view.y = target.position.y;
      target.display.view.rotation = target.position.rotation;
    }
  }
}
interface ISystem
{
  function update( time:Number ):void;
}

And what is an entity?

One last change – there’s nothing special about the Spaceship class. It’s just a container for components. So we’ll just call it Entity and give it a collection of components. We’ll access those components based on their class type.

class Entity
{
  private var components : Dictionary;
  public function add( component:Object ):void
  {
    var componentClass : Class = component.constructor;
    components[ componentClass ] = component'
  }
  public function remove( componentClass:Class ):void
  {
    delete components[ componentClass ];
  }
  public function get( componentClass:Class ):Object
  {
    return components[ componentClass ];
  }
}

So we’ll create our spaceship like this

public function createSpaceship():void
{
  var spaceship:Entity = new Entity();
  var position:PositionComponent = new PositionComponent();
  position.x = Stage.stageWidth / 2;
  position.y = Stage.stageHeight / 2;
  position.rotation = 0;
  spaceship.add( position );
  var display:DisplayComponent = new DisplayComponent();
  display.view = new SpaceshipImage();
  spaceship.add( display );
  engine.add( spaceship );
}

The core Engine class

We mustn’t forget the system manager, formerly called the process manager.

class SystemManager
{
  private var systems:PrioritisedList;
  public function addSystem( system:ISystem, priority:int ):void
  {
    systems.add( system, priority );
    system.start();
  }
  public function update( time:Number ):void
  {
    for each( var system:ISystem in systemes )
    {
      system.update( time );
    }
  }
  public function removeSystem( system:ISystem ):void
  {
    system.end();
    systems.remove( system );
  }
}

This will be enhanced and will sit at the heart of our entity component system framework. We’ll add to it the functionality mentioned above to dynamically create nodes for the systems.

The entities only care about components, and the systems only care about nodes. So to complete the entity component system framework, we need code to watch the entities and, as they change, add and remove their components to the node collections used by the systems. Because this is the one bit of code that knows about both entities and systems, we might consider it central to the game. In Ash, I call this the Engine class, and it is an enhanced version of the system manager.

Every entity and every system is added to and removed from the Engine class when you start using it and stop using it. The Engine class keeps track of the components on the entities and creates and destroys nodes as necessary, adding those nodes to the node collections. The Engine class also provides a way for the systems to get the collections they require.

public class Engine
{
  private var entities:EntityList;
  private var systems:SystemList;
  private var nodeLists:Dictionary;
  public function addEntity( entity:Entity ):void
  {
    entities.add( entity );
    // create nodes from this entity's components and add them to node lists
    // also watch for later addition and removal of components from the entity so
    // you can adjust its derived nodes accordingly
  }
  public function removeEntity( entity:Entity ):void
  {
    // destroy nodes containing this entity's components
    // and remove them from the node lists
    entities.remove( entity );
  }
  public function addSystem( system:System, priority:int ):void
  {
    systems.add( system, priority );
    system.start();
  }
  public function removeSystem( system:System ):void
  {
    system.end();
    systems.remove( system );
  }
  public function getNodeList( nodeClass:Class ):NodeList
  {
    var nodes:NodeList = new NodeList();
    nodeLists[ nodeClass ] = nodes;
    // create the nodes from the current set of entities
    // and populate the node list
    return nodes;
  }
  public function update( time:Number ):void
  {
    for each( var system:ISystem in systemes )
    {
      system.update( time );
    }
  }
}

To see one implementation of this architecture, checkout the Ash entity system framework, and see the example Asteroids implementation there too.

A step further

In Actionscript, the Node and Entity classes are necessary for efficiently managing the Components and passing them to the Systems. But note that these classes are just glue, the game is defined in the Systems and the Components. The Entity class provides a means to find and manage the components for each entity and the Node classes provide a means to group components into collections for use in the Systems. In other languages and runtime environments it may be more efficient to manage this glue differently.

For example, in a large server-based game we might store the components in a database – they are just data after all – with each record (i.e. each component) having a field for the unique id of the entity it belongs to along with fields for the other component data. Then we pull the components for an entity directly from the database when needed, using the entity id to find it, and we create collections of data for the systems to operate on by doing joined queries across the appropriate databases. For example, for the move system we would pull records from the postion components table and the movement components table where entity ids match and a record exists in both tables (i.e. the entity has both a position and a movement component). In this instance the Entity and Node classes are not required and the only presence for the entity is the unique id that is used in the data tables.

Similarly, if you have control over the memory allocation for your game it is often more efficient to take a similar approach for local game code too, creating components in native arrays of data and looking-up the components for an entity based on an id. Some aspects of the game code become more complex and slower (e.g. finding the components for a specific entity) but others become much faster (e.g. iterating through the component data collections inside a system) because the data is efficiently laid out in memory to minimise cache misses and maximise speed.

The important elements of this architecture are the components and the systems. Everything else is configuration and glue. And note that components are data and systems are functions, so we don’t even need object oriented code to do this.

Conclusion

So, to summarise, entity component systems originate from a desire to simplify the game loop. From that comes an architecture of components, which represent the state of the game, and systems, which operate on the state of the game. Systems are updated every frame – this is the game loop. Components are combined into entities, and systems operate on the entities that have all the components they are interested in. The engine monitors the systems and the components and ensures each system has access to a collection of all the components it needs.

An entity component system framework like Ash provides the basic scaffolding and core management for this architecture, without providing any actual component or system classes. You create your game by creating the appropriate components and systems.

An entity component system game engine will provide many standard systems and components on top of the basic framework.

Three entity component system frameworks for Actionscript are my own AshEmber2 by Tom Davies and Xember by Alec McEachranArtemis is an entity system framework for Java, that has also been ported to C#.

My next post covers some of the reasons why I like using an entity system framework for my game development projects.

Share this post or a comment online –

Posted on 

Finite state machines are one of the staple constructs in game development. During the course of a game, game objects may pass through many states and managing those states effectively is important.

The difficulty with finite state machines in an entity system framework like Ash can be summed up in one sentence – the state pattern doesn’t work with an entity system framework. Entity system frameworks use a data-oriented paradigm in which game objects are not self-contained OOP objects. So you can’t use the state pattern, or any variation of it. All the data is in the components, all the logic is in the systems.

If your states are few and simple it is possible to use a good old fashioned switch statement inside a system, with the data for all the states in one or more components that are used by that system, but I wouldn’t usually recommend that.

When creating Stick Tennis I was faced with the problem of how to manage states as the two main entities in the game are the two players, and they go through a number of states as they…

  • prepare to serve
  • swing arm to toss the ball
  • release the ball
  • swing the racquet
  • hit the ball
  • follow through
  • run to a good position
  • react to the opponent hitting the ball
  • run to intercept the ball
  • swing the racquet
  • hit the ball
  • follow through
  • run to a good position
  • react to winning the point
  • …etc

Stick Tennis is a complex example, and I can’t show you the source code, so instead I’ll use something a little simpler, with source code.

An example

Lets consider a guard character in a game. This character patrols along a path, keeping watch. If they spot an enemy, they attack him/her.

In a traditional object-oriented state machine we might have a class for each state

public class PatrolState
{
    private var guard : Character;
    private var path : Vector.<Point>;
    public function PatrolState( guard : Character, path : Vector.<Point> )
    {
        this.guard = guard;
        this.path = path;
    }
    public function update( time : Number ) : void
    {
        moveAlongPath( time );
        var enemy : Character = lookForEnemies();
        if( enemy )
        {
            guard.changeState( new AttackState( guard, enemy ) );
        }
    }
}
public class AttackState
{
    private var guard : Character;
    private var enemy : Character;
    public function AttackState( guard : Character, enemy : Character )
    {
        this.guard = guard;
        this.enemy = enemy;
    }
    public function update( time : Number ) : void
    {
        guard.attack( enemy );
        if( enemy.isDead )
        {
            guard.changeState( new PatrolState( guard, PatrolPathFactory.getPath( guard.id ) );
        }
    }
}

In a entity system architecture we have to take a slightly different approach, but the core principle of the state pattern, to split the state machine across multiple classes, one for each state, can still be applied. To implement the state machine in an entity framework we will use one System per state.

public class PatrolSystem extends ListIteratingSystem
{
    public function PatrolSystem()
    {
        super( PatrolNode, updateNode );
    }
    private function updateNode( node : PatrolNode, time : Number ) : void
    {
        moveAlongPath( node );
        var enemy : Enemy = lookForEnemies( node );
        if( enemy )
        {
            node.entity.remove( Patrol );
            var attack : Attack = new Attack();
            attack.enemy = enemy;
            node.entity.add( attack );
        }
    }
}
public class AttackSystem extends ListIteratingSystem
{
    public function AttackSystem()
    {
        super( AttackNode, updateNode );
    }
    private function updateNode( node : PatrolNode, time : Number ) : void
    {
        attack( node.entity, node.attack.enemy );
        if( node.attack.enemy.get( Health ).energy == 0 )
        {
            node.entity.remove( Attack );
            var patrol : Patrol = new Patrol();
            patrol.path = PatrolPathFactory.getPath( node.entity.name );
            node.entity.add( patrol );
        }
    }
}

The guard will be processed by the PatrolSystem if he has a Patrol component, and he will be processed by the AttackSystem if he has an Attack component. By adding/removing these components from the guard we change his state.

The components and nodes look like this…

public class Patrol
{
    public var path : Vector.<Point>;
}
public class Attack
{
    public var enemy : Entity;
}
public class Position
{
    public var point : Point;
}
public class Health
{
    public var energy : Number;
}
public class PatrolNode extends Node
{
    public var patrol : Patrol;
    public var position : Position;
}
public class AttackNode extends Node
{
    public var attack : Attack;
}

So, by changing the components of the entity, we change the entities state and thus change the systems that process the entity.

Another example

Here’s another, more complex example using the Asteroids example game that I use to illustrate how Ash works. I’ve add an additional state to the spaceship for when it’s shot. Rather than simply removing the spaceship when it is shot, I show a short animation of it breaking up. While doing this, the user won’t be able to move it and the spaceship won’t react to collisions with other objects.

The two states require the following

While the ship is alive –

  • It looks like a spaceship
  • The user can move it
  • The user can fire its gun
  • It collides with asteroids

When the ship is dead –

  • It looks like bits of a spaceship floating in space
  • The user cannot move it
  • The user cannot fire its gun
  • It doesn’t collide with asteroids
  • After a fixed time it is removed from the game

The relevant piece of code, where the spaceship dies, is in the CollisionSystem. Without the second state it would look like this

for ( spaceship = spaceships.head; spaceship; spaceship = spaceship.next )
{
    for ( asteroid = asteroids.head; asteroid; asteroid = asteroid.next )
    {
        if ( Point.distance( asteroid.position.position, spaceship.position.position )
            <= asteroid.position.collisionRadius + spaceship.position.collisionRadius )
        {
            creator.destroyEntity( spaceship.entity );
            break;
        }
    }
}

The code tests whether the ship is colliding with an asteroid, and if it is it removes the ship. Elsewhere, the GameManager system handles the situation where there is no spaceship and creates another one, if any are left, or ends the game. Instead of destroying the spaceship, we need to change its state. So, lets try this…

We can prevent the user controlling the spaceship by simply removing the MotionControls and GunControlscomponents. We might as well remove the Motion and Gun components while we’re at it since they’re of no use without the controls. So we replace the code above with

for ( spaceship = spaceships.head; spaceship; spaceship = spaceship.next )
{
    for ( asteroid = asteroids.head; asteroid; asteroid = asteroid.next )
    {
        if ( Point.distance( asteroid.position.position, spaceship.position.position )
            <= asteroid.position.collisionRadius + spaceship.position.collisionRadius )
        {
            spaceship.entity.remove( MotionControls );
            spaceship.entity.remove( Motion );
            spaceship.entity.remove( GunControls );
            spaceship.entity.remove( Gun );
            break;
        }
    }
}

Next, we need to change how the ship looks and remove the collision behaviour

for ( spaceship = spaceships.head; spaceship; spaceship = spaceship.next )
{
    for ( asteroid = asteroids.head; asteroid; asteroid = asteroid.next )
    {
        if ( Point.distance( asteroid.position.position, spaceship.position.position )
            <= asteroid.position.collisionRadius + spaceship.position.collisionRadius )
        {
            spaceship.entity.remove( MotionControls );
            spaceship.entity.remove( Motion );
            spaceship.entity.remove( GunControls );
            spaceship.entity.remove( Gun );
            spaceship.entity.remove( Collision );
            spaceship.entity.remove( Display );
            spaceship.entity.add( new Display( new SpaceshipDeathView() ) );
            break;
        }
    }
}

And finally, we need to ensure that the spaceship is removed after a short period of time. To do this, we’ll need a new system and component like this

public class DeathThroes
{
    public var countdown : Number;
    public function DeathThroes( duration : Number )
    {
        countdown = duration;
    }
}
public class DeathThroesNode extends Node
{
    public var death : DeathThroes;
}
public class DeathThroesSystem extends ListIteratingSystem
{
    private var creator : EntityCreator;
    public function DeathThroesSystem( creator : EntityCreator )
    {
        super( DeathThroesNode, updateNode );
        this.creator = creator;
    }
    private function updateNode( node : DeathThroesNode, time : Number ) : void
    {
        node.death.countdown -= time;
        if ( node.death.countdown <= 0 )
        {
            creator.destroyEntity( node.entity );
        }
    }
}

We add the DeathThroesSystem to the game at the start, so it will handle the drawn-out death of any entity. Then we add the DeathThroes component to the spaceship when it dies.

for ( spaceship = spaceships.head; spaceship; spaceship = spaceship.next )
{
    for ( asteroid = asteroids.head; asteroid; asteroid = asteroid.next )
    {
        if ( Point.distance( asteroid.position.position, spaceship.position.position )
            <= asteroid.position.collisionRadius + spaceship.position.collisionRadius )
        {
            spaceship.entity.remove( MotionControls );
            spaceship.entity.remove( Motion );
            spaceship.entity.remove( GunControls );
            spaceship.entity.remove( Gun );
            spaceship.entity.remove( Collision );
            spaceship.entity.remove( Display );
            spaceship.entity.add( new Display( new SpaceshiopDeathView() ) );
            spaceship.entity.add( new DeathThroes( 5 ) );
            break;
        }
    }
}

And that is our state transition. The transition is achieved by altering which components the entity has.

The state is encapsulated in its components

This is the general rule of the entity system architecture – the state of an entity is encapsulated in its components. If you want to change how an entity is processed, you should change its components. That will alter which systems operate on it and that changes how the entity is processed.

Standardised state machine code

To help with state machines I’ve added some standard state machine classes to Ash. These classes help you manage states by defining states based on the components they contain, and then changing state simply by specifying the new state you want.

A finite state machine is an instance of the EntityStateMachine class. You pass it a reference to the entity it will manage when constructing it. You will usually store the state machine in a component on the entity so it can be recovered from within any system that is operating on the entity.

var stateMachine : EntityStateMachine = new EntityStateMachine( guard );

A state machine is configured with states, and the state can be changed by calling the state machine’s changeState() method. States are identified by a string, which is assigned when the state is created and used to identify the state when calling the changeState() method.

States are instances of the EntityState class. They may be added to the EntityStateMachine using the EntityStateMachine.addState() method, or they may be created and added in one call using the EntityStateMachine.createState() method.

var patrolState : EntityState = stateMachine.createState( "patrol" );
var attackState : EntityState = stateMachine.createState( "attack" );

A state is a set of components that should be added to the entity when that state is entered, and removed when that state exits (unless they are also required for the next state). The add method of the EntityState specifies the type of component required for the state and is followed by a rule specifying how to create that component.

var patrol : Patrol = new Patrol();
patrol.path = PatrolPathFactory.getPath( node.entity.name );
patrolState.add( Patrol ).withInstance( patrol );
attackState.add( Attack );

The four standard rules for components are

entityState.add( type : Class );

Without a rule, the state machine will create a new instance of the given type to provide the component every time the state is entered.

entityState.add( type : Class ).withType( otherType : Class );

This rule will create a new instance of the otherType every time the state is entered. otherType should be the same as or extend the specified component type. You only need this rule if you create component classes that extend other component classes and should be treated as the base class by the engine, which is rare.

entityState.add( type : Class ).withInstance( instance : * );

This method will use the provided instance for the component every time the state is entered.

Finally

entityState.add( type : Class ).withSingleton();

or

entityState.add( type : Class ).withSingleton( otherType : Class );

will create a single instance and use that one instance every time the state is entered. This is similar to using the withInstance method, but the withSingleton method will not create the instance until it is needed. If otherType is omitted, then the singleton with be an instance of type, if included it will be of otherType and otherType must be the same as or extend type.

Finally, you can use custom code to provide the component by implementing the IComponentProvider interface and then using your custom provider with

entityState.add( type : Class ).withProvider( provider : IComponentProvider );

The IComponentProvider interface is defined as

public interface IComponentProvider
{
    function getComponent() : *;
    function get identifier() : *;
}

The getComponent method returns a component instance. The identifier in the IComponentProvider is used to compare two component providers to see if they will effectively return the same component. This is used to avoid replacing a component unnecessarily if two successive states use the same component.

The methods are designed to be chained together, to create a fluid interface, as you’ll see in the next example.

Back to the examples

If we apply these new tools to the spaceship example, the states are set-up when the spaceship entity is created, as follows

var fsm : EntityStateMachine = new EntityStateMachine( spaceshipEntity );
fsm.createState( "playing" )
   .add( Motion ).withInstance( new Motion( 0, 0, 0, 15 ) )
   .add( MotionControls )
       .withInstance( new MotionControls( Keyboard.LEFT, Keyboard.RIGHT, Keyboard.UP, 100, 3 ) )
   .add( Gun ).withInstance( new Gun( 8, 0, 0.3, 2 ) )
   .add( GunControls ).withInstance( new GunControls( Keyboard.SPACE ) )
   .add( Collision ).withInstance( new Collision( 9 ) )
   .add( Display ).withInstance( new Display( new SpaceshipView() ) );
fsm.createState( "destroyed" )
   .add( DeathThroes ).withInstance( new DeathThroes( 5 ) )
   .add( Display ).withInstance( new Display( new SpaceshipDeathView() ) );
var spaceshipComponent : Spaceship = new Spaceship();
spaceshipComponent.fsm = fsm;
spaceshipEntity.add( spaceshipComponent );
fsm.changeState( "playing" );

and the state change is simplified to

for ( spaceship = spaceships.head; spaceship; spaceship = spaceship.next )
{
    for ( asteroid = asteroids.head; asteroid; asteroid = asteroid.next )
    {
        if ( Point.distance( asteroid.position.position, spaceship.position.position )
            <= asteroid.position.collisionRadius + spaceship.position.collisionRadius )
        {
            spaceship.spaceship.fsm.changeState( "destroyed" );
            break;
        }
    }
}

To do

There will be further refinement and additions to the state machine tools based on feedback so please do let me know how you get on with them. Use the mailing list for Ash to get in touch.

Share this post or a comment online –

Do You Really Know CORS?

Cross-Origin Resource Sharing

No ‘Access-Control-Allow-Origin’ header is present on the requested resource. Origin http://www.sesamestreet.com  is therefore not allowed access.

 

If you work with a frontend sometimes, the chances are that you’ve seen the error above before. When it had happened to you for the first time, like any proper developer does, you googled it. As a result, you have probably seen some advice on StackOverflow to include Access-Control-Allow-Origin in your server’s response and then, you can happily return to your code.

 

Surprisingly, this is the end of the experience with Cross-Origin Resource Sharing (CORS) for many developers. They know how to fix the problem, but they don’t always understand why the problem exists in the first place. In this article, we will dive deeper into this topic, trying to understand what problem CORS really solves. However, we will start with the Same-Origin Policy (SOP) concept.

 

What Is the Same-Origin Policy?

SOP is a security mechanism implemented in almost all of the modern browsers. It does not allow documents or scripts loaded from one origin to access resources from other origins. To understand why it’s so critical, it’s important to realize that for any HTTP request to a particular domain, browsers automatically attach any cookies bounded to that domain. Let’s imagine a cookie my_cookie=oreo which is stored for the domain CookieMonster.com.

 

 

CookieMonster.com is a single-page application that uses REST API exposed at CookieMonster.com/api. Thus, for every HTTP call to the API, my_cookie=oreowill be attached to the request.

 

Without Same-Origin Policy, the following scenario would be possible:

 

 

SesameStreet.com could reuse the user’s cookie which was previously stored for CookieMonster.com. Sounds scary? Not yet? Let’s look at the more thought-provoking scenario:

 

 

and then:

 

 

Without SOP, MaliciousWebsite.com would be able to send a request to MyBank.com/api using the session cookie stored for MyBank.com. Why is that? Because, as previously mentioned, browsers automatically attach cookies bounded to a destination domain. It doesn’t matter if a request originated from MyBank.com or MaliciousWebsite.com. As long as a request goes to MyBank.com, the cookies stored for MyBank.com would be used. As you can see, without Same-Origin Policy, Cross-Site Request Forgery (CSRF) attack can be relatively simple – assuming that an authentication is based solely on a session cookie, as opposed to a token-based authentication. That’s one of the reasons the SOP was introduced.

Having said that, it has always been possible for the browser to make cross-origin requests by specifying a resource from a foreign domain in the <img><script>or <iframe> tag – and if applicable, cookies might be attached. The crucial difference is that AJAX call is fired from the JavaScript code, which has total control and can be potentially dangerous. On the other hand, tags are in the control of the browser, and no JS code can intercept HTTP requests that they trigger.

 

What is Origin?

In the previous paragraph, I used domains as an example. But Same-Origin Policy applies to origins. How is an origin defined? Two origins are considered to be equal if they have the same protocol, host, and port – sometimes referred to as “scheme/host/port tuple”. It explains why we see this error – even if both our backend and frontend run locally – they use different ports, and thus, they have different origins.

 

What Is CORS?

Even though some people call CORS a security mechanism, it’s actually the opposite. It’s a way to relax security, make it less restrictive. SOP is implemented in almost all modern browsers and because of that, a website from one origin is not allowed to access resources from foreign origins. CORS is a mechanism to make it possible.

 

How Does CORS Work?

CORS defines two types of requests: simple requests and preflighted requests.

Simple Requests

To put it simply, a simple request is the one that doesn’t trigger the preflight request. A request doesn’t trigger preflight request if it meets all of the following conditions:

  • uses GET, HEAD or POST method
  • doesn’t have headers other than the small subset defined in the specification(any custom or Authorization header breaks this condition)
  • the only allowed values for Content-Type header are application/x-www-form-urlencodedmultipart/form-datatext/plain(application/json breaks this condition)

 

A typical scenario would be:

1. SesameStreet.com is opened in a browser tab. It initiates AJAX request (using XMLHttpRequest or Fetch API) to GET CookieMonster.com/api/monsters

2. The browser notices that this is a cross-origin request, and attaches Origin request header:

3. The CORS-configured server checks the Origin header and if the origin is allowed, then it sets the Access-Control-Allow-Origin header to the Origin value:

4. When the response reaches the browser, the browser verifies if the value under Access-Control-Allow-Origin header matches the origin of the tab the request originated from.

 

Preflighted Requests

As described earlier, even adding a header such as Authorization causes a request to be preflighted. A request is preflighted if a browser first sends an additional preliminary OPTIONS request (“preflight request”) in order to determine whether the actual request (“preflighted request”) is safe to send. Let’s look at the typical flow where we want to create a new monster resource:

1. A browser tab with SesameStreet.com makes AJAX POST request to CookieMonster.com/api/monsters with a JSON payload using POST method. The browser knows that sending POST with Content-Type different from application/x-www-form-urlencodedmultipart/form-datatext/plain has to be preflighted, so it sends OPTIONS request with three additional parameters:

  • Origin – this one we already know
  • Access-Control-Request-Method – HTTP method of the main (preflighted) request
  • Access-Control-Request-Headers – HTTP headers of the main (preflighted) request

2. The server responds with the allowed origin, methods and headers.

3. If the origin is allowed and if the HTTP method and headers of the main request are on the list returned by the server, the main request can be sent. This will be a regular cross-origin request, so it will include Origin header and the response will contain Access-Control-Allow-Origin once again.

Performance note: sending a preflight request every time can be a performance overhead. This can be mitigated by caching preflight requests using Access-Control-Max-Age response header.

 

Common Misconception About CORS

At the first glance, CORS configuration on a server side looks like some sort of ACL (Access Control List) – a server returns the origin that it accepts requests from. The only way to access a resource is to send a request from the origin whitelisted by a server, right? Not really. Remember that HTTP isn’t used only by browsers and you can send an HTTP request from any client like curl, Postman, and so on. If you prepare a custom HTTP request in those tools, you can put any Origin header you want. You can also skip it and a server usually returns a correct result anyway. Why is that? Because as I mentioned earlier, Same-Origin Policy is a concept implemented in browsers. Other tools or software components don’t care about it that much.

There is a simple implication based on what you’ve just read. If there is a third-party API which you want your webpage to consume, but the API returns Access-Control-Allow-Origin header set to an origin other than your own, then you can circumvent that problem by setting up a proxy server. Why? Because as described above, SOP doesn’t apply to server-to-server communication, only to browser-to-server one.

 

Is CORS Safe?

The most important question – is the CSRF scenario from the beginning of this article possible using CORS? The answer is that it depends. By default, CORS does not include cookies into cross-origin requests. This is different from older cross-origin techniques, such as JSON-P (JSON with Padding). This behavior greatly reduces CSRF vulnerabilities. However, if you really want to send cookies in your request, you can explicitly permit that. This requires coordination between both the client and the server side. Your website must set withCredentials property on the XMLHttpRequest to true (or credentials property in Fetch API set to include). Additionally, the server must respond with Access-Control-Allow-Credentials header set to true. With this combination, both parties agree to use credentials when sending a cross-origin request. Credentials are cookies, authorization headers or TLS client certificates. Thankfully, there is one security measure that prevents an excessive exhibitionism – the following combination won’t work:

If you allow credentials to be sent, then Access-Control-Allow-Origincannot be set to the wildcard. The server must return an explicit origin that is allowed to access the resource. The bad news is that many servers blindly generate the Access-Control-Allow-Origin header based on the Origin value from the user’s request. If that’s the case, using Access-Control-Allow-Credentials set to true can be a serious security hole.

The Total Beginner’s Guide to Better 2D Game Art

This article will introduce you to basic art concepts to give you a head start in making your own 2D game art. This is not a Tutorial! This article is for those with some vague familiarity with 2D art for games, primarily people who are programming games but would like to create quality assets, or those just getting started with creating art for games. By 2D assets, I’m referring to 2-dimensional images used for games – anything from character sprites to large backdrops. This article focuses on giving a brief introduction to good old-fashioned art skills and the ways they can make your game better. It’s meant to give you a brief introduction to some principles and ideas so you don’t have to waste your time discovering them the hard way or develop any bad habits you will then have to break.

I won’t be covering things like file formats, raster art vs vector art or what software to use in this article.

What I will be covering:

  • Form and Shape
  • Anatomy and Proportions
  • Perspective
  • Breaking Down Color
  • Lighting and Shading
  • Practice Makes Perfect

If those bullet points don’t grip your heart and tear at your soul, here’s this handy before-and-after demonstrated what you will learn:

beforeafter.png

An internet fact!

Okay, that’s actually what my programmer friend made and my, uh vast improvement, but I think it’s a pretty good example of what happens when you apply a little artistic know-how to a design. We’re all used to looking at 2D images in everyday life, but knowing what things look good isn’t the same as understanding exactly why they they look good. Any 2D image can be broken down into basic elements, and you can think about creating 2D art as combining those elements in a way that 1) Looks like what you meant it to be, and 2) Is not super ugly. For example, we all know what a square and a sphere look like, but how do they fit into the process of making an identifiable character? To answer that, we’re going to start with our first section:

Form and Shape

Knowing that shapes matter, you can apply them to make environments seem more or less friendly, or match (or intentionally mismatch) characters and objects to those environments. Start designs with only very basic shapes- I’m talking about circles, squares, and rectangles. Try a character made of squares, or one made out of just triangles, and then see who looks more like the hero and who looks more like the villian. Keeping your initial design thoughts to shapes only also lets you generate a lot of ideas without getting carried away trying to figure out the detail right away (more on that later with the “Practice Makes Perfect” section). Generally, sharp edges imply artificiality or evil while curves and roundness imply organic and good. Traditionally it’s though of as a spectrum, with roundness on one end with jaggedness on the other, with squares somewhere in the middle. For a great example, think about the landscape of Mordor in the Lord of the Rings films, versus the rolling hills of the Shire. A round, friendly looking character wandering through an angular environment seems more unsettling than the same character in a predominantly round environment. In the same vein, you can easily make stylistic choices to influence how a player thinks of an area. Let’s take a look at another particularly good example… Let’s break down two characters that have a lot, but also pretty much nothing in common: Godzilla and Barney the Dinosaur. What kinds of shapes make one look like a fearless killing machine and the other look like a friendly hugging, uh… machine?

barneygodzilla.jpg

Also, Godzilla has three fingers.(Barney image sourceGodzilla image source.)

Think about it, both characters are T-rex-like monsters designed around the fact that a guy had to fit inside… but they’re on the opposite sides of the appeal spectrum. Why? It has a lot to do with one having smooth curves while the other is more angular with parts that are downright sharp (there are other reasons, which we’ll get into later). At a fundamental level, this has to do with our general comfortableness with round organic shapes versus our discomfort with sharp angularity. It’s no coincidence that “bad guys” tend to have spikes coming out of every concievable surface (Bowser in Super Mario Bros being the archetype), while “good guys” like Mario himself, tend to be, well, soft around the edges. When Sonic the Hedgehog was concieved as a cooler, more mature version of Mario, it’s no coincidence he designed to be signicantly sleeker and spikier than Mario. Let’s take a look at Barney and Gozilla again, this time in silhouette:

barneygodzillashape.png

Evilness of a character is correlated with how painful the action figure is to step on.

Silhouettes are very closely tied to the shape of an object, and are a great way to break down the shape of a character. Apart from any any connotations of the shapes used, if a character does not have a distinct silhouette compared to other characters, it’s not a very good design. Some artists even go so far as to start with the silhouette and move inward to flesh out their subject. Reducing an object to just the silhoutte can also be a great double check after it’s already started to make sure it’s looking right. In summary, when thinking up designs for your games, make sure you account for shape and form and connotations those shapes often have to get a design that conveys what you want it to. Also keep in mind that things are largely recognized by shape, so objects in your game should have distinct shapes in order for the player to identify them easily. Spikey the Sea Urchin as a protagonist, outside ironic appreciation, probably wouldn’t have a lot of appeal among facebook gamers. TL;DR: Everything is made of shapes and forms, and different kinds have different subconcious connotations.

Anatomy and Proportions

Figure drawing is often considered to be the most difficult field of drawing, since people are structurally complicated masses of interconnected cartilage, muscles, bone, and skin. I’m not going to go into super detail since I don’t personally have a ton of experience, but there are hundreds of books and websites dedicated to figure drawing. The essential idea is that there are certain rules and relationships in terms of length, size, and position of anatomical features, which is important because anatomical errors stick out. The more stylized a figure, like Mickey Mouse, the less important strict adherence to anatomy becomes, but it’s a good idea to study realistic figures since by knowing the rules, you’ll be able to bend them better. You can think of human proportions as essentially shortcuts to get close to ideal anatomy by comparing the size of different parts of the body to each other. There are specific proportions to measure pretty much every part of the human figure, but the usual starting point is the head. Humans in real-life are around 7.5 heads tall, though often this is rounded to 8 to make a slightly more idealized figure:

eightheadman.png

There are many, many examples of available of this kind of diagram. Google is your friend!

Changing the size of the head of a character compared to his/her/its body can have a pretty big impact on how that character feels. Big heads are more child-like, and so are more associated with friendly characters, while small headed characters feel more adult or even grandiose. Yet again, Godzilla and Barney help us out:

barneygodzillaproportions.jpg

Godzilla might seem more mature, but Barney is waaaaaay creepier like this.

TL;DR: For people to look right, they have to follow certain rules regarding proportions, and messing with different proportions can change the “feel” of a character. Pages to get you started: Proportions Guide by FOERVRAENGDidrawdigital Tutorial: Anatomy and Proportion

Perspective

Perspective is all about creating the illusion of depth on a 2D surface by altering the appearance of shapes and forms, and is a pretty big subject so forgive me if I split it into sub-headings.

Geometric Perspective

Most 2D games simply avoid dealing with what I like to call “Geometric” perspective for the simple reason that implementing true-to-life perspective would be insanely time consuming for creating 2D art for games. Games like to cheat their way out of that problem by adopting unrealistic viewpoints, such as assuming everything is seen perfectly form the side (like a 2D Platformer), or from an isometric viewpoint which is no less realistic although more subtle in its unreality. I want to go over it because it’s probably the hardest overall principle to truly understand, and even a very basic understanding will get you vastly better results. The Vanishing Point forms the basis for most formal perspective and is based around the idea that parallel lines appear to converge onto a single point as they recede from the viewer. LOLwut you ask? Like this:

perspectiverail.jpg

This would be a more dramatic example if there was an oncoming train.Image from Wikimedia Commons

Notice how the parallel lines (real and implied) converge? Maybe this will help:

perspectiveraillines.jpg

So I could have added more red lines, what of it?

Red lines converge on the vanishing point. Got it. What you also see dividing the earth and sky is the Horizon Line, which happens on infinite (from the viewer’s perspective) planes. Vanishing points and horizon lines at their core enforce a really a simple idea: Things that are far away appear smaller than things close up, and the closer side of an object will appear bigger than the farther side. The above example uses one vanishing point, but there are really as many vanishing points in a scene as there are sets of parallel lines, with each set having its own vanishing point. Sound complicated? It definitely is, which why scenes are generally simplified down to one-, two-, or three-point perspective. What normally happens with one-point and two-point perspective is that one or the more sets of parallel lines are assumed to stay parallel forever and never converge. Here’s an example of a cube and a cuboid in one point perspective:

onepointperspective.jpg

That’s right… pencil and paper, sucka.

Note how the horizontal and vertical sets of edges are perfectly parallel. Now, here’s two-point perspective:

[attachment=15361:Twopointperspective.jpgIt’s traditional when you’re starting out with perspective to lightly draw the other side of the objects as well to get a better feel for the 3D-ness.

Here, the set of edges that were previously perfectly horizontal now get their own vanishing point. The vertical edges stay perfectly parallel. Finally, here’s three-point perspective:

twopointperspective.jpg

Three-point perspective pretty much entails epic-ness, at least in terms of height.

Now all edges get their own vanishing point. Good for them, right? It should be noted that vanishing points deal parallel lines the best, but by drawing guide lines or even full boxes you can get a better idea of how to approximate depth for complicated shapes. One, Two, and Three-point perspectives are by far the most common and useful, but there’s at least one artist who has used six-point perspective to create crazy spherical scenes. There’s an important trick to drawing tubes or any circular object in proper perspective, since circles in perspective deform in special ways. Circles look like ellipses when they’re viewed obliquely, the more oblique, the more squashed the ellipse:

obliquestraighton.jpg

I cannot tell you how many people don’t get this, so seriously and for real, circles turn into ellipses.

A simple rule is that when you’re looking up at a cylinder edge (like the roof of a round building), the curve bulges up. When you’re looking down, like at the base of telephone pole, the circle bulges down. The line through the middle of this image is the horizon line:

lookuplookdown.jpg

This would have been a perfect candidate for shading to add depth, but we’ll get there.

However, you should remember than many 2D games avoid geometric perspective problems by picking viewpoints (from the side, perfectly top-down) that minimize the need for it.

Foreshortening

A common perspective art concept in figure drawing is called foreshortening, which comes up often with how parts of the body appear in perspective. A fist held out at the viewer will not only appear bigger than it would be held at character’s side, but it eclipses a huge part of the arm, too. I’m terrible at figure drawing so this won’t be the most professional-looking example ever:

foreshortening.jpg

Seriously, I suck.

Often, artists eyeball foreshortening for characters simply because laying out all the vanishing points would take too long. But for your edification, here’s forshortening with vanishing points and cylinders, which are often used as proxy forms for limbs:

cylinderperspective.jpg

At least I can draw cylinders good…er, I mean “well”.

Keep in mind that characters, human characters especially, can be thought of as a series of simpler objects which are easier to comprehend. Sketching out characters as a series of tubes connected by joints before filling in the detail isn’t uncommon. Page to get you started: idrawdigital Tutorial: Forshortening Tricks

Overlap and Parallax

Overlap is very simple: closer objects will overlap and mask farther objects. It’s very important for 2D games since it’s a very simple way to show the player their relationship to objects. Let’s take a quick look at an extremely simple example:

overlapparallax.png

Also known as the weird hills in the background of every Super Mario game.

From this set of lines you perceive the circular thing (a bush?) on the right is in front of the others, while the tallest one is behind. This effect is sometimes called the “T-rule” since the line of the object in front forms the top of a T compared to the object behind. It’s simple, but pretty powerful. In this example, all the T’s are updside-down:

trule.png

More like ASCII Code 193 rule, amirite?

Parallax is another important perspective effect having to do with the relationships of overlapping objects. Parallax essentially is that objects that are far away appear to move less when the viewer moves, compared to closer objects. Parallax is great for 2D games because it’s pretty easy to implement, and you have no doubt come across it. Wikipedia actually has a pretty decent article on using parallax in games, and I’d hate to waste your time regurgitating it.

Atmospheric Perspective

Since 2D games often intentionally violate regular perspective rules for the simple reason that it’s easier to draw stuff for them, they have to rely on other means to get the idea of depth across. By making objects that are supposed to be far away from the viewer appear more washed out and less detailed, you can easily make the brighter, sharper looking things in the foreground appear more distinct. Here’s an example from real life, in a picture I took while visiting the gloriously smoggy People’s Republic of China:

chinasmog.jpg

For real, it’s pretty smoggy over there.

You can also see the parallel line perspective effect, although in this case the main vanishing point would be off to the left of the frame. The game applications are pretty staggering. Almost every 2D platformer ever made uses atmospheric perspective. Take this screenshot from Super Mario World:

supermarioworld.jpg

Also, overlap and parallax! Booya! Super Mario World Image Source

Notice that the farther in the background an object is, the more washed out it looks. In particular, looking at how dark the outlines are tell you how close they are to the plane of the player. This also folds directly into the idea of contrast. Contrast can tell the player what’s important and what’s not. Take a look at that Super Mario World screenshot again. Blue hills that are lightly shaded? Not important. Pipe with a white highlight fading to total black? Important. The only bright red thing on the screen? Super important. Remember, interactive parts of a game should always stand out from non-interactive parts unless there’s a specific reason to obscure something from the player.

Pages to get you started: ArtyFactory.com Linear and Aerial Perspectiveperspective-book.com Tutorial

Breaking Down Color

Color is a tricky thing, and one of the more subjective parts of art in general. Some colors look better to some people than others, and color combinations and connotations do not transcend cultures. White might be the color of purity in the west, but in Japan white often stands for death. However, there are a few basic ideas regarding color that will help you out in understanding what’s going on in your art. Let’s start with thinking about what makes up a particular color.

Hue, Saturation, Brightness

There are many ways to break down color, but this one I think is the most helpful for beginning digital artists.

Let’s start by comparing two colors:

redvsblue.png

Red vs Blue, get it? It seemed pretty clever at the time.

Red and Blue. They aren’t the same color? Pretty simple, right? Well there’s actually a more precise term called Hue. The left square has a red hue and the right one has a blue hue. Other hues include green, orange, purple, and so on. While hue may seem just a redundant term for color, it’s not because the amount of any given hue in a color can change:

redvslessred.png

Red vs Less Red.

So they are both red, but how are they different? The one on the right is just kinda… washed out. The term we’re looking for is saturation. Saturation is basically the term for how colorful, um, a color is, or how much hue it contains. I like to think of saturation as a measure of how much gray there is in a color. No gray = saturated. Lots of gray = unsaturated. So in this case, the square on the left is a fully saturated while the one on the right is desaturated. Pure gray is simply a color with no saturation. Saturation is the most devious of color attributes for beginners to get the hang of in my opinion. Just be aware that saturation has a big impact on the “tone” of your art. Highly saturated colors tend to look more, well, friendly when used in large amounts, where desaturated colors are associated with grittier style. The last attribute is Brightness. It’s much more straightforward – it’s just how bright the color is (sometimes the term “value” is used instead, no biggy). Here’s the same red as above, but with a darker version:

redvsdarkerred.png

Same red, but darker (not desaturated).

The relationship between brightness and saturation takes a little getting used to, since they can appear to overlap:

saturationbrightness.png

I like drawing spaceships and explosions, but I also secretly like magenta.

Here’s an example of how color can affect the tone of a game, with Castlevania: Lords of Shadow set against New Super Mario.

supermariobros.jpg

Also note the lack of gibs and bloodsplatter you’d expect a Mario that size would generate stepping on a goomba. Image Source

castlevania.jpg

Nothing clever, just wanted to point out how well those bright status bars stand out from the background. Image Source

You know what also relates back to color… Barney and Godzilla! Whooo! So anyway, think about the ways color makes them so different in terms of hue, brightness, and saturation, and what would happen if one or more of those attributes changed. What would happen if you left one attribute alone but traded them between the characters? Would a gray Barney still seem huggable?

barneygodzilla.jpg

There is no escape from the Godzilla-Barney comparison!

RGB in Brief

Congratulations! You now understand HSB (Hue Saturation Brightness) color (sometimes the “B” is swapped out with the a “V” for value, but the meaning is the same). Pretty much any image software can use that definition along with Red, Green, and Blue (RGB), and Cyan, Magenta, Yellow, and Key /Black (CMYK). I think HSB is a much more straightforward way of understanding what’s happing with colors, especially regarding how bright and how saturated a color is, which is what you need when you’re working on shading. You will have to work with RGB color in different applications however, so let’s review that briefly. RGB simply describes colors in terms of Red, Green, and Blue, since all colors can be described as a combination of those three, which has to do with how your eyes process color information. Take some time and monkey around with color values to see what both the HSB and RGB values change, and how they related to each other. Here’s the standard RGB overlap diagram (notice what happens when the colors overlap):

rgb.png

Also known as an additive color model, since colors are creating by adding light, rather than absorbing light (subtractive model)

Also note how when all three are combined, it makes white. You can think of all the colors playing in a tug-of-war, so when they all have the same value, the hues cancel each other out and make gray. But the colors different combinations yield can be kind of confusing, so for working on artwork, I would lean towards HSB.

The Color Wheel

Now that we have defined what a color is, let’s start looking at color combinations. Color theory is complicated and pretty subjective, so what follows isn’t meant to be an ironclad explanation, but a guide of what to think about. The color wheel forms the basis of most color theory. It’s basically an arrangement of hues by their perceived relationship, with Red, Yellow, and Blue at the thirds of the wheel (the so-called primary colors), with Green, Orange, and Purple (secondary colors) between them.

wheelofcolor.png

Wheel of Color would be a really stupid game show.

Hues are also commonly split into Warm and Cool categories, termed color temperature, with red-yellow colors being described as warm and blue colors being described as cool, as below:

coolwarm.png

Spike Lee’s film Do the Right Thing was oranger than normal to make it seem “hotter.” I learned that in a film class, and it seemed relevant to bring up.

I added a zone of iffiness, since those colors are kind of borderline but I’ve seen the yellow-green as cool and the magenta color as warm. The important thing to remember is that cooler colors are associated more with darker shades, so a cool shadow will be perceived as darker than a warmer one that is technically the same value (brightness). Other relationships between colors can be explained using the wheel, too. Analogous colors are simply hues next to each other on the color wheel, like green, yellow, and the colors between. Complementary colors are colors (hues) 180 degrees from each other that appear more vibrant when used together. You’ve probably seen them in action, even if you didn’t know why; blue and orange has even become a trope.

complementary.png

If you’re using Firefox, look at the icon. Complementary colors strike again!

When working on game art, think about associating colors with specific factions or enemies and environments or levels. Color-coding isn’t mandatory, but you can use it as a way to bend player perception. Think of a set of colors for bad guys, but use unique shades of those colors for specific enemies, for example. But with using colors, don’t be afraid to experiment and try lesser-used colors. Using any reasonably-advanced art program, like GIMP, it’s actually easier to change color than any other attribute. It’s one of the few things you can change after completion relatively easily. TL;DR: Colors can be divided and related to one another in different ways, and different combinations of colors can make their individual component colors look different, for better or worse.

Page to get you startedColor Theory for Designers

Lighting and Shading

I’m going to be using lots of pixel art examples, but they basic concepts are just as applicable to any other type of 2D art.

Light Sources

The most common issue I see with beginning artists is that they don’t understand lighting. Shading a drawing generally means to apply different shades to create the illusion of light in a drawing, just like perspective is the illusion of depth. And just like in perspective, you have to create some 2D stand-ins for mimicking real-world effects. There’s really only one rule: Light has to come from somewhere. It doesn’t just appear, which means that laying down colors in a drawing will always look wrong. What happens pretty often with beginners is that they try to “shade” their art, but don’t understand lighting, which results in objects that look like this:

pillowshading.png

Seriously, don’t do this.

Compared to the unshaded version:

noshading.png

You might even be better of with this version than the one above.

It’s called pillow shading, and it can be easy to do without thinking about it. It can seem natural to just color from the outside in with darker shades… but it looks completely fake. In order for lighting to look right, it has to be directional, with light/dark shades being chosen depending on the whether or not a side of the subject faces the light source(s). A light source could be the sun, a lamp, a boiling hot lake of molten lava, etc, but doesn’t have to be something that specific. For example, you can just assume almost all light is coming in at 45 degrees from infinity, and your subject will be shaded well enough for most applications. For animated sprites that are going to be used in a variety of environments, a little vagueness helps keep the sprite from looking too out of place on any background. Here’s an example using a light source coming from the top left somewhere:

anchor.png

This also requires you to think about if there’s a part casting a shadow over another.

Parts facing the light source are lighter and parts facing away are darker, couldn’t be simpler, right? Of course, it’s not always that simple…

Flat vs Round

Keep in mind that flat surfaces generally have almost the same shade across their entire surface, where curved surfaces will have a gradient of shades. Here’s a neat real world example (with fighter jets!) of how this works:

f117.jpg

An F-117. I actually grew up with these flying over my house.

Notice how all the panels have the same shade unless they’re actually in the shadow of a different part of the airplane? Now, let’s look at a more normal jet (an F-15):

f15.jpg

Whooo America! Except this one is actually Saudi Arabian – Gotcha!

Relate back to the Shape and Form section. Which one of these bad boys would you cast as the good guy and which as the bad guy going on looks alone?

Here, you can see an actual gradient transition between light and dark. Check out the left wing, it’s an almost perfect gradient. Now let’s go back to that pillow shaded mess from earlier:

pillowshading.png

simpleshading.png

The light source for the cube and sphere aren’t quite the same. What’s different?

Note the cube only really needs one value per side at this scale, while the sphere requires many more values to mimic the gradiated nature of shadows on curved surfaces.

Bounced Light

The kind of shading above is simplified since light can also bounce off surfaces and light up shadows. This often means that the part of the shadow that is farthest away from the main light source is actually lighter than the other parts of the shadow. This is most noticeable when an object is extremely close to a reflective surface, or is just plain big. This is the classical example of how this looks:

spherereflectedlightpencil.png

Also note how the shadow it casts also gives you a better sense of depth.

Here’s a couple digital examples of the same thing.

spherereflectedlightcomparison.png

If these spheres were sitting on a blue surface, the reflected light on the cube would also be blue.

The left is an example of bounced light located off the edge, which happens with highly reflective surfaces. The shinier the object, the more obvious and distinct this bounced light appears. Speaking of shininess, lighting and shading can not only reveal an object’s form, but also its texture.

Hue Shifting

Hue shifting relates somewhat to bounced light, and comes up with pixel art a lot. It basically refers to how the hue of a shadow or highlight isn’t necessarily just a darker or lighter version of the base color. The most common usage is for objects that are supposed to be in the sun. The direct sunlight tends to be a little yellow, but the blue sky reflects blue light into the shadows, so you have yellowish highlights and blue-ish shadows.

hueshifting.png

This also relates back to warm colors and cool colors, with cool shadows and warm highlights.

This idea becomes more important when you have multiple light sources (like underlighting from lava or whatever) that’s a different color from other light sources. Remember, colored light affects the color of the object. But hue shifting can also be simply a stylistic choice, and by exaggerating the effect or substituting complementary colors you can get some pretty neat effects:

stylizedshading.png

Doing this too much will make your game look like it’s trapped inside Instagram.

Keep in mind that shadows can also appear to be less saturated, and that less-saturated colors can appear darker than they really are. There’s not a total consensus on hue shifting in the art world, so find a way that you like, but keep in mind the more you shift, the more surreal your art will seem.

Shading and Texture

The texture of an object affects how light bounces off of it, so naturally changing how you shade something can also change what its texture looks like. There are specific terms for certain types of textures that will help you in thinking about different types of texture, too:

flatmattegloss.png

Knowing this will let you buy paint without having to ask the forlorn-looking old guy working in the home improvement department for help.

Gloss surface is just a shiny surface, where the light bounces off a particular angle of the surface almost all in the same direction with very little scattering. What that means is the brightest part is very bright (since you’re getting lots of light from that one place at once) and the darkest part is very dark (since the light is sticking together and going somewhere else). A good example of a gloss surface is the body of a freshly waxed car. A Flat texture is the opposite, where light scatters off the surface at many different angles. That means the brightness is more even, since no part of the object is directing all of its light in a particular direction. Old car tires are a pretty good example of a flat surface, as is modeling clay. A Matte surface is somewhere in-between. It reflects light in chunks, but scatters a lot too. A lot of plastic has a matte finish, like most keyboards. So when you’re drawing, think about what kind of material that you’re shading is supposed to be. Is it shiny metal, or flat cloth? You don’t want your medieval characters wearing plastic-looking garb, and you probably don’t want your sleek sci-fi armor to look silky soft. TL;DR: Light has to come from somewhere, or look like it does, for 2D images to look right.

Page to get you startedAndroid Arts Tutorial by Niklas Jansson

Practice Makes Perfect

So now that all those concepts are laid out, what are you supposed to do? Well, start trying stuff out! Don’t be afraid to jump right in. It really is true that anyone can draw. Sure, some people have a particular knack, but the biggest separator between a bad artist and a good one is how much they’ve practiced. You gotta do it a lot to get good at it. Practice, but practice smart. Game projects also provide a lot of opportunity for practice, so if you have a project in mind, start creating art for it if you haven’t already (after reading this article all the way through, though). If you don’t have one in mind, find one! Even the smallest game project has enough art that you’ll get enough practical experience that by the next one you’ll be noticeably better. And fortunately for non-artists, game art doesn’t have to be Italian renaissance level quality to be functional.

Three P’s: Pencil, Paper, Practice

The only way to get better at drawing is to practice, and the cheapest and easiest way is to do it with good ol’ fashioned pencil and paper. It might be tempting to simply stick with digital-only for all steps in the design process since that’s where your final product will be, but resist! Drawing by hand gets you more involved in the process and will help you avoid some of the more dangerous habits that relying on software tools will get you. Those tools can be great and it might seem easier at first to make sprites using the square tool, but trust me when I say that you would do ridiculous and ugly things that would be impossible to do with a pencil sketch. There will be plenty of time later to mercilessly exploit tools, tricks, and shortcuts when you’re conscious of the basic principles. It might seem awkward at first if you’ve gotten used to doing things digitally only, but pencil and paper are the starting point for artists the world over for a reason. With that in mind, I recommend buying a sketchbook, some pencils (I like mechanical pencils, but it doesn’t really matter at this level), and a separate eraser like a Magic Rub since you’ll be erasing way more often than the #2 pencil designer gods intended. You don’t absolutely need a sketchbook – the real key is that you need to practice, and to that end the margins of your class notebook or a sticky note at work aren’t worlds apart. A sketchbook does let you keep all your work in one place, though, so you don’t have to hunt down that one really sweet design for an enemy that you put on the corner of your homework or a memo at work.

Sketching

The key to pencil sketching is to think of all the lines as temporary suggestions rather than permanent representations. What? Don’t get attached to your lines! Sketch over them, erase them, make new ones without regard for old ones. Of course, make all your lines fairly light when doing this. Start with the basic shape of your subject, and add detail incrementally. Most things can be approximated by basic forms, namely the sphere, the cylinder, and the box, which is especially useful for drawing objects in perspective. For example, don’t draw a more or less finished head, then move to the chest, then arms, then legs, etc. because focusing on detail will make you lose sight of how those parts fit together. Start with a big rough sketch of everything together and add detail on top of that. Don’t get attached to any lines in the beginning – don’t be afraid to ignore lines and draw other lines on top until you get an overall shape that looks good, and don’t be afraid to simply restart if things aren’t going your way.

“>This video illustrates this perfectly, as the artist builds the basic framework of the character, puts some rough shapes on top, the proceeds adding more and more detail – and erasing and re-doing parts that look bad. Here’s a little image out of one of my old sketchbooks, complete with funky-looking man:

anotherfigure.jpg

Another figure – Don’t look! Um, I guess he has a huge zit on his face? What was I thinking?

Draft, Draft, Draft

It might sound crazy at first, but you should sketch at least three versions of any character/object/setting before committing to a digital version to use in a game. Major studios often create literally dozens of concepts for a single character before even thinking about picking one. Even for background assets like trees or bushes that aren’t interactive, sketching three versions to get one final asset is not an unusual ratio. Just like turning in the first draft of a term paper, making the first version the only and therefore final version is a huge risk and not one worth taking. By trying three different ways, you can also then take the best parts of each version and combine them to make a stronger final version. Here’s a simple example of a couple cool space helmets, both of which are different than the final design below (and based on even rougher earlier sketches):

drafthelmet.jpg

Shout out to Anatomy and Proportions section since it’s hard to make a helmet without knowing how skulls are shaped. shadedhelmet.png

The top part should really be casting a shadow on the visor… oh well.

This might seem burdensome, so it’s important to keep in mind that these sketches are rough, rough, rr-rrr-rooouuugh drafts. Don’t spend time on them. In fact, the less the better many times, since the longer you dwell on a piece, the less flexible you become about revising it or making the next version different. Leave the detail out at first, just get the general idea down and move on. If you feel like it, you can then go back and add more detail to your first contender. Be prepared to draw, a lot, and be prepared to get a little frustrated sometimes. If your art looks iffy at best to you, congratulations, you are a human being. Your next drawings will probably be better, and the ones after that better still. Remember, getting frustrated is standard – if drawing was as easy as it looked, there wouldn’t be this article. In fact, if you aren’t getting at least a little frustrated drawing for a while, you either aren’t pushing yourself or your contacts fell out and you’ve convinced yourself that blurry mess was totally your intention all along. TL;DR: Draft all your game art by sketching out several versions first with pencil and paper, without worrying about being perfect.

Related PageSketching: The Visual Thinking Power Tool

Conclusion

Hopefully now that you are familiar with these concepts, you can go forth and create with the knowledge you need to not suck. I mean, be incredible! Seriously though, creating art isn’t easy and it takes a lot of practice, but just having some idea of these concepts is fantastic. As I said in the introduction, most of the information here is in the context of creating 2D art for games, and doesn’t necessarily reflect what you would get in an Art 101 class.

Further Reading

I’ve included links within the text, usually at the end of relevant sections, but if you’re interested at all in any further information about game art, particularly character creation, I have to highly reccomend Chris Solarski’s book Drawing Basics and Video Game Art. This article owes a lot to him and his book, and you can read some of his writing on Gamasutra.

The Total Beginner’s Guide to Game AI

Introduction

This article will introduce you to a range of introductory concepts used in artificial intelligence for games (or ‘Game AI’ for short) so that you can understand what tools are available for approaching your AI problems, how they work together, and how you might start to implement them in the language or engine of your choice.

We’re going to assume you have a basic knowledge of video games, and some grasp on mathematical concepts like geometry, trigonometry, etc. Most code examples will be in pseudo-code, so no specific programming language knowledge should be required.

What is Game AI?

Game AI is mostly focused on which actions an entity should take, based on the current conditions. This is what the traditional AI literature refers to as controlling ‘intelligent agents’ where the agent is usually a character in the game – but could also be a vehicle, a robot, or occasionally something more abstract such as a whole group of entities, or even a country or civilization. In each case it is a thing that needs to observe its surroundings, make decisions based on that, and act upon them. This is sometimes thought of as the Sense/Think/Act cycle:

  • Sense: The agent detects – or is told about – things in their environment that may influence their behaviour (e.g. threats nearby, items to collect, points of interest to investigate)
  • Think: The agent makes a decision about what to do in response (e.g. considers whether it is safe enough to collect items, or whether it should focus on fighting or hiding first)
  • Act: The agent performs actions to put the previous decision into motion (e.g. starts moving along a path towards the enemy, or towards the item, etc)
  • …the situation has now changed, due to the actions of the characters, so the cycle must repeat with the new data.

In real-world AI problems, especially the ones making the news at the moment, they are typically heavily focused on the ‘sense’ part of this cycle. For example, autonomous cars must take images of the road ahead, combine them with other data such as radar and LIDAR, and attempt to interpret what they see. This is usually done by some sort of machine learning, which is especially good at taking a lot of noisy, real-world data (like a photo of the road in front of a car, or a few frames of video) and making some sense of that, extracting semantic information such as “there is another car 20 yards ahead of you”. These are referred to as ‘classification problems’.

Games are unusual in that they don’t tend to need a complex system to extract this information, as much of it is intrinsic to the simulation. There’s no need to run image recognition algorithms to spot if there’s an enemy ahead; the game knows there is an enemy there and can feed that information directly in to the decision making process. So the ‘sense’ part of the cycle is often much simpler, and the complexity arises in the ‘think’ and ‘act’ implementations.

Constraints of Game AI development

AI for games usually has a few constraints it has to respect:

  • It isn’t usually ‘pre-trained’ like a machine learning algorithm would be; it’s not practical to write a neural network during development to observe tens of thousands of players and learn the best way to play against them, because the game isn’t released yet and there are no players!
  • The game is usually supposed to provide entertainment and challenge rather than be ‘optimal’ – so even if the agents could be trained to take the best approach against the humans, this is often not what the designers actually want.
  • There is often a requirement for agents to appear ‘realistic’, so that players can feel that they’re competing against human-like opponents. The AlphaGo program was able to become far better than humans but the moves chosen were so far from the traditional understanding of the game that experienced opponents would say that it “almost felt like I was playing against an alien”. If a game is simulating a human opponent, this is typically undesirable, so the algorithm would have to be tweaked to make believable decisions rather than the ideal ones.
  • It needs to run in ‘real-time’, which in this context means that the algorithm can’t monopolise CPU usage for a long time in order to make the decision. Even taking just 10 milliseconds to make a decision is far too long because most games only have somewhere between 16 and 33 milliseconds to perform all the processing for the next frame of graphics.
  • It’s ideal if at least some of the system is data-driven rather than hard-coded, so that non-coders can make adjustments, and so that adjustments can be made more quickly.

With that in mind, we can start to look at some extremely simple AI approaches that cover the whole Sense/Think/Act cycle in ways that are efficient and allow the game designers to choose challenging and human-like behaviours.

Basic Decision Making

Let’s start with a very simple game like Pong. The aim to move the ‘paddle’ so that the ball bounces off it instead of going past it, the rules being much like tennis in that you lose when you fail to return the ball. The AI has the relatively simple task of deciding which direction to move the paddle.

Hardcoded conditional statements

If we wanted to write AI to control the paddle, there is an intuitive and easy solution – simply try and position the paddle below the ball at all times. By the time the ball reaches the paddle, the paddle is ideally already in place and can return the ball.

A simple algorithm for this, expressed in ‘pseudocode’ below, might be:

every frame/update while the game is running:

if the ball is to the left of the paddle:

	move the paddle left

else if the ball is to the right of the paddle:

	move the paddle right

Providing that the paddle can move at least as fast as the ball does, this should be a perfect algorithm for an AI player of Pong. In cases where there isn’t a lot of sensory data to work with and there aren’t many different actions the agent can perform, you don’t need anything much more complex than this.

This approach is so simple that the whole Sense/Think/Act cycle is barely visible. But it is there.

  • The ‘sense’ part is in the 2 “if” statements. The game knows where the ball is, and where the paddle is. So the AI asks the game for those positions, and thereby ‘senses’ whether the ball is to the left or to the right.
  • The ‘think’ part is also built in to the two “if” statements. These embody 2 decisions, which in this case are mutually exclusive, and result in one of three actions being chosen – either to move the paddle left, to move it right, or to do nothing if the paddle is already correctly positioned.
  • The ‘act’ part is the “move the paddle left” and “move the paddle right” statements. Depending on the way the game is implemented, this might take the form of immediately moving the paddle position, or it might involve setting the paddle’s speed and direction so that it gets moved properly elsewhere in the game code.

Approaches like this are often termed “reactive” because there are a simple set of rules – in this case, ‘if’ statements in the code – which react to the current state of the world and immediately make a decision on how to act.

Decision Trees

This Pong example is actually equivalent to a formal AI concept called a ‘decision tree’. This is a system where the decisions are arranged into a tree shape and the algorithm must traverse it in order to reach a ‘leaf’ which contains the final decision on which action to take. Let’s draw a visual representation of a decision tree for the Pong paddle algorithm, using a flowchart:

DecisionTree1.png

You can see that it it resembles a tree, although an upside-down one!

Each part of the decision tree is typically called a ‘node’ because AI uses graph theory to describe structures like this. Each node is one of two types:

  1. Decision Nodes: a choice between 2 alternatives based on checking some condition, each alternative represented as its own node;
  2. End Nodes: an action to take, which represents the final decision made by the tree.

The algorithm starts at the first node, designated the ‘root’ of the tree, and either takes a decision on which child node to move to based on the condition in the doe, or executes the action stored in the node and stops.

At first glance, it might not be obvious what the benefit is here, because the decision tree is obviously doing the same job as the if-statements in the previous section. But there is a very generic system here, where each decision has precisely 1 condition and 2 possible outcomes, which allows a developer to build up the AI from data that represents the decisions in the tree, avoiding hardcoding it. It’s easy to imagine a simple data format to describe the tree like this:

Node number

Decision (or ‘End’)

Action

Action

1

Is Ball Left Of Paddle?

Yes? Check Node 2

No? Check Node 3

2

End

Move Paddle Left

3

Is Ball Right Of Paddle?

Yes? Goto Node4

No? Goto Node5

4

End

Move Paddle Right

5

End

Do Nothing

On the code side, you’d have a system to read in each of these lines, create a node for each one, hook up the decision logic based on the 2nd column, and hook up the child nodes based on the 3rd and 4th columns. You still need to hard-code the conditions and the actions, but now you can imagine a more complex game where you add extra decisions and actions and can tweak the whole AI just by editing the text file with the tree definition in. You could hand the file over to a game designer who can tweak the behaviour without needing to recompile the game and change the code – providing you have provided useful conditions and actions in the code already.

Where decision trees can be really powerful is when they can be constructed automatically based on a large set of examples (e.g. using the ID3 algorithm). This makes them an effective and highly-performant tool to classify situations based on the input data, but this is beyond the scope of a simple designer-authored system for having agents choose actions.

Scripting

Earlier we had a decision tree system that made use of pre-authored conditions and actions. The person designing the AI could arrange the tree however they wanted, but they had to rely on the programmer having already provided all the necessary conditions and actions they needed. What if we could give the designer better tools which allowed them to create some of their own conditions, and maybe even some of their own actions?

For example, instead of the coder having to write code for the conditions “Is Ball Left Of Paddle” and “Is Ball Right Of Paddle”, he or she could just provide a system where the designer writes the conditions to check those values for themself. The decision tree data might end up looking more like this:

Node number

Decision (or ‘End’)

Action

Action

1

ball.position.x < paddle.position.x

Yes? Check Node 2

No? Check Node 3

2

End

Move Paddle Left

3

ball.position.x > paddle.position.x

Yes? Check Node4

No? Check Node5

4

End

Move Paddle Right

5

End

Do Nothing

This is the same as above, but the decisions have their own code in them, looking a bit like the conditional part of an if-statement. On the code side, this would read in that 2nd column for the Decision nodes, and instead of looking up the specific condition to run (like “Is Ball Left Of Paddle”), it evaluates the conditional expression and returns true or false accordingly. This can be done by embedding a scripting language, like Lua or Angelscript, which allows the developer to take objects in their game (e.g. the ball and the paddle) and create variables which can be accessed in script (e.g. “ball.position”). The scripting language is usually easier to write than C++ and doesn’t require a full compilation stage, so it’s very suitable for making quick adjustments to game logic and allowing less technical members of the team to be able to shape features without requiring a coder’s intervention.

In the above example the scripting language is only being used to evaluate the conditional expression, but there’s no reason the output actions couldn’t be scripted too. For example, the action data like “Move Paddle Right” could become a script statement like “ball.position.x += 10”, so that the action is also defined in the script, without the programmer needing to hard code a MovePaddleRight function.

Going one step further, it’s possible (and common) to take this to the logical conclusion of writing the whole decision tree in the scripting language instead of as a list of lines of data. This would be code that looks much like the original hardcoded conditional statements we introduced above, except now they wouldn’t be ‘hardcoded’ – they’d exist in external script files, meaning they can be changed without recompiling the whole program. It is often even possible to change the script file while the game is running, allowing developers to rapidly test different AI approaches.

Responding to events

The examples above were designed to run every frame in a simple game like Pong. The idea is that they can continuously run the Sense/Think/Act loop and keep acting based on the latest world state. But with more complex games, rather than evaluating everything, it often makes more sense to respond to ‘events’, which are notable changes in the game environment.

This doesn’t apply so much to Pong, so let’s pick a different example. Imagine a shooter game where the enemies are stationary until they detect the player, and then take different actions based on who they are – the brawlers might charge towards the player, while the snipers will stay back and aim a shot. This is still essentially a basic reactive system – “if player is seen, then do something” – but it can logically be divided up into the event (“Player Seen”), and the reaction (choose a response and carry it out).

This brings us right back to our Sense/Think/Act cycle. We might have a bit of code which is the ‘Sense’ code, and that checks, every frame, whether the enemy can see the player. If not, nothing happens. But if so, it creates the ‘Player Seen’ event. The code would have a separate section which says “When ‘Player Seen’ event occurs, do <xyz>” and the <xyz> is whichever response you need to handle the Thinking and Acting. On your Brawler character, you might hook up your “ChargeAndAttack” response function to the Player Seen event – and on the Sniper character, you would hook up your “HideAndSnipe” response function to that event. As with the previous examples, you can make these associations in a data file so that they can be quickly changed without rebuilding the engine. And it’s also possible (and common) to write these response functions in a scripting language, so that the designers can make complex decisions when these events occur.

Advanced Decision Making

Although simple reactive systems are very powerful, there are many situations where they are not really enough. Sometimes we want to make different decisions based on what the agent is currently doing, and representing that as a condition is unwieldy. Sometimes there are just too many conditions to effectively represent them in a decision tree or a script. Sometimes we need to think ahead and estimate how the situation will change before deciding our next move. For these problems, we need more complex solutions.

Finite state machines

finite state machine (or FSM for short) is a fancy way of saying that some object – for example, one of our AI agents – is currently in one of several possible states, and that they can move from one state to another. There are a finite number of those states, hence the name. A real-life example is a set of traffic lights, which will go from red, to yellow, to green, and back again. Different places have different sequences of lights but the principle is the same – each state represents something (such as “Stop”, “Go”, “Stop if possible”, etc), it is only in one state at any one time, and it transitions from one to the next based on simple rules.

This applies quite well to NPCs in games. A guard might have the following distinct states:

  • Patrolling
  • Attacking
  • Fleeing

And you might come up with the following rules for when they change states:

  • If a guard sees an opponent, they attack
  • If a guard is attacking but can no longer see the opponent, go back to patrolling
  • If a guard is attacking but is badly hurt, start fleeing

This is simple enough that you can write it as simple hard-coded if-statements, with a variable storing which state the guard is in and various checks to see if there’s an enemy nearby, what the guard’s health level is like, etc. But imagine we want to add a few more states:

  • Idling (between patrols)
  • Searching (when a previously spotted enemy has hidden)
  • Running for help (when an enemy is spotted but is too strong to fight alone)

And the choices available in each state are typically limited – for example, the guard probably won’t want to search for a lost enemy if their health is too low

Eventually it gets a bit too unwieldy for a long list of “if <x and y but not z> then <p>”, and it helps to have a formalised way to think about the states, and the transitions between the states. To do this, we consider all the states, and under each state, we list all the transitions to other states, along with the conditions necessary for them. We also need to designate an initial state so that we know what to start off with, before any other conditions may apply.

State Transition Condition New State
Idling have been idle for 10 seconds Patrolling
enemy visible and enemy is too strong Finding Help
enemy visible and health high Attacking
enemy visible and health low Fleeing
Patrolling finished patrol route Idling
enemy visible and enemy is too strong Finding Help
enemy visible and health high Attacking
enemy visible and health low Fleeing
Attacking no enemy visible Idling
health low Fleeing
Fleeing no enemy visible Idling
Searching have been searching for 10 seconds Idling
enemy visible and enemy is too strong Finding Help
enemy visible and health high Attacking
enemy visible and health low Fleeing
Finding Help friend visible Attacking
Start state: Idling

This is known as a state transition table and is a comprehensive (if unattractive) way of representing the FSM.  From this data, it’s also possible to draw a diagram and get a comprehensive visual oversight of how the NPC’s behaviour might play out over time.

StateMachine1v2.png

This captures the essence of the decision making for that agent based on the situation it finds itself in, with each arrow showing a transition between states, if the condition alongside the arrow is true.

Every update or ‘tick’ we check the agent’s current state, look through the list of transitions, and if the conditions are met for a transition, change to the new state. The Idling state will, every frame or tick, check whether the 10 second timer has expired, and if it has, trigger the transition to the Patrolling state. Similarly, the Attacking state will check if the agent’s health is low, and if so, transition to the Fleeing state.

That handles the transitions between states – but what about the behaviours associated with the states themselves?  In terms of carrying out the actual behaviours for a given state, there are usually 2 types of ‘hook’ where we attach actions to the finite state machine:

  1. Actions we take periodically, for example every frame or every ‘tick’, for the current state.
  2. Actions we take when we transition from one state to another.

For an example of the first type, the Patrolling state will, every frame or tick, continue to move the agent along the patrol route. The Attacking state will, every frame or tick, attempt to launch an attack or move into a position where that is possible. And so on.

For the second type, consider the ‘If enemy visible and enemy is too strong → Finding Help’ transition. The agent must pick where to go to find help, and store that information so that the Finding Help state knows where to go. Similarly, within the Finding Help state, once help has been found the agent transitions back to the Attacking state, but at that point it will want to tell the friend about the threat, so there might be a “NotifyFriendOfThreat” action that occurs on that transition.

Again, we can see this system through the lense of Sense/Think/Act. The senses are embodied in the data used by the transition logic. The thinking is embodied by the transitions available in each state. And the acting is carried out by the actions taken periodically within a state or on the transitions between states.

This basic system works well, although some times the continual polling for the transition conditions can be expensive. For example, if every agent has to do complex calculations every frame to determine whether it can see any enemies in order to decide whether to transition from Patrolling to Attacking, that might waste a lot of CPU time.  As we saw earlier, we can think of important changes in the world state as ‘events’ which can be processed as they happen. So instead of the state machine explicitly checking a “can my agent see the player?” transition condition every frame, you might have a separate visibility system perform these checks a little less frequently (e.g. 5 times a second), and emit a “Player Seen” event when the check passes. This gets given to the state machine which would now have a “Player Seen event received” transition condition and would respond to the event accordingly. The end behaviour is identical, except for the almost unnoticeable (and yet more realistic) delay in responding, but the performance is better, as a result of moving the Sense part of the system to a separate part of the program.

Hierarchical State Machines

This is all well and good, but large state machines can get awkward to work with. If we wanted to broaden the Attacking state by replacing it with separate MeleeAttacking and RangedAttacking states, we would have to alter the in-bound transitions in every state, now and in future, that needs to be able to transition to an Attacking state.

You probably also noticed that there are a lot of duplicated transitions in our example. Most of the transitions in the Idling state are identical to the ones in the Patrolling state, and it would be good to not have to duplicate that work, especially if we add more states similar to that. It might make sense to group Idling and Patrolling under some sort of shared label of ‘Non-Combat’ where there is just one shared set of transitions to combat states. If we represent this as a state in itself, we could consider Idling and Patrolling as ‘sub-states’ of that, which allows us to represent the whole system more effectively. Example, using a separate transition table for the new Non-Combat sub-state:

Main states:

State Transition Condition New State
Non-Combat enemy visible and enemy is too strong Finding Help
enemy visible and health high Attacking
enemy visible and health low Fleeing
Attacking no enemy visible Non-Combat
health low Fleeing
Fleeing no enemy visible Non-Combat
Searching have been searching for 10 seconds Non-Combat
enemy visible and enemy is too strong Finding Help
enemy visible and health high Attacking
enemy visible and health low Fleeing
Finding Help friend visible Attacking
Start state: Non-Combat

 

Non-Combat state:

State Transition Condition New State
Idling have been idle for 10 seconds Patrolling
Patrolling finished patrol route Idling
Start state: Idling

And in diagram form:

HierarchicalStateMachine1.png

This is essentially the same system, except now there is a Non-Combat state which replaces Patrolling and Idling, and it is a state machine in itself, with 2 sub-states of Patrolling and Idling. With each state potentially containing a state machine of sub-states (and those sub-states perhaps containing their own state machine, for as far down as you need to go), we have a Hierarchical Finite State Machine (HFSM for short). By grouping the non-combat behaviours we cut out a bunch of redundant transitions, and we could do the same for any new states we chose to add that might share transitions. For example, if we expanded the Attacking state into MeleeAttacking and MissileAttacking states in future, they could be substates, transitioning between each other based on distance to enemy and ammunition availability, but sharing the exit transitions based on health levels and so on. Complex behaviours and sub-behaviours can be easily represented this way with a minimum of duplicated transitions.

Behavior Trees

With HFSMs we get the ability to build relatively complex behaviour sets in a relatively intuitive manner. However, one slight wrinkle in the design is that the decision making, in the form of the transition rules, is tightly-bound to the current state. In many games, that is exactly what you want. And careful use of a hierarchy of states can reduce the amount of transition duplication here. But sometimes you want rules that apply no matter which state you’re in, or which apply in almost all states. For example, if an agent’s health is down to 25%, you might want it to flee no matter whether it was currently in combat, or standing idle, or talking, or any other state, and you don’t want to have to remember to add that condition to every state you might ever add to a character in future. And if your designer later says that they want to change the threshold from 25% to 10%, you would then have to go through every single state’s relevant transition and change it.

Ideally for this situation you’d like a system where the decisions about which state to be in live outside the states themselves, so that you can make the change in just one place and still have it able to transition correctly. This is where a behaviour tree comes in.

There are several different ways to implement behaviour trees, but the essence is the same across most of them, and closely resembles the decision tree mentioned earlier: the algorithm starts at a ‘root node’ and there are nodes in the tree to represent either decisions or an action. However, there are a few key differences:

  • Nodes now return one of 3 values: Succeeded (if its work is done), Failed (if it cannot run), or Running (if it is still running and has not fully succeeded or failed yet).
  • We no longer have decision nodes to choose between 2 alternatives, but instead have ‘Decorator’ nodes, which have a single child node. If they ‘succeed’, they execute their single child node. Decorator nodes often have conditions in them which decides whether they succeed (and execute their subtree) or fail (and do nothing). They can also return Running if appropriate.
  • Nodes that take actions will return the Running value to represent activities that are ongoing

This small set of nodes can be combined to produce a large number of complex behaviours, and is often a very concise representation. For example, we can rewrite the guard’s hierarchical state machine from the previous example as a behaviour tree:

BehaviorTree1.png

With this structure, there doesn’t need to be an explicit transition from the Idling or Patrolling states to the Attacking states or any others – if the tree is traversed top-down, left-to-right, the correct decision gets made based on the current situation. If an enemy is visible and the character’s health is low, the tree will stop execution at the ‘Fleeing’ node, regardless of what node it was previously executing – Patrolling, Idling, Attacking, etc.

You might notice that there’s currently no transition to return to Idling from the Patrolling state – this is where non-conditional decorators come in. A common decorator node is Repeat – this has no condition, but simply intercepts a child node returning ‘Succeeded’ and runs that child node again, returning ‘Running’ instead. The new tree looks like this:

BehaviorTree2.png

Behaviour trees are quite complex as there are often many different ways to draw up the tree and finding the right combination of decorator and composite nodes can be tricky. There are also the issues of how often to check the tree (do we want traverse it every frame, or only when something happens that mean one of the conditions has changed?) and how to store state relating to the nodes (how do we know when we’ve been idling for 10 seconds? How do we know which nodes were executing last time, so we can handle a sequence correctly?) For this reason there are many different implementations. For example, some systems like the Unreal Engine 4 behaviour tree system have replaced decorator nodes with inline decorators, only re-evaluate the tree when the decorator conditions change, and provide ‘services’ to attach to nodes and provide periodic updates even when the tree isn’t being re-evaluated. Behaviour trees are powerful tools but learning to use them effectively, especially in the face of multiple varying implementations, can be daunting.

Utility-based systems

Some games would like to have a lot of different actions available, and therefore want the benefits of simpler, centralised transition rules, but perhaps don’t need the expressive power of a full behaviour tree implementation. Instead of having an explicit set of choices to make in turn, or a tree of potential actions with implicit fallback positions set by the tree structure, perhaps it’s possible to simply examine all the actions and pick out the one that seems most appropriate right now?

A utility-based system is exactly that – a system where an agent has a variety of actions at their disposal, and they choose one to execute based on the relative utility of each action, where utility is an arbitrary measure of how important or desirable performing that action is to the agent. By writing utility functions to calculate the utility for an action based on the current state of the agent and its environment, the agent is able to check those utility values and thereby select the most relevant state at any time.

Again, this closely resembles a finite state machine except one where the transitions are determined by the score for each potential state, including the current one. Note that we generally choose the highest-scoring action to transition to (or remain in, if already performing that action), but for more variety it could be a weighted random selection (favouring the highest scoring action but allowing some others to be chosen), picking a random action from the top 5 (or any other quantity), etc.

The typical utility system will assign some arbitrary range of utility values – say, 0 (completely undesirable) to 100 (completely desirable) and each action might have a set of considerations that influence how that value is calculated. Returning to our guard example, we might imagine something like this:

Action Utility Calculation
FindingHelp If enemy visible and enemy is strong and health low return 100 else return 0
Fleeing If enemy visible and health low return 90 else return 0
Attacking If enemy visible return 80
Idling If currently idling and have done it for 10 seconds already, return 0, else return 50
Patrolling If at end of patrol route, return 0, else return 50

 

One of the most important things to notice about this setup is that transitions between actions are completely implicit – any state can legally follow any other state. Also, action priorities are implicit in the utility values returned. If the enemy is visible and that enemy is strong and the character’s health is low then both Fleeing and FindingHelp will return high non-zero values, but FindingHelp always scores higher. Similarly, the non-combat actions never return more than 50, so they will always be trumped by a combat action. Actions and their utility calculiations need designing with this in mind.

Our example has actions return either a fixed constant utility value, or one of two fixed utility values. A more realistic system will typically involve returning a score from a continuous range of values. For example, the Fleeingaction might return higher utility values if the agent’s health is lower, and the Attacking action might return lower utility values if the foe is too tough to beat. This would allow the Fleeing action to take precedence over Attacking in any situation where the agent feels that it doesn’t have enough health to take on the opponent. This allows relative action priorities to change based on any number of criteria, which can make this type of approach more flexible than a behavior tree or finite state machine.

Each action usually has a bunch of conditions involved in calculating the utility. To avoid hard-coding everything, this may need to be written in a scripting language, or as a series of mathematical formulae aggregated together in an understandable way. See Dave Mark’s (@IADaveMarklectures and presentations for a lot more information about this.

Some games that attempt to model a character’s daily routine, like The Sims, add an extra layer of calculation where the agent has a set of ‘drives’ or ‘motivations’ that influence the utility scores. For example, if a character has a Hunger motivation, this might periodically go up over time, and the utility score for the EatFood action will return higher and higher values as time goes on, until the character is able to execute that action, decrease their Hunger level, and the EatFood action then goes back to returning a zero or near-zero utility value.

The idea of choosing actions based on a scoring system is quite straightforward, so it is obviously possible – and common – to use utility-based decision making as a part of other AI decision-making processes, rather than as a full replacement for them. A decision tree could query the utility score of its two child nodes and pick the higher-scoring one. Similarly a behaviour tree could have a Utility composite node that uses utility scores to decide which child to execute.

Movement and Navigation

In our previous examples, we either had a simple paddle which we told to move left or right, or a guard character who was told to patrol or attack. But how exactly do we handle movement of an agent over a period of time? How do we set the speed, how do we avoid obstacles, and how do we plan a route when getting to the destination is more complex than moving directly? We’ll take a look at this now.

Steering

At a very basic level, it often makes sense to treat each agent as having a velocity value, which encompasses both how fast it is moving and in which direction. This velocity might be measured in metres per second, miles per hour, pixels per second, etc. Recalling our Sense/Think/Act cycle, we can imagine that the Think part might choose a velocity, and then the Act part applies that velocity to the agent, moving it through the world. It’s common for games to have a physics system that performs this task for you, examining the velocity value of every entity and adjusting the position accordingly, so you can often delegate this work to that system, leaving the AI just with the job of deciding what velocity the agent should have.

If we know where an agent wishes to be, then we will want to use our velocity to move the agent in that direction. Very trivially, we have an equation like this:

desired_travel = destination_position – agent_position

So, imagining a 2D world where the agent is at (-2,-2) and the destination is somewhere roughly to the north-east at (30, 20) the desired travel for the agent to get there is (32, 22). Let’s pretend these positions are in metres – if we decide that our agent can move 5 metres per second then we would scale our travel vector down to that size and see that we want to set a velocity of roughly (4.12, 2.83). With that set, and movement being based on that value, the agent would arrive at the destination a little under 8 seconds later, as we would expect.

The calculations can be re-run whenever you want. For example, if the agent above was half-way to the target, the desired travel would be half the length, but once scaled to the agent’s maximum speed of 5 m/s the velocity comes out the same. This also works for moving targets (within reason), allowing the agent to make small corrections as they move.

Often we want a bit more control than this – for example, we might want to ramp up the velocity slowly at the start to represent a person moving from a stand still, through to walking, and then to running. We might want to do the same at the other end to have them slow down to a stop as they approach the destination. This is often handled by what are known as “steering behaviours”, each with specific names like Seek, Flee, Arrival, and so on. The idea of these is that acceleration forces can be applied to the agent’s velocity, based on comparing the agent’s position and current velocity to the destination position, to produce different ways to move towards a target.

Each behaviour has a slightly different purpose. Seek and Arrive are ways of moving an agent towards a destination point. Obstacle Avoidance and Separation help an agent take small corrective movements to steer around small obstacles between the agent and its destination. Alignment and Cohesion keep agents moving together to simulate flocking animals. Any number of these different steering behaviours can be added together, often as a weighted sum, to produce an aggregate value that takes these different concerns into account and produces a single output vector. For example, you might see a typical character agent use an Arrival steering behaviour alongside a Separation behaviour and an Obstacle Avoidance behaviour to keep away from walls and other agents. This approach works well in fairly open environments that aren’t too complex or crowded.

However, in more challenging environments, simply adding together the outputs from the behaviours doesn’t work well – perhaps they result in moving too slowly past an object, or an agent gets stuck when the Arrival behaviour wants to go through an obstacle but the Obstacle Avoidance behaviour is pushing the agent back the way it came. Therefore it sometimes makes sense to consider variations on steering behaviours which are more sophisticated than just adding together all the values. One family of approaches is to think of steering the other way around – instead of having each of the behaviours give us a direction and then combine them to reach a consensus (which may itself not be adequate), we could instead consider steering in several different directions – such as 8 compass points, or 5 or 6 directions in front of the agent – and see which one of those is best.

Still, in a complex environment with dead-ends and choices over which way to turn, we’re going to need something more advanced, which we’ll come to shortly.

Pathfinding

That’s great for simple movement in a fairly open area, like a soccer pitch or an arena, where getting from A to B is mostly about travelling in a straight line with small corrections to avoid obstacles. But what about when the route to the destination is more complex? This is where we need ‘pathfinding’, which is the act of examining the world and deciding on a route through it to get the agent to the destination.

Simplest – overlay a grid, for each square near to you, look at the neighbors you’re allowed to move into. If any of those are the destination, follow the route back from each square to the previous until you get to the start, and that’s the route. Otherwise, repeat the process with the reachable neighbors of the previous neighbors, until you find the destination or run out of squares (which means there is no possible route). This is what is formally known as a Breadth-First Search algorithm (often abbreviated to BFS) because at each step it looks at all directions (hence ‘breadth’) before moving the search outwards. The search space is like a wavefront that moves out until it hits the place that is being searched for.

This is a simple example of the search in action. The search area expands at each step until it has included the destination point – then the path back to the start can be traced.

pathfinding1.gif

The result here is that you get a list of gridsquares which make up the route you need to take. This is commonly called the ‘path’ (hence pathfinding) but you can also think of it as a plan as it represents a list of places to be, one after the other, to achieve the final goal of being at the destination.

Now, given that we will know the position of each gridsquare in the world, it’s possible to use the steering behaviours mentioned previously to move along the path – first from the start node to node 2, then from node 2 to node 3, and so on. The simplest approach is to steer towards the centre of the next gridsquare, but a popular alternative is to steer for the middle of the edge between the current square and the next one. This allows the agent to cut corners on sharp turns which can make for more realistic-looking movement.

It’s easy to see that this algorithm can be a bit wasteful, as it explores just as many squares in the ‘wrong’ direction as the ‘right’ direction. It also doesn’t make any allowances for movement costs, where some squares are more expensive than others. This is where a more sophisticated algorithm called A* (pronounced ‘A star’) comes in. It works much the same way as breadth-first search, except that instead of blindly exploring neighbors, then neighbors of neighbors, then neighbors of neighbors of neighbors and so on, it puts all these nodes into a list and sorts them so that the next node it explores is always the one it thinks is most likely to lead to the shortest route. The nodes are sorted based on a heuristic – basically an educated guess – that takes into account 2 things – the cost of the hypothetical route to that gridsquare (thus incorporating any movement costs you need) and an estimate of how far that gridsquare is from the destination (thus biasing the search in the right direction).

pathfinding2.gif

In this example we show it examining one square at a time, each time picking a neighbouring square that is the best (or joint best) prospect. The resulting path is the same as with breadth-first search, but fewer squares were examined in the process – and this makes a big difference to the game’s performance on complex levels.

Movement without a grid

The previous examples used a grid overlaid on the world and plotted a route across the world in terms of the gridsquares. But most games are not laid out on a grid, and overlaying a grid might therefore not make for realistic movement patterns. It might also require compromises over how large or small to make each gridsquare – too large, and it doesn’t adequately represent small corridors or turnings, too small and there could be thousands of gridsquares to search which will take too long. So, what are the alternatives?

The first thing to realise is that, in mathematical terms, the grid gives us a ‘graph’ of connected nodes. The A* (and BFS) algorithms actually operate on graphs, and don’t care about our grid at all. So we could decide to place the nodes at arbitrary positions in the world, and providing there is a walkable straight line between any two connected nodes, and between the start and end positions and at least one of the nodes, our algorithm will work just as well as before – usually better, in fact, because we have fewer nodes to search. This is often called a ‘waypoints’ system as each node represents a significant position in the world that can form part of any number of hypothetical paths.

pathfinding3a.png

Example 1: a node in every grid square. The search starts from the node the agent is in, and ends at the node in the target gridsquare.

 pathfinding3b.png

Example 2: a much smaller set of nodes, or waypoints. The search begins at the agent, passes through as many waypoints as necessary, and then proceeds to the end point. Note that moving to the first waypoint, south-west of the player, is an inefficient route, so some degree of post-processing of a path generated in this way is usually necessary (for example, to spot that the path can go directly to the waypoint to the north-east).

 

This is quite a flexible and powerful system. But it usually requires some care in deciding where and how to place the waypoints, otherwise agents might not be able to see their nearest waypoint and start a path. It would be great if we could generate the waypoints automatically based on the world geometry somehow.

This is where a ‘navmesh’ comes in. Short for ‘navigation mesh’ it is a (typically) 2D mesh of triangles that roughly overlays the world geometry, anywhere that the game allows an agent to walk. Each of the triangles in the mesh becomes a node in the graph and has up to 3 adjacent triangles which become neighboring nodes in the graph.

This picture is an example from the Unity engine – it has analysed the geometry in the world and produced a navmesh (light blue) that is an approximation of it. Each polygon in the navmesh is an area that an agent can stand on, and the agent is able to move from one polygon to any polygon adjacent to it. (In this example the polygons are narrower than the floors they rest upon, to take account of the agent’s radius that would extend out beyond the agent’s nominal position.)

unity-navmesh.png

We can search for a route through this mesh, again using A*, and this gives us a near-perfect route through the world that can take all the geometry into account and yet not require an excess of redundant nodes (like the grid) or require a human to generate waypoints.

Pathfinding is a wide subject and there are many different approaches to it, especially if you have to code the low level details yourself. One of the best sources for further reading is Amit Patel’s site.

Planning

We saw with pathfinding that sometimes it’s not sufficient to just pick a direction and move straight there – we have to pick a route and make several turns in order to reach the destination that we want. We can generalise this idea to a wide range of concepts where achieving your goal is not just about the next step, but about a series of steps that are necessary to get there, and where you might need to look ahead several steps in order to know what the first one should be. This is what we call planning. Pathfinding can be thought of as one specific application of planning, but there are many more applications for the concept. In terms of our Sense/Think/Act cycle, this is where the Think phase tries to plan out multiple Act phases for the future.

Let’s look at the game Magic: The Gathering. It’s your first turn, you have a hand of cards, and the hand includes a Swamp which provides 1 point of Black Mana, a Forest which provides 1 point of Green Mana, a Fugitive Wizard which requires 1 Blue Mana to summon, and an Elvish Mystic which requires 1 Green Mana to summon. (We’ll ignore the other 3 cards for now to keep this simple.) The rules say (roughly) that a player is allowed to play 1 land card per turn, can ‘tap’ their land cards in play to extract the mana from it, and can cast as many spells (including creature summoning) as they have available mana. In this situation a human player would probably know to play the Forest, tap it for 1 point of Green Mana, and then summon the Elvish Mystic. But how would a game AI know to make that decision?

A Simple ‘Planner’

The naive approach might be to just keep trying each action in turn until no appropriate ones are left. Looking at the hand of cards, it might see that it can play the Swamp, so it does so. After that, does it have any other actions left this turn? It can’t summon either the Elvish Mystic or the Fugitive Wizard as they require green and blue mana respectively, and our Swamp in play can only provide black mana. And we can’t play the Forest because we’ve already played the Swamp. So, the AI player has produced a valid turn, but not a very optimal one. Luckily, we can do better.

In much the same way that pathfinding finds a list of positions to move through the world to reach a desired position, our planner can find a list of actions that get the game into a desired state. Just like each position along a path had a set of neighbors which were potential choices for the next step along the path, each action in a plan has neighbors, or ‘successors’, which are candidates for the next step in the plan. We can search through these actions and successor actions until we reach the state that we want.

In our example, let’s assume the desired outcome is “to summon a creature, if possible”. The start of the turn sees us with only 2 potential actions allowed by the rules of the game:

 

1. Play the Swamp (result: Swamp leaves the Hand, enters play)
2. Play the Forest (result: Forest leaves the Hand, enters play)

 

Each action taken may enable further actions and close off others, again depending on the game rules. Imagine we choose to play the swamp – this removes that action as a potential successor (as the swamp has already been played), it removes Play The Forest as a successor (because the game rules only allow you to play one Land card per turn), and it adds Tap the Swamp for 1 point of Black Mana as a successor – the only successor, in fact. If we follow it one step further and choose ‘Tap The Swamp’, we get 1 point of black mana that we can’t do anything with, which is pointless.

 

1. Play the Swamp (result: Swamp leaves the Hand, enters play)
            1.1 Tap the Swamp (result: Swamp is tapped, +1 Black mana available)
                        No actions left – END
2. Play the Forest (result: Forest leaves the Hand, enters play)

 

This short list of actions didn’t achieve much, leading us down the equivalent of a dead-end if we use the pathfinding analogy. So, we repeat the process for the next action. We choose ‘Play The Forest’ – again, this removes ‘Play The Forest’ and ‘Play The Swamp’ from consideration, and opens up ‘Tap The Forest’ as the potential (and only) next step. That gives us 1 green mana, which in this case opens up a third step, that of Summon Elvish Mystic.

 

1. Play the Swamp (result: Swamp leaves the Hand, enters play)
           1.1. Tap the Swamp (result: Swamp is tapped, +1 Black mana available)
                       No actions left – END
2. Play the Forest (result: Forest leaves the Hand, enters play)
            2.1 Tap the Forest  (result: Forest is tapped, +1 Green mana available)
                        2.1.1 Summon Elvish Mystic (result: Elvish Mystic in play, -1 Green mana available)
                                    No actions left – END

 

We’ve now explored all the possible actions, and the actions that follow on from those actions, and we found a plan that allowed us to summon a creature:  Play the Forest, Tap the Forest, Summon the Elvish Mystic.

Obviously this is a very simplified example, and usually you would want to pick the best plan rather than just any plan that meets some sort of criteria (such as ‘summon a creature’). Typically you might score potential plans based on the final outcome or the cumulative benefit of following the plan. For example, you might award yourself 1 point for playing a Land card and 3 points for summoning a creature. “Play The Swamp” would be a short plan yielding 1 point, but “Play The Forest → Tap The Forest → Summon Elvish Mystic” is a plan yielding 4 points, 1 for the land and 3 for the creature. This would be the top scoring plan on offer and therefore would be chosen, if that was how we were scoring them.

We’ve shown how planning works within a single turn of Magic: The Gathering, but it can apply just as well to actions in consecutive turns (e.g. moving a pawn to make room to develop a bishop in Chess, or dashing to cover in XCOM so that the unit can shoot from safety next turn) or to an overall strategy over time (e.g. choosing to construct pylons before other Protoss buildings in Starcraft, or quaffing a Fortify Health potion in Skyrim before attacking an enemy).

Improved Planning

Sometimes there are just too many possible actions at each step for it to be reasonable for us to consider every permutation. Returning to the Magic: The Gathering example – imagine if we had several creatures in our hand, plenty of land already in play so we could summon any of them, creatures already in play with some abilities available, and a couple more land cards in our hand – the number of permutations of playing lands, tapping lands, summoning creatures, and using creature abilities, could number in the thousands or even tens of thousands. Luckily we have a couple of ways to attempt to manage this.

The first way is ‘backwards chaining’. Instead of trying all the actions and seeing where they lead, we might start with each of the final results that we want and see if we can find a direct route there. An analogy is trying to reach a specific leaf from the trunk of a tree – it makes more sense to start from that leaf and work backwards, tracing the route easily to the trunk (which we could then follow in reverse), rather than starting from the trunk and trying to guess which branch to take at each step. By starting at the end and going in reverse, forming the plan can be a lot quicker and easier.

For example, if the opponent is on 1 health point, it might be useful to try and find a plan for “Deal 1 or more points of direct damage to the opponent”. To achieve this, our system knows it needs to cast a direct damage spell, which in turn means that it needs to have one in the hand and have enough mana to cast it, which in turn means it must be able to tap sufficient land to get that mana, which might require it to play an additional land card.

Another way is ‘best-first search’. Instead of iterating through all permutations of action exhaustively, we measure how ‘good’ each partial plan is (similarly to how we chose between plans above) and we evaluate the best-looking one each time. This often allows us to form a plan that is optimal, or at least good enough, without needing to consider every possible permutation of plans. A* is a form of best-first search – by exploring the most promising routes first it can usually find a path to the destination without needing to explore too far in other directions.

An interesting and increasingly popular variant on best-first search is Monte Carlo Tree Search. Instead of attempting to guess which plans are better than others while selecting each successive action, it picks random successors at each step until it reaches the end when no more actions are available – perhaps because the hypothetical plan led to a win or lose condition – and uses that outcome to weight the previous choices higher or lower. By repeating this process many times in succession it can produce a good estimate of which next step is best, even if the situation changes (such as the opponent taking preventative action to thwart us).

Finally, no discussion of planning in games is complete without mentioning Goal-Oriented Action Planning, or GOAP for short. This is a widely-used and widely-discussed technique, but apart from a few specific implementation details it is essentially a backwards-chaining planner that starts with a goal and attempts to pick an action that will meet that goal, or, more likely, a list of actions that will lead to the goal being met. For example, if the goal was e.g. “Kill The Player”, and the player is in cover, the plan might be “FlushOutWithGrenade”→ “Draw Weapon” → “Attack”.

There are typically several goals, each with their own priority, and if the highest priority goal can’t be met – e.g. no set of actions can form a “Kill the player” plan because the player is not visible – it will fall back to lower priority goals, such as “Patrol” or “Stand guard”.

Learning and Adapting

We mentioned at the start that game AI does not generally use ‘machine learning’ because it is not generally suited to real-time control of intelligent agents in a game world. However, that doesn’t mean we can’t take some inspiration from that area when it makes sense. We might want a computer opponent in a shooter game to learn the best places to go in order to score the most kills. Or we might want the opponent in a fighting game like Tekken or Street Fighter to spot when we use the same ‘combo’ move over and over and start blocking it, forcing us to try different tactics. So there are times when some degree of machine learning can be useful.

Statistics and Probabilities

Before we look at more complex examples, it’s worth considering how far we can go just by taking some simple measurements and using that data to make decisions. For example, say we had a real-time strategy game and we were trying to guess whether a player is likely to launch a rush attack in the first few minutes or not, so we can decide whether we need to build more defences or not. We might want to extrapolate from the player’s past behaviour to give an indication of what the future behaviour might be like. To begin with we have no data about the player from which to extrapolate – but each time the AI plays against the human opponent, it can record the time of the first attack. After several plays, those times can be averaged and will be a reasonably good approximation of when that player might attack in future.

The problem with simple averages is that they tend towards the centre over time, so if a player employed a rushing strategy the first 20 times and switched to a much slower strategy the next 20 times, the average would be somewhere in the middle, telling us nothing useful. One way to rectify this is a simple windowed average, such as only considering the last 20 data points.

A similar approach can be used when estimating the probability of certain actions happening, by assuming that past player preferences will carry forward to the future. For example, if a player attacks us with a Fireball 5 times, a Lightning Bolt 2 times, and hand-to-hand combat just once, it’s likely that they prefer the Fireball, using 5 in 8 times. Extrapolating from this, we can see the probability of different weapon use is Fireball=62.5%, Lightning Bolt=25%, and Hand-To-Hand=12.5%. Our AI characters would be well advised to find some fire-retardant armor!

Another interesting method is to use a Naive Bayes Classifier to examine large amounts of input data and try to classify the current situation so the AI agent can react appropriately. Bayesian classifiers are perhaps best known for their use in email spam filters, where they examine the words in the email, compare them to whether those words mostly appeared in spam or non-spam in the past, and use that to make a judgement on whether this latest email is likely to be spam. We could do similarly, albeit with less input data. By recording all the useful information we see (such as which enemy units are constructed, or which spells they use, or which technologies they have researched) and then noting the resulting situation (war vs peace, rush strategy vs defensive strategy, etc.), we could pick appropriate behaviour based on that.

With all of these learning approaches, it might be sufficient – and often preferable – to run them on the data gathered during playtesting prior to release. This allows the AI to adapt to the different strategies that your playtesters use, but won’t change after release. By comparison, AI that adapts to the player after release might end up becoming too predictable or even too difficult to beat.

Basic weight-based adaptation

Let’s take things a little further. Rather than just using the input data to choose between discrete pre-programmed strategies, maybe we want to change a set of values that inform our decision making. Given a good understanding of our game world and game rules, we can do the following:

  • Have the AI collect data about the state of the world and key events during play (as above);
  • Change several values or ‘weights’ based on that data as it is being collected;
  • Implement our decisions based on processing or evaluating these weights.

Imagine the computer agent has several major rooms in a FPS map to choose from. Each room has a weight which determines how desirable that room is to visit, all starting at the same value. When choosing where to go, it picks a room at random, but biased based on those weights. Now imagine that when the computer agent is killed, it notes which room it was in and decreases the weight, so it is less likely to come back here in future. Similarly, imagine the computer agent scores a kill – it might increase the weight of the room it’s in, to move it up the list of preferences. So if one room starts being particularly deadly for the AI player, it’ll start to avoid it in future, but if some other room lets the AI score a lot of kills, it’ll keep going back there.

Markov Models

What if we wanted to use data collected like this to make predictions? For example, if we record each room we see a player in over a period of time as they play the game, we might reasonably expect to use that to predict which room the player might move to next. By keeping track of both the current room the player is in and the previous room they were seen in, and recording that as a pair of values, we can calculate how often each of the former situations leads to the latter situation and use that for future predictions.

Imagine there are 3 rooms, Red, Green, and Blue, and these are the observations we saw when watching a play session:

First Room Seen

Total Observations

Next Room Seen

Times Observed

Percentage

 

Red

 

10

Red

2

20%

Green

7

70%

Blue

1

10%

 

Green

 

10

Red

3

30%

Green

5

50%

Blue

2

20%

 

Blue

 

8

Red

6

75%

Green

2

25%

Blue

0

0%

 

The number of sightings in each room is fairly even, so it doesn’t tell us much about where might be a good place to place an ambush. It might be skewed by players spawning evenly across the map, equally likely to emerge into any of those three rooms. But the data about the next room they enter might be useful, and could help us predict a player’s movement through the map.

We can see at a glance that the Green room is apparently quite desirable to players – most people in the Red room then proceed to the Green room, and 50% of players seen in the Green room are still there the next time we check. We can also see that the Blue room is quite an undesirable destination – people rarely pass from the Red or Green rooms to the Blue room, and nobody seems to linger in the Blue room at all.

But the data tells us something more specific – it says that when a player is in the Blue room, the next room we see them in is most likely to be the Red room, not the Green room. Despite the Green room being a more popular destination overall than the Red one, that trend is slightly reversed if the player is currently in the Blue room. The next state (i.e. the room they choose to travel to) is apparently dependent on the previous state (i.e. the room they currently find themselves in) so this data lets us make better predictions about their behaviour than if we just counted the observations independently.

This idea that we can use knowledge of the past state to predict a future state is called a Markov model and examples like this where we have accurately measurable events (such as ‘what room is the player in’) are called Markov Chains. Since they represent the chance of changes between successive states they are often visually represented as a finite state machine with the probability shown alongside each transition. Previously we used a state machine to represent some sort of behavioural state that an agent was in, but the concept extends to any sort of state, whether to do with an agent or not. In this case the states represent the room the agent is occupying, and it would look like this:

MarkovChain1.png

This is a simple approach to represent the relative chance of different state transitions, giving the AI some predictive power regarding the next state. But we could go further, by using the system to look 2 or more steps into the future.

If the player is seen in the Green room, we use our data to estimate there is a 50% chance they will still be in the Green room for the next observation. But what’s the chance they will still be there for the observation after that? It’s not just the chance that they stayed in the Green room for 2 observations (50% * 50% = 25%) but also the chance that they left and came back. Here’s a new table with the previous values applied to 3 observations, 1 current and 2 hypothetical ones in the future.

Observation 1

Hypothetical Observation 2

Percentage chance

Hypothetical Observation 3

Percentage chance

Cumulative chance

 

 

 

 

Green

 

Red

 

30%

Red

20%

6%

Green

70%

21%

Blue

10%

3%

 

Green

 

50%

Red

30%

15%

Green

50%

25%

Blue

20%

10%

 

Blue

 

20%

Red

75%

15%

Green

25%

5%

Blue

0%

0%

 

 

 

Total:

100%

 

Here we can see that the chance of seeing the player in the Green room 2 observations later is likely to be 51% – 21% of that coming from a journey via the Red room, 5% of it seeing the player visit the Blue room in between, and 25% staying in the Green room throughout.

The table is just a visual aid – the procedure requires only that you multiply out the probabilities at each step. This means you could look a long way into the future, with one significant caveat: we are making an assumption that the chance of entering a room depends entirely on the current room they are in. This is what we call the Markov Property – the idea that a future state depends only on the present state. While it allows us to use powerful tools like this Markov Chain, it is usually only an approximation. Players may change which room they are in based on other factors, such as their health level or how much ammo they have, and since we don’t capture this information as part of our state our predictions will be less accurate as a result.

N-Grams

What about our fighting game combo-spotting example? This is a similar situation, where we want to predict a future state based on the past state (in order to decide how to block or evade an attack), but rather than looking at a single state or event, we want to look for sequences of events that make up a combo move.

One way to do this is to store each input (such as KickPunch, or Block) in a buffer and record the whole buffer as the event.  So, imagine a player repeatedly presses KickKickPunch to use a ‘SuperDeathFist’ attack, the AI system stores all the inputs in a buffer, and remembers the last 3 inputs used at each step.

Input Input sequence so far New input memory
Kick Kick none
Punch Kick, Punch none
Kick Kick, Punch, Kick Kick, Punch, Kick
Kick Kick, Punch, Kick, Kick Punch, Kick, Kick
Punch Kick, Punch, Kick, Kick, Punch Kick, Kick, Punch
Block Kick, Punch, Kick, Kick, Punch, Block Kick, Punch, Block
Kick Kick, Punch, Kick, Kick, Punch, Block, Kick Punch, Block, Kick
Kick Kick, Punch, Kick, Kick, Punch, Block, Kick, Kick Block, Kick, Kick
Punch Kick, Punch, Kick, Kick, Punch, Block, Kick, Kick, Punch Kick, Kick, Punch

 

(Rows in bold are when the player launches the SuperDeathFist attack.)

It would be possible to look at all the times that the player chose Kick followed by another Kick in the past, and then notice that the next input is always Punch. This lets the AI agent make a prediction that if the player has just chosen Kick followed by Kick, they are likely to choose Punch next, thereby triggering the SuperDeathFist. This allows the AI to consider picking an action that counteracts that, such as a block or evasive action.

These sequences of events are known as N-grams where N is the number of items stored. In the previous example it was a 3-gram, also known as a trigram, which means the first 2 entries are used to predict the 3rd one. In a 5-gram, the first 4 entries would hopefully predict the 5th, and so on.

The developer needs to choose the size (sometimes called the ‘order’) of the N-grams carefully. Lower numbers require less memory, as there are fewer possible permutations, but they store less history and therefore lose context. For instance, a 2-gram (sometimes called a ‘bigram’) would have entries for KickKick and entries for KickPunch but has no way of storing KickKickPunch, so has no specific awareness of that combo.

On the other hand, higher numbers require more memory and are likely to be harder to train, as you will have many more possible permutations and therefore you might never see the same one twice. For example, if you had the 3 possible inputs of KickPunch, or Block and were using 10-grams then you have almost 60,000 different permutations.

A bigram model is basically a trivial Markov Chain – each ‘Past State/Current State’ pair is a bi-gram and you can predict the second state based on the first. Tri-grams and larger N-grams can also be thought of as Markov Chains, where all but the last item in the N-gram together form the first state and the last item is the second state. Our fighting game example is representing the chance of moving from the Kick then Kick state to the Kick then Punch state. By treating multiple entries of input history as a single unit, we are essentially transforming the input sequence into one piece of state, which gives us the Markov Property – allowing us to use Markov Chains to predict the next input, and thus to guess which combo move is coming next.

Knowledge representation

We’ve discussed several ways of making decisions, making plans, and making predictions, and all of these are based on the agent’s observations of the state of the world. But how do we observe a whole game world effectively? We saw earlier that the way we represent the geography of the world can have a big effect on how we navigate it, so it is easy to imagine that this holds true about other aspects of game AI as well. How do we gather and organise all the information we need in a way that performs well (so it can be updated often and used by many agents) and is practical (so that the information is easy to use with our decision-making)? How do we turn mere datainto information or knowledge? This will vary from game to game, but there are a few common approaches that are widely used.

Tags

Sometimes we already have a ton of usable data at our disposal and all we need is a good way to categorise and search through it. For example, maybe there are lots of objects in the game world and some of them make for good cover to avoid getting shot. Or maybe we have a bunch of voice-acted lines, all only appropriate in certain situations, and we want a way to quickly know which is which. The obvious approach is to attach a small piece of extra information that we can use in the searches, and these are called tags.

Take the cover example; a game world may have a ton of props – crates, barrels, clumps of grass, wire fences. Some of them are suitable as cover, such as the crates and barrels, and some of them are not, such as the grass and the wire fence. So when your agent executes the “Move To Cover” action, it needs to search through the objects in the local area and determine which ones are candidates. It can’t usually just search by name – you might have “Crate_01”, “Crate_02”, up to  “Crate_27” for every variety of crate that your artists made, and you don’t want to search for all of those names in the code. You certainly don’t want to have to add an extra name in the code every time the artist makes a new crate or barrel variation. Instead, you might think of searching for any name that contains the word “Crate”, but then one day your artists adds a “Broken_Crate” that has a massive hole in it and isn’t suitable for cover.

So, what you do instead is you create a “COVER” tag, and you ask artists or designers to attach that tag to any item that is suitable as cover. Once they do this for all your barrels and (intact) crates, your AI routine only has to search for any object with that tag, and it can know that it is suitable. This tag will still work if objects get renamed later, and can be added to future objects without requiring any extra code changes.

In the code, tags are usually just represented as a string, but if you know all the tags you’re using, you can convert the strings to unique numbers to save space and speed up the searches. Some engines provide tag functionality built in, such as Unity and Unreal Engine 4 so all you have to do is decide on your set of tags and use them where necessary.

Smart Objects

Tags are a way of putting extra information into the agent’s environment to help it understand the options available to it, so that queries like “Find me every place nearby that provides cover” or “Find me every enemy in range that is a spellcaster” can be executed efficiently and will work on future game assets with minimal effort required. But sometimes tags don’t contain enough information to be as useful as we need.

Imagine a medieval city simulator, where the adventurers within wander around of their own accord, training, fighting, and resting as necessary. We could place training grounds around the city, and give them a ‘TRAINING’ tag, so that it’s easy for the adventurers to find where to train. But let’s imagine one is an archery range, and the other is a school for wizards – we’d want to show different animations in each case, because they represent quite different activities under the generic banner of ‘training’, and not every adventurer will be interested in both. We might decide to drill down and have ARCHERY-TRAINING and MAGIC-TRAINING tags, and separate training routines for each one, with the specific animations built in to those routines. That works. But imagine your design team then say, “Let’s have a Robin Hood Training School, which offers  archery and swordfighting”! And now swordfighting is in, they then ask for “Gandalf’s Academy of Spells and Swordfighting”. You find yourself needing to support multiple tags per location, looking up different animations based on which aspect of the training the adventurer needs, etc.

Another way would be to store this information directly in the object, along with the effect it has on the player, so that the AI can simply be told what the options are, and select from them accordingly based on the agent’s needs. It can then move to the relevant place, perform the relevant animation (or any other prerequisite activity) as specified by the object, and gain the rewards accordingly.

  Animation to Perform Result for User
Archery Range Shoot-Arrow +10 Archery Skill
Magic School  Sword-Duel +10 Swords Skill
Robin Hood’s School  Shoot-Arrow +15 Archery Skill
Sword-Duel +8 Swords Skill
Gandalf’s Academy Sword-Duel +5 Swords Skill
Cast-Spell +10 Magic Skill

 

An archer character in the vicinity of the above 4 locations would be given these 6 options, of which 4 are irrelevant as the character is neither a sword user nor a magic user. By matching on the outcome, in this case, the improvement in skill, rather than on a name or a tag, we make it easy to extend our world with new behaviour. We can add Inns for rest and food. We could allow adventurers to go to the Library and read about spells, but also about advanced archery techniques.

Object Name Animation to Perform End Result
Inn Buy -10 Hunger
Inn Sleep -50 Tiredness
Library Read-Book +10 Spellcasting Skill
Library Read-Book +5 Archery Skill

 

If we just had a “Train In Archery” behaviour, even if we tagged our Library as a ARCHERY-TRAINING location, we would presumably still need a special case to handle using the ‘read-book’ animation instead of the usual swordfighting animation. This system gives us more flexibility by moving these associations into data and storing the data in the world.

By having the objects or locations – like the Library, the Inn, or the training schools – tell us what services they offer, and what the character must do to obtain them, we get to use a small number of animations and simple descriptions of the outcome to enable a vast number of interesting behaviours. Instead of objects passively waiting to be queried, those objects can instead give out a lot of information about what they can be used for, and how to use them.

Response Curves

Often you have a situation where part of the world state can be measured as a continuous value – for example:

  • “Health Percentage” typically ranges from 0 (dead) to 100 (in perfect health)
  • “Distance To Nearest Enemy” ranges from 0 to some arbitrary positive value

You also might have some aspect of your AI system that requires continuous-valued inputs in some other range. For example, a utility system might want to take both the distance to the nearest enemy and the character’s current health when deciding whether to flee or not.

However, the system can’t just add the two world state values to create some sort of ‘safeness’ level because the two units are not comparable – it would consider a near-dead character almost 200 units from an enemy to be just as safe as a character in perfect health who was 100 units from an enemy. Similarly, while the health percentage value is broadly linear, the distance is not – the difference between an enemy being 200m away and 190m away is much less significant than the difference between an enemy 10m away and one right in front of the character.

Ideally, we want an approach that can take these 2 measurements and convert them into similar ranges so that they can be directly compared. And we want designers to be able to choose how these conversions work, so that they can control the relative importance of each value. Response Curves are a tool to do just that.

The simplest explanation of a response curve is that it’s a graph with input along the X axis, the arbitrary values like “distance to nearest enemy”, and output along the Y axis, usually a normalized value from 0.0 to 1.0. The line or curve through the graph determines the mapping from the input to the normalized output, and designers tweak those lines to get the behavior they want.

For our “safeness” level calculation, we might decide to keep the health percentage as a linear value – i.e. 10% more health is equally good whether we’re badly injured or lightly injured – so mapping it to the 0 to 1 range is straightforward:

safetyhealthvalues.png

The distance to the nearest enemy is a bit different, since we presumably don’t care at all about enemies beyond a certain distance – let’s say 50 metres – and we care a lot more about the differences at close range than at long range.

Here we can see that the ‘safety’ output for an enemy at 40 or 50 metres is very similar: 0.96 vs 1.0.

safetydistancevalues.png

However, there’s a bigger safety difference between an enemy 15 metres away – roughly 0.5 – and an enemy 5 metres away – roughly 0.2. This better reflects the urgency that applies as an enemy draws nearer.

With these 2 values both scaled into the 0 to 1 range, we could calculate the overall Safety value as the average of the two input values. A character with 20% health and an enemy 50 metres away could have a Safety score of 0.6. A character with 75% health and an enemy just 5 metres away could have a Safety score of 0.47. And a badly wounded character with 10% health and an enemy just 5 metres away would have a Safety score of just 0.145.

Some things to bear in mind:

  • It’s common to use some sort of weighted average to combine the output of response curves into the final value – this makes it easier to re-use the same curves for calculating different values, by using different weights in each case to reflect the differing importance.
  • When the input value is outside the prescribed range – for example, an enemy more than 50m away in our example above – it’s common to clamp the input value to the maximum so the calculation acts as if they were at that range.
  • Implementation of the response curve will often take the form of a mathematical equation, typically running the (perhaps clamped) input through a linear equation or simple polynomial. But any system that allows a designer to create and evaluate a curve may suffice – for example, the Unity AnimationCurve object allows for the placement of arbitrary values, the choice of whether to smooth the line between the values or not, and the evaluation of any point along the line.

Blackboards

Often we find ourselves in a situation where the AI for an agent needs to start keeping track of knowledge and information it is picking up during play so that it can be used in future decision making. Maybe an agent needs to remember who the last character to attack it was, so that it knows that should be the focus of attacks in the short term. Or maybe it wants to note how long it was since it heard a disturbance, so that after some period of time it can stop investigating and go back to whatever it was doing before. Often the system that writes the data is quite separate from the system that reads the data, so it needs to be easily accessible from the agent rather than built in to the various AI systems directly. The reads may happen some time after the writes, so it needs to be stored somewhere that it can be retrieved later (rather than being calculated on demand, which may not be possible).

In a hard-coded AI system the answer here is usually just to add the necessary variables as the need for them arises. These variables go into the character or agent instances, either directly inline, or in the form of a separate structure or class to hold this information. AI routines are adapted to read and write from this data as needed. This works well as a simple approach, but can get unwieldy as more and more pieces of information need adding, and usually requires rebuilding the game each time.

A more advanced approach might be to change this data store into something that allows systems to read and write arbitrary data – something that would allow new variables to be added without needing to change the data structure, and thereby increasing the number of changes that can be made from data files and scripts without needing a rebuild. If each agent simply keeps a list of key/value pairs, one for each discrete piece of knowledge, the various AI systems can collaborate to add in this information and read it when necessary.

These approaches are what are known in AI as ‘blackboards’ because the idea is that each participant – in our case, the various AI routines like perception and pathfinding and decision-making – can all write on the blackboard when they need to record what they know, and can read anything else on the blackboard written by anyone else in order to carry out their task. The analogy is that of a team of specialists gathered around a board, writing on it each time they have something useful to share with the group, reading their peers’ previous contributions, until they reach an agreed solution or plan. The hard-coded list of shared variables is sometimes called a ‘static blackboard’ (because the slots in which information are stored are fixed at run-time) and the arbitrary list of key/value pairs is often termed a ‘dynamic blackboard’ by comparison, but the way they are used is much the same, as an information mediator between parts of the AI system.

In traditional AI the emphasis is usually on collaboration between numerous systems to jointly solve a problem, but in game AI there are relatively few systems at work. Still, some degree of collaboration may still take place. Imagine the following in an action RPG:

  • A ‘perception’ system scans the area regularly and records entries like the following into the agent’s blackboard:
    • “NearestEnemy”: “Goblin #412”
    • “NearestEnemyDistance”: 35.0
    • “NearestFriend”: “Warrior #43”
    • “NearestFriendDistance”: 55.4
    • “Last Seen Disturbance”: 12:45pm
  • Systems like the combat system can record data in the blackboard when key events happen, for example:
    • “Last Damaged”: 12:34pm

A lot of this data may look redundant – after all, it should be easy to derive the distance to the nearest enemy whenever it is needed simply by knowing who that enemy is and querying for their position. But that is potentially a slow operation if done many times a frame in order to decide whether the agent is threatened or not – especially if we also need to repeat the spatial query to find out which enemy is closest. And timestamps for “last seen disturbance” or “last damaged” can’t be derived instantly anyway – there needs to be a record of when these things took place, and a blackboard is a reasonable place to store that.

Unreal Engine 4 uses a dynamic blackboard system for the data provided to its Behaviour Trees. By providing this shared data object it is easy for designers to write new values into the blackboard based on their Blueprints (visual scripts) and for the behaviour tree to read those values later to help choose behaviour, all without requiring any recompilation of the engine.

Influence Maps

A common problem in game AI is deciding exactly where an agent should try and move to. In a shooter game we might have selected an action such as “Move to Cover”, but how do we decide where the cover is, in the face of moving enemies? Similarly, what exactly does it mean to “Flee” – where is the safest place to run to? Or in an RTS game, we might want to have our troops attack a weak point in the opponent’s defences – what is a convenient way to determine where the weakest point is?

These can all be considered geographical queries, because we’re asking a question about the shape and form of the environment and the position of entities within it. Our game is likely to have all that data to hand, but making sense of it is tricky. For example, if we want to find the weak point in an enemy’s defences, simply choosing the position of the weakest building or fortification is not good enough if it is flanked by two powerful weapons systems! We need a way to take the local area into account to give us a better overview of the situation.

The Influence Map is a data structure designed to do exactly this. It represents the ‘influence’ that an entity might have over the area around it, and by combining the influence of multiple entities, presents a more realistic view of the whole landscape. In implementation terms, we approximate the game world by overlaying a 2D grid, and after determining which grid square an entity is in, we can apply their influence score – representing whatever aspect of the gameplay we are trying to model – to that square and some of the surrounding ones. We accumulate these values in the same grid to gain the overall picture. Then we can query the grid in various ways to understand the world and make decisions about positioning and destinations.

Let’s take the example of the ‘weakest point in the opponent’s defence’. We have a defensive wall that we want to send footsoldiers to attack, but there are with 3 catapults behind it – 2 close together on the left, and 1 over on the right. How do we choose a good position for the attack?

First, we can assign each catapult a defence score of +1 for all gridsquares within firing range. Plotting these scores on the influence map for one turret looks like this:

AI Influence Maps 1.png

The blue box covers all the squares that we might consider attacking the wall. The red squares represent +1 influence for the catapults, which in this case means somewhere they can attack, and therefore presents danger for an invading unit.

If we now add in the influence from a second catapult:

AI Influence Maps 2.png

Now we have a darker area, where the influence of the two catapults overlaps, scoring +2 in those squares. The +2 square inside the blue zone would be an especially dangerous place to choose to attack the wall! Let’s add in the influence from our final catapult:

AI Influence Maps 3.png

[Icons: CC-BY: https://game-icons.net/heavenly-dog/originals/defensive-wall.html]

Now we have a full indication of the area covered by the catapults. In our potential attack zone, we have one square which has +2 influence for the catapults, 11 squares that have +1, but we have 2 squares which have 0 influence from the catapults – these are prime candidates for our chosen attack position, where we can attack the wall without risking fire from the catapults.

The benefit of the influence map here is that it transforms a continuous space with an almost endless set of position possibilities into a discrete set of rough positions that we can reason about very quickly.

Still, we can get that benefit just by picking a small number of candidate attack positions – why would we choose to use an influence map here instead of manually checking the distance to each turret for each of those positions?

Firstly, the influence map can be very cheap to calculate. Once the map has been written with the influence scores, it doesn’t need to change at all unless the entities in it move. That means you don’t have to perform distance calculations all the time or continue iterating over every possible unit – we ‘bake’ that information into the map just once, and can query it as many times as we like.

Secondly, we can overlay and combine multiple influence maps to perform more complex queries. For example, to find a safe place to flee to, we might take the influence map of our enemies, and subtract the map of our friends – gridsquares with a high negative score would therefore be considered safe.

AI Influence Maps 4.png

More red means more danger, more green means more safety. Areas where they overlap can fully or partially cancel out, to reflect the conflicting areas of influence.

Finally, they are easy to visualise by drawing them in the world. This can be a valuable aid to designers who need to tune the AI based on visible properties, and can be watched in real-time to understand why the AI is making the decisions that it does.

Conclusion

Hopefully this has given you a broad overview of some of the most common tools and approaches used in game AI, and the situations in which they are useful. Many other techniques – less commonly used but potentially just as effective – have not been covered, and these include:

  •  algorithms for optimization tasks, including hill-climbing, gradient descent, and genetic algorithms
  •  adversarial search/planning algorithms such as minimax and alpha-beta pruning
  •  classification techniques such as perceptrons, neural networks, and support vector machines
  •  systems to handle agent perceptions and memories
  •  AI architectural approaches, such as hybrid systems, subsumption achitectures, and other ways to layer AI systems
  •  animation tools such as motion planning and motion matching
  •  performance considerations such as level of detail, anytime algorithms, and timeslicing

To read more on these topics, and the topics covered in detail above, the following sources may prove useful.

Of course, on GameDev.net you have the Artificial Intelligence articles and tutorials reference as well as the Artificial Intelligence forum.

Much of the best material can be found in books, including the following:

Beyond that there are several good books on general game AI written by industry pros and it would be hard to single any out above the others – read the reviews and pick one that sounds like it would suit you.

 

Did you like this tutorial? You might also like these other tutorials from GameDev.net’s Total Beginner’s Guide series:

Building a Responsive Dashboard with Vue.js

https://www.michael-iriarte.com/articles/responsive-vue-dashboard/index.html#/

1. Project Setup

One main paradygm of vue.js is the single file component, this means each component contains its own html markup, javascript scripts and CSS styles inside a single .vue file.

In order to bundle such app architecture into production-ready static assets, the most common way is to use a Webpack custom Vue loader, which can be a bit confusing when getting started. Luckily Vue offers a very useful command line interface to boostrap a project in no time.

• Install Vue cli globally

npm install -g @vue/cli

• Create a new project (from parent directory)

vue create my-cool-dashboard

When initializing the project, make sure to select Babel, Vuex and SCSS in order to be able to run the following code samples. Use space bar to toggle options and up/down arrows to navigate.The script should run for a little while, installing the npm dependencies. Once complete, navigate inside the newly created directory and let’s start the webpack dev server

cd ./my-cool-dashboard/
npm run serve

Now your dev server should be running, if you navigate to the server url, you should see the demo Vue.js landing page.We’re ready to cook!


2. Responsive Grid

Our dashboard is going to display multiple charts on different platforms and screen sizes. Our goal is to setup a responsive system to optimize the surface available and display the charts in a responsive way.
There is a great article that covers the technique using Less.
The code below is very similar to the exception it uses SASS instead, also the grid is 6 columns instead of 12.
Below is the key SCSS mixins:

@mixin flex-size($col: 6, $gutter: 1%) {
  flex-basis: (100% / (6 / $col)) - $gutter * 2;
}

@mixin six-columns-layout($screen-type: desktop, $gutter: 1%) {
  .#{$screen-type}-1-col {
    @include flex-size(1, $gutter);
  }
  .#{$screen-type}-2-col {
    @include flex-size(2, $gutter);
  }
  .#{$screen-type}-3-col {
    @include flex-size(3, $gutter);
  }
  .#{$screen-type}-4-col {
    @include flex-size(4, $gutter);
  }
  .#{$screen-type}-5-col {
    @include flex-size(5, $gutter);
  }
  .#{$screen-type}-6-col {
    @include flex-size(6, $gutter);
  }
}

/** Grid Layout **/
@mixin grid-6($element-selector) {
  @at-root #{$element-selector + &} {
    display: flex;
    // if any margin, we want it spaced evenly
    justify-content: space-evenly;
    // we want all the widgest to have the same height when on the same line
    align-items: stretch; 
    // wrap the list of widgets over multiple lines if needed
    flex-wrap: wrap;
    @media (min-width: $nav-max-width) {
      @include six-columns-layout(desktop);
    }
    @media (max-width: $nav-max-width) {
      @include six-columns-layout(tablet);
    }
    @media (max-width: $mobile-size) {
      @include six-columns-layout(phone);
    }
  }
}

Once the mixin is ready we can use it inside our Grid.vue component:

@import "./../styles/mixins.scss";
.grid {
  @include grid-6(&)
}         

Finally we can add widgets to our grid and use the following CSS classes to set the responsive sizes.

For example for small metrics, we want a 6 column layout on desktop, 3 column layout on tablet and 2 column layout on mobile, so for each metric widget we add the following classes:

phone-3-col tablet-2-col desktop-1-col

If instead we’re displaying a larger chart we would want to use a layout with a single column on all platforms.

phone-6-col tablet-6-col desktop-6-col

Now we have a responsive layout in place, we’re ready to start adding widgets to the grid. But before that we take a little detour and will focus first on loading data into the app.


3. Loading Data

For the sake of this tutorial, we will be loading a json file, simulating a GET request – which should be pretty easy to switch towards your own backend api.

First we install axios a handy XHR client wrapper library

npm i axios

Then we create a dashboard-data.json file containing our dashboard data (which would be the API response). we create this json file inside the static directory public/assets/.

{
  "widgets": {
    "transactions": "250K",
    "weather": "☀️",
    "responsiveness": "99%",
    "events": "28,320",
    "hits": "9.12K",
    "convertion": "69%",
    "jsFrameworks": {
      "range": [0, 10000],
      "values": [9892, 8932, 4253, 1990, 1600],
      "labels": ["vue.js", "react", "angular", "backbone", "jQuery"]
    },
    "topWines": {
      "range": [0, 440000],
      "values": [440000, 280953, 144500, 120040],
      "labels": ["Haut Médoc", "Pessac", "Beaujolais", "Rioja"]
    }
  }
}

To load the API data, we create a XHR client class using the dependency axios that we installed earlier

import axios from 'axios'
export default class DashboardAPI {
  static loadDashboardData () {
    return axios.get('./assets/dashboard-data.json')
  }
}

Next we need to setup vuex. If you’re not familiar with it, below are some useful resources on the FLUX architecture:
FLUX presentation
VueX documentation

A lot of the code may look like a lot of verbose at first and may look like over-engineering when one is not familiar with these concepts. But bear with me, this will allow us to scale and mantain the app in the long run, breaking down complexity of a project into smaller and more manageable chunks.
In this tutorial we create a vuex store sub-module widgets which will manage the state/data of our widgets – in this case we have 6 metrics, 2 bar charts, 1 map-3d.

state.js

Our main state defines the widgets states as well as a boolean flag used to define its loading/ready states. Note that all these values are initialized to null.

export default {
  loading: true,
  widgets: {
    transactions: {
      value: null
    },
    weather: {
      value: null,
    },
    responsiveness: {
      value: null,
    },
    events: {
      value: null
    },
    hits: {
      value: null
    },
    convertion: {
      value: null
    },
    jsFrameworks: {
      range: null,
      values: null,
      labels: null
    },
    topWines: {
      range: null,
      values: null,
      labels: null
    },
    map3D: {
      // not needed in this tutorial
    }
  }
}

actions.js

We only need one action here, which is pulling the data from the API. The action will be dispatched by the Grid when mounted. When called,
(1) the loading flag is set to true,
(2) we make a request to the server,
(3) when complete we mutate the state with the fresh data
(4) and set the loading flag to false so that the component can now render the available data.

import * as types from './mutations-types'
import DashboardAPI from '@/api/DashboardAPI'
export const loadDashboardData = ({ commit }) => {
  commit(types.SET_LOADING_STATE, true)
  DashboardAPI.loadDashboardData().then(response => {
    const {data} = response
    commit(types.SET_DASHBOARD_DATA, data)
    commit(types.SET_LOADING_STATE, false)
  })
}

mutations.js

The action above commits two types of mutations. These are pure functions that are never async.

import * as types from './mutations-types'
export default {
  [types.SET_LOADING_STATE] (state, value) {
    state.loading = value
  },
  [types.SET_DASHBOARD_DATA] (state, {widgets}) {
    // Metrics
    state.widgets.transactions.value = widgets.transactions
    state.widgets.weather.value = widgets.weather
    state.widgets.responsiveness.value = widgets.responsiveness
    state.widgets.events.value = widgets.events
    state.widgets.hits.value = widgets.hits
    state.widgets.convertion.value = widgets.convertion
    // SVG Charts
    state.widgets.jsFrameworks.range = widgets.jsFrameworks.range
    state.widgets.jsFrameworks.values = widgets.jsFrameworks.values
    state.widgets.jsFrameworks.labels = widgets.jsFrameworks.labels
    state.widgets.topWines.range = widgets.topWines.range
    state.widgets.topWines.values = widgets.topWines.values
    state.widgets.topWines.labels = widgets.topWines.labels
  }
}

getters.js

To read and react from state changes, components rely on vuex MapGetters. We could technically simply declare a widgets getter and use dot notation to retrieve all the widgets data within the tree, but with maintainability and portability in mind we split the getters into a detailed list of widgets:

export default {
  isLoading: state => state.loading,
  transactions: state => state.widgets.transactions,
  convertion: state => state.widgets.convertion,
  hits: state => state.widgets.hits,
  events: state => state.widgets.events,
  responsiveness: state => state.widgets.responsiveness,
  weather: state => state.widgets.weather,
  jsFrameworks: state => state.widgets.jsFrameworks,
  topWines: state => state.widgets.topWines,
}

Load the data!

Now our vuex store is setup, all that’s left is to trigger the main action from a component. In our example, we load the data everytime the Grid component is mounted:

mounted () {
  // load fresh data every time we land on the view
  this.$store.dispatch('widgets/loadDashboardData')
}

After page load, open the vue dev tools and you will be able to inspect the global state at any mutation point.


4. HTML component

Let’s start with a simple text component that display data using HTML (source file).
This very basic component only takes two attributes which are title and value.
In this component, responsivness is inherited from the responsive grid and therefore very flexible with no extra setup work.

props: {
  value: {
    type: [String, Number],
    required: true
  },
  label: {
    type: String,
    default: ""
  }
}

usage in your tempate:

< metric value="value" label="label" / >

MetricChart.vue

value

label


5. SVG component

One of the main reasons I really enjoy working with SVG is for its viewBox and preserveAspectRatio property. It allows one to define the size of the vieport and how it should resize. Here is a very good guide on the topic.

< svg
  xmlns="http://www.w3.org/2000/svg"
  preserveAspectRatio="none"
  viewBox="0 0 300 100" >

Since our viewBox is now defined (300×100), we can safely base our ratio calculations using the fixed unit and let SVG resize the thing for us.
For example:

labelLineOffsetY (index) {
  // chart is 100 unit height
  // we divide 100 by the number of items and multiply by index to get offset
  const offset = Math.round(100 / this.values.length)
  return offset * index
}

Go ahead and resize your browser, you will see that the chart remains consistent, no matter what its scale is. One way one could extend this tutorial would be to make the SVG font size dynamic and react to the svg size, but that’s out of scope for now.

WidgetChartSVG.vue (source file)

skills

CSSSVGWebGLES665347795


6. Canvas component

So far we have a small collection of components (html metric and svg chart) and thing haven’t been too complex because most of handling responsivness was taken care natively by the browser.

When dealing with the canvas element, and especially with the 3D context, we need to manually initialize, destroy and resize our component in order to support various screen sizes and browser resize.

In this example (source file) we will be loading a three.js demo, without focusing on the webgl stuff too much, we will cover the key points to integrate it to your vue app.

First we install three.js – which needs no introduction 😍

npm i three

All the three demo code is included in the src/gl directory.

The GL.js class has 3 methods that we will use from the vue component
• constructor,
• handleResize,
• destroy

In our canvas component WidgetMap3D.vue

constructor()

Initialize a new canvas when the component gets mounted:

mounted () {
  this.gl = new GL(this.$refs.canvas, this.$refs.container)
}

destroy()

Destory the canvas right before the component gets destroyed:

beforeDestroy () {
  this.gl.destroy()
}

handleResize()

Now if you dig into the sample project, you will notice I added a global window.resize event handler that updates the vuex state:
state.ui.window.width and
state.ui.window.height
Our component just needs to import the windowWidth getter and setup a watcher to update the canvas when the window size changes:

Import the vuex MapGetter for the window width:

computed: {
  ...mapGetters({
    'windowWidth': 'ui/windowWidth'
  })
}

Setup watcher and update the canvas on resize:

watch: {
  windowWidth () {
    this.gl.handleResize()
  }
}

Now you can use the component by simply:

< map-3d title="three.js map" / >

WidgetMap3D.vue

three.js map


7. Put it All Together

Now our grid can import these components and map their attributes with the vuex store data.

Another task of Grid.vue is to tell vuex when to load the data (on component mount). This is done by dispatching ‘widgets/loadDashboardData’

Grid.vue (source file)

import WidgetMetric from '@/components/grid/WidgetMetric'
import WidgetChartSVG from '@/components/grid/WidgetChartSVG'
import WidgetMap3D from '@/components/grid/WidgetMap3D'
import { mapGetters } from 'vuex'

export default {
  name: 'Grid',
  components: {
    'metric': WidgetMetric,
    'chart-svg': WidgetChartSVG,
    'map-3d': WidgetMap3D
  },
  computed: {
    ...mapGetters({
      'isLoading': 'widgets/isLoading',
      'transactions': 'widgets/transactions',
      'convertion': 'widgets/convertion',
      'hits': 'widgets/hits',
      'events': 'widgets/events',
      'responsiveness': 'widgets/responsiveness',
      'weather': 'widgets/weather',
      'jsFrameworks': 'widgets/jsFrameworks',
      'topWines': 'widgets/topWines',
    })
  },
  mounted () {
    // load fresh data every time we land on the view
    this.$store.dispatch('widgets/loadDashboardData')
  }
}

And finally the template where we bind the vuex data to component attributes, define the CSS classes for the responsive layout, and setup a quick/cheap loading screen.

< template >
  < div class="grid" >
    
    < div
      v-if="isLoading"
      class="loading">
      Loading...
    < /div >
    
    < metric
      v-if="!isLoading"
      class="widget phone-3-col tablet-2-col desktop-1-col"
      :value="transactions.value"
      label="transactions"
    / >
    
    < metric
      v-if="!isLoading"
      class="widget phone-3-col tablet-2-col desktop-1-col"
      :value="weather.value"
      label="weather"
    / >

    < metric
      v-if="!isLoading"
      class="widget phone-3-col tablet-2-col desktop-1-col"
      :value="responsiveness.value"
      label="responsiveness"
    / >

    < metric
      v-if="!isLoading"
      class="widget phone-3-col tablet-2-col desktop-1-col"
      :value="events.value"
      label="events"
    / >

    < metric
      v-if="!isLoading"
      class="widget phone-3-col tablet-2-col desktop-1-col"
      :value="hits.value"
      label="hits"
    / >

    < metric
      v-if="!isLoading"
      class="widget phone-3-col tablet-2-col desktop-1-col"
      :value="convertion.value"
      label="convertion"
    / >

    < chart-svg
      v-if="!isLoading"
      class="widget phone-6-col tablet-3-col desktop-3-col"
      title="javascript frameworks"
      :range="jsFrameworks.range"
      :values="jsFrameworks.values"
      :labels="jsFrameworks.labels"
    / >

    < chart-svg
      v-if="!isLoading"
      class="widget phone-6-col tablet-3-col desktop-3-col"
      title="top wines"
      :range="topWines.range"
      :values="topWines.values"
      :labels="topWines.labels"
    / >

    < map-3d
      v-if="!isLoading"
      class="widget phone-6-col tablet-6-col desktop-6-col"
      title="three.js map"
    / >

  < /div >
< /template >

8. Building for Production

When you’re ready to share your dashboard with the rest of the world, you will want to build your assets in order to deploy them to a CDN.

vue-cli comes already setup and one can build the assets for production by simpy running:

npm run build

Now sometimes you’ll want to set up specific configs, all you need to do is to create a file vue.config.js at the root of your project. Below is for example a way to set the base url of your project, or the port number of the dev server:

module.exports = {
  baseUrl: process.env.NODE_ENV === 'production'
    ? 'https://www.michael-iriarte.com/articles/responsive-vue-dashboard'
    : '/',
  devServer: {
    port: 47000
  }
}

This is pretty much it for now, a basic wireframe app to add on your own components and vizualizations.

Checkout the demo or source on github

I hope this article will help some of you getting started with Vue. Please post in the comments if you have any questions.

Share the web!

[Transcript] Richard Feynman on Why Questions

thought this video was a really good question dissolving by Richard Feynman. But it’s in 240p! Nobody likes watching 240p videos. So I transcribed it. (Edit: That was in jest. The real reasons are because I thought I could get more exposure this way, and because a lot of people appreciate transcripts. Also, Paul Graham speculatesthat the written word is universally superior than the spoken word for the purpose of ideas.) I was going to post it as a rationality quote, but the transcript was sufficiently long that I think it warrants a discussion post instead.

Here you go:

Interviewer: If you get hold of two magnets, and you push them, you can feel this pushing between them. Turn them around the other way, and they slam together. Now, what is it, the feeling between those two magnets?

Feynman: What do you mean, “What’s the feeling between the two magnets?”

Interviewer: There’s something there, isn’t there? The sensation is that there’s something there when you push these two magnets together.

Feynman: Listen to my question. What is the meaning when you say that there’s a feeling? Of course you feel it. Now what do you want to know?

Interviewer: What I want to know is what’s going on between these two bits of metal?

Feynman: They repel each other.

Interviewer: What does that mean, or why are they doing that, or how are they doing that? I think that’s a perfectly reasonable question.

Feynman: Of course, it’s an excellent question. But the problem, you see, when you ask why something happens, how does a person answer why something happens? For example, Aunt Minnie is in the hospital. Why? Because she went out, slipped on the ice, and broke her hip. That satisfies people. It satisfies, but it wouldn’t satisfy someone who came from another planet and who knew nothing about why when you break your hip do you go to the hospital. How do you get to the hospital when the hip is broken? Well, because her husband, seeing that her hip was broken, called the hospital up and sent somebody to get her. All that is understood by people. And when you explain a why, you have to be in some framework that you allow something to be true. Otherwise, you’re perpetually asking why. Why did the husband call up the hospital? Because the husband is interested in his wife’s welfare. Not always, some husbands aren’t interested in their wives’ welfare when they’re drunk, and they’re angry.

And you begin to get a very interesting understanding of the world and all its complications. If you try to follow anything up, you go deeper and deeper in various directions. For example, if you go, “Why did she slip on the ice?” Well, ice is slippery. Everybody knows that, no problem. But you ask why is ice slippery? That’s kinda curious. Ice is extremely slippery. It’s very interesting. You say, how does it work? You could either say, “I’m satisfied that you’ve answered me. Ice is slippery; that explains it,” or you could go on and say, “Why is ice slippery?” and then you’re involved with something, because there aren’t many things as slippery as ice. It’s very hard to get greasy stuff, but that’s sort of wet and slimy. But a solid that’s so slippery? Because it is, in the case of ice, when you stand on it (they say) momentarily the pressure melts the ice a little bit so you get a sort of instantaneous water surface on which you’re slipping. Why on ice and not on other things? Because water expands when it freezes, so the pressure tries to undo the expansion and melts it. It’s capable of melting, but other substances get cracked when they’re freezing, and when you push them they’re satisfied to be solid.

Why does water expand when it freezes and other substances don’t? I’m not answering your question, but I’m telling you how difficult the why question is. You have to know what it is that you’re permitted to understand and allow to be understood and known, and what it is you’re not. You’ll notice, in this example, that the more I ask why, the deeper a thing is, the more interesting it gets. We could even go further and say, “Why did she fall down when she slipped?” It has to do with gravity, involves all the planets and everything else. Nevermind! It goes on and on. And when you’re asked, for example, why two magnets repel, there are many different levels. It depends on whether you’re a student of physics, or an ordinary person who doesn’t know anything. If you’re somebody who doesn’t know anything at all about it, all I can say is the magnetic force makes them repel, and that you’re feeling that force.

You say, “That’s very strange, because I don’t feel kind of force like that in other circumstances.” When you turn them the other way, they attract. There’s a very analogous force, electrical force, which is the same kind of a question, that’s also very weird. But you’re not at all disturbed by the fact that when you put your hand on a chair, it pushes you back. But we found out by looking at it that that’s the same force, as a matter of fact (an electrical force, not magnetic exactly, in that case). But it’s the same electric repulsions that are involved in keeping your finger away from the chair because it’s electrical forces in minor and microscopic details. There’s other forces involved, connected to electrical forces. It turns out that the magnetic and electrical force with which I wish to explain this repulsion in the first place is what ultimately is the deeper thing that we have to start with to explain many other things that everybody would just accept. You know you can’t put your hand through the chair; that’s taken for granted. But that you can’t put your hand through the chair, when looked at more closely, why, involves the same repulsive forces that appear in magnets. The situation you then have to explain is why, in magnets, it goes over a bigger distance than ordinarily. There it has to do with the fact that in iron all the electrons are spinning in the same direction, they all get lined up, and they magnify the effect of the force ’til it’s large enough, at a distance, that you can feel it. But it’s a force which is present all the time and very common and is a basic force of almost – I mean, I could go a little further back if I went more technical – but on an early level I’ve just got to tell you that’s going to be one of the things you’ll just have to take as an element of the world: the existence of magnetic repulsion, or electrical attraction, magnetic attraction.

I can’t explain that attraction in terms of anything else that’s familiar to you. For example, if we said the magnets attract like if rubber bands, I would be cheating you. Because they’re not connected by rubber bands. I’d soon be in trouble. And secondly, if you were curious enough, you’d ask me why rubber bands tend to pull back together again, and I would end up explaining that in terms of electrical forces, which are the very things that I’m trying to use the rubber bands to explain. So I have cheated very badly, you see. So I am not going to be able to give you an answer to why magnets attract each other except to tell you that they do. And to tell you that that’s one of the elements in the world – there are electrical forces, magnetic forces, gravitational forces, and others, and those are some of the parts. If you were a student, I could go further. I could tell you that the magnetic forces are related to the electrical forces very intimately, that the relationship between the gravity forces and electrical forces remains unknown, and so on. But I really can’t do a good job, any job, of explaining magnetic force in terms of something else you’re more familiar with, because I don’t understand it in terms of anything else that you’re more familiar with.

Is Listening to an Audio book “Cheating?”

I’ve been asked this question a lot and I hate it.  I’ll describe why in a bit, but for now I’ll just change it to “does your mind do more or less the same thing when you listening to an audio book and when you read print?”

The short answer is “mostly.”

Picture

An influential model of reading is the simple view (Gough & Tumner, 1986), which claims that two fundamental processes contribute to reading: decoding and language processing. “Decoding” obviously refers to figuring out words from print. “Language processing” refers to the same mental processes you use for oral language. Reading, as an evolutionary late-comer, must piggy-back on mental processes that already existed, and spoken communication does much of the lending.

So according to the simple model, listening to an audio book is exactly like reading print, except that the latter requires decoding and the former doesn’t.

Is the simple view right?

Some predictions you’d derive from the simple view are supported. For example, You’d expect that a lot of the difference in reading proficiency in the early grades would be due to differences in decoding. In later grades, most children are pretty fluent decoders so differences in decoding would be more due to processes that support comprehension. That prediction seems to be true (e.g., Tilstra et al, 2009).

Especially relevant to the question of audiobooks, you’d also predict that for typical adults (who decode fluently) listening comprehension and reading comprehension would be mostly the same thing. And experiments show very high correlations of scores on listening and reading comprehension tests in adults (Bell & Perfetti, 1994Gernsbacher, Varner, & Faust, 1990).

The simple view is a useful way to think about the mental processes involved in reading, especially for texts that are more similar to spoken language, and that we read for purposes similar to those of listening. The simple view is less applicable when we put reading to other purposes, e.g., when students study a text for a quiz, or when we scan texts looking for a fact as part of a research project.

The simple view is also likely incomplete for certain types of texts. The written word is not always similar to speech. In such cases prosody might be an aid to comprehension. Prosody refers to changes in pacing, pitch, and rhythm in speech. “I really enjoy your blog” can either be a sincere compliment or a sarcastic put-down—both look identical on the page, and prosody would communicate the difference in spoken language.


We do hear voices in our heads as we read…sometimes this effect can be notable, as when we know the sound of the purported author’s voice (e.g., Kosslyn & Matt, 1977). For audio books, the reader doesn’t need to supply the prosody–whoever is reading the book aloud does so.

Picture

For difficult-to-understand texts, prosody can be a real aid to understanding. Shakespearean plays provide ready examples. When Juliet says “Wherefore art thou Romeo?” it’s common for students to think that “wherefore” means “where,” and Juliet (who in fact doesn’t know Romeo is nearby at that moment) is wondering where Romeo is. “Wherefore” actually means “why” and she’s wondering why he’s called Romeo, and why names, which are arbitrary, could matter at all. An actress can communicate the intended meaning of “Wherefore art thou Romeo” through prosody, although the movie clip below doesn’t offer a terrific example.

​So listening to an audio book may have more information that will make comprehension a little easier. Prosody might clarify the meaning of ambiguous words or help you to assign syntactic roles to words.

But most of the time it doesn’t, because most of what you listen to is not that complicated. For most books, for most purposes, listening and reading are more or less the same thing.

So listening to an audiobook is not “cheating,” but let me tell you why I objected to phrasing the question that way. “Cheating” implies an unfair advantage, as though you are receiving a benefit while skirting some work. Why talk about reading as though it were work?

Listening to an audio book might be considered cheating if the act of decoding were the point; audio books allow you to seem to have decoded without doing so. But if appreciating the language and the story is the point, it’s not.  ​Comparing audio books to cheating is like meeting a friend at Disneyland and saying “you took a bus here? I drove myself, you big cheater.” The point is getting to and enjoying the destination. The point is not how you traveled.