What are archetypes?
Archetypes are the underlying storage method we will be using to store components in our ECS. In
our ECS we will use something like Vec<Archetype>
to store everything.
An archetype stores *all* of the components on an entity and also only stores components from entities
with the same set of components. E.G. an entity with *only* [T1, T2, T3]
components would
have all components stored in one archetype, and an entity with *only* [T1, T2, T3, T4]
components would have all the components stored in a different archetype.
Let's say we have an archetype, we'll call it A_123
. A_123
is storing
components for entities that only have components [T1, T2, T3]
.
When we spawn an entity like so:
world.spawn((
T1 { .. },
T2 { .. },
T3 { .. },
));
We need to find an archetype to place all these components into. As mentioned previously we want to find if there's
an archetype that only stores components for entities with [T1, T2, T3]
, and, luckily for us, there is! It's A_123
.
Let's take a quick look at what A_123
might look like internally and how we would go about placing these components into it.
In psuedocode we have some archetype struct that looks a bit like:
struct A_123 {
column_1: Vec<T1>,
column_2: Vec<T2>,
column_3: Vec<T3>,
}
You might be wondering why we store each component type in a separate vec rather than something like this:
struct A_123(Vec<(T1, T2, T3)>);
There are a few reasons for this,
- The borrow checker will get angry at us if we have two Query's where one wants to access components
T1
mutably and the other wants to accessT2
immutably (We cant create an iterator over only *parts* of the (T1, T2, T3) tuple afterall) - It's more performant to store the components in separate Vecs as when we iterate only
T1
we dont needlessly load theT2
andT3
components into the cpu's cache.
With that explained lets look at some psuedocode for that spawn
function we used earlier.
fn spawn(&mut self, data: (T1, T2, T3)) -> Entity {
let a_123 = /* magically get the A_123 archetype */;
let entity = self.entity_generator.spawn();
a_123.entities.push(entity);
a_123.column_1.push(data.0);
a_123.column_2.push(data.1);
a_123.column_3.push(data.2);
entity
}
...Huh interesting how all this psuedocode looks awfully like Rust... I'm sure its just a coincedence ;)
There are a few things to talk about the above code:
- What's this
entities
field that suddenly appeared? - Do we need to store any kind of metadata so that we know which entity every element in each of the component vecs corresponds to?
Both of these things are *actually* one and the same.
Because we store entities with *only* components [T1, T2, T3]
in this archetype, the component
vecs will always be the same length. This means that when we push components to the front of the vecs
they'll all end up at the same index.
What we can do with this knowledge is have have a Vec<Entity>
in the archetype and
push the spawned Entity
to the entities vec. We then have an implicit mapping where for every index in
the component columns we can access the entities
vec at the same index to check what entity this component is for.
It's for the same reason that we can also access the other component columns at this index and the component will be for
the same entity.
That last point is why Query's in archetype based ECS' are *so* fast. When we want to iterate all T1
and T2
components
in our world we can just find every archetype that has a column for T1
and also a column for T2
and then blindly iterate
both Vecs at the same time. Some psuedocode to hopefully help demonstrate what I mean:
fn iterate_T1_and_T2(archetype: &A_123) -> impl Iterator<Item = (&T1, &T2)> {
archetype.column_1.iter().zip(archetype.column_2.iter())
}
Now that you (hopefully) have an understanding of how components get stored in archetype based ECS' and the performance advantages we can get from it, it's time to talk about one of the biggest flaws of the archetype model. Adding/Removing components is *really* slow.. like.. **really** slow.
Let's continue with our previous example of spawning our entity with components [T1, T2, T3]
. It's sitting pretty comfortably in
our A_123
archetype and we'll have super fast iteration times if we add more entities to this archetype and query for their components.
Now lets try adding a component to our entity. As previously mentioned our A_123
archetype stores entities which *only* have
[T1, T2, T3]
components as this lets us have amazing iteration speeds. Now let's say we add a component T4
to out entity,
we would no longer fit this criteria which means we cant store our entity's components in A_123
, we'll have to make a second archetype-
A_1234
it will store components for entities who only have [T1, T2, T3, T4]
Lets write some quick psuedocode and see if the performance problem here speaks for itself:
fn add_T4_to_entity_in_A_123(&mut self, entity: Entity, data: T4) {
let a_123 = /* magically get the A_123 archetype */;
let index = /* magically get the index in the
component columns corresponding to the entity */;
let a_1234 = /* magically get the A_1234 archetype */;
a_1234.entities.push(entity);
a_1234.column_4.push(data);
// Oh no this doesnt look cheap at all
a_1234.column_1.push(a_123.column_1.remove(index));
a_1234.column_2.push(a_123.column_2.remove(index));
a_1234.column_3.push(a_123.column_3.remove(index));
self.archetypes.push(a_1234);
}
Aaaaaaaaand yep... we have to move each of the entity's components out of their columns in A_123
.
Then have to push the removed components to the respective columns in the other archetype (A_1234
). That's a lot of
moving around memory which isn't the fastest thing in the world- to say the least
There's not a whole lot we can do about this. The need to move all this memory around when adding/removing components is entirely necessary for the previously mentioned amazing iteration performance. Archetype ECS' are inherently about trading in add/remove performance in exchange for iteration performance.
There are others ways to model your ECS such as sparsesets which have signficantly better add/remove performance in exchange for worse iteration performance relative to archetype based ECS' (still fast though :P). There's no one best way to model an ECS, sparseset and archetype ECS' just make different tradeoffs :)
In the next part we'll actually create our Archetype
struct