The goal of this post is to explore each major version change to the ECS framework of bevy and explore some of its original roots.
V1
The initial version of bevy’s ECS was a fork of the hecs library and heavily inspired by Legion.
Bevy ECS V1, hecs, Legion, Flecs, and Unity DOTS are all archetypal ECS.
hecs is short for “A handy ECS”. Its goals were:
- fast traversals
- a simple interface
- a small dependency closure
- exclusion of externally-implementable functionality
And when we look at some hecs code we can see the influence clearly:
let mut world = World::new();
// Nearly any type can be used as a component with zero boilerplate
let a = world.spawn((123, true, "abc"));
let b = world.spawn((42, false));
// Systems can be simple for loops
for (id, (number, &flag)) in world.query_mut::<(&mut i32, &bool)>() {
if flag { *number *= 2; }
}
// Random access is simple and safe
assert_eq!(*world.get::<&i32>(a).unwrap(), 246);
assert_eq!(*world.get::<&i32>(b).unwrap(), 42);
And here is some modern (0.10) bevy code expressing something similar:
use bevy_ecs::prelude::*;
#[derive(Component)]
struct Position { x: f32, y: f32 }
#[derive(Component)]
struct Velocity { x: f32, y: f32 }
let mut world = World::new();
let entity = world
.spawn((Position { x: 0.0, y: 0.0 }, Velocity { x: 1.0, y: 0.0 }))
.id();
let entity_ref = world.entity(entity);
let position = entity_ref.get::<Position>().unwrap();
let velocity = entity_ref.get::<Velocity>().unwrap();
hecs internally tracks groups of entities which all have the same components. Each group has a dense, contiguous array for each type of component. When a system accesses all entities with a certain set of components, a fast linear traversal can be made through each group having a superset of those components. This is effectively a columnar database, and has the same benefits: the CPU can accurately predict memory accesses, bypassing unneeded data, maximizing cache use and minimizing latency.
hecs uses components, bundles, archetypes, worlds, and commands.
Bevy’s initial fork was focused on a couple of key innovations:
- Adding systems as rust functions
- Adding global and local resources as a new type of components
- Adding a scheduler to run the core game loop (single threaded)
- Removing generational entity ids and replacing them with UUIDs for serialization
- Adding the ability to query entity indices directly
- Exposing more of the core of hecs as public to build higher level apis
- About twice as much performance as hecs
There was also inspiration taken from Legion (boxed system traits + schedules + resources), and Shipyard (system functions + using archetypes to store resources + queryable entities)
At this stage there was actually only two types of functional systems: foreach systems and query systems.
Of particular note is how they
implemented
the original IntoSystem
macros that form the core of bevy systems as rust
functions.
Most of the inspiration comes from the original Legion implementation that was removed
Here is the original system definition:
pub trait System: Send + Sync {
fn name(&self) -> Cow<'static, str>;
fn id(&self) -> SystemId;
fn thread_local_execution(&self) -> ThreadLocalExecution;
fn run(&mut self, world: &World, resources: &Resources);
fn run_thread_local(&mut self, world: &mut World, resources: &mut Resources);
fn initialize(&mut self, _resources: &mut Resources) {}
// Here is what adding systems to an app looked like:
App::build()
.add_system(my_system.system())
.add_system(my_thread_local_system.thread_local_system())
.run()
The original functional system implementation had a few problems:
- system params must follow an arbitrary set of order rules
- hard to understand
- slow to compile
Its not until 0.3.0 when these were worked out. Its around then that we drop the distinction between “foreach” systems and query systems.
There was some early quirks, like Query::iter/entity
being quite awkward:
for (a, b) in &mut query.iter() {}
// Or this:
for (a, b) in query.iter().iter() {}
// This is query.get in modern bevy
if let Ok(mut result) = query.entity(entity) {
if let Some((a, b)) = result.get() { }
}
One of the problems encountered early on was the Transform
was a set of
components that could be queried independently (Translation
, Rotation
and
Scale
as separate components).
This helped with parallelism but ended up being unweildly to have the output be one frame behind constantly.
Taking some more inspiration from Legion the Transform
we know today was born.
See this discussion for more information.
Bevy tasks was also added in place of the original rayon integration for multithreading.
Change detection also comes up.
Generational indexes on entities made a comeback.
App system ergonomics got vastly improved. Oh wait, due to safety it got rolled back temporarily.
The algorithm for parallel execution gets a v2, it looks like:
1. Evaluate run criteria of system sets.
2. If any of the sets have changed, rebuild the cached scheduling data.
3. Run exclusive systems that want to be at the start of stage.
4. If world's archetypes were changed since the last time parallel systems were
ran, update their access and the relevant part of scheduling data.
5. Enter outer compute pool scope.
6. Prepare parallel systems - spawn tasks, reset counters, queue systems with
no dependencies, etc.
7. Try running a queued thread-local system. Queue its dependants if this was
the last system they were waiting on.
8. Enter inner compute pool scope.
9. Try starting some queued thread-agnostic systems.
10. See if any thread-agnostic systems have finished, queue their dependants
if those were the last systems they were waiting on.
11. Exit inner scope.
12. If there are any queued or running systems, continue from step 7.
13. Exit outer scope.
14. Merge in command buffers.
15. Run exclusive systems that want to be at the end of stage.
16. Re-evaluate run criteria, continue from step 3 if needed.
Reflection got added to help with scripting, serialization and improving the ergonomics by not constantly requiring end users to derive too many traits.
Initially, schedules had quite a few limitations:
- Only one Schedule allowed
- Very static: you are limited to using the tools we gave you (stages are lists of systems, you can add stages to schedules)
- Couldn’t switch between schedules at runtime
- Couldn’t easily support “fixed timestep” scenarios
Schedule V2 fixed these. This is where we get some of the modern bevy ideas like nested schedules, system stages, run criteria and app states.
V2
The overall goal was to improve the performance and versatility of Bevy ECS.
The implementation of World
gets an overhaul and breaks free from the original
hecs definition.
Improvements on the ECS are focused around generalizing the storage and retrieval of components to handle more use cases:
“Developers selecting an ECS framework are stuck with a hard choice. Select an “archetypal” framework with “fast iteration everywhere” but without the ability to cheaply add/remove components, or select a “sparse set” framework to cheaply add/remove components but with slower iteration performance.”
Archetypal ECS:
- Stores components in “tables” with static schemas. Each “column” stores components of a given type. Each “row” is an entity.
- Each “archetype” has its own table. Adding/removing an entity’s component changes the archetype.
- Enables super-fast Query iteration due to its cache-friendly data layout
- Comes at the cost of more expensive add/remove operations for an entity’s components, because all components need to be copied to the new archetype’s “table”
Sparse Set ECS:
- Stores components of the same type in densely packed arrays, which are sparsely indexed by densely packed unsigned integers (Entity ids)
- Query iteration is slower than Archetypal ECS because each entity’s component could be at any position in the sparse set. This “random access” pattern isn’t cache friendly. Additionally, there is an extra layer of indirection because you must first map the entity id to an index in the component array.
- Adding/removing components is a cheap, constant time operation
Bevy ECS V2 introduced a hybrid storage model where we use Table
for archetype
based storage and SparseSet
for rapidly updating components.
Archetypes become simply metadata, they store:
The ComponentId
of each of the Archetype
components (and that component’s
storage type)
- Archetypes are uniquely defined by their component layouts
- For example: entities with “table” components [A, B, C] and “sparse set” components [D, E] will always be in the same archetype.
The TableId
associated with the archetype
- For now each archetype has exactly one table (which can have no components),
- There is a 1->Many relationship from Tables->Archetypes. A given table could
have any number of archetype components stored in it:
- Ex: an entity with “table storage” components [A, B, C] and “sparse set” components [D, E] will share the same [A, B, C] table as an entity with [A, B, C] table component and [F] sparse set components.
- This 1->Many relationship is how we preserve fast “cache friendly” iteration performance when possible (more on this later)
A list of entities that are in the archetype and the row id of the table they
are in ArchetypeComponentId
- unique densely packed identifiers for (
ArchetypeId
,ComponentId
) pairs - used by the schedule executor for cheap system access control
- “Archetype Graph Edges”
Archetypes are usually quite expensive to compute. Normally querying becomes slower as the number of archetypes goes up.
To cache the results Bevy starts building a graph of them. Bundles are the preferred way to add components as they reduce the amount this cache needs to be refreshed.
To achieve this Query
and SystemParam
are now stateful, holding information
that helps cache their results and ensure proper initialization.
Components
finally got added which gives world.components()
and stores
mappings from ComponentId
to ComponentInfo
as well as TypeId
to
TypeInfo
.
The only difference between Resource
and Component
in V1 of Bevy was that a
Resource
was required to be unique. Now it gets merged into World
, with a
familiar API and the difference between them becomes even more slight:
world.insert_resource(1);
world.insert_resource(2.0);
let a = world.get_resource::<i32>().unwrap();
let mut b = world.get_resource_mut::<f64>().unwrap();
*b = 3.0;
They have their own ComponentId
(and their own resource TypeId->ComponentId
scope, so they don’t conflict wit components of the same type), and they are
stored in a special “resource archetype”.
We also got our familiar builder pattern for entities:
// spawn now takes no inputs and returns an EntityMut
let entity = world.spawn()
.insert(A) // insert a single component into the entity
.insert_bundle((B, C)) // insert a bundle of components into the entity
.id() // id returns the Entity id
Around now we get the buearacratic machinery that Bevy is famous for up and running.
System sets and run criteria get a V2 update which allows us to:
- Reuse run criteria within a stage.
- Pipe output of one run criteria as input to another.
- Assign labels, dependencies, run criteria, and ambiguity sets to many systems at the same time.
V3
- Scheduler V3 Part 1
- Scheduler V3 Part 2
- Schedules are now stored in a publicly-accessible resource in the world so you could add systems to a schedule (one that isn’t currently running) while the app is running (removal not allowed)
- Current design is that you can change the field from wherever you can access the Schedules resource, and then the next time App::update is called it’ll run the schedule associated with that label
- One Shot Systems Part 1
- One Shot Systems Part 2
- CSS Grid Layout
- Assets V2
- Runtime Query Generation Part 1
- Runtime Query Generation Part 2
- Runtime Query Generation Part 3
Future work
- Relations
- Configurable and more reliable events
- Changes to Plugin to make it more configurable
- WorldQueryData vs WorldQueryFilter split
- One-shot systems
- Entity niching
- Dynamic components
- Runtime insertion / removal of systems
- Scoped EntityMut equivalent for WorldQuery