Bevy Cameras | Tainted Coders

Each Camera is responsible for 3 main things:

The render target which is the region of the screen to draw something
The projection which determines how to transform 3D into 2D (our screen)
The position of the view in our scene to capture and transform

Usually a camera's render target is a Window. That is, each Camera is given a region of the Window that it is entitled to draw on called a Viewport.

Each frame, Bevy will start by drawing the ClearColor over the camera's viewport and then draw things from scratch on the screen. This color ends up being the background where nothing is rendered yet.

The coordinate system in Bevy is right handed so:

X increases going to the right
Y increases going up
Z increases coming towards the screen
The default center of the screen is (0, 0)

Bevy uses a RenderGraph to determine when to run each part of its rendering pipeline to draw everything in the correct order of dependency. Each camera you spawn is given a unique name and added to this graph as a node.

Creating a camera

When we spawn a camera we use Camera2d or Camera3d depending on our game.

A useful starting point is to define a MainCamera with a Camera2d as a required component. It can be useful to separate your primary camera from other cameras you might spawn in your scene.

// Useful for marking the "main" camera if we have many
#[derive(Component)]
#[require(Camera2d)]
pub struct MainCamera;

fn initialize_camera(mut commands: Commands) {
  commands.spawn(MainCamera);
}

Camera behavior is usually quite generic and separate from the rest of the game logic so I prefer creating a camera plugin and adding it to the app:

pub struct CameraPlugin;

impl Plugin for CameraPlugin {
  fn build(&self, app: &mut App) {
    app.add_systems(Startup, initialize_camera);
  }
}

fn main() {
  App::new()
    .add_plugins(DefaultPlugins)
    .add_plugins(CameraPlugin)
    .run();
}

Viewports

It can be confusing to figure out what it means to have multiple cameras in a scene.

Bevy's cameras are really proxies for a Viewport. The Viewport is what defines the area of our Window to draw on.

Each camera can only have a single RenderTarget. The target can be one of 3 things:

Window
Image
TextureView (useful for OpenXR)

So even though we might only have a single rendering target (the Window), we can still have multiple viewports on that window. Each one can take up a region of the screen and assigned its own separate camera.

This is how you can accomplish something like split screen or a minimap.

Camera projection

Bevy's default camera uses an orthogonal projection with a symmetric frustrum.

Don't panic! Let's break that all down.

Projection refers to the process of transforming a 3D scene into a 2D representation on a screen or viewport.

Since computer screens are 2D, we need to convert the 3D world of objects and their positions into a flat image that can be displayed.

An orthogonal projection is a type of projection that preserves the relative sizes of objects and their distances from us.

In other words: if two objects are at different distances from the viewer in the 3D world, their projected sizes on the 2D screen will accurately reflect their sizes in the 3D world.

A frustum, on the other hand, refers to a truncated pyramid shape that represents the viewing volume or the field of view in computer graphics. It is like a pyramid with the top cut off, resulting in a smaller pyramid shape.

A square frustrum

In a symmetric frustum, the shape of the truncated pyramid is balanced or symmetrical, which means that the left and right sides, as well as the top and bottom sides, are equal in size and shape.

Directing the camera

To move your camera around the scene we just need to change its translation:

fn move_camera(
  mut camera: Query<&mut Transform, (With<Camera2d>, Without<Player>)>,
  player: Query<&Transform, (With<Player>, Without<Camera2d>)>,
  time: Res<Time>,
) {
  let Ok(mut camera) = camera.single_mut() else {
    return;
  };

  let Ok(player) = player.single() else {
    return;
  };

  let Vec3 { x, y, .. } = player.translation;
  let direction = Vec3::new(x, y, camera.translation.z);

  camera.translation =
    camera.translation.lerp(direction, time.delta_secs() * 2.);
}

We could have snapped the camera to our players position every frame, but here we are using linear interpolation to smooth this effect which is common for top down games.

If we want to rotate the camera instead we would manipulate the transform's pitch and yaw:

fn rotate_camera(
  time: Res<Time>,
  mut mouse_motion: EventReader<MouseMotion>,
  mut cameras: Query<&mut Transform, With<Camera>>,
) {
  for mut transform in &mut cameras {
    let dt = time.delta_secs();
    // The factors are just arbitrary mouse sensitivity values.
    // It's often nicer to have a faster horizontal sensitivity than vertical.
    let mouse_sensitivity = Vec2::new(0.12, 0.10);

    for motion in mouse_motion.read() {
      let delta_yaw = -motion.delta.x * dt * mouse_sensitivity.x;
      let delta_pitch = -motion.delta.y * dt * mouse_sensitivity.y;

      // Add yaw (global)
      transform.rotate_y(delta_yaw);

      // Add pitch (local)
      const PITCH_LIMIT: f32 = std::f32::consts::FRAC_PI_2 - 0.01;
      let (yaw, pitch, roll) = transform.rotation.to_euler(EulerRot::YXZ);
      let pitch = (pitch + delta_pitch).clamp(-PITCH_LIMIT, PITCH_LIMIT);
      transform.rotation = Quat::from_euler(EulerRot::YXZ, yaw, pitch, roll);
    }
  }
}

A Quat is a quaternion (don't panic!), a 4D mathematical object (don't panic!) that represents a rotation in 3D space. It avoids something called Gimbal lock which is caused by using Euler angles (the intuitive ones) and allows for computationally efficient smooth interpolations.

A quaternion lets you tell a Transform to rotate around a specific axis by some angle.

The EulerRot::YXZ specifies the order we want the rotation to be applied. We said:

First apply the yaw around a global y value
Then apply the pitch around a local x value
Finally roll around the z axis

Focusing in on just this chunk of code:

let (yaw, pitch, roll) = transform.rotation.to_euler(EulerRot::YXZ);
let pitch = (pitch + delta_pitch).clamp(-PITCH_LIMIT, PITCH_LIMIT);
transform.rotation = Quat::from_euler(EulerRot::YXZ, yaw, pitch, roll);

We decomposed the quaternion into its Euler components.
Then we adjusted the pitch based on the mouse Y motion (clamped so we can't flip upside down).
And finally we rebuild the rotation quaternion with the new pitch, existing yaw and the roll.

rotate_y is applying the global rotation directly. That's fine for yaw because rotating around the Y axis is not sensitive to the current pitch.

For the pitch we want it relative to the camera's current orientation (around the x axis). That's why we need the special handling with the quaternion.

To zoom in with our camera we manipulate the Projection component which holds the information about how to relate the distances in 3D to the 2D viewport.

fn zoom_control_system(
  input: Res<ButtonInput<KeyCode>>,
  mut camera_query: Query<&mut Projection, With<MainCamera>>,
) {
  let mut projection = match camera_query.single_mut() {
    Ok(p) => p,
    Err(_) => return,
  };

  let Projection::Perspective(perspective) = projection.as_mut() else {
    return;
  };

  if input.pressed(KeyCode::Minus) {
    perspective.fov += 0.1;
  }
  if input.pressed(KeyCode::Equal) {
    perspective.fov -= 0.1;
  }

  perspective.fov = perspective.fov.clamp(0.1, std::f32::consts::FRAC_PI_2);
}

Render Layers

When we want a camera to only render certain entities we can use the RenderLayers component.

By default all components are rendered on layer 0 and there are 32 TOTAL_LAYERS to choose from.

Attaching it to our camera sets which entities it should render.

Attaching it to our other entities sets which camera should do the rendering.

// RenderLayers are Copy so aliases work to improve clarity
const BACKGROUND: RenderLayers = RenderLayers::layer(1);
const FOREGROUND: RenderLayers = RenderLayers::layer(2);

fn initialize_cameras(mut commands: Commands) {
  commands.spawn((FOREGROUND, MainCamera));

  commands.spawn((Camera2d, BACKGROUND));
}

#[derive(Component)]
struct Player;

fn spawn_player(
  mut commands: Commands
) {
  commands.spawn((Player, FOREGROUND));
}

Rendering Order

Multiple cameras will all render to the same window. When we want to control the ordering of this rendering we can use a priority.

Cameras with a higher order are rendered later, and thus on top of lower order cameras.

We can imagine it a bit like an oil painter. The first layer you apply to the canvas is the background, and the subsequent layers are painted on over top.

use bevy::render::camera::ClearColorConfig;

fn render_order(mut commands: Commands) {
  // This camera defaults to priority 0 and is rendered "first" / "at the back"
  commands.spawn(Camera3d::default());

  // This camera renders "after" / "at the front"
  commands.spawn((
    Camera3d::default(),
    Camera {
      // renders after / on top of the main camera
      order: 1,
      // don't clear the color while rendering this camera
      clear_color: ClearColorConfig::None,
      ..default()
    },
  ));
}

Mouse coordinates

When you place your mouse on the screen it would two positions:

On-screen coordinates (the position of the pixel on a screen)
World coordinates (the position of the mouse projected onto our game)

So when we read our Window::cursor_position we are only getting the on-screen coordinates. We would have to further convert them by projecting them according to our camera:

fn mouse_coordinates(
  window_query: Query<&Window>,
  camera_query: Query<(&Camera, &GlobalTransform), With<MainCamera>>,
) {
  let world_position = window_query
    .single()
    .ok()
    .and_then(|window| window.cursor_position())
    .and_then(|cursor_pos| {
      // Now that we have the cursor position, we can convert it to
      // world coordinates
      camera_query.single().ok().and_then(|(camera, transform)| {
        camera
          .viewport_to_world(transform, cursor_pos)
          .ok()
          .map(|ray| ray.origin.truncate())
      })
    });

  if let Some(pos) = world_position {
    info!("World coords: {}/{}", pos.x, pos.y);
  }
}