References and lifetimes with owned data: Using owned and borrowed values in the same datastructure

I hit a problem similar to the one below during my early days with rust:

Here are the requirements (this is just one far-fetched example):

I want to store all the Asteroids in Space.
The Asteroid list is pre-defined and owned somewhere else, so we can just keep a reference to the asteroids.
While the program is running, new Asteroids may be spontaneously created inside Space.

Sounds simple, let’s give it a try.

struct Asteroid {
    name: &'static str,
    // More stuff related to the asteroid.
}

struct Space<'a> {
    /// Keep a reference to asteroids.
    asteroids: Vec<&'a Asteroid>,
}

impl<'a> Space<'a> {
    fn new(asteroids: Vec<&'a Asteroid>) -> Self {
        Self { asteroids }
    }
}

fn main() {
    let asteroid_ceres = Asteroid { name: "Ceres" };
    let asteroid_trojan = Asteroid { name: "Trojan" };
    let space = Space::new(vec![&asteroid_ceres, &asteroid_trojan]);
    // Do something with space
}

Fair enough, requirements 1. and 2. are implemented. Let’s implement 3.. We’ll add a method create_asteroid to Space with a naive (wrong) approach.

impl<'a> Space<'a> {
    // ...
    fn create_asteroid(&mut self) {
        let new_asteroid = Asteroid { name: "NEW" };
        self.asteroids.push(&self.new_asteroid);
    }
}

Of course this does not compile! new_asteroid will be dropped at the end of the create_asteroid function. The pointer inside self.asteroids would be a dangling pointer and rustc will scream at you!

error[E0597]: `new_asteroid` does not live long enough
  --> src/main.rs:17:30
   |
17 |         self.asteroids.push(&new_asteroid);
   |                              ^^^^^^^^^^^^ borrowed value does not live long enough
18 |     }
   |     - borrowed value only lives until here

Ok, so you see the rationale? The value of new_asteroid must be owned somewhere! A little less naive (but still wrong) approach would be to have the new Space object own new_asteroid! Let’s do it.

Solution 0: Own the `Asteroid` somewhere in `Space`

struct Space<'a> {
   asteroids: Vec<&'a Asteroid>,
   /// NEW! Own the new asteroids.
   new_asteroids: Vec<Asteroid>,
}

impl<'a> Space<'a> {
    fn new(asteroids: Vec<&'a Asteroid>) -> Self {
        Self { asteroids, new_asteroids: vec![] }
    }

    fn create_asteroid(&mut self) {
        let new_asteroid = Asteroid { name: "NEW" };
        self.new_asteroids.push(new_asteroid);
    }
}

This will compile. However, some people might say “I want all the asteroids to be included in the same object!” (for whatever reason, i.e. better integration with lots of legacy code, ouch).

Can you execute the code as below?

impl<'a> Space<'a> {
    // ...
    fn create_asteroid(&mut self) {
        let new_asteroid = Asteroid { name: "NEW" };
        self.new_asteroids.push(new_asteroid);
        self.asteroids.push(self.new_asteroids.last().unwrap());
    }
}

Let’s see. NO.

error[E0495]: cannot infer an appropriate lifetime for lifetime parameter in function call due to conflicting requirements
  --> src/main.rs:18:48
   |
18 |         self.asteroids.push(self.new_asteroids.last().unwrap());
   |                                                ^^^^
   |
note: first, the lifetime cannot outlive the anonymous lifetime #1 defined on the method body at 15:5...
  --> src/main.rs:15:5
   |
15 | /     fn create_asteroid(&mut self) {
16 | |         let new_asteroid = Asteroid { name: "NEW" };
17 | |         self.new_asteroids.push(new_asteroid);
18 | |         self.asteroids.push(self.new_asteroids.last().unwrap());
19 | |     }
   | |_____^
note: ...so that reference does not outlive borrowed content
  --> src/main.rs:18:29
   |
18 |         self.asteroids.push(self.new_asteroids.last().unwrap());
   |                             ^^^^^^^^^^^^^^^^^^
note: but, the lifetime must be valid for the lifetime 'a as defined on the impl at 10:1...
  --> src/main.rs:10:1
   |
10 | impl<'a> Space<'a> {
   | ^^^^^^^^^^^^^^^^^^
note: ...so that reference does not outlive borrowed content
  --> src/main.rs:18:29
   |
18 |         self.asteroids.push(self.new_asteroids.last().unwrap());
   |                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Well you see, the lifetimes of the references inside self.asteroids are all 'a, right? The lifetimes of the whole Space struct (and of its owned values: self.new_asteroids) are not the same! You could force their lifetimes to be identical by defining create_asteroid as follows:

impl<'a> Space<'a> {
    // ...
    fn create_asteroid(&'a mut self) {
        let new_asteroid = Asteroid { name: "NEW" };
        self.new_asteroids.push(new_asteroid);
        self.asteroids.push(self.new_asteroids.last().unwrap());
    }
}

This will compile. Now, using create_asteroid twice will not compile:

fn main() {
    let asteroid_ceres = Asteroid { name: "Ceres" };
    let asteroid_trojan = Asteroid { name: "Trojan" };
    let mut space = Space::new(vec![&asteroid_ceres, &asteroid_trojan]);
    space.create_asteroid();
    space.create_asteroid();
}

See for yourself:

error[E0499]: cannot borrow `space` as mutable more than once at a time
  --> src/main.rs:27:5
   |
26 |     space.create_asteroid();
   |     ----- first mutable borrow occurs here
27 |     space.create_asteroid();
   |     ^^^^^ second mutable borrow occurs here
28 | }
   | - first borrow ends here

Why? create_asteroid borrows space mutably for its whole lifetime ('a)! So the borrow only ends when space is dropped. Just as rustc tells you.

(*) This actually protect you from doing a lot of ugly stuff, which include for example removing an Asteroid from the new_asteroids vector. If you were to be able to do that, asteroids would then contain dangling references.

Solution 1: Use `unsafe`

Screw lifetimes. Let me do what I want to do.

Using unsafe will, in appearance, solve the issue:

impl<'a> Space<'a> {
    fn create_asteroid(&mut self) {
        let new_asteroid = Asteroid { name: "NEW" };
        self.new_asteroids.push(new_asteroid);
        let new_asteroid = self.new_asteroids.last().unwrap() as *const Asteroid;
        self.asteroids.push(unsafe { &*new_asteroid });
    }
}

We are here basically coercing the lifetime of the pointer to an object inside new_asteroids to 'a.

YOU MUST MAKE SURE THAT THE INTERFACE PROVIDED WILL BE SAFE!

If another developer were to destroy any instance of Asteroid inside self.new_asteroids (with somethink like Vec::clear), this would result in undefined behavior the next time the vector self.asteroids is used.

Solution 2: Use `std::borrow::Cow`

Now, we could use Cow, so that asteroids contain both borrowed and owned values. Cow allows to encapsulate a piece of data why may or may not be owned. This is convenient when you do not know statically whether a value will always be owned or always be borrowed.

use std::borrow::Cow;

/// Cow requires the encapsulated value to implement Clone
#[derive(Clone, Copy)]
struct Asteroid {
    name: &'static str,
    // More stuff related to the asteroid.
}

struct Space<'a> {
   asteroids: Vec<Cow<'a, Asteroid>>,
}

impl<'a> Space<'a> {
    /// Create Space from a slice of Asteroids
    fn new(asteroids: &'a [Asteroid]) -> Self {
        Self { asteroids: asteroids.iter().map(Cow::Borrowed).collect() }
    }

    /// Create a new owned asteroid
    fn create_asteroid(&mut self) {
        let new_asteroid = Asteroid { name: "NEW" };
        self.asteroids.push(Cow::Owned(new_asteroid));
    }
}

fn main() {
    let asteroid_ceres = Asteroid { name: "Ceres" };
    let asteroid_trojan = Asteroid { name: "Trojan" };
    let asteroids = vec![asteroid_ceres, asteroid_trojan];
    let mut space = Space::new(&asteroids);
    space.create_asteroid();
}

Now we have all our Asteroids inside the same data-structure thanks to Cow!

However, Cow is quite costly. It requires the encapsulated object to be clonable. It must implement Clone (ToOwned to be exact). Do we require its Clone-On-Write feature? Maybe yes, maybe not.

If we do not require to change a borrowed value, we could as well just use another kind of smart pointer with less constraints. Let’s call it Bow (Borrow-Or-oWned).

Solution 3: `Bow` (Borrow Or oWned), your own smart pointer

The type Bow can enclose and provide immutable access to borrowed data, and gives mutable acess when owned.

It implements the Borrow and Deref traits for ease of use.

use std::borrow::Borrow;
use std::ops::Deref;

pub enum Bow<'a, T: 'a> {
    Owned(T),
    Borrowed(&'a T)
}

impl<'a, T: 'a> Borrow<T> for Bow<'a, T> {
    fn borrow(&self) -> &T {
        match self {
            &Bow::Owned(ref t) => t,
            &Bow::Borrowed(t) => t,
        }
    }
}

impl<'a, T: 'a> Deref for Bow<'a, T> {
    type Target = T;
    fn deref(&self) -> &T {
        self.borrow()
    }
}

/// Helper functions to get mutable access to the encapsulated data
impl<'a, T: 'a> Bow<'a, T> {
    pub fn borrow_mut(&mut self) -> Option<&mut T> {
        match self {
            &mut Bow::Owned(ref mut t) => Some(t),
            &mut Bow::Borrowed(_) => None,
        }
    }

    pub fn extract(self) -> Option<T> {
        match self {
            Bow::Owned(t) => Some(t),
            Bow::Borrowed(_) => None,
        }
    }
}

Our code does not sensibly change. With Bow, we just need to replace Cow with Bow. And everything compiles and works!

// ...
struct Space<'a> {
   asteroids: Vec<Bow<'a, Asteroid>>,
}

impl<'a> Space<'a> {
    fn new(asteroids: &'a [Asteroid]) -> Self {
        Self { asteroids: asteroids.iter().map(Bow::Borrowed).collect() }
    }

    fn create_asteroid(&mut self) {
        let new_asteroid = Asteroid { name: "NEW" };
        self.asteroids.push(Bow::Owned(new_asteroid));
    }
}
// ...

As always, rust enforces us to make good design decision with our data structure!

NB: This post inspired a crate: boow.

Solution 0: Own the Asteroid somewhere in Space

Solution 1: Use unsafe

Solution 2: Use std::borrow::Cow

Solution 3: Bow (Borrow Or oWned), your own smart pointer

Solution 0: Own the `Asteroid` somewhere in `Space`

Solution 1: Use `unsafe`

Solution 2: Use `std::borrow::Cow`

Solution 3: `Bow` (Borrow Or oWned), your own smart pointer