Object lifetime and ownership

Before learning Rust, I never thought about object lifetime and ownership that much. It turns out they have many things to do with memory safety and thread safety. Nowadays I think about lifetime and ownership all the time, even when writing programs in C++. Here is a summary of my thoughts, inspired by Rust, but applicable to any programming language.

Each object has a lifetime. An object is alive, when we can access it. In this article, an "object" is a generic term that refers to a "thing" that represents a piece of memory or other resources. A string, an integer or a struct in C are all "objects".

Scope

The concept of a scope is well known. In C-style languages, a scope is usually a code block. When we start executing a code block between a pair of { and }, a scope is created. When we are done with the code block, the scope is destroyed. If an object is owned by a scope, its lifetime is bounded by the scope. The object goes out of life when the scope ends. Here is a minimal example.

{ // A scope is created.
  int x = 0; // Life of x starts here.
  ...
  x += 2;
} // Life of x ends here, when the scope ends.
// Cannot use x anymore.

Two blocks can own different sets of objects. They should not access or modify variables owned by each other. This is useful when we want isolation between code blocks. Functions, closures, loop bodies, branches all create scopes. Loop bodies are special, since they are essentially many scopes that look just like each other.

Object hierarchy

Object hierarchy is also a common concept. One object can own another object. When the outer object dies, the inner object dies with it. We say the lifetime of the inner object is bounded by the outer object.

struct ShortString {
  char content[100];
  size_t len;
}

In a string, the bytes in memory are usually owned by the string itself. When the string dies, we no longer need the bytes. Thus we usually choose to destroy those bytes when the string goes out of life.

Both scopes and objects allow nesting. Scopes and objects can own other scopes and objects. Together they form a tree structure of ownership.

Ownership tree. Execution order is from top to bottom. Squares are objects.

Ownership is useful

Let's have a look at a classic example of unsafe memory access.

ShortString *create_empty_short_string() {
  ShortString str;
  return &str;
}

int *ptr = create_empty_short_string();  // <-- a scope is created and destroyed.
// str has gone out of life
// ptr points to something that does not exist.
ptr->len = 10;  // boom!

Using the concept of ownership, we can see that str is owned by the function scope. When that scope is gone, so is str. That is why we cannot safely access *ptr. *ptr is known as a "dangling pointer. This is a confusing problem for C/C++ beginners. It is easy to explain when we introduce the concept of "ownership". Ownership helps us understand memory safety.

Here is another observation. In the scope/object tree, a child scope can safely access any object that belongs to its parent. The child scope lives shorter, while the object lives slightly longer. That is why the code inside an if-else block can always use variables in the enclosing block.

Passing ownership

To exchange information between scopes, ownership of objects can be passed around. For example, when a parameter is passed to a function, so can be the ownership of the parameter. When a value is returned from a function, often the ownership is transferred from the function scope to the calling scope.

ShortString random_short_string(
  params RandomParams // Ownership of params is passed to the function.
) {  
  ShortString str;
  for (int i = 0; i < params.size; i++) {
    str.content[i] = 'a' + i;
  }
  str.len = param.size;
  return str;
}

ShortString short_string = random_short_string({size = 10});
// Ownership of the string is passed to the calling scope.

Some of the C++ experts might start to scream and yell "but that value is copied!", copy elision, move semantics and so on. Please stop thinking about implementation details and focus on the intention. The intent of random_short_string(), is clearly to hand over str to any caller. No copy has to be made, because random_short_string()  does not want to keep the original copy for itself. Clear ownership helps avoid copying.

Dynamic lifetime

We often need more flexibility than a tree structure, as well as beyond the interaction of two scopes. That flexibility can be archived by a third type of lifetime: good until deleted.

ShortString *create_short_string() {
  return new ShortString();
}

ShortString *str = create_short_string();  // <-- a scope is created and destroyed.
// But the str lives beyond the scope.
str->len = sscanf("%s", &str->content);

...

// This function takes ownership of str.
void print_short_string(ShortString *str) {
  // str is now owned by this function.
  str->content[str->len] = '\0';
  printf("%s\n", str->content);
  delete str;  // str dies here.
}

print_short_string(str); // str passed to the function.
str->len;  // boom! No longer safe to access str.

Unlike "owned" objects, there is little guarantee around dynamic lifetime. A pointer can point to a valid object. It can also point to an object that has been deleted. The programmer must make sure the object is still alive when dereferencing a pointer. That, as it turned out, is a super hard thing to do.

Easier cases

Dynamic lifetime is hard. Overtime people discovered two special cases of dynamic lifetime that are easier to reason about: single ownership and shared ownership.

Single ownership: an object is passed around between scopes and objects, but at any pointing time, it can only be accessed by one owner. If the current owner decides the object is not needed anymore, the object goes out of life. It is the responsibility of the current owner to clean the object up.

Shared ownership: an object is shared between many scopes and objects. The object is alive as long as one of them still needs the object. The last owner is responsible for cleaning it up. Often it is not clear which scope that would be just by reading the code.

These two cases roughly correspond to std::unique_ptr and std::shared_ptr in C++. Unfortunately the complex syntax of C++ (e.g. std::move, && etc) is not really the best tool for demonstrations. We do not have a code example here. However the conclusion is clear, ownership simplifies dynamic lifetime.

Static lifetime

Static is such an overloaded term. It is not the opposite of "dynamic" we talked about above. Here it means "good until the end of the program". An object is said to have a static lifetime, when it is alive throughout the whole program. Such objects are usually not destroyed by user code. Making everything static is a good way to solve the dangling pointer problem, except that it might use too much memory.

Conclusion

We talked about all three types of lifetime, and how "ownership" helps us with memory safety. Let's discuss how they help with thread safety in the next article.

Show Comments