Destruction that works with multi-threads

In object-oriented programming, the life of an object starts with construction, and ends with destruction. The idea is that before an object can be used, we must allocate memory for it, initialize its members, then initialize itself. Similarly, if an object is no longer needed, we must return its resources and release the memory. It sounds simple enough, but becomes quite complicated in a multi-threaded environment.

Let's take a look at the lifetime of an object that is used by multiple threads.

When the object is first created, it is accessible by just one thread -- the creator thread (T1). T1 initializes the object and its members, then shares the object with other threads. From that point on, each thread executes at their own pace. Eventually the object reaches its end of life and should be destroyed. An object can be destroyed at most once, typically by one thread. No parallelism is needed here. Let's assume T2 ended up being the one that is responsible. T2 destructs members of the object, and then releases the memory.

The symmetry between construction and deconstruction is natural and obvious: created once, destroyed once; created by one thread, destroyed by one thread.

The asymmetry is also there. The tasks of T1 and T2 are different. When T1 creates the object, it knows that only itself can access the object it just created. However when T2 is about to destroy the object, more than one thread has access to it. For T2 to do its job, it must somehow make sure other threads have stopped using the object. The mechanism can be

A global shutdown signal that is sent to all threads (shutdown_condvar).
A shared pointer that assigns the duty of destroying an object to the last bearer of the object (shared_ptr).
A state of the object itself that denotes an invalid state (weak_ptr) so that T2 can destroy the object at any time.
A ref count included in the object itself, and a blocking mechanism to wait for that count to reach zero.

My favorite approach is #4 because it is most universal, and has deterministic runtime behavior when compared with shared_ptr. We know upfront that T2 is the thread that destroys the object, not a latency sensitive thread or a thread that won't finish running a time consuming task in 10 minutes.

In this approach, T2 would usually need to let other threads know that they must give up the object. It could notify other threads by either changing a boolean that other threads are watching, or sending a message to all holders of the object. Then T2 must wait for the ref count to reach zero (or one if T2 itself is counted). After that it can safely destroy the object and release the memory.

The waiting part is unavoidable. T2 must release the memory, and no more than one thread can release the memory. Even if all operations before releasing the memory are thread safe, we still need to have one thread do the last thing, and that thread has to wait.

There is an alternative. We could just not release the memory at all. The object is safe to use all the way to the end of the process. This is the easiest thing to do, but does put pressure on memory usage.