On the heap?

A discussion of techniques to detect whether a particular C++ object is allocated off the heap. Includes comments on Item 27 of Scott Meyers' book More Effective C++.

Introduction

Some time ago, Michael van Rooyen [3] kicked up some dust when he described an alternative to the use of virtual destructors to accomplish the effect of virtual destruction. He argued that his alternative would impose less of a penalty than an (inline) virtual destructor, while still achieving the desired polymorphic effect. Subsequently, Irfan Pyarali [4] and Tim Peierls [5] discussed some related issues. Taken together, there seem to be two points emerging from the discussion.

  1. Is it worthwhile to circumvent virtual destructors (by introducing another mechanism for polymorphic object destruction)?
  2. Can we find a way to reliably detect or mark dynamic objects, even in the presence of multi-threading and complicated initialization schemes such as presented by Tim Peierls?

My short initial answers to these two questions above are "no" and "not without the help of the compiler", respectively. However, I will do the following:

  1. Gather empirical data as to the runtime costs involved in Van Rooyen’s problem and his solution.
  2. Discuss other ways of determining the "heapness" of objects.

This article deals with issue B, while issue A is taken up in its companion "Destroy or destruct?".

In Scott Meyers’ recent book "More effective C++" [1], Item 27 deals with the issue of requiring or prohibiting heap-based objects in C++. The essence of the item is formed by various ways of determining whether or not some specific object lives on the heap as opposed to on the stack or in the static data area. I will address his solutions as well as the issues raised in [4] and [5].

Back to the top

Checking "heapness" with a global variable

One method that Meyers presents to ensure that objects of a given class are indeed allocated off the heap, is to use a global variable, in this case a static data member, as follows ("UPNumber" and other names are borrowed from [1]):

class UPNumber {
  static bool onTheHeap;
public:
  // Nested exception class
  class HeapConstraintViolation {};

  // Overloaded operator new for the class
  static void *operator new(size_t sz)
  {
    onTheHeap = true;
    return ::operator new(sz);
  }

  UPNumber()
  {
    if (!onTheHeap) throw HeapConstraintViolation();
    // ...Normal processing...
    onTheHeap = false;
  }
};

bool UPNumber::onTheHeap=false;

As you can see, this approach cannot prevent objects from being created elsewhere, but it will prevent the successful construction of non-heap based objects. Meyers then goes on to analyze this solution and points out why it isn’t satisfactory: it won’t work for allocated arrays of UPNumbers, and it may present problems if several UPNumber allocations occur with constructor calls interspersed in a platform dependent way. For example, this:

new UPNumber(*new UPNumber)

may or may not work with the onTheHeap flag, depending on the order in which the constructor of the allocated argument object is called relative to the operator new of the ‘outer’ object. The point is that each UPNumber constructor call resets the onTheHeap flag and therefore makes it impossible to remember more than a single operator new call at a time.

Back to the top

Using a counter instead of a flag

There is, of course, a fairly simple way to circumvent this problem: replace the bool flag by a counter. In a private communication with Scott Meyers, he indicated that he had indeed considered putting this alternative in his book as well, but decided against it as it would necessitate him to include alternatives for a number of other items too, and space did not allow that. Anyway, here it is after all.

class UPNumber {
  static int heapPending; // Replaces ‘onTheHeap’
public:
  class HeapConstraintViolation {};

  static void *operator new(size_t sz)
  {
    ++heapPending;
    return ::operator new(sz);
  }

  UPNumber()
  {
    if (heapPending < 1) throw HeapConstraintViolation();
    // ...Normal processing...
    --heapPending;
  }
};

int UPNumber::heapPending=0;

This solves most of the problems Meyers mentions (we’ll get to array allocations in a moment), including the new UPNumber(*new UPNumber) situation. In fact, it could only fail (as far as single objects are concerned) in a situation similar to the following:

new UPNumber(UPNumber());

(This is similar to the example Tim Peierls gave in the June 1996 Letters section [5], but in a different context: we’re not using object addresses here, but simply a count of pending constructions.) This fails if (and only if) the compiler happens to generate code as follows:

  1. Call operator new
  2. Call the constructor of the temporary
  3. Call the constructor of the dynamic object

The order of 2 and 3 will never be reversed, of course, so the question is simply whether or not the operator new call will precede the construction of the temporary. (This scenario would also cause the "onTheHeap" approach to fail.) Any other combination of constructors and/or operator new invocations will be safe with the counter approach. In the example above it remains to be seen if the use of a temporary makes sense, but it’s legal anyway so we have to deal with it (which we can’t).

As to arrays of UPNumbers, this could be solved by also overloading operator new[] for the UPNumber class and implementing it as follows:

void *UPNumber::operator new[](size_t sz)
{
  heapPending += sz / sizeof(UPNumber);
  return ::operator new[](sz);
}

This solves the array allocation problem, assuming that ‘sz’ is indeed the correct multiple of sizeof(UPNumber). When will this not be the case? I can see two possibilities:

1. When the C++ implementation under consideration needs some extra space in the array for bookkeeping, and to this end increases the passed-in ‘sz’. This is a possibility allowed by the ANSI/ISO C++ draft standard (see section 5.3.4), and is used by at least some compilers.

2. When UPNumber::operator new[] is inherited and used by a class derived from UPNumber and sizeof(Derived) > sizeof(UPNumber). There is no foolproof way to check for this (in particular, the size of the derived class may itself be an integral multiple of the size of UPNumber), so this might cause problems too.

To summarize: The counter approach does not completely solve the problems outlined in [1], but it improves on the bool flag approach. In fact, it still is a cheap solution and it may suffice for a number of situations, especially if portability is not an issue or will be considered as the need arises (as frequently happens in practice).

Back to the top

Tracking heap based objects

As an alternative to the global variable approach, Meyers presents a HeapTracked class that uses a list to keep track of the addresses of class instances allocated off the heap, then uses this information to determine if a particular object resides on the heap. The implementation goes like this:

class HeapTracked {
  // Class-global list of allocated addresses
  typedef const void *RawAddress;
  static list<RawAddress> addresses;
public:
  // Nested exception class
  class MissingAddress {};

  // Virtual destructor to allow dynamic_cast<>; pure to make
  // class HeapTracked abstract.
  virtual ~HeapTracked()=0;

  // Overloaded operator new and delete
  static void *operator new(size_t sz)
  {
    void *ptr=::operator new(sz);
    addresses.push_front(ptr);
    return ptr;
  }

  static void operator delete(void *ptr)
  {
    // Remove ‘ptr’ from ‘addresses’
    list<RawAddress>::iterator it=find(addresses.begin(),

    addresses.end(), ptr);
    if (it !=addresses.end()) {
      addresses.erase(it);
      ::operator delete(ptr);
    } else
      throw MissingAddress();
  }

  // Heap check for specific object
  bool isOnHeap() const
  {
    // Use dynamic cast to get start of object block
    RawAddress ptr=dynamic_cast<RawAddress>(this);
    // See if it’s in ‘addresses’
    return find(addresses.begin(), addresses.end(), ptr) !=
      addresses.end();
  }
};

// Meyers omitted first HeapTracked:: qualifier...
list<HeapTracked::RawAddress> HeapTracked::addresses;

This solution is a bit like the one Van Rooyen first presented [3] and the modifications suggested by Peierls [5], with the proviso that Van Rooyen only used a single address pointer instead of a collection of them.

The HeapTracked class does indeed track objects on the heap in a portable way. It is, however, a fairly expensive solution both in space and time. In particular, the list<HeapTracked::RawAddress> template will often (if not always) be implemented as a doubly-linked list and therefore require the allocation of a list element per address stored. This may be a quite considerable overhead, both in space and time. Furthermore, HeapTracked::isOnHeap() operates in linear time, and this too could be a problem.

The latter problem might be solved by switching to, say, a map instead of a linear list; a decent map lookup would probably operate in O(log n) time (or even in amortized constant time, given an underlying hash table implementation) and therefore improve the running time of the lookup operation. On the downside, the map implementation would be likely to take at least as much space per element as the linear list, if not more. Both approaches also introduce quite some code overhead, especially with C++ compilers that instantiate every member function of a template rather than just the ones used by the program. This is contrary to what the standard dictates, but unfortunately, many popular C++ compilers take this approach.

Back to the top

Alternative heap tracking methods

I have two alternatives to offer. Neither is as generally applicable or as foolproof as Meyers’ solution, but they are a lot cheaper, requiring only a single bit per tracked object and checking in constant time, and could be appropriate in many situations. After all, if you really need to track heap based objects, you may also be willing to impose some constraints on the use of the tracked class. (I have done so anyway.) What’s more, the amount of object code added by these alternative methods is likely to be considerably less than that of the methods presented in [1].

Both methods add a signature to the allocated object which is initialized by the overloaded operator new for the class, although the location of the signature differs between the two approaches. The signature is not initialized if the object is constructed otherwise, and this is also the possibly fatal flaw: a correct signature may be accidentally present even in objects that did not come off the heap. I’ll return to that later.

A. Prepend a signature to the allocated block

This first solution would work out as follows: overload operator new for the class and have it allocate some extra space before the actual block, then fill this with a special signature. For example:

class HeapTracked {
  static const int32 signature=0x57647652L;
  static bool checkSignature(const void *);
  ...RawAddress stuff deleted...
public:
  ...as before (operator delete overloading not needed)...
};

void *HeapTracked::operator new(size_t sz)
{
  // Increase requested size by signature size
  sz +=sizeof(int32);
  // Get the somewhat larger block
  void *block=::operator new(sz);

  // Initialize heap signature, then return pointer to
  // subblock beyond signature. (Second static_cast not
  // strictly necessary.)
  int32 *signptr=static_cast<int32 *>(block);
  *signptr=signature;
  return static_cast<void *>(signptr + 1);
}

bool HeapTracked::isOnHeap() const
{
  return checkSignature(dynamic_cast<const void *>(this));
}

bool HeapTracked::checkSignature(const void *ptr)
{
  // Cast back to int32 pointer
  const int32 *signptr=static_cast<const int32 *>(ptr);
  // Check signature in front of the block.
  // ***This is the problem area, see text***
  return signptr[-1]==signature;
}

(By the way, ‘int32’ is a typedef to ensure I have a 4-byte integer regardless of my compiler’s preferences.)

As you’ve probably spotted, the problem with this approach arises when HeapTracked::checkSignature() attempts to retrieve the signature field that precedes the actual object. In general, the address some_array[-1] is not guaranteed to exist. If the block was allocated by HeapTracked::operator new, it will exist, but we’re just about to find that out and can’t be sure yet.

This problem cannot be solved in a portable way, as far as I know. It may just so happen that the address signptr[-1] is not valid; it depends on the platform at hand whether or not this is likely to occur. In my practice (DOS, Win16, Win32, OS/2 and Mac apps), the most threatening situations are in 16-bit segmented DOS/Win16 programs when signptr[-1] would cause segment ‘underwrap’ (this can be checked, albeit in a non-portable way) and in others, when the passed-in block pointer actually addresses something on the stack exactly at the bottom of a stack page, and signptr[-1] causes a page fault on the stack’s guard page. The latter condition is almost impossible to check for, but it has never occurred to me (yet). Anyway, this method is not without its problems, although it is as generally applicable as Meyers’ list<HeapTracked::RawAddress> approach in all other respects (inheritance, mix-in use, etc.).

Another potential problem has to do with block alignment: the block returned by operator new() must be suitably aligned for any object type. Depending on the platform under consideration, this might mean 4- or 8-byte alignment. We may assume that ::operator new() does return a suitably aligned block of memory (although it may not have the optimal alignment, see [2]), but if we add an extra 4-byte offset for our signature before returning the pointer to the HeapTracked::operator new() caller, we may have violated the alignment restrictions. Again, this is a platform dependent matter and you may have to insert some extra padding to ensure that the addition of a signature does not violate alignment restrictions.

Finally, the approach does not work with arrays of heap tracked objects, but Scott Meyers wisely ignored that subject too.

B. Include a signature in the object itself

This second solution is one you’ll probably balk at, but it still has its merits and I’ve used it with success on several occasions (as I did with the previous method). The idea is to place the signature field inside the allocated object, yet still initialize it in HeapTracked::operator new. For example:

class HeapTracked {
  static const int32 signature=0x57647652L;
  int32 signfield; // Added
  ...RawAddress stuff deleted...
public:
  ...as before (operator delete overloading not needed)...
  // Might want copy constructor & assignment
};

void *HeapTracked::operator new(size_t sz)
{
  // Get block of correct size
  void *block=::operator new(sz);

  // Point into the object-to-be and set its signature...
  HeapTracked *htptr=static_cast<HeapTracked *>(block);
  htptr->signfield=signature;
  return block;
}

bool HeapTracked::isOnHeap() const
{
  return signfield==signature;
}

The HeapTracked constructors must leave this field alone of course, as must the assignment operator for the class, so it’s probably a good idea to define at least the copy constructor and assignment operator for this class and implement them to that effect.

As an extra benefit, if you’re really into it you might even want to use this method for allocated arrays of HeapTracked objects and implement HeapTracked::operator new[] to step through the array of objects-to-be (using sizeof(HeapTracked) to determine the step size), setting all their signature fields. In for a penny, in for a pound, don’t you think? Obviously, this can only work under the same circumstances as outlined previously with the UPNumber::operator new[] overloading.

When is this method not applicable? In essence, when class HeapTracked is used as a base class (e.g. as the mix-in mentioned in [1]) and it is either not the first base class, or it is a virtual base class. In both cases, the start of the new object will not coincide with the start of the allocated block, and the htptr->signfield reference will be off. What’s more, HeapTracked::operator new will have no way of spotting this. The same applies if the C++ implementation under consideration adds some bookkeeping space in front of the actual object (again allowed by ANSI/ISO C++). It’s up to the client to decide whether or not these restrictions are acceptable.

Accidental signature match

As I hinted at above, there still remains the problem of accidental signature matches. Both method A and B assume that if the signature value matches the predefined constant, we have a valid heap block. This may not always be true, however, especially not so because in the case of non-heap-based objects, the signature fields aren’t initialized at all and may contain any value (including the signature value). What can we do about it? Nothing absolutely foolproof, but a few things might raise the confidence level enough to make either method usable.

1. Use an "improbable" signature value. Clearly, 0 would be a poor choice, as would be 0xFFFFFFFF. I normally use a 4-byte value that does not match anything systematic. In the above examples the value 0x57647652L was used; I find it nicely conspicuous in dumps, yet fairly unlikely to occur by chance (try interpreting it as 4 ASCII characters). If necessary, you can make the signature value less probable by increasing its size (what about a 128-bit GUID?), but some compromise is probably called for.

2. Clear the signature field in HeapTracked::operator delete (method A and B) or in the HeapTracked destructor (only method B). This might help a bit, but not much, since we’re mainly interested in distinguishing heap based objects from others, and not one heap based object from another. Still, this could reduce the chance of accidents in method B if we do it in the HeapTracked destructor for every HeapTracked object.

3. Use additional information above and beyond the signature field. For example, add a signature field at the end of the object as well (this limits the application of the methods). You could also XOR the signature with a dynamically magical value, say the address of the object itself. (Using the address by itself is probably not a good idea, as it might show up in other places as well.) Or you could prepend an additional counter field and XOR the signature field with that. All these improvements would still allow HeapTracked::isOnHeap() to operate in constant time, and don’t require auxiliary data structures outside the object proper.

Multi-threading issues

One good point about both methods is that they are safe in the presence of multi-threading, an issue raised by Pyarali [4] and Peierls [5]. Neither method relies on global storage (the constant signature value doesn’t count in this respect), but only on a per-object field. Assuming that the operator new implementation under consideration is adequately protected against concurrent access, there is no further need for concurrency control within the object constructors themselves. This alone might be an excellent reason to prefer one of these methods, even considering their less-than-perfect signature matching. Incidentally, this is another trait that sets these methods off from Meyers’ techniques, which do rely on global data structures to hold addresses of allocated objects.

C. Dig into your C++ compiler

One final method to get what you want (i.e., knowledge about heap-basedness) is to consult your compiler’s runtime implementation details. This is extremely nonportable (it might even change from one release to the next), but what gives if you’re desperate?

Many compilers will use some hidden information passed to the constructors and destructors they generate, indicating whether or not the constructor should call operator new before proceeding (effectively combining the constructor call with the operator new call), and conversely, whether the destructor should call operator delete when it’s done. They will also add some information to memory blocks indicating how many objects are in the block (to take care of array construction and destruction). If you can get at these and are prepared to face the maintenance problems it causes, this could also solve your problems.

[As an aside: compiler vendors might consider adding some intrinsic function, say "__dynamic()", that is only valid inside constructors and gives access to this precious bit of information. However, I don’t think that it would be useful enough to warrant inclusion in the ANSI/ISO C++ standard, because the interpretation is ambiguous. For single objects there is no problem: __dynamic() indicates clearly whether or not the object is heap-based. On the other hand, for arrays of objects, should __dynamic() indicate that the objects within the array are heap-based? If you are interested in the general question of heap-basedness, it should, but if you’re trying to come up with some kind of destroy() function such as Van Rooyen, it should not: we don’t want every individual object within the array to be considered dynamic for that purpose. But then again, how do you destroy() dynamically allocated arrays of objects?]

Obviously, this approach is off-limits for a book like "More effective C++" and I have never seen it being used by anybody but the compiler manufacturers themselves, but...

Back to the top

Conclusion

As far as the techniques for "heapness" checking are concerned, the alternatives to Scott Meyers’ methods sacrifice portability for speed and compactness. They are not completely portable, although portability concerns are fairly well localized, nor completely foolproof. On the other hand, they are fast, simple, and safe in the presence of multi-threading, which is not the case with Meyers’ approach, nor with the approaches proposed in [3], [4] and [5]. Therefore, if you do want a textbook solution and multi-threading is not an issue (or you are prepared to add concurrency control), consult [1]. However, the alternatives may have sufficient merit to justify their use in many practical situations. In fact, it will often be just those practical situations that dictate the application of some heap-based check in the first place.

Back to the top

References

[1] Scott Meyers: More Effective C++, Addison-Wesley 1996

[2] Rick Grehan: Bug-free benchmarks, BYTE March 1996, pp. 149-150.

[3] Michael van Rooyen: Alternative C++, C++ Report April 1996, pp. 39-42.

[4] Irfan Pyarali: Comments on "Alternative C++", C++ Report May 1996, Letters section.

[5] Tim Peierls: Comments on "Alternative C++", C++ Report June 1996, Letters section.