Enums, strings and laziness
If you look at the glibc equivalent for converting #defines to strings for purposes of perror, it’s a massive array that, at compile-time, builds all the strings.
The biggest drawback with this approach is that the #define and the corresponding friendly strings are defined and reconciled in two different places. If someone updates the header file to add a new errno, then s/he has to remember to also update this other place so perror works as expected. I’m using errno as an example, but this is a common problem when writing code in C or C++. The problem is exacerbated when certain enums are conditional (based on the operating system, cpu type and so on). Then these checks now need to be in multiple places. Ugliness.
So how can we be lazy and define everything in one place so bugs don’t creep in over time when others add new enums to our list?
JDK 1.5 has a nicety where you can define your enums and stringify them too. In other words, if you had a class that’s equivalent to errno.h you can do something like the following:
enum Errno { EPERM("Operation not permitted"), ENOENT("No such file or directory"), .... }
With C, we can accomplish pretty much the same thing with the following preprocessor technique.
First the header file errno.h
#ifndef _ERRNO_H_ #define _ERRNO_H_ /* * Common case when someone just includes this file. In this case, * they just get the various E* tokens as good old enums. */ #if !defined(ETYPE) #define ETYPE(val, desc) E##val, #define ETYPE_ENUM enum { #endif /* ETYPE */ ETYPE(PERM, "Operation not permitted") ETYPE(NOENT, "No such file or directory") /* * Close up the enum block in the common case of someone including * this file. */ #if defined(ETYPE_ENUM) #undef ETYPE_ENUM #undef ETYPE ETYPE_MAX }; #endif /* ETYPE_ENUM */ #endif /* _ERRNO_H_ */
Without really much effort, this behaves exactly like sys/errno.h in that we get the various EPERM, ENOENT tokens as enums. So how do we write the perror function?
Here’s perror.c
#include "errno.h" static const char *__sys_errlist_internal[ETYPE_MAX] = { #undef _ERRNO_H_ #define ETYPE(val, desc) { desc }, #include "errno.h" { 0 } #undef ETYPE }; const char * perror(int err) { if (err >= ETYPE_MAX) return "Unknown error"; return __sys_errlist_internal[err]; }
Wait, how did that work? We included errno.h twice. The first time we got the standard definition of ETYPE, while the second time, we redefined ETYPE to pull the descriptions and build an array of strings automatically. Obviously this contrived example assumed that the error numbers are contiguous and start with zero. The ETYPE macro could have a whole lots of other interesting meta data associated with each enumeration to the point entire structures and look tables can be initialized be merely re-defining ETYPE and including errno.h as many times as possible. Now adding a new errno is as simple as a touching a single file and recompiling.
Laziness at it’s best.