======================= Important note =============================== This proposal applies to both the C and the C++ languages. There is a large overlap for both languages; where a specific constraint or example applies only to one of the languages, it is stated as such, i.e., the item contains a clause like "in C" or "for C++". Items without such a clause should be considered to apply to both languages. ======================= Cover sheet starts here ============ Document Number: WG14 N___/X3J11 __-___ C9X Revision Proposal ===================== Title: Reserved '__null__' identifier Author: David R. Tribble Author Affiliation: (Self) Postal Address: USA E-mail Address: dtribble@technologist.com, david.tribble@beasys.com, dtribble@flash.net Telephone Number: +1 972 738 6125 Fax Number: +1 972 738 6111 Sponsor: ________________________________________ Date: 1997-11-03 Revision: 5 Proposal Category: __ Editorial change/non-normative contribution __ Correction X_ New feature __ Addition to obsolescent feature list __ Addition to Future Directions __ Other (please specify) _____________________________ Area of Standard Affected: __ Environment X_ Language __ Preprocessor X_ Library X_ Macro/typedef/tag name __ Function __ Header __ Other (please specify) _____________________________ Prior Art: The GNU C++ compiler (gcc) supports a '__null' reserved identifier, which is used as the default value for the 'NULL' macro. Target Audience: Programmers using pointers. Related Documents (if any): (None) Proposal Attached: X_ Yes __ No, but what's your interest? Abstract: The addition of a reserved identifier '__null__' to be used as a generic null pointer constant. A new definition of the standard 'NULL' macro is also suggested. ======================= Cover sheet ends here ============== PROPOSAL A reserved identifier, '__null__', is to be added to the language to serve as a null pointer constant. The '__null__' name is a predefined identifier with a type of 'pointer' and a value of 'null pointer'. The type of '__null__' is assignment-compatible with all pointer types. Its value compares equal to all null pointer expressions regardless of their type, and unequal to any (non-null) pointer to a valid object or function. A new definition of the 'NULL' macro (in the header file) is also suggested (see the COMPATIBILITY section below). PROBLEM STATEMENT C and C++ currently lack an explicit null pointer constant keyword or identifier. The integer constant zero is deemed a special case that can be used for such a value. This practice, however, is the source of some semantic (type checking) problems. (1) As it currently stands, the 'NULL' macro may be defined in C as '((void *)0)', '0', or '0L', and in C++ as '0' or '0L'. The last two definitions present a problem. In the example below, a 'NULL' defined as '0' might not operate correctly when passed as a pointer argument. extern int foo(char *s, ...); i = foo("%s", NULL); In other words, a cast is always required for correct operation and portability: i = foo("%s", (char *)NULL); Using the '__null__' identifier poses no such problem: i = foo("%s", __null__); (2) Consider another example: extern Info * bar(); if (bar() == NULL) ... /* check for failure */ Now assume that the definition of bar() is changed to: extern enum Status bar(); If an enum Status value of zero indicates success, but a null pointer returned from bar() indicates failure, the old 'if' statement is now out of sync with the new function declaration and the 'if' statement will operate incorrectly. However, the 'if' statement does not result in a type mismatch error at compile time and the error goes undetected. If, however, the 'if' statement is written using '__null__', the compiler will catch the error resulting from the changed prototype: if (bar() == __null__) ... /* bar() must return a pointer */ (3) In C++, the multiple meanings of '0' leads to unintentional errors when dealing with overloaded functions: extern int fly(int i); extern int fly(char *p); i = fly(NULL); // calls fly(int), not fly(char *) Since '__null__' is of type pointer, no such ambiguity arises with its use: i = fly(__null__); // calls fly(char *) (4) Another problem arises from the fact that 'sizeof(NULL)' might not be equal to 'sizeof((void *))'. By definition, 'sizeof(__null__)' is guaranteed to be equal to 'sizeof((void *))'. CONSTRAINTS The following paragraphs describe the constraints on the '__null__' identifier. (See the EXAMPLES section below for illustrations of these constraints.) (1) No header file needs to be #included to make the '__null__' identifier visible in a translation unit; the identifier is predefined (i.e., it is built into the language). (2) The name '__null__' is a reserved identifier. It shall not be used as the name of any library or user-defined object or function, nor as the name of any struct or union tag or member, nor as an enumeration tag or constant, nor as a typedef name, nor as a statement label. Whether or not '__null__' may be used as the name of a preprocessor macro is implementation-defined. (Note, however, that since it begins with two underscores it is reserved for use by the language or implementation.) (3) The '__null__' identifier may be used as an operand in equality expressions (involving the '==' and '!=' operators) for comparison to pointer expressions. In C++, it may be compared to pointer to member expressions. (4) The '__null__' identifier shall not be used as an operand in a relational expression (involving the '<', '<=', '>', or '>=' operators). In C++, '__null__' may be used as an argument to these operators provided that they have been overloaded and take a pointer argument. (5) The '__null__' identifier may be used as an an r-value in assignment expressions. It is assignment-compatible with every pointer type. It may be used as the value (right) operand of the simple assignment operator ('='), but shall not be the operand of any of the compound assignment operators ('*=', '/=', '%=', '+=', '-=', '>>=', '<<=', '&=', '^=', and '|='). In C++, '__null__' may be used as an argument to these operators provided that they have been overloaded and take a pointer argument. In C++, '__null__' is also compatible with every pointer to member type. (6) The '__null__' identifier is a constant expression, and may be used as an initialization-expression to initialize a variable of pointer type. In C++, it may also be used to initialize a variable of a pointer to member type. In C++, '__null__' shall not be used to initialize a reference variable. (7) The '__null__' identifier may be passed as an argument to a function. In C++, it shall not be passed to a function or a constructor as an argument of reference type. The '__null__' identifier may be passed as an argument to a function taking a variable number of arguments (i.e., the function has a '...' in its prototype), and as such is passed as a null pointer constant of type 'pointer to void'. Attempting to access the value of the argument as any other type (using the 'va_arg()' macro in the called function) is undefined behavior. Passing a null pointer value as a variable argument of any other pointer type without an explicit cast is undefined behavior. [This allows a bare '__null__' to be passed as a '...' arg as if it was being passed as '(void*)0'. It also leaves the door open for implementations to allow a bare '__null__' to be passed as any other pointer type (such as 'char*', which has the same alignment restrictions as 'void*') without requiring a cast, but note that this is 'undefined behavior' and is thus not portable; the portable (and safe) thing to do is to explicitly cast '__null__' to the right pointer type.] [It may be deemed legal to allow '__null__' to be used to initialize a reference to a const pointer. This effectively initializes the reference to refer to a temporary null pointer constant. However, this would be a special case use for '__null__'. On the other hand, it would be consistent with the current use of '0' as such an argument.] (8) The '__null__' identifier has a type, which is an incomplete pointer type. However, it is compatible with all pointer types. In C++, it is also compatible with all pointer to member types. (It does not have type 'pointer to void'. Its specific type depends upon the context in which it is used. In effect, it "conforms" to the type required by the context.) Because '__null__' has no specific pointer type, in C++ it shall not be used as a class (type) parameter for a template. (In contexts where a specific pointer type cannot be deduced, it has no specific type.) (9) The '__null__' identifier is type-compatible with pointer types having any combination of 'const' or 'volatile' qualifiers. (10) Casting '__null__' to a particular pointer type yields a null pointer of that type, which compares equal to any other null pointer value of that type. Whether or not the bitwise representation of a null pointer value of one type is the same as that of null pointers of other types is implementation-defined. The conversion of '__null__' to a cv-qualified pointer type is a single conversion operation, and not a sequence of a pointer conversion followed by a qualification conversion. In C++, '__null__' may be cast to any pointer to member type, yielding a null pointer to member value of that type, which compares equal to any other null pointer to member of the same type, and unequal to any pointer to member of the same type that was not derived from a null pointer value. (11) The '__null__' identifier is not an l-value. It shall not be the target operand of an assignment or increment operator. (12) The '__null__' identifier does not designate an addressable object, and thus it does not have an address. The unary '&' operator shall not be applied to it. (13) The '__null__' value does not point to a valid object and thus shall not be dereferenced (by the unary '*' operator or the '[]' operator). It shall not be used as an operand in a pointer arithmetic expression (involving the binary '+' or '-' operators). In C++, it may be passed as an argument to the binary '+' or '-' operators provided that they have been overloaded and take a pointer argument. The '__null__' identifier shall not be dereferenced as a function designator in order to call a function. In C++, it does not designate any member of any struct or class object and shall not be dereferenced as a pointer to member. (14) The '__null__' identifier has a size. For purposes where the complete type is not needed (such as when it is used as an argument of the 'sizeof' operator), the size of '__null__' is the same as the size of type 'pointer to void'. (15) The '__null__' identifier is not type-compatible with non- pointer types. In particular, it is not type-compatible with integral types. (16) The '__null__' identifier may be cast to an integral type, but the resulting integer value is implementation-defined. In particular, the resulting integer value might or might not compare equal to integer zero. The integer type must be wide enough to contain all of the bits of a null pointer to void; casting to smaller types may result in the truncation of high-order bits and is implementation-defined behavior. (The 'size_t' and 'ptrdiff_t' types might or might not be wide enough. Types 'unsigned long int' and 'unsigned long long int' probably have sufficient width and are the safest integral types to use.) (17) Casting an integer value, which was the result of previously casting '__null__' to an integer type, to type 'pointer to void' yields a null pointer value that compares equal to '__null__' (provided that the integer type is wide enough). Casting the integer value to any other pointer type yields a value that is implementation-defined and which might or might not compare equal to '__null__'. EXAMPLES The following examples illustrate most of the constraints described in the CONSTRAINTS section above. (1) Identifier '__null__' may be used in equality expressions: int * p; if (p == __null__) ... /* compare a pointer to null */ if (p != __null__) ... /* check for non-null */ (2) Identifier '__null__' is type-compatible with all pointer types: float * fp; int * ip; char * cp; const char * ccp; char *const cpc; const char *const ccpc; if (fp == __null__) ... /* okay */ if (ip == __null__) ... /* okay */ if (cp == __null__) ... /* okay */ if (ccp == __null__) ... /* okay */ if (cpc == __null__) ... /* okay */ if (ccpc == __null__) ... /* okay */ (3) Identifier '__null__' is type-compatible with C++ pointer to member types: int Foo::* mvp; // member var ptr void (Foo::* mfp)(); // member func ptr if (mvp == __null__) ... // okay if (mfp == __null__) ... // okay (4) Identifier '__null__' is a null pointer value that compares equal to all other null pointers: float * fp = (float *)0; if (fp == __null__) ... /* always true */ if ((char *)(void *)fp == __null__) /* always true */ (5) Identifier '__null__' can be assigned to objects of pointer type: struct Foo * p = __null__; /* okay */ const struct Foo * pc = __null__; /* okay */ struct Foo *const cp = __null__; /* okay */ const struct Foo *const cpc = __null__; /* okay */ (6) Identifier '__null__' can be passed to functions as an argument of pointer type: extern int foo(const char *s); extern int bar(double *dp); extern void prf(const char *f, ...); extern void ff(int x, ...); foo(__null__); /* okay */ bar(__null__); /* okay */ prf("%p", __null__); /* okay */ ff(17, (char *)__null__); /* okay */ (7) In C++, identifier '__null__' has no specific type other than 'pointer', so it requires explicit casting to disambiguate its type when passed as an argument to overloaded functions: extern void abc(void *p); extern void abc(const char *s); abc(__null__); // error, ambiguous abc((void *)__null__); // calls abc(void *) abc((const char *)__null__); // calls abc(const char *) abc((char *)__null__); // calls abc(const char *) (8) In C++, identifier '__null__' has no specific pointer type, so it cannot be used as a class parameter for a template: template class Oper { ... }; // template class template void foo(T a) { ... }; // template func Oper<__null__> obj; // error, not a complete type foo(__null__); // error, not a complete type (9) Identifier '__null__' is not an l-value. The following statement results in a compile-time error: __null__ = expr; /* error, not an l-value */ (10) Identifier '__null__' does not point to a valid object. Thus the following statements result in compile-time errors: x = *__null__; /* error, doesn't point to an object */ (11) Identifier '__null__' cannot be used in pointer arithmetic expressions: i = __null__[i]; /* error, can't do pointer arithmetic */ i = i[__null__]; /* error, can't do pointer arithmetic */ p = __null__ + j; /* error, can't do pointer arithmetic */ p = __null__ + 0; /* error, can't do pointer arithmetic */ (12) Identifier '__null__' does not name a proper object, so the following is illegal: p = &__null__; /* error, has no address */ (13) Identifier '__null__' has an incomplete type, so these expressions also have incomplete types: (e ? __null__ : __null__) /* incomplete type */ (e1, e2, __null__) /* incomplete type */ (14) Identifier '__null__' has a size: i = sizeof(__null__); /* okay, same as sizeof(void *) */ if (sizeof(__null__) == sizeof(void *)) ... /* always true */ if (sizeof(__null__) == sizeof(float *)) ... /* implementation-defined */ if (sizeof(__null__) == sizeof((*)(void)) ... /* implementation-defined */ (15) Identifier '__null__' is not type-compatible with integral types: int i; bool b; enum { ZERO = 0 }; if (__null__ == 0) ... /* error */ if (__null__ == '\0') ... /* error */ if (__null__ == i) ... /* error */ if (__null__ == b) ... /* error */ if (__null__ == false) ... /* error */ if (__null__ == ZERO) ... /* error */ if (__null__) ... /* error */ if (!__null__) ... /* error */ i = __null__; /* error */ b = __null__; /* error */ (16) Identifier '__null__' can be cast to an integer, but the result might not be equal to integer zero: int i; unsigned long ul; i = (int)__null__; /* cast null to int */ ul = (unsigned long)__null__; /* cast null to ulong */ if (i == 0) ... /* implementation-defined */ if (ul == 0) ... /* implementation-defined */ (17) Casting '__null__' to an integer and then casting the result to type 'pointer to void' results in a null pointer value, provided that the integer type is wide enough: unsigned long ul; void * p; char * cp; ul = (unsigned long)__null__; /* cast null to ulong */ p = (void *)ul; /* cast back to pointer */ if (p == __null__) ... /* true, if sizeof(__null__) <= sizeof(ulong) */ cp = (char *)ul; /* cast back to char ptr */ if (cp == __null__) ... /* implementation-defined */ (18) Using '__null__' is more type-safe than using integer zero as a null pointer constant: int i = 0; char * p; enum { ZERO = 0 }; if (p == 0) ... /* legal, but mixes types */ if (p == '\0') ... /* legal, but mixes types */ if (p == i) ... /* legal, but mixes types */ if (p == ZERO) ... /* legal, but mixes types */ C only if (p == false) ... /* legal, but mixes types */ C only if ((void *)0 == 0) ... /* legal, but mixes types */ if (p == __null__) ... /* safe */ if (0 == __null__) ... /* error, incompatible types */ if ('\0' == __null__) ... /* error, incompatible types */ if (i == __null__) ... /* error, incompatible types */ if (ZERO == __null__) ... /* error, incompatible types */ if (false == __null__) ... /* error, incompatible types */ if ((void *)0 == __null__) ... /* safe */ (19) Using '__null__' is more type-safe than using integer zero as a null pointer constant in C++. The following statements assume that 'NULL' is defined as '0' or '0L': char ch; char * cp; memset(p, NULL, sz); // legal ch = NULL; // legal *cp = NULL; // legal memset(p, __null__, sz); // error ch = __null__; // error *cp = __null__; // error COMPATIBILITY CONSIDERATIONS In the interests of providing backward compatibility to existing code, the definition of the 'NULL' macro can be changed in the standard header files (specifically, ) to: #define NULL __null__ Since the name '__null__' begins with two underscores, there will be no conflicts with user-defined names in conforming programs (since names beginning with two underscores are reserved for use by the language and the implementation). ('NULL' is already a reserved name because it is a macro name defined in a standard library.) This change has no detrimental effects on conforming programs that use 'NULL' correctly. ISSUES (1) On the face of it, there appear to be contradictory semantics implied by these statements: t1 = ((void *)0 == __null__); /* true */ t2 = (0 == (long)__null__); /* ? */ t3 = ((void *)(int)__null__ == __null__); /* ? */ t4 = ((void *)__null__ == __null__); /* true */ No contradictions exist, however. The value of 't1' is always true because casting an integer zero value to a pointer type always results in a null pointer, which in turn always compares equal to '__null__'. The value of 't2' is implementation-defined because the integer representation of a null pointer value might not be all bits zero. The value of 't3', likewise, is implementation-defined because casting '__null__' to an integer value and then casting that integer value to a pointer type will result in a pointer value that compares equal to '__null__' only if the integer type is wide enough to hold all the bits of a null pointer value. The value of 't4' is always true because casting '__null__' to a specific pointer type always results in a null pointer value, which always compares equal to '__null__'. (2) It seems reasonable to allow '__null__' to be used to specify pure virtual functions in C++, as in this example: class Xyzzy { virtual int can() = 0; // virtual member func virtual int will() = NULL; // virtual member func virtual int do() = __null__; // virtual member func ... }; The declaration for 'can()' is legal C++ syntax. So is the declaration for 'will()' (assuming that 'NULL' is defined as '0' or '0L'). Since the syntax of "initializing" a function declarator to zero in order to specify that it is a pure virtual function is reminiscent of initializing a function pointer, it seems reasonable that pure functions could also be "initialized" to '__null__', as the declaration for 'do()' illustrates. (I expect that this opinion will be controversial. I'm not a strong advocate of it, but it would seem to make the language more consistent.) (3) It is tempting to make the constraint that '(int)__null__' is equal to integer zero (0), but I'm not convinced that this is a necessary or good thing. This might open the semantic door for allowing '0' and '__null__' to be used interchangeably, which is not my intent; '__null__' should only be legal where pointer values are legal. LIBRARY CONSIDERATIONS Many of the standard library functions take pointer arguments. The use of '__null__' in these cases presents no new semantic or portability problems. Some of the standard library functions accept a variable number of arguments (such as vprintf()). In all of these cases the use of '__null__' as an argument is safe. (Safer, in fact, that the existing practice of allowing 'NULL' to be defined as '0' or '0L'.) COMMENTS The new '__null__' identifier is meant to replace the use of '0' as a null pointer constant. In all situations involving pointer expressions, the use of '0' can be replaced with '__null__' without affecting the semantics. '__null__' cannot be used as a replacement for '0' in other situations, however, particularly within arithmetic expressions. Code that currently uses 'NULL' correctly will not be affected by this proposal. I do not advocate changing the existing semantics of using integer constant zero as a null pointer constant in this proposal. The choice of the name '__null__' follows the naming convention (apparently) established by the ISO C9X committee for reserved identifiers (e.g., '__func__'). FUTURE CONSIDERATIONS With the introduction of the '__null__' identifier, the practice of using the constant '0' (integer zero) as a null pointer constant could be deprecated. However, it cannot simply be removed from the language today because so much existing code relies on it. Encouraging the use of the 'NULL' macro increases the likelihood that errors (such as those shown in the EXAMPLES section above) will be detected at compile time (see the COMPATIBILITY section above). CONCLUSION Adding the '__null__' identifier to the language will allow code that deals with pointers to be more type-safe and less ambiguous. REFERENCES ANSI/ISO 9899-1990 Standard for Programming Languages - C, the following sections: 6.2.2.3 Pointers 6.3.2.2 Function calls 6.3.3.2 Address and indirection operators 6.3.4 Cast operators 6.3.6 Additive operators 6.3.8 Relational operators 6.3.9 Equality operators 6.3.16 Assignment operators 6.3.16.1 Simple assignment 6.3.16.2 Compound assignment 6.4 Constant expressions 7.1.6 Common definitions ISO/IEC JTC1/SC22 Programming Language C++, committee draft 2 (CD 14882), Dec 1996, the following sections: 4.10 Pointer conversions 4.11 Pointer to member conversions ====================== End of Proposal =====================