From - Fri Jul 11 08:41:38 1997 Return-Path: Received: from beasys.com (unknown-42-10.beasys.com [206.189.42.10]) by centurion.flash.net (8.8.5/8.8.5) with ESMTP id VAA04135 for ; Thu, 10 Jul 1997 21:10:41 -0500 (CDT) Received: from central.beasys.com (dallas [206.189.43.10]) by beasys.com (8.7.5/8.7.3) with ESMTP id TAA10242 for ; Thu, 10 Jul 1997 19:06:54 -0700 (PDT) Received: from zaphod (zaphod [206.189.43.99]) by central.beasys.com (8.7.5/8.7.3) with SMTP id VAA10107 for ; Thu, 10 Jul 1997 21:17:07 -0500 (CDT) Message-Id: <2.2.32.19970711021246.00361b2c@central.beasys.com> X-Sender: drt@central.beasys.com X-Mailer: Windows Eudora Pro Version 2.2 (32) Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Date: Thu, 10 Jul 1997 21:12:46 -0500 To: dtribble@flash.net From: David R Tribble Subject: Proposal for ANSI/ISO C library X-UIDL: b5ddb8bebec3aa18cc4cb429818ac138 Status: U X-Mozilla-Status: 0001 Content-Length: 17196 >Date: Thu, 10 Jul 1997 11:56:28 -0500 >To: "Clive D.W. Feather" >From: David R Tribble >Subject: Proposal for ANSI/ISO C library >Cc: "ANSI/ISO C Committee" [Please pass this proposal on to the committee. Though I realize that the current C9X standard is almost complete, nonetheless I am submitting this proposal for the ANSI/ISO C programming language in the hopes that it will be considered, if not for the current draft, then for a future edition. Thanks.] ======================= Cover sheet starts here ============ Document Number: WG14 N___/X3J11 __-___ C9X Revision Proposal ===================== Title: Thread-safe library Author: David R. Tribble Author Affiliation: (Self) Postal Address: 6004 Cave River Dr. Plano, TX 75093-6951 USA E-mail Address: dtribble@technologist.com, david.tribble@beasys.com, dtribble@flash.net Telephone Number: +1 972 738 6125 Fax Number: +1 972 738 6111 Sponsor: ________________________________________ Date: 1997-07-10 Proposal Category: __ Editorial change/non-normative contribution __ Correction X_ New feature __ Addition to obsolescent feature list __ Addition to Future Directions __ Other (please specify) _____________________________ Area of Standard Affected: __ Environment __ Language __ Preprocessor X_ Library X_ Macro/typedef/tag name X_ Function X_ Header __ Other (please specify) _____________________________ Prior Art: Reentrant/thread-safe versions of the ANSI/ISO (and POSIX) functions have been in use on many Unix implementations for some time now. Target Audience: Programmers writing multithreaded programs. Related Documents (if any): (None) Proposal Attached: X_ Yes __ No, but what's your interest? Abstract: The addition of standard library functions suitable for use in multithreaded environments, which augment the existing thread-unsafe functions. ======================= Cover sheet ends here ============== Caveats: (The section numbers used throughout this proposal are taken from "The Annotated ANSI C Standard" ANSI/ISO 9899-1990, edited by Herbert Schildt. They may not reflect the latest section numbers subsequent to the ISO amendments.) Proposal: The following standard library functions maintain state information from one function call to the next. This makes them unsuitable for use in modern multithreaded operating environments, i.e., they are not thread-safe: asctime() 7.12.3.1 ctime() 7.12.3.2 gmtime() 7.12.3.3 localtime() 7.12.3.4 rand() 7.10.2.1 setlocale() 7.4.1.1 strerror() 7.11.6.2 strtok() 7.11.5.8 wcstok() 7.11.?.? This proposal adds, for each of the functions listed, a new function that is suitable for use in multithreaded environments. Problem Statement: Many modern operating environments support the concept of multiple "threads" of execution within a single program. (This is known as "multithreaded programming" or sometimes "multiprogramming".) In order to properly operate in such environments, library functions must not make use of modifiable global data that persists across function calls, nor share global data among function invocations. If there are library functions that do so, they must make special arrangements to operate as expected within multithreaded programs, so that a function called in one thread does not interfere with a function call from another thread; in a multithreaded program, both calls could be active simultaneously. Functions that take such precautions are called "thread-safe" because they can be safely used within multithreaded programs. Conceptual Model: To support multithreaded programming environments, the standard C library should be defined such that no functions rely on the state of global data between invocations. The most straightforward way to accomplish this is to identify those functions that violate this constraint and then define new replacement functions that provide the same functionality but that are thread-safe. Library Changes: With the exception of the functions listed above, all of the existing library functions (as defined in the current standard) should be explicitly defined as being thread-safe. (Some of the functions are implemented by some vendors in thread-unsafe ways. Explicitly defining this as a no-no would allow programmers to rely on more robust behavior from the library.) In order to make this goal achievable, the following functions need to be changed to return 'const' pointers. Their current definitions return pointers to non-const data and merely state that programs shall not modify the contents of the data returned (but do nothing to enforce this). This weakness might force some implementors to allocate modifiable global data areas, which are inherently thread- unsafe, in order to provide the non-constness of the return values. The following functions need changing to return pointers to const values: asctime() 7.12.3.1 ctime() 7.12.3.2 getenv() 7.10.4.4 gmtime() 7.12.3.3 localeconv() 7.4.2.1 localtime() 7.12.3.4 setlocale() 7.4.1.1 strerror() 7.11.6.2 Library Additions: The following functions would be added to the standard library and would provide thread-safe operation. The operation of each one is explained as well as the rationale behind it. (In most cases, the name and prototype of the function is taken from existing Unix practice.) The additional functions are grouped according to library sections. Section: Time and Date #include int asctime_r(const struct tm *t, char *buf, size_t len); [See asctime(), 7.12.3.1.] Converts the broken-down time value `t' into a 26-character string of the form: "Mon Sep 18 01:03:52 1989\n\0" All of the fields of the string are fixed-width. The area pointed to by `buf' is filled with the resulting string up to `len' characters (including the terminating null character). The number of characters moved into `buf' is returned, or -1 is returned on error. It it recommended that `len' be at least 26, since that is the length of the complete string with all of the fields shown above. Rationale: This function is passed an explicit buffer to fill, thus it does not suffer from having to share the same global data string from call to call. In a multithreaded program, threads are thus not forced to share the same string, which may be filled at different times by each thread. This also eliminates the risk when asctime_r() is called by two threads at the same time, since they are updating different data areas. #include int ctime_r(const time_t *t, char *buf, size_t len); [See ctime(), 7.12.3.2.] Equivalent in operation to: struct tm tmp; asctime_r(localtime_r(t, &tmp), buf, len); Rationale: (Same as for asctime_r().) #include int gmtime_r(const time_t *t, struct tm *ts); [See gmtime(), 7.12.3.3.] Converts the time value `t' into its equivalent broken-down form, storing the result in the structure pointed to by `ts'. The resulting values of the members of `ts' are derived with respect to Coordinated Universal Time (UTC) (compare to localtime_r()). Returns zero on success or -1 on error (such as either pointer being null). Rationale: (Same as for asctime_r().) #include int localtime_r(const time_t *t, struct tm *ts); [See localtime(), 7.12.3.4.] Converts the time value `t' into its equivalent broken-down form, storing the result in the structure pointed to by `ts'. The resulting values of the members of `ts' are derived with respect to program's notion of the local time zone (compare to gmtime_r()). Returns zero on success or -1 on error (such as either pointer being null). Rationale: (Same as for asctime_r().) Section: Locale It is possible that an implementation that supports multithreading may choose to allow different threads within a given program to have different locale settings. The following new functions allow for this kind of implementation. [If this kind of implementation is considered ill advised, i.e., if it is considered best that a program shall have only one locale setting shared by all of its threads, then the new functions below still have advantages to offer for multithreaded implementations.] #include #define LC_BUFSIZ [See setlocale_r() below.] This is a macro that specifies the largest possible number of characters in the string returned by setlocale() and setlocale_r(). #include int setlocale_r(int category, const char *locale, char *buf, size_t len); [See setlocale(), 7.4.1.1.] This function selects the appropriate portion of the program's (or thread's) locale as specified by the `category' and `locale' arguments, storing the result in string `buf'. Up to `len' character are stored in the string (including the terminating null ('\0') character). The number of characters stored into string `buf' is returned. If any of the arguments are invalid (such as null), -1 is returned. The maximum number of characters that shall be placed into string `buf' (including the terminating null ('\0') character) is specified by the LC_BUFSIZ macro, which is defined in the header. (This is functionally similar to setlocale().) Rationale: This function supports implementations that allow different threads within a given program to have different locale settings. If, on the other hand, all of the threads within a given program share the same locale, then this function operates identically to setlocale() with the added advantage that one thread cannot affect the `buf' string passed to setlocale_r() by a second thread, such as if the first thread were changing the program's locale by its own (simultaneous) call to setlocale_r(). #include const char * strerror_r(int err); [See strerror(), 7.11.6.2.] [Note that existing implementations of strerror_r() have a different prototype. Perhaps this function should have a different name. Or, the definition of strerror() could be modified to behave as described below in the Rationale.] This function maps error number `err' into its corresponding error message string. The contents of the returned string are implementation-defined. If `err' is not a meaningful error number in the implementation, a pointer to an empty string (i.e., a string containing only a single null ('\0') character) is returned. (A null pointer is never returned.) Rationale: This function supports implementations that allow different threads within a given program to have different locale settings. Threads with different locales may require different error message string tables (or catalogs), such as those written in different regional languages. If, on the other hand, all the threads within a given program share the same locale settings, then strerror_r() operates identically to strerror(). Note also that this function returns a 'const' string pointer, unlike strerror() (which only states that programs may not modify the contents of the returned string). Section: String Handling #include char * strsep(char **s, const char *del); [See strtok(), 7.11.5.8.] [An alternate name for this function could be strtok_r(), except that existing implementations of strtok_r() have a different prototype.] This function locates the first occurrence in the string pointed to by `s' of any character in the string `del' (or its terminating null ('\0') character) and replaces it with a null ('\0') character. The pointer pointed to by `s' is then set to point to the location of the next character in string `*s' that immediately follows the delimiter character, or to NULL if the end of the string is reached before a matching delimiter character is found. The original value of `*s' (i.e., the value upon entry to the function) is returned. Rationale: This is a thread-safe version of strtok() that does not require state information about the search string to be saved between calls. [This is almost identical to the strsep() proposal made by Keith Bostic in May 1995.] #include char * wcssep(wchar_t **s, const wchar_t *del); [See wcstok(), ?; and strtok(), 7.11.5.8.] [An alternate name for this function could be wcstok_r(), except that existing implementations of wcstok_r() have a different prototype.] This function is equivalent to strsep() except that it operates on wide-character string operands. Section: Miscellaneous #include int rand_r(unsigned int *seed, int *r); [See rand(), 7.10.2.1.] Computes a random number in the range [0,RAND_MAX] from the seed value pointed to by `seed'. The resulting random number is stored into `r' and returned from the function. The value pointed to by `seed' is also updated appropriately, so that the next call to rand_r() returns the next random number in the sequence. If either pointer `r' or `seed' is null, -1 is returned. Calls to rand_r() that are passed identical seed values (not to be confused with identical seed value pointers) shall return identical random numbers and update the seed values to the same values. Rationale: The rand() function is already thread-safe, but rand_r() allows multiple threads within a program to maintain their own independent sequences of random numbers concurrently. Without rand_r(), threads calling rand() will affect other threads' sequences of random values, most likely in ways that are not predictable nor replicable. Conclusion The next release of the ANSI/ISO C standard will take us into the next decade. Current operating systems technology has embraced multithreading, and it appears that this programming paradigm will be with us for some years to come. As this technology becomes more widespread, the limitations inherent in the C library will become more apparent, and vendors will be forced to come up with their own nonstandard solutions. It is better that we recognize that such deficiencies exist and enhance the C library to accommodate the technology than to watch incompatible solutions proliferate. This is best done sooner rather than later, so that programmers can rely on standard, vendor-independent solutions. Future Considerations: The current thread-unsafe functions could be marked as obsolescent so as to discourage their use. On the other hand, an argument could be made that existing single-threaded programs shouldn't be required to be changed, since they don't benefit from, nor require, thread-safe library functions. References: The strsep() proposal made by Keith Bostic in May 1995, titled "Replacement for strtok". Various manual pages for the following Unix systems were also consulted: HP-UX, AIX, Digital Unix, SunOS. ====================== End of Proposal =====================