To: Deborah Donovan
From: David R Tribble on Wed, Jan 21, 1998 12:22 PM
Subject: Comments on ISO/IEC 9899 (C9X) draft
Message-Id: <2.2.32.19980121170940.00f01fac@central.beasys.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Date: Wed, 21 Jan 1998 11:09:40 -0600
To: ddonovan@itic.nw.dc.us
From: David R Tribble <david.tribble@central.beasys.com>
Subject: Comments on ISO/IEC 9899 (C9X) draft

Public Comment Number(s) PC-____ to PC-____ 

ISO/IEC CD 9899 (SC22N2620) Public Comment
===========================================

Date:               1998-01-21
Author:             David R. Tribble
Author Affiliation: Self
Postal Address:     6004 Cave River Dr.
                    Plano, TX 75093-6951
                    USA
E-mail Address:     dtribble@technologist.com
                    david.tribble@central.beasys.com
                    dtribble@flash.net
Telephone Number:   +1 972 738 6125, 16:00-00:00 UTC
                    +1 972 964 1729, 01:00-05:00 UTC
Fax Number:         +1 972 738 6111
Number of individual comments: 2

------------------------------------------------------------------------

Comment 2.

Category: Request for clarification

Committee Draft subsection: 5.1.1.2, 5.2.1, 6.1.2

Title: Source characters not allowed as UCNs

Detailed description:

 Section 5.1.1.2 states that UCN codes representing characters in the
 source character set are not allowed within the source text.  For
 example, the following fragment is illegal:

    int func(int i)
    {
        return \u0030;          // \u0030 is '0'
    }

    int bar(int \u006A)         // \u006A is 'j'
    {
        return \u006A + 1;
    }

 But this fragment is legal:

    int foo(int \u00E1)         // \u00E1 is 'a'+accent
    {
        return \u00E1 * 2;
    }

 There is little difference in these fragments.  What is the reason for
 the limitation on valid UCN codes?

 Conceivably, a Unicode text editor might store all the characters in
 a file as UCN sequences for maximum portability.  Allowing most
 characters to be written as UCNs but requiring a few characters to be
 written strictly as 7-bit ISO-646 characters seems like an artificial
 restriction.

 A C compiler implementation could choose to convert all source
 characters into 16-bit (or even 32-bit) codes internally, preferring
 to convert UCNs into single internal codes as they are read.  Why
 should it be prevented from accepting every alphanumeric ISO-10646
 character, instead of every alphanumeric character /except/ 'a'-'z'
 et al?

------------------------------------------------------------------------