[Info-vax] C limitations, was: Re: VMS process communication

Arne Vajhøj arne at vajhoej.dk
Mon Apr 24 09:40:29 EDT 2023


On 4/24/2023 9:32 AM, Andy Burns wrote:
> Simon Clubley wrote:
>> Arne Vajhøj wrote:
>>> Java has a Character.isDigit method that return true for 350 chars.
>>
>> And what the hell is the point of _that_ ? :-)
> 
> Depends ... if you have a string where Character.isDigit() returns true 
> for every character, does Integer.parseInt() return the expected 
> integer.  What about mixed writing systems, should the string "٣3३3" be 
> expected to return the integer 3333?

The documentation explains what is does:

<quote>
public static boolean isDigit(char ch)

Determines if the specified character is a digit.

A character is a digit if its general category type, provided by 
Character.getType(ch), is DECIMAL_DIGIT_NUMBER.

Some Unicode character ranges that contain digits:

     '\u0030' through '\u0039', ISO-LATIN-1 digits ('0' through '9')
     '\u0660' through '\u0669', Arabic-Indic digits
     '\u06F0' through '\u06F9', Extended Arabic-Indic digits
     '\u0966' through '\u096F', Devanagari digits
     '\uFF10' through '\uFF19', Fullwidth digits

Many other character ranges contain digits as well.

Note: This method cannot handle supplementary characters. To support all 
Unicode characters, including supplementary characters, use the 
isDigit(int) method.
</quote>

But my theory is that most Java programs doing some input verification
really want to check for Iso-Latin-1 digits and anything else should
be considered an error.

And at it is really symmetric. If a program is expecting arab-indic
digits, then Iso-Latin-1 digits should probably be considered an
error.

Arne








More information about the Info-vax mailing list