[Info-vax] C limitations, was: Re: VMS process communication
Arne Vajhøj
arne at vajhoej.dk
Mon Apr 24 09:40:29 EDT 2023
On 4/24/2023 9:32 AM, Andy Burns wrote:
> Simon Clubley wrote:
>> Arne Vajhøj wrote:
>>> Java has a Character.isDigit method that return true for 350 chars.
>>
>> And what the hell is the point of _that_ ? :-)
>
> Depends ... if you have a string where Character.isDigit() returns true
> for every character, does Integer.parseInt() return the expected
> integer. What about mixed writing systems, should the string "٣3३3" be
> expected to return the integer 3333?
The documentation explains what is does:
<quote>
public static boolean isDigit(char ch)
Determines if the specified character is a digit.
A character is a digit if its general category type, provided by
Character.getType(ch), is DECIMAL_DIGIT_NUMBER.
Some Unicode character ranges that contain digits:
'\u0030' through '\u0039', ISO-LATIN-1 digits ('0' through '9')
'\u0660' through '\u0669', Arabic-Indic digits
'\u06F0' through '\u06F9', Extended Arabic-Indic digits
'\u0966' through '\u096F', Devanagari digits
'\uFF10' through '\uFF19', Fullwidth digits
Many other character ranges contain digits as well.
Note: This method cannot handle supplementary characters. To support all
Unicode characters, including supplementary characters, use the
isDigit(int) method.
</quote>
But my theory is that most Java programs doing some input verification
really want to check for Iso-Latin-1 digits and anything else should
be considered an error.
And at it is really symmetric. If a program is expecting arab-indic
digits, then Iso-Latin-1 digits should probably be considered an
error.
Arne
More information about the Info-vax
mailing list