[Info-vax] openvms and xterm

Arne Vajhøj arne at vajhoej.dk
Wed Apr 24 20:08:06 EDT 2024


On 4/23/2024 9:03 AM, Dan Cross wrote:
> In article <v088m8$1juj9$1 at dont-email.me>,
> Simon Clubley  <clubley at remove_me.eisner.decus.org-Earth.UFP> wrote:
>> On 2024-04-22, Dan Cross <cross at spitfire.i.gajendra.net> wrote:
>>>
>>> Eh, JSON has its own problems; since the representation of
>>> numbers is specified to be compatible with floats, it's possible
>>> to lose data by translating it through JSON (I've seen people
>>> put e.g. machine addresses in JSON and then be surprised when
>>> the low bits disappear: floating point representations are not
>>> exact over the range of 64-bit integers!).
>>
>> I would consider that to be strictly a programmer error. That's the
>> kind of thing that should be stored as a string value unless you are
>> using a JSON library that preserves integer values unless decimal data
>> is present in the input data (and hence silently converts it to a float).
>>
>> I don't expect people to write their own JSON library (although I hope
>> they can given how relatively simple JSON is to parse), but I do expect
>> them to know what values they can use in libraries in general without
>> experiencing data loss.
> 
> In modern languages, one can often derive JSON serialization and
> deserialization methods from the source data type, transparent
> to the programmer.  Those may decide to use the JSON numeric
> type for numeric data; this has surprised a few people I know
> (who are extraordinarily competent programmers).  Sure, the fix
> is generally easy (there's often a way to annotate a datum to
> say "serialize this as a string"), but that doesn't mean that
> even very senior people don't get caught out at times.
> 
> But the problem is even more insideous than that; popular tools
> like `jq` can take properly serialized source data and silently
> make lossy conversions.  So you might have properly written,
> value preserving libraries at both ends and still suffer loss
> due to some intermediate tool.
> 
> JSON is dangerous.  Caveat emptor.

This will be a relative long post. Sorry.

The problem at hand has nothing to do with JSON. It is
a string to numeric and data types problem.

JSON:

{ "v": 100000000000000001 }

XML:

<data>
     <v>100000000000000001</v>
</data>

YAML:

v: 100000000000000001

All expose the same problem.

The value cannot be represented as is in some very common
data types like 32 bit integers and 64 bit floating point.

But selecting an appropriate data type for a given piece
of data based on its possible values and usage is
core responsibility for a developer.

The developer should understand possible values before
writing the code and understand the characteristics
of the available data types.

If not then that is a developer error.

But errors happen all the time. Not so good developers
make lots of errors. Good developers make occasionally
errors. The bad developers are not aware of data
requirements and how the data types work. The good developers
are aware, but because they are having a bad day or something
then it slips through anyway.

There is every indication that large JSON values and the
usage of 64 bit floating point has caused problems. There must
be a reason why this was added to the JSON RFC:

<quote>
    This specification allows implementations to set limits on the range
    and precision of numbers accepted.  Since software that implements
    IEEE 754 binary64 (double precision) numbers [IEEE754] is generally
    available and widely used, good interoperability can be achieved by
    implementations that expect no more precision or range than these
    provide, in the sense that implementations will approximate JSON
    numbers within the expected precision.  A JSON number such as 1E400
    or 3.141592653589793238462643383279 may indicate potential
    interoperability problems, since it suggests that the software that
    created it expects receiving software to have greater capabilities
    for numeric magnitude and precision than is widely available.

    Note that when such software is used, numbers that are integers and
    are in the range [-(2**53)+1, (2**53)-1] are interoperable in the
    sense that implementations will agree exactly on their numeric
    values.
</quote>

Excusing the error with limitations in the programming language
and the JSON library is a poor excuse.

The developer (+ architect + other senior technology decision makers)
are responsible for selecting a programming language and a
JSON library that meet business requirements.

Mistakes can be made in all sorts of programming languages
no matter the typing system. But my guess is that it happens
more frequently when the type is not static and explicit declared.
Some doesn't like to type in that type, but it literally makes the
type visible.

The fact that it ultimately is the developers responsibility
to select proper data types does not mean that programming languages
and JSON libraries can not help catch errors.

If it is obvious that an unexpected/useless result is being
produced then it should be flagged (return error code or throw
exception depending on technology).

Let us go back to the example with 100000000000000001.

Trying to stuff that into a 32 bit integer by like parsing
it as a 64 bit integer and returning the lower 32 bits
is in my best opinion an error. Nobody wants to get an int
with 1569325057 from retrieving a 32 bit integer integer
from "100000000000000001". It should give an error.

The case with a 64 bit floating point is more tricky. One
could argue that 100000000000000001.0 is the expected
result and that 100000000000000000.0 should be considered
an error. And it probably would be an error in the majority
of cases. But there is actually the possibility that
someone that understand floating point are reading JSON
and expect what is happening and does not care because
there are some uncertainty in the underlying data. And
creating a false error for people that understand FP data
types to prevent those that do not understand FP data types
from shooting themself in the foot is not good.

Arne












More information about the Info-vax mailing list