[Info-vax] DCL DIFF algorithm

glen herrmannsfeldt gah at ugcs.caltech.edu
Tue Nov 9 03:06:28 EST 2010


DTL <didier.morandi at gmail.com> wrote:
 
> Who could give me (discretely or not) the algorithm used by the DCL
> Differences verb, please?
> I'd like to code it in VBScript but I have no clue on how to start.
> Compare lines one by one will break the logic as soon as the first
> different lines are encountered. All next lines will obviously be
> because of the shift. This is this shift concern I wish to be
> documented on.

The unix diff command uses the dynamic programming algorithms
that originated for comparing protein sequences in biology.

Having not thought about this for a while, I believe that they
generate a hash value for each line, then run the dynamic
programming algorithm on the hash values.  That finds the
optimal set if insertions, substitutions, and deletions to
go from on to the other.

I don't know why DCL would do it differently.  The algorithms
were well known.  The main one is the Needleman-Wunsch algorithm,
first published in 1970.  (That is, some years before VMS 1.0.)

-- glen



More information about the Info-vax mailing list