[Info-vax] source control and semantics (Re: Why so much Unix envy?)

Craig A. Berry craigberry at nospam.mac.com
Fri Sep 12 21:49:23 EDT 2014


On 9/12/14, 12:29 AM, Shark8 wrote:

> One particular instance that most tooling shows its inferiority is in
> how it handles source-code: plain text. Given what we *know* about
> handling program semantics it's shameful that most source-code
> repositories are whitespace sensitive, recording non-semantic changes at
> the same level of import as semantic changes. (i.e. something trivial
> and unimportant, like some guy with a preference for spaces instead of
> tabs for indentation altering the source to his style is recorded at the
> same level as altering an algorithm to fix/introduce an edge-case.)

Actually most modern version control tools do not treat the source as
plain text because at the most fundamental level they don't treat it as
text at all: it's just arbitrary content. This does, necessarily, mean
that any changes to any content get tracked, but that's actually a
well-thought-out separation of concerns. Embedding language-specific
semantics into the tool that tracks changes would be a significant layer
violation and a major step backwards.

Of course these tools mostly have facilities to check and/or massage the
input however you want at the point of recording content changes. Just
set up a pre-commit hook that runs the code through a parser or
normalizes style according to local guidelines or enforces any other
policy you want. Decide on a case-by-case basis whether the machine can
automatically reduce the content to a standard form before recording
changes or whether the commit will be rejected and the programmer
required to make corrections before proceeding.

See, for example:

<http://hgbook.red-bean.com/read/handling-repository-events-with-hooks.html>

And also note that you can typically ignore whitespace-only changes when
looking through history with such options as "git diff
--ignore-space-change".

And finally, note that the difference between spaces and tabs can be
semantically significant, as it is with Python.



More information about the Info-vax mailing list