> Author doesn’t explain what happened or why the proposed flags will solve the problem.
Probably because she/he doesn't know. Could be lots of things, because FYI mtime can be modified by the user. Go `touch` a file.
In all likelihood, it happens because of a package installation, where a package install sets the same mtime, on a file which has the same sized, but has different file contents. That's where I usually see it.
`httm` allows one to dedup snapshot versions by size, then hash the contents of identical sized versions for this very reason.
--dedup-by[=<DEDUP_BY>] comparing file versions solely on the basis of size and modify time (the default "metadata" behavior) may return what appear to be "false positives". This is because metadata, specifically modify time and size, is not a precise measure of whether a file has actually changed. A program might overwrite a file with the same contents, and/or a user can simply update the modify time via 'touch'. When specified with the "contents" option, httm compares the actual file contents of same-sized file versions, overriding the default "metadata" only behavior...
It is worth nothing that rsync doesn't compare just by size and mtime but also (relative) path - i.e. it normally compares an old copy of a file with the current version of the same file. So the likelyhood of "collisions" is much smaller than a file de-duplicating tool that compares random files.
I think you may misunderstand what httm does. httm prints the size, date and corresponding locations of available unique versions of files residing on snapshots.
And -- this makes it quite effective at proving how often this happens:
Probably because she/he doesn't know. Could be lots of things, because FYI mtime can be modified by the user. Go `touch` a file.
In all likelihood, it happens because of a package installation, where a package install sets the same mtime, on a file which has the same sized, but has different file contents. That's where I usually see it.
`httm` allows one to dedup snapshot versions by size, then hash the contents of identical sized versions for this very reason.