Update docs and usages regarding '-r' recursive option for git-diff-tree.
Remove '-r' from common diff options, mention it only for git-diff-tree.
Remove one extraneous use of '-r' with git-diff-files in get-merge.sh.
Sync the synopsis and usage string for git-diff-tree.
Signed-off-by: Chris Shoemaker <c.shoemaker at cox.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This makes the tree diff functionality independent of the "git-diff-tree"
program, by splitting the core functionality up into a library file.
This will be needed for when we teach git-rev-list to only follow a
specified set of pathnames, rather than the global revision history.
Most of it is a fairly straightforward code move, but it also involves
some calling convention cleanup, and moving some of the static variables
from diff-tree.c into the options structure.
The actual tree change callback routines also become paramterized by the
diff_options structure, allowing the library functionality to do something
else than just show the diff on stdout.
Right now the only user of this functionality remains git-diff-tree
itself.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
When we updated the marker for new files from 'N' to 'A', we forgot to
notice that the letter is already taken by the All-Or-None mark.
Change the All-Or-None marker to '*' to resolve this conflict.
git-diff-tree -r --diff-filter='R*' -M
shows all the changes (not just renames) that are contained in commits
that have renames, in comparison with:
git-diff-tree -r --diff-filter='R' -M
shows the same set of changes but the diff output are limited only to
renaming changes.
Signed-off-by: Junio C Hamano <junkio@cox.net>
When many paths are modified, rename detection takes a lot of time.
The new option -l<num> can be used to disable rename detection when
more than <num> paths are possibly created as renames.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This is a long overdue clean-up to the code for parsing and passing
diff options. It also tightens some constness issues.
Signed-off-by: Junio C Hamano <junkio@cox.net>
The textual diff generation with built-in '-p' in diff-* brothers has
proven to be useful enough that git-diff-helper outlived its usefulness.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Both Cogito and StGIT prefer to see 'A' for new files. The
current 'N' is visually harder to distinguish from 'M', which is
used for modified files. Prepare the internals to use symbolic
constants to make the change easier.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This removes the separate "formats" for name and name-with-zero-
termination.
It also removes the difference between HUMAN and MACHINE formats, and
they both become DIFF_FORMAT_RAW, with the difference being just in the
line and inter-filename termination.
It also makes the code easier to understand.
I got tired of maintaining almost duplicated descriptions in
diff-* brothers, both in usage string and documentation.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Porcelain layers often want to find only names of changed files,
and even with diff-raw output format they end up having to pick
out only the filename. Support --name-only (and --name-only-z
for xargs -0 and cpio -0 users that want to treat filenames with
embedded newlines sanely) flag to help them.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This is a halfway between debugging aid and a helper to write an
ultra-smart merge scripts. The new option takes a string that
consists of a list of "status" letters, and limits the diff
output to only those classes of changes, with two exceptions:
- A broken pair (aka "complete rewrite"), does not match D
(deleted) or N (created). Use B to look for them.
- The letter "A" in the diff-filter string does not match
anything itself, but causes the entire diff that contains
selected patches to be output (this behaviour is similar to
that of --pickaxe-all for the -S option).
For example,
$ git-rev-list HEAD |
git-diff-tree --stdin -s -v -B -C --diff-filter=BCR
shows a list of commits that have complete rewrite, copy, or
rename.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch updates diff documentation and usage strings:
- clarify the semantics of -R. It is not "output in reverse";
rather, it is "I will feed diff backwards". Semantically
they are different when -C is involved.
- describe -O in usage strings of diff-* brothers. It was
implemented, documented but not described in usage text.
Also it adds -O to diff-helper. Like -S (and unlike -M/-C/-B),
this option can work on sanitized diff-raw output produced by
the diff-* brothers. While we are at it, the call it makes to
diffcore is cleaned up to use the diffcore_std() like everybody
else, and the declaration for the low level diffcore routines
are moved from diff.h (public) to diffcore.h (private between
diff.c and diffcore backends).
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The core GIT repository has trees that record regular file mode
in 0664 instead of normalized 0644 pattern. Comparing such a
tree with another tree that records the same file in 0644
pattern without content changes with git-diff-tree causes it to
feed otherwise unmodified pairs to the diff_change() routine,
which triggers a sanity check routine and barfs. This patch
fixes the problem, along with the fix to another caller that
uses unnormalized mode bits to call diff_change() routine in a
similar way.
Without this patch, you will see "fatal error" from diff-tree
when you run git-deltafy-script on the core GIT repository
itself.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
A new diffcore filter diffcore-order is introduced. This takes
a text file each of whose line is a shell glob pattern. Patches
that match a glob pattern on an earlier line in the file are
output before patches that match a later line, and patches that
do not match any glob pattern are output last.
A typical orderfile for git project probably should look like
this:
README
Makefile
Documentation
*.h
*.c
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
A new diffcore transformation, diffcore-break.c, is introduced.
When the -B flag is given, a patch that represents a complete
rewrite is broken into a deletion followed by a creation. This
makes it easier to review such a complete rewrite patch.
The -B flag takes the same syntax as the -M and -C flags to
specify the minimum amount of non-source material the resulting
file needs to have to be considered a complete rewrite, and
defaults to 99% if not specified.
As the new test t4008-diff-break-rewrite.sh demonstrates, if a
file is a complete rewrite, it is broken into a delete/create
pair, which can further be subjected to the usual rename
detection if -M or -C is used. For example, if file0 gets
completely rewritten to make it as if it were rather based on
file1 which itself disappeared, the following happens:
The original change looks like this:
file0 --> file0' (quite different from file0)
file1 --> /dev/null
After diffcore-break runs, it would become this:
file0 --> /dev/null
/dev/null --> file0'
file1 --> /dev/null
Then diffcore-rename matches them up:
file1 --> file0'
The internal score values are finer grained now. Earlier
maximum of 10000 has been raised to 60000; there is no user
visible changes but there is no reason to waste available bits.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The three diff-* brothers had a sequence of calls into diffcore
that were almost identical. Introduce a new diffcore_std()
function that takes all the necessary arguments to consolidate
it. This will make later enhancements and changing the order of
diffcore application simpler.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This attempts to optimize "diff-tree -[CM] --stdin", which
compares successible tree pairs. This optimization does not
make much sense for other commands in the diff-* brothers.
When reading from --stdin and using rename/copy detection, the
patch makes diff-tree to read the current index file first.
This is done to reuse the optimization used by diff-cache in the
non-cached case. Similarity estimator can avoid expanding a
blob if the index says what is in the work tree has an exact
copy of that blob already expanded.
Another optimization the patch makes is to check only file sizes
first to terminate similarity estimation early. In order for
this to work, it needs a way to tell the size of the blob
without expanding it. Since an obvious way of doing it, which
is to keep all the blobs previously used in the memory, is too
costly, it does so by keeping the filesize for each object it
has already seen in memory.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When --pickaxe-all is given in addition to -S, pickaxe shows the
entire diffs contained in the changeset, not just the diffs for
the filepair that touched the sought-after string. This is
useful to see the changes in context.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This changes the argument of diff_setup() from an integer that
says if we are feeding reversed diff to a bitmask, so that later
global options can be added more easily.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
For later stages to reorder patches, pruning logic and rename detection
logic should not decide which delete to discard (because another entry
said it will take over the file as a rename) until the very end.
Also fix some tests that were assuming the earlier "last one is rename
or keep everything else is copy" semantics of diff-raw format, which no
longer is true.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This changes the diff-raw format again, following the mailing
list discussion. The new format explicitly expresses which one
is a rename and which one is a copy.
The documentation and tests are updated to match this change.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This moves the path selection logic from individual programs to a new
diffcore transformer (diff-tree still needs to have its own for
performance reasons). Also the header printing code in diff-tree was
tweaked not to produce anything when pickaxe is in effect and there is
nothing interesting to report. An interesting example is the following
in the GIT archive itself:
$ git-whatchanged -p -C -S'or something in a real script'
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Update the diff-raw format as Linus and I discussed, except that
it does not use sequence of underscore '_' letters to express
nonexistence. All '0' mode is used for that purpose instead.
The new diff-raw format can express rename/copy, and the earlier
restriction that -M and -C _must_ be used with the patch format
output is no longer necessary. The patch makes -M and -C flags
independent of -p flag, so you need to say git-whatchanged -M -p
to get the diff/patch format.
Updated are both documentations and tests.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This does not actually supress the extra headers when pickaxe is
used, but prepares enough support for diff-tree to implement it.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This steals the "pickaxe" feature from JIT and make it available
to the bare Plumbing layer. From the command line, the user
gives a string he is intersted in.
Using the diff-core infrastructure previously introduced, it
filters the differences to limit the output only to the diffs
between <src> and <dst> where the string appears only in one but
not in the other. For example:
$ ./git-rev-list HEAD | ./git-diff-tree -Sdiff-tree-helper --stdin -M
would show the diffs that touch the string "diff-tree-helper".
In real software-archaeologist application, you would typically
look for a few to several lines of code and see where that code
came from.
The "pickaxe" module runs after "rename/copy detection" module,
so it even crosses the file rename boundary, as the above
example demonstrates.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This cleans up the way calls are made into the diff core from diff-tree
family and diff-helper. Earlier, these programs had "if
(generating_patch)" sprinkled all over the place, but those ugliness are
gone and handled uniformly from the diff core, even when not generating
patch format.
This also allowed diff-cache and diff-files to acquire -R
(reverse) option to generate diff in reverse. Users of
diff-tree can swap two trees easily so I did not add -R there.
[ Linus' note: I'll add -R to "diff-tree" too, since a "commit
diff" doesn't have another tree to switch around: the other
tree is always the parent(s) of the commit ]
Also -M<digits-as-mantissa> suggestion made by Linus has been
implemented.
Documentation updates are also included.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This rips out the rename detection engine from diff-helper and moves it
to the diff core, and updates the internal calling convention used by
diff-tree family into the diff core. In order to give the same option
name to diff-tree family as well as to diff-helper, I've changed the
earlier diff-helper '-r' option to '-M' (stands for Move; sorry but the
natural abbreviation 'r' for 'rename' is already taken for 'recursive').
Although I did a fair amount of test with the git-diff-tree with
existing rename commits in the core GIT repository, this should still be
considered beta (preview) release. This patch depends on the diff-delta
infrastructure just committed.
This implements almost everything I wanted to see in this series of
patch, except a few minor cleanups in the calling convention into diff
core, but that will be a separate cleanup patch.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch adds a framework and a stub implementation of rename
detection to diff-helper program.
The current stub code is just enough to detect pure renames in
diff-tree output and not fancier. The plan is perhaps to use
the same delta code when Nico's delta storage patch is merged
for similarity evaluation purposes.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
It used to be that diff-tree needed helper support to parse its
raw output to generate diffs, but these days git-diff-* family
produces the same output and the helper is not tied to diff-tree
anymore. Drop "tree" from its name.
This commit is done separately to record just the rename and no
file content changes. The changes in the renamed files are recorded
in the next commit.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Bundled with the changes in the unrenamed files.
Signed-off-by: Petr Baudis <pasky@ucw.cz>
This patch optimizes "diff-cache -p --cached" by avoiding to
inflate blobs into temporary files when the blob recorded in the
cache matches the corresponding file in the work tree. The file
in the work tree is passed as the comparison source in such a
case instead.
This optimization kicks in only when we have already read the
cache this optimization and this is deliberate. Especially,
diff-tree does not use this code, because changes are contained
in small number of files relative to the project size most of
the time, and reading cache is so expensive for a large project
that the cost of reading it outweighs the savings by not
inflating blobs.
Also this patch cleans up the structure passed from diff clients
by removing one unused structure member.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This introduces three public functions for diff-cache and friends can
use to call out to the GIT_EXTERNAL_DIFF program when they wish to.
A normal "add/remove/change" entry is turned into 7-parameter process
invocation of GIT_EXTERNAL_DIFF program as before. In addition, the
program can now be called with a single parameter when diff-cache and
friends want to report an unmerged path.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This reworks the diff-tree-helper and show-diff to further make external
diff command interface simpler.
These commands now honor GIT_EXTERNAL_DIFF environment variable which
can point at an arbitrary program that takes 7 parameters:
name file1 file1-sha1 file1-mode file2 file2-sha1 file2-mode
The parameters for an external diff command are as follows:
name this invocation of the command is to emit diff
for the named cache/tree entry.
file1 pathname that holds the contents of the first
file. This can be a file inside the working
tree, or a temporary file created from the blob
object, or /dev/null. The command should not
attempt to unlink it -- the temporary is
unlinked by the caller.
file1-sha1 sha1 hash if file1 is a blob object, or "."
otherwise.
file1-mode mode bits for file1, or "." for a deleted file.
If GIT_EXTERNAL_DIFF environment variable is not set, the
default is to invoke diff with the set of parameters old
show-diff used to use. This built-in implementation honors the
GIT_DIFF_CMD and GIT_DIFF_OPTS environment variables as before.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
With this patch, the non-core'ish part of show-diff command that
invokes an external "diff" comand to obtain patches is split
into a separate file. The next patch will introduce a new
command, diff-tree-helper, which uses this common diff interface
to format diff-tree and diff-cache output into a patch form.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>