Add new item text to struct diff_options.
If set then do not try to detect binary files.
Signed-off-by: Stephan Feder <sf@b-i-t.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
* th/diff:
builtin-diff: turn recursive on when defaulting to --patch format.
t4013: note improvements brought by the new output code.
t4013: add format-patch tests.
format-patch: fix diff format option implementation
combine-diff.c: type sanity.
t4013 test updates for new output code.
Fix some more diff options changes.
Fix diff-tree -s
log --raw: Don't descend into subdirectories by default
diff-tree: Use ---\n as a message separator
Print empty line between raw, stat, summary and patch
t4013: add more tests around -c and --cc
whatchanged: Default to DIFF_FORMAT_RAW
Don't xcalloc() struct diffstat_t
Add msg_sep to diff_options
DIFF_FORMAT_RAW is not default anymore
Set default diff output format after parsing command line
Make --raw option available for all diff commands
Merge with_raw, with_stat and summary variables to output_format
t4013: add tests for diff/log family output options.
Add msg_sep variable to struct diff_options. msg_sep is printed after
commit message. Default is "\n", format-patch sets it to "---\n".
This also removes the second argument from show_log() because all
callers derived it from the first argument:
show_log(rev, rev->loginfo, ...
Signed-off-by: Timo Hirvonen <tihirvon@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
DIFF_FORMAT_* are now bit-flags instead of enumerated values.
Signed-off-by: Timo Hirvonen <tihirvon@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Call it like this:
unsigned char id[20];
if (diff_flush_patch_id(diff_options, id))
printf("And the patch id is: %s\n", sha1_to_hex(id));
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This adds -b (--ignore-space-change) and -w (--ignore-all-space) flags to
diff. The main part of the patch is teaching libxdiff about it.
[jc: renamed xdl_line_match() to xdl_recmatch() since the former is used
for different purposes in xpatchi.c which is in the parts of the upstream
source we do not use.]
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This patch is a slightly adjusted version of Junio's patch:
http://www.gelato.unsw.edu.au/archives/git/0604/19354.html
However, instead of using a config variable, this patch makes it available
as a diff option.
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This patch touches a couple of files, because it adds options to print a
custom text just after the subject of a commit, and just after the
diffstat.
[jc: made "many dashes" used as the boundary leader into a single
variable, to reduce the possibility of later tweaks to miscount the
number of dashes to break it.]
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Actually, it is a diff option now, so you can say
git diff --check
to ask if what you are about to commit is a good patch.
[jc: this also would work for fmt-patch, but the point is that
the check is done before making a commit. format-patch is run
from an already created commit, and that is too late to catch
whitespace damaged change.]
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Remove the need to pipe git diff through git apply to
get the extended headers summary.
Signed-off-by: Sean Estabrooks <seanlkml@sympatico.ca>
Signed-off-by: Junio C Hamano <junkio@cox.net>
We used to parse "-U" and "--unified" as part of the GIT_DIFF_OPTS
environment variable, but strangely enough we would _not_ parse them as
part of the normal diff command line (where we only accepted "-u").
This adds parsing of -U and --unified, both with an optional numeric
argument. So now you can just say
git diff --unified=5
to get a unified diff with a five-line context, instead of having to do
something silly like
GIT_DIFF_OPTS="--unified=5" git diff -u
(that silly format does continue to still work, of course).
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This updates the user interface and generated diff data format.
* "diff --binary" is used to signal that we want an e-mailable
binary patch. It implies --full-index and -p.
* "apply --allow-binary-replacement" acquired a short synonym
"apply --binary".
* After the "GIT binary patch\n" header line there is a token
to record which binary patch mechanism was used, so that we
can extend it later. Currently there are two mechanisms
defined: "literal" and "delta". The former records the
deflated postimage and the latter records the deflated delta
from the preimage to postimage.
For purely implementation convenience, I added the deflated
length after these "literal/delta" tokens (otherwise the
decoding side needs to guess and reallocate the buffer while
inflating). Improvement patches are very welcomed.
Signed-off-by: Junio C Hamano <junkio@cox.net>
"git diff(n)" without --base, --ours, etc. defaults to --cc,
which usually is the same as -p unless you are in the middle of
a conflicted merge, just like the shell script version.
"git diff(n) blobA blobB path" complains and dies.
"git diff(n) tree0 tree1 tree2...treeN" does combined diff that
shows a merge of tree1..treeN to result in tree0.
Giving "-c" option to any command that defaults to "--cc" turns
off dense-combined flag.
Signed-off-by: Junio C Hamano <junkio@cox.net>
"diff-index -m" does not mean "do not ignore merges", but means
"pretend missing files match the index".
The previous round tried to address this, but failed because
setup_revisions() ate "-m" flag before the caller had a chance
to intervene.
Signed-off-by: Junio C Hamano <junkio@cox.net>
The second installment to libify diff brothers. The pathname
arguments are checked more strictly than before because we now
use the revision.c::setup_revisions() infrastructure.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This is the first installment to libify diff brothers.
The updated diff-files uses revision.c::setup_revisions()
infrastructure to parse its command line arguments, which means
the pathname arguments are checked more strictly than before.
The tests are adjusted to separate possibly missing paths from
the rest of arguments with double-dashes, to show the kosher
way.
As Linus pointed out, renaming diff.c to diff-lib.c was simply
stupid, so I am renaming it back. The new diff-lib.c is to
contain pieces extracted from diff brothers.
Signed-off-by: Junio C Hamano <junkio@cox.net>
On Sun, 16 Apr 2006, Junio C Hamano wrote:
>
> In the mid-term, I am hoping we can drop the generate_header()
> callchain _and_ the custom code that formats commit log in-core,
> found in cmd_log_wc().
Ok, this was nastier than expected, just because the dependencies between
the different log-printing stuff were absolutely _everywhere_, but here's
a patch that does exactly that.
The patch is not very easy to read, and the "--patch-with-stat" thing is
still broken (it does not call the "show_log()" thing properly for
merges). That's not a new bug. In the new world order it _should_ do
something like
if (rev->logopt)
show_log(rev, rev->logopt, "---\n");
but it doesn't. I haven't looked at the --with-stat logic, so I left it
alone.
That said, this patch removes more lines than it adds, and in particular,
the "cmd_log_wc()" loop is now a very clean:
while ((commit = get_revision(rev)) != NULL) {
log_tree_commit(rev, commit);
free(commit->buffer);
commit->buffer = NULL;
}
so it doesn't get much prettier than this. All the complexity is entirely
hidden in log-tree.c, and any code that needs to flush the log literally
just needs to do the "if (rev->logopt) show_log(...)" incantation.
I had to make the combined_diff() logic take a "struct rev_info" instead
of just a "struct diff_options", but that part is pretty clean.
This does change "git whatchanged" from using "diff-tree" as the commit
descriptor to "commit", and I changed one of the tests to reflect that new
reality. Otherwise everything still passes, and my other tests look fine
too.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
With this option, git prepends a diffstat in front of the patch.
Since I really, really do not know what a diffstat of a combined diff
("merge diff") should look like, the diffstat is not generated for these.
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Now, you can say "git diff --stat" (to get an idea how many changes are
uncommitted), or "git log --stat".
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
The way tree-diff was set up assumed we would use only one set
of pathspec during the entire life of the program. Move the
pathspec related static variables out to diff_options structure
so that we can filter commits with one set of paths while show
the actual diffs using different set of paths.
I suspect this breaks blame.c, and makes "git log paths..." to
default to the --full-diff, the latter of which is dealt with
the next commit.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Nobody except diff-stages used it -- the callers instead filtered
the input to diffcore themselves. Make diff-stages do that as
well and retire diffcore-pathspec.
Signed-off-by: Junio C Hamano <junkio@cox.net>
git-diff-* --pickaxe-regex will change the -S pickaxe to match
POSIX extended regular expressions instead of fixed strings.
The regex.h library is a rather stupid interface and I like pcre too, but
with any luck it will be everywhere we will want to run Git on, it being
POSIX.2 and all. I'm not sure if we can expect platforms like AIX to
conform to POSIX.2 or if win32 has regex.h. We might add a flag to
Makefile if there is a portability trouble potential.
Signed-off-by: Petr Baudis <pasky@suse.cz>
Introduce tree-walk.[ch] and move "struct tree_desc" and
associated functions from various places.
Rename DIFF_FILE_CANON_MODE(mode) macro to canon_mode(mode) and
move it to cache.h. This macro returns the canonicalized
st_mode value in the host byte order for files, symlinks and
directories -- to be compared with a tree_desc entry.
create_ce_mode(mode) in cache.h is similar but is intended to be
used for index entries (so it does not work for directories) and
returns the value in the network byte order.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This shows "new file mode XXXX" and "deleted file mode XXXX"
lines like two-way diff-patch output does, by checking the
status from each parent.
The diff-raw output for combined diff is made a bit uglier by
showing diff status letters with each parent. While most of the
case you would see "MM" in the output, an Evil Merge that
touches a path that was added by inheriting from one parent is
possible and it would be shown like these:
$ git-diff-tree --abbrev -c HEAD
2d7ca89675eb8888b0b88a91102f096d4471f09f
::000000 000000 100644 0000000... 0000000... 31dd686... AA b
::000000 100644 100644 0000000... 6c884ae... c6d4fa8... AM d
::100644 100644 100644 4f7cbe7... f8c295c... 19d5d80... RR e
Signed-off-by: Junio C Hamano <junkio@cox.net>
This way, diff-files can make use of it. Also implement the
full suite of what diff_flush_raw() supports just for
consistency. With this, 'diff-tree -c -r --name-status' would
show what is expected.
There is no way to get the historical output (useful for
debugging and low-level Plumbing work) anymore, so tentatively
it makes '-m' to mean "do not combine and show individual diffs
with parents".
diff-files matches diff-tree to produce raw output for -c. For
textual combined diff, use -p -c.
Signed-off-by: Junio C Hamano <junkio@cox.net>
NOTE! This makes "-c" be the default, which effectively means that merges
are never ignored any more, and "-m" is a no-op. So it changes semantics.
I would also like to make "--cc" the default if you do patches, but didn't
actually do that.
The raw output format is not wonderfully pretty, but it's distinguishable
from a "normal patch" in that a normal patch with just one parent has just
one colon at the beginning, while a multi-parent raw diff has <n> colons
for <n> parents.
So now, in the kernel, when you do
git-diff-tree cce0cac125623f9b68f25dd1350f6d616220a8dd
(to see the manual ARM merge that had a conflict in arch/arm/Kconfig), you
get
cce0cac125623f9b68f25dd1350f6d616220a8dd
::100644 100644 100644 4a63a8e2e45247a11c068c6ed66c6e7aba29ddd9 77eee38762d69d3de95ae45dd9278df9b8225e2c 2f61726d2f4b636f6e66696700dbf71a59dad287 arch/arm/Kconfig
ie you see two colons (two parents), then three modes (parent modes
followed by result mode), then three sha1s (parent sha1s followed by
result sha1).
Which is pretty close to the normal raw diff output.
Cool/stupid exercise:
$ git-whatchanged | grep '^::' | cut -f2- | sort |
uniq -c | sort -n | less -S
will show which files have needed the most file-level merge conflict
resolution. Useful? Probably not. But kind of interesting.
For the kernel, it's
....
10 arch/ia64/Kconfig
11 drivers/scsi/Kconfig
12 drivers/net/Makefile
17 include/linux/libata.h
18 include/linux/pci_ids.h
23 drivers/net/Kconfig
24 drivers/scsi/libata-scsi.c
28 drivers/scsi/libata-core.c
43 MAINTAINERS
Signed-off-by: Junio C Hamano <junkio@cox.net>
We have operations to "extract" and "update" a "struct tree_desc", but we
only used them in tree-diff.c and they were static to that file.
But other tree traversal functions can use them to their advantage
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
The minimum length of abbreviated object name was hardcoded in
different places to be 4, risking inconsistencies in the future.
Also there were three different "default abbreviation
precision". Use two C preprocessor symbols to clean up this
mess.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This ports the "combined diff" to diff-files so that differences
to the working tree files since stage 2 and stage 3 are shown
the same way as combined diff output from diff-tree for the
merge commit would be shown if the current working tree files
are committed.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Building on the previous '-c' (combined) option, '--cc' option
squelches the output further by omitting hunks that consist of
difference with solely one parent.
Signed-off-by: Junio C Hamano <junkio@cox.net>
A new option '-c' to diff-tree changes the way a merge commit is
displayed when generating a patch output. It shows a "combined
diff" (hence the option letter 'c'), which looks like this:
$ git-diff-tree --pretty -c -p fec9ebf1 | head -n 18
diff-tree fec9ebf... (from parents)
Merge: 0620db3... 8a263ae...
Author: Junio C Hamano <junkio@cox.net>
Date: Sun Jan 15 22:25:35 2006 -0800
Merge fixes up to GIT 1.1.3
diff --combined describe.c
@@@ +98,7 @@@
return (a_date > b_date) ? -1 : (a_date == b_date) ? 0 : 1;
}
- static void describe(char *arg)
- static void describe(struct commit *cmit, int last_one)
++ static void describe(char *arg, int last_one)
{
+ unsigned char sha1[20];
+ struct commit *cmit;
There are a few things to note about this feature:
- The '-c' option implies '-p'. It also implies '-m' halfway
in the sense that "interesting" merges are shown, but not all
merges.
- When a blob matches one of the parents, we do not show a diff
for that path at all. For a merge commit, this option shows
paths with real file-level merge (aka "interesting things").
- As a concequence of the above, an "uninteresting" merge is
not shown at all. You can use '-m' in addition to '-c' to
show the commit log for such a merge, but there will be no
combined diff output.
- Unlike "gitk", the output is monochrome.
A '-' character in the nth column means the line is from the nth
parent and does not appear in the merge result (i.e. removed
from that parent's version).
A '+' character in the nth column means the line appears in the
merge result, and the nth parent does not have that line
(i.e. added by the merge itself or inherited from another
parent).
The above example output shows that the function signature was
changed from either parents (hence two "-" lines and a "++"
line), and "unsigned char sha1[20]", prefixed by a " +", was
inherited from the first parent.
The code as sent to the list was buggy in few corner cases,
which I have fixed since then.
It does not bother to keep track of and show the line numbers
from parent commits, which it probably should.
Signed-off-by: Junio C Hamano <junkio@cox.net>
When I show transcripts to explain how something works, I often
find myself hand-editing the diff-raw output to shorten various
object names in the output.
This adds --abbrev option to the diff family, which shortens
diff-raw output and diff-tree commit id headers.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Otherwise we would end up linking all the unneeded stuff into git-daemon
only to link with git_default_config.
Signed-off-by: Junio C Hamano <junkio@cox.net>
A new option, --full-index, is introduced to diff family. This
causes the full object name of pre- and post-images to appear on
the index line of patch formatted output, to be used in
conjunction with --allow-binary-replacement option of git-apply.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Update docs and usages regarding '-r' recursive option for git-diff-tree.
Remove '-r' from common diff options, mention it only for git-diff-tree.
Remove one extraneous use of '-r' with git-diff-files in get-merge.sh.
Sync the synopsis and usage string for git-diff-tree.
Signed-off-by: Chris Shoemaker <c.shoemaker at cox.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This makes the tree diff functionality independent of the "git-diff-tree"
program, by splitting the core functionality up into a library file.
This will be needed for when we teach git-rev-list to only follow a
specified set of pathnames, rather than the global revision history.
Most of it is a fairly straightforward code move, but it also involves
some calling convention cleanup, and moving some of the static variables
from diff-tree.c into the options structure.
The actual tree change callback routines also become paramterized by the
diff_options structure, allowing the library functionality to do something
else than just show the diff on stdout.
Right now the only user of this functionality remains git-diff-tree
itself.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
When we updated the marker for new files from 'N' to 'A', we forgot to
notice that the letter is already taken by the All-Or-None mark.
Change the All-Or-None marker to '*' to resolve this conflict.
git-diff-tree -r --diff-filter='R*' -M
shows all the changes (not just renames) that are contained in commits
that have renames, in comparison with:
git-diff-tree -r --diff-filter='R' -M
shows the same set of changes but the diff output are limited only to
renaming changes.
Signed-off-by: Junio C Hamano <junkio@cox.net>
When many paths are modified, rename detection takes a lot of time.
The new option -l<num> can be used to disable rename detection when
more than <num> paths are possibly created as renames.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This is a long overdue clean-up to the code for parsing and passing
diff options. It also tightens some constness issues.
Signed-off-by: Junio C Hamano <junkio@cox.net>
The textual diff generation with built-in '-p' in diff-* brothers has
proven to be useful enough that git-diff-helper outlived its usefulness.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Both Cogito and StGIT prefer to see 'A' for new files. The
current 'N' is visually harder to distinguish from 'M', which is
used for modified files. Prepare the internals to use symbolic
constants to make the change easier.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This removes the separate "formats" for name and name-with-zero-
termination.
It also removes the difference between HUMAN and MACHINE formats, and
they both become DIFF_FORMAT_RAW, with the difference being just in the
line and inter-filename termination.
It also makes the code easier to understand.
I got tired of maintaining almost duplicated descriptions in
diff-* brothers, both in usage string and documentation.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Porcelain layers often want to find only names of changed files,
and even with diff-raw output format they end up having to pick
out only the filename. Support --name-only (and --name-only-z
for xargs -0 and cpio -0 users that want to treat filenames with
embedded newlines sanely) flag to help them.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This is a halfway between debugging aid and a helper to write an
ultra-smart merge scripts. The new option takes a string that
consists of a list of "status" letters, and limits the diff
output to only those classes of changes, with two exceptions:
- A broken pair (aka "complete rewrite"), does not match D
(deleted) or N (created). Use B to look for them.
- The letter "A" in the diff-filter string does not match
anything itself, but causes the entire diff that contains
selected patches to be output (this behaviour is similar to
that of --pickaxe-all for the -S option).
For example,
$ git-rev-list HEAD |
git-diff-tree --stdin -s -v -B -C --diff-filter=BCR
shows a list of commits that have complete rewrite, copy, or
rename.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch updates diff documentation and usage strings:
- clarify the semantics of -R. It is not "output in reverse";
rather, it is "I will feed diff backwards". Semantically
they are different when -C is involved.
- describe -O in usage strings of diff-* brothers. It was
implemented, documented but not described in usage text.
Also it adds -O to diff-helper. Like -S (and unlike -M/-C/-B),
this option can work on sanitized diff-raw output produced by
the diff-* brothers. While we are at it, the call it makes to
diffcore is cleaned up to use the diffcore_std() like everybody
else, and the declaration for the low level diffcore routines
are moved from diff.h (public) to diffcore.h (private between
diff.c and diffcore backends).
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The core GIT repository has trees that record regular file mode
in 0664 instead of normalized 0644 pattern. Comparing such a
tree with another tree that records the same file in 0644
pattern without content changes with git-diff-tree causes it to
feed otherwise unmodified pairs to the diff_change() routine,
which triggers a sanity check routine and barfs. This patch
fixes the problem, along with the fix to another caller that
uses unnormalized mode bits to call diff_change() routine in a
similar way.
Without this patch, you will see "fatal error" from diff-tree
when you run git-deltafy-script on the core GIT repository
itself.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
A new diffcore filter diffcore-order is introduced. This takes
a text file each of whose line is a shell glob pattern. Patches
that match a glob pattern on an earlier line in the file are
output before patches that match a later line, and patches that
do not match any glob pattern are output last.
A typical orderfile for git project probably should look like
this:
README
Makefile
Documentation
*.h
*.c
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
A new diffcore transformation, diffcore-break.c, is introduced.
When the -B flag is given, a patch that represents a complete
rewrite is broken into a deletion followed by a creation. This
makes it easier to review such a complete rewrite patch.
The -B flag takes the same syntax as the -M and -C flags to
specify the minimum amount of non-source material the resulting
file needs to have to be considered a complete rewrite, and
defaults to 99% if not specified.
As the new test t4008-diff-break-rewrite.sh demonstrates, if a
file is a complete rewrite, it is broken into a delete/create
pair, which can further be subjected to the usual rename
detection if -M or -C is used. For example, if file0 gets
completely rewritten to make it as if it were rather based on
file1 which itself disappeared, the following happens:
The original change looks like this:
file0 --> file0' (quite different from file0)
file1 --> /dev/null
After diffcore-break runs, it would become this:
file0 --> /dev/null
/dev/null --> file0'
file1 --> /dev/null
Then diffcore-rename matches them up:
file1 --> file0'
The internal score values are finer grained now. Earlier
maximum of 10000 has been raised to 60000; there is no user
visible changes but there is no reason to waste available bits.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The three diff-* brothers had a sequence of calls into diffcore
that were almost identical. Introduce a new diffcore_std()
function that takes all the necessary arguments to consolidate
it. This will make later enhancements and changing the order of
diffcore application simpler.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This attempts to optimize "diff-tree -[CM] --stdin", which
compares successible tree pairs. This optimization does not
make much sense for other commands in the diff-* brothers.
When reading from --stdin and using rename/copy detection, the
patch makes diff-tree to read the current index file first.
This is done to reuse the optimization used by diff-cache in the
non-cached case. Similarity estimator can avoid expanding a
blob if the index says what is in the work tree has an exact
copy of that blob already expanded.
Another optimization the patch makes is to check only file sizes
first to terminate similarity estimation early. In order for
this to work, it needs a way to tell the size of the blob
without expanding it. Since an obvious way of doing it, which
is to keep all the blobs previously used in the memory, is too
costly, it does so by keeping the filesize for each object it
has already seen in memory.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When --pickaxe-all is given in addition to -S, pickaxe shows the
entire diffs contained in the changeset, not just the diffs for
the filepair that touched the sought-after string. This is
useful to see the changes in context.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This changes the argument of diff_setup() from an integer that
says if we are feeding reversed diff to a bitmask, so that later
global options can be added more easily.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
For later stages to reorder patches, pruning logic and rename detection
logic should not decide which delete to discard (because another entry
said it will take over the file as a rename) until the very end.
Also fix some tests that were assuming the earlier "last one is rename
or keep everything else is copy" semantics of diff-raw format, which no
longer is true.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This changes the diff-raw format again, following the mailing
list discussion. The new format explicitly expresses which one
is a rename and which one is a copy.
The documentation and tests are updated to match this change.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This moves the path selection logic from individual programs to a new
diffcore transformer (diff-tree still needs to have its own for
performance reasons). Also the header printing code in diff-tree was
tweaked not to produce anything when pickaxe is in effect and there is
nothing interesting to report. An interesting example is the following
in the GIT archive itself:
$ git-whatchanged -p -C -S'or something in a real script'
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Update the diff-raw format as Linus and I discussed, except that
it does not use sequence of underscore '_' letters to express
nonexistence. All '0' mode is used for that purpose instead.
The new diff-raw format can express rename/copy, and the earlier
restriction that -M and -C _must_ be used with the patch format
output is no longer necessary. The patch makes -M and -C flags
independent of -p flag, so you need to say git-whatchanged -M -p
to get the diff/patch format.
Updated are both documentations and tests.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This does not actually supress the extra headers when pickaxe is
used, but prepares enough support for diff-tree to implement it.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This steals the "pickaxe" feature from JIT and make it available
to the bare Plumbing layer. From the command line, the user
gives a string he is intersted in.
Using the diff-core infrastructure previously introduced, it
filters the differences to limit the output only to the diffs
between <src> and <dst> where the string appears only in one but
not in the other. For example:
$ ./git-rev-list HEAD | ./git-diff-tree -Sdiff-tree-helper --stdin -M
would show the diffs that touch the string "diff-tree-helper".
In real software-archaeologist application, you would typically
look for a few to several lines of code and see where that code
came from.
The "pickaxe" module runs after "rename/copy detection" module,
so it even crosses the file rename boundary, as the above
example demonstrates.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This cleans up the way calls are made into the diff core from diff-tree
family and diff-helper. Earlier, these programs had "if
(generating_patch)" sprinkled all over the place, but those ugliness are
gone and handled uniformly from the diff core, even when not generating
patch format.
This also allowed diff-cache and diff-files to acquire -R
(reverse) option to generate diff in reverse. Users of
diff-tree can swap two trees easily so I did not add -R there.
[ Linus' note: I'll add -R to "diff-tree" too, since a "commit
diff" doesn't have another tree to switch around: the other
tree is always the parent(s) of the commit ]
Also -M<digits-as-mantissa> suggestion made by Linus has been
implemented.
Documentation updates are also included.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This rips out the rename detection engine from diff-helper and moves it
to the diff core, and updates the internal calling convention used by
diff-tree family into the diff core. In order to give the same option
name to diff-tree family as well as to diff-helper, I've changed the
earlier diff-helper '-r' option to '-M' (stands for Move; sorry but the
natural abbreviation 'r' for 'rename' is already taken for 'recursive').
Although I did a fair amount of test with the git-diff-tree with
existing rename commits in the core GIT repository, this should still be
considered beta (preview) release. This patch depends on the diff-delta
infrastructure just committed.
This implements almost everything I wanted to see in this series of
patch, except a few minor cleanups in the calling convention into diff
core, but that will be a separate cleanup patch.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch adds a framework and a stub implementation of rename
detection to diff-helper program.
The current stub code is just enough to detect pure renames in
diff-tree output and not fancier. The plan is perhaps to use
the same delta code when Nico's delta storage patch is merged
for similarity evaluation purposes.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
It used to be that diff-tree needed helper support to parse its
raw output to generate diffs, but these days git-diff-* family
produces the same output and the helper is not tied to diff-tree
anymore. Drop "tree" from its name.
This commit is done separately to record just the rename and no
file content changes. The changes in the renamed files are recorded
in the next commit.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Bundled with the changes in the unrenamed files.
Signed-off-by: Petr Baudis <pasky@ucw.cz>
This patch optimizes "diff-cache -p --cached" by avoiding to
inflate blobs into temporary files when the blob recorded in the
cache matches the corresponding file in the work tree. The file
in the work tree is passed as the comparison source in such a
case instead.
This optimization kicks in only when we have already read the
cache this optimization and this is deliberate. Especially,
diff-tree does not use this code, because changes are contained
in small number of files relative to the project size most of
the time, and reading cache is so expensive for a large project
that the cost of reading it outweighs the savings by not
inflating blobs.
Also this patch cleans up the structure passed from diff clients
by removing one unused structure member.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This introduces three public functions for diff-cache and friends can
use to call out to the GIT_EXTERNAL_DIFF program when they wish to.
A normal "add/remove/change" entry is turned into 7-parameter process
invocation of GIT_EXTERNAL_DIFF program as before. In addition, the
program can now be called with a single parameter when diff-cache and
friends want to report an unmerged path.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This reworks the diff-tree-helper and show-diff to further make external
diff command interface simpler.
These commands now honor GIT_EXTERNAL_DIFF environment variable which
can point at an arbitrary program that takes 7 parameters:
name file1 file1-sha1 file1-mode file2 file2-sha1 file2-mode
The parameters for an external diff command are as follows:
name this invocation of the command is to emit diff
for the named cache/tree entry.
file1 pathname that holds the contents of the first
file. This can be a file inside the working
tree, or a temporary file created from the blob
object, or /dev/null. The command should not
attempt to unlink it -- the temporary is
unlinked by the caller.
file1-sha1 sha1 hash if file1 is a blob object, or "."
otherwise.
file1-mode mode bits for file1, or "." for a deleted file.
If GIT_EXTERNAL_DIFF environment variable is not set, the
default is to invoke diff with the set of parameters old
show-diff used to use. This built-in implementation honors the
GIT_DIFF_CMD and GIT_DIFF_OPTS environment variables as before.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
With this patch, the non-core'ish part of show-diff command that
invokes an external "diff" comand to obtain patches is split
into a separate file. The next patch will introduce a new
command, diff-tree-helper, which uses this common diff interface
to format diff-tree and diff-cache output into a patch form.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>