Gitk wants to use git-diff-tree as a filter to tell it which ids from
a given list affect a set of files or directories. We don't want to
fork and exec a new git-diff-tree process for each batch of ids, since
there could be a lot of relatively small batches. For example, a
batch could contain as many ids as fit in gitk's headline display
window, i.e. 20 or so, and we would be processing a new batch every
time the user scrolls that window.
The --stdin flag to git-diff-tree is suitable for this, but the main
difficulty is that the output of git-diff-tree gets buffered and
doesn't get sent until the buffer is full.
This provides a way to get git-diff-tree to flush its output buffers.
If a blank line is supplied on git-diff-tree's standard input, it will
flush its output buffers and then accept further input.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
On Sun, 16 Apr 2006, Junio C Hamano wrote:
>
> In the mid-term, I am hoping we can drop the generate_header()
> callchain _and_ the custom code that formats commit log in-core,
> found in cmd_log_wc().
Ok, this was nastier than expected, just because the dependencies between
the different log-printing stuff were absolutely _everywhere_, but here's
a patch that does exactly that.
The patch is not very easy to read, and the "--patch-with-stat" thing is
still broken (it does not call the "show_log()" thing properly for
merges). That's not a new bug. In the new world order it _should_ do
something like
if (rev->logopt)
show_log(rev, rev->logopt, "---\n");
but it doesn't. I haven't looked at the --with-stat logic, so I left it
alone.
That said, this patch removes more lines than it adds, and in particular,
the "cmd_log_wc()" loop is now a very clean:
while ((commit = get_revision(rev)) != NULL) {
log_tree_commit(rev, commit);
free(commit->buffer);
commit->buffer = NULL;
}
so it doesn't get much prettier than this. All the complexity is entirely
hidden in log-tree.c, and any code that needs to flush the log literally
just needs to do the "if (rev->logopt) show_log(...)" incantation.
I had to make the combined_diff() logic take a "struct rev_info" instead
of just a "struct diff_options", but that part is pretty clean.
This does change "git whatchanged" from using "diff-tree" as the commit
descriptor to "commit", and I changed one of the tests to reflect that new
reality. Otherwise everything still passes, and my other tests look fine
too.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Merging all three option parsers related to whatchanged is
unarguably the right thing, but the fallout was too big to scare
me away. Let's try it once again, but once step at time.
This splits out init_revisions() call from setup_revisions(), so
that the callers can set different defaults to match the
traditional benaviour.
The rev-list command is still broken in a big way, which is the
topic of next step.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This basically does a few things that are sadly somewhat interdependent,
and nontrivial to split out
- get rid of "struct log_tree_opt"
The fields in "log_tree_opt" are moved into "struct rev_info", and all
users of log_tree_opt are changed to use the rev_info struct instead.
- add the parsing for the log_tree_opt arguments to "setup_revision()"
- make setup_revision set a flag (revs->diff) if the diff-related
arguments were used. This allows "git log" to decide whether it wants
to show diffs or not.
- make setup_revision() also initialize the diffopt part of rev_info
(which we had from before, but we just didn't initialize it)
- make setup_revision() do all the "finishing touches" on it all (it will
do the proper flag combination logic, and call "diff_setup_done()")
Now, that was the easy and straightforward part.
The slightly more involved part is that some of the programs that want to
use the new-and-improved rev_info parsing don't actually want _commits_,
they may want tree'ish arguments instead. That meant that I had to change
setup_revision() to parse the arguments not into the "revs->commits" list,
but into the "revs->pending_objects" list.
Then, when we do "prepare_revision_walk()", we walk that list, and create
the sorted commit list from there.
This actually cleaned some stuff up, but it's the less obvious part of the
patch, and re-organized the "revision.c" logic somewhat. It actually paves
the way for splitting argument parsing _entirely_ out of "revision.c",
since now the argument parsing really is totally independent of the commit
walking: that didn't use to be true, since there was lots of overlap with
get_commit_reference() handling etc, now the _only_ overlap is the shared
(and trivial) "add_pending_object()" thing.
However, I didn't do that file split, just because I wanted the diff
itself to be smaller, and show the actual changes more clearly. If this
gets accepted, I'll do further cleanups then - that includes the file
split, but also using the new infrastructure to do a nicer "git diff" etc.
Even in this form, it actually ends up removing more lines than it adds.
It's nice to note how simple and straightforward this makes the built-in
"git log" command, even though it continues to support all the diff flags
too. It doesn't get much simpler that this.
I think this is worth merging soonish, because it does allow for future
cleanup and even more sharing of code. However, it obviously touches
"revision.c", which is subtle. I've tested that it passes all the tests we
have, and it passes my "looks sane" detector, but somebody else should
also give it a good look-over.
[jc: squashed the original and three "oops this too" updates, with
another fix-up.]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
The way tree-diff was set up assumed we would use only one set
of pathspec during the entire life of the program. Move the
pathspec related static variables out to diff_options structure
so that we can filter commits with one set of paths while show
the actual diffs using different set of paths.
I suspect this breaks blame.c, and makes "git log paths..." to
default to the --full-diff, the latter of which is dealt with
the next commit.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This separates out the part that deals with one-commit diff-tree
(and --stdin form) into a separate log-tree module.
There are two goals with this. The more important one is to be
able to make this part available to "git log --diff", so that we
can have a native "git whatchanged" command. Another is to
simplify the commit log generation part simpler.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This replaces occurences of "blob", "commit", "tag", and "tree",
where they're really used as type specifiers, which we already
have defined global constants for.
Signed-off-by: Peter Eriksen <s022018@student.dtu.dk>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Marco says it breaks qgit. This makes the flags a bit more
orthogonal.
$ git-diff-tree -r --abbrev ca18
No output from this command because you asked to skip merge by
not having -m there.
$ git-diff-tree -r -m --abbrev ca18
ca182053c7
:100644 100644 538d21d... 59042d1... M Makefile
:100644 100644 410b758... 6c47c3a... M entry.c
ca182053c7
:100644 100644 30479b4... 59042d1... M Makefile
The same "independent sets of diff" as before without -c.
$ git-diff-tree -r -m -c --abbrev ca18
ca182053c7
::100644 100644 100644 538d21d... 30479b4... 59042d1... MM Makefile
Combined.
$ git-diff-tree -r -c --abbrev ca18
ca182053c7
::100644 100644 100644 538d21d... 30479b4... 59042d1... MM Makefile
Asking for combined without -m does not make sense, so -c
implies -m.
We need to supply -c as default to whatchanged, which is a
one-liner.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This way, diff-files can make use of it. Also implement the
full suite of what diff_flush_raw() supports just for
consistency. With this, 'diff-tree -c -r --name-status' would
show what is expected.
There is no way to get the historical output (useful for
debugging and low-level Plumbing work) anymore, so tentatively
it makes '-m' to mean "do not combine and show individual diffs
with parents".
diff-files matches diff-tree to produce raw output for -c. For
textual combined diff, use -p -c.
Signed-off-by: Junio C Hamano <junkio@cox.net>
NOTE! This makes "-c" be the default, which effectively means that merges
are never ignored any more, and "-m" is a no-op. So it changes semantics.
I would also like to make "--cc" the default if you do patches, but didn't
actually do that.
The raw output format is not wonderfully pretty, but it's distinguishable
from a "normal patch" in that a normal patch with just one parent has just
one colon at the beginning, while a multi-parent raw diff has <n> colons
for <n> parents.
So now, in the kernel, when you do
git-diff-tree cce0cac125623f9b68f25dd1350f6d616220a8dd
(to see the manual ARM merge that had a conflict in arch/arm/Kconfig), you
get
cce0cac125623f9b68f25dd1350f6d616220a8dd
::100644 100644 100644 4a63a8e2e45247a11c068c6ed66c6e7aba29ddd9 77eee38762d69d3de95ae45dd9278df9b8225e2c 2f61726d2f4b636f6e66696700dbf71a59dad287 arch/arm/Kconfig
ie you see two colons (two parents), then three modes (parent modes
followed by result mode), then three sha1s (parent sha1s followed by
result sha1).
Which is pretty close to the normal raw diff output.
Cool/stupid exercise:
$ git-whatchanged | grep '^::' | cut -f2- | sort |
uniq -c | sort -n | less -S
will show which files have needed the most file-level merge conflict
resolution. Useful? Probably not. But kind of interesting.
For the kernel, it's
....
10 arch/ia64/Kconfig
11 drivers/scsi/Kconfig
12 drivers/net/Makefile
17 include/linux/libata.h
18 include/linux/pci_ids.h
23 drivers/net/Kconfig
24 drivers/scsi/libata-scsi.c
28 drivers/scsi/libata-core.c
43 MAINTAINERS
Signed-off-by: Junio C Hamano <junkio@cox.net>
git-diff-tree --stdin ignored second and subsequent parents when
fed git-rev-list --parents output. Update diff_tree_commit()
function to take a commit object, and pass a fabricated commit
object after grafting the fake parents from diff_tree_stdin().
Signed-off-by: Junio C Hamano <junkio@cox.net>
It _might_ make sense for certain users like gitk and gitview if
we had a single tool that gives --pretty and its diff even if
the diff is empty. Having said that, the flag --cc -m is too
specific. If some uses want to see the commit log even for an
empty diff, that flag should not be something only --cc honors.
Here's an "--always" flag that does that.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Building on the previous '-c' (combined) option, '--cc' option
squelches the output further by omitting hunks that consist of
difference with solely one parent.
Signed-off-by: Junio C Hamano <junkio@cox.net>
A new option '-c' to diff-tree changes the way a merge commit is
displayed when generating a patch output. It shows a "combined
diff" (hence the option letter 'c'), which looks like this:
$ git-diff-tree --pretty -c -p fec9ebf1 | head -n 18
diff-tree fec9ebf... (from parents)
Merge: 0620db3... 8a263ae...
Author: Junio C Hamano <junkio@cox.net>
Date: Sun Jan 15 22:25:35 2006 -0800
Merge fixes up to GIT 1.1.3
diff --combined describe.c
@@@ +98,7 @@@
return (a_date > b_date) ? -1 : (a_date == b_date) ? 0 : 1;
}
- static void describe(char *arg)
- static void describe(struct commit *cmit, int last_one)
++ static void describe(char *arg, int last_one)
{
+ unsigned char sha1[20];
+ struct commit *cmit;
There are a few things to note about this feature:
- The '-c' option implies '-p'. It also implies '-m' halfway
in the sense that "interesting" merges are shown, but not all
merges.
- When a blob matches one of the parents, we do not show a diff
for that path at all. For a merge commit, this option shows
paths with real file-level merge (aka "interesting things").
- As a concequence of the above, an "uninteresting" merge is
not shown at all. You can use '-m' in addition to '-c' to
show the commit log for such a merge, but there will be no
combined diff output.
- Unlike "gitk", the output is monochrome.
A '-' character in the nth column means the line is from the nth
parent and does not appear in the merge result (i.e. removed
from that parent's version).
A '+' character in the nth column means the line appears in the
merge result, and the nth parent does not have that line
(i.e. added by the merge itself or inherited from another
parent).
The above example output shows that the function signature was
changed from either parents (hence two "-" lines and a "++"
line), and "unsigned char sha1[20]", prefixed by a " +", was
inherited from the first parent.
The code as sent to the list was buggy in few corner cases,
which I have fixed since then.
It does not bother to keep track of and show the line numbers
from parent commits, which it probably should.
Signed-off-by: Junio C Hamano <junkio@cox.net>
When I show transcripts to explain how something works, I often
find myself hand-editing the diff-raw output to shorten various
object names in the output.
This adds --abbrev option to the diff family, which shortens
diff-raw output and diff-tree commit id headers.
Signed-off-by: Junio C Hamano <junkio@cox.net>
We used to read the commit objects by hand and ignored the grafts.
Rewrite it using lookup_commit() API, to make it grafts-aware.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Otherwise we would end up linking all the unneeded stuff into git-daemon
only to link with git_default_config.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This patch introduces -no-commit-id option for git-diff-tree, which
suppresses commit ID output.
[jc: dropped gitk part for now.]
Signed-off-by: Pavel Roskin <proski@gnu.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Update docs and usages regarding '-r' recursive option for git-diff-tree.
Remove '-r' from common diff options, mention it only for git-diff-tree.
Remove one extraneous use of '-r' with git-diff-files in get-merge.sh.
Sync the synopsis and usage string for git-diff-tree.
Signed-off-by: Chris Shoemaker <c.shoemaker at cox.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This makes the tree diff functionality independent of the "git-diff-tree"
program, by splitting the core functionality up into a library file.
This will be needed for when we teach git-rev-list to only follow a
specified set of pathnames, rather than the global revision history.
Most of it is a fairly straightforward code move, but it also involves
some calling convention cleanup, and moving some of the static variables
from diff-tree.c into the options structure.
The actual tree change callback routines also become paramterized by the
diff_options structure, allowing the library functionality to do something
else than just show the diff on stdout.
Right now the only user of this functionality remains git-diff-tree
itself.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Do our own ctype.h, just to get the sane semantics: we want
locale-independence, _and_ we want the right signed behaviour. Plus we
only use a very small subset of ctype.h anyway (isspace, isalpha,
isdigit and isalnum).
Signed-off-by: Junio C Hamano <junkio@cox.net>
This is a first cut at a very simple parser for a git config file.
The format of the file is a simple ini-file like thing, with simple
variable/value pairs. You can (and should) make the variables have a
simple single-level scope, ie a valid file looks something like this:
#
# This is the config file, and
# a '#' or ';' character indicates
# a comment
#
; core variables
[core]
; Don't trust file modes
filemode = false
; Our diff algorithm
[diff]
external = "/usr/local/bin/gnu-diff -u"
renames = true
which parses into three variables: "core.filemode" is associated with the
string "false", and "diff.external" gets the appropriate quoted value.
Right now we only react to one variable: "core.filemode" is a boolean that
decides if we should care about the 0100 (user-execute) bit of the stat
information. Even that is just a parsing demonstration - this doesn't
actually implement that st_mode compare logic itself.
Different programs can react to different config options, although they
should always fall back to calling "git_default_config()" on any config
option name that they don't recognize.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This is a long overdue clean-up to the code for parsing and passing
diff options. It also tightens some constness issues.
Signed-off-by: Junio C Hamano <junkio@cox.net>
It is a bit embarrassing that it took this long for a fix since the
problem was first reported on Aug 13th.
Message-ID: <87y876gl1r.wl@mail2.atmark-techno.com>
From: Yasushi SHOJI <yashi@atmark-techno.com>
Newsgroups: gmane.comp.version-control.git
Subject: [patch] possible memory leak in diff.c::diff_free_filepair()
Date: Sat, 13 Aug 2005 19:58:56 +0900
This time I used valgrind to make sure that it does not overeagerly
discard memory that is still being used.
Signed-off-by: Junio C Hamano <junkio@cox.net>
We always show the diff as an absolute path, but pathnames to diff are
taken relative to the current working directory (and if no pathnames are
given, the default ends up being all of the current working directory).
Note that "../xyz" also works, so you can do
cd linux/drivers/char
git diff ../block
and it will generate a diff of the linux/drivers/block changes.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
All usage strings are now declared as static const char [].
This is carried over from my old git-pb branch.
Signed-off-by: Petr Baudis <pasky@ucw.cz>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This removes the separate "formats" for name and name-with-zero-
termination.
It also removes the difference between HUMAN and MACHINE formats, and
they both become DIFF_FORMAT_RAW, with the difference being just in the
line and inter-filename termination.
It also makes the code easier to understand.
I got tired of maintaining almost duplicated descriptions in
diff-* brothers, both in usage string and documentation.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Porcelain layers often want to find only names of changed files,
and even with diff-raw output format they end up having to pick
out only the filename. Support --name-only (and --name-only-z
for xargs -0 and cpio -0 users that want to treat filenames with
embedded newlines sanely) flag to help them.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Like diff-tree, this patch makes -C option for diff-* brothers
to use only pre-image of modified files as rename/copy detection
by default. Give --find-copies-harder to use unmodified files
to find copies from as well.
This also fixes "diff-files -C" problem earlier noticed by
Linus. It was feeding the null sha1 even when the file in the
work tree was known to match what is in the index file. This
resulted in diff-files showing everything in the project.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This is a halfway between debugging aid and a helper to write an
ultra-smart merge scripts. The new option takes a string that
consists of a list of "status" letters, and limits the diff
output to only those classes of changes, with two exceptions:
- A broken pair (aka "complete rewrite"), does not match D
(deleted) or N (created). Use B to look for them.
- The letter "A" in the diff-filter string does not match
anything itself, but causes the entire diff that contains
selected patches to be output (this behaviour is similar to
that of --pickaxe-all for the -S option).
For example,
$ git-rev-list HEAD |
git-diff-tree --stdin -s -v -B -C --diff-filter=BCR
shows a list of commits that have complete rewrite, copy, or
rename.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Normally, diff-tree does not feed unchanged filepair to diffcore
for performance reasons, so copies are detected only when the
source file of the copy happens to be modified in the same
changeset. This adds --find-copies-harder flag to tell
diff-tree to sacrifice the performance in order to find copies
the same way as other commands in diff-* family.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This steals --pretty command line option from rev-list and
teaches diff-tree to do the same. With this change,
$ git-whatchanged --pretty
would work as expected.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
You can ask to print out "raw" format (full headers, full body),
"medium" format (author and date, full body) or "short" format
(author only, condensed body).
Use "git-rev-list --pretty=short HEAD | less -S" for an example.
This cleans up diff_scoreopt_parse() function that is used to
parse the fractional notation -B, -C and -M option takes. The
callers are modified to check for errors and complain. Earlier
they silently ignored malformed input and falled back on the
default.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch updates diff documentation and usage strings:
- clarify the semantics of -R. It is not "output in reverse";
rather, it is "I will feed diff backwards". Semantically
they are different when -C is involved.
- describe -O in usage strings of diff-* brothers. It was
implemented, documented but not described in usage text.
Also it adds -O to diff-helper. Like -S (and unlike -M/-C/-B),
this option can work on sanitized diff-raw output produced by
the diff-* brothers. While we are at it, the call it makes to
diffcore is cleaned up to use the diffcore_std() like everybody
else, and the declaration for the low level diffcore routines
are moved from diff.h (public) to diffcore.h (private between
diff.c and diffcore backends).
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The core GIT repository has trees that record regular file mode
in 0664 instead of normalized 0644 pattern. Comparing such a
tree with another tree that records the same file in 0644
pattern without content changes with git-diff-tree causes it to
feed otherwise unmodified pairs to the diff_change() routine,
which triggers a sanity check routine and barfs. This patch
fixes the problem, along with the fix to another caller that
uses unnormalized mode bits to call diff_change() routine in a
similar way.
Without this patch, you will see "fatal error" from diff-tree
when you run git-deltafy-script on the core GIT repository
itself.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
A new diffcore filter diffcore-order is introduced. This takes
a text file each of whose line is a shell glob pattern. Patches
that match a glob pattern on an earlier line in the file are
output before patches that match a later line, and patches that
do not match any glob pattern are output last.
A typical orderfile for git project probably should look like
this:
README
Makefile
Documentation
*.h
*.c
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
A new diffcore transformation, diffcore-break.c, is introduced.
When the -B flag is given, a patch that represents a complete
rewrite is broken into a deletion followed by a creation. This
makes it easier to review such a complete rewrite patch.
The -B flag takes the same syntax as the -M and -C flags to
specify the minimum amount of non-source material the resulting
file needs to have to be considered a complete rewrite, and
defaults to 99% if not specified.
As the new test t4008-diff-break-rewrite.sh demonstrates, if a
file is a complete rewrite, it is broken into a delete/create
pair, which can further be subjected to the usual rename
detection if -M or -C is used. For example, if file0 gets
completely rewritten to make it as if it were rather based on
file1 which itself disappeared, the following happens:
The original change looks like this:
file0 --> file0' (quite different from file0)
file1 --> /dev/null
After diffcore-break runs, it would become this:
file0 --> /dev/null
/dev/null --> file0'
file1 --> /dev/null
Then diffcore-rename matches them up:
file1 --> file0'
The internal score values are finer grained now. Earlier
maximum of 10000 has been raised to 60000; there is no user
visible changes but there is no reason to waste available bits.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>