1
0
Fork 0
mirror of https://github.com/git/git.git synced 2024-11-01 14:57:52 +01:00
Commit graph

924 commits

Author SHA1 Message Date
Junio C Hamano
ef49a7a012 zlib: zlib can only process 4GB at a time
The size of objects we read from the repository and data we try to put
into the repository are represented in "unsigned long", so that on larger
architectures we can handle objects that weigh more than 4GB.

But the interface defined in zlib.h to communicate with inflate/deflate
limits avail_in (how many bytes of input are we calling zlib with) and
avail_out (how many bytes of output from zlib are we ready to accept)
fields effectively to 4GB by defining their type to be uInt.

In many places in our code, we allocate a large buffer (e.g. mmap'ing a
large loose object file) and tell zlib its size by assigning the size to
avail_in field of the stream, but that will truncate the high octets of
the real size. The worst part of this story is that we often pass around
z_stream (the state object used by zlib) to keep track of the number of
used bytes in input/output buffer by inspecting these two fields, which
practically limits our callchain to the same 4GB limit.

Wrap z_stream in another structure git_zstream that can express avail_in
and avail_out in unsigned long. For now, just die() when the caller gives
a size that cannot be given to a single zlib call. In later patches in the
series, we would make git_inflate() and git_deflate() internally loop to
give callers an illusion that our "improved" version of zlib interface can
operate on a buffer larger than 4GB in one go.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-10 11:52:15 -07:00
Junio C Hamano
225a6f1068 zlib: wrap deflateBound() too
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-10 11:18:17 -07:00
Junio C Hamano
55bb5c9147 zlib: wrap deflate side of the API
Wrap deflateInit, deflate, and deflateEnd for everybody, and the sole use
of deflateInit2 in remote-curl.c to tell the library to use gzip header
and trailer in git_deflate_init_gzip().

There is only one caller that cares about the status from deflateEnd().
Introduce git_deflate_end_gently() to let that sole caller retrieve the
status and act on it (i.e. die) for now, but we would probably want to
make inflate_end/deflate_end die when they ran out of memory and get
rid of the _gently() kind.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-10 11:10:29 -07:00
Junio C Hamano
456a4c08b8 Merge branch 'jk/diff-not-so-quick'
* jk/diff-not-so-quick:
  diff: futureproof "stop feeding the backend early" logic
  diff_tree: disable QUICK optimization with diff filter

Conflicts:
	diff.c
2011-06-06 11:40:14 -07:00
Junio C Hamano
b3c89315a3 Merge branch 'jc/rename-degrade-cc-to-c' into maint
* jc/rename-degrade-cc-to-c:
  diffcore-rename: fall back to -C when -C -C busts the rename limit
  diffcore-rename: record filepair for rename src
  diffcore-rename: refactor "too many candidates" logic
  builtin/diff.c: remove duplicated call to diff_result_code()
2011-05-31 12:00:02 -07:00
Junio C Hamano
28b9264dd6 diff: futureproof "stop feeding the backend early" logic
Refactor the "do not stop feeding the backend early" logic into a small
helper function and use it in both run_diff_files() and diff_tree() that
has the stop-early optimization. We may later add other types of diffcore
transformation that require to look at the whole result like diff-filter
does, and having the logic in a single place is essential for longer term
maintainability.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-31 09:21:36 -07:00
Junio C Hamano
e5f85df87e diff --stat-count: finishing touches
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-27 21:50:39 -07:00
Michael J Gruber
808e1db231 diff: introduce --stat-lines to limit the stat lines
Often one is interested in the full --stat output only for commits which
change a few files, but not others, because larger restructuring gives a
--stat which fills a few screens.

Introduce a new option --stat-count=<count> which limits the --stat output
to the first <count> lines, followed by a "..." line. It can
also be given as the third parameter in
--stat=<width>,<name-width>,<count>.

Also, the unstuck form is supported analogous to the other two stat
parameters.

Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-27 10:44:34 -07:00
Michael J Gruber
358e460eeb diff.c: omit hidden entries from namelen calculation with --stat
Currently, --stat calculates the longest name from all items but then
drops some (mode changes) from the output later on.

Instead, drop them from the namelen generation and calculation.

Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-27 10:44:02 -07:00
Junio C Hamano
d9ac3e41c3 Merge branch 'jm/maint-diff-words-with-sbe' into maint
* jm/maint-diff-words-with-sbe:
  do not read beyond end of malloc'd buffer
2011-05-26 09:43:00 -07:00
Jeff King
3813e69031 refactor get_textconv to not require diff_filespec
This function actually does two things:

  1. Load the userdiff driver for the filespec.

  2. Decide whether the driver has a textconv component, and
     initialize the textconv cache if applicable.

Only part (1) requires the filespec object, and some callers
may not have a filespec at all. So let's split them it into
two functions, and put part (2) with the userdiff code,
which is a better fit.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-23 15:46:02 -07:00
Junio C Hamano
34ad5a52b4 Merge branch 'jm/maint-diff-words-with-sbe'
* jm/maint-diff-words-with-sbe:
  do not read beyond end of malloc'd buffer
2011-05-23 10:27:42 -07:00
Jim Meyering
42536dd9b9 do not read beyond end of malloc'd buffer
With diff.suppress-blank-empty=true, "git diff --word-diff" would
output data that had been read from uninitialized heap memory.
The problem was that fn_out_consume did not account for the
possibility of a line with length 1, i.e., the empty context line
that diff.suppress-blank-empty=true converts from " \n" to "\n".
Since it assumed there would always be a prefix character (the space),
it decremented "len" unconditionally, thus passing len=0 to emit_line,
which would then blindly call emit_line_0 with len=-1 which would
pass that value on to fwrite as SIZE_MAX.  Boom.

Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-20 11:39:49 -07:00
Junio C Hamano
df54e2bfd6 Merge branch 'jh/dirstat-lines'
* jh/dirstat-lines:
  Mark dirstat error messages for translation
  Improve error handling when parsing dirstat parameters
  New --dirstat=lines mode, doing dirstat analysis based on diffstat
  Allow specifying --dirstat cut-off percentage as a floating point number
  Add config variable for specifying default --dirstat behavior
  Refactor --dirstat parsing; deprecate --cumulative and --dirstat-by-file
  Make --dirstat=0 output directories that contribute < 0.1% of changes
  Add several testcases for --dirstat and friends
2011-05-13 11:01:32 -07:00
Junio C Hamano
a613b534bc Merge branch 'jc/fix-diff-files-unmerged' into maint
* jc/fix-diff-files-unmerged:
  diff-files: show unmerged entries correctly
  diff: remove often unused parameters from diff_unmerge()
  diff.c: return filepair from diff_unmerge()
  test: use $_z40 from test-lib
2011-05-13 10:41:54 -07:00
Junio C Hamano
22dbeee715 Merge branch 'jc/fix-diff-files-unmerged'
* jc/fix-diff-files-unmerged:
  diff-files: show unmerged entries correctly
  diff: remove often unused parameters from diff_unmerge()
  diff.c: return filepair from diff_unmerge()
  test: use $_z40 from test-lib
2011-05-06 10:52:58 -07:00
Junio C Hamano
f5bf1b5f6b Merge branch 'jh/dirstat' into maint
* jh/dirstat:
  --dirstat: In case of renames, use target filename instead of source filename
  Teach --dirstat not to completely ignore rearranged lines within a file
  --dirstat-by-file: Make it faster and more correct
  --dirstat: Describe non-obvious differences relative to --stat or regular diff
2011-05-04 14:59:07 -07:00
Johan Herland
7478ac57c4 Mark dirstat error messages for translation
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-29 11:22:56 -07:00
Johan Herland
51670fc87e Improve error handling when parsing dirstat parameters
When encountering errors or unknown tokens while parsing parameters to the
--dirstat option, it makes sense to die() with an error message informing
the user of which parameter did not make sense. However, when parsing the
diff.dirstat config variable, we cannot simply die(), but should instead
(after warning the user) ignore the erroneous or unrecognized parameter.
After all, future Git versions might add more dirstat parameters, and
using two different Git versions on the same repo should not cripple the
older Git version just because of a parameter that is only understood by
a more recent Git version.

This patch fixes the issue by refactoring the dirstat parameter parsing
so that parse_dirstat_params() keeps on parsing parameters, even if an
earlier parameter was not recognized. When parsing has finished, it returns
zero if all parameters were successfully parsed, and non-zero if one or
more parameters were not recognized (with appropriate error messages
appended to the 'errmsg' argument).

The parse_dirstat_params() callers then decide (based on the return value
from parse_dirstat_params()) whether to warn and ignore (in case of
diff.dirstat), or to warn and die (in case of --dirstat).

The patch also adds a couple of tests verifying the correct behavior of
--dirstat and diff.dirstat in the face of unknown (possibly future) dirstat
parameters.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Improved-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-29 11:22:56 -07:00
Johan Herland
1c57a627bf New --dirstat=lines mode, doing dirstat analysis based on diffstat
This patch adds an alternative implementation of show_dirstat(), called
show_dirstat_by_line(), which uses the more expensive diffstat analysis
(as opposed to show_dirstat()'s own (relatively inexpensive) analysis)
to derive the numbers from which the --dirstat output is computed.

The alternative implementation is controlled by the new "lines" parameter
to the --dirstat option (or the diff.dirstat config variable).

For binary files, the diffstat analysis counts bytes instead of lines,
so to prevent binary files from dominating the dirstat results, the
byte counts for binary files are divided by 64 before being compared to
their textual/line-based counterparts. This is a stupid and ugly - but
very cheap - heuristic.

In linux-2.6.git, running the three different --dirstat modes:

  time git diff v2.6.20..v2.6.30 --dirstat=changes > /dev/null
vs.
  time git diff v2.6.20..v2.6.30 --dirstat=lines > /dev/null
vs.
  time git diff v2.6.20..v2.6.30 --dirstat=files > /dev/null

yields the following average runtimes on my machine:

 - "changes" (default): ~6.0 s
 - "lines":             ~9.6 s
 - "files":             ~0.1 s

So, as expected, there's a considerable performance hit (~60%) by going
through the full diffstat analysis as compared to the default "changes"
analysis (obviously, "files" is much faster than both). As such, the
"lines" mode is probably only useful if you really need the --dirstat
numbers to be consistent with the numbers returned from the other
--*stat options.

The patch also includes documentation and tests for the new dirstat mode.

Improved-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-29 11:22:55 -07:00
Johan Herland
712d2c7dd8 Allow specifying --dirstat cut-off percentage as a floating point number
Only the first digit after the decimal point is kept, as the dirstat
calculations all happen in permille.

Selftests verifying floating-point percentage input has been added.

Improved-by: Junio C Hamano <gitster@pobox.com>
Improved-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-29 11:20:11 -07:00
Johan Herland
2d17495196 Add config variable for specifying default --dirstat behavior
The new diff.dirstat config variable takes the same arguments as
'--dirstat=<args>', and specifies the default arguments for --dirstat.
The config is obviously overridden by --dirstat arguments passed on the
command line.

When not specified, the --dirstat defaults are 'changes,noncumulative,3'.

The patch also adds several tests verifying the interaction between the
diff.dirstat config variable, and the --dirstat command line option.

Improved-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-29 11:20:03 -07:00
Johan Herland
333f3fb0c5 Refactor --dirstat parsing; deprecate --cumulative and --dirstat-by-file
Instead of having multiple interconnected dirstat-related options, teach
the --dirstat option itself to accept all behavior modifiers as parameters.

 - Preserve the current --dirstat=<limit> (where <limit> is an integer
   specifying a cut-off percentage)
 - Add --dirstat=cumulative, replacing --cumulative
 - Add --dirstat=files, replacing --dirstat-by-file
 - Also add --dirstat=changes and --dirstat=noncumulative for specifying the
   current default behavior. These allow the user to reset other --dirstat
   parameters (e.g. 'cumulative' and 'files') occuring earlier on the
   command line.

The deprecated options (--cumulative and --dirstat-by-file) are still
functional, although they have been removed from the documentation.

Allow multiple parameters to be separated by commas, e.g.:
  --dirstat=files,10,cumulative

Update the documentation accordingly, and add testcases verifying the
behavior of the new syntax.

Improved-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-29 11:17:36 -07:00
Johan Herland
58a8756a98 Make --dirstat=0 output directories that contribute < 0.1% of changes
The expected output from --dirstat=0, is to include any directory with
changes, even if those changes contribute a minuscule portion of the total
changes. However, currently, directories that contribute less than 0.1% are
not included, since their 'permille' value is 0, and there is an
'if (permille)' check in gather_dirstat() that causes them to be ignored.

This test is obviously intended to exclude directories that contribute no
changes whatsoever, but in this case, it hits too broadly. The correct
check is against 'this_dir' from which the permille is calculated. Only if
this value is 0 does the directory truly contribute no changes, and should
be skipped from the output.

This patches fixes this issue, and updates corresponding testcases to
expect the new behvaior.

Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-29 11:17:36 -07:00
Junio C Hamano
50d3062ab2 Merge branch 'jc/diff-irreversible-delete'
* jc/diff-irreversible-delete:
  git diff -D: omit the preimage of deletes
2011-04-28 14:11:47 -07:00
Junio C Hamano
76a89d6d82 Merge branch 'jc/rename-degrade-cc-to-c'
* jc/rename-degrade-cc-to-c:
  diffcore-rename: fall back to -C when -C -C busts the rename limit
  diffcore-rename: record filepair for rename src
  diffcore-rename: refactor "too many candidates" logic
  builtin/diff.c: remove duplicated call to diff_result_code()
2011-04-28 14:11:43 -07:00
Junio C Hamano
d98a509ec3 Merge branch 'jh/dirstat'
* jh/dirstat:
  --dirstat: In case of renames, use target filename instead of source filename
  Teach --dirstat not to completely ignore rearranged lines within a file
  --dirstat-by-file: Make it faster and more correct
  --dirstat: Describe non-obvious differences relative to --stat or regular diff
2011-04-28 14:11:19 -07:00
Junio C Hamano
fa7b290895 diff: remove often unused parameters from diff_unmerge()
e9c8409 (diff-index --cached --raw: show tree entry on the LHS for
unmerged entries., 2007-01-05) added a <mode, object name> pair as
parameters to this function, to store them in the pre-image side of an
unmerged file pair.  Now the function is fixed to return the filepair it
queued, we can make the caller on the special case codepath to do so.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-23 22:34:43 -07:00
Junio C Hamano
76399c0195 diff.c: return filepair from diff_unmerge()
The underlying diff_queue() returns diff_filepair so that the caller can
further add information to it, and the helper function diff_unmerge()
utilizes the feature itself, but does not expose it to its callers, which
was kind of selfish.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-23 22:34:43 -07:00
Johan Herland
2ca8671470 --dirstat: In case of renames, use target filename instead of source filename
This changes --dirstat analysis to count "damage" toward the target filename,
rather than the source filename. For renames within a directory, this won't
matter to the final output, but when moving files between diretories, the
output now lists the target directory rather than the source directory.

Signed-off-by: Johan Herland <johan@herland.net>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-12 11:29:34 -07:00
Johan Herland
2ff3a80334 Teach --dirstat not to completely ignore rearranged lines within a file
Currently, the --dirstat analysis ignores when lines within a file are
rearranged, because the "damage" calculated by show_dirstat() is 0.
However, if the object name has changed, we already know that there is
some damage, and it is unintuitive to claim there is _no_ damage.

Teach show_dirstat() to assign a minimum amount of damage (== 1) to
entries for which the analysis otherwise yields zero damage, to still
represent that these files are changed, instead of saying that there
is no change.

Also, skip --dirstat analysis when the object names are the same (e.g. for
a pure file rename).

Signed-off-by: Johan Herland <johan@herland.net>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-11 11:16:15 -07:00
Johan Herland
0133dab75d --dirstat-by-file: Make it faster and more correct
Currently, when using --dirstat-by-file, it first does the full --dirstat
analysis (using diffcore_count_changes()), and then resets 'damage' to 1,
if any damage was found by diffcore_count_changes().

But --dirstat-by-file is not interested in the file damage per se. It only
cares if the file changed at all. In that sense it only cares if the blob
object for a file has changed. We therefore only need to compare the
object names of each file pair in the diff queue and we can skip the
entire --dirstat analysis and simply set 'damage' to 1 for each entry
where the object name has changed.

This makes --dirstat-by-file faster, and also bypasses --dirstat's practice
of ignoring rearranged lines within a file.

The patch also contains an added testcase verifying that --dirstat-by-file
now detects changes that only rearrange lines within a file.

Signed-off-by: Johan Herland <johan@herland.net>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-11 10:12:24 -07:00
Junio C Hamano
467ddc14fe git diff -D: omit the preimage of deletes
When reviewing a patch while concentrating primarily on the text after
then change, wading through pages of deleted text involves a cognitive
burden.

Introduce the -D option that omits the preimage text from the patch output
for deleted files.  When used with -B (represent total rewrite as a single
wholesale deletion followed by a single wholesale addition), the preimage
text is also omitted.

To prevent such a patch from being applied by mistake, the output is
designed not to be usable by "git apply" (or GNU "patch"); it is strictly
for human consumption.

It of course is possible to "apply" such a patch by hand, as a human can
read the intention out of such a patch.  It however is impossible to apply
such a patch even manually in reverse, as the whole point of this option
is to omit the information necessary to do so from the output.

Initial request by Mart Sõmermaa, documentation and tests helped by
Michael J Gruber.

Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-02 23:52:20 -07:00
Junio C Hamano
f31027c99c diffcore-rename: fall back to -C when -C -C busts the rename limit
When there are too many paths in the project, the number of rename source
candidates "git diff -C -C" finds will exceed the rename detection limit,
and no inexact rename detection is performed.  We however could fall back
to "git diff -C" if the number of modified paths is sufficiently small.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-22 14:29:07 -07:00
Johannes Schindelin
c0aa335c95 Remove unused variables
Noticed by gcc 4.6.0.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-22 11:43:27 -07:00
Stephen Boyd
c2e86addb8 Fix sparse warnings
Fix warnings from 'make check'.

 - These files don't include 'builtin.h' causing sparse to complain that
   cmd_* isn't declared:

   builtin/clone.c:364, builtin/fetch-pack.c:797,
   builtin/fmt-merge-msg.c:34, builtin/hash-object.c:78,
   builtin/merge-index.c:69, builtin/merge-recursive.c:22
   builtin/merge-tree.c:341, builtin/mktag.c:156, builtin/notes.c:426
   builtin/notes.c:822, builtin/pack-redundant.c:596,
   builtin/pack-refs.c:10, builtin/patch-id.c:60, builtin/patch-id.c:149,
   builtin/remote.c:1512, builtin/remote-ext.c:240,
   builtin/remote-fd.c:53, builtin/reset.c:236, builtin/send-pack.c:384,
   builtin/unpack-file.c:25, builtin/var.c:75

 - These files have symbols which should be marked static since they're
   only file scope:

   submodule.c:12, diff.c:631, replace_object.c:92, submodule.c:13,
   submodule.c:14, trace.c:78, transport.c:195, transport-helper.c:79,
   unpack-trees.c:19, url.c:3, url.c:18, url.c:104, url.c:117, url.c:123,
   url.c:129, url.c:136, thread-utils.c:21, thread-utils.c:48

 - These files redeclare symbols to be different types:

   builtin/index-pack.c:210, parse-options.c:564, parse-options.c:571,
   usage.c:49, usage.c:58, usage.c:63, usage.c:72

 - These files use a literal integer 0 when they really should use a NULL
   pointer:

   daemon.c:663, fast-import.c:2942, imap-send.c:1072, notes-merge.c:362

While we're in the area, clean up some unused #includes in builtin files
(mostly exec_cmd.h).

Signed-off-by: Stephen Boyd <bebarino@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-22 10:16:54 -07:00
Junio C Hamano
0ce6a51b43 Merge branch 'jk/merge-rename-ux'
* jk/merge-rename-ux:
  pull: propagate --progress to merge
  merge: enable progress reporting for rename detection
  add inexact rename detection progress infrastructure
  commit: stop setting rename limit
  bump rename limit defaults (again)
  merge: improve inexact rename limit warning
2011-03-19 23:23:56 -07:00
Junio C Hamano
4e530c5049 Merge branch 'jk/diffstat-binary' into maint
* jk/diffstat-binary:
  diff: don't retrieve binary blobs for diffstat
  diff: handle diffstat of rewritten binary files
2011-03-16 16:47:26 -07:00
Jonathan Nieder
9cba13ca5d standardize brace placement in struct definitions
In a struct definitions, unlike functions, the prevailing style is for
the opening brace to go on the same line as the struct name, like so:

 struct foo {
	int bar;
	char *baz;
 };

Indeed, grepping for 'struct [a-z_]* {$' yields about 5 times as many
matches as 'struct [a-z_]*$'.

Linus sayeth:

 Heretic people all over the world have claimed that this inconsistency
 is ...  well ...  inconsistent, but all right-thinking people know that
 (a) K&R are _right_ and (b) K&R are right.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-16 12:49:02 -07:00
Jeff King
abb371a1ef diff: don't retrieve binary blobs for diffstat
We only need the size, which is much cheaper to get,
especially if it is a big binary file.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-02-22 10:58:18 -08:00
Jeff King
ded0abc73c diff: handle diffstat of rewritten binary files
The logic in builtin_diffstat assumes that a
complete_rewrite pair should have its lines counted. This is
nonsensical for binary files and leads to confusing things
like:

  $ git diff --stat --summary HEAD^ HEAD
   foo.rand |  Bin 4096 -> 4096 bytes
   1 files changed, 0 insertions(+), 0 deletions(-)

  $ git diff --stat --summary -B HEAD^ HEAD
   foo.rand |   34 +++++++++++++++-------------------
   1 files changed, 15 insertions(+), 19 deletions(-)
   rewrite foo.rand (100%)

So let's reorder the function to handle binary files first
(which from diffstat's perspective look like complete
rewrites anyway), then rewrites, then actual diffstats.

There are two bonus prizes to this reorder:

  1. It gets rid of a now-superfluous goto.

  2. The binary case is at the top, which means we can
     further optimize it in the next patch.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-02-22 10:57:58 -08:00
Jeff King
92c57e5c1d bump rename limit defaults (again)
We did this once before in 5070591 (bump rename limit
defaults, 2008-04-30). Back then, we were shooting for about
1 second for a diff/log calculation, and 5 seconds for a
merge.

There are a few new things to consider, though:

  1. Average processors are faster now.

  2. We've seen on the mailing list some ugly merges where
     not using inexact rename detection leads to many more
     conflicts. Merges of this size take a long time
     anyway, so users are probably happy to spend a little
     bit of time computing the renames.

Let's bump the diff/merge default limits from 200/500 to
400/1000. Those are 2 seconds and 10 seconds respectively on
my modern hardware.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-02-21 10:23:36 -08:00
Junio C Hamano
6ae7a51a2e Merge branch 'ks/blame-worktree-textconv-cached'
* ks/blame-worktree-textconv-cached:
  fill_textconv(): Don't get/put cache if sha1 is not valid
  t/t8006: Demonstrate blame is broken when cachetextconv is on
2010-12-21 14:30:52 -08:00
Kirill Smelkov
9ec09b0495 fill_textconv(): Don't get/put cache if sha1 is not valid
When blaming files in the working tree, the filespec is marked with
!sha1_valid, as we have not given the contents an object name yet.  The
function to cache textconv results (keyed on the object name), however,
didn't check this condition, and ended up on storing the cached result
under a random object name.

Cc: Axel Bonnet <axel.bonnet@ensimag.imag.fr>
Cc: Clément Poulain <clement.poulain@ensimag.imag.fr>
Cc: Diane Gasselin <diane.gasselin@ensimag.imag.fr>
Cc: Jeff King <peff@peff.net>
Signed-off-by: Kirill Smelkov <kirr@landau.phys.spbu.ru>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-12-19 18:41:32 -08:00
Junio C Hamano
cf7a64b54a Merge branch 'kb/diff-C-M-synonym'
* kb/diff-C-M-synonym:
  diff: use "find" instead of "detect" as prefix for long forms of -M and -C
  diff: add --detect-copies-harder as a synonym for --find-copies-harder
2010-12-16 12:58:59 -08:00
Yann Dirson
f611ddc774 diff: use "find" instead of "detect" as prefix for long forms of -M and -C
It is more consistent with existing --find-copies-harder; luckily "detect"
variant has not appeared in any officially released version of git.

Signed-off-by: Yann Dirson <ydirson@altern.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-12-10 13:52:05 -08:00
Junio C Hamano
8577def6fc Merge branch 'np/diff-in-corrupt-repository' into maint
* np/diff-in-corrupt-repository:
  diff: don't presume empty file when corresponding object is missing
2010-12-09 10:36:39 -08:00
Junio C Hamano
ae0a37cd6b Merge branch 'cm/diff-check-at-eol' into maint
* cm/diff-check-at-eol:
  diff --check: correct line numbers of new blank lines at EOF
2010-12-09 10:36:10 -08:00
Junio C Hamano
f04aa35eb6 Merge branch 'jk/diff-CBM'
* jk/diff-CBM:
  diff: report bogus input to -C/-M/-B
2010-12-08 11:24:11 -08:00
Junio C Hamano
f5a5531e4e Merge branch 'np/diff-in-corrupt-repository'
* np/diff-in-corrupt-repository:
  diff: don't presume empty file when corresponding object is missing
2010-11-29 17:52:33 -08:00
Junio C Hamano
039e84e30d Merge branch 'cm/diff-check-at-eol'
* cm/diff-check-at-eol:
  diff --check: correct line numbers of new blank lines at EOF
2010-11-29 17:52:31 -08:00
Kevin Ballard
150a5daad0 diff: add --detect-copies-harder as a synonym for --find-copies-harder
Signed-off-by: Kevin Ballard <kevin@sb.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-29 16:58:27 -08:00
Junio C Hamano
9cffe2018a Merge branch 'cb/diff-fname-optim' into maint
* cb/diff-fname-optim:
  diff: avoid repeated scanning while looking for funcname
  do not search functions for patch ID
  add rebase patch id tests
2010-11-24 12:46:26 -08:00
Junio C Hamano
78bce6c7e9 Merge branch 'jk/no-textconv-symlink' into maint
* jk/no-textconv-symlink:
  diff: don't use pathname-based diff drivers for symlinks
2010-11-24 12:46:20 -08:00
Junio C Hamano
8cf666c9ee Merge branch 'cb/diff-fname-optim'
* cb/diff-fname-optim:
  diff: avoid repeated scanning while looking for funcname
  do not search functions for patch ID
  add rebase patch id tests
2010-11-17 14:59:16 -08:00
Junio C Hamano
6a2e93f107 Merge branch 'jk/no-textconv-symlink'
* jk/no-textconv-symlink:
  diff: don't use pathname-based diff drivers for symlinks
2010-11-17 14:59:10 -08:00
Junio C Hamano
329351feeb Merge branch 'kb/merge-recursive-rename-threshold'
* kb/merge-recursive-rename-threshold:
  diff: add synonyms for -M, -C, -B
  merge-recursive: option to specify rename threshold

Conflicts:
	Documentation/diff-options.txt
	Documentation/merge-strategies.txt
2010-10-26 21:54:04 -07:00
Junio C Hamano
d7806967bd Merge branch 'maint'
* maint:
  Fix copy-pasted comments related to tree diff handling.
2010-10-26 15:04:05 -07:00
Yann Dirson
c3fced6498 Fix copy-pasted comments related to tree diff handling.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-10-25 00:21:56 -07:00
Nicolas Pitre
c50c4316e1 diff: don't presume empty file when corresponding object is missing
The low-level diff code will happily produce totally bogus diff output
with a broken repository via format-patch and friends by treating missing
objects as empty files.  Let's prevent that from happening any longer.

Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-10-21 22:23:34 -07:00
Jeff King
07cd726527 diff: report bogus input to -C/-M/-B
We already detect invalid input to these functions, but we
simply exit with an error code, never saying anything as
simple as "your input was wrong". Let's fix that.

Before:

  $ git diff -CM
  $ echo $?
  128

After:

  $ git diff -CM
  error: invalid argument to -C: M
  $ echo $?
  128

There should be no problems with having diff_opt_parse print
to stderr, as there is already precedent in complaining
about bogus --color and --output arguments.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-10-21 15:44:53 -07:00
Christoph Mallon
8837d33595 diff --check: correct line numbers of new blank lines at EOF
The whitespace check printed the value of the wrong variable, i.e. the
beginning of the block of blank lines at the EOF (possibly absent) in the
old file.

As "git diff --check" is used by users to check their changes before
making a commit, we should point at the line number in the file after
the change.

Signed-off-by: Christoph Mallon <christoph.mallon@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-10-16 18:57:35 -07:00
Junio C Hamano
b3c16ee454 Merge branch 'jc/pickaxe-grep'
* jc/pickaxe-grep:
  diff/log -G<pattern>: tests
  git log/diff: add -G<regexp> that greps in the patch text
  diff: pass the entire diff-options to diffcore_pickaxe()
  gitdiffcore doc: update pickaxe description
2010-09-29 13:49:03 -07:00
Matthieu Moy
9ec26eb7cd diff: trivial fix for --output file error message
The option argument is either after the equal sign in --output=... or in
the next command-line argument. optarg is the reliable way to access it.

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-09-29 13:25:17 -07:00
Kevin Ballard
37ab5156ae diff: add synonyms for -M, -C, -B
Add new long-form options --detect-renames[=<n>], --detect-copies[=<n>],
and --break-rewrites[=[<n>][/<m>]] as synonyms for the -M, -C, and -B
options (respectively).

Signed-off-by: Kevin Ballard <kevin@sb.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-09-29 13:18:04 -07:00
Kevin Ballard
10ae7526be merge-recursive: option to specify rename threshold
The recursive merge strategy turns on rename detection but leaves the
rename threshold at the default. Add a strategy option to allow the user
to specify a rename threshold to use.

Signed-off-by: Kevin Ballard <kevin@sb.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-09-29 13:15:56 -07:00
Clemens Buchacher
ad14b450c0 do not search functions for patch ID
Visual aids, such as the function name in the hunk
header, are not necessary for the purposes of
computing a patch ID.

This is a performance optimization.

Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-09-23 18:35:07 -07:00
Jeff King
d391c0ff94 diff: don't use pathname-based diff drivers for symlinks
When we're diffing symlinks, we consider the contents to be
the pathname that the symlink points to. When a user sets up
a userdiff driver like "*.pdf diff=pdf", their "diff.pdf.*"
config generally tells us what to do with the content of
pdf files.

With the current code, we will actually process a symlink
like "link.pdf" using a configured pdf driver, meaning we
are using contents which consist of a pathname with
configuration that is expecting contents that consist of an
actual pdf file.

The most noticeable example of this would have been
textconv; however, it was already protected in its own
textconv-specific code path. We can still see the breakage
with something like "diff.*.binary", though. You could
also see it with diff.*.funcname, though it is a bit harder
to trigger accidentally there.

This patch adds a check for S_ISREG lower in the callstack
than the textconv-specific check, which should block use of
any userdiff config for non-regular files. We can drop the
check in the textconv code, which is now redundant.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-09-23 18:32:32 -07:00
Junio C Hamano
8ac8cf5bc1 Merge branch 'maint'
* maint:
  xdiff-interface.c: always trim trailing space from xfuncname matches
  diff.c: call regfree to free memory allocated by regcomp when necessary
2010-09-09 17:29:40 -07:00
Brandon Casey
ef5644ea6e diff.c: call regfree to free memory allocated by regcomp when necessary
Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-09-09 17:18:04 -07:00
Junio C Hamano
7cc1e385a0 Merge branch 'cb/binary-patch-id'
* cb/binary-patch-id:
  hash binary sha1 into patch id
2010-08-31 16:24:48 -07:00
Junio C Hamano
f506b8e8b5 git log/diff: add -G<regexp> that greps in the patch text
Teach "-G<regexp>" that is similar to "-S<regexp> --pickaxe-regexp" to the
"git diff" family of commands.  This limits the diff queue to filepairs
whose patch text actually has an added or a deleted line that matches the
given regexp.  Unlike "-S<regexp>", changing other parts of the line that
has a substring that matches the given regexp IS counted as a change, as
such a change would appear as one deletion followed by one addition in a
patch text.

Unlike -S (pickaxe) that is intended to be used to quickly detect a commit
that changes the number of occurrences of hits between the preimage and
the postimage to serve as a part of larger toolchain, this is meant to be
used as the top-level Porcelain feature.

The implementation unfortunately has to run "diff" twice if you are
running "log" family of commands to produce patches in the final output
(e.g. "git log -p" or "git format-patch").  I think we _could_ cache the
result in-core if we wanted to, but that would require larger surgery to
the diffcore machinery (i.e. adding an extra pointer in the filepair
structure to keep a pointer to a strbuf around, stuff the textual diff to
the strbuf inside diffgrep_consume(), and make use of it in later stages
when it is available) and it may not be worth it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-31 14:30:29 -07:00
Junio C Hamano
382f013bc4 diff: pass the entire diff-options to diffcore_pickaxe()
That would make it easier to give enhanced feature to the
pickaxe transformation.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-31 14:30:28 -07:00
Junio C Hamano
e40b34b1ec Merge branch 'mm/shortopt-detached'
* mm/shortopt-detached:
  log: parse separate option for --glob
  log: parse separate options like git log --grep foo
  diff: parse separate options --stat-width n, --stat-name-width n
  diff: split off a function for --stat-* option parsing
  diff: parse separate options like -S foo

Conflicts:
	revision.c
2010-08-21 23:28:31 -07:00
Junio C Hamano
bd3a97a27a Merge branch 'jc/maint-follow-rename-fix'
* jc/maint-follow-rename-fix:
  log: test for regression introduced in v1.7.2-rc0~103^2~2
  diff --follow: do call diffcore_std() as necessary
  diff --follow: do not waste cycles while recursing
2010-08-18 12:47:18 -07:00
Junio C Hamano
cc34bb0b02 Merge branch 'jl/submodule-ignore-diff'
* jl/submodule-ignore-diff:
  Add tests for the diff.ignoreSubmodules config option
  Add the 'diff.ignoreSubmodules' config setting
  Submodules: Use "ignore" settings from .gitmodules too for diff and status
  Submodules: Add the new "ignore" config option for diff and status

Conflicts:
	diff.c
2010-08-18 12:36:25 -07:00
Clemens Buchacher
34597c1f5a hash binary sha1 into patch id
Since commit 2f82f760 (Take binary diffs into
account for "git rebase"), binary files are
included in patch ID computation. Binary files are
diffed using the text diff algorithm, however,
which has a huge impact on performance. The
following tests performance for a 50000 line file
marked as binary in .gitattributes.

$ git format-patch --stdout --ignore-if-in-upstream master

real    0m0.367s
user    0m0.354s
sys     0m0.010s

Instead of diffing the binary files, hash the pre-
and post-image sha1, which is just as unique. As a
result, performance is much improved.

$ git format-patch --stdout --ignore-if-in-upstream master

real    0m0.016s
user    0m0.015s
sys     0m0.001s

Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-16 18:31:37 -07:00
Junio C Hamano
44c48a909a diff --follow: do call diffcore_std() as necessary
Usually, diff frontends populate the output queue with filepairs without
any rename information and call diffcore_std() to sort the renames out.
When --follow is in effect, however, diff-tree family of frontend has a
hack that looks like this:

    diff-tree frontend
    -> diff_tree_sha1()
       . populate diff_queued_diff
       . if --follow is in effect and there is only one change that
         creates the target path, then
       -> try_to_follow_renames()
	  -> diff_tree_sha1() with no pathspec but with -C
	  -> diffcore_std() to find renames
	  . if rename is found, tweak diff_queued_diff and put a
	    single filepair that records the found rename there
    -> diffcore_std()
       . tweak elements on diff_queued_diff by
       - rename detection
       - path ordering
       - pickaxe filtering

We need to skip parts of the second call to diffcore_std() that is related
to rename detection, and do so only when try_to_follow_renames() did find
a rename.  Earlier 1da6175 (Make diffcore_std only can run once before a
diff_flush, 2010-05-06) tried to deal with this issue incorrectly; it
unconditionally disabled any second call to diffcore_std().

This hopefully fixes the breakage.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-13 12:17:45 -07:00
Jakub Narebski
d8faea9d18 diff: strip extra "/" when stripping prefix
There are two ways a user might want to use "diff --relative":

  1. For a file in a directory, like "subdir/file", the user
     can use "--relative=subdir/" to strip the directory.

  2. To strip part of a filename, like "foo-10", they can
     use "--relative=foo-".

We currently handle both of those situations. However, if the user passes
"--relative=subdir" (without the trailing slash), we produce inconsistent
results. For the unified diff format, we collapse the double-slash of
"a//file" correctly into "a/file". But for other formats (raw, stat,
name-status), we end up with "/file".

We can do what the user means here and strip the extra "/" (and only a
slash).  We are not hurting any existing users of (2) above with this
behavior change because the existing output for this case was nonsensical.

Patch by Jakub, tests and commit message by Jeff King.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-11 09:46:47 -07:00
Johannes Schindelin
be4f2b408e Add the 'diff.ignoreSubmodules' config setting
When you have a lot of submodules checked out, the time penalty to check
for dirty submodules can easily imply a multiplication of the total time
by the factor 20. This makes the difference between almost instantaneous
(< 2 seconds) and unbearably slow (> 50 seconds) here, since the disk
caches are constantly overloaded.

To this end, the submodule.*.ignore config option was introduced, but it
is per-submodule.

This commit introduces a global config setting to set a default
(porcelain) value for the --ignore-submodules option, keeping the
default at 'none'. It can be overridden by the submodule.*.ignore
setting and by the --ignore-submodules option.

Incidentally, this commit fixes an issue with the overriding logic:
multiple --ignore-submodules options would not clear the previously
set flags.

While at it, fix a typo in the documentation for submodule.*.ignore.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-09 09:11:50 -07:00
Jens Lehmann
aee9c7d654 Submodules: Add the new "ignore" config option for diff and status
The new "ignore" config option controls the default behavior for "git
status" and the diff family. It specifies under what circumstances they
consider submodules as modified and can be set separately for each
submodule.

The command line option "--ignore-submodules=" has been extended to accept
the new parameter "none" for both status and diff.

Users that chose submodules to get rid of long work tree scanning times
might want to set the "dirty" option for those submodules. This brings
back the pre 1.7.0 behavior, where submodule work trees were never
scanned for modifications. By using "--ignore-submodules=none" on the
command line the status and diff commands can be told to do a full scan.

This option can be set to the following values (which have the same name
and meaning as for the "--ignore-submodules" option of status and diff):

"all": All changes to the submodule will be ignored.

"dirty": Only differences of the commit recorded in the superproject and
	the submodules HEAD will be considered modifications, all changes
	to the work tree of the submodule will be ignored. When using this
	value, the submodule will not be scanned for work tree changes at
	all, leading to a performance benefit on large submodules.

"untracked": Only untracked files in the submodules work tree are ignored,
	a changed HEAD and/or modified files in the submodule will mark it
	as modified.

"none" (which is the default): Either untracked or modified files in a
	submodules work tree or a difference between the subdmodules HEAD
	and the commit recorded in the superproject will make it show up
	as changed. This value is added as a new parameter for the
	"--ignore-submodules" option of the diff family and "git status"
	so the user can override the settings in the configuration.

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-09 09:01:52 -07:00
Matthieu Moy
1e57208ef0 diff: parse separate options --stat-width n, --stat-name-width n
Part of a campaign for unstuck forms of options.

[jn: with some refactoring]

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-06 09:14:36 -07:00
Jonathan Nieder
4d7f7a4ae7 diff: split off a function for --stat-* option parsing
As an optimization, the diff_opt_parse() switchboard has
a single case for all the --stat-* options.  Split it
off into a separate function so we can enhance it
without bringing code dangerously close to the right
margin.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-06 09:14:28 -07:00
Matthieu Moy
dea007fb4c diff: parse separate options like -S foo
Change the option parsing logic in revision.c to accept separate forms
like `-S foo' in addition to `-Sfoo'. The rest of git already accepted
this form, but revision.c still used its own option parsing.

Short options affected are -S<string>, -l<num> and -O<orderfile>, for
which an empty string wouldn't make sense, hence -<option> <arg> isn't
ambiguous.

This patch does not handle --stat-name-width and --stat-width, which are
special-cases where diff_long_opt do not apply. They are handled in a
separate patch to ease review.

Original patch by Matthieu Moy, plus refactoring by Jonathan Nieder.

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-06 09:14:22 -07:00
Junio C Hamano
bb89e84f95 Merge branch 'sv/maint-diff-q-clear-fix' into maint
* sv/maint-diff-q-clear-fix:
  Fix DIFF_QUEUE_CLEAR refactoring
2010-08-03 15:17:34 -07:00
Junio C Hamano
ee38d823f7 Fix DIFF_QUEUE_CLEAR refactoring
It introduced a macro to reduce repeated assignments to three fields,
but an unrelated and incorrect change snuck in by mistake, which broke
commands like "git diff-files -p --submodule".

Noticed by Sven Verdoolaege.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-02 08:30:02 -07:00
Bo Yang
e13f38a33e diff.c: fix a graph output bug
When --graph is in effect, the line-prefix typically has colored graph
line segments and ends with reset.  The color sequence "set" given to
this function is for showing the metainfo part of the patch text and
(1) it should not be applied to the graph lines, and (2) it will be
reset at the end of line_prefix so it won't be in effect anyway.

Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-07-08 18:09:14 -07:00
Junio C Hamano
a76b2084fb Merge branch 'jl/status-ignore-submodules'
* jl/status-ignore-submodules:
  Add the option "--ignore-submodules" to "git status"
  git submodule: ignore dirty submodules for summary and status

Conflicts:
	builtin/commit.c
	t/t7508-status.sh
	wt-status.c
	wt-status.h
2010-06-30 11:55:39 -07:00
Junio C Hamano
e1165dd144 Merge branch 'jl/maint-diff-ignore-submodules'
* jl/maint-diff-ignore-submodules:
  t4027,4041: Use test -s to test for an empty file
  Add optional parameters to the diff option "--ignore-submodules"
  git diff: rename test that had a conflicting name
2010-06-30 11:55:37 -07:00
Junio C Hamano
4af574dbdc Merge branch 'ab/blame-textconv'
* ab/blame-textconv:
  t/t8006: test textconv support for blame
  textconv: support for blame
  textconv: make the API public

Conflicts:
	diff.h
2010-06-27 12:07:44 -07:00
Jens Lehmann
46a958b3da Add the option "--ignore-submodules" to "git status"
In some use cases it is not desirable that "git status" considers
submodules that only contain untracked content as dirty. This may happen
e.g. when the submodule is not under the developers control and not all
build generated files have been added to .gitignore by the upstream
developers. Using the "untracked" parameter for the "--ignore-submodules"
option disables checking for untracked content and lets git diff report
them as changed only when they have new commits or modified content.

Sometimes it is not wanted to have submodules show up as changed when they
just contain changes to their work tree (this was the behavior before
1.7.0). An example for that are scripts which just want to check for
submodule commits while ignoring any changes to the work tree. Also users
having large submodules known not to change might want to use this option,
as the - sometimes substantial - time it takes to scan the submodule work
tree(s) is saved when using the "dirty" parameter.

And if you want to ignore any changes to submodules, you can now do that
by using this option without parameters or with "all" (when the config
option status.submodulesummary is set, using "all" will also suppress the
output of the submodule summary).

A new function handle_ignore_submodules_arg() is introduced to parse this
option new to "git status" in a single location, as "git diff" already
knew it.

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-06-25 11:30:25 -07:00
Junio C Hamano
262657dce6 Merge branch 'maint'
* maint:
  Update draft release notes to 1.7.1.1
  tests: remove unnecessary '^' from 'expr' regular expression

Conflicts:
	diff.c
2010-06-22 09:35:36 -07:00
Junio C Hamano
3c656899cd Merge branch 'cc/maint-diff-CC-binary' into maint
* cc/maint-diff-CC-binary:
  diff: fix "git show -C -C" output when renaming a binary file

Conflicts:
	diff.c
2010-06-22 09:27:07 -07:00
Junio C Hamano
cb2af93ac1 Merge branch 'bw/diff-metainfo-color' into maint
* bw/diff-metainfo-color:
  diff: fix coloring of extended diff headers
2010-06-21 05:40:10 -07:00
Junio C Hamano
60335534a6 Merge branch 'rs/diff-no-minimal' into maint
* rs/diff-no-minimal:
  git diff too slow for a file
2010-06-21 05:38:50 -07:00
Junio C Hamano
5977744d04 Merge branch 'cc/maint-diff-CC-binary'
* cc/maint-diff-CC-binary:
  diff: fix "git show -C -C" output when renaming a binary file

Conflicts:
	diff.c
2010-06-18 11:16:57 -07:00
Junio C Hamano
98ad90fbab Merge branch 'by/diff-graph'
* by/diff-graph:
  Make --color-words work well with --graph
  graph.c: register a callback for graph output
  Emit a whole line in one go
  diff.c: Output the text graph padding before each diff line
  Output the graph columns at the end of the commit message
  Add a prefix output callback to diff output

Conflicts:
	diff.c
2010-06-18 11:16:57 -07:00
Junio C Hamano
18fd805583 Merge branch 'jh/diff-index-line-abbrev'
* jh/diff-index-line-abbrev:
  diff.c: Ensure "index $from..$to" line contains unambiguous SHA1s

Conflicts:
	diff.c
2010-06-18 11:16:56 -07:00
Junio C Hamano
2621ac50cc Merge branch 'ec/diff-noprefix-config'
* ec/diff-noprefix-config:
  diff: add configuration option for disabling diff prefixes.
2010-06-18 11:16:55 -07:00
Junio C Hamano
448598b508 Merge branch 'bw/diff-metainfo-color'
* bw/diff-metainfo-color:
  diff: fix coloring of extended diff headers
2010-06-13 11:21:25 -07:00
Junio C Hamano
39b5977b13 Merge branch 'rs/diff-no-minimal'
* rs/diff-no-minimal:
  git diff too slow for a file
2010-06-13 11:20:46 -07:00
Jens Lehmann
dd44d419d3 Add optional parameters to the diff option "--ignore-submodules"
In some use cases it is not desirable that the diff family considers
submodules that only contain untracked content as dirty. This may happen
e.g. when the submodule is not under the developers control and not all
build generated files have been added to .gitignore by the upstream
developers. Using the "untracked" parameter for the "--ignore-submodules"
option disables checking for untracked content and lets git diff report
them as changed only when they have new commits or modified content.

Sometimes it is not wanted to have submodules show up as changed when they
just contain changes to their work tree. An example for that are scripts
which just want to check for submodule commits while ignoring any changes
to the work tree. Also users having large submodules known not to change
might want to use this option, as the - sometimes substantial - time it
takes to scan the submodule work tree(s) is saved.

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-06-11 13:33:17 -07:00
Axel Bonnet
a788d7d58b textconv: make the API public
The textconv functionality allows one to convert a file into text before
running diff. But this functionality can be useful to other features
such as blame.

Signed-off-by: Axel Bonnet <axel.bonnet@ensimag.imag.fr>
Signed-off-by: Clément Poulain <clement.poulain@ensimag.imag.fr>
Signed-off-by: Diane Gasselin <diane.gasselin@ensimag.imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-06-11 13:17:57 -07:00
Christian Couder
296c6bb21a diff: fix "git show -C -C" output when renaming a binary file
A bug was introduced in 3e97c7c6af
(No diff -b/-w output for all-whitespace changes, Nov 19 2009)
that made the lines:

  diff --git a/bar b/sub/bar
  similarity index 100%
  rename from bar
  rename to sub/bar

disappear from "git show -C -C" output when file bar is a binary
file.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-06-06 15:14:27 -07:00
Bo Yang
4297c0aeb5 Make --color-words work well with --graph
'--color-words' algorithm can be described as:

  1. collect a the minus/plus lines of a diff hunk, divided into
     minus-lines and plus-lines;

  2. break both minus-lines and plus-lines into words and
     place them into two mmfile_t with one word for each line;

  3. use xdiff to run diff on the two mmfile_t to get the words level diff;

And for the common parts of the both file, we output the plus side text.
diff_words->current_plus is used to trace the current position of the plus file
which printed. diff_words->last_minus is used to trace the last minus word
printed.

For '--graph' to work with '--color-words', we need to output the graph prefix
on each line of color words output. Generally, there are two conditions on
which we should output the prefix.

  1. diff_words->last_minus == 0 &&
     diff_words->current_plus == diff_words->plus.text.ptr

     that is: the plus text must start as a new line, and if there is no minus
     word printed, a graph prefix must be printed.

  2. diff_words->current_plus > diff_words->plus.text.ptr &&
     *(diff_words->current_plus - 1) == '\n'

     that is: a graph prefix must be printed following a '\n'

Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-31 18:02:20 -07:00
Bo Yang
2efcc97764 Emit a whole line in one go
Since the graph prefix will be printed when calling
emit_line, so the functions should be used to emit a
complete line out once a time. No one should call
emit_line to just output some strings instead of a
complete line.
Use a strbuf to compose the whole line, and then
call emit_line to output it once.

Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-31 18:02:04 -07:00
Bo Yang
7be5761073 diff.c: Output the text graph padding before each diff line
Change output from diff with -p/--dirstat/--binary/--numstat/--stat/
--shortstat/--check/--summary options to align with graph paddings.

Thanks Jeff King <peff@peff.net> for reporting the '--summary' bug and
his initial patch.

Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-31 18:00:21 -07:00
Bo Yang
a3c158d4a5 Add a prefix output callback to diff output
The callback can be used to add some prefix string to each line of
diff output.

Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-31 18:00:21 -07:00
Johan Herland
3e5a188f1d diff.c: Ensure "index $from..$to" line contains unambiguous SHA1s
In the metainfo section of git diffs there's an "index" line providing
abbreviated (unless --full-index is used) blob SHA1s from the
pre-/post-images used to generate the diff. These provide hints that
can be used to reconstruct a 3-way merge when applying the patch
(see the --3way option to 'git am' for more details).

In order for this to work, however, the blob SHA1s must not be
abbreviated into ambiguity.

This patch eliminates the possible ambiguity by using find_unique_abbrev()
to produce the abbreviated SHA1s (instead of blind abbreviation by way of
"%.*s").

A testcase verifying the fix is also included.

Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-31 17:44:01 -07:00
Junio C Hamano
82c531b3b6 Merge branch 'by/log-follow'
* by/log-follow:
  tests: rename duplicate t4205
  Make git log --follow find copies among unmodified files.
  Make diffcore_std only can run once before a diff_flush
  Add a macro DIFF_QUEUE_CLEAR.
2010-05-21 04:02:23 -07:00
Junio C Hamano
1bdd46cd3a Merge branch 'tr/word-diff'
* tr/word-diff:
  diff: add --word-diff option that generalizes --color-words

Conflicts:
	diff.c
2010-05-21 04:02:17 -07:00
Bert Wesarg
374664478f diff: fix coloring of extended diff headers
Coloring the extended headers where done as a whole not per line. less with
option -R (which is the default from git) does not support this coloring
mode because of performance reasons. The -r option would be an alternative
but has problems with lines that are longer than the screen. Therefore
stick to the idiom to color each line separately. The problem is, that the
result of ill_metainfo() will also be used as an parameter to an external
diff driver, so we need to disable coloring in this case.

Because coloring is now done inside fill_metainfo() we can simply add this
string to the diff header and therefore keep the last newline in the
extended header. This results also into the fact that the external diff
driver now gets this last newline too. Which is a change in behavior
but a good one.

Signed-off-by: Bert Wesarg <bert.wesarg@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-19 21:06:40 -07:00
Will Palmer
1c9eecff97 diff-options: make --patch a synonym for -p
Here we simply make --patch a synonym for -p, whose mnemonic was "patch"
all along.

Signed-off-by: Will Palmer <wmpalmer@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-18 21:50:03 -07:00
Eli Collins
f89504ddb9 diff: add configuration option for disabling diff prefixes.
With new configuration "diff.noprefix", "git diff" does not show a source or destination prefix ala "git diff --no-prefix".

Signed-off-by: Eli Collins <eli@cloudera.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-18 21:31:51 -07:00
Junio C Hamano
dd75d07899 Merge branch 'jk/cached-textconv'
* jk/cached-textconv:
  diff: avoid useless filespec population
  diff: cache textconv output
  textconv: refactor calls to run_textconv
  introduce notes-cache interface
  make commit_tree a library function
2010-05-08 22:33:08 -07:00
Bo Yang
1da6175d43 Make diffcore_std only can run once before a diff_flush
When file renames/copies detection is turned on, the
second diffcore_std will degrade a 'C' pair to a 'R' pair.

And this may happen when we run 'git log --follow' with
hard copies finding. That is, the try_to_follow_renames()
will run diffcore_std to find the copies, and then
'git log' will issue another diffcore_std, which will reduce
'src->rename_used' and recognize this copy as a rename.
This is not what we want.

So, I think we really don't need to run diffcore_std more
than one time.

Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-07 09:34:28 -07:00
Bo Yang
9ca5df9061 Add a macro DIFF_QUEUE_CLEAR.
Refactor the diff_queue_struct code, this macro help
to reset the structure.

Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-07 09:34:27 -07:00
Junio C Hamano
6b6f5d4664 Merge branch 'maint-1.7.0' into maint
* maint-1.7.0:
  remove ecb parameter from xdi_diff_outf()
2010-05-04 15:20:47 -07:00
René Scharfe
dfea79004c remove ecb parameter from xdi_diff_outf()
xdi_diff_outf() overrides the structure members of its last parameter,
ignoring any value that callers pass in.  It's no surprise then that all
callers pass a pointer to an uninitialized structure.  They also don't
read it after the call, so the parameter is neither used for input nor
for output.   Turn it into a local variable of xdi_diff_outf().

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-04 15:19:14 -07:00
René Scharfe
582aa00bdf git diff too slow for a file
Ever since the xdiff library had been introduced to git, all its callers
have used the flag XDF_NEED_MINIMAL.  It makes sure that the smallest
possible diff is produced, but that takes quite some time if there are
lots of differences that can be expressed in multiple ways.

This flag makes a difference for only 0.1% of the non-merge commits in
the git repo of Linux, both in terms of diff size and execution time.
The patches there are mostly nice and small.

SungHyun Nam however reported a case in a different repo where a diff
took more than 20 times longer to generate with XDF_NEED_MINIMAL than
without.  Rebasing became really slow.

This patch removes this flag from all callers.  The default of xdiff is
saner because it has minimal to no impact in the normal case of small
diffs and doesn't incur that much of a speed penalty for large ones.

A follow-up patch may introduce a command line option to set the flag if
the user needs it, similar to GNU diff's -d/--minimal.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-02 07:59:50 -07:00
Junio C Hamano
4fd8145c0c Merge branch 'jk/maint-diffstat-overflow' into maint
* jk/maint-diffstat-overflow:
  diff: use large integers for diffstat calculations
2010-04-22 22:29:13 -07:00
Junio C Hamano
bc32d342c2 Merge branch 'jk/maint-diffstat-overflow'
* jk/maint-diffstat-overflow:
  diff: use large integers for diffstat calculations
2010-04-18 21:31:50 -07:00
Jeff King
0974c117ff diff: use large integers for diffstat calculations
The diffstat "added" and "changed" fields generally store
line counts; however, for binary files, they store file
sizes. Since we store and print these values as ints, a
diffstat on a file larger than 2G can show a negative size.
Instead, let's use uintmax_t, which should be at least 64
bits on modern platforms.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-04-17 11:30:21 -07:00
Thomas Rast
882749a04f diff: add --word-diff option that generalizes --color-words
This teaches the --color-words engine a more general interface that
supports two new modes:

* --word-diff=plain, inspired by the 'wdiff' utility (most similar to
  'wdiff -n <old> <new>'): uses delimiters [-removed-] and {+added+}

* --word-diff=porcelain, which generates an ad-hoc machine readable
  format:
  - each diff unit is prefixed by [-+ ] and terminated by newline as
    in unified diff
  - newlines in the input are output as a line consisting only of a
    tilde '~'

Both of these formats still support color if it is enabled, using it
to highlight the differences.  --color-words becomes a synonym for
--word-diff=color, which is the color-only format.  Also adds some
compatibility/convenience options.

Thanks to Junio C Hamano and Miles Bader for good ideas.

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-04-14 10:56:53 -07:00
Junio C Hamano
daaf2e8892 Merge branch 'jc/conflict-marker-size' into maint
* jc/conflict-marker-size:
  diff --check: honor conflict-marker-size attribute
2010-04-09 22:38:34 -07:00
Junio C Hamano
7ec1eb93f7 Merge early parts of jk/cached-textconv 2010-04-08 23:31:51 -07:00
Junio C Hamano
aed6ca52e7 diff.c: work around pointer constness warnings
The textconv leak fix introduced two invocations of free() to release
memory pointed by "const char *", which get annoying compiler warning.
2010-04-08 23:30:49 -07:00
Junio C Hamano
3f3f8d9d09 Merge branch 'jc/conflict-marker-size'
* jc/conflict-marker-size:
  diff --check: honor conflict-marker-size attribute
2010-04-06 14:50:46 -07:00
Jeff King
b337398266 diff: avoid useless filespec population
builtin_diff calls fill_mmfile fairly early, which in turn
calls diff_populate_filespec, which actually retrieves the
file's blob contents into a buffer. Long ago, this was
sensible as we would need to look at the blobs eventually.

These days, however, we may not ever want those blobs if we
end up using a textconv cache, and for large binary files
(exactly the sort for which you might have a textconv
cache), just retrieving the objects can be costly.

This patch just pushes the fill_mmfile call a bit later, so
we can avoid populating the filespec in some cases.  There
is one thing to note that looks like a bug but isn't. We
push the fill_mmfile down into the first branch of a
conditional. It seems like we would need it on the other
branch, too, but we don't; fill_textconv does it for us (in
fact, before this, we were just writing over the results of
the fill_mmfile on that branch).

Here's a timing sample on a commit with 45 changed jpgs and
avis. The result is fully textconv cached, but we still
wasted a lot of time just pulling the blobs from storage.
The total size of the blobs (source and dest) is about
180M.

  [before]
  $ time git show >/dev/null
  real    0m0.352s
  user    0m0.148s
  sys     0m0.200s

  [after]
  $ time git show >/dev/null
  real    0m0.009s
  user    0m0.004s
  sys     0m0.004s

And that's on a warm cache. On a cold cache, the "after"
case is not much worse, but the "before" case has to do an
extra 180M of I/O.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-04-02 00:11:20 -07:00
Jeff King
d9bae1a178 diff: cache textconv output
Running a textconv filter can take a long time. It's
particularly bad for a large file which needs to be spooled
to disk, but even for small files, the fork+exec overhead
can add up for something like "git log -p".

This patch uses the notes-cache mechanism to keep a fast
cache of textconv output. Caches are stored in
refs/notes/textconv/$x, where $x is the userdiff driver
defined in gitattributes.

Caching is enabled only if diff.$x.cachetextconv is true.

In my test repo, on a commit with 45 jpg and avi files
changed and a textconv to show their exif tags:

  [before]
  $ time git show >/dev/null
  real    0m13.724s
  user    0m12.057s
  sys     0m1.624s

  [after, first run]
  $ git config diff.mfo.cachetextconv true
  $ time git show >/dev/null
  real    0m14.252s
  user    0m12.197s
  sys     0m1.800s

  [after, subsequent runs]
  $ time git show >/dev/null
  real    0m0.352s
  user    0m0.148s
  sys     0m0.200s

So for a slight (3.8%) cost on the first run, we achieve an
almost 40x speed up on subsequent runs.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-04-02 00:05:31 -07:00
Jeff King
840383b2c2 textconv: refactor calls to run_textconv
This patch adds a fill_textconv wrapper, which centralizes
some minor logic like error checking and handling the case
of no-textconv.

In addition to dropping the number of lines, this will make
it easier in future patches to handle multiple types of
textconv.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-04-02 00:01:57 -07:00
Jeff King
b76c056b95 fix textconv leak in emit_rewrite_diff
We correctly free() for the normal diff case, but leak for
rewrite diffs.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-04-01 23:49:29 -07:00
Junio C Hamano
890a13a452 Sync with 1.7.0.4
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-31 15:14:27 -07:00
Johannes Sixt
da1fbed3ff diff: fix textconv error zombies
To make the code simpler, run_textconv lumps all of its
error checking into one conditional. However, the
short-circuit means that an error in reading will prevent us
from calling finish_command, leaving a zombie child.
Clean up properly after errors.

Based-on-work-by: Jeff King <peff@peff.net>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-30 14:46:33 -07:00
Junio C Hamano
a757c646ee diff --check: honor conflict-marker-size attribute
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-24 19:35:34 -07:00
Junio C Hamano
b6a7a06aa6 Merge branch 'jl/submodule-diff-dirtiness'
* jl/submodule-diff-dirtiness:
  git status: ignoring untracked files must apply to submodules too
  git status: Fix false positive "new commits" output for dirty submodules
  Refactor dirty submodule detection in diff-lib.c
  git status: Show detailed dirty status of submodules in long format
  git diff --submodule: Show detailed dirty status of submodules
2010-03-24 16:25:43 -07:00
Jens Lehmann
85adbf2f75 git status: Fix false positive "new commits" output for dirty submodules
Testing if the output "new commits" should appear in the long format of
"git status" is done by comparing the hashes of the diffpair. This always
resulted in printing "new commits" for submodules that contained untracked
or modified content, even if they did not contain new commits. The reason
was that match_stat_with_submodule() did set the "changed" flag for dirty
submodules, resulting in two->sha1 being set to the null_sha1 at the call
sites, which indicates that new commits are present. This is changed so
that when no new commits are present, the same object names are in the
sha1 field for both sides of the filepair, and the working tree side will
have the "dirty_submodule" flag set when appropriate. For a submodule to
be seen as modified even when it just has a dirty work tree, some
conditions had to be extended to also check for the "dirty_submodule"
flag.

Unfortunately the test case that should have found this bug had been
changed incorrectly too. It is fixed and extended to test for other
combinations too.

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-12 22:17:24 -08:00
Jens Lehmann
ae6d5c1b6f Refactor dirty submodule detection in diff-lib.c
Moving duplicated code into the new function match_stat_with_submodule().
Replacing the implicit activation of detailed checks for the dirtiness of
submodules when DIFF_FORMAT_PATCH was selected with explicitly setting
the recently added DIFF_OPT_DIRTY_SUBMODULES option in diff_setup_done().

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-12 22:17:17 -08:00
Junio C Hamano
8cc3709df0 Merge branch 'ld/maint-diff-quiet-w' into maint
* ld/maint-diff-quiet-w:
  git-diff: add a test for git diff --quiet -w
  git diff --quiet -w: check and report the status
2010-03-04 22:26:39 -08:00
Junio C Hamano
36420805a7 Merge branch 'ld/maint-diff-quiet-w'
* ld/maint-diff-quiet-w:
  git-diff: add a test for git diff --quiet -w
  git diff --quiet -w: check and report the status
2010-03-02 12:44:10 -08:00
Junio C Hamano
c06951894a Merge branch 'ml/color-when'
* ml/color-when:
  Add an optional argument for --color options
2010-03-02 12:44:06 -08:00
Mark Lodato
73e9da0196 Add an optional argument for --color options
Make git-branch, git-show-branch, git-grep, and all the diff-based
programs accept an optional argument <when> for --color.  The argument
is a colorbool: "always", "never", or "auto".  If no argument is given,
"always" is used;  --no-color is an alias for --color=never.  This makes
the command-line interface consistent with other GNU tools, such as `ls'
and `grep', and with the git-config color options.  Note that, without
an argument, --color and --no-color work exactly as before.

To implement this, two internal changes were made:

1. Allow the first argument of git_config_colorbool() to be NULL,
   in which case it returns -1 if the argument isn't "always", "never",
   or "auto".

2. Add OPT_COLOR_FLAG(), OPT__COLOR(), and parse_opt_color_flag_cb()
   to the option parsing library.  The callback uses
   git_config_colorbool(), so color.h is now a dependency
   of parse-options.c.

Signed-off-by: Mark Lodato <lodatom@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-02-18 17:21:40 -08:00
Junio C Hamano
07cb9a369e Merge branch 'jc/typo' into maint
* jc/typo:
  Typofixes outside documentation area
2010-02-17 14:55:09 -08:00
Junio C Hamano
6d816301cd Merge branch 'jc/typo'
* jc/typo:
  Typofixes outside documentation area
2010-02-16 22:45:14 -08:00
Junio C Hamano
e7b3cea0f7 Merge branch 'maint-1.6.6' into maint
* maint-1.6.6:
  dwim_ref: fix dangling symref warning
  stash pop: remove 'apply' options during 'drop' invocation
  diff: make sure --output=/bad/path is caught
  Remove hyphen from "git-command" in two error messages
2010-02-16 15:05:02 -08:00
Junio C Hamano
eb0bcd0fbe Merge branch 'maint-1.6.5' into maint-1.6.6
* maint-1.6.5:
  dwim_ref: fix dangling symref warning
  stash pop: remove 'apply' options during 'drop' invocation
  diff: make sure --output=/bad/path is caught
2010-02-16 15:04:55 -08:00
Larry D'Anna
6977c250ac git diff --quiet -w: check and report the status
The option -w tells the diff machinery to inspect the contents to set the
exit status, instead of checking the blob object level difference alone.
However, --quiet tells the diff machinery not to look at the contents, which
means DIFF_FROM_CONTENTS has no chance to inspect the change.

Work it around by calling diff_flush_patch() with output sent to /dev/null.

Signed-off-by: Larry D'Anna <larry@elder-gods.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-02-15 23:04:34 -08:00
Larry D'Anna
8324b977ae diff: make sure --output=/bad/path is caught
The return value from fopen wasn't being checked.

Signed-off-by: Larry D'Anna <larry@elder-gods.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-02-15 21:46:01 -08:00
Junio C Hamano
9517e6b843 Typofixes outside documentation area
begining -> beginning
    canonicalizations -> canonicalization
    comand -> command
    dewrapping -> unwrapping
    dirtyness -> dirtiness
    DISCLAMER -> DISCLAIMER
    explicitely -> explicitly
    feeded -> fed
    impiled -> implied
    madatory -> mandatory
    mimick -> mimic
    preceeding -> preceding
    reqeuest -> request
    substition -> substitution

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-02-03 21:28:17 -08:00
Junio C Hamano
d539de9f25 Merge branch 'jl/diff-submodule-ignore'
* jl/diff-submodule-ignore:
  Teach diff --submodule that modified submodule directory is dirty
  git diff: Don't test submodule dirtiness with --ignore-submodules
  Make ce_uptodate() trustworthy again
2010-01-26 22:53:13 -08:00
Jens Lehmann
721ceec1ad Teach diff --submodule that modified submodule directory is dirty
Since commit 8e08b4 git diff does append "-dirty" to the work tree side
if the working directory of a submodule contains new or modified files.
Lets do the same when the --submodule option is used.

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-24 21:04:31 -08:00
Junio C Hamano
026680f881 Merge branch 'jc/fix-tree-walk'
* jc/fix-tree-walk:
  read-tree --debug-unpack
  unpack-trees.c: look ahead in the index
  unpack-trees.c: prepare for looking ahead in the index
  Aggressive three-way merge: fix D/F case
  traverse_trees(): handle D/F conflict case sanely
  more D/F conflict tests
  tests: move convenience regexp to match object names to test-lib.sh

Conflicts:
	builtin-read-tree.c
	unpack-trees.c
	unpack-trees.h
2010-01-24 17:35:58 -08:00
Junio C Hamano
c6ec7efdd4 Merge branch 'jl/submodule-diff'
* jl/submodule-diff:
  Performance optimization for detection of modified submodules
  git status: Show uncommitted submodule changes too when enabled
  Teach diff that modified submodule directory is dirty
  Show submodules as modified when they contain a dirty work tree
2010-01-22 16:08:10 -08:00
Junio C Hamano
03c6e97f80 Merge branch 'maint-1.6.4' into maint-1.6.5
* maint-1.6.4:
  Fix mis-backport of t7002
  base85: Make the code more obvious instead of explaining the non-obvious
  base85: encode_85() does not use the decode table
  base85 debug code: Fix length byte calculation
  checkout -m: do not try to fall back to --merge from an unborn branch
  branch: die explicitly why when calling "git branch [-a|-r] branchname".
  textconv: stop leaking file descriptors
  commit: --cleanup is a message option
  git count-objects: handle packs bigger than 4G
  t7102: make the test fail if one of its check fails
2010-01-18 21:37:12 -08:00
Junio C Hamano
814035c12a Merge branch 'maint-1.6.3' into maint-1.6.4
* maint-1.6.3:
  base85: Make the code more obvious instead of explaining the non-obvious
  base85: encode_85() does not use the decode table
  base85 debug code: Fix length byte calculation
  checkout -m: do not try to fall back to --merge from an unborn branch
  branch: die explicitly why when calling "git branch [-a|-r] branchname".
  textconv: stop leaking file descriptors
  commit: --cleanup is a message option
  git count-objects: handle packs bigger than 4G
  t7102: make the test fail if one of its check fails

Conflicts:
	builtin-commit.c
2010-01-18 21:37:06 -08:00
Junio C Hamano
011c181cc6 Merge branch 'maint-1.6.2' into maint-1.6.3
* maint-1.6.2:
  base85: Make the code more obvious instead of explaining the non-obvious
  base85: encode_85() does not use the decode table
  base85 debug code: Fix length byte calculation
  checkout -m: do not try to fall back to --merge from an unborn branch
  branch: die explicitly why when calling "git branch [-a|-r] branchname".
  textconv: stop leaking file descriptors
  commit: --cleanup is a message option
  git count-objects: handle packs bigger than 4G
  t7102: make the test fail if one of its check fails

Conflicts:
	diff.c
2010-01-18 21:29:47 -08:00
Jens Lehmann
e3d42c4773 Performance optimization for detection of modified submodules
In the worst case is_submodule_modified() got called three times for
each submodule. The information we got from scanning the whole
submodule tree the first time can be reused instead.

New parameters have been added to diff_change() and diff_addremove(),
the information is stored in a new member of struct diff_filespec. Its
value is then reused instead of calling is_submodule_modified() again.

When no explicit "-dirty" is needed in the output the call to
is_submodule_modified() is not necessary when the submodules HEAD
already disagrees with the ref of the superproject, as this alone
marks it as modified. To achieve that, get_stat_data() got an extra
argument.

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-18 17:28:21 -08:00
Junio C Hamano
4fa088209c Merge branch 'jk/run-command-use-shell'
* jk/run-command-use-shell:
  t4030, t4031: work around bogus MSYS bash path conversion
  diff: run external diff helper with shell
  textconv: use shell to run helper
  editor: use run_command's shell feature
  run-command: optimize out useless shell calls
  run-command: convert simple callsites to use_shell
  t0021: use $SHELL_PATH for the filter script
  run-command: add "use shell" option
2010-01-17 15:58:15 -08:00
Junio C Hamano
8e08b4198c Teach diff that modified submodule directory is dirty
A diff run in superproject only compares the name of the commit object
bound at the submodule paths.  When we compare with a work tree and the
checked out submodule directory is dirty (e.g. has either staged or
unstaged changes, or has new files the user forgot to add to the index),
show the work tree side as "dirty".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-16 16:40:56 -08:00
Junio C Hamano
73d66323ac Merge branch 'nd/sparse'
* nd/sparse: (25 commits)
  t7002: test for not using external grep on skip-worktree paths
  t7002: set test prerequisite "external-grep" if supported
  grep: do not do external grep on skip-worktree entries
  commit: correctly respect skip-worktree bit
  ie_match_stat(): do not ignore skip-worktree bit with CE_MATCH_IGNORE_VALID
  tests: rename duplicate t1009
  sparse checkout: inhibit empty worktree
  Add tests for sparse checkout
  read-tree: add --no-sparse-checkout to disable sparse checkout support
  unpack-trees(): ignore worktree check outside checkout area
  unpack_trees(): apply $GIT_DIR/info/sparse-checkout to the final index
  unpack-trees(): "enable" sparse checkout and load $GIT_DIR/info/sparse-checkout
  unpack-trees.c: generalize verify_* functions
  unpack-trees(): add CE_WT_REMOVE to remove on worktree alone
  Introduce "sparse checkout"
  dir.c: export excluded_1() and add_excludes_from_file_1()
  excluded_1(): support exclude files in index
  unpack-trees(): carry skip-worktree bit over in merged_entry()
  Read .gitignore from index if it is skip-worktree
  Avoid writing to buffer in add_excludes_from_file_1()
  ...

Conflicts:
	.gitignore
	Documentation/config.txt
	Documentation/git-update-index.txt
	Makefile
	entry.c
	t/t7002-grep.sh
2010-01-13 11:58:34 -08:00
Junio C Hamano
c5034673fd Merge branch 'maint-1.6.1' into maint-1.6.2
* maint-1.6.1:
  base85: Make the code more obvious instead of explaining the non-obvious
  base85: encode_85() does not use the decode table
  base85 debug code: Fix length byte calculation
  checkout -m: do not try to fall back to --merge from an unborn branch
  branch: die explicitly why when calling "git branch [-a|-r] branchname".
  textconv: stop leaking file descriptors
  commit: --cleanup is a message option
  git count-objects: handle packs bigger than 4G
  t7102: make the test fail if one of its check fails

Conflicts:
	diff.c
2010-01-10 00:49:47 -08:00
Junio C Hamano
730f72840c unpack-trees.c: look ahead in the index
This makes the traversal of index be in sync with the tree traversal.
When unpack_callback() is fed a set of tree entries from trees, it
inspects the name of the entry and checks if the an index entry with
the same name could be hiding behind the current index entry, and

 (1) if the name appears in the index as a leaf node, it is also
     fed to the n_way_merge() callback function;

 (2) if the name is a directory in the index, i.e. there are entries in
     that are underneath it, then nothing is fed to the n_way_merge()
     callback function;

 (3) otherwise, if the name comes before the first eligible entry in the
     index, the index entry is first unpacked alone.

When traverse_trees_recursive() descends into a subdirectory, the
cache_bottom pointer is moved to walk index entries within that directory.

All of these are omitted for diff-index, which does not even want to be
fed an index entry and a tree entry with D/F conflicts.

This fixes 3-way read-tree and exposes a bug in other parts of the system
in t6035, test #5.  The test prepares these three trees:

 O = HEAD^
    100644 blob e69de29bb2    a/b-2/c/d
    100644 blob e69de29bb2    a/b/c/d
    100644 blob e69de29bb2    a/x

 A = HEAD
    100644 blob e69de29bb2    a/b-2/c/d
    100644 blob e69de29bb2    a/b/c/d
    100644 blob 587be6b4c3f93f93c489c0111bba5596147a26cb    a/x

 B = master
    120000 blob a36b77384451ea1de7bd340ffca868249626bc52    a/b
    100644 blob e69de29bb2    a/b-2/c/d
    100644 blob e69de29bb2    a/x

With a clean index that matches HEAD, running

    git read-tree -m -u --aggressive $O $A $B

now yields

    120000 a36b77384451ea1de7bd340ffca868249626bc52 3       a/b
    100644 e69de29bb2 0       a/b-2/c/d
    100644 e69de29bb2 1       a/b/c/d
    100644 e69de29bb2 2       a/b/c/d
    100644 587be6b4c3f93f93c489c0111bba5596147a26cb 0       a/x

which is correct.  "master" created "a/b" symlink that did not exist,
and removed "a/b/c/d" while HEAD did not do touch either path.

Before this series, read-tree did not notice the situation and resolved
addition of "a/b" and removal of "a/b/c/d" independently.  If A = HEAD had
another path "a/b/c/e" added, this merge should conflict but instead it
silently resolved "a/b" and then immediately overwrote it to add
"a/b/c/e", which was quite bogus.

Tests in t1012 start to work with this.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-07 15:00:14 -08:00
Jeff King
7ed7fac45a diff: run external diff helper with shell
This is mostly to make it more consistent with the rest of
git, which uses the shell to exec helpers.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-05 23:41:51 -08:00
Jeff King
41a457e4f8 textconv: use shell to run helper
Currently textconv helpers are run directly. Running through
the shell is useful because the user can provide a program
with command line arguments, like "antiword -f".

It also makes textconv more consistent with other parts of
git, most of which run their helpers using the shell.

The downside is that textconv helpers with shell
metacharacters (like space) in the filename will be broken.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-05 23:41:51 -08:00
Junio C Hamano
9e7ad090fa Merge branch 'maint'
* maint:
  textconv: stop leaking file descriptors
  commit: --cleanup is a message option
  git count-objects: handle packs bigger than 4G
  t7102: make the test fail if one of its check fails
  Documentation: always respect core.worktree if set
2009-12-30 01:25:21 -08:00
Junio C Hamano
b0b3a241e2 Merge branch 'maint-1.6.1' into maint
* maint-1.6.1:
  textconv: stop leaking file descriptors
  commit: --cleanup is a message option
  git count-objects: handle packs bigger than 4G
  t7102: make the test fail if one of its check fails

Conflicts:
	builtin-commit.c
	diff.c
2009-12-30 01:24:12 -08:00
Jeff King
70d7099916 textconv: stop leaking file descriptors
We read the output from textconv helpers over a pipe, but we
never actually closed our end of the pipe after using it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-12-30 01:22:27 -08:00
Junio C Hamano
648f407017 Merge branch 'gb/1.7.0-diff-whitespace-only-output'
* gb/1.7.0-diff-whitespace-only-output:
  No diff -b/-w output for all-whitespace changes
2009-12-26 14:03:18 -08:00
Junio C Hamano
3cc3fb7df6 Merge branch 'jc/1.7.0-diff-whitespace-only-status'
* jc/1.7.0-diff-whitespace-only-status:
  diff.c: fix typoes in comments
  Make test case number unique
  diff: Rename QUIET internal option to QUICK
  diff: change semantics of "ignore whitespace" options

Conflicts:
	diff.h
2009-12-26 14:03:18 -08:00
Junio C Hamano
a1bb8f45f1 Merge branch 'maint' to sync with 1.6.5.7
* maint:
  Git 1.6.5.7
  worktree: don't segfault with an absolute pathspec without a work tree
  ignore unknown color configuration
  help.autocorrect: do not run a command if the command given is junk
  Illustrate "filter" attribute with an example
2009-12-16 12:47:15 -08:00
Jeff King
8b8e862490 ignore unknown color configuration
When parsing the config file, if there is a value that is
syntactically correct but unused, we generally ignore it.
This lets non-core porcelains store arbitrary information in
the config file, and it means that configuration files can
be shared between new and old versions of git (the old
versions might simply ignore certain configuration).

The one exception to this is color configuration; if we
encounter a color.{diff,branch,status}.$slot variable, we
die if it is not one of the recognized slots (presumably as
a safety valve for user misconfiguration). This behavior
has existed since 801235c (diff --color: use
$GIT_DIR/config, 2006-06-24), but hasn't yet caused a
problem. No porcelain has wanted to store extra colors, and
we once a color area (like color.diff) has been introduced,
we've never changed the set of color slots.

However, that changed recently with the addition of
color.diff.func. Now a user with color.diff.func in their
config can no longer freely switch between v1.6.6 and older
versions; the old versions will complain about the existence
of the variable.

This patch loosens the check to match the rest of
git-config; unknown color slots are simply ignored. This
doesn't fix this particular problem, as the older version
(without this patch) is the problem, but it at least
prevents it from happening again in the future.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-12-16 12:45:16 -08:00
Bert Wesarg
89cb73a19a Give the hunk comment its own color
Inspired by the coloring of quilt.

Introduce a separate color and paint the hunk comment part, i.e. the name
of the function, in a separate color "diff.func" (defaults to plain).

Whitespace between hunk header and hunk comment is printed in plain color.

Signed-off-by: Bert Wesarg <bert.wesarg@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-11-28 10:05:44 -08:00
Junio C Hamano
06a4755270 emit_line(): don't emit an empty <SET><RESET> followed by a newline
When emit_line() is called with an empty line (but non-zero length, as we
send line terminating LF or CRLF to the function), it used to emit
<SET><RESET> followed by a newline.  Stop the wastefulness.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-11-27 22:33:53 -08:00
Greg Bacon
3e97c7c6af No diff -b/-w output for all-whitespace changes
Change git-diff's whitespace-ignoring modes to generate
output only if a non-empty patch results, which git-apply
rejects.

Update the tests to look for the new behavior.

Signed-off-by: Greg Bacon <gbacon@dbresearch.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-11-20 22:00:36 -08:00
Junio C Hamano
d404a3e1a5 Merge branch 'js/maint-diff-color-words' into maint
* js/maint-diff-color-words:
  diff --color-words: bit of clean-up
  diff --color-words -U0: fix the location of hunk headers
  t4034-diff-words: add a test for word diff without context

Conflicts:
	diff.c
2009-11-16 00:01:56 -08:00
Junio C Hamano
6dbdba00ea Merge branch 'jc/maint-blank-at-eof' into maint
* jc/maint-blank-at-eof:
  diff -B: colour whitespace errors
  diff.c: emit_add_line() takes only the rest of the line
  diff.c: split emit_line() from the first char and the rest of the line
  diff.c: shuffling code around
  diff --whitespace: fix blank lines at end
  core.whitespace: split trailing-space into blank-at-{eol,eof}
  diff --color: color blank-at-eof
  diff --whitespace=warn/error: fix blank-at-eof check
  diff --whitespace=warn/error: obey blank-at-eof
  diff.c: the builtin_diff() deals with only two-file comparison
  apply --whitespace: warn blank but not necessarily empty lines at EOF
  apply --whitespace=warn/error: diagnose blank at EOF
  apply.c: split check_whitespace() into two
  apply --whitespace=fix: detect new blank lines at eof correctly
  apply --whitespace=fix: fix handling of blank lines at the eof
2009-11-15 23:06:34 -08:00
Junio C Hamano
002a9ec005 Merge branch 'js/maint-diff-color-words'
* js/maint-diff-color-words:
  diff --color-words: bit of clean-up
  diff --color-words -U0: fix the location of hunk headers
  t4034-diff-words: add a test for word diff without context

Conflicts:
	diff.c
2009-11-15 16:41:29 -08:00
Junio C Hamano
76fd28283f diff --color-words: bit of clean-up
When we introduced the "word diff" mode, we could have done one of three
things:

 * change fn_out_consume() to "this is called every time a line worth of
   diff becomes ready from the lower-level diff routine.  This function
   knows two sets of helpers (one for line-oriented diff, another for word
   diff), and each set has various functions to be called at certain
   places (e.g. hunk header, context, ...).  The function's role is to
   inspect the incoming line, and dispatch appropriate helpers to produce
   either line- or word- oriented diff output."

 * introduce fn_out_consume_word_diff() that is "this is called every time
   a line worth of diff becomes ready from the lower-level diff routine,
   and here is what we do to prepare word oriented diff using that line."
   without touching fn_out_consume() at all.

 * Do neither of the above, and keep fn_out_consume() to "this is called
   every time a line worth of diff becomes ready from the lower-level diff
   routine, and here is what we do to output line oriented diff using that
   line."  but sprinkle a handful of 'are we in word-diff mode?  if so do
   this totally different thing' at random places.

This patch is to at least abstract the details of "this totally different
thing" out from the main codepath, in order to improve readability.

We can later refactor it by introducing fn_out_consume_word_diff(), taking
the second route above, but that is a separate topic.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-31 14:25:15 -07:00
Junio C Hamano
d39d667169 Merge branch 'js/diff-verbose-submodule'
* js/diff-verbose-submodule:
  add tests for git diff --submodule
  Add the --submodule option to the diff option family
2009-10-30 20:16:26 -07:00
Johannes Schindelin
a4ca1465ec diff --color-words -U0: fix the location of hunk headers
Colored word diff without context lines firstly printed all the hunk
headers among each other and then printed the diff.

This was due to the code relying on getting at least one context line at
the end of each hunk, where the colored words would be flushed (it is
done that way to be able to ignore rewrapped lines).

Noticed by Markus Heidelberg.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-30 09:42:56 -07:00
Johannes Schindelin
752c0c2492 Add the --submodule option to the diff option family
When you use the option --submodule=log you can see the submodule
summaries inlined in the diff, instead of not-quite-helpful SHA-1 pairs.

The format imitates what "git submodule summary" shows.

To do that, <path>/.git/objects/ is added to the alternate object
databases (if that directory exists).

This option was requested by Jens Lehmann at the GitTogether in Berlin.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-19 22:31:00 -07:00
Junio C Hamano
9981c808b4 Merge branch 'jc/maint-blank-at-eof'
* jc/maint-blank-at-eof:
  diff -B: colour whitespace errors
  diff.c: emit_add_line() takes only the rest of the line
  diff.c: split emit_line() from the first char and the rest of the line
  diff.c: shuffling code around
  diff --whitespace: fix blank lines at end
  core.whitespace: split trailing-space into blank-at-{eol,eof}
  diff --color: color blank-at-eof
  diff --whitespace=warn/error: fix blank-at-eof check
  diff --whitespace=warn/error: obey blank-at-eof
  diff.c: the builtin_diff() deals with only two-file comparison
  apply --whitespace: warn blank but not necessarily empty lines at EOF
  apply --whitespace=warn/error: diagnose blank at EOF
  apply.c: split check_whitespace() into two
  apply --whitespace=fix: detect new blank lines at eof correctly
  apply --whitespace=fix: fix handling of blank lines at the eof
2009-10-17 00:11:03 -07:00
Felipe Contreras
2775d92c53 diff.c: stylefix
Essentially; s/type* /type */ as per the coding guidelines.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-11 21:54:44 -07:00
Junio C Hamano
d91ba8fa88 Merge branch 'jc/maint-1.6.0-blank-at-eof' into jc/maint-blank-at-eof
* jc/maint-1.6.0-blank-at-eof:
  diff -B: colour whitespace errors
2009-09-15 11:21:10 -07:00
Junio C Hamano
61c6457e89 Merge branch 'jc/maint-1.6.0-blank-at-eof' (early part) into jc/maint-blank-at-eof
* 'jc/maint-1.6.0-blank-at-eof' (early part):
  diff.c: emit_add_line() takes only the rest of the line
  diff.c: split emit_line() from the first char and the rest of the line
2009-09-15 11:20:46 -07:00
Junio C Hamano
bb35fefbc9 Merge branch 'jc/maint-1.6.0-blank-at-eof' (early part) into jc/maint-blank-at-eof
* 'jc/maint-1.6.0-blank-at-eof' (early part):
  diff.c: shuffling code around
2009-09-15 03:38:30 -07:00
Junio C Hamano
afd9db4173 Merge branch 'jc/maint-1.6.0-blank-at-eof' (early part) into jc/maint-blank-at-eof
* 'jc/maint-1.6.0-blank-at-eof' (early part):
  diff --whitespace: fix blank lines at end
  core.whitespace: split trailing-space into blank-at-{eol,eof}
  diff --color: color blank-at-eof
  diff --whitespace=warn/error: fix blank-at-eof check
  diff --whitespace=warn/error: obey blank-at-eof
  diff.c: the builtin_diff() deals with only two-file comparison
  apply --whitespace: warn blank but not necessarily empty lines at EOF
  apply --whitespace=warn/error: diagnose blank at EOF
  apply.c: split check_whitespace() into two
  apply --whitespace=fix: detect new blank lines at eof correctly
  apply --whitespace=fix: fix handling of blank lines at the eof
2009-09-15 03:28:08 -07:00
Junio C Hamano
7f7ee2ff2d diff -B: colour whitespace errors
We used to send the old and new contents more or less straight out to the
output with only the original "old is red, new is green" colouring.  Now
all the necessary support routines have been prepared, call them with a
line of data at a time from the output code and have them check and color
whitespace errors in exactly the same way as they are called from the low
level diff callback routines.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-09-15 02:41:03 -07:00
Junio C Hamano
018cff7046 diff.c: emit_add_line() takes only the rest of the line
As the first character on the line that is fed to this function is always
"+", it is pointless to send that along with the rest of the line.

This change will make it easier to reuse the logic when emitting the
rewrite diff, as we do not want to copy a line only to add "+"/"-"/" "
immediately before its first character when we produce rewrite diff
output.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-09-15 02:41:02 -07:00
Junio C Hamano
250f79930d diff.c: split emit_line() from the first char and the rest of the line
A new helper function emit_line_0() takes the first line of diff output
(typically "-", " ", or "+") separately from the remainder of the line.
No other functional changes.

This change will make it easier to reuse the logic when emitting the
rewrite diff, as we do not want to copy a line only to add "+"/"-"/" "
immediately before its first character when we produce rewrite diff
output.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-09-15 02:40:51 -07:00
Junio C Hamano
6957eb9a39 diff.c: shuffling code around
Move function, type, and structure definitions for fill_mmfile(),
count_trailing_blank(), check_blank_at_eof(), emit_line(),
new_blank_line_at_eof(), emit_add_line(), sane_truncate_fn, and
emit_callback up in the file, so that they can be refactored into helper
functions and reused by codepath for emitting rewrite patches.

This only moves the lines around to make the next two patches easier to
read.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-09-14 22:18:49 -07:00
Junio C Hamano
d68fe26f3e diff --whitespace: fix blank lines at end
The earlier logic tried to colour any and all blank lines that were added
beyond the last blank line in the original, but this was very wrong.  If
you added 96 blank lines, a non-blank line, and then 3 blank lines at the
end, only the last 3 lines should trigger the error, not the earlier 96
blank lines.

We need to also make sure that the lines are after the last non-blank line
in the postimage as well before deciding to paint them.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-09-14 22:18:36 -07:00
Junio C Hamano
690ed84363 diff --color: color blank-at-eof
Since the coloring logic processed the patch output one line at a time, we
couldn't easily color code the new blank lines at the end of file.

Reuse the adds_blank_at_eof() function to find where the runs of such
blank lines start, keep track of the line number in the preimage while
processing the patch output one line at a time, and paint the new blank
lines that appear after that line to implement this.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-09-04 11:50:27 -07:00
Junio C Hamano
467babf8d0 diff --whitespace=warn/error: fix blank-at-eof check
The "diff --check" logic used to share the same issue as the one fixed for
"git apply" earlier in this series, in that a patch that adds new blank
lines at end could appear as

    @@ -l,5 +m,7 @@$
    _context$
    _context$
    -deleted$
    +$
    +$
    +$
    _$
    _$

where _ stands for SP and $ shows a end-of-line.  Instead of looking at
each line in the patch in the callback, simply count the blank lines from
the end in two versions, and notice the presence of new ones.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-09-04 11:50:27 -07:00
Junio C Hamano
5b5061efd8 diff --whitespace=warn/error: obey blank-at-eof
The "diff --check" code used to conflate trailing-space whitespace error
class with this, but now we have a proper separate error class, we should
check it under blank-at-eof, not trailing-space.

The whitespace error is not about _having_ blank lines at end, but about
adding _new_ blank lines.  To keep the message consistent with what is
given by "git apply", call whitespace_error_string() to generate it,
instead of using a hardcoded custom message.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-09-04 11:50:26 -07:00
Junio C Hamano
b8d9c1a66b diff.c: the builtin_diff() deals with only two-file comparison
The combined diff is implemented in combine_diff() and fn_out_consume()
codepath never has to deal with anything but two-file comparision.

Drop nparents from the emit_callback structure and simplify the code.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-09-04 11:50:26 -07:00
Brian Gianforcaro
eeefa7c90e Style fixes, add a space after if/for/while.
The majority of code in core git appears to use a single
space after if/for/while. This is an attempt to bring more
code to this standard. These are entirely cosmetic changes.

Signed-off-by: Brian Gianforcaro <b.gianfo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-31 23:26:28 -07:00
Jim Meyering
97bf2a0809 diff.c: fix typoes in comments
Should be squashed when we reroll 'next' into the main commit.
2009-08-30 14:13:01 -07:00
Nguyễn Thái Ngọc Duy
b4d1690df1 Teach Git to respect skip-worktree bit (reading part)
grep: turn on --cached for files that is marked skip-worktree
ls-files: do not check for deleted file that is marked skip-worktree
update-index: ignore update request if it's skip-worktree, while still allows removing
diff*: skip worktree version

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-23 17:13:32 -07:00
Junio C Hamano
90b1994170 diff: Rename QUIET internal option to QUICK
The option "QUIET" primarily meant "find if we have _any_ difference as
quick as possible and report", which means we often do not even have to
look at blobs if we know the trees are different by looking at the higher
level (e.g. "diff-tree A B").  As a side effect, because there is no point
showing one change that we happened to have found first, it also enables
NO_OUTPUT and EXIT_WITH_STATUS options, making the end result look quiet.

Rename the internal option to QUICK to reflect this better; it also makes
grepping the source tree much easier, as there are other kinds of QUIET
option everywhere.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-29 10:22:39 -07:00
Junio C Hamano
f245194f9a diff: change semantics of "ignore whitespace" options
Traditionally, the --ignore-whitespace* options have merely meant to tell
the diff output routine that some class of differences are not worth
showing in the textual diff output, so that the end user has easier time
to review the remaining (presumably more meaningful) changes.  These
options never affected the outcome of the command, given as the exit
status when the --exit-code option was in effect (either directly or
indirectly).

When you have only whitespace changes, however, you might expect

	git diff -b --exit-code

to report that there is _no_ change with zero exit status.

Change the semantics of --ignore-whitespace* options to mean more than
"omit showing the difference in text".

The exit status, when --exit-code is in effect, is computed by checking if
we found any differences at the path level, while diff frontends feed
filepairs to the diffcore engine.  When "ignore whitespace" options are in
effect, we defer this determination until the very end of diffcore
transformation.  We simply do not know until the textual diff is
generated, which comes very late in the pipeline.

When --quiet is in effect, various diff frontends optimize by breaking out
early from the loop that enumerates the filepairs, when we find the first
path level difference; when --ignore-whitespace* is used the above change
automatically disables this optimization.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-29 10:22:39 -07:00
Junio C Hamano
128a9d86da Merge branch 'rs/grep-p'
* rs/grep-p:
  grep: simplify -p output
  grep -p: support user defined regular expressions
  grep: add option -p/--show-function
  grep: handle pre context lines on demand
  grep: print context hunk marks between files
  grep: move context hunk mark handling into show_line()
  userdiff: add xdiff_clear_find_func()
2009-07-09 00:59:58 -07:00
Junio C Hamano
dd787c19c4 Merge branch 'tr/die_errno'
* tr/die_errno:
  Use die_errno() instead of die() when checking syscalls
  Convert existing die(..., strerror(errno)) to die_errno()
  die_errno(): double % in strerror() output just in case
  Introduce die_errno() that appends strerror(errno) to die()
2009-07-06 09:39:46 -07:00
René Scharfe
8cfe5f1cd5 userdiff: add xdiff_clear_find_func()
xdiff_set_find_func() is used to set user defined regular expressions
for finding function signatures.  Add xdiff_clear_find_func(), which
frees the memory allocated by the former, making the API complete.

Also, use the new function in diff.c (the only call site of
xdiff_set_find_func()) to clean up after ourselves.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-01 19:16:37 -07:00
Thomas Rast
0721c314a5 Use die_errno() instead of die() when checking syscalls
Lots of die() calls did not actually report the kind of error, which
can leave the user confused as to the real problem.  Use die_errno()
where we check a system/library call that sets errno on failure, or
one of the following that wrap such calls:

  Function              Passes on error from
  --------              --------------------
  odb_pack_keep         open
  read_ancestry         fopen
  read_in_full          xread
  strbuf_read           xread
  strbuf_read_file      open or strbuf_read_file
  strbuf_readlink       readlink
  write_in_full         xwrite

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-27 11:14:53 -07:00
Thomas Rast
d824cbba02 Convert existing die(..., strerror(errno)) to die_errno()
Change calls to die(..., strerror(errno)) to use the new die_errno().

In the process, also make slight style adjustments: at least state
_something_ about the function that failed (instead of just printing
the pathname), and put paths in single quotes.

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-27 11:14:53 -07:00
Junio C Hamano
f4f78e668d Merge branch 'maint'
* maint:
  diff.c: plug a memory leak in an error path
  fetch-pack: close output channel after sideband demultiplexer terminates
  builtin-remote: Make "remote show" display all urls
2009-06-09 00:29:36 -07:00
Johannes Sixt
802f9c9cb2 diff.c: plug a memory leak in an error path
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-08 21:19:26 -07:00
David Aguilar
003b33a8ad diff: generate pretty filenames in prep_temp_blob()
Naturally, prep_temp_blob() did not care about filenames.
As a result, GIT_EXTERNAL_DIFF and textconv generated
filenames such as ".diff_XXXXXX".

This modifies prep_temp_blob() to generate user-friendly
filenames when creating temporary files.

Diffing "name.ext" now generates "XXXXXX_name.ext".

Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-05-31 17:57:59 -07:00
Junio C Hamano
2c5942dbae Merge branch 'ar/unlink-err' into maint
* ar/unlink-err:
  print unlink(2) errno in copy_or_link_directory
  replace direct calls to unlink(2) with unlink_or_warn
  Introduce an unlink(2) wrapper which gives warning if unlink failed
2009-05-25 19:01:50 -07:00
Jeff King
3cd7388d57 convert bare readlink to strbuf_readlink
This particular readlink call never NUL-terminated its
result, making it a potential source of bugs (though there
is no bug now, as it currently always respects the length
field). Let's just switch it to strbuf_readlink which is
shorter and less error-prone.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-05-25 11:34:02 -07:00
Junio C Hamano
36587681b4 Merge branch 'ar/unlink-err'
* ar/unlink-err:
  print unlink(2) errno in copy_or_link_directory
  replace direct calls to unlink(2) with unlink_or_warn
  Introduce an unlink(2) wrapper which gives warning if unlink failed
2009-05-18 09:01:06 -07:00
Junio C Hamano
c16cea7345 Merge branch 'mh/diff-stat-color'
* mh/diff-stat-color:
  diff: do not color --stat output like patch context
2009-05-18 08:59:54 -07:00
Junio C Hamano
e89c6ea998 Merge branch 'jc/maint-1.6.0-keep-pack' into maint-1.6.1
* jc/maint-1.6.0-keep-pack:
  pack-objects: don't loosen objects available in alternate or kept packs
  t7700: demonstrate repack flaw which may loosen objects unnecessarily
  Remove --kept-pack-only option and associated infrastructure
  pack-objects: only repack or loosen objects residing in "local" packs
  git-repack.sh: don't use --kept-pack-only option to pack-objects
  t7700-repack: add two new tests demonstrating repacking flaws
  is_kept_pack(): final clean-up
  Simplify is_kept_pack()
  Consolidate ignore_packed logic more
  has_sha1_kept_pack(): take "struct rev_info"
  has_sha1_pack(): refactor "pretend these packs do not exist" interface
  git-repack: resist stray environment variable
2009-05-03 15:01:31 -07:00
Junio C Hamano
3f3e2c26fa Merge branch 'jc/maint-1.6.0-diff-borrow-carefully' into maint-1.6.1
* jc/maint-1.6.0-diff-borrow-carefully:
  diff --cached: do not borrow from a work tree when a path is marked as assume-unchanged
2009-05-03 15:01:26 -07:00
Felipe Contreras
4b25d091ba Fix a bunch of pointer declarations (codestyle)
Essentially; s/type* /type */ as per the coding guidelines.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-05-01 15:17:31 -07:00
Alex Riesen
691f1a28bf replace direct calls to unlink(2) with unlink_or_warn
This helps to notice when something's going wrong, especially on
systems which lock open files.

I used the following criteria when selecting the code for replacement:
- it was already printing a warning for the unlink failures
- it is in a function which already printing something or is
  called from such a function
- it is in a static function, returning void and the function is only
  called from a builtin main function (cmd_)
- it is in a function which handles emergency exit (signal handlers)
- it is in a function which is obvously cleaning up the lockfiles

Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-04-29 18:37:41 -07:00
Markus Heidelberg
a408e0e649 diff: do not color --stat output like patch context
The diffstat used the color.diff.plain slot (context text) for coloring
filenames and the whole summary line. This didn't look nice and the
affected text isn't patch context at all.

Signed-off-by: Markus Heidelberg <markus.heidelberg@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-04-26 01:37:47 -07:00
Linus Torvalds
cced5fbc24 Allow users to un-configure rename detection
I told people on the kernel mailing list to please use "-M" when sending
me rename patches, so that I can see what they do while reading email
rather than having to apply the patch and then look at the end result.

I also told them that if they want to make it the default, they can just
add

	[diff]
		renames

to their ~/.gitconfig file. And while I was thinking about that, I wanted
to also check whether you can then mark individual projects to _not_ have
that default in the per-repository .git/config file.

And you can't. Currently you cannot have a global "enable renames by
default" and then a local ".. but not for _this_ project". Why? Because if
somebody writes

	[diff]
		renames = no

we simply ignore it, rather than resetting "diff_detect_rename_default"
back to zero.

Fixed thusly.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-04-11 22:22:00 -07:00
Junio C Hamano
197cf8d59c Merge branch 'jc/maint-1.6.0-diff-borrow-carefully' into maint
* jc/maint-1.6.0-diff-borrow-carefully:
  diff --cached: do not borrow from a work tree when a path is marked as assume-unchanged
2009-04-08 23:23:17 -07:00
Junio C Hamano
c3067cbfb3 Merge branch 'jc/maint-1.6.0-keep-pack' into maint
* jc/maint-1.6.0-keep-pack:
  pack-objects: don't loosen objects available in alternate or kept packs
  t7700: demonstrate repack flaw which may loosen objects unnecessarily
  Remove --kept-pack-only option and associated infrastructure
  pack-objects: only repack or loosen objects residing in "local" packs
  git-repack.sh: don't use --kept-pack-only option to pack-objects
  t7700-repack: add two new tests demonstrating repacking flaws
  is_kept_pack(): final clean-up
  Simplify is_kept_pack()
  Consolidate ignore_packed logic more
  has_sha1_kept_pack(): take "struct rev_info"
  has_sha1_pack(): refactor "pretend these packs do not exist" interface
  git-repack: resist stray environment variable

Conflicts:
	t/t7700-repack.sh
2009-04-08 23:21:10 -07:00
Junio C Hamano
aa72a14a7f Merge branch 'jc/maint-1.6.0-diff-borrow-carefully'
* jc/maint-1.6.0-diff-borrow-carefully:
  diff --cached: do not borrow from a work tree when a path is marked as assume-unchanged
2009-03-28 00:42:31 -07:00
Junio C Hamano
b71fdc590d Merge branch 'js/maint-diff-temp-smudge'
* js/maint-diff-temp-smudge:
  Smudge the files fed to external diff and textconv
2009-03-26 00:27:30 -07:00
Junio C Hamano
150115aded diff --cached: do not borrow from a work tree when a path is marked as assume-unchanged
When the index says that the file in the work tree that corresponds to the
blob object that is used for comparison is known to be unchanged, "diff"
reads from the file and applies convert_to_git(), instead of inflating the
object, to feed the internal diff engine with, because an earlier
benchnark found that it tends to be faster to use this optimization.

However, the index can lie when the path is marked as assume-unchanged.
Disable the optimization for such paths.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-22 15:26:07 -07:00
Johannes Schindelin
4e218f54b3 Smudge the files fed to external diff and textconv
When preparing temporary files for an external diff or textconv, it is
easier on the external tools, especially when they are implemented using
platform tools, if they are fed the input after convert_to_working_tree().

This fixes msysGit issue 177.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-22 15:03:59 -07:00
Junio C Hamano
aec813062b Merge branch 'jc/maint-1.6.0-keep-pack'
* jc/maint-1.6.0-keep-pack:
  is_kept_pack(): final clean-up
  Simplify is_kept_pack()
  Consolidate ignore_packed logic more
  has_sha1_kept_pack(): take "struct rev_info"
  has_sha1_pack(): refactor "pretend these packs do not exist" interface
  git-repack: resist stray environment variable
2009-03-11 13:49:56 -07:00
Benjamin Kramer
eb3a9dd327 Remove unused function scope local variables
These variables were unused and can be removed safely:

  builtin-clone.c::cmd_clone(): use_local_hardlinks, use_separate_remote
  builtin-fetch-pack.c::find_common(): len
  builtin-remote.c::mv(): symref
  diff.c::show_stats():show_stats(): total
  diffcore-break.c::should_break(): base_size
  fast-import.c::validate_raw_date(): date, sign
  fsck.c::fsck_tree(): o_sha1, sha1
  xdiff-interface.c::parse_num(): read_some

Signed-off-by: Benjamin Kramer <benny.kra@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-07 20:52:17 -08:00
Junio C Hamano
4a2caf6912 Merge branch 'al/ansi-color'
* al/ansi-color:
  builtin-branch.c: Rename branch category color names
  Clean up use of ANSI color sequences
2009-03-05 15:41:19 -08:00
Keith Cascio
628d5c2b70 Use DIFF_XDL_SET/DIFF_OPT_SET instead of raw bit-masking
Signed-off-by: Keith Cascio <keith@cs.ucla.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-04 00:56:51 -08:00
Junio C Hamano
cd673c1f17 has_sha1_pack(): refactor "pretend these packs do not exist" interface
Most of the callers of this function except only one pass NULL to its last
parameter, ignore_packed.

Introduce has_sha1_kept_pack() function that has the function signature
and the semantics of this function, and convert the sole caller that does
not pass NULL to call this new function.

All other callers and has_sha1_pack() lose the ignore_packed parameter.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-28 01:06:06 -08:00
Keith Cascio
4b15b4ab5f Remove redundant bit clears from diff_setup()
All bits already clear after memset(0).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-13 18:19:37 -08:00
Arjen Laarhoven
dc6ebd4cc5 Clean up use of ANSI color sequences
Remove the literal ANSI escape sequences and replace them by readable
constants.

Signed-off-by: Arjen Laarhoven <arjen@yaph.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-13 17:27:58 -08:00
Nazri Ramliy
a8344abe0f Bugfix: GIT_EXTERNAL_DIFF with more than one changed files
When there is more than one file that are changed, running git diff with
GIT_EXTERNAL_DIFF incorrectly diagnoses an programming error and dies.
The check introduced in 479b0ae (diff: refactor tempfile cleanup handling,
2009-01-22) to detect a temporary file slot that forgot to remove its
temporary file was inconsistent with the way the codepath to remove the
temporary to mark the slot that it is done with it.

This patch fixes this problem and adds a test case for it.

Signed-off-by: Nazri Ramliy <ayiehere@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-12 12:31:52 -08:00
Junio C Hamano
bdf6442b48 Merge branch 'jc/maint-split-diff-metainfo'
* jc/maint-split-diff-metainfo:
  diff.c: output correct index lines for a split diff
2009-01-31 18:07:42 -08:00
Junio C Hamano
fa5bc8abb3 Merge branch 'jk/signal-cleanup'
* jk/signal-cleanup:
  t0005: use SIGTERM for sigchain test
  pager: do wait_for_pager on signal death
  refactor signal handling for cleanup functions
  chain kill signals for cleanup functions
  diff: refactor tempfile cleanup handling
  Windows: Fix signal numbers
2009-01-31 17:43:56 -08:00
Junio C Hamano
90b23e5f21 Merge branch 'jc/maint-1.6.0-split-diff-metainfo' into jc/maint-split-diff-metainfo
This is an evil merge, as a test added since 1.6.0 expects an incorrect
behaviour the merged commit fixes.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-27 01:08:02 -08:00
Junio C Hamano
b67b9612e1 diff.c: output correct index lines for a split diff
A patch that changes the filetype (e.g. regular file to symlink) of a path
must be split into a deletion event followed by a creation event, which
means that we need to have two independent metainfo lines for each.
However, the code reused the single set of metainfo lines.

As the blob object names recorded on the index lines are usually not used
nor validated on the receiving end, this is not an issue with normal use
of the resulting patch.  However, when accepting a binary patch to delete
a blob, git-apply verified that the postimage blob object name on the
index line is 0{40}, hence a patch that deletes a regular file blob that
records binary contents to create a blob with different filetype (e.g. a
symbolic link) failed to apply.  "git am -3" also uses the blob object
names recorded on the index line, so it would also misbehave when
synthesizing a preimage tree.

This moves the code to generate metainfo lines around, so that two
independent sets of metainfo lines are used for the split halves.

Additional tests by Jeff King.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-27 00:48:00 -08:00
Junio C Hamano
9847a52432 Merge branch 'js/diff-color-words'
* js/diff-color-words:
  Change the spelling of "wordregex".
  color-words: Support diff.wordregex config option
  color-words: make regex configurable via attributes
  color-words: expand docs with precise semantics
  color-words: enable REG_NEWLINE to help user
  color-words: take an optional regular expression describing words
  color-words: change algorithm to allow for 0-character word boundaries
  color-words: refactor word splitting and use ALLOC_GROW()
  Add color_fwrite_lines(), a function coloring each line individually
2009-01-25 17:13:29 -08:00
Junio C Hamano
5dc1308562 Merge branch 'js/patience-diff'
* js/patience-diff:
  bash completions: Add the --patience option
  Introduce the diff option '--patience'
  Implement the patience diff algorithm

Conflicts:
	contrib/completion/git-completion.bash
2009-01-23 21:51:38 -08:00
Jeff King
57b235a4bc refactor signal handling for cleanup functions
The current code is very inconsistent about which signals
are caught for doing cleanup of temporary files and lock
files. Some callsites checked only SIGINT, while others
checked a variety of death-dealing signals.

This patch factors out those signals to a single function,
and then calls it everywhere. For some sites, that means
this is a simple clean up. For others, it is an improvement
in that they will now properly clean themselves up after a
larger variety of signals.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-21 22:46:53 -08:00
Jeff King
4a16d07272 chain kill signals for cleanup functions
If a piece of code wanted to do some cleanup before exiting
(e.g., cleaning up a lockfile or a tempfile), our usual
strategy was to install a signal handler that did something
like this:

  do_cleanup(); /* actual work */
  signal(signo, SIG_DFL); /* restore previous behavior */
  raise(signo); /* deliver signal, killing ourselves */

For a single handler, this works fine. However, if we want
to clean up two _different_ things, we run into a problem.
The most recently installed handler will run, but when it
removes itself as a handler, it doesn't put back the first
handler.

This patch introduces sigchain, a tiny library for handling
a stack of signal handlers. You sigchain_push each handler,
and use sigchain_pop to restore whoever was before you in
the stack.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-21 22:46:52 -08:00
Jeff King
479b0ae81c diff: refactor tempfile cleanup handling
There are two pieces of code that create tempfiles for diff:
run_external_diff and run_textconv. The former cleans up its
tempfiles in the face of premature death (i.e., by die() or
by signal), but the latter does not. After this patch, they
will both use the same cleanup routines.

To make clear what the change is, let me first explain what
happens now:

  - run_external_diff uses a static global array of 2
    diff_tempfile structs (since it knows it will always
    need exactly 2 tempfiles). It calls prepare_temp_file
    (which doesn't know anything about the global array) on
    each of the structs, creating the tempfiles that need to
    be cleaned up. It then registers atexit and signal
    handlers to look through the global array and remove the
    tempfiles. If it succeeds, it calls the handler manually
    (which marks the tempfile structs as unused).

  - textconv has its own tempfile struct, which it allocates
    using prepare_temp_file and cleans up manually. No
    signal or atexit handlers.

The new code moves the installation of cleanup handlers into
the prepare_temp_file function. Which means that that
function now has to understand that there is static tempfile
storage. So what happens now is:

  - run_external_diff calls prepare_temp_file
  - prepare_temp_file calls claim_diff_tempfile, which
    allocates an unused slot from our global array
  - prepare_temp_file installs (if they have not already
    been installed) atexit and signal handlers for cleanup
  - prepare_temp_file sets up the tempfile as usual
  - prepare_temp_file returns a pointer to the allocated
    tempfile

The advantage being that run_external_diff no longer has to
care about setting up cleanup handlers. Now by virtue of
calling prepare_temp_file, run_textconv gets the same
benefit, as will any future users of prepare_temp_file.

There are also a few side benefits to the specific
implementation:

  - we now install cleanup handlers _before_ allocating the
    tempfile, closing a race which could leave temp cruft

  - when allocating a slot in the global array, we will now
    detect a situation where the old slots were not properly
    vacated (i.e., somebody forgot to call remove upon
    leaving the function). In the old code, such a situation
    would silently overwrite the tempfile names, meaning we
    would forget to clean them up. The new code dies with a
    bug warning.

  - we make sure only to install the signal handler once.
    This isn't a big deal, since we are just overwriting the
    old handler, but will become an issue when a later patch
    converts the code to use sigchain

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-21 22:46:52 -08:00
Junio C Hamano
f873dd5ac2 Merge branch 'maint'
* maint:
  Rename diff.suppress-blank-empty to diff.suppressBlankEmpty
2009-01-21 01:08:10 -08:00
Boyd Stephen Smith Jr
98a4d87b87 color-words: Support diff.wordregex config option
When diff is invoked with --color-words (w/o =regex), use the regular
expression the user has configured as diff.wordregex.

diff drivers configured via attributes take precedence over the
diff.wordregex-words setting.  If the user wants to change them, they have
their own configuration variables.

Signed-off-by: Boyd Stephen Smith Jr <bss@iguanasuicide.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-21 00:51:12 -08:00
Johannes Schindelin
950db8798d Rename diff.suppress-blank-empty to diff.suppressBlankEmpty
All the other config variables use CamelCase.  This config variable should
not be an exception.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-21 00:17:40 -08:00
Thomas Rast
80c49c3de2 color-words: make regex configurable via attributes
Make the --color-words splitting regular expression configurable via
the diff driver's 'wordregex' attribute.  The user can then set the
driver on a file in .gitattributes.  If a regex is given on the
command line, it overrides the driver's setting.

We also provide built-in regexes for the languages that already had
funcname patterns, and add an appropriate diff driver entry for C/++.
(The patterns are designed to run UTF-8 sequences into a single chunk
to make sure they remain readable.)

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-17 10:44:21 -08:00
Thomas Rast
bf82940dbf color-words: enable REG_NEWLINE to help user
We silently truncate a match at the newline, which may lead to
unexpected behaviour, e.g., when matching "<[^>]*>" against

  <foo
  bar>

since then "<foo" becomes a word (and "bar>" doesn't!) even though the
regex said only angle-bracket-delimited things can be words.

To alleviate the problem slightly, use REG_NEWLINE so that negated
classes can't match a newline.  Of course newlines can still be
matched explicitly.

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-17 10:43:24 -08:00
Johannes Schindelin
2b6a5417d7 color-words: take an optional regular expression describing words
In some applications, words are not delimited by white space.  To
allow for that, you can specify a regular expression describing
what makes a word with

	git diff --color-words='[A-Za-z0-9]+'

Note that words cannot contain newline characters.

As suggested by Thomas Rast, the words are the exact matches of the
regular expression.

Note that a regular expression beginning with a '^' will match only
a word at the beginning of the hunk, not a word at the beginning of
a line, and is probably not what you want.

This commit contains a quoting fix by Thomas Rast.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-17 10:43:08 -08:00
Johannes Schindelin
2e5d2003b2 color-words: change algorithm to allow for 0-character word boundaries
Up until now, the color-words code assumed that word boundaries are
identical to white space characters.

Therefore, it could get away with a very simple scheme: it copied the
hunks, substituted newlines for each white space character, called
libxdiff with the processed text, and then identified the text to
output by the offsets (which agreed since the original text had the
same length).

This code was ugly, for a number of reasons:

- it was impossible to introduce 0-character word boundaries,

- we had to print everything word by word, and

- the code needed extra special handling of newlines in the removed part.

Fix all of these issues by processing the text such that

- we build word lists, separated by newlines,

- we remember the original offsets for every word, and

- after calling libxdiff on the wordlists, we parse the hunk headers, and
  find the corresponding offsets, and then

- we print the removed/added parts in one go.

The pre and post samples in the test were provided by Santi Béjar.

Note that there is some strange special handling of hunk headers where
one line range is 0 due to POSIX: in this case, the start is one too
low.  In other words a hunk header '@@ -1,0 +2 @@' actually means that
the line must be added after the _second_ line of the pre text, _not_
the first.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-17 10:42:41 -08:00
Johannes Schindelin
23c1575f74 color-words: refactor word splitting and use ALLOC_GROW()
Word splitting is now performed by the function diff_words_fill(),
avoiding having the same code twice.

In the same spirit, avoid duplicating the code of ALLOC_GROW().

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-17 10:42:19 -08:00