The code remembered that the last diff output it saw was an empty line,
and tried to reset that state whenever it sees a context line, a non-blank
new line, or a new hunk. However, this codepath asks the underlying diff
engine to feed diff without any context, and the "just saw an empty line"
state was not reset if you added a new blank line in the last hunk of your
patch, even if it is not the last line of the file.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* bd/diff-strbuf:
xdiff-interface: hide the whole "xdiff_emit_state" business from the caller
Use strbuf for struct xdiff_emit_state's remainder
Make xdi_diff_outf interface for running xdiff_outf diffs
GNU diff's --suppress-blank-empty option makes it so that diff no
longer outputs trailing white space unless the input data has it.
With this option, empty context lines are now empty also in diff -u output.
Before, they would have a single trailing space.
* diff.c (diff_suppress_blank_empty): New global.
(git_diff_basic_config): Set it.
(fn_out_consume): Honor it.
* t/t4029-diff-trailing-space.sh: New file.
* Documentation/config.txt: Document it.
Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This further enhances xdi_diff_outf() interface so that it takes two
common parameters: the callback function that processes one line at a
time, and a pointer to its application specific callback data structure.
xdi_diff_outf() creates its own "xdiff_emit_state" structure and stashes
these two away inside it, which is used by the lowest level output
function in the xdiff_outf() callchain, consume_one(), to call back to the
application layer. With this restructuring, we lift the requirement that
the caller supplied callback data structure embeds xdiff_emit_state
structure as its first member.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To prepare for the need to initialize and release resources for an
xdi_diff with the xdiff_outf output function, make a new function to
wrap this usage.
Old:
ecb.outf = xdiff_outf;
ecb.priv = &state;
...
xdi_diff(file_p, file_o, &xpp, &xecfg, &ecb);
New:
xdi_diff_outf(file_p, file_o, &state.xm, &xpp, &xecfg, &ecb);
Signed-off-by: Brian Downing <bdowning@lavos.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All BibTeX entries starts with an @ followed by an entry type. Since
there are many entry types and own can be defined, the pattern matches
legal entry type names instead of just the default types (which would
be a long list). The pattern also matches strings and comments since
they will also be useful to position oneself in a bib-file.
Signed-off-by: Gustaf Hendeby <hendeby@isy.liu.se>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Recently "git diff --check" learned to detect new trailing blank lines
just like "git apply --whitespace" does. However this check should not
trigger unconditionally. This patch makes it honor the whitespace
settings from core.whitespace and gitattributes.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The configuration was added as a core option in 3299c6f (diff: make
default rename detection limit configurable., 2005-11-15), but 9ce392f
(Move diff.renamelimit out of default configuration., 2005-11-21)
separated diff-related stuff out of the core.
Up to that point it was Ok.
When we separated the Porcelain options out of the git_diff_config in
83ad63c (diff: do not use configuration magic at the core-level,
2006-07-08), we should have been more careful.
This mistake made diff-tree plumbing and git-show Porcelain to notice
different set of renames when the user explicitly asked for rename
detection.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch enhances the tex funcname by adding support for
chapter and part sectioning commands. It also matches
the starred version of the sectioning commands.
Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Finds classes, records, functions, procedures, and sections. Most lines
need to start at the first column, or else there's no way to differentiate
a procedure's definition from its declaration.
Signed-off-by: Avery Pennarun <apenwarr@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Provide a regexp that catches class, module and method definitions in
Ruby scripts, since the built-in default only finds classes.
Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch makes two small changes to improve the output of --inline
and --attach.
The first is to write a newline preceding the boundary. This is needed because
MIME defines the encapsulation boundary as including the preceding CRLF (or in
this case, just LF), so we should be writing one. Without this, the last
newline in the pre-diff content is consumed instead.
The second change is to always write the line termination character
(default: newline) even when using --inline or --attach. This is simply to
improve the aesthetics of the resulting message. When using --inline an email
client should render the resulting message identically to the non-inline
version. And when using --attach this adds a blank line preceding the
attachment in the email, which is visually attractive.
Signed-off-by: Kevin Ballard <kevin@sb.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint:
Start preparing 1.5.6.4 release notes
git fetch-pack: do not complain about "no common commits" in an empty repo
rebase-i: keep old parents when preserving merges
t7600-merge: Use test_expect_failure to test option parsing
Fix buffer overflow in prepare_attr_stack
Fix buffer overflow in git diff
Fix buffer overflow in git-grep
git-cvsserver: fix call to nonexistant cleanupWorkDir()
Documentation/git-cherry-pick.txt et al.: Fix misleading -n description
Conflicts:
RelNotes
If PATH_MAX on your system is smaller than a path stored, it may cause
buffer overflow and stack corruption in diff_addremove() and diff_change()
functions when running git-diff
Signed-off-by: Dmitry Potapov <dpotapov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* qq/maint:
clone -q: honor "quiet" option over native transports.
attribute documentation: keep EXAMPLE at end
builtin-commit.c: Use 'git_config_string' to get 'commit.template'
http.c: Use 'git_config_string' to clean up SSL config.
diff.c: Use 'git_config_string' to get 'diff.external'
convert.c: Use 'git_config_string' to get 'smudge' and 'clean'
builtin-log.c: Use 'git_config_string' to get 'format.subjectprefix' and 'format.suffix'
Documentation cvs: Clarify when a bare repository is needed
Documentation: be precise about which date --pretty uses
Conflicts:
Documentation/gitattributes.txt
* jc/checkdiff:
Fix t4017-diff-retval for white-space from wc
Update sample pre-commit hook to use "diff --check"
diff --check: detect leftover conflict markers
Teach "diff --check" about new blank lines at end
checkdiff: pass diff_options to the callback
check_and_emit_line(): rename and refactor
diff --check: explain why we do not care whether old side is binary
Before this patch, name_width becomes negative or null for width values
less than 15 and name_width values greater than 25 (default: 50). This
leads to output random data.
This patch checks for minimal width and name_width values.
Signed-off-by: Olivier Marin <dkr@freesurf.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This teaches "diff --check" to detect and complain if the change
adds lines that look like leftover conflict markers.
We should be able to remove the old Perl script used in the sample
pre-commit hook and modernize the script with this facility.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a patch adds new blank lines at the end, "git apply --whitespace"
warns. This teaches "diff --check" to do the same.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function name was too bland and not explicit enough as to what it is
checking. Split it into two, and call the one that checks if there is a
whitespace breakage "ws_check()", and call the other one that checks and
emits the line after color coding "ws_check_emit()".
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All other codepaths refrain from running textual diff when either the old
or the new side is binary, but this function only checks the new side. I
was almost going to change it to check both, but that would be a bad
change. Explain why to prevent future mistakes.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git diff --check" should return non-zero when there was any whitespace
error but the code only paid attention to the error status of the last
new line in the patch.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jk/test:
enable whitespace checking of test scripts
avoid trailing whitespace in zero-change diffstat lines
avoid whitespace on empty line in automatic usage message
mask necessary whitespace policy violations in test scripts
fix whitespace violations in test scripts
It worked that way since commit 50f575fc (Tweak diff colors,
2006-06-22), but commit c1795bb0 (Unify whitespace checking, 2007-12-13)
changed it. This patch restores the old behaviour.
Besides Linus' arguments in the log message of 50f575fc, resetting color
before printing newline is also important to keep 'git add --patch'
happy. If the last line(s) of a file are removed, then that hunk will
end with a colored line. However, if the newline comes before the color
reset, then the diff output will have an additional line at the end
containing only the reset sequence. This causes trouble in
git-add--interactive.perl's parse_diff function, because @colored will
have one more element than @diff, and that last element will contain the
color reset. The elements of these arrays will then be copied to @hunk,
but only as many as the number of elements in @diff. As a result the
last color reset is lost and all subsequent terminal output will be
printed in color.
Signed-off-by: SZEDER Gábor <szeder@ira.uka.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In some cases, we produce a diffstat line even though no
lines have changed (e.g., because of an exact rename). In
this case, there is no +/- "graph" after the number of
changed lines. However, we output the space separator
unconditionally, meaning that these lines contained a
trailing space character.
This isn't a huge problem, but in cleaning up the output we
are able to eliminate some trailing whitespace from a test
vector.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The new option --ignore-submodules can now be used to ignore changes in
submodules.
Why? Sometimes it is not interesting when a submodule changed.
For example, when reordering some commits in the superproject, a dirty
submodule is usually totally uninteresting. So we will use this option
in git-rebase to test for a dirty working tree.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git_config() only had a function parameter, but no callback data
parameter. This assumes that all callback functions only modify
global variables.
With this patch, every callback gets a void * parameter, and it is hoped
that this will help the libification effort.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The current rename limit default of 100 was arbitrarily
chosen. Testing[1] has shown that on modern hardware, a
limit of 200 adds about a second of computation time, and a
limit of 500 adds about 5 seconds of computation time.
This patch bumps the default limit to 200 for viewing diffs,
and to 500 for performing a merge. The limit for generating
git-status templates is set independently; we bump it up to
200 here, as well, to match the diff limit.
[1]: See <20080211113516.GB6344@coredump.intra.peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These variables were made unnecessary by commit
3969cf7db1.
Signed-off-by: Adam Simpkins <adam@adamsimpkins.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of counting added and removed lines (and mixing the byte size
reported for binary files in the result), summarize the extent of damage
the same way as we count similarity for rename detection.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ping Yin noticed that "git diff-index --raw" shows 0{40} when work tree
has submodule difference, but "git diff --raw" didn't correctly do so.
There was a mistake in the diffcore_skip_stat_unmatch() that was meant to
clean up the stat-only difference for running diff between the index and
work tree and diff between the tree and the work tree, to cause it re-read
from the submodule repository HEAD. When ce_stat_match() says work tree
is different, we should always say 0{40} on the work tree side.
This patch fixes the issue, and adds tests.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now find_unique_abbrev() never returns NULL, there is no need for callers
to prepare for seeing NULL and fall back to giving the full 40-hexdigits.
While we are at it, drop "..." in the "git reset" output that reports the
location of the new HEAD, between the abbreviated commit object name and
the one line commit summary. Because we are always showing the HEAD
(which cannot be missing!), we never had a case where we show the full 40
hexdigits that is not followed by three dots, and these three dots were
stealing 3 columns from the precious horizontal screen real estate out of
80 that can better be used for the one line commit summary.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint:
Documentation/git-am.txt: Pass -r in the example invocation of rm -f .dotest
timezone_names[]: fixed the tz offset for New Zealand.
filter-branch documentation: non-zero exit status in command abort the filter
rev-parse: fix potential bus error with --parseopt option spec handling
Use a single implementation and API for copy_file()
Documentation/git-filter-branch: add a new msg-filter example
Correct fast-export file mode strings to match fast-import standard
Originally by Kristian Hï¿œgsberg; I fixed the conversion of rerere, which
had a different API.
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We do not account binary nor unmerged files when --shortstat is
asked for (or the summary stat at the end of --stat).
The new option --dirstat should work the same way as it is about
summarizing the changes of multiple files by adding them up.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This change removes all obvious useless if-before-free tests.
E.g., it replaces code like this:
if (some_expression)
free (some_expression);
with the now-equivalent:
free (some_expression);
It is equivalent not just because POSIX has required free(NULL)
to work for a long time, but simply because it has worked for
so long that no reasonable porting target fails the test.
Here's some evidence from nearly 1.5 years ago:
http://www.winehq.org/pipermail/wine-patches/2006-October/031544.html
FYI, the change below was prepared by running the following:
git ls-files -z | xargs -0 \
perl -0x3b -pi -e \
's/\bif\s*\(\s*(\S+?)(?:\s*!=\s*NULL)?\s*\)\s+(free\s*\(\s*\1\s*\))/$2/s'
Note however, that it doesn't handle brace-enclosed blocks like
"if (x) { free (x); }". But that's ok, since there were none like
that in git sources.
Beware: if you do use the above snippet, note that it can
produce syntactically invalid C code. That happens when the
affected "if"-statement has a matching "else".
E.g., it would transform this
if (x)
free (x);
else
foo ();
into this:
free (x);
else
foo ();
There were none of those here, either.
If you're interested in automating detection of the useless
tests, you might like the useless-if-before-free script in gnulib:
[it *does* detect brace-enclosed free statements, and has a --name=S
option to make it detect free-like functions with different names]
http://git.sv.gnu.org/gitweb/?p=gnulib.git;a=blob;f=build-aux/useless-if-before-free
Addendum:
Remove one more (in imap-send.c), spotted by Jean-Luc Herren <jlh@gmx.ch>.
Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The Solaris regex library doesn't like having the '$' anchor
inside capture parentheses. It rejects the match, causing
t4018 to fail.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint:
commit: discard index after setting up partial commit
filter-branch: handle filenames that need quoting
diff: Fix miscounting of --check output
hg-to-git: fix parent analysis
mailinfo: feed only one line to handle_filter() for QP input
diff.c: add "const" qualifier to "char *cmd" member of "struct ll_diff_driver"
Add "const" qualifier to "char *excludes_file".
Add "const" qualifier to "char *editor_program".
Add "const" qualifier to "char *pager_program".
config: add 'git_config_string' to refactor string config variables.
diff.c: remove useless check for value != NULL
fast-import: check return value from unpack_entry()
Validate nicknames of remote branches to prohibit confusing ones
diff.c: replace a 'strdup' with 'xstrdup'.
diff.c: fixup garding of config parser from value=NULL
c1795bb (Unify whitespace checking) incorrectly made the
checking function return without incrementing the line numbers
when there is no whitespace problem is found on a '+' line.
This resurrects the earlier behaviour.
Noticed and reported by Jay Soffian. The test script was stolen
from Jay's independent fix.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Also use "git_config_string" to simplify code where "cmd" is set.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is not necessary to check if value != NULL before calling
'parse_lldiff_command' as there is already a check inside this
function.
By the way this patch also improves the existing check inside
'parse_lldiff_command' by using:
return config_error_nonbool(var);
instead of:
return error("%s: lacks value", var);
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Christian Couder noticed that there still were a handcrafted error()
call that we should have converted to config_error_nonbool() where
parse_lldiff_command() parses the configuration file.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This allows the --relative option to say which subdirectory to
pretend to be in, so that in a bare repository, you can say:
$ git log --relative=drivers/ v2.6.20..v2.6.22 -- drivers/scsi/
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This adds --relative option to the diff family. When you start
from a subdirectory:
$ git diff --relative
shows only the diff that is inside your current subdirectory,
and without $prefix part. People who usually live in
subdirectories may like it.
There are a few things I should also mention about the change:
- This works not just with diff but also works with the log
family of commands, but the history pruning is not affected.
In other words, if you go to a subdirectory, you can say:
$ git log --relative -p
but it will show the log message even for commits that do not
touch the current directory. You can limit it by giving
pathspec yourself:
$ git log --relative -p .
This originally was not a conscious design choice, but we
have a way to affect diff pathspec and pruning pathspec
independently. IOW "git log --full-diff -p ." tells it to
prune history to commits that affect the current subdirectory
but show the changes with full context. I think it makes
more sense to leave pruning independent from --relative than
the obvious alternative of always pruning with the current
subdirectory, which would break the symmetry.
- Because this works also with the log family, you could
format-patch a single change, limiting the effect to your
subdirectory, like so:
$ cd gitk-git
$ git format-patch -1 --relative 911f1eb
But because that is a special purpose usage, this option will
never become the default, with or without repository or user
preference configuration. The risk of producing a partial
patch and sending it out by mistake is too great if we did
so.
- This is inherently incompatible with --no-index, which is a
bolted-on hack that does not have much to do with git
itself. I didn't bother checking and erroring out on the
combined use of the options, but probably I should.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This adds a new form of overview diffstat output, doing something that I
have occasionally ended up doing manually (and badly, because it's
actually pretty nasty to do), and that I think is very useful for an
project like the kernel that has a fairly deep and well-separated
directory structure with semantic meaning.
What I mean by that is that it's often interesting to see exactly which
sub-directories are impacted by a patch, and to what degree - even if you
don't perhaps care so much about the individual files themselves.
What makes the concept more interesting is that the "impact" is often
hierarchical: in the kernel, for example, something could either have a
very localized impact to "fs/ext3/" and then it's interesting to see that
such a patch changes mostly that subdirectory, but you could have another
patch that changes some generic VFS-layer issue which affects _many_
subdirectories that are all under "fs/", but none - or perhaps just a
couple of them - of the individual filesystems are interesting in
themselves.
So what commonly happens is that you may have big changes in a specific
sub-subdirectory, but still also significant separate changes to the
subdirectory leading up to that - maybe you have significant VFS-level
changes, but *also* changes under that VFS layer in the NFS-specific
directories, for example. In that case, you do want the low-level parts
that are significant to show up, but then the insignificant ones should
show up as under the more generic top-level directory.
This patch shows all of that with "--dirstat". The output can be either
something simple like
commit 81772fe...
Author: Thomas Gleixner <tglx@linutronix.de>
Date: Sun Feb 10 23:57:36 2008 +0100
x86: remove over noisy debug printk
pageattr-test.c contains a noisy debug printk that people reported.
The condition under which it prints (randomly tapping into a mem_map[]
hole and not being able to c_p_a() there) is valid behavior and not
interesting to report.
Remove it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
100.0% arch/x86/mm/
or something much more complex like
commit e231c2e...
Author: David Howells <dhowells@redhat.com>
Date: Thu Feb 7 00:15:26 2008 -0800
Convert ERR_PTR(PTR_ERR(p)) instances to ERR_CAST(p)
20.5% crypto/
7.6% fs/afs/
7.6% fs/fuse/
7.6% fs/gfs2/
5.1% fs/jffs2/
5.1% fs/nfs/
5.1% fs/nfsd/
7.6% fs/reiserfs/
15.3% fs/
7.6% net/rxrpc/
10.2% security/keys/
where that latter example is an example of significant work in some
individual fs/*/ subdirectories (like the patches to reiserfs accounting
for 7.6% of the whole), but then discounting those individual filesystems,
there's also 15.3% other "random" things that weren't worth reporting on
their oen left over under fs/ in general (either in that directory itself,
or in subdirectories of fs/ that didn't have enough changes to be reported
individually).
I'd like to stress that the "15.3% fs/" mentioned above is the stuff that
is under fs/ but that was _not_ significant enough to report on its own.
So the above does _not_ mean that 15.3% of the work was under fs/ per se,
because that 15.3% does *not* include the already-reported 7.6% of afs,
7.6% of fuse etc.
If you want to enable "cumulative" directory statistics, you can use the
"--cumulative" flag, which adds up percentages recursively even when
they have been already reported for a sub-directory. That cumulative
output is disabled if *all* of the changes in one subdirectory come from
a deeper subdirectory, to avoid repeating subdirectories all the way to
the root.
For an example of the cumulative reporting, the above commit becomes
commit e231c2e...
Author: David Howells <dhowells@redhat.com>
Date: Thu Feb 7 00:15:26 2008 -0800
Convert ERR_PTR(PTR_ERR(p)) instances to ERR_CAST(p)
20.5% crypto/
7.6% fs/afs/
7.6% fs/fuse/
7.6% fs/gfs2/
5.1% fs/jffs2/
5.1% fs/nfs/
5.1% fs/nfsd/
7.6% fs/reiserfs/
61.5% fs/
7.6% net/rxrpc/
10.2% security/keys/
in which the commit percentages now obviously add up to much more than
100%: now the changes that were already reported for the sub-directories
under fs/ are then cumulatively included in the whole percentage of fs/
(ie now shows 61.5% as opposed to the 15.3% without the cumulative
reporting).
The default reporting limit has been arbitrarily set at 3%, which seems
to be a pretty good cut-off, but you can specify the cut-off manually by
giving it as an option parameter (eg "--dirstat=5" makes the cut-off be
at 5% instead)
NOTE! The percentages are purely about the total lines added and removed,
not anything smarter (or dumber) than that. Also note that you should not
generally expect things to add up to 100%: not only does it round down, we
don't report leftover scraps (they add up to the top-level change count,
but we don't even bother reporting that, it only reports subdirectories).
Quite frankly, as a top-level manager this is really convenient for me,
but it's going to be very boring for git itself since there are few
subdirectories. Also, don't expect things to make tons of sense if you
combine this with "-M" and there are cross-directory renames etc.
But even for git itself, you can get some fun statistics. Try out
git log --dirstat
and see the occasional mentions of things like Documentation/, git-gui/,
gitweb/ and gitk-git/. Or try out something like
git diff --dirstat v1.5.0..v1.5.4
which does kind of git an overview that shows *something*. But in general,
the output is more exciting for big projects with deeper structure, and
doing a
git diff --dirstat v2.6.24..v2.6.25-rc1
on the kernel is what I actually wrote this for!
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* lt/in-core-index:
lazy index hashing
Create pathname-based hash-table lookup into index
read-cache.c: introduce is_racy_timestamp() helper
read-cache.c: fix a couple more CE_REMOVE conversion
Also use unpack_trees() in do_diff_cache()
Make run_diff_index() use unpack_trees(), not read_tree()
Avoid running lstat(2) on the same cache entry.
index: be careful when handling long names
Make on-disk index representation separate from in-core one
diff.external, diff.*.command, diff.color.*, color.diff.* and
diff.*.funcname configuration variables expect a string value.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
CRLF conversion bears a slight chance of corrupting data.
autocrlf=true will convert CRLF to LF during commit and LF to
CRLF during checkout. A file that contains a mixture of LF and
CRLF before the commit cannot be recreated by git. For text
files this is the right thing to do: it corrects line endings
such that we have only LF line endings in the repository.
But for binary files that are accidentally classified as text the
conversion can corrupt data.
If you recognize such corruption early you can easily fix it by
setting the conversion type explicitly in .gitattributes. Right
after committing you still have the original file in your work
tree and this file is not yet corrupted. You can explicitly tell
git that this file is binary and git will handle the file
appropriately.
Unfortunately, the desired effect of cleaning up text files with
mixed line endings and the undesired effect of corrupting binary
files cannot be distinguished. In both cases CRLFs are removed
in an irreversible way. For text files this is the right thing
to do because CRLFs are line endings, while for binary files
converting CRLFs corrupts data.
This patch adds a mechanism that can either warn the user about
an irreversible conversion or can even refuse to convert. The
mechanism is controlled by the variable core.safecrlf, with the
following values:
- false: disable safecrlf mechanism
- warn: warn about irreversible conversions
- true: refuse irreversible conversions
The default is to warn. Users are only affected by this default
if core.autocrlf is set. But the current default of git is to
leave core.autocrlf unset, so users will not see warnings unless
they deliberately chose to activate the autocrlf mechanism.
The safecrlf mechanism's details depend on the git command. The
general principles when safecrlf is active (not false) are:
- we warn/error out if files in the work tree can modified in an
irreversible way without giving the user a chance to backup the
original file.
- for read-only operations that do not modify files in the work tree
we do not not print annoying warnings.
There are exceptions. Even though...
- "git add" itself does not touch the files in the work tree, the
next checkout would, so the safety triggers;
- "git apply" to update a text file with a patch does touch the files
in the work tree, but the operation is about text files and CRLF
conversion is about fixing the line ending inconsistencies, so the
safety does not trigger;
- "git diff" itself does not touch the files in the work tree, it is
often run to inspect the changes you intend to next "git add". To
catch potential problems early, safety triggers.
The concept of a safety check was originally proposed in a similar
way by Linus Torvalds. Thanks to Dimitry Potapov for insisting
on getting the naked LF/autocrlf=true case right.
Signed-off-by: Steffen Prohaska <prohaska@zib.de>
Aside from the lstat(2) done for work tree files, there are
quite many lstat(2) calls in refname dwimming codepath. This
patch is not about reducing them.
* It adds a new ce_flag, CE_UPTODATE, that is meant to mark the
cache entries that record a regular file blob that is up to
date in the work tree. If somebody later walks the index and
wants to see if the work tree has changes, they do not have
to be checked with lstat(2) again.
* fill_stat_cache_info() marks the cache entry it just added
with CE_UPTODATE. This has the effect of marking the paths
we write out of the index and lstat(2) immediately as "no
need to lstat -- we know it is up-to-date", from quite a lot
fo callers:
- git-apply --index
- git-update-index
- git-checkout-index
- git-add (uses add_file_to_index())
- git-commit (ditto)
- git-mv (ditto)
* refresh_cache_ent() also marks the cache entry that are clean
with CE_UPTODATE.
* write_index is changed not to write CE_UPTODATE out to the
index file, because CE_UPTODATE is meant to be transient only
in core. For the same reason, CE_UPDATE is not written to
prevent an accident from happening.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We truncate hunk-header line at 80 bytes, but that 80th byte
could be in the middle of a character, which is bad. This uses
pick_one_utf8_char() function to make sure we do not cut a character
in the middle.
This assumes that the internal representation of the text is
UTF-8. This needs to be extended in the future but the optimal
direction has not been decided yet.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is no point to this. Either:
1. The program has already loaded git_diff_ui_config, in
which case this is a noop.
2. The program didn't, which means it is plumbing that
does not _want_ git_diff_ui_config to be loaded.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The funcname patterns influence the "comment" on @@ lines of
the diff. They are safe to use with plumbing since they
don't fundamentally change the meaning of the diff in any
way.
Since all diff users call either diff_ui_config or
diff_basic_config, we can get rid of the lazy reading of the
config.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The diff porcelain uses git_diff_ui_config to set
porcelain-ish config options, like automatically turning on
color. The plumbing specifically avoids calling this
function, since it doesn't want things like automatic color
or rename detection.
However, some diff options should be set for both plumbing
and porcelain. For example, one can still turn on color in
git-diff-files using the --color command line option. This
means we want the color config from color.diff.* (so that
once color is on, we use the user's preferred scheme), but
_not_ the color.diff variable.
We split the diff config into "ui" and "basic", where
"basic" is suitable for use by plumbing (so _most_ things
affecting the output should still go into the "ui" part).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This moves the logic to quote two paths (prefix + path) in
C-style introduced in the previous commit from the
dump_quoted_path() in combine-diff.c to quote.c, and uses it to
fix rewrite_diff() that never C-quoted the pathnames correctly.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With the new options "--src-prefix=<prefix>", "--dst-prefix=<prefix>"
and "--no-prefix", you can now control the path prefixes of the diff
machinery. These used to by hardwired to "a/" for the source prefix
and "b/" for the destination prefix.
Initial patch by Pascal Obry. Sane option names suggested by Linus.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We had the diff.external variable in the documentation of the config
file since its conception, but failed to respect it.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For consistency, make the two tools report whitespace errors in the
same way (the output of "diff --check" has been tweaked to match
that of "git apply").
Note that although the textual content is basically the same only
"git diff --check" provides a colorized version of the problematic
lines; making "git apply" do colorization will require more extensive
changes (figuring out the diff colorization preferences of the user)
and so that will be a subject for another commit.
Signed-off-by: Wincent Colaiuta <win@wincent.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit unifies three separate places where whitespace checking was
performed:
- the whitespace checking previously done in builtin-apply.c is
extracted into a function in ws.c
- the equivalent logic in "git diff" is removed
- the emit_line_with_ws() function is also removed because that also
rechecks the whitespace, and its functionality is rolled into ws.c
The new function is called check_and_emit_line() and it does two things:
checks a line for whitespace errors and optionally emits it. The checking
is based on lines of content rather than patch lines (in other words, the
caller must strip the leading "+" or "-"); this was suggested by Junio on
the mailing list to allow for a future extension to "git show" to display
whitespace errors in blobs.
At the same time we teach it to report all classes of whitespace errors
found for a given line rather than reporting only the first found error.
Signed-off-by: Wincent Colaiuta <win@wincent.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is no reason --exit-code and --check-diff must be mutually
exclusive, so assign different bits to different results and allow them
to be returned from the command. Introduce diff_result_code() to factor
out the common code to decide final status code based on diffopt
settings and use it everywhere.
Update tests to match the above fix.
Turning pager off when "diff --check" is used is a regression.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git diff" has a --check option that can be used to check for whitespace
problems but it only reported by printing warnings to the
console.
Now when the --check option is used we give a non-zero exit status,
making "git diff --check" nicer to use in scripts and hooks.
Signed-off-by: Wincent Colaiuta <win@wincent.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This inserts a new function xdi_diff() that currently does not
do anything other than calling the underlying xdl_diff() to the
callchain of current callers of xdl_diff() function.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"diff --check" would only detect spaces before tabs if a tab was the
last character in the leading indent. Fix that and add a test case to
make sure the bug doesn't regress in the future.
Signed-off-by: Wincent Colaiuta <win@wincent.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "-z" format is all about machine parsability, but showing renamed
paths as "common/{a => b}/suffix" makes it impossible. The scripts would
never have successfully parsed "--numstat -z -M" in the old format.
This fixes the output format in a (hopefully minimally) backward
incompatible way.
* The output without -z is not changed. This has given a good way for
humans to view added and deleted lines separately, and showing the
path in combined, shorter way would preserve readability.
* The output with -z is unchanged for paths that do not involve renames.
Existing scripts that do not pass -M/-C are not affected at all.
* The output with -z for a renamed path is shown in a format that can
easily be distinguished from an unrenamed path.
This is based on Jakub Narebski's patch. Bugs and documentation typos
are mine.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For consistency, change "white space" and "whitespaces" to
"whitespace", fixing a couple of adjacent grammar problems in the
docs.
Signed-off-by: Wincent Colaiuta <win@wincent.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jc/spht:
Use gitattributes to define per-path whitespace rule
core.whitespace: documentation updates.
builtin-apply: teach whitespace_rules
builtin-apply: rename "whitespace" variables and fix styles
core.whitespace: add test for diff whitespace error highlighting
git-diff: complain about >=8 consecutive spaces in initial indent
War on whitespace: first, a bit of retreat.
Conflicts:
cache.h
config.c
diff.c
The `core.whitespace` configuration variable allows you to define what
`diff` and `apply` should consider whitespace errors for all paths in
the project (See gitlink:git-config[1]). This attribute gives you finer
control per path.
For example, if you have these in the .gitattributes:
frotz whitespace
nitfol -whitespace
xyzzy whitespace=-trailing
all types of whitespace problems known to git are noticed in path 'frotz'
(i.e. diff shows them in diff.whitespace color, and apply warns about
them), no whitespace problem is noticed in path 'nitfol', and the
default types of whitespace problems except "trailing whitespace" are
noticed for path 'xyzzy'. A project with mixed Python and C might want
to have:
*.c whitespace
*.py whitespace=-indent-with-non-tab
in its toplevel .gitattributes file.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This adds an option to help scripts find out color settings from
the configuration file.
git config --get-colorbool color.diff
inspects color.diff variable, and exits with status 0 (i.e. success) if
color is to be used. It exits with status 1 otherwise.
If a script wants "true"/"false" answer to the standard output of the
command, it can pass an additional boolean parameter to its command
line, telling if its standard output is a terminal, like this:
git config --get-colorbool color.diff true
When called like this, the command outputs "true" to its standard output
if color is to be used (i.e. "color.diff" says "always", "auto", or
"true"), and "false" otherwise.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
663af3422a (Full rework of
quote_c_style and write_name_quoted.) mistakenly used puts()
when writing out a fixed string when it did not want to add a
terminating LF.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
reverse_diff was a bit-value in disguise, it's merged in the flags now.
Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This introduces a new whitespace error type, "indent-with-non-tab".
The error is about starting a line with 8 or more SP, instead of
indenting it with a HT.
This is not enabled by default, as some projects employ an
indenting policy to use only SPs and no HTs.
The kernel folks and git contributors may want to enable this
detection with:
[core]
whitespace = indent-with-non-tab
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This introduces core.whitespace configuration variable that lets
you specify the definition of "whitespace error".
Currently there are two kinds of whitespace errors defined:
* trailing-space: trailing whitespaces at the end of the line.
* space-before-tab: a SP appears immediately before HT in the
indent part of the line.
You can specify the desired types of errors to be detected by
listing their names (unique abbreviations are accepted)
separated by comma. By default, these two errors are always
detected, as that is the traditional behaviour. You can disable
detection of a particular type of error by prefixing a '-' in
front of the name of the error, like this:
[core]
whitespace = -trailing-space
This patch teaches the code to output colored diff with
DIFF_WHITESPACE color to highlight the detected whitespace
errors to honor the new configuration.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* js/forkexec:
Use the asyncronous function infrastructure to run the content filter.
Avoid a dup2(2) in apply_filter() - start_command() can do it for us.
t0021-conversion.sh: Test that the clean filter really cleans content.
upload-pack: Run rev-list in an asynchronous function.
upload-pack: Move the revision walker into a separate function.
Use the asyncronous function infrastructure in builtin-fetch-pack.c.
Add infrastructure to run a function asynchronously.
upload-pack: Use start_command() to run pack-objects in create_pack_file().
Have start_command() create a pipe to read the stderr of the child.
Use start_comand() in builtin-fetch-pack.c instead of explicit fork/exec.
Use run_command() to spawn external diff programs instead of fork/exec.
Use start_command() to run content filters instead of explicit fork/exec.
Use start_command() in git_connect() instead of explicit fork/exec.
Change git_connect() to return a struct child_process instead of a pid_t.
Conflicts:
builtin-fetch-pack.c
The core rename detection had some rather stupid code to check if a
pathname was used by a later modification or rename, which basically
walked the whole pathname space for all renames for each rename, in
order to tell whether it was a pure rename (no remaining users) or
should be considered a copy (other users of the source file remaining).
That's really silly, since we can just keep a count of users around, and
replace all those complex and expensive loops with just testing that
simple counter (but this all depends on the previous commit that shared
the diff_filespec data structure by using a separate reference count).
Note that the reference count is not the same as the rename count: they
behave otherwise rather similarly, but the reference count is tied to
the allocation (and decremented at de-allocation, so that when it turns
zero we can get rid of the memory), while the rename count is tied to
the renames and is decremented when we find a rename (so that when it
turns zero we know that it was a rename, not a copy).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rather than copy the filespecs when introducing new versions of them
(for rename or copy detection), use a refcount and increment the count
when reusing the diff_filespec.
This avoids unnecessary allocations, but the real reason behind this is
a future enhancement: we will want to track shared data across the
copy/rename detection. In order to efficiently notice when a filespec
is used by a rename, the rename machinery wants to keep track of a
rename usage count which is shared across all different users of the
filespec.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix size_t vs. unsigned long pointer mismatch warnings introduced
with the addition of strbuf_detach().
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
* ph/strbuf: (44 commits)
Make read_patch_file work on a strbuf.
strbuf_read_file enhancement, and use it.
strbuf change: be sure ->buf is never ever NULL.
double free in builtin-update-index.c
Clean up stripspace a bit, use strbuf even more.
Add strbuf_read_file().
rerere: Fix use of an empty strbuf.buf
Small cache_tree_write refactor.
Make builtin-rerere use of strbuf nicer and more efficient.
Add strbuf_cmp.
strbuf_setlen(): do not barf on setting length of an empty buffer to 0
sq_quote_argv and add_to_string rework with strbuf's.
Full rework of quote_c_style and write_name_quoted.
Rework unquote_c_style to work on a strbuf.
strbuf API additions and enhancements.
nfv?asprintf are broken without va_copy, workaround them.
Fix the expansion pattern of the pseudo-static path buffer.
builtin-for-each-ref.c::copy_name() - do not overstep the buffer.
builtin-apply.c: fix a tiny leak introduced during xmemdupz() conversion.
Use xmemdupz() in many places.
...
We find rename candidates by computing a fingerprint hash of
each file, and then comparing those fingerprints. There are
inherently O(n^2) comparisons, so it pays in CPU time to
hoist the (rather expensive) computation of the fingerprint
out of that loop (or to cache it once we have computed it once).
Previously, we didn't keep the filespec information around
because then we had the potential to consume a great deal of
memory. However, instead of keeping all of the filespec
data, we can instead just keep the fingerprint.
This patch implements and uses diff_free_filespec_data_large
to accomplish that goal. We also have to change
estimate_similarity not to needlessly repopulate the
filespec data when we already have the hash.
Practical tests showed 4.5x speedup for a 10% memory usage
increase.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For that purpose, the ->buf is always initialized with a char * buf living
in the strbuf module. It is made a char * so that we can sloppily accept
things that perform: sb->buf[0] = '\0', and because you can't pass "" as an
initializer for ->buf without making gcc unhappy for very good reasons.
strbuf_init/_detach/_grow have been fixed to trust ->alloc and not ->buf
anymore.
as a consequence strbuf_detach is _mandatory_ to detach a buffer, copying
->buf isn't an option anymore, if ->buf is going to escape from the scope,
and eventually be free'd.
API changes:
* strbuf_setlen now always works, so just make strbuf_reset a convenience
macro.
* strbuf_detatch takes a size_t* optional argument (meaning it can be
NULL) to copy the buffer's len, as it was needed for this refactor to
make the code more readable, and working like the callers.
Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* quote_c_style works on a strbuf instead of a wild buffer.
* quote_c_style is now clever enough to not add double quotes if not needed.
* write_name_quoted inherits those advantages, but also take a different
set of arguments. Now instead of asking for quotes or not, you pass a
"terminator". If it's \0 then we assume you don't want to escape, else C
escaping is performed. In any case, the terminator is also appended to the
stream. It also no longer takes the prefix/prefix_len arguments, as it's
seldomly used, and makes some optimizations harder.
* write_name_quotedpfx is created to work like write_name_quoted and take
the prefix/prefix_len arguments.
Thanks to those API changes, diff.c has somehow lost weight, thanks to the
removal of functions that were wrappers around the old write_name_quoted
trying to give it a semantics like the new one, but performing a lot of
allocations for this goal. Now we always write directly to the stream, no
intermediate allocation is performed.
As a side effect of the refactor in builtin-apply.c, the length of the bar
graphs in diffstats are not affected anymore by the fact that the path was
clipped.
Signed-off-by: Pierre Habouzit <madcoder@debian.org>