"git apply --whitespace=fix" fixed whitespace errors in the common
context lines but did so without reporting.
* jc/apply-ws-fix-expands-report:
apply: detect and mark whitespace errors in context lines when fixing
"git apply" was not very careful about reading from, removing,
updating and creating paths outside the working tree (under
--index/--cached) or the current directory (when used as a
replacement for GNU patch).
* jc/apply-beyond-symlink:
apply: do not touch a file beyond a symbolic link
apply: do not read from beyond a symbolic link
apply: do not read from the filesystem under --index
apply: reject input that touches outside the working area
"git apply --whitespace=fix" used to under-allocate the memory
when the fix resulted in a longer text than the original patch.
* jc/apply-ws-fix-expands:
apply: count the size of postimage correctly
apply: make update_pre_post_images() sanity check the given postlen
apply.c: typofix
"git blame HEAD -- missing" failed to correctly say "HEAD" when it
tried to say "No such path 'missing' in HEAD".
* jk/blame-commit-label:
blame.c: fix garbled error message
use xstrdup_or_null to replace ternary conditionals
builtin/commit.c: use xstrdup_or_null instead of envdup
builtin/apply.c: use xstrdup_or_null instead of null_strdup
git-compat-util: add xstrdup_or_null helper
Because Git tracks symbolic links as symbolic links, a path that
has a symbolic link in its leading part (e.g. path/to/dir/file,
where path/to/dir is a symbolic link to somewhere else, be it
inside or outside the working tree) can never appear in a patch
that validly applies, unless the same patch first removes the
symbolic link to allow a directory to be created there.
Detect and reject such a patch.
Things to note:
- Unfortunately, we cannot reuse the has_symlink_leading_path()
from dir.c, as that is only about the working tree, but "git
apply" can be told to apply the patch only to the index or to
both the index and to the working tree.
- We cannot directly use has_symlink_leading_path() even when we
are applying only to the working tree, as an early patch of a
valid input may remove a symbolic link path/to/dir and then a
later patch of the input may create a path path/to/dir/file, but
"git apply" first checks the input without touching either the
index or the working tree. The leading symbolic link check must
be done on the interim result we compute in-core (i.e. after the
first patch, there is no path/to/dir symbolic link and it is
perfectly valid to create path/to/dir/file).
Similarly, when an input creates a symbolic link path/to/dir and
then creates a file path/to/dir/file, we need to flag it as an
error without actually creating path/to/dir symbolic link in the
filesystem.
Instead, for any patch in the input that leaves a path (i.e. a non
deletion) in the result, we check all leading paths against the
resulting tree that the patch would create by inspecting all the
patches in the input and then the target of patch application
(either the index or the working tree).
This way, we catch a mischief or a mistake to add a symbolic link
path/to/dir and a file path/to/dir/file at the same time, while
allowing a valid patch that removes a symbolic link path/to/dir and
then adds a file path/to/dir/file.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We should reject a patch, whether it renames/copies dir/file to
elsewhere with or without modificiation, or updates dir/file in
place, if "dir/" part is actually a symbolic link to elsewhere,
by making sure that the code to read the preimage does not read
from a path that is beyond a symbolic link.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We currently read the preimage to apply a patch from the index only
when the --cached option is given. Do so also when the command is
running under the --index option. With --index, the index entry and
the working tree file for a path that is involved in a patch must be
identical, so this should not affect the result, but by reading from
the index, we will get the protection to avoid reading an unintended
path beyond a symbolic link automatically.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
By default, a patch that affects outside the working area (either a
Git controlled working tree, or the current working directory when
"git apply" is used as a replacement of GNU patch) is rejected as a
mistake (or a mischief). Git itself does not create such a patch,
unless the user bends over backwards and specifies a non-standard
prefix to "git diff" and friends.
When `git apply` is used as a "better GNU patch", the user can pass
the `--unsafe-paths` option to override this safety check. This
option has no effect when `--index` or `--cached` is in use.
The new test was stolen from Jeff King with slight enhancements.
Note that a few new tests for touching outside the working area by
following a symbolic link are still expected to fail at this step,
but will be fixed in later steps.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the incoming patch has whitespace errors in a common context
line (i.e. a line that is expected to be found and is not modified
by the patch), "apply --whitespace=fix" corrects the whitespace
errors the line has, in addition to the whitespace error on a line
that is updated by the patch. However, we did not count and report
that we fixed whitespace errors on such lines.
[jc: This is iffy. What if the whitespace error has been fixed in
the target since the patch was written? A common context line we
see in the patch has errors, and it matches a line in the target
that has the errors already corrected, resulting in no change, which
we may not want to count after all. On the other hand, we are
reporting whitespace errors _in_ the incoming patch, so...]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Under --whitespace=fix option, match_fragment() function examines
the preimage (the common context and the removed lines in the patch)
and the file being patched and checks if they match after correcting
all whitespace errors. When they are found to match, the common
context lines in the preimage is replaced with the fixed copy,
because these lines will then be copied to the corresponding place
in the postimage by a later call to update_pre_post_images(). Lines
that are added in the postimage, under --whitespace=fix, have their
whitespace errors already fixed when apply_one_fragment() prepares
the preimage and the postimage, so in the end, application of the
patch can be done by replacing the block of text in the file being
patched that matched the preimage with what is in the postimage that
was updated by update_pre_post_images().
In the earlier days, fixing whitespace errors always resulted in
reduction of size, either collapsing runs of spaces in the indent to
a tab or removing the trailing whitespaces. These days, however,
some whitespace error fix results in extending the size.
250b3c6c (apply --whitespace=fix: avoid running over the postimage
buffer, 2013-03-22) tried to compute the final postimage size but
its math was flawed. It counted the size of the block of text in
the original being patched after fixing the whitespace errors on its
lines that correspond to the preimage. That number does not have
much to do with how big the final postimage would be.
Instead count (1) the added lines in the postimage, whose size is
the same as in the final patch result because their whitespace
errors have already been corrected, and (2) the fixed size of the
lines that are common.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git apply --whitespace=fix" used to be able to assume that fixing
errors will always reduce the size by e.g. stripping whitespaces at
the end of lines or collapsing runs of spaces into tabs at the
beginning of lines. An update to accomodate fixes that lengthens
the result by e.g. expanding leading tabs into spaces were made long
time ago but the logic miscounted the necessary space after such
whitespace fixes, leading to either under-allocation or over-usage
of already allocated space.
Illustrate this with a runtime sanity-check to protect us from
future breakage. The test was stolen from Kyle McKay who helped
to identify the problem.
Helped-by: "Kyle J. McKay" <mackyle@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch puts the usage info strings that were not already in docopt-
like format into docopt-like format, which will be a litle easier for
end users and a lot easier for translators. Changes include:
- Placing angle brackets around fill-in-the-blank parameters
- Putting dashes in multiword parameter names
- Adding spaces to [-f|--foobar] to make [-f | --foobar]
- Replacing <foobar>* with [<foobar>...]
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This file had its own identical helper that predates
xstrdup_or_null. Let's use the global one to avoid
repetition.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git remote update --prune" to drop many refs has been optimized.
* mh/simplify-repack-without-refs:
sort_string_list(): rename to string_list_sort()
prune_remote(): iterate using for_each_string_list_item()
prune_remote(): rename local variable
repack_without_refs(): make the refnames argument a string_list
prune_remote(): sort delete_refs_list references en masse
prune_remote(): initialize both delete_refs lists in a single loop
prune_remote(): exit early if there are no stale references
The new name is more consistent with the names of other
string_list-related functions.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Continue where ae021d87 (use skip_prefix to avoid magic numbers) left off
and use skip_prefix() in more places for determining the lengths of prefix
strings to avoid using dependent constants and other indirect methods.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the interface declaration for the functions in lockfile.c from
cache.h to a new file, lockfile.h. Add #includes where necessary (and
remove some redundant includes of cache.h by files that already
include builtin.h).
Move the documentation of the lock_file state diagram from lockfile.c
to the new header file.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update git_config() users with callback functions for a very narrow
scope with calls to config-set API that lets us query a single
variable.
* ta/config-set-2:
builtin/apply.c: replace `git_config()` with `git_config_get_string_const()`
merge-recursive.c: replace `git_config()` with `git_config_get_int()`
ll-merge.c: refactor `read_merge_config()` to use `git_config_string()`
fast-import.c: replace `git_config()` with `git_config_get_*()` family
branch.c: replace `git_config()` with `git_config_get_string()
alias.c: replace `git_config()` with `git_config_get_string()`
imap-send.c: replace `git_config()` with `git_config_get_*()` family
pager.c: replace `git_config()` with `git_config_get_value()`
builtin/gc.c: replace `git_config()` with `git_config_get_*()` family
rerere.c: replace `git_config()` with `git_config_get_*()` family
fetchpack.c: replace `git_config()` with `git_config_get_*()` family
archive.c: replace `git_config()` with `git_config_get_bool()` family
read-cache.c: replace `git_config()` with `git_config_get_*()` family
http-backend.c: replace `git_config()` with `git_config_get_bool()` family
daemon.c: replace `git_config()` with `git_config_get_bool()` family
Applying a patch not generated by Git in a subdirectory used to
check the whitespace breakage using the attributes for incorrect
paths. Also whitespace checks were performed even for paths
excluded via "git apply --exclude=<path>" mechanism.
* jc/apply-ws-prefix:
apply: omit ws check for excluded paths
apply: hoist use_patch() helper for path exclusion up
apply: use the right attribute for paths in non-Git patches
Use `git_config_get_string_const()` instead of `git_config()` to take
advantage of the config-set API which provides a cleaner control flow.
Signed-off-by: Tanay Abhra <tanayabh@gmail.com>
Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Whitespace breakages are checked while the patch is being parsed.
Disable them at the beginning of parse_chunk(), where each
individual patch is parsed, immediately after we learn the name of
the file the patch applies to and before we start parsing the diff
contained in the patch.
One may naively think that we should be able to not just skip the
whitespace checks but simply fast-forward to the next patch without
doing anything once use_patch() tells us that this patch is not
going to be used. But in reality we cannot really skip much of the
parsing in order to do such a "fast-forward", primarily because
parsing "@@ -k,l +m,n @@" lines and counting the input lines is how
we determine the boundaries of individual patches.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We parse each patchfile and find the name of the path the patch
applies to, and then use that name to consult the attribute system
to find the whitespace rules to be used, and also the target file
(either in the working tree or in the index) to replay the changes
against.
Unlike a Git-generated patch, a non-Git patch is taken to have the
pathnames relative to the current working directory. The names
found in such a patch are modified by prepending the prefix by the
prefix_patches() helper function introduced in 56185f49 (git-apply:
require -p<n> when working in a subdirectory., 2007-02-19).
However, this prefixing is done after the patch is fully parsed and
affects only what target files are patched. Because the attributes
are checked against the names found in the patch during the parsing,
not against the final pathname, the whitespace check that is done
during parsing ends up using attributes for a wrong path for non-Git
patches.
Fix this by doing the prefix much earlier, immediately after the
header part of each patch is parsed and we learn the name of the
path the patch affects.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When parsing "index" lines from a git-diff, we look for a
space followed by the mode. If we don't have a space, then
we set our pointer to the end-of-line. However, we don't
double-check that our end-of-line pointer is valid (e.g., if
we got a truncated diff input), which could lead to some
wrap-around pointer arithmetic.
In most cases this would probably get caught by our "40 <
len" check later in the function, but to be on the safe
side, let's just use strchrnul to treat end-of-string the
same as end-of-line.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use xmemdupz() to allocate the memory, copy the data and make sure to
NUL-terminate the result, all in one step. The resulting code is
shorter, doesn't contain the constants 1 and '\0', and avoids
duplicating function parameters.
For blame, the last copied byte (o->file.ptr[o->file.size]) is always
set to NUL by fake_working_tree_commit() or read_sha1_file(), so no
information is lost by the conversion to using xmemdupz().
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
An experiment to use two files (the base file and incremental
changes relative to it) to represent the index to reduce I/O cost
of rewriting a large index when only small part of the working tree
changes.
* nd/split-index: (32 commits)
t1700: new tests for split-index mode
t2104: make sure split index mode is off for the version test
read-cache: force split index mode with GIT_TEST_SPLIT_INDEX
read-tree: note about dropping split-index mode or index version
read-tree: force split-index mode off on --index-output
rev-parse: add --shared-index-path to get shared index path
update-index --split-index: do not split if $GIT_DIR is read only
update-index: new options to enable/disable split index mode
split-index: strip pathname of on-disk replaced entries
split-index: do not invalidate cache-tree at read time
split-index: the reading part
split-index: the writing part
read-cache: mark updated entries for split index
read-cache: save deleted entries in split index
read-cache: mark new entries for split index
read-cache: split-index mode
read-cache: save index SHA-1 after reading
entry.c: update cache_changed if refresh_cache is set in checkout_entry()
cache-tree: mark istate->cache_changed on prime_cache_tree()
cache-tree: mark istate->cache_changed on cache tree update
...
* jk/xstrfmt:
setup_git_env(): introduce git_path_from_env() helper
unique_path: fix unlikely heap overflow
walker_fetch: fix minor memory leak
merge: use argv_array when spawning merge strategy
sequencer: use argv_array_pushf
setup_git_env: use git_pathdup instead of xmalloc + sprintf
use xstrfmt to replace xmalloc + strcpy/strcat
use xstrfmt to replace xmalloc + sprintf
use xstrdup instead of xmalloc + strcpy
use xstrfmt in favor of manual size calculations
strbuf: add xstrfmt helper
"--ignore-space-change" option of "git apply" ignored the spaces
at the beginning of line too aggressively, which is inconsistent
with the option of the same name "diff" and "git diff" have.
* jc/apply-ignore-whitespace:
apply --ignore-space-change: lines with and without leading whitespaces do not match
A submodule diff generally has content like:
-Subproject commit [0-9a-f]{40}
+Subproject commit [0-9a-f]{40}
When we are using "git apply --index" with a submodule, we
first apply the textual diff, and then parse that result to
figure out the new sha1.
If the diff has bogus input like:
-Subproject commit 1234567890123456789012345678901234567890
+bogus
we will parse the "bogus" portion. Our parser assumes that
the buffer starts with "Subproject commit", and blindly
skips past it using strlen(). This can cause us to read
random memory after the buffer.
This problem was unlikely to have come up in practice (since
it requires a malformed diff), and even when it did, we
likely noticed the problem anyway as the next operation was
to call get_sha1_hex on the random memory.
However, we can easily fix it by using skip_prefix to notice
the parsing error.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's easy to get manual allocation calculations wrong, and
the use of strcpy/strcat raise red flags for people looking
for buffer overflows (though in this case each site was
fine).
It's also shorter to use xstrfmt, and the printf-format
tends to be easier for a reader to see what the final string
will look like.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Other fill_stat_cache_info() is on new entries, which should set
CE_ENTRY_ADDED in cache_changed, so we're safe.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"--ignore-space-change" option of "git apply" ignored the
spaces at the beginning of line too aggressively, which is
inconsistent with the option of the same name "diff" and "git diff"
have.
* jc/apply-ignore-whitespace:
apply --ignore-space-change: lines with and without leading whitespaces do not match
Eradicate mistaken use of "nor" (that is, essentially "nor" used
not in "neither A nor B" ;-)) from in-code comments, command output
strings, and documentations.
* jl/nor-or-nand-and:
code and test: fix misuses of "nor"
comments: fix misuses of "nor"
contrib: fix misuses of "nor"
Documentation: fix misuses of "nor"
The fuzzy_matchlines() function is used when attempting to resurrect
a patch that is whitespace-damaged, or when applying a patch that
was produced against an old codebase to the codebase after
indentation change.
The patch may want to change a line "a_bc" ("_" is used throught
this description for a whitespace to make it stand out) in the
original into something else, and we may not find "a_bc" in the
current source, but there may be "a__bc" (two spaces instead of one
the whitespace-damaged patch claims to expect). By ignoring the
amount of whitespaces, it forces "git apply" to consider that "a_bc"
in the broken patch meant to refer to "a__bc" in reality.
However, the implementation special cases a run of whitespaces at
the beginning of a line and makes "abc" match "_abc", even though a
whitespace in the middle of string never matches a 0-width gap,
e.g. "a_bc" does not match "abc". A run of whitespace at the end of
one string does not match a 0-width end of line on the other line,
either, e.g. "abc_" does not match "abc".
Fix this inconsistency by making the code skip leading whitespaces
only when both strings begin with a whitespace. This makes the
option mean the same as the option of the same name in "diff" and
"git diff".
Note that I am not sure if anybody sane should use this option in
the first place. The fuzzy match logic may be able to find the
original line that the patch author may have meant to touch because
it does not fully trust what the original lines say (i.e. context
lines prefixed by " " and old lines prefixed by "-" does not have to
exactly match the contents the patch is applied to). There is no
reason for us to trust what the replacement lines (i.e. new lines
prefixed by "+") say, either, but with this option enabled, we end
up copying these new lines with suspicious whitespace distributions
literally into the patched result. But as long as we keep it, we
should make it do its insane thing consistently.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We started using wildmatch() in place of fnmatch(3); complete the
process and stop using fnmatch(3).
* nd/no-more-fnmatch:
actually remove compat fnmatch source code
stop using fnmatch (either native or compat)
Revert "test-wildmatch: add "perf" command to compare wildmatch and fnmatch"
use wildmatch() directly without fnmatch() wrapper
Shrink lifetime of variables by moving their definitions to an
inner scope where appropriate.
* ep/varscope:
builtin/gc.c: reduce scope of variables
builtin/fetch.c: reduce scope of variable
builtin/commit.c: reduce scope of variables
builtin/clean.c: reduce scope of variable
builtin/blame.c: reduce scope of variables
builtin/apply.c: reduce scope of variables
bisect.c: reduce scope of variable
Make it clear that we don't use fnmatch() anymore.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Leaving only the function definitions and declarations so that any
new topic in flight can still make use of the old functions, replace
existing uses of the prefixcmp() and suffixcmp() with new API
functions.
The change can be recreated by mechanically applying this:
$ git grep -l -e prefixcmp -e suffixcmp -- \*.c |
grep -v strbuf\\.c |
xargs perl -pi -e '
s|!prefixcmp\(|starts_with\(|g;
s|prefixcmp\(|!starts_with\(|g;
s|!suffixcmp\(|ends_with\(|g;
s|suffixcmp\(|!ends_with\(|g;
'
on the result of preparatory changes in this series.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This task emerged from b04ba2bb (parse-options: deprecate OPT_BOOLEAN,
2011-09-27). All occurrences of the respective variables have
been reviewed and none of them relied on the counting up mechanism,
but all of them were using the variable as a true boolean.
This patch does not change semantics of any command intentionally.
Signed-off-by: Stefan Beller <stefanbeller@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>