An experiment to use two files (the base file and incremental
changes relative to it) to represent the index to reduce I/O cost
of rewriting a large index when only small part of the working tree
changes.
* nd/split-index: (32 commits)
t1700: new tests for split-index mode
t2104: make sure split index mode is off for the version test
read-cache: force split index mode with GIT_TEST_SPLIT_INDEX
read-tree: note about dropping split-index mode or index version
read-tree: force split-index mode off on --index-output
rev-parse: add --shared-index-path to get shared index path
update-index --split-index: do not split if $GIT_DIR is read only
update-index: new options to enable/disable split index mode
split-index: strip pathname of on-disk replaced entries
split-index: do not invalidate cache-tree at read time
split-index: the reading part
split-index: the writing part
read-cache: mark updated entries for split index
read-cache: save deleted entries in split index
read-cache: mark new entries for split index
read-cache: split-index mode
read-cache: save index SHA-1 after reading
entry.c: update cache_changed if refresh_cache is set in checkout_entry()
cache-tree: mark istate->cache_changed on prime_cache_tree()
cache-tree: mark istate->cache_changed on cache tree update
...
* jk/xstrfmt:
setup_git_env(): introduce git_path_from_env() helper
unique_path: fix unlikely heap overflow
walker_fetch: fix minor memory leak
merge: use argv_array when spawning merge strategy
sequencer: use argv_array_pushf
setup_git_env: use git_pathdup instead of xmalloc + sprintf
use xstrfmt to replace xmalloc + strcpy/strcat
use xstrfmt to replace xmalloc + sprintf
use xstrdup instead of xmalloc + strcpy
use xstrfmt in favor of manual size calculations
strbuf: add xstrfmt helper
Move "commit->buffer" out of the in-core commit object and keep
track of their lengths. Use this to optimize the code paths to
validate GPG signatures in commit objects.
* jk/commit-buffer-length:
reuse cached commit buffer when parsing signatures
commit: record buffer length in cache
commit: convert commit->buffer to a slab
commit-slab: provide a static initializer
use get_commit_buffer everywhere
convert logmsg_reencode to get_commit_buffer
use get_commit_buffer to avoid duplicate code
use get_cached_commit_buffer where appropriate
provide helpers to access the commit buffer
provide a helper to set the commit buffer
provide a helper to free commit buffer
sequencer: use logmsg_reencode in get_message
logmsg_reencode: return const buffer
do not create "struct commit" with xcalloc
commit: push commit_index update into alloc_commit_node
alloc: include any-object allocations in alloc_report
replace dangerous uses of strbuf_attach
commit_tree: take a pointer/len pair rather than a const strbuf
This avoids a manual allocation calculation, and is shorter
to boot.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most callsites which use the commit buffer try to use the
cached version attached to the commit, rather than
re-reading from disk. Unfortunately, that interface provides
only a pointer to the NUL-terminated buffer, with no
indication of the original length.
For the most part, this doesn't matter. People do not put
NULs in their commit messages, and the log code is happy to
treat it all as a NUL-terminated string. However, some code
paths do care. For example, when checking signatures, we
want to be very careful that we verify all the bytes to
avoid malicious trickery.
This patch just adds an optional "size" out-pointer to
get_commit_buffer and friends. The existing callers all pass
NULL (there did not seem to be any obvious sites where we
could avoid an immediate strlen() call, though perhaps with
some further refactoring we could).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Each of these sites assumes that commit->buffer is valid.
Since they would segfault if this was not the case, they are
likely to be correct in practice. However, we can
future-proof them by using get_commit_buffer.
And as a side effect, we abstract away the final bare uses
of commit->buffer.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Like the callsites in the previous commit, logmsg_reencode
already falls back to read_sha1_file when necessary.
However, I split its conversion out into its own commit
because it's a bit more complex.
We return either:
1. The original commit->buffer
2. A newly allocated buffer from read_sha1_file
3. A reencoded buffer (based on either 1 or 2 above).
while trying to do as few extra reads/allocations as
possible. Callers currently free the result with
logmsg_free, but we can simplify this by pointing them
straight to unuse_commit_buffer. This is a slight layering
violation, in that we may be passing a buffer from (3).
However, since the end result is to free() anything except
(1), which is unlikely to change, and because this makes the
interface much simpler, it's a reasonable bending of the
rules.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This simplifies the code, as logmsg_reencode handles the
reencoding for us in a single call. It also means we learn
logmsg_reencode's trick of pulling the buffer from disk when
commit->buffer is NULL (we currently just silently return!).
It is doubtful this matters in practice, though, as
sequencer operations would not generally turn off
save_commit_buffer.
Note that we may be fixing a bug here. The existing code
does:
if (same_encoding(to, from))
reencode_string(buf, to, from);
That probably should have been "!same_encoding".
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`do_pick_commit` handles three situations if it is not fast-forwarding.
In order for `do_pick_commit` to identify the situation, it examines the
return value of the selected merge command.
1. return value 0 stands for a clean merge
2. 1 is passed in case of a failed merge due to conflict
3. any other return value means that the merge did not even start
So far, the sequencer returns 1 in case of a failed fast-forward, which
would mean "failed merge due to conflict". However, a fast-forward
either starts and succeeds or does not start at all. In particular, it
cannot fail in between like a merge with a dirty index due to conflicts.
In order to signal the three possible situations (not only success and
failure to complete) after a pick through porcelain commands such as
`cherry-pick`, exit with a return value that is neither 0 nor 1. 128 was
chosen in line with the other situations in which the sequencer
encounters an error. In such situations, the sequencer returns a
negative value and `cherry-pick` translates this into a call to `die`.
`die` then terminates the process with exit status 128.
Signed-off-by: Fabian Ruch <bafain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change fast_forward_to() to check if locking the ref failed, print a
nice error message and bail out early.
The old code did not check if ref_lock was NULL and relied on the
fact that the write_ref_sha1() would safely detect this condition
and set the return variable ret to indicate an error.
While that is safe, it makes the code harder to read for two reasons:
* Inconsistency. Almost all other places we do check the lock for
NULL explicitly, so the naive reader is confused "why don't we
check here?"
* And relying on write_ref_sha1() to detect and return an error for
when a previous lock_any_ref_for_update() failed feels obfuscated.
This change should not change any functionality or logic aside from
adding an extra error message when this condition is triggered
(write_ref_sha1() returns an error silently for this condition).
Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Nicolas Vigier <boklm@mars-attacks.org>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Leaving only the function definitions and declarations so that any
new topic in flight can still make use of the old functions, replace
existing uses of the prefixcmp() and suffixcmp() with new API
functions.
The change can be recreated by mechanically applying this:
$ git grep -l -e prefixcmp -e suffixcmp -- \*.c |
grep -v strbuf\\.c |
xargs perl -pi -e '
s|!prefixcmp\(|starts_with\(|g;
s|prefixcmp\(|!starts_with\(|g;
s|!suffixcmp\(|ends_with\(|g;
s|suffixcmp\(|!ends_with\(|g;
'
on the result of preparatory changes in this series.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Expose lock_ref_sha1_basic's type_p argument to callers of
lock_any_ref_for_update. Update all call sites to ignore it by passing
NULL for now.
Signed-off-by: Brad King <brad.king@kitware.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
I attempted to make index_state->cache[] a "const struct cache_entry **"
to find out how existing entries in index are modified and where. The
question I have is what do we do if we really need to keep track of on-disk
changes in the index. The result is
- diff-lib.c: setting CE_UPTODATE
- name-hash.c: setting CE_HASHED
- preload-index.c, read-cache.c, unpack-trees.c and
builtin/update-index: obvious
- entry.c: write_entry() may refresh the checked out entry via
fill_stat_cache_info(). This causes "non-const struct cache_entry
*" in builtin/apply.c, builtin/checkout-index.c and
builtin/checkout.c
- builtin/ls-files.c: --with-tree changes stagemask and may set
CE_UPDATE
Of these, write_entry() and its call sites are probably most
interesting because it modifies on-disk info. But this is stat info
and can be retrieved via refresh, at least for porcelain
commands. Other just uses ce_flags for local purposes.
So, keeping track of "dirty" entries is just a matter of setting a
flag in index modification functions exposed by read-cache.c. Except
unpack-trees, the rest of the code base does not do anything funny
behind read-cache's back.
The actual patch is less valueable than the summary above. But if
anyone wants to re-identify the above sites. Applying this patch, then
this:
diff --git a/cache.h b/cache.h
index 430d021..1692891 100644
--- a/cache.h
+++ b/cache.h
@@ -267,7 +267,7 @@ static inline unsigned int canon_mode(unsigned int mode)
#define cache_entry_size(len) (offsetof(struct cache_entry,name) + (len) + 1)
struct index_state {
- struct cache_entry **cache;
+ const struct cache_entry **cache;
unsigned int version;
unsigned int cache_nr, cache_alloc, cache_changed;
struct string_list *resolve_undo;
will help quickly identify them without bogus warnings.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The reflog message created when "git cherry-pick" fast-forwarded
did not say anything but "cherry-pick", but it now says
"cherry-pick: fast-forward".
* rr/cherry-pick-fast-forward-reflog-message:
sequencer: write useful reflog message for fast-forward
The following command
$ git cherry-pick --ff b8bb3f
writes the following uninformative message to the reflog
cherry-pick
Improve it to
cherry-pick: fast-forward
Avoid hard-coding "cherry-pick" in fast_forward_to(), so the sequencer
is generic enough to support future actions.
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix "git cherry-pick $annotated_tag", which was mistakenly rejected.
* mv/sequencer-pick-error-diag:
cherry-pick: picking a tag that resolves to a commit is OK
Earlier, 21246dbb9e (cherry-pick: make sure all input objects are
commits, 2013-04-11) tried to catch an unlikely "git cherry-pick $blob"
as an error, but broke a more important use case to cherry-pick a
tag that points at a commit.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint:
Correct common spelling mistakes in comments and tests
kwset: fix spelling in comments
precompose-utf8: fix spelling of "want" in error message
compat/nedmalloc: fix spelling in comments
compat/regex: fix spelling and grammar in comments
obstack: fix spelling of similar
contrib/subtree: fix spelling of accidentally
git-remote-mediawiki: spelling fixes
doc: various spelling fixes
fast-export: fix argument name in error messages
Documentation: distinguish between ref and offset deltas in pack-format
i18n: make the translation of -u advice in one go
Most of these were found using Lucas De Marchi's codespell tool.
Signed-off-by: Stefano Lattarini <stefano.lattarini@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a single argument was a non-commit, the error message used to be:
fatal: BUG: expected exactly one commit from walk
For multiple arguments, when none of the arguments was a commit, the error was:
fatal: empty commit set passed
Finally, when some of the arguments were non-commits, we ignored those
arguments. Fix this bug and make sure all arguments are commits, and
for the first non-commit, error out with:
fatal: <name>: Can't cherry-pick a <type>
Signed-off-by: Miklos Vajna <vmiklos@suse.cz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Consolidate codepaths that inspect log-message-to-be and decide to
add a new Signed-off-by line in various commands.
* bc/append-signed-off-by:
git-commit: populate the edit buffer with 2 blank lines before s-o-b
Unify appending signoff in format-patch, commit and sequencer
format-patch: update append_signoff prototype
t4014: more tests about appending s-o-b lines
sequencer.c: teach append_signoff to avoid adding a duplicate newline
sequencer.c: teach append_signoff how to detect duplicate s-o-b
sequencer.c: always separate "(cherry picked from" from commit body
sequencer.c: require a conforming footer to be preceded by a blank line
sequencer.c: recognize "(cherry picked from ..." as part of s-o-b footer
t/t3511: add some tests of 'cherry-pick -s' functionality
t/test-lib-functions.sh: allow to specify the tag name to test_commit
commit, cherry-pick -s: remove broken support for multiline rfc2822 fields
sequencer.c: rework search for start of footer to improve clarity
'commit -s' populates the edit buffer with a blank line before the
Signed-off-by line, to allow the user to immediately start typing
the log message. But commit 33f2f9ab removed this space, forcing
the user to first push the Signed-off-by line down to open a place
to type the log message.
Fix this regression and let's ensure that the Signed-off-by line is
preceded by two blank lines, instead of just one, to hint that
something should be filled in, and that a blank line should separate
it from the body and the Signed-off-by line.
Add a test for this behavior.
Reported-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Brandon Casey <drafnel@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach append_signoff to detect whether a blank line exists at the position
that the signed-off-by line will be added, and refrain from adding an
additional one if one already exists. Or, add an additional line if one
is needed to make sure the new footer is separated from the message body
by a blank line.
Signed-off-by: Brandon Casey <bcasey@nvidia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach append_signoff how to detect a duplicate s-o-b in the commit footer.
This is in preparation to unify the append_signoff implementations in
log-tree.c and sequencer.c.
Fixes test in t3511.
Signed-off-by: Brandon Casey <bcasey@nvidia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Start treating the "(cherry picked from" line added by cherry-pick -x
the same way that the s-o-b lines are treated. Namely, separate them
from the main commit message body with an empty line.
Introduce tests to test this functionality.
Signed-off-by: Brandon Casey <bcasey@nvidia.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, append_signoff() performs a search for the last line of the
commit buffer by searching back from the end until it hits a newline. If
it reaches the beginning of the buffer without finding a newline, that
means either the commit message was empty, or there was only one line in it.
In this case, append_signoff will skip the call to has_conforming_footer
since it already knows that it is necessary to append a newline before
appending the sob.
Let's perform this function inside of has_conforming_footer where it
appropriately belongs and generalize it so that we require that the
footer paragraph be an actual distinct paragraph separated by a blank
line.
Signed-off-by: Brandon Casey <drafnel@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When 'cherry-pick -s' is used to append a signed-off-by line to a cherry
picked commit, it does not currently detect the "(cherry picked from..."
that may have been appended by a previous 'cherry-pick -x' as part of the
s-o-b footer and it will insert a blank line before appending a new s-o-b.
Let's detect "(cherry picked from...)" as part of the footer so that we
will produce this:
Signed-off-by: A U Thor <author@example.com>
(cherry picked from da39a3ee5e6b4b0d3255bfef95601890afd80709)
Signed-off-by: C O Mmitter <committer@example.com>
instead of this:
Signed-off-by: A U Thor <author@example.com>
(cherry picked from da39a3ee5e6b4b0d3255bfef95601890afd80709)
Signed-off-by: C O Mmitter <committer@example.com>
[with improvements from Jonathan Nieder]
Signed-off-by: Brandon Casey <bcasey@nvidia.com>
Acked-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Starting with c1e01b0c (commit: More generous accepting of RFC-2822 footer
lines, 2009-10-28), "git commit -s" carefully parses the last paragraph of
each commit message to check if it consists only of RFC2822-style headers,
in which case the signoff will be added as a new line in the same list:
Reported-by: Reporter <reporter@example.com>
Signed-off-by: Author <author@example.com>
Acked-by: Lieutenant <lt@example.com>
It even included support for accepting indented continuation lines for
multiline fields. Unfortunately the multiline field support is broken
because it checks whether buf[k] (the first character of the *next* line)
instead of buf[i] is a whitespace character. The result is that any footer
with a continuation line is not accepted, since the last continuation line
neither starts with an RFC2822 field name nor is followed by a continuation
line.
That this has remained broken for so long is good evidence that nobody
actually needed multiline fields. Rip out the broken continuation support.
There should be no functional change.
[Thanks to Jonathan Nieder for the excellent commit message]
Signed-off-by: Brandon Casey <bcasey@nvidia.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This code sequence is somewhat difficult to read. Let's rewrite it and add
some comments to improve clarity.
Signed-off-by: Brandon Casey <drafnel@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
cherry-picking into an unborn branch should work, so make it work,
with or without --ff.
Cherry-picking anything other than a commit that only adds files, will
naturally result in conflicts. Similarly, revert also works, but will
result in conflicts unless the specified revision only deletes files.
Signed-off-by: Martin von Zweigbergk <martinvonz@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Various codepaths checked if two encoding names are the same using
ad-hoc code and some of them ended up asking iconv() to convert
between "utf8" and "UTF-8". The former is not a valid way to spell
the encoding name, but often people use it by mistake, and we
equated them in some but not all codepaths. Introduce a new helper
function to make these codepaths consistent.
* jc/same-encoding:
reencode_string(): introduce and use same_encoding()
Conflicts:
builtin/mailinfo.c
Callers of reencode_string() that re-encodes a string from one
encoding to another all used ad-hoc way to bypass the case where the
input and the output encodings are the same. Some did strcmp(),
some did strcasecmp(), yet some others when converting to UTF-8 used
is_encoding_utf8().
Introduce same_encoding() helper function to make these callers use
the same logic. Notably, is_encoding_utf8() has a work-around for
common misconfiguration to use "utf8" to name UTF-8 encoding, which
does not match "UTF-8" hence strcasecmp() would not consider the
same. Make use of it in this helper function.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These functions are called in sequencer.c, which is part of
libgit.a. This makes libgit.a potentially require builtin/merge.c for
external git commands.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Turn many file-scope private symbols to static to reduce the
global namespace contamination.
* jc/make-static:
sequencer.c: mark a private file-scope symbol as static
ident.c: mark private file-scope symbols as static
trace.c: mark a private file-scope symbol as static
wt-status.c: mark a private file-scope symbol as static
read-cache.c: mark a private file-scope symbol as static
strbuf.c: mark a private file-scope symbol as static
sha1-array.c: mark a private file-scope symbol as static
symlinks.c: mark private file-scope symbols as static
notes.c: mark a private file-scope symbol as static
rerere.c: mark private file-scope symbols as static
graph.c: mark private file-scope symbols as static
diff.c: mark a private file-scope symbol as static
commit.c: mark a file-scope private symbol as static
builtin/notes.c: mark file-scope private symbols as static
In case 'git cherry-pick -s <commit>' failed, the user had to use 'git
commit -s' (i.e. state the -s option again), which is easy to forget
about. Instead, write the signed-off-by line early, so plain 'git
commit' will have the same result.
Also update 'git commit -s', so that in case there is already a relevant
Signed-off-by line before the Conflicts: line, it won't add one more at
the end of the message. If there is no such line, then add it before the
the Conflicts: line.
Signed-off-by: Miklos Vajna <vmiklos@suse.cz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git cherry-pick A C B" used to replay changes in A and then B and
then C if these three commits had committer timestamps in that
order, which is not what the user who said "A C B" naturally expects.
* mz/cherry-pick-cmdline-order:
cherry-pick/revert: respect order of revisions to pick
demonstrate broken 'git cherry-pick three one two'
teach log --no-walk=unsorted, which avoids sorting
When giving multiple individual revisions to cherry-pick or revert, as
in 'git cherry-pick A B' or 'git revert B A', one would expect them to
be picked/reverted in the order given on the command line. They are
instead ordered by their commit timestamp -- in chronological order
for "cherry-pick" and in reverse chronological order for
"revert". This matches the order in which one would usually give them
on the command line, making this bug somewhat hard to notice. Still,
it has been reported at least once before [1].
It seems like the chronological sorting happened by accident because
the revision walker has traditionally always sorted commits in reverse
chronological order when rev_info.no_walk was enabled. In the case of
'git revert B A' where B is newer than A, this sorting is a no-op. For
'git cherry-pick A B', the sorting would reverse the arguments, but
because the sequencer also flips the rev_info.reverse flag when
picking (as opposed to reverting), the end result is a chronological
order. The rev_info.reverse flag was probably flipped so that the
revision walker emits B before C in 'git cherry-pick A..C'; that it
happened to effectively undo the unexpected sorting done when not
walking, was probably a coincidence that allowed this bug to happen at
all.
Fix the bug by telling the revision walker not to sort the commits
when not walking. The only case we want to reverse the order is now
when cherry-picking and walking revisions (rev_info.no_walk = 0).
[1] http://thread.gmane.org/gmane.comp.version-control.git/164794
Signed-off-by: Martin von Zweigbergk <martinvonz@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Scripts such as "git rebase -i" cannot currently cherry-pick commits
which have an empty commit message, as git cherry-pick calls git
commit without the --allow-empty-message option.
Add an --allow-empty-message option to git cherry-pick which is passed
through to git commit, so this behaviour can be overridden.
Signed-off-by: Chris Webb <chris@arachsys.com>
Reviewed-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The earlier "--keep-redundant-commit" series broke "cherry-pick"
that is given a commit whose change is already in the current
history. Such a cherry-pick would result in an empty change, and
should stop with an error, telling the user that conflict resolution
may have made the result empty (which is exactly what is happening),
but we silently dropped the change on the floor without any message
nor non-zero exit code.
Signed-off-by: Junio C Hamano <gitster@pobox.com>