We generally want to avoid lookup_unknown_object, because it
results in allocating more memory for the object than may be
strictly necessary.
In this case, it is used to check whether we have an
already-parsed object before calling parse_object, to save
us from reading the object from disk. Using lookup_object
would be fine for that purpose, but we can take it a step
further. Since this code was written, parse_object already
learned the "check lookup_object" optimization, so we can
simply call parse_object directly.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "struct object" type implements basic object
polymorphism. Individual instances are allocated as
concrete types (or as a union type that can store any
object), and a "struct object *" can be cast into its real
type after examining its "type" enum. This means it is
dangerous to have a type field that does not match the
allocation (e.g., setting the type field of a "struct blob"
to "OBJ_COMMIT" would mean that a reader might read past the
allocated memory).
In most of the current code this is not a problem; the first
thing we do after allocating an object is usually to set its
type field by passing it to create_object. However, the
virtual commits we create in merge-recursive.c do not ever
get their type set. This does not seem to have caused
problems in practice, though (presumably because we always
pass around a "struct commit" pointer and never even look at
the type).
We can fix this oversight and also make it harder for future
code to get it wrong by setting the type directly in the
object allocation functions.
This will also make it easier to fix problems with commit
index allocation, as we know that any object allocated by
alloc_commit_node will meet the invariant that an object
with an OBJ_COMMIT type field will have a unique index
number.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Build the commit_list of parents by calling commit_list_append() twice
instead of allocating and linking the items by hand. This makes the
code shorter and simpler. Rename the commit_list from parent to parents
(plural) while at it because there are two of them.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add 'verify-commit' to be used in a way similar to 'verify-tag' is
used. Further work on verifying the mergetags might be needed.
* mg/verify-commit:
t7510: test verify-commit
t7510: exit for loop with test result
verify-commit: scriptable commit signature verification
gpg-interface: provide access to the payload
gpg-interface: provide clear helper for struct signature_check
"git clone -b brefs/tags/bar" would have mistakenly thought we were
following a single tag, even though it was a name of the branch,
because it incorrectly used strstr().
* jc/fix-clone-single-starting-at-a-tag:
builtin/clone.c: detect a clone starting at a tag correctly
* jk/repack-pack-keep-objects:
repack: s/write_bitmap/&s/ in code
repack: respect pack.writebitmaps
repack: do not accidentally pack kept objects by default
We can make the parsing of the --sort parameter a bit more
readable by having skip_prefix keep our pointer up to date.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jk/xstrfmt:
setup_git_env(): introduce git_path_from_env() helper
unique_path: fix unlikely heap overflow
walker_fetch: fix minor memory leak
merge: use argv_array when spawning merge strategy
sequencer: use argv_array_pushf
setup_git_env: use git_pathdup instead of xmalloc + sprintf
use xstrfmt to replace xmalloc + strcpy/strcat
use xstrfmt to replace xmalloc + sprintf
use xstrdup instead of xmalloc + strcpy
use xstrfmt in favor of manual size calculations
strbuf: add xstrfmt helper
* jk/skip-prefix:
http-push: refactor parsing of remote object names
imap-send: use skip_prefix instead of using magic numbers
use skip_prefix to avoid repeated calculations
git: avoid magic number with skip_prefix
fetch-pack: refactor parsing in get_ack
fast-import: refactor parsing of spaces
stat_opt: check extra strlen call
daemon: use skip_prefix to avoid magic numbers
fast-import: use skip_prefix for parsing input
use skip_prefix to avoid repeating strings
use skip_prefix to avoid magic numbers
transport-helper: avoid reading past end-of-string
fast-import: fix read of uninitialized argv memory
apply: use skip_prefix instead of raw addition
refactor skip_prefix to return a boolean
avoid using skip_prefix as a boolean
daemon: mark some strings as const
parse_diff_color_slot: drop ofs parameter
Hashmap entries are typically looked up by just a key. The hashmap_get()
API expects an initialized entry structure instead, to support compound
keys. This flexibility is currently only needed by find_dir_entry() in
name-hash.c (and compat/win32/fscache.c in the msysgit fork). All other
(currently five) call sites of hashmap_get() have to set up a near emtpy
entry structure, resulting in duplicate code like this:
struct hashmap_entry keyentry;
hashmap_entry_init(&keyentry, hash(key));
return hashmap_get(map, &keyentry, key);
Add a hashmap_get_from_hash() API that allows hashmap lookups by just
specifying the key and its hash code, i.e.:
return hashmap_get_from_hash(map, hash(key), key);
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Copying the first bytes of a SHA1 is duplicated in six places,
however, the implications (the actual value would depend on the
endianness of the platform) is documented only once.
Add a properly documented API for this.
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move "commit->buffer" out of the in-core commit object and keep
track of their lengths. Use this to optimize the code paths to
validate GPG signatures in commit objects.
* jk/commit-buffer-length:
reuse cached commit buffer when parsing signatures
commit: record buffer length in cache
commit: convert commit->buffer to a slab
commit-slab: provide a static initializer
use get_commit_buffer everywhere
convert logmsg_reencode to get_commit_buffer
use get_commit_buffer to avoid duplicate code
use get_cached_commit_buffer where appropriate
provide helpers to access the commit buffer
provide a helper to set the commit buffer
provide a helper to free commit buffer
sequencer: use logmsg_reencode in get_message
logmsg_reencode: return const buffer
do not create "struct commit" with xcalloc
commit: push commit_index update into alloc_commit_node
alloc: include any-object allocations in alloc_report
replace dangerous uses of strbuf_attach
commit_tree: take a pointer/len pair rather than a const strbuf
In this code, we try to convert both "foo.idx" and "foo"
into "foo.pack". By stripping the suffix, we can avoid a
confusing use of strbuf_splice, and make it clear that both
cases are adding ".pack" to the end.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We also switch to using strbufs, which lets us avoid the
potentially dangerous combination of a manual malloc
followed by a strcpy.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When stripping a suffix like:
if (ends_with(str, "foo"))
buf = xmemdupz(str, strlen(str) - 3);
we can instead use strip_suffix to avoid the constant 3,
which must match the literal "foo" (we sometimes use
strlen("foo") instead, but that means we are repeating
ourselves). The example above becomes:
if (strip_suffix(str, "foo", &len))
buf = xmemdupz(str, len);
This also saves a strlen(), since we calculate the string
length when detecting the suffix.
Note that in some cases we also switch from xstrndup to
xmemdupz, which saves a further strlen call.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These two are almost the same function, with the exception
that has_extension only matches if there is content before
the suffix. So ends_with(".exe", ".exe") is true, but
has_extension would not be.
This distinction does not matter to any of the callers,
though, and we can just replace uses of has_extension with
ends_with. We prefer the "ends_with" name because it is more
generic, and there is nothing about the function that
requires it to be used for file extensions.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
One of the purposes of "git replace --edit" is to help a
user repair objects which are malformed or corrupted.
Usually we pretty-print trees with "ls-tree", which is much
easier to work with than the raw binary data. However, some
forms of corruption break the tree-walker, in which case our
pretty-printing fails, rendering "--edit" useless for the
user.
This patch introduces a "--raw" option, which lets you edit
the binary data in these instances.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is a little more verbose, but will make it easier to
make parts of our command-line conditional (without
resorting to magic numbers or lots of NULLs to get an
appropriately sized argv array).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a file descriptor is given to run_command via the
"in", "out", or "err" parameters, run_command takes
ownership. The descriptor will be closed in the parent
process whether the process is spawned successfully or not,
and closing it again is wrong.
In practice this has not caused problems, because we usually
close() right after start_command returns, meaning no other
code has opened a descriptor in the meantime. So we just get
EBADF and ignore it (rather than accidentally closing
somebody else's descriptor!).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Recent updates to "git repack" started to duplicate objects that
are in packfiles marked with .keep flag into the new packfile by
mistake.
* jk/repack-pack-keep-objects:
repack: s/write_bitmap/&s/ in code
repack: respect pack.writebitmaps
repack: do not accidentally pack kept objects by default
"git status" (and "git commit") behaved as if changes in a modified
submodule are not there if submodule.*.ignore configuration is set,
which was misleading. The configuration is only to unclutter diff
output during the course of development, and should not to hide
changes in the "status" output to cause the users forget to commit
them.
* jl/status-added-submodule-is-never-ignored:
commit -m: commit staged submodules regardless of ignore config
status/commit: show staged submodules regardless of ignore config
"git status", even though it is a read-only operation, tries to
update the index with refreshed lstat(2) info to optimize future
accesses to the working tree opportunistically, but this could
race with a "read-write" operation that modify the index while it
is running. Detect such a race and avoid overwriting the index.
* ym/fix-opportunistic-index-update-race:
read-cache.c: verify index file before we opportunistically update it
wrapper.c: add xpread() similar to xread()
"git remote rm" and "git remote prune" can involve removing many
refs at once, which is not a very efficient thing to do when very
many refs exist in the packed-refs file.
* jl/remote-rm-prune:
remote prune: optimize "dangling symref" check/warning
remote: repack packed-refs once when deleting multiple refs
remote rm: delete remote configuration as the last
"git rerere forget" did not work well when merge.conflictstyle
was set to a non-default value.
* fc/rerere-conflict-style:
rerere: fix for merge.conflictstyle
On a case insensitive filesystem, merge-recursive incorrectly
deleted the file that is to be renamed to a name that is the same
except for case differences.
* dt/merge-recursive-case-insensitive:
mv: allow renaming to fix case on case insensitive filesystems
merge-recursive.c: fix case-changing merge bug
"git mailinfo" used to read beyond the end of header string while
parsing an incoming e-mail message to extract the patch.
* rs/mailinfo-header-cmp:
mailinfo: use strcmp() for string comparison
The error reporting from "git index-pack" has been improved to
distinguish missing objects from type errors.
* jk/index-pack-report-missing:
index-pack: distinguish missing objects from type errors
We used to disable threaded "git index-pack" on platforms without
thread-safe pread(); use a different workaround for such
platforms to allow threaded "git index-pack".
* nd/index-pack-one-fd-per-thread:
index-pack: work around thread-unsafe pread()
"git grep -O" to show the lines that hit in the pager did not work
well with case insensitive search. We now spawn "less" with its
"-I" option when it is used as the pager (which is the default).
* sk/spawn-less-case-insensitively-from-grep-O-i:
git grep -O -i: if the pager is 'less', pass the '-I' option
"git gc --auto" was recently changed to run in the background to
give control back early to the end-user sitting in front of the
terminal, but it forgot that housekeeping involving reflogs should
be done without other processes competing for accesses to the refs.
* nd/daemonize-gc:
gc --auto: do not lock refs in the background
"git format-patch" did not enforce the rule that the "--follow"
option from the log/diff family of commands must be used with
exactly one pathspec.
* jk/diff-follow-must-take-one-pathspec:
move "--follow needs one pathspec" rule to diff_setup_done
"git commit --allow-empty-message -C $commit" did not work when the
commit did not have any log message.
* jk/commit-C-pick-empty:
commit: do not complain of empty messages from -C
"git blame" assigned the blame to the copy in the working-tree if
the repository is set to core.autocrlf=input and the file used CRLF
line endings.
* bc/blame-crlf-test:
blame: correctly handle files regardless of autocrlf
"git blame" miscounted number of columns needed to show localized
timestamps, resulting in jaggy left-side-edge of the source code
lines in its output.
* jx/blame-align-relative-time:
blame: dynamic blame_date_width for different locales
blame: fix broken time_buf paddings in relative timestamp
"--ignore-space-change" option of "git apply" ignored the spaces
at the beginning of line too aggressively, which is inconsistent
with the option of the same name "diff" and "git diff" have.
* jc/apply-ignore-whitespace:
apply --ignore-space-change: lines with and without leading whitespaces do not match
Commit signatures can be verified using "git show -s --show-signature"
or the "%G?" pretty format and parsing the output, which is well suited
for user inspection, but not for scripting.
Provide a command "verify-commit" which is analogous to "verify-tag": It
returns 0 for good signatures and non-zero otherwise, has the gpg output
on stderr and (optionally) the commit object on stdout, sans the
signature, just like "verify-tag" does.
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The struct has been growing members whose malloced memory needs to be
freed. Do this with one helper function so that no malloced memory shall
be left unfreed.
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
31b808a0 (clone --single: limit the fetch refspec to fetched branch,
2012-09-20) tried to see if the given "branch" to follow is actually
a tag at the remote repository by checking with "refs/tags/" but it
incorrectly used strstr(3); it is actively wrong to treat a "branch"
"refs/heads/refs/tags/foo" and use the logic for the "refs/tags/"
ref hierarchy. What the code really wanted to do is to see if it
starts with "refs/tags/".
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jc/fetch-pull-refmap:
docs: Explain the purpose of fetch's and pull's <refspec> parameter.
fetch: allow explicit --refmap to override configuration
fetch doc: add a section on configured remote-tracking branches
fetch doc: remove "short-cut" section
fetch doc: update refspec format description
fetch doc: on pulling multiple refspecs
fetch doc: remove notes on outdated "mixed layout"
fetch doc: update note on '+' in front of the refspec
fetch doc: move FETCH_HEAD material lower and add an example
fetch doc: update introductory part for clarity
It's a common idiom to match a prefix and then skip past it
with strlen, like:
if (starts_with(foo, "bar"))
foo += strlen("bar");
This avoids magic numbers, but means we have to repeat the
string (and there is no compiler check that we didn't make a
typo in one of the strings).
We can use skip_prefix to handle this case without repeating
ourselves.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A submodule diff generally has content like:
-Subproject commit [0-9a-f]{40}
+Subproject commit [0-9a-f]{40}
When we are using "git apply --index" with a submodule, we
first apply the textual diff, and then parse that result to
figure out the new sha1.
If the diff has bogus input like:
-Subproject commit 1234567890123456789012345678901234567890
+bogus
we will parse the "bogus" portion. Our parser assumes that
the buffer starts with "Subproject commit", and blindly
skips past it using strlen(). This can cause us to read
random memory after the buffer.
This problem was unlikely to have come up in practice (since
it requires a malformed diff), and even when it did, we
likely noticed the problem anyway as the next operation was
to call get_sha1_hex on the random memory.
However, we can easily fix it by using skip_prefix to notice
the parsing error.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The skip_prefix() function returns a pointer to the content
past the prefix, or NULL if the prefix was not found. While
this is nice and simple, in practice it makes it hard to use
for two reasons:
1. When you want to conditionally skip or keep the string
as-is, you have to introduce a temporary variable.
For example:
tmp = skip_prefix(buf, "foo");
if (tmp)
buf = tmp;
2. It is verbose to check the outcome in a conditional, as
you need extra parentheses to silence compiler
warnings. For example:
if ((cp = skip_prefix(buf, "foo"))
/* do something with cp */
Both of these make it harder to use for long if-chains, and
we tend to use starts_with() instead. However, the first line
of "do something" is often to then skip forward in buf past
the prefix, either using a magic constant or with an extra
strlen(3) (which is generally computed at compile time, but
means we are repeating ourselves).
This patch refactors skip_prefix() to return a simple boolean,
and to provide the pointer value as an out-parameter. If the
prefix is not found, the out-parameter is untouched. This
lets you write:
if (skip_prefix(arg, "foo ", &arg))
do_foo(arg);
else if (skip_prefix(arg, "bar ", &arg))
do_bar(arg);
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's easy to get manual allocation calculations wrong, and
the use of strcpy/strcat raise red flags for people looking
for buffer overflows (though in this case each site was
fine).
It's also shorter to use xstrfmt, and the printf-format
tends to be easier for a reader to see what the final string
will look like.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is one line shorter, and makes sure the length in the
malloc and sprintf steps match.
These conversions are very straightforward; we can drop the
malloc entirely, and replace the sprintf with xstrfmt.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is one line shorter, and makes sure the length in the
malloc and copy steps match.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There's no point in using:
if (skip_prefix(buf, "foo"))
over
if (starts_with(buf, "foo"))
as the point of skip_prefix is to return a pointer to the
data after the prefix. Using starts_with is more readable,
and will make refactoring skip_prefix easier.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow remote-helper/fast-import based transport to rename the refs
while transferring the history.
* fc/remote-helper-refmap:
transport-helper: remove unnecessary strbuf resets
transport-helper: add support to delete branches
fast-export: add support to delete refs
fast-import: add support to delete refs
transport-helper: add support to push symbolic refs
transport-helper: add support for old:new refspec
fast-export: add new --refspec option
fast-export: improve argument parsing
"git gc --auto" was recently changed to run in the background to
give control back early to the end-user sitting in front of the
terminal, but it forgot that housekeeping involving reflogs should
be done without other processes competing for accesses to the refs.
* nd/daemonize-gc:
gc --auto: do not lock refs in the background
"git remote rm" and "git remote prune" can involve removing many
refs at once, which is not a very efficient thing to do when very
many refs exist in the packed-refs file.
* jl/remote-rm-prune:
remote prune: optimize "dangling symref" check/warning
remote: repack packed-refs once when deleting multiple refs
remote rm: delete remote configuration as the last
submodule.*.ignore and diff.ignoresubmodules are used to ignore all
submodule changes in "diff" output, but it can be confusing to
apply these configuration values to status and commit.
This is a backward-incompatible change, but should be so in a good
way (aka bugfix).
* jl/status-added-submodule-is-never-ignored:
commit -m: commit staged submodules regardless of ignore config
status/commit: show staged submodules regardless of ignore config
"git replace" learns a new "--edit" option.
* cc/replace-edit:
Documentation: replace: describe new --edit option
replace: add --edit to usage string
replace: add tests for --edit
replace: die early if replace ref already exists
replace: refactor checking ref validity
replace: make sure --edit results in a different object
replace: add --edit option
replace: factor object resolution out of replace_object
replace: use OPT_CMDMODE to handle modes
replace: refactor command-mode determination
* 'mt/patch-id-stable' (early part):
patch-id-test: test stable and unstable behaviour
patch-id: make it stable against hunk reordering
test doc: test_write_lines does not split its arguments
test: add test_write_lines helper
Changing get_next_line() to return the end pointer instead of NULL in
case no newline character is found treats allows us to treat complete
and incomplete lines the same, simplifying the code. Switching to
counting lines instead of EOLs allows us to start counting at the
first character, instead of having to call get_next_line() first.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the code for finding the start of the next line into a helper
function in order to reduce duplication.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most callsites which use the commit buffer try to use the
cached version attached to the commit, rather than
re-reading from disk. Unfortunately, that interface provides
only a pointer to the NUL-terminated buffer, with no
indication of the original length.
For the most part, this doesn't matter. People do not put
NULs in their commit messages, and the log code is happy to
treat it all as a NUL-terminated string. However, some code
paths do care. For example, when checking signatures, we
want to be very careful that we verify all the bytes to
avoid malicious trickery.
This patch just adds an optional "size" out-pointer to
get_commit_buffer and friends. The existing callers all pass
NULL (there did not seem to be any obvious sites where we
could avoid an immediate strlen() call, though perhaps with
some further refactoring we could).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Each of these sites assumes that commit->buffer is valid.
Since they would segfault if this was not the case, they are
likely to be correct in practice. However, we can
future-proof them by using get_commit_buffer.
And as a side effect, we abstract away the final bare uses
of commit->buffer.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Like the callsites in the previous commit, logmsg_reencode
already falls back to read_sha1_file when necessary.
However, I split its conversion out into its own commit
because it's a bit more complex.
We return either:
1. The original commit->buffer
2. A newly allocated buffer from read_sha1_file
3. A reencoded buffer (based on either 1 or 2 above).
while trying to do as few extra reads/allocations as
possible. Callers currently free the result with
logmsg_free, but we can simplify this by pointing them
straight to unuse_commit_buffer. This is a slight layering
violation, in that we may be passing a buffer from (3).
However, since the end result is to free() anything except
(1), which is unlikely to change, and because this makes the
interface much simpler, it's a reasonable bending of the
rules.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some call sites check commit->buffer to see whether we have
a cached buffer, and if so, do some work with it. In the
long run we may want to switch these code paths to make
their decision on a different boolean flag (because checking
the cache may get a little more expensive in the future).
But for now, we can easily support them by converting the
calls to use get_cached_commit_buffer.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Right now this is just a one-liner, but abstracting it will
make it easier to change later.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This converts two lines into one at each caller. But more
importantly, it abstracts the concept of freeing the buffer,
which will make it easier to change later.
Note that we also need to provide a "detach" mechanism for a
tricky case in index-pack. We are passed a buffer for the
object generated by processing the incoming pack. If we are
not using --strict, we just calculate the sha1 on that
buffer and return, leaving the caller to free it. But if we
are using --strict, we actually attach that buffer to an
object, pass the object to the fsck functions, and then
detach the buffer from the object again (so that the caller
can free it as usual). In this case, we don't want to free
the buffer ourselves, but just make sure it is no longer
associated with the commit.
Note that we are making the assumption here that the
attach/detach process does not impact the buffer at all
(e.g., it is never reallocated or modified). That holds true
now, and we have no plans to change that. However, as we
abstract the commit_buffer code, this dependency becomes
less obvious. So when we detach, let's also make sure that
we get back the same buffer that we gave to the
commit_buffer code.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Normally scripts do not have to be aware about split indexes because
all shared indexes are in $GIT_DIR. A simple "mv $tmp_index
$GIT_DIR/somewhere" is enough. Scripts that generate temporary indexes
and move them across repos must be aware about split index and copy
the shared file as well. This option enables that.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If you have a large work tree but only make changes in a subset, then
$GIT_DIR/index's size should be stable after a while. If you change
branches that touch something else, $GIT_DIR/index's size may grow
large that it becomes as slow as the unified index. Do --split-index
again occasionally to force all changes back to the shared index and
keep $GIT_DIR/index small.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The large part of this patch just follows CE_ENTRY_CHANGED
marks. replace_index_entry() is updated to update
split_index->base->cache[] as well so base->cache[] does not reference
to a freed entry.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Other fill_stat_cache_info() is on new entries, which should set
CE_ENTRY_ADDED in cache_changed, so we're safe.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
cache entry additions, removals and modifications are separated
out. The rest of changes are still in the catch-all flag
SOMETHING_CHANGED, which would be more specific later.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The return value from logmsg_reencode may be either a newly
allocated buffer or a pointer to the existing commit->buffer.
We would not want the caller to accidentally free() or
modify the latter, so let's mark it as const. We can cast
away the constness in logmsg_free, but only once we have
determined that it is a free-able buffer.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In both blame and merge-recursive, we sometimes create a
"fake" commit struct for convenience (e.g., to represent the
HEAD state as if we would commit it). By allocating
ourselves rather than using alloc_commit_node, we do not
properly set the "index" field of the commit. This can
produce subtle bugs if we then use commit-slab on the
resulting commit, as we will share the "0" index with
another commit.
We can fix this by using alloc_commit_node() to allocate.
Note that we cannot free the result, as it is part of our
commit allocator. However, both cases were already leaking
the allocated commit anyway, so there's nothing to fix up.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While strbufs are pretty common throughout our code, it is
more flexible for functions to take a pointer/len pair than
a strbuf. It's easy to turn a strbuf into such a pair (by
dereferencing its members), but less easy to go the other
way (you can strbuf_attach, but that has implications about
memory ownership).
This patch teaches commit_tree (and its associated callers
and sub-functions) to take such a pair for the commit
message rather than a strbuf. This makes passing the buffer
around slightly more verbose, but means we can get rid of
some dangerous strbuf_attach calls in the next patch.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We currently have pack.writeBitmaps, which originally
operated at the pack-objects level. This should really have
been a repack.* option from day one. Let's give it the more
sensible name, but keep the old version as a deprecated
synonym.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We previously needed to pass --no-write-bitmap-index
explicitly to pack-objects to override its reading of
pack.writebitmaps from the config. Now that it no longer
does so, we can assume that bitmaps are off by default, and
only turn them on when necessary. This also lets us avoid a
confusing tri-state flag for write_bitmaps.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The handling of the pack.writebitmaps config option
originally happened in pack-objects, which is quite
low-level. It would make more sense for drivers of
pack-objects to read the config, and then manipulate
pack-objects with command-line options.
Recently, repack learned to do so, making the low-level read
of pack.writebitmaps redundant here. Other callers, like
upload-pack, would not generally want to write bitmaps
anyway.
This could be considered a regression for somebody who is
driving pack-objects themselves outside of repack and
expects the config option to be used. However, such users
seem rather unlikely given how new the bitmap code is (and
the fact that they would basically be reimplementing repack
in the first place).
Note that we do not do anything with pack.writeBitmapHashCache
here. That option is not about "do we write bimaps", but
rather "when we are writing bitmaps, how do we do it?". You
would want that to kick in anytime you decide to write them,
similar to how pack.indexVersion is used.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The config name is "writeBitmaps", so the internal variable
missing the plural is unnecessarily confusing to write.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The config option to turn on bitmaps is read all the way
down in the plumbing of pack-objects. This makes it hard for
other options in the porcelain of repack to make decisions
based on the bitmap setting. For example,
repack.packKeptObjects tries to kick in by default only when
bitmaps are turned on. But it can't do so reliably because
it doesn't yet know whether we are using bitmaps.
This patch teaches repack to respect pack.writebitmaps. It
means we pass a redundant command-line flag to pack-objects,
but that's OK; it shouldn't affect the outcome.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit ee34a2b (repack: add `repack.packKeptObjects` config
var, 2014-03-03) added a flag which could duplicate kept
objects, but did not mean to turn it on by default. Instead,
the option is tied by default to the decision to write
bitmaps, like:
if (pack_kept_objects < 0)
pack_kept_objects = write_bitmap;
after which we expect pack_kept_objects to be a boolean 0 or
1. However, that assignment neglects that write_bitmap is
_also_ a tri-state with "-1" as the default, and with
neither option given, we accidentally turn the option on.
This patch is the minimal fix to restore the desired
behavior for the default state. Further patches will fix the
more complicated cases.
Note the update to t7700. It failed to turn on bitmaps,
meaning we were actually confirming the wrong behavior!
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Patch id changes if users reorder file diffs that make up a patch.
As the result is functionally equivalent, a different patch id is
surprising to many users.
In particular, reordering files using diff -O is helpful to make patches
more readable (e.g. API header diff before implementation diff).
Add an option to change patch-id behaviour making it stable against
these kinds of patch change:
calculate SHA1 hash for each hunk separately and sum all hashes
(using a symmetrical sum) to get patch id
We use a 20byte sum and not xor - since xor would give 0 output
for patches that have two identical diffs, which isn't all that
unlikely (e.g. append the same line in two places).
The new behaviour is enabled
- when patchid.stable is true
- when --stable flag is present
Using a new flag --unstable or setting patchid.stable to false force
the historical behaviour.
In the documentation, clarify that patch ID can now be a sum of hashes,
not a hash.
Document how command line and config options affect the
behaviour.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert three cases of checking for a constant prefix using memcmp() to
starts_with(). This way there is no need for magic string length
constants and we avoid running over the end of the string should it be
shorter than the prefix.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Avoid running over the end of header string while parsing an
incoming e-mail message to extract the patch.
* rs/mailinfo-header-cmp:
mailinfo: use strcmp() for string comparison
"update-index --cacheinfo" in 2.0 crashes on a malformed command line.
* jc/rev-parse-argh-dashed-multi-words:
update-index: fix segfault with missing --cacheinfo argument