Plugging a handful of memory leaks in the ref-filter codepath.
* ot/ref-filter-plug-leaks:
ref-filter: free item->value and item->value->s
ls-remote: release memory instead of UNLEAK
ref-filter: free memory from used_atom
Various codepaths in the core-ish part learn to work on an
arbitrary in-core index structure, not necessarily the default
instance "the_index".
* nd/the-index: (23 commits)
revision.c: reduce implicit dependency the_repository
revision.c: remove implicit dependency on the_index
ws.c: remove implicit dependency on the_index
tree-diff.c: remove implicit dependency on the_index
submodule.c: remove implicit dependency on the_index
line-range.c: remove implicit dependency on the_index
userdiff.c: remove implicit dependency on the_index
rerere.c: remove implicit dependency on the_index
sha1-file.c: remove implicit dependency on the_index
patch-ids.c: remove implicit dependency on the_index
merge.c: remove implicit dependency on the_index
merge-blobs.c: remove implicit dependency on the_index
ll-merge.c: remove implicit dependency on the_index
diff-lib.c: remove implicit dependency on the_index
read-cache.c: remove implicit dependency on the_index
diff.c: remove implicit dependency on the_index
grep.c: remove implicit dependency on the_index
diff.c: remove the_index dependency in textconv() functions
blame.c: rename "repo" argument to "r"
combine-diff.c: remove implicit dependency on the_index
...
Release item->value.
Initialize item->value->s dynamically and then release its resources.
Release some local variables.
Final goal of this patch is to reduce number of memory leaks.
Signed-off-by: Olga Telezhnaia <olyatelezhnaya@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Release memory from used_atom variable for reducing number of memory
leaks.
Signed-off-by: Olga Telezhnaia <olyatelezhnaya@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git interpret-trailers" and its underlying machinery had a buggy
code that attempted to ignore patch text after commit log message,
which triggered in various codepaths that will always get the log
message alone and never get such an input.
* jk/trailer-fixes:
append_signoff: use size_t for string offsets
sequencer: ignore "---" divider when parsing trailers
pretty, ref-filter: format %(trailers) with no_divider option
interpret-trailers: allow suppressing "---" divider
interpret-trailers: tighten check for "---" patch boundary
trailer: pass process_trailer_opts to trailer_info_get()
trailer: use size_t for iterating trailer list
trailer: use size_t for string offsets
The code for computing history reachability has been shuffled,
obtained a bunch of new tests to cover them, and then being
improved.
* ds/reachable:
commit-reach: correct accidental #include of C file
commit-reach: use can_all_from_reach
commit-reach: make can_all_from_reach... linear
commit-reach: replace ref_newer logic
test-reach: test commit_contains
test-reach: test can_all_from_reach_with_flags
test-reach: test reduce_heads
test-reach: test get_merge_bases_many
test-reach: test is_descendant_of
test-reach: test in_merge_bases
test-reach: create new test tool for ref_newer
commit-reach: move can_all_from_reach_with_flags
upload-pack: generalize commit date cutoff
upload-pack: refactor ok_to_give_up()
upload-pack: make reachable() more generic
commit-reach: move commit_contains from ref-filter
commit-reach: move ref_newer from remote.c
commit.h: remove method declarations
commit-reach: move walk methods from commit.c
In both of these cases we know that we are feeding the
trailer-parsing code a pure commit message. We should tell
it so, which avoids false positives for a commit message
that contains a "---" line.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A few atoms like %(objecttype) and %(objectsize) in the format
specifier of "for-each-ref --format=<format>" can be filled without
getting the full contents of the object, but just with the object
header. These cases have been optimized by calling
oid_object_info() API (instead of reading and inspecting the data).
* ot/ref-filter-object-info:
ref-filter: use oid_object_info() to get object
ref-filter: merge get_obj and get_object
ref-filter: initialize eaten variable
ref-filter: fill empty fields with empty values
ref-filter: add info_source to valid_atom
The "--ignore-case" option of "git for-each-ref" (and its friends)
did not work correctly, which has been fixed.
* jk/for-each-ref-icase:
ref-filter: avoid backend filtering with --ignore-case
for-each-ref: consistently pass WM_IGNORECASE flag
t6300: add a test for --ignore-case
There are several commit walks in the codebase. Group them together into
a new commit-reach.c file and corresponding header. After we group these
walks into one place, we can reduce duplicate logic by calling
equivalent methods.
All methods are direct moves, except we also make the commit_contains()
method public so its consumers in ref-filter.c can still call it. We can
also test this method in a test-tool in a later commit.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These methods are now declared in commit-reach.h. Remove them from
commit.h and add new include statements in all files that require these
declarations.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The conversion to pass "the_repository" and then "a_repository"
throughout the object access API continues.
* sb/object-store-grafts:
commit: allow lookup_commit_graft to handle arbitrary repositories
commit: allow prepare_commit_graft to handle arbitrary repositories
shallow: migrate shallow information into the object parser
path.c: migrate global git_path_* to take a repository argument
cache: convert get_graft_file to handle arbitrary repositories
commit: convert read_graft_file to handle arbitrary repositories
commit: convert register_commit_graft to handle arbitrary repositories
commit: convert commit_graft_pos() to handle arbitrary repositories
shallow: add repository argument to is_repository_shallow
shallow: add repository argument to check_shallow_file_for_update
shallow: add repository argument to register_shallow
shallow: add repository argument to set_alternate_shallow_file
commit: add repository argument to lookup_commit_graft
commit: add repository argument to prepare_commit_graft
commit: add repository argument to read_graft_file
commit: add repository argument to register_commit_graft
commit: add repository argument to commit_graft_pos
object: move grafts to object parser
object-store: move object access functions to object-store.h
Add a struct repository argument to the functions in commit-graph.h that
read the commit graph. (This commit does not affect functions that write
commit graphs.)
Because the commit graph functions can now read the commit graph of any
repository, the global variable core_commit_graph has been removed.
Instead, the config option core.commitGraph is now read on the first
time in a repository that a commit is attempted to be parsed using its
commit graph.
This commit includes a test that exercises the functionality on an
arbitrary repository that is not the_repository.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use oid_object_info_extended() to get object info instead of
read_object_file().
It will help to handle some requests faster (e.g., we do not need to
parse whole object if we need to know %(objectsize)).
It could also help us to add new atoms such as %(objectsize:disk)
and %(deltabase).
Signed-off-by: Olga Telezhnaia <olyatelezhnaya@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Inline get_obj(): it would be easier to edit the code
without this split.
Signed-off-by: Olga Telezhnaia <olyatelezhnaya@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Initialize variable `eaten` before its using. We may initialize it in
parse_object_buffer(), but there are cases when we do not reach this
invocation.
Signed-off-by: Olga Telezhnaia <olyatelezhnaya@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Atoms like "align" or "end" do not have string representation.
Earlier we had to go and parse whole object with a hope that we
could fill their string representations. It's easier to fill them
with an empty string before we start to work with whole object.
It is important to mention that we fill only these atoms that must
contain nothing. So, if we could not fill the atom because, for example,
the object is missing, we leave it with NULL.
Signed-off-by: Olga Telezhnaia <olyatelezhnaya@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add the source of object data to prevent parsing of unneeded data.
The goal is to improve performance by avoiding calling expensive
functions when we don't need the information they provide
or when we could get it by using a cheaper function.
It is stored in valid_atoms because it depends on the atoms we are
interested in.
Signed-off-by: Olga Telezhnaia <olyatelezhnaya@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When for-each-ref is used with --ignore-case, we expect
match_name_as_path() to do a case-insensitive match. But
there's an extra layer of filtering that happens before we
even get there. Since commit cfe004a5a9 (ref-filter: limit
traversal to prefix, 2017-05-22), we feed the prefix to the
ref backend so that it can optimize the ref iteration.
There's no mechanism for us to tell the backend we're matching
case-insensitively. Nor is there likely to be one anytime soon,
since the packed backend relies on binary-searching the sorted list
of refs. Let's just punt on this case. The extra filtering is an
optimization that we simply can't do. We'll still give the correct
answer via the filtering in match_name_as_path().
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The match_name_as_path() function learned to set
WM_IGNORECASE in the "flags" field when the user passed
--ignore-case. But it forgot to actually pass the flags to
wildmatch()!
As a result, the --ignore-case feature has been broken since
it was added in 3bb16a8bf2 (tag, branch, for-each-ref: add
--ignore-case for sorting and filtering, 2016-12-04). We
didn't notice because we added tests only for git-branch and
git-tag. Whereas git-for-each-ref has slightly different
matching rules, and thus uses a different function (the
related function match_pattern() does it correctly).
Incidentally, this also caused clang's scan-build to
complain about the code; the assignment to "flags" was dead
code.
Note that we can't flip the test in t6300 to expect_success
yet. There's another bug, which will be dealt with in the
next patch.
Commit-message-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow callers of
lookup_commit_reference_gently to be more specific about which
repository to handle. This is a small mechanical change; it doesn't
change the implementation to handle repositories other than
the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow the callers of parse_object_buffer
to be more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow the callers of parse_object
to be more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update to ds/generation-numbers topic.
* ds/commit-graph-lockfile-fix:
commit-graph: fix UX issue when .lock file exists
commit-graph.txt: update design document
merge: check config before loading commits
commit: use generation number in remove_redundant()
commit: add short-circuit to paint_down_to_common()
commit: use generation numbers for in_merge_bases()
ref-filter: use generation number for --contains
commit-graph: always load commit-graph information
commit: use generations in paint_down_to_common()
commit-graph: compute generation numbers
commit: add generation number to struct commit
ref-filter: fix outdated comment on in_commit_list
Developer support update, by using BUG() macro instead of die() to
mark codepaths that should not happen more clearly.
* js/use-bug-macro:
BUG_exit_code: fix sparse "symbol not declared" warning
Convert remaining die*(BUG) messages
Replace all die("BUG: ...") calls by BUG() ones
run-command: use BUG() to report bugs, not die()
test-tool: help verifying BUG() code paths
The code has been taught to use the duplicated information stored
in the commit-graph file to learn the tree object name for a commit
to avoid opening and parsing the commit object when it makes sense
to do so.
* ds/lazy-load-trees:
coccinelle: avoid wrong transformation suggestions from commit.cocci
commit-graph: lazy-load trees for commits
treewide: replace maybe_tree with accessor methods
commit: create get_commit_tree() method
treewide: rename tree to maybe_tree
A commit A can reach a commit B only if the generation number of A
is strictly larger than the generation number of B. This condition
allows significantly short-circuiting commit-graph walks.
Use generation number for '--contains' type queries.
On a copy of the Linux repository where HEAD is contained in v4.13
but no earlier tag, the command 'git tag --contains HEAD' had the
following peformance improvement:
Before: 0.81s
After: 0.04s
Rel %: -95%
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This should make these functions easier to find and cache.h less
overwhelming to read.
In particular, this moves:
- read_object_file
- oid_object_info
- write_object_file
As a result, most of the codebase needs to #include object-store.h.
In this patch the #include is only added to files that would fail to
compile otherwise. It would be better to #include wherever
identifiers from the header are used. That can happen later
when we have better tooling for it.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code restructuring, in preparation for further work.
* ot/libify-get-ref-atom-value:
ref-filter: libify get_ref_atom_value()
ref-filter: add return value to parsers
ref-filter: change parsing function error handling
ref-filter: add return value && strbuf to handlers
ref-filter: start adding strbufs with errors
ref-filter: add shortcut to work with strbufs
In d8193743e0 (usage.c: add BUG() function, 2017-05-12), a new macro
was introduced to use for reporting bugs instead of die(). It was then
subsequently used to convert one single caller in 588a538ae5
(setup_git_env: convert die("BUG") to BUG(), 2017-05-12).
The cover letter of the patch series containing this patch
(cf 20170513032414.mfrwabt4hovujde2@sigill.intra.peff.net) is not
terribly clear why only one call site was converted, or what the plan
is for other, similar calls to die() to report bugs.
Let's just convert all remaining ones in one fell swoop.
This trick was performed by this invocation:
sed -i 's/die("BUG: /BUG("/g' $(git grep -l 'die("BUG' \*.c)
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The in_commit_list() method does not check the parents of
the candidate for containment in the list. Fix the comment
that incorrectly states that it does.
Reported-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
API clean-up aournd ref-filter code.
* jk/ref-array-push:
ref-filter: factor ref_array pushing into its own function
ref-filter: make ref_array_item allocation more consistent
ref-filter: use "struct object_id" consistently
"git branch --list" during an interrupted "rebase -i" now lets
users distinguish the case where a detached HEAD is being rebased
and a normal branch is being rebased.
* ks/branch-list-detached-rebase-i:
t3200: verify "branch --list" sanity when rebasing from detached HEAD
branch --list: print useful info whilst interactive rebasing a detached HEAD
In anticipation of making trees load lazily, create a Coccinelle
script (contrib/coccinelle/commit.cocci) to ensure that all
references to the 'maybe_tree' member of struct commit are either
mutations or accesses through get_commit_tree() or
get_commit_tree_oid().
Apply the Coccinelle script to create the rest of the patch.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Using the commit-graph file to walk commit history removes the large
cost of parsing commits during the walk. This exposes a performance
issue: lookup_tree() takes a large portion of the computation time,
even when Git never uses those trees.
In anticipation of lazy-loading these trees, rename the 'tree' member
of struct commit to 'maybe_tree'. This serves two purposes: it hints
at the future role of possibly being NULL even if the commit has a
valid tree, and it allows for unambiguous transformation from simple
member access (i.e. commit->maybe_tree) to method access.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In preparation for callers constructing their own ref_array
structs, let's move our own internal push operation into its
own function.
While we're at it, we can replace REALLOC_ARRAY() with
ALLOC_GROW(), which should give the growth operation
amortized linear complexity (as opposed to growing by one,
which is potentially quadratic, though in-place realloc
growth often makes this faster in practice).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have a helper function to allocate ref_array_item
structs, but it only takes a subset of the possible fields
in the struct as initializers. We could have it accept an
argument for _every_ field, but that becomes a pain for the
fields which some callers don't want to set initially.
Instead, let's be explicit that it takes only the minimum
required to create the ref, and that callers should then
fill in the rest themselves.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Internally we store a "struct object_id", and all of our
callers have one to pass us. But we insist that they peel it
to its bare-sha1 hash, which we then hashcpy() into place.
Let's pass it around as an object_id, which future-proofs us
for a post-sha1 world.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When rebasing interactively (rebase -i), "git branch --list" prints
a line indicating the current branch being rebased. This works well
when the interactive rebase is initiated when a local branch is
checked out.
This doesn't play well when the rebase is initiated on a detached
HEAD. When "git branch --list" tries to print information related
to the interactive rebase in this case it tries to print the name
of a branch using an uninitialized variable and thus tries to
print a "null pointer string". As a consequence, it does not provide
useful information while also inducing undefined behaviour.
So, print the point from which the rebase was started when interactive
rebasing a detached HEAD.
Signed-off-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Finish removing die() calls from ref-filter formatting logic,
so that it could be used by other commands.
Change the signature of get_ref_atom_value() and underlying functions
by adding return value and strbuf parameter for error message.
Return value equals 0 upon success and -1 upon failure.
Upon failure, error message is appended to the strbuf.
Signed-off-by: Olga Telezhnaia <olyatelezhnaya@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Continue removing die() calls from ref-filter formatting logic,
so that it could be used by other commands.
Change the signature of parsers by adding return value and
strbuf parameter for error message.
Return value equals 0 upon success and -1 upon failure.
Upon failure, error message is appended to the strbuf.
Signed-off-by: Olga Telezhnaia <olyatelezhnaya@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Continue removing die() calls from ref-filter formatting logic,
so that it could be used by other commands.
Change the signature of parse_ref_filter_atom() by adding
strbuf parameter for error message.
The function returns the position in the used_atom[] array
(as before) for the given atom, or -1 to signal an error.
Upon failure, error message is appended to the strbuf.
Signed-off-by: Olga Telezhnaia <olyatelezhnaya@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Continue removing die() calls from ref-filter formatting logic,
so that it could be used by other commands.
Change the signature of handlers by adding return value
and strbuf parameter for errors.
Return value equals 0 upon success and -1 upon failure.
Upon failure, error message is appended to the strbuf.
Signed-off-by: Olga Telezhnaia <olyatelezhnaya@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is a first step in removing die() calls from ref-filter
formatting logic, so that it could be used by other commands
that do not want to die during formatting process.
die() calls related to bugs in code will not be touched in this patch.
Everything would be the same for show_ref_array_item() users.
But, if you want to deal with errors by your own, you could invoke
format_ref_array_item(). It means that you need to print everything
(the result and errors) on your side.
This commit changes signature of format_ref_array_item() by adding
return value and strbuf parameter for errors, and adjusts
its callers. While at it, reduce the scope of the out-variable.
Signed-off-by: Olga Telezhnaia <olyatelezhnaya@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add function strbuf_addf_ret() that helps to save a few lines of code.
Function expands fmt with placeholders, append resulting message
to strbuf *sb, and return error code ret.
Signed-off-by: Olga Telezhnaia <olyatelezhnaya@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert read_sha1_file to take a pointer to struct object_id and rename
it read_object_file. Do the same for read_sha1_file_extended.
Convert one use in grep.c to use the new function without any other code
change, since the pointer being passed is a void pointer that is already
initialized with a pointer to struct object_id. Update the declaration
and definitions of the modified functions, and apply the following
semantic patch to convert the remaining callers:
@@
expression E1, E2, E3;
@@
- read_sha1_file(E1.hash, E2, E3)
+ read_object_file(&E1, E2, E3)
@@
expression E1, E2, E3;
@@
- read_sha1_file(E1->hash, E2, E3)
+ read_object_file(E1, E2, E3)
@@
expression E1, E2, E3, E4;
@@
- read_sha1_file_extended(E1.hash, E2, E3, E4)
+ read_object_file_extended(&E1, E2, E3, E4)
@@
expression E1, E2, E3, E4;
@@
- read_sha1_file_extended(E1->hash, E2, E3, E4)
+ read_object_file_extended(E1, E2, E3, E4)
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>