Because if there is, such two tree entries would never be compared as
equal - the code in base_name_compare() explicitly compares modes, if
there is a change for dir bit, even for equal paths, entries would
compare as different.
The code I'm removing here is from 2005 April 262e82b4 (Fix diff-tree
recursion), which pre-dates base_name_compare() introduction in 958ba6c9
(Introduce "base_name_compare()" helper function) by a month.
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move code for finding paths for which diff(commit,parent_i) is not-empty
for all parents to separate function - at present we have generic (and
slow) code for this job, which translates 1 n-parent problem to n
1-parent problems and then intersect results, and will be adding another
limited, but faster, paths scanning implementation in the next patch.
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Judging from sample outputs and tests nothing changes in diff -c output,
and this change will help later patches, when we'll be refactoring paths
scanning into its own function with several variants - the
show_log_first logic / code will stay common to all of them.
NOTE: only now we have to take care to explicitly not show anything if
parents array is empty, as in fact there are some clients in Git code,
which calls diff_tree_combined() in such a way.
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
where "correct paths" stands for paths that are different to all
parents.
Up until now, we were testing combined diff only on one file, or on
several files which were all different (t4038-diff-combined.sh).
As recent thinko in "simplify intersect_paths() further" showed, and
also, since we are going to rework code for finding paths different to
all parents, lets write at least basic tests.
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Linus once said:
I actually wish more people understood the really core low-level
kind of coding. Not big, complex stuff like the lockless name
lookup, but simply good use of pointers-to-pointers etc. For
example, I've seen too many people who delete a singly-linked
list entry by keeping track of the "prev" entry, and then to
delete the entry, doing something like
if (prev)
prev->next = entry->next;
else
list_head = entry->next;
and whenever I see code like that, I just go "This person
doesn't understand pointers". And it's sadly quite common.
People who understand pointers just use a "pointer to the entry
pointer", and initialize that with the address of the
list_head. And then as they traverse the list, they can remove
the entry without using any conditionals, by just doing a "*pp =
entry->next".
Applying that simplification lets us lose 7 lines from this function
even while adding 2 lines of comment.
I was tempted to squash this into the original commit, but because
the benchmarking described in the commit log is without this
simplification, I decided to keep it a separate follow-up patch.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The field was used in order to speed-up name comparison and also to
mark removed paths by setting it to 0.
Because the updated code does significantly less strcmp and also
just removes paths from the list and free right after we know a path
will not be needed, it is not needed anymore.
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When generating combined diff, for each commit, we intersect diff
paths from diff(parent_0,commit) to diff(parent_i,commit) comparing
all paths pairs, i.e. doing it the quadratic way. That is correct,
but could be optimized.
Paths come from trees in sorted (= tree) order, and so does diff_tree()
emits resulting paths in that order too. Now if we look at diffcore
transformations, all of them, except diffcore_order, preserve resulting
path ordering:
- skip_stat_unmatch, grep, pickaxe, filter
-- just skip elements -> order stays preserved
- break -- just breaks diff for a path, adding path
dup after the path -> order stays preserved
- detect rename/copy -- resulting paths are emitted sorted
(verified empirically)
So only diffcore_order changes diff paths ordering.
But diffcore_order meaning affects only presentation - i.e. only how to
show the diff, so we could do all the internal computations without
paths reordering, and order only resultant paths set. This is faster,
since, if we know two paths sets are all ordered, their intersection
could be done in linear time.
This patch does just that.
Timings for `git log --raw --no-abbrev --no-renames` without `-c` ("git log")
and with `-c` ("git log -c") before and after the patch are as follows:
linux.git v3.10..v3.11
log log -c
before 1.9s 20.4s
after 1.9s 16.6s
navy.git (private repo)
log log -c
before 0.83s 15.6s
after 0.83s 2.1s
P.S.
I think linux.git case is sped up not so much as the second one, since
in navy.git, there are more exotic (subtree, etc) merges.
P.P.S.
My tracing showed that the rest of the time (16.6s vs 1.9s) is usually
spent in computing huge diffs from commit to second parent. Will try to
deal with it, if I'll have time.
P.P.P.S.
For combine_diff_path, ->len is not needed anymore - will remove it in
the next noisy cleanup path, to maintain good signal/noise ratio here.
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the next patch combine-diff will have special code-path for taking
orderfile into account. Prepare for making changes by introducing
coverage tests for that case.
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
diffcore_order() interface only accepts a queue of `struct
diff_filepair`.
In the next patches, we'll want to order `struct combine_diff_path`
by path, so let's first rework diffcore-order to also provide
generic low-level interface for ordering arbitrary objects, provided
they have path accessors.
The new interface is:
- `struct obj_order` for describing objects to ordering routine, and
- order_objects() for actually doing the ordering work.
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This continues 4651ece8 (Switch over tree descriptors to contain a
pre-parsed entry) and moves the only rest computational part
mode = canon_mode(mode)
from tree_entry_extract() to tree entry decode phase - to
decode_tree_entry().
The reason to do it, is that canon_mode() is at least 2 conditional
jumps for regular files, and that could be noticeable should canon_mode()
be invoked several times.
That does not matter for current Git codebase, where typical tree
traversal is
while (t->size) {
sha1 = tree_entry_extract(t, &path, &mode);
...
update_tree_entry(t);
}
i.e. we do t -> sha1,path.mode "extraction" only once per entry. In such
cases, it does not matter performance-wise, where that mode
canonicalization is done - either once in tree_entry_extract(), or once
in decode_tree_entry() called by update_tree_entry() - it is
approximately the same.
But for future code, which could need to work with several tree_desc's
in parallel, it could be handy to operate on tree_desc descriptors, and
do "extracts" only when needed, or at all, access only relevant part of
it through structure fields directly.
And for such situations, having canon_mode() be done once in decode
phase is better - we won't need to pay the performance price of 2 extra
conditional jumps on every t->mode access.
So let's move mode canonicalization to decode_tree_entry(). That was the
final bit. Now after tree entry is decoded, it is fully ready and could
be accessed either directly via field, or through tree_entry_extract()
which this time got really "totally trivial".
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since diff_tree_sha1() can now accept empty trees via NULL sha1, we
could just call it without manually reading trees into tree_desc and
duplicating code.
Besides, that
if (!tree)
return 0;
looked suspect - we were saying an invalid tree != empty tree, but maybe it is
better to just say the tree is invalid here, which is what diff_tree_sha1()
does for such case.
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since diff_tree_sha1() can now accept empty trees via NULL sha1, we
could just call it without manually reading trees into tree_desc and
duplicating code.
Cc: Thomas Rast <tr@thomasrast.ch>
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now since diff_tree_sha1 understands NULL for both old and new, we could
indicate an empty tree for root commit by providing just NULL for old
sha1.
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
which would mean that corresponding tree - old or new - is empty.
As followup patches will show, that functionality was already needed in
several places of Git codebase, but there, we were preparing empty
tree_desc objects by hand, with some code duplication.
For handling sha1 = NULL case, let's reuse fill_tree_descriptor() which
returns just empty tree_desc in that case.
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Otherwise there is a race: if 'git log' finishes writing before the
pager terminates and closes the pipe, all is well, and if the pager
finishes quickly enough then 'git log' terminates with SIGPIPE.
died of signal 13 at /build/buildd/git-1.9~rc1/t/test-terminal.perl line 33.
not ok 6 - LESS and LV envvars are set for pagination
Noticed on Ubuntu PPA builders, where the race was lost about half the
time. Compare v1.7.0.2~6^2 (tests: Fix race condition in t7006-pager,
2010-02-22).
Reported-by: Anders Kaseorg <andersk@MIT.EDU>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* ta/doc-http-protocol-in-html:
http-protocol.txt: don't use uppercase for variable names in "The Negotiation Algorithm"
Documentation: make it easier to maintain enumerated documents
create HTML for http-protocol.txt
"git repack --max-pack-size=8g" stopped being parsed correctly when
the command was reimplemented in C.
* sb/repack-in-c:
repack: propagate pack-objects options as strings
repack: make parsed string options const-correct
repack: fix typo in max-pack-size option
"git clone $origin foo\bar\baz" on Windows failed to create the
leading directories (i.e. a moral-equivalent of "mkdir -p").
* ss/safe-create-leading-dir-with-slash:
safe_create_leading_directories(): on Windows, \ can separate path components
Code clean-up and protection against concurrent write access to the
ref namespace.
* mh/safe-create-leading-directories:
rename_tmp_log(): on SCLD_VANISHED, retry
rename_tmp_log(): limit the number of remote_empty_directories() attempts
rename_tmp_log(): handle a possible mkdir/rmdir race
rename_ref(): extract function rename_tmp_log()
remove_dir_recurse(): handle disappearing files and directories
remove_dir_recurse(): tighten condition for removing unreadable dir
lock_ref_sha1_basic(): if locking fails with ENOENT, retry
lock_ref_sha1_basic(): on SCLD_VANISHED, retry
safe_create_leading_directories(): add new error value SCLD_VANISHED
cmd_init_db(): when creating directories, handle errors conservatively
safe_create_leading_directories(): introduce enum for return values
safe_create_leading_directories(): always restore slash at end of loop
safe_create_leading_directories(): split on first of multiple slashes
safe_create_leading_directories(): rename local variable
safe_create_leading_directories(): add explicit "slash" pointer
safe_create_leading_directories(): reduce scope of local variable
safe_create_leading_directories(): fix format of "if" chaining
Fix performance regression in v1.8.4.x and later.
* jk/mark-edges-uninteresting:
list-objects: only look at cmdline trees with edge_hint
t/perf: time rev-list with UNINTERESTING commits
* jk/diff-filespec-cleanup:
diff_filespec: use only 2 bits for is_binary flag
diff_filespec: reorder is_binary field
diff_filespec: drop xfrm_flags field
diff_filespec: drop funcname_pattern_ident field
diff_filespec: reorder dirty_submodule macro definitions
The "if /etc/ssl/certs/ directory exists, explicitly telling the
library to use it as SSL_ca_path" blind-defaulting in "git
send-email" broke platforms where /etc/ssl/certs/ directory exists,
but it cannot used as SSL_ca_path (e.g. Fedora rawhide). Fix it by
not specifying any SSL_ca_path/SSL_ca_file but still asking for peer
verification in such a case.
* rk/send-email-ssl-cert:
send-email: /etc/ssl/certs/ directory may not be usable as ca_path
Explicitly list $HOME/.config/git/ignore as one of the places you
can use to keep ignore patterns that depend on your personal choice
of tools, e.g. *~ for Emacs users.
* jn/ignore-doc:
gitignore doc: add global gitignore to synopsis
Fix a handful of bugs around interpreting $branch@{upstream}
notation and its lookalike, when $branch part has interesting
characters, e.g. "@", and ":".
* jk/interpret-branch-name-fix:
interpret_branch_name: find all possible @-marks
interpret_branch_name: avoid @{upstream} past colon
interpret_branch_name: always respect "namelen" parameter
interpret_branch_name: rename "cp" variable to "at"
interpret_branch_name: factor out upstream handling
"git clone" would fail to clone from a repository that has a ref
directly under "refs/", e.g. "refs/stash", because different
validation paths do different things on such a refname. Loosen the
client side's validation to allow such a ref.
* jk/allow-fetch-onelevel-refname:
fetch-pack: do not filter out one-level refs
"git log --left-right A...B" lost the "leftness" of commits
reachable from A when A is a tag as a side effect of a recent
bugfix. This is a regression in 1.8.4.x series.
* jc/revision-range-unpeel:
revision: propagate flag bits from tags to pointees
revision: mark contents of an uninteresting tree uninteresting
Instead of starting an enumeration of documents with a DOC = doc1
followed by DOC += doc2, DOC += doc3, ..., empty it with "DOC =" at
the beginning and consistently add them with "DOC += ...".
Signed-off-by: Junio C Hamano <gitster@pobox.com>
./Documentation/technical/http-protocol.txt was missing from TECH_DOCS in Makefile.
Add it and also improve HTML formatting while still retaining good readability of the ASCII text:
- Use monospace font instead of italicized or roman font for machine output and source text
- Use roman font for things which should be body text
- Use double quotes consistently for "want" and "have" commands
- Use uppercase "C" / "S" consistently for "client" / "server";
also use "C:" / "S:" instead of "(C)" / "(S)" for consistency and
to avoid having formatted "(C)" as copyright symbol in HTML
- Use only spaces and not a combination of tabs and spaces for whitespace
Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The current basedir compare aborts early in order to avoid futile
recursive searches. However, a match may still be found by another
pathspec. This can cause an error while checking out files from a branch
when using multiple pathspecs:
$ git checkout master -- 'a/*.txt' 'b/*.txt'
error: pathspec 'a/*.txt' did not match any file(s) known to git.
Signed-off-by: Andy Spencer <andy753421@gmail.com>
Acked-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>