Instead of counting added and removed lines (and mixing the byte size
reported for binary files in the result), summarize the extent of damage
the same way as we count similarity for rename detection.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, when looking for a packed object from the pack idx, a
simple binary search is used.
A conventional binary search loop looks like this:
unsigned lo, hi;
do {
unsigned mi = (lo + hi) / 2;
int cmp = "entry pointed at by mi" minus "target";
if (!cmp)
return mi; "mi is the wanted one"
if (cmp > 0)
hi = mi; "mi is larger than target"
else
lo = mi+1; "mi is smaller than target"
} while (lo < hi);
"did not find what we wanted"
The invariants are:
- When entering the loop, 'lo' points at a slot that is never
above the target (it could be at the target), 'hi' points at
a slot that is guaranteed to be above the target (it can
never be at the target).
- We find a point 'mi' between 'lo' and 'hi' ('mi' could be
the same as 'lo', but never can be as high as 'hi'), and
check if 'mi' hits the target. There are three cases:
- if it is a hit, we have found what we are looking for;
- if it is strictly higher than the target, we set it to
'hi', and repeat the search.
- if it is strictly lower than the target, we update 'lo'
to one slot after it, because we allow 'lo' to be at the
target and 'mi' is known to be below the target.
If the loop exits, there is no matching entry.
When choosing 'mi', we do not have to take the "middle" but
anywhere in between 'lo' and 'hi', as long as lo <= mi < hi is
satisfied. When we somehow know that the distance between the
target and 'lo' is much shorter than the target and 'hi', we
could pick 'mi' that is much closer to 'lo' than (hi+lo)/2,
which a conventional binary search would pick.
This patch takes advantage of the fact that the SHA-1 is a good
hash function, and as long as there are enough entries in the
table, we can expect uniform distribution. An entry that begins
with for example "deadbeef..." is much likely to appear much
later than in the midway of a reasonably populated table. In
fact, it can be expected to be near 87% (222/256) from the top
of the table.
This is a work-in-progress and has switches to allow easier
experiments and debugging. Exporting GIT_USE_LOOKUP environment
variable enables this code.
On my admittedly memory starved machine, with a partial KDE
repository (3.0G pack with 95M idx):
$ GIT_USE_LOOKUP=t git log -800 --stat HEAD >/dev/null
3.93user 0.16system 0:04.09elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+55588minor)pagefaults 0swaps
Without the patch, the numbers are:
$ git log -800 --stat HEAD >/dev/null
4.00user 0.15system 0:04.17elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+60258minor)pagefaults 0swaps
In the same repository:
$ GIT_USE_LOOKUP=t git log -2000 HEAD >/dev/null
0.12user 0.00system 0:00.12elapsed 97%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+4241minor)pagefaults 0swaps
Without the patch, the numbers are:
$ git log -2000 HEAD >/dev/null
0.05user 0.01system 0:00.07elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+8506minor)pagefaults 0swaps
There isn't much time difference, but the number of minor faults
seems to show that we are touching much smaller number of pages,
which is expected.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The presence of a .git directory used to be good enough evidence that
GIT-VERSION-GEN could use 'git describe' to get a version number. But
now .git might as well be a file so the test must be extended to cater for
such setups.
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When git-submodule tries to detect 'active' submodules, it checks for the
existence of a directory named '.git'. This isn't good enough now that .git
can be a file pointing to the real $GIT_DIR so the tests are changed to
reflect this.
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When .git in a submodule is a file, resolve_gitlink_ref() needs to pick up
the real GIT_DIR of the submodule from that file.
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch allows .git to be a regular textfile containing the path of
the real git directory (prefixed with "gitdir: "), which can be useful on
platforms lacking support for real symlinks.
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This expands on the previous patch, and allows "git add" to sanely handle
a filename that has changed case, keeping the case in the index constant,
and avoiding aliases.
In particular, if you have an index entry called "File", but the
checked-out tree is case-corrupted and has an entry called "file"
instead, doing a
git add .
(or naming "file" explicitly) will automatically notice that we have an
alias, and will replace the name "file" with the existing index
capitalization (ie "File").
However, if we actually have *both* a file called "File" and one called
"file", and they don't have the same lstat() information (ie we're on a
case-sensitive filesystem but have the "core.ignorecase" flag set), we
will error out if we try to add them both.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This simplifies the matching case of "I already have this file and it is
up-to-date" and makes it do the right thing in the face of
case-insensitive aliases.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is immaterial on sane filesystems, but if you have a broken (aka
case-insensitive) filesystem, and the objective is to remove the file
'abc' and replace it with the file 'Abc', then we must make sure to do
the removal first.
Otherwise, you'd first update the file 'Abc' - which would just
overwrite the file 'abc' due to the broken case-insensitive filesystem -
and then remove file 'abc' - which would now brokenly remove the just
updated file 'Abc' on that broken filesystem.
By doing removals first, this won't happen.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we find an unexpected file, see if that filename perhaps exists in a
case-insensitive way in the index, and whether the file matches that. If
so, ignore it as a known pre-existing file of a different name.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
..and start using it for directory entry traversal (ie "git status" will
not consider entries that match an existing entry case-insensitively to
be a new file)
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Right now nobody uses it, but "index_name_exists()" gets a flag so
you can enable it on a case-by-case basis.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This allows verify_absent() in unpack_trees() to use the hash chains
rather than looking it up using the binary search.
Perhaps more importantly, it's also going to be useful for the next phase,
where we actually start looking at the cache entry when we do
case-insensitive lookups and checking the result.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's really totally separate functionality, and if we want to start
doing case-insensitive hash lookups, I'd rather do it when it's
separated out.
It also renames "remove_index_entry()" to "remove_name_hash()", because
that really describes the thing better. It doesn't actually remove the
index entry, that's done by "remove_index_entry_at()", which is something
very different, despite the similarity in names.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of wasting space with whole integers for a single bit.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* mk/unpack-careful:
t5300: add test for "index-pack --strict"
receive-pack: allow using --strict mode for unpacking objects
unpack-objects: fix --strict handling
t5300: add test for "unpack-objects --strict"
unpack-objects: prevent writing of inconsistent objects
* fl/send-email-outside:
send-email: Don't require to be called in a repository
Git.pm: Don't require repository instance for ident
Git.pm: Don't require a repository instance for config
var: Don't require to be in a git repository.
When using git-svn to follow only a single (empty) path per
svn-remote (i.e. not using --stdlayout), following the history
of a renamed path was broken in
c586879cdf.
This reverts the regression for the single (emtpy) path per
svn-remote case.
To avoid breaking the tests in a committed revision, this is an
addendum to a patch originally submitted by
Santhosh Kumar Mani <santhoshmani@gmail.com>:
> git-svn: add test for renamed directory fetch
>
> This test tries to fetch a directory which had renames in the
> history from a SVN repository.
[ew: unneccesary dependency on the starting an HTTP server
removed from Santhosh's original test.]
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In commit 15387e3 (Test suite: reset TERM to its previous value after
testing., 2007-10-26), I added a workaround to reset TERM to its previous
value before the "test_done" at the end of "t7005-editor.sh" because
otherwise "test_done" would have printed the test result with a bad TERM
env variable (this resulted in output with no color on konsole).
But since commit c2116a1 (test-lib: fix TERM to dumb for test
repeatability, 2008-03-06), colored output is printed in a subshell with
TERM reset to its original value so the earlier workaround is not needed
anymore.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
An earlier commit 4be6096 (apply --unidiff-zero: loosen sanity checks for
--unidiff=0 patches, 2006-09-17) made match_beginning and match_end
computed incorrectly. If a hunk inserts at the beginning, old position
recorded at the hunk is line 0, and if a hunk changes at the beginning, it
is line 1. The new test added to t4104 exposes that the old code did not
insist on matching at the beginning for a patch to add a line to an empty
file.
An even older 65aadb9 (apply: force matching at the beginning.,
2006-05-24) was equally wrong in that it tried to take hints from the
number of leading context lines, to decide if the hunk must match at the
beginning, but we can just look at the line number in the hunk to decide.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Peter Eriksen <s022018@student.dtu.dk>
Acked-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With tcl/tk8.5 the lset command seems to behave differently. When
changing the background color through Edit->Preferences, the changes
are applied, but new dialogs, such as View->New view... barf with
Error: unknown color name "{#ffffff}"
Additionally when closing gitk, and starting it up again, a bad value
has been saved to ~/.gitk, preventing gitk from running properly; it
fails with
Error in startup script: unknown color name "{#ffffff}"
...
This commit fixes the problem by changing the color dialogs to pass
the empty string {} as the list index to choosecolor. This causes
the lset and lindex commands used by choosecolor to use and set the
whole variable (bgcolor, fgcolor or selectbgcolor) rather than
treating them as a 1-element list. Tested with tcl/tk8.4 and 8.5.
Dmitry Potapov reported this problem through
http://bugs.debian.org/472615
Signed-off-by: Gerrit Pape <pape@smarden.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
It is a bit confusing on first read, that
"The packed archive format (.pack) is designed
to be unpackable..."
Signed-off-by: Peter Eriksen <s022018@student.dtu.dk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When git-fetch encounters the refspec "tag" it assumes that the next
argument will be a tag name. If there is no next argument, it should
die gracefully instead of erroring.
Signed-off-by: Kevin Ballard <kevin@sb.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reverts commit 6aa6f92fda.
It caused is_deleted() subroutine to output warnings when dealing with
old, legacy gitweb blobdiff URLs without either 'hb' or 'hpb'
parameters.
This fixes http://bugs.debian.org/469083
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The earlier one did not correctly propagate GITWEB_CONFIG_SYSTEM from
Makefile to generated gitweb.cgi script.
Signed-off-by: Gerrit Pape <pape@smarden.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* git://repo.or.cz/git-gui:
git-gui: use +/- instead of ]/[ to show more/less context in diff
git-gui: Update french translation
git-gui: Switch keybindings for [ and ] to bracketleft and bracketright
On some systems, brackets cannot be used as event details
(they don't have a keysym), so use +/- instead (both on
keyboard and keypad) and add ctrl-= as a synonym of ctrl-+
for convenience.
[sp: Had to change accelerator to show only "$M1T-="; the
original version included "$M1T-+ $M1T-=" but this is
not drawn at all on Mac OS X.]
Signed-off-by: Michele Ballabio <barra_cuda@katamail.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
The rate of fixes that trickle in has slowed and we are definitely
getting there. Hopefully one final round and we will have the final
1.5.5 soon.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The interactive mode does not work with files whose names contain
characters that need C-quoting. `core.quotepath` configuration can be
used to work this limitation around to some degree, but backslash,
double-quote and control characters will still have problems.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* dd/cvsserver:
cvsserver: Use the user part of the email in log and annotate results
cvsserver: Add test for update -p
cvsserver: Implement update -p (print to stdout)
cvsserver: Add a few tests for 'status' command
cvsserver: Do not include status output for subdirectories if -l is passed
cvsserver: Only print the file part of the filename in status header
cvsserver: Respond to the 'editors' and 'watchers' commands
This test was already careful enough to skip signed tag tests if gpg
is not available, but it must also skip all verify tests, even those
that are about non-signed tags, because they also invoke gpg.
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>