The preceding code checks that view->max_off is nonnegative and
(off + width) fits in an off_t, so this code is already safe.
Signed-off-by: David Barr <davidbarr@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
These are already safe because both sides of the comparison are
nonnegative.
This would normally not be important because Git is not -Wsign-compare
clean anyway, but we like to keep the vcs-svn/ lib to a higher
standard for convenience using it in other projects.
Signed-off-by: David Barr <davidbarr@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
memmem is a GNU extension.
Avoiding it makes the code clearer and makes it easier for projects
that don't share git's compat/ code, such as the standalone
svn-dump-fast-export project, to reuse the vcs-svn/ library.
Signed-off-by: David Barr <davidbarr@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Since the length of t is already known, we can simplify a little by
using memcmp() instead of strncmp() to carry out a prefix comparison.
All nearby code already does this.
Noticed in the standalone svn-dump-fast-export project which has not
needed to implement prefixcmp() yet.
Signed-off-by: David Barr <davidbarr@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Currently the cleanup code looks like this:
free resources
return 0;
error_out:
free resources
return -1;
Avoid duplicating the "free resources" part by keeping the return
value in a variable and sharing code between the success and
exceptional case:
ret = 0;
out:
free resources
return ret;
Noticed in the svn-dump-fast-export project, where using the error()
macro in void context produces a warning.
Signed-off-by: David Barr <davidbarr@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Without this change, clang complains:
vcs-svn/svndiff.c:298:3: warning: Assigned value is garbage or undefined
off_t pre_off = pre_off; /* stupid GCC... */
^ ~~~~~~~
This code uses an old and common idiom for suppressing an
"uninitialized variable" warning, and clang is wrong to warn about it.
The idiom tells the compiler to leave the variable uninitialized,
which saves a few bytes of code size, and, more importantly, allows
valgrind to check at runtime that the variable is properly initialized
by the time it is used.
But MSVC and clang do not know that idiom, so let's avoid it in
vcs-svn/ code.
Initialize pre_off to -1, a recognizably meaningless value, to allow
future code changes that cause pre_off to be used before it is
initialized to be caught early.
Signed-off-by: David Barr <davidbarr@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Since v1.7.5~42^2~6 (vcs-svn: remove buffer_read_string)
buffer_reset() does nothing thus fast_export_reset() also.
Signed-off-by: David Barr <davidbarr@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
On 32-bit architectures with 64-bit file offsets, gcc 4.3 and earlier
produce the following warning:
CC vcs-svn/sliding_window.o
vcs-svn/sliding_window.c: In function `check_overflow':
vcs-svn/sliding_window.c:36: warning: comparison is always false \
due to limited range of data type
The warning appears even when gcc is run without any warning flags
(PR12963). In later versions it can be reproduced with -Wtype-limits,
which is implied by -Wextra.
On 64-bit architectures it really is possible for a size_t not to be
representable as an off_t so the check being warned about is not
actually redundant. But even false positives are distracting. Avoid
the warning by making the "len" argument to check_overflow a
uintmax_t; no functional change intended.
Reported-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
There is no reason in principle that an svn-format dump would not
be able to represent a file whose length does not fit in a 32-bit
integer. Use off_t consistently (instead of uint32_t) to represent
file lengths so we can handle that.
Most of our code is already ready to do that without this patch and
already passes values of type off_t around. The type mismatch due to
stragglers was noticed with gcc -Wtype-limits.
Inspired-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
The canonical interpretation of a range a,b is as an interval [a,b),
not [a,a+b), so this function taking argument names a and b feels
unnatural. Use more explicit names "offset" and "len" to make the
arguments' type and function clearer.
While at it, rename the function to convey that we are making sure
the sum of this offset and length do not overflow an off_t, not a
size_t.
[jn: split out from a patch from Ramsay Jones, then improved with
advice from Thomas Rast, Dmitry Ivankov, and David Barr]
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Improved-by: Dmitry Ivankov <divanorama@gmail.com>
Curiously, pre_len given to read_length() does not trigger the same warning
even though the code structure is the same. Most likely this is because
read_offset() is used only once and inlining it will make gcc realize that
it has a chance to do more flow analysis. Alas, the analysis is flawed, so
it does not help X-<.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This simplifies svn-fe a great deal and fulfills a longstanding wish:
support for dumps with deltas in them, and incremental imports.
The cost is that commandline usage of the svn-fe tool becomes a little
more complicated since it no longer keeps state itself but instead reads
blobs back from fast-import in order to copy them between revisions and
apply deltas to them.
Also removes a couple of custom data structures and replaces them with
strbufs like other parts of Git.
* 'svn-fe' of git://repo.or.cz/git/jrn: (32 commits)
vcs-svn: reset first_commit_done in fast_export_init
vcs-svn: do not initialize report_buffer twice
vcs-svn: avoid hangs from corrupt deltas
vcs-svn: guard against overflow when computing preimage length
vcs-svn: cap number of bytes read from sliding view
test-svn-fe: split off "test-svn-fe -d" into a separate function
vcs-svn: implement text-delta handling
vcs-svn: let deltas use data from preimage
vcs-svn: let deltas use data from postimage
vcs-svn: verify that deltas consume all inline data
vcs-svn: implement copyfrom_data delta instruction
vcs-svn: read instructions from deltas
vcs-svn: read inline data from deltas
vcs-svn: read the preimage when applying deltas
vcs-svn: parse svndiff0 window header
vcs-svn: skeleton of an svn delta parser
vcs-svn: make buffer_read_binary API more convenient
vcs-svn: learn to maintain a sliding view of a file
Makefile: list one vcs-svn/xdiff object or header per line
vcs-svn: avoid using ls command twice
...
Conflicts:
Makefile
contrib/svn-fe/svn-fe.txt
The pathspec structure has a few bits of data to drive various operation
modes after we unified the pathspec matching logic in various codepaths.
For example, max_depth field is there so that "git grep" can limit the
output for files found in limited depth of tree traversal. Also in order
to show just the surface level differences in "git diff-tree", recursive
field stops us from descending into deeper level of the tree structure
when it is set to false, and this also affects pathspec matching when
we have wildcards in the pathspec.
The diff-index has always wanted the recursive behaviour, and wanted to
match pathspecs without any depth limit. But we forgot to do so when we
updated tree_entry_interesting() logic to unify the pathspec matching
logic.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jk/credentials:
credential-cache: ignore "connection refused" errors
unix-socket: do not let close() or chdir() clobber errno during cleanup
credential-cache: report more daemon connection errors
unix-socket: handle long socket pathnames
The credential-cache helper will try to connect to its
daemon over a unix socket. Originally, a failure to do so
was silently ignored, and we would either give up (if
performing a "get" or "erase" operation), or spawn a new
daemon (for a "store" operation).
But since 8ec6c8d, we try to report more errors. We detect a
missing daemon by checking for ENOENT on our connection
attempt. If the daemon is missing, we continue as before
(giving up or spawning a new daemon). For any other error,
we die and report the problem.
However, checking for ENOENT is not sufficient for a missing
daemon. We might also get ECONNREFUSED if a dead daemon
process left a stale socket. This generally shouldn't
happen, as the daemon cleans up after itself, but the daemon
may not always be given a chance to do so (e.g., power loss,
"kill -9").
The resulting state is annoying not just because the helper
outputs an extra useless message, but because it actually
blocks the helper from spawning a new daemon to replace the
stale socket.
Fix it by checking for ECONNREFUSED.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The pathspec structure has a few bits of data to drive various operation
modes after we unified the pathspec matching logic in various codepaths.
For example, max_depth field is there so that "git grep" can limit the
output for files found in limited depth of tree traversal. Also in order
to show just the surface level differences in "git diff-tree", recursive
field stops us from descending into deeper level of the tree structure
when it is set to false, and this also affects pathspec matching when
we have wildcards in the pathspec.
The diff-index has always wanted the recursive behaviour, and wanted to
match pathspecs without any depth limit. But we forgot to do so when we
updated tree_entry_interesting() logic to unify the pathspec matching
logic.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's actually unlimited recursion if wildcards are active regardless
--max-depth
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Two "^" characters were incorrectly being interpreted as markup for
superscripting. Fix them by writing them as attribute references
"{caret}".
Although a single "^" character in a paragraph cannot be
misinterpreted in this way, also write other "^" characters as
"{caret}" in the interest of good hygiene (unless they are in literal
paragraphs, of course, in which context attribute references are not
recognized).
Spell "{}" consistently, namely *not* quoted as "\{\}". Since the
braces are empty, they cannot be interpreted as an attribute
reference, and either spelling is OK. So arbitrarily choose one
variation and use it consistently.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint:
Update draft release notes to 1.7.8.4
Update draft release notes to 1.7.7.6
Update draft release notes to 1.7.6.6
thin-pack: try harder to use preferred base objects as base
When creating a pack using objects that reside in existing packs, we try
to avoid recomputing futile delta between an object (trg) and a candidate
for its base object (src) if they are stored in the same packfile, and trg
is not recorded as a delta already. This heuristics makes sense because it
is likely that we tried to express trg as a delta based on src but it did
not produce a good delta when we created the existing pack.
As the pack heuristics prefer producing delta to remove data, and Linus's
law dictates that the size of a file grows over time, we tend to record
the newest version of the file as inflated, and older ones as delta
against it.
When creating a thin-pack to transfer recent history, it is likely that we
will try to send an object that is recorded in full, as it is newer. But
the heuristics to avoid recomputing futile delta effectively forbids us
from attempting to express such an object as a delta based on another
object. Sending an object in full is often more expensive than sending a
suboptimal delta based on other objects, and it is even more so if we
could use an object we know the receiving end already has (i.e. preferred
base object) as the delta base.
Tweak the recomputation avoidance logic, so that we do not punt on
computing delta against a preferred base object.
The effect of this change can be seen on two simulated upload-pack
workloads. The first is based on 44 reflog entries from my git.git
origin/master reflog, and represents the packs that kernel.org sent me git
updates for the past month or two. The second workload represents much
larger fetches, going from git's v1.0.0 tag to v1.1.0, then v1.1.0 to
v1.2.0, and so on.
The table below shows the average generated pack size and the average CPU
time consumed for each dataset, both before and after the patch:
dataset
| reflog | tags
---------------------------------
before | 53358 | 2750977
size after | 32398 | 2668479
change | -39% | -3%
---------------------------------
before | 0.18 | 1.12
CPU after | 0.18 | 1.15
change | +0% | +3%
This patch makes a much bigger difference for packs with a shorter slice
of history (since its effect is seen at the boundaries of the pack) though
it has some benefit even for larger packs.
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The word-diff logic accumulates + and - lines until another line type
appears (normally [ @\]), at which point it generates the word diff.
This is usually correct, but it breaks when the preimage does not have
a newline at EOF:
$ printf "%s" "a a a" >a
$ printf "%s\n" "a ab a" >b
$ git diff --no-index --word-diff a b
diff --git 1/a 2/b
index 9f68e94..6a7c02f 100644
--- 1/a
+++ 2/b
@@ -1 +1 @@
[-a a a-]
No newline at end of file
{+a ab a+}
Because of the order of the lines in a unified diff
@@ -1 +1 @@
-a a a
\ No newline at end of file
+a ab a
the '\' line flushed the buffers, and the - and + lines were never
matched with each other.
A proper fix would defer such markers until the end of the hunk.
However, word-diff is inherently whitespace-ignoring, so as a cheap
fix simply ignore the marker (and hide it from the output).
We use a prefix match for '\ ' to parallel the logic in
apply.c:parse_fragment(). We currently do not localize this string
(just accept other variants of it in git-apply), but this should be
future-proof.
Noticed-by: Ivan Shirokoff <shirokoff@yandex-team.ru>
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The tightening done in (ee27ca4a: archive: don't let remote clients
get unreachable commits, 2011-11-17) went too far and disallowed
HEAD:Documentation as it would try to find "HEAD:Documentation" as a
ref.
Only DWIM the "HEAD" part to see if it exists as a ref. Once we're
sure that we've been given a valid ref, we follow the normal code
path. This still disallows attempts to access commits which are not
branch tips.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function frees the individual "struct match_attr"s we
have allocated, but forgot to free the array holding their
pointers, leading to a minor memory leak (but it can add up
after checking attributes for paths in many directories).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Due to MSYS path mangling GIT_DIR contains a Windows-style path when
checked inside a Perl script even if GIT_DIR was previously set to an
MSYS-style path in a shell script. So explicitly convert to an MSYS-style
path before calling Perl's rel2abs() to make it work.
This fix was inspired by a very similar patch in WebKit:
http://trac.webkit.org/changeset/76255/trunk/Tools/Scripts/commit-log-editor
Signed-off-by: Sebastian Schuberth <sschuberth@gmail.com>
Tested-by: Pat Thoyts <patthoyts@users.sourceforge.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>