mirrors/git - Incest Forge: Beyond sex. We incest.

mirrors/git

mirror of https://github.com/git/git.git synced 2024-11-15 21:53:44 +01:00

Author	SHA1	Message	Date
Jeff King	212f2ffbf0	t: add basic bitmap functionality tests Now that we can read and write bitmaps, we can exercise them with some basic functionality tests. These tests aren't particularly useful for seeing the benefit, as the test repo is too small for it to make a difference. However, we can at least check that using bitmaps does not break anything. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-30 12:19:23 -08:00
Nguyễn Thái Ngọc Duy	d3d3e4c490	count-objects: recognize .bitmap in garbage-checking Count-objects will report any "garbage" files in the packs directory, including files whose extensions it does not know (case 1), and files whose matching ".pack" file is missing (case 2). Without having learned about ".bitmap" files, the current code reports all such files as garbage (case 1), even if their pack exists. Instead, they should be treated as case 2. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-30 12:19:23 -08:00
Vicent Marti	5cf2741c5a	repack: consider bitmaps when performing repacks Since `pack-objects` will write a `.bitmap` file next to the `.pack` and `.idx` files, this commit teaches `git-repack` to consider the new bitmap indexes (if they exist) when performing repack operations. This implies moving old bitmap indexes out of the way if we are repacking a repository that already has them, and moving the newly generated bitmap indexes into the `objects/pack` directory, next to their corresponding packfiles. Since `git repack` is now capable of handling these `.bitmap` files, a normal `git gc` run on a repository that has `pack.writebitmaps` set to true in its config file will generate bitmap indexes as part of the garbage collection process. Alternatively, `git repack` can be called with the `-b` switch to explicitly generate bitmap indexes if you are experimenting and don't want them on all the time. Signed-off-by: Vicent Marti <tanoku@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-30 12:19:23 -08:00
Jeff King	b77fcd1edc	repack: handle optional files created by pack-objects We ask pack-objects to pack to a set of temporary files, and then rename them into place. Some files that pack-objects creates may be optional (like a .bitmap file), in which case we would not want to call rename(). We already call stat() and make the chmod optional if the file cannot be accessed. We could simply skip the rename step in this case, but that would be a minor regression in noticing problems with non-optional files (like the .pack and .idx files). Instead, we can now annotate extensions as optional, and skip them if they don't exist (and otherwise rely on rename() to barf). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-30 12:19:23 -08:00
Jeff King	42a02d8529	repack: turn exts array into array-of-struct This is slightly more verbose, but will let us annotate the extensions with further options in future commits. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-30 12:19:23 -08:00
Jeff King	b328c2166e	repack: stop using magic number for ARRAY_SIZE(exts) We have a static array of extensions, but hardcode the size of the array in our loops. Let's pull out this magic number, which will make it easier to change. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-30 12:19:23 -08:00
Vicent Marti	7cc8f97108	pack-objects: implement bitmap writing This commit extends more the functionality of `pack-objects` by allowing it to write out a `.bitmap` index next to any written packs, together with the `.idx` index that currently gets written. If bitmap writing is enabled for a given repository (either by calling `pack-objects` with the `--write-bitmap-index` flag or by having `pack.writebitmaps` set to `true` in the config) and pack-objects is writing a packfile that would normally be indexed (i.e. not piping to stdout), we will attempt to write the corresponding bitmap index for the packfile. Bitmap index writing happens after the packfile and its index has been successfully written to disk (`finish_tmp_packfile`). The process is performed in several steps: 1. `bitmap_writer_set_checksum`: this call stores the partial checksum for the packfile being written; the checksum will be written in the resulting bitmap index to verify its integrity 2. `bitmap_writer_build_type_index`: this call uses the array of `struct object_entry` that has just been sorted when writing out the actual packfile index to disk to generate 4 type-index bitmaps (one for each object type). These bitmaps have their nth bit set if the given object is of the bitmap's type. E.g. the nth bit of the Commits bitmap will be 1 if the nth object in the packfile index is a commit. This is a very cheap operation because the bitmap writing code has access to the metadata stored in the `struct object_entry` array, and hence the real type for each object in the packfile. 3. `bitmap_writer_reuse_bitmaps`: if there exists an existing bitmap index for one of the packfiles we're trying to repack, this call will efficiently rebuild the existing bitmaps so they can be reused on the new index. All the existing bitmaps will be stored in a `reuse` hash table, and the commit selection phase will prioritize these when selecting, as they can be written directly to the new index without having to perform a revision walk to fill the bitmap. This can greatly speed up the repack of a repository that already has bitmaps. 4. `bitmap_writer_select_commits`: if bitmap writing is enabled for a given `pack-objects` run, the sequence of commits generated during the Counting Objects phase will be stored in an array. We then use that array to build up the list of selected commits. Writing a bitmap in the index for each object in the repository would be cost-prohibitive, so we use a simple heuristic to pick the commits that will be indexed with bitmaps. The current heuristics are a simplified version of JGit's original implementation. We select a higher density of commits depending on their age: the 100 most recent commits are always selected, after that we pick 1 commit of each 100, and the gap increases as the commits grow older. On top of that, we make sure that every single branch that has not been merged (all the tips that would be required from a clone) gets their own bitmap, and when selecting commits between a gap, we tend to prioritize the commit with the most parents. Do note that there is no right/wrong way to perform commit selection; different selection algorithms will result in different commits being selected, but there's no such thing as "missing a commit". The bitmap walker algorithm implemented in `prepare_bitmap_walk` is able to adapt to missing bitmaps by performing manual walks that complete the bitmap: the ideal selection algorithm, however, would select the commits that are more likely to be used as roots for a walk in the future (e.g. the tips of each branch, and so on) to ensure a bitmap for them is always available. 5. `bitmap_writer_build`: this is the computationally expensive part of bitmap generation. Based on the list of commits that were selected in the previous step, we perform several incremental walks to generate the bitmap for each commit. The walks begin from the oldest commit, and are built up incrementally for each branch. E.g. consider this dag where A, B, C, D, E, F are the selected commits, and a, b, c, e are a chunk of simplified history that will not receive bitmaps. A---a---B--b--C--c--D \ E--e--F We start by building the bitmap for A, using A as the root for a revision walk and marking all the objects that are reachable until the walk is over. Once this bitmap is stored, we reuse the bitmap walker to perform the walk for B, assuming that once we reach A again, the walk will be terminated because A has already been SEEN on the previous walk. This process is repeated for C, and D, but when we try to generate the bitmaps for E, we can reuse neither the current walk nor the bitmap we have generated so far. What we do now is resetting both the walk and clearing the bitmap, and performing the walk from scratch using E as the origin. This new walk, however, does not need to be completed. Once we hit B, we can lookup the bitmap we have already stored for that commit and OR it with the existing bitmap we've composed so far, allowing us to limit the walk early. After all the bitmaps have been generated, another iteration through the list of commits is performed to find the best XOR offsets for compression before writing them to disk. Because of the incremental nature of these bitmaps, XORing one of them with its predecesor results in a minimal "bitmap delta" most of the time. We can write this delta to the on-disk bitmap index, and then re-compose the original bitmaps by XORing them again when loaded. This is a phase very similar to pack-object's `find_delta` (using bitmaps instead of objects, of course), except the heuristics have been greatly simplified: we only check the 10 bitmaps before any given one to find best compressing one. This gives good results in practice, because there is locality in the ordering of the objects (and therefore bitmaps) in the packfile. 6. `bitmap_writer_finish`: the last step in the process is serializing to disk all the bitmap data that has been generated in the two previous steps. The bitmap is written to a tmp file and then moved atomically to its final destination, using the same process as `pack-write.c:write_idx_file`. Signed-off-by: Vicent Marti <tanoku@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-30 12:19:22 -08:00
Vicent Marti	aa32939fea	rev-list: add bitmap mode to speed up object lists The bitmap reachability index used to speed up the counting objects phase during `pack-objects` can also be used to optimize a normal rev-list if the only thing required are the SHA1s of the objects during the list (i.e., not the path names at which trees and blobs were found). Calling `git rev-list --objects --use-bitmap-index [committish]` will perform an object iteration based on a bitmap result instead of actually walking the object graph. These are some example timings for `torvalds/linux` (warm cache, best-of-five): $ time git rev-list --objects master > /dev/null real 0m34.191s user 0m33.904s sys 0m0.268s $ time git rev-list --objects --use-bitmap-index master > /dev/null real 0m1.041s user 0m0.976s sys 0m0.064s Likewise, using `git rev-list --count --use-bitmap-index` will speed up the counting operation by building the resulting bitmap and performing a fast popcount (number of bits set on the bitmap) on the result. Here are some sample timings of different ways to count commits in `torvalds/linux`: $ time git rev-list master \| wc -l 399882 real 0m6.524s user 0m6.060s sys 0m3.284s $ time git rev-list --count master 399882 real 0m4.318s user 0m4.236s sys 0m0.076s $ time git rev-list --use-bitmap-index --count master 399882 real 0m0.217s user 0m0.176s sys 0m0.040s This also respects negative refs, so you can use it to count a slice of history: $ time git rev-list --count v3.0..master 144843 real 0m1.971s user 0m1.932s sys 0m0.036s $ time git rev-list --use-bitmap-index --count v3.0..master real 0m0.280s user 0m0.220s sys 0m0.056s Though note that the closer the endpoints, the less it helps. In the traversal case, we have fewer commits to cross, so we take less time. But the bitmap time is dominated by generating the pack revindex, which is constant with respect to the refs given. Note that you cannot yet get a fast --left-right count of a symmetric difference (e.g., "--count --left-right master...topic"). The slow part of that walk actually happens during the merge-base determination when we parse "master...topic". Even though a count does not actually need to know the real merge base (it only needs to take the symmetric difference of the bitmaps), the revision code would require some refactoring to handle this case. Additionally, a `--test-bitmap` flag has been added that will perform the same rev-list manually (i.e. using a normal revwalk) and using bitmaps, and verify that the results are the same. This can be used to exercise the bitmap code, and also to verify that the contents of the .bitmap file are sane. Signed-off-by: Vicent Marti <tanoku@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-30 12:19:22 -08:00
Vicent Marti	6b8fda2db1	pack-objects: use bitmaps when packing objects In this patch, we use the bitmap API to perform the `Counting Objects` phase in pack-objects, rather than a traditional walk through the object graph. For a reasonably-packed large repo, the time to fetch and clone is often dominated by the full-object revision walk during the Counting Objects phase. Using bitmaps can reduce the CPU time required on the server (and therefore start sending the actual pack data with less delay). For bitmaps to be used, the following must be true: 1. We must be packing to stdout (as a normal `pack-objects` from `upload-pack` would do). 2. There must be a .bitmap index containing at least one of the "have" objects that the client is asking for. 3. Bitmaps must be enabled (they are enabled by default, but can be disabled by setting `pack.usebitmaps` to false, or by using `--no-use-bitmap-index` on the command-line). If any of these is not true, we fall back to doing a normal walk of the object graph. Here are some sample timings from a full pack of `torvalds/linux` (i.e. something very similar to what would be generated for a clone of the repository) that show the speedup produced by various methods: [existing graph traversal] $ time git pack-objects --all --stdout --no-use-bitmap-index \ </dev/null >/dev/null Counting objects: 3237103, done. Compressing objects: 100% (508752/508752), done. Total 3237103 (delta 2699584), reused 3237103 (delta 2699584) real 0m44.111s user 0m42.396s sys 0m3.544s [bitmaps only, without partial pack reuse; note that pack reuse is automatic, so timing this required a patch to disable it] $ time git pack-objects --all --stdout </dev/null >/dev/null Counting objects: 3237103, done. Compressing objects: 100% (508752/508752), done. Total 3237103 (delta 2699584), reused 3237103 (delta 2699584) real 0m5.413s user 0m5.604s sys 0m1.804s [bitmaps with pack reuse (what you get with this patch)] $ time git pack-objects --all --stdout </dev/null >/dev/null Reusing existing pack: 3237103, done. Total 3237103 (delta 0), reused 0 (delta 0) real 0m1.636s user 0m1.460s sys 0m0.172s Signed-off-by: Vicent Marti <tanoku@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-30 12:19:22 -08:00
Jeff King	ce2bc42456	pack-objects: split add_object_entry This function actually does three things: 1. Check whether we've already added the object to our packing list. 2. Check whether the object meets our criteria for adding. 3. Actually add the object to our packing list. It's a little hard to see these three phases, because they happen linearly in the rather long function. Instead, this patch breaks them up into three separate helper functions. The result is a little easier to follow, though it unfortunately suffers from some optimization interdependencies between the stages (e.g., during step 3 we use the packing list index from step 1 and the packfile information from step 2). More importantly, though, the various parts can be composed differently, as they will be in the next patch. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-30 12:19:22 -08:00
Vicent Marti	fff42755ef	pack-bitmap: add support for bitmap indexes A bitmap index is a `.bitmap` file that can be found inside `$GIT_DIR/objects/pack/`, next to its corresponding packfile, and contains precalculated reachability information for selected commits. The full specification of the format for these bitmap indexes can be found in `Documentation/technical/bitmap-format.txt`. For a given commit SHA1, if it happens to be available in the bitmap index, its bitmap will represent every single object that is reachable from the commit itself. The nth bit in the bitmap is the nth object in the packfile; if it's set to 1, the object is reachable. By using the bitmaps available in the index, this commit implements several new functions: - `prepare_bitmap_git` - `prepare_bitmap_walk` - `traverse_bitmap_commit_list` - `reuse_partial_packfile_from_bitmap` The `prepare_bitmap_walk` function tries to build a bitmap of all the objects that can be reached from the commit roots of a given `rev_info` struct by using the following algorithm: - If all the interesting commits for a revision walk are available in the index, the resulting reachability bitmap is the bitwise OR of all the individual bitmaps. - When the full set of WANTs is not available in the index, we perform a partial revision walk using the commits that don't have bitmaps as roots, and limiting the revision walk as soon as we reach a commit that has a corresponding bitmap. The earlier OR'ed bitmap with all the indexed commits can now be completed as this walk progresses, so the end result is the full reachability list. - For revision walks with a HAVEs set (a set of commits that are deemed uninteresting), first we perform the same method as for the WANTs, but using our HAVEs as roots, in order to obtain a full reachability bitmap of all the uninteresting commits. This bitmap then can be used to: a) limit the subsequent walk when building the WANTs bitmap b) finding the final set of interesting commits by performing an AND-NOT of the WANTs and the HAVEs. If `prepare_bitmap_walk` runs successfully, the resulting bitmap is stored and the equivalent of a `traverse_commit_list` call can be performed by using `traverse_bitmap_commit_list`; the bitmap version of this call yields the objects straight from the packfile index (without having to look them up or parse them) and hence is several orders of magnitude faster. As an extra optimization, when `prepare_bitmap_walk` succeeds, the `reuse_partial_packfile_from_bitmap` call can be attempted: it will find the amount of objects at the beginning of the on-disk packfile that can be reused as-is, and return an offset into the packfile. The source packfile can then be loaded and the bytes up to `offset` can be written directly to the result without having to consider the entires inside the packfile individually. If the `prepare_bitmap_walk` call fails (e.g. because no bitmap files are available), the `rev_info` struct is left untouched, and can be used to perform a manual rev-walk using `traverse_commit_list`. Hence, this new set of functions are a generic API that allows to perform the equivalent of git rev-list --objects [roots...] [^uninteresting...] for any set of commits, even if they don't have specific bitmaps generated for them. In further patches, we'll use this bitmap traversal optimization to speed up the `pack-objects` and `rev-list` commands. Signed-off-by: Vicent Marti <tanoku@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-30 12:19:22 -08:00
Vicent Marti	0d4455a3ab	documentation: add documentation for the bitmap format This is the technical documentation for the JGit-compatible Bitmap v1 on-disk format. Signed-off-by: Vicent Marti <tanoku@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-30 12:19:22 -08:00
Vicent Marti	e1273106f6	ewah: compressed bitmap implementation EWAH is a word-aligned compressed variant of a bitset (i.e. a data structure that acts as a 0-indexed boolean array for many entries). It uses a 64-bit run-length encoding (RLE) compression scheme, trading some compression for better processing speed. The goal of this word-aligned implementation is not to achieve the best compression, but rather to improve query processing time. As it stands right now, this EWAH implementation will always be more efficient storage-wise than its uncompressed alternative. EWAH arrays will be used as the on-disk format to store reachability bitmaps for all objects in a repository while keeping reasonable sizes, in the same way that JGit does. This EWAH implementation is a mostly straightforward port of the original `javaewah` library that JGit currently uses. The library is self-contained and has been embedded whole (4 files) inside the `ewah` folder to ease redistribution. The library is re-licensed under the GPLv2 with the permission of Daniel Lemire, the original author. The source code for the C version can be found on GitHub: https://github.com/vmg/libewok The original Java implementation can also be found on GitHub: https://github.com/lemire/javaewah [jc: stripped debug-only code per Peff's $gmane/239768] Signed-off-by: Vicent Marti <tanoku@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Helped-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-30 12:17:20 -08:00
Junio C Hamano	8f29299136	merge-base --octopus: reduce the result from get_octopus_merge_bases() Scripts that use "merge-base --octopus" could do the reducing themselves, but most of them are expected to want to get the reduced results without having to do any work themselves. Tests are taken from a message by Василий Макаров <einmalfel@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> --- We might want to vet the existing callers of the underlying get_octopus_merge_bases() and find out if _all_ of them are doing anything extra (like deduping) because the machinery can return duplicate results. And if that is the case, then we may want to move the dedupling down the callchain instead of having it here.	2013-12-30 11:58:54 -08:00
Junio C Hamano	e2f5df4244	merge-base: separate "--independent" codepath into its own helper It piggybacks on an unrelated handle_octopus() function only because there are some similarities between the way they need to preprocess their input and output their result. There is nothing similar in the true logic between these two operations. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-30 11:37:49 -08:00
Johannes Schindelin	e228c1736f	Remove the line length limit for graft files Support for grafts predates Git's strbuf, and hence it is understandable that there was a hard-coded line length limit of 1023 characters (which was chosen a bit awkwardly, given that it is exactly one byte short of aligning with the 41 bytes occupied by a commit name and the following space or new-line character). While regular commit histories hardly win comprehensibility in general if they merge more than twenty-two branches in one go, it is not Git's business to limit grafts in such a way. In this particular developer's case, the use case that requires substantially longer graft lines to be supported is the visualization of the commits' order implied by their changes: commits are considered to have an implicit relationship iff exchanging them in an interactive rebase would result in merge conflicts. Thusly implied branches tend to be very shallow in general, and the resulting thicket of implied branches is usually very wide; It is actually quite common that most of the commits in a topic branch have not even one implied parent, so that a final merge commit has about as many implied parents as there are commits in said branch. [jc: squashed in tests by Jonathan] Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-27 16:46:25 -08:00
Junio C Hamano	53f3478d42	Merge git://git.bogomips.org/git-svn * git://git.bogomips.org/git-svn: git-svn: workaround for a bug in svn serf backend	2013-12-27 14:58:35 -08:00
Junio C Hamano	36ec9e2173	Merge branch 'fc/remote-helper-fixes' * fc/remote-helper-fixes: remote-hg: test 'shared_path' in a moved clone remote-hg: add tests for special filenames remote-hg: fix 'shared path' path remote-helpers: add extra safety checks remote-hg: avoid buggy strftime()	2013-12-27 14:58:25 -08:00
Junio C Hamano	97663a1e97	Merge branch 'js/gnome-keyring' Style fix. * js/gnome-keyring: contrib/git-credential-gnome-keyring.c: small stylistic cleanups	2013-12-27 14:58:23 -08:00
Junio C Hamano	53b2a5ff30	Merge branch 'jk/name-pack-after-byte-representation' Two packfiles that contain the same set of objects have traditionally been named identically, but that made repacking a repository that is already fully packed without any cruft with a different packing parameter cumbersome. Update the convention to name the packfile after the bytestream representation of the data, not after the set of objects in it. * jk/name-pack-after-byte-representation: pack-objects doc: treat output filename as opaque pack-objects: name pack files after trailer hash sha1write: make buffer const-correct	2013-12-27 14:58:19 -08:00
Junio C Hamano	73b063130b	Merge branch 'tg/diff-no-index-refactor' "git diff ../else/where/A ../else/where/B" when ../else/where is clearly outside the repository, and "git diff --no-index A B", do not have to look at the index at all, but we used to read the index unconditionally. * tg/diff-no-index-refactor: diff: avoid some nesting diff: add test for --no-index executed outside repo diff: don't read index when --no-index is given diff: move no-index detection to builtin/diff.c	2013-12-27 14:58:17 -08:00
Junio C Hamano	6904f9aa5b	Merge branch 'zk/difftool-counts' Show the total number of paths and the number of paths shown so far when "git difftool" prompts to launch an external diff tool, which would give users some sense of progress. * zk/difftool-counts: diff.c: fix some recent whitespace style violations difftool: display the number of files in the diff queue in the prompt	2013-12-27 14:58:13 -08:00
Junio C Hamano	604ada435b	Merge branch 'jk/cat-file-regression-fix' "git cat-file --batch=", an admittedly useless command, did not behave very well. * jk/cat-file-regression-fix: cat-file: handle --batch format with missing type/size cat-file: pass expand_data to print_object_or_die	2013-12-27 14:58:11 -08:00
Junio C Hamano	2b0a564e02	Merge branch 'jk/pull-rebase-using-fork-point' * jk/pull-rebase-using-fork-point: rebase: use reflog to find common base with upstream pull: use merge-base --fork-point when appropriate	2013-12-27 14:58:08 -08:00
Junio C Hamano	e9ecee0423	Merge branch 'jk/rev-parse-double-dashes' "git rev-parse <revs> -- <paths>" did not implement the usual disambiguation rules the commands in the "git log" family used in the same way. * jk/rev-parse-double-dashes: rev-parse: be more careful with munging arguments rev-parse: correctly diagnose revision errors before "--"	2013-12-27 14:58:01 -08:00
Junio C Hamano	7cdebd8a20	Merge branch 'jc/push-refmap' Make "git push origin master" update the same ref that would be updated by our 'master' when "git push origin" (no refspecs) is run while the 'master' branch is checked out, which makes "git push" more symmetric to "git fetch" and more usable for the triangular workflow. * jc/push-refmap: push: also use "upstream" mapping when pushing a single ref push: use remote.$name.push as a refmap builtin/push.c: use strbuf instead of manual allocation	2013-12-27 14:57:50 -08:00
Roman Kagan	2394e94e83	git-svn: workaround for a bug in svn serf backend Subversion serf backend in versions 1.8.5 and below has a bug() that the function creating the descriptor of a file change -- add_file() -- doesn't make a copy of its third argument when storing it on the returned descriptor. As a result, by the time this field is used (in transactions of file copying or renaming) it may well be released, and the memory reused. One of its possible manifestations is the svn assertion triggering on an invalid path, with a message svn_fspath__skip_ancestor: Assertion `svn_fspath__is_canonical(child_fspath)' failed. This patch works around this bug, by storing the value to be passed as the third argument to add_file() in a local variable with the same scope as the file change descriptor, making sure their lifetime is the same. [ew: fixed in Subversion r1553376 as noted by Jonathan Nieder] Cc: Benjamin Pabst <benjamin.pabst85@gmail.com> Signed-off-by: Eric Wong <normalperson@yhbt.net> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Roman Kagan <rkagan@mail.ru>	2013-12-27 20:22:19 +00:00
Nguyễn Thái Ngọc Duy	8785b7654b	commit.c: make "tree" a const pointer in commit_tree*() Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-26 11:55:18 -08:00
Jeff King	65ea9c3c3d	cat-file: provide %(deltabase) batch format It can be useful for debugging or analysis to see which objects are stored as delta bases on top of others. This information is available by running `git verify-pack`, but that is extremely expensive (and is harder than necessary to parse). Instead, let's make it available as a cat-file query format, which makes it fast and simple to get the bases for a subset of the objects. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-26 11:54:26 -08:00
Jeff King	5d642e7506	sha1_object_info_extended: provide delta base sha1s A caller of sha1_object_info_extended technically has enough information to determine the base sha1 from the results of the call. It knows the pack, offset, and delta type of the object, which is sufficient to find the base. However, the functions to do so are not publicly available, and the code itself is intimate enough with the pack details that it should be abstracted away. We could add a public helper to allow callers to query the delta base separately, but it is simpler and slightly more efficient to optionally grab it along with the rest of the object_info data. For cases where the object is not stored as a delta, we write the null sha1 into the query field. A careful caller could check "oi.whence == OI_PACKED && oi.u.packed.is_delta" before looking at the base sha1, but using the null sha1 provides a simple alternative (and gives a better sanity check for a non-careful caller than simply returning random bytes). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-26 11:53:32 -08:00
Jeff King	9af270e8c2	do not pretend sha1write returns errors The sha1write function returns an int, but it will always be "0". The failure-prone parts of the function happen in the "flush" callback, which cannot pass an error back to us. So we just end up calling die() during the flush. Let's just drop the return value altogether, as it only confuses callers into thinking that it might be useful. Only one call site actually checked the return value. We can drop that check, since it just led to a die() anyway. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-26 11:50:20 -08:00
Nguyễn Thái Ngọc Duy	64ed07cee0	add: don't complain when adding empty project root This behavior was added in `07d7bed` (add: don't complain when adding empty project root - 2009-04-28) then broken by `84b8b5d` (remove match_pathspec() in favor of match_pathspec_depth() - 2013-07-14). Reinstate it. Noticed-by: Thomas Ferris Nicolaisen <tfnico@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-26 10:46:26 -08:00
Antoine Pelisse	1f7feb7753	remote-hg: test 'shared_path' in a moved clone Since `e71d1378` (remote-hg: fix 'shared path' path, 2013-12-07), Mercurial 'shared_path' file is correctly updated whenever a clone is moved. Make sure it keeps working, especially as this is depending on a private Mercurial file. Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-26 10:43:56 -08:00
brian m. carlson	5e1361ccdb	log: properly handle decorations with chained tags git log did not correctly handle decorations when a tag object referenced another tag object that was no longer a ref, such as when the second tag was deleted. The commit would not be decorated correctly because parse_object had not been called on the second tag and therefore its tagged field had not been filled in, resulting in none of the tags being associated with the relevant commit. Call parse_object to fill in this field if it is absent so that the chain of tags can be dereferenced and the commit can be properly decorated. Include tests as well to prevent future regressions. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-20 14:37:03 -08:00
Nguyễn Thái Ngọc Duy	82246b765b	daemon: be strict at parsing parameters --[no-]informative-errors Use strcmp() instead of starts_with()/!prefixcmp() to stop accepting --informative-errors-just-a-little Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-20 14:05:07 -08:00
Samuel Bronson	6d8940b562	diff: add diff.orderfile configuration variable diff.orderfile acts as a default for the -O command line option. [sb: split up aw's original patch; rework tests and docs, treat option as pathname] Signed-off-by: Anders Waldenborg <anders@0x63.nu> Signed-off-by: Samuel Bronson <naesten@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-18 16:39:00 -08:00
Samuel Bronson	a21bae33d9	diff: let "git diff -O" read orderfile from any file and fail properly The -O flag really shouldn't silently fail to do anything when given a path that it can't read from. However, it should be able to read from un-mmappable files, such as: * pipes/fifos * /dev/null: It's a character device (at least on Linux) * ANY empty file: Quoting Linux mmap(2), "SUSv3 specifies that mmap() should fail if length is 0. However, in kernels before 2.6.12, mmap() succeeded in this case: no mapping was created and the call returned addr. Since kernel 2.6.12, mmap() fails with the error EINVAL for this case." We especially want "-O/dev/null" to work, since we will be documenting it as the way to cancel "diff.orderfile" when we add that. (Note: "-O/dev/null" did have the right effect, since the existing error handling essentially worked out to "silently ignore the orderfile". But this was probably more coincidence than anything else.) So, lets toss all of that logic to get the file mmapped and just use strbuf_read_file() instead, which gives us decent error handling practically for free. Signed-off-by: Samuel Bronson <naesten@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-18 16:29:05 -08:00
Samuel Bronson	b527773092	t4056: add new tests for "git diff -O" Adapted from $gmane/236427 by Anders Waldenborg, "diff: Add diff.orderfile configuration variable". Signed-off-by: Anders Waldenborg <anders@0x63.nu> Signed-off-by: Samuel Bronson <naesten@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-18 16:26:12 -08:00
Jeff King	4454e9cb59	builtin/prune.c: use strbuf to avoid having to worry about PATH_MAX While at it, rename prune_tmp_object(), which used to be a helper to remove temporary files that were created to become loose object files, to prune_tmp_file(), as the function is also used to remove any random cruft whose name begins with tmp_ directly in .git/object or .git/object/pack directories these days. Noticed-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-18 15:53:56 -08:00
Junio C Hamano	491a8dec44	get_max_fd_limit(): fall back to OPEN_MAX upon getrlimit/sysconf failure On broken systems where RLIMIT_NOFILE is visible by the compliers but underlying getrlimit() system call does not behave, we used to simply die() when we are trying to decide how many file descriptors to allocate for keeping packfiles open. Instead, allow the fallback codepath to take over when we get such a failure from getrlimit(). The same issue exists with _SC_OPEN_MAX and sysconf(); restructure the code in a similar way to prepare for a broken sysconf() as well. Noticed-by: Joey Hess <joey@kitenet.net> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-18 14:59:43 -08:00
Roberto Tyley	615b8f1a8d	docs: add filter-branch notes on The BFG The BFG is a tool specifically designed for the task of removing unwanted data from Git repository history - a common use-case for which git-filter-branch has been the traditional workhorse. It's beneficial to let users know that filter-branch has an alternative here: * speed : The BFG is 10-50x faster http://rtyley.github.io/bfg-repo-cleaner/#speed * complexity of configuration : filter-branch is a very flexible tool, but demands very careful usage in order to get the desired results http://rtyley.github.io/bfg-repo-cleaner/#examples Obviously, filter-branch has it's advantages too - it permits very complex rewrites, and doesn't require a JVM - but for the common use-case of deleting unwanted data, it's helpful to users to be aware that an alternative exists. The BFG was released under the GPL in February 2013, and has since seen widespread production use (The Guardian, RedHat, Google, UK Government Digital Service), been tested against large repos (~300K commits, ~5GB packfiles) and received significant positive feedback from users: http://rtyley.github.io/bfg-repo-cleaner/#feedback Signed-off-by: Roberto Tyley <roberto.tyley@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-18 10:41:41 -08:00
Junio C Hamano	7794a680e6	Sync with 1.8.5.2 * maint: Git 1.8.5.2 cmd_repack(): remove redundant local variable "nr_packs"	2013-12-17 14:12:17 -08:00
Junio C Hamano	b10cd577d8	Update draft release notes to 1.9 Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-17 14:05:50 -08:00
Junio C Hamano	173473c287	Merge branch 'kn/gitweb-extra-branch-refs' Allow gitweb to be configured to show refs out of refs/heads/ as if they were branches. * kn/gitweb-extra-branch-refs: gitweb: Denote non-heads, non-remotes branches gitweb: Add a feature for adding more branch refs gitweb: Return 1 on validation success instead of passed input gitweb: Move check-ref-format code into separate function	2013-12-17 12:03:33 -08:00
Junio C Hamano	1945e8ac85	Merge branch 'tb/clone-ssh-with-colon-for-port' Be more careful when parsing remote repository URL given in the scp-style host:path notation. * tb/clone-ssh-with-colon-for-port: git_connect(): use common return point connect.c: refactor url parsing git_connect(): refactor the port handling for ssh git fetch: support host:/~repo t5500: add test cases for diag-url git fetch-pack: add --diag-url git_connect: factor out discovery of the protocol and its parts git_connect: remove artificial limit of a remote command t5601: add tests for ssh t5601: remove clear_ssh, refactor setup_ssh_wrapper	2013-12-17 12:03:32 -08:00
Junio C Hamano	88cb2f96ac	Merge branch 'nd/transport-positive-depth-only' "git fetch --depth=0" was a no-op, and was silently ignored. Diagnose it as an error. * nd/transport-positive-depth-only: clone,fetch: catch non positive --depth option value	2013-12-17 12:03:29 -08:00
Junio C Hamano	ad70448576	Merge branch 'cc/starts-n-ends-with' Remove a few duplicate implementations of prefix/suffix comparison functions, and rename them to starts_with and ends_with. * cc/starts-n-ends-with: replace {pre,suf}fixcmp() with {starts,ends}_with() strbuf: introduce starts_with() and ends_with() builtin/remote: remove postfixcmp() and use suffixcmp() instead environment: normalize use of prefixcmp() by removing " != 0"	2013-12-17 12:02:44 -08:00
Junio C Hamano	14a9c5f261	Merge branch 'jl/commit-v-strip-marker' "git commit -v" appends the patch to the log message before editing, and then removes the patch when the editor returned control. However, the patch was not stripped correctly when the first modified path was a submodule. * jl/commit-v-strip-marker: commit -v: strip diffs and submodule shortlogs from the commit message	2013-12-17 11:47:18 -08:00
Junio C Hamano	433a30d0ba	Merge branch 'tr/send-email-ssl' SSL-related options were not passed correctly to underlying socket layer in "git send-email". * tr/send-email-ssl: send-email: set SSL options through IO::Socket::SSL::set_client_defaults send-email: --smtp-ssl-cert-path takes an argument send-email: pass Debug to Net::SMTP::SSL::new	2013-12-17 11:47:12 -08:00
Junio C Hamano	7dc8a65c86	Merge branch 'nd/gettext-vsnprintf' * nd/gettext-vsnprintf: gettext.c: detect the vsnprintf bug at runtime	2013-12-17 11:47:10 -08:00

... 11 12 13 14 15 ...