mirrors/git - Incest Forge: Beyond sex. We incest.

mirrors/git

mirror of https://github.com/git/git.git synced 2024-10-31 14:27:54 +01:00

Author	SHA1	Message	Date
Stephen P. Smith	038a878810	Add 'human' date format documentation Display date and time information in a format similar to how people write dates in other contexts. If the year isn't specified then, the reader infers the date is given is in the current year. By not displaying the redundant information, the reader concentrates on the information that is different. The patch reports relative dates based on information inferred from the date on the machine running the git command at the time the command is executed. While the format is more useful to humans by dropping inferred information, there is nothing that makes it actually human. If the 'relative' date format wasn't already implemented then using 'relative' would have been appropriate. Signed-off-by: Stephen P. Smith <ischis2@cox.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-22 14:16:17 -08:00
Stephen P. Smith	2fd7c22992	Replace the proposed 'auto' mode with 'auto:' In addition to adding the 'human' format, the patch added the auto keyword which could be used in the config file as an alternate way to specify the human format. Removing 'auto' cleans up the 'human' format interface. Added the ability to specify mode 'foo' if the pager is being used by using auto:foo syntax. Therefore, 'auto:human' date mode defaults to human if we're using the pager. So you can do git config --add log.date auto:human and your "git log" commands will show the human-legible format unless you're scripting things. Signed-off-by: Stephen P. Smith <ischis2@cox.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-22 14:16:17 -08:00
Jeff King	7b95849be4	attr: do not mark queried macros as unset Since `60a12722ac` (attr: remove maybe-real, maybe-macro from git_attr, 2017-01-27), we will always mark an attribute macro (e.g., "binary") that is specifically queried for as "unspecified", even though listing _all_ attributes would display it at set. E.g.: $ echo "* binary" >.gitattributes $ git check-attr -a file file: binary: set file: diff: unset file: merge: unset file: text: unset $ git check-attr binary file file: binary: unspecified The problem stems from an incorrect conversion of the optimization from `06a604e670` (attr: avoid heavy work when we know the specified attr is not defined, 2014-12-28). There we tried in collect_some_attrs() to avoid even looking at the attr_stack when the user has asked for "foo" and we know that "foo" did not ever appear in any .gitattributes file. It used a flag "maybe_real" in each attribute struct, where "real" meant that the attribute appeared in an actual file (we have to make this distinction because we also create an attribute struct for any names that are being queried). But as explained in that commit message, the meaning of "real" was tangled with some special cases around macros. When `60a12722ac` later refactored the macro code, it dropped maybe_real entirely. This missed the fact that "maybe_real" could be unset for two reasons: because of a macro, or because it was never found during parsing. This had two results: - the optimization in collect_some_attrs() ceased doing anything meaningful, since it no longer kept track of "was it found during parsing" - worse, it actually kicked in when the caller _did_ ask about a macro by name, causing us to mark it as unspecified It should be possible to salvage this optimization, but let's start with just removing the remnants. It hasn't been doing anything (except creating bugs) since `60a12722ac`, and nobody seems to have noticed the performance regression. It's more important to fix the correctness problem clearly first. I've added two tests here. The second one actually shows off the bug. The test of "check-attr -a" is not strictly necessary, but we currently do not test attribute macros much, and the builtin "binary" not at all. So this increases our general test coverage, as well as making sure we didn't mess up this related case. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-22 13:48:15 -08:00
Johannes Schindelin	d609615f48	tests: explicitly use `test-tool.exe` on Windows In `8abfdf44c8` (tests: explicitly use `git.exe` on Windows, 2018-11-14), we made sure to use the `.exe` file extension when using an absolute path to `git.exe`, to avoid getting confused with a file or directory in the same place that lacks said file extension. For the same reason, we need to handle test-tool.exe the same way. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-22 12:35:59 -08:00
SZEDER Gábor	5af7417bd8	commit-graph: rename "large edges" to "extra edges" The optional 'Large Edge List' chunk of the commit graph file stores parent information for commits with more than two parents, and the names of most of the macros, variables, struct fields, and functions related to this chunk contain the term "large edges", e.g. write_graph_chunk_large_edges(). However, it's not a really great term, as the edges to the second and subsequent parents stored in this chunk are not any larger than the edges to the first and second parents stored in the "main" 'Commit Data' chunk. It's the number of edges, IOW number of parents, that is larger compared to non-merge and "regular" two-parent merge commits. And indeed, two functions in 'commit-graph.c' have a local variable called 'num_extra_edges' that refer to the same thing, and this "extra edges" term is much better at describing these edges. So let's rename all these references to "large edges" in macro, variable, function, etc. names to "extra edges". There is a GRAPH_OCTOPUS_EDGES_NEEDED macro as well; for the sake of consistency rename it to GRAPH_EXTRA_EDGES_NEEDED. We can do so safely without causing any incompatibility issues, because the term "large edges" doesn't come up in the file format itself in any form (the chunk's magic is {'E', 'D', 'G', 'E'}, there is no 'L' in there), but only in the specification text. The string "large edges", however, does come up in the output of 'git commit-graph read' and in tests looking at its input, but that command is explicitly documented as debugging aid, so we can change its output and the affected tests safely. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-22 11:33:46 -08:00
Ævar Arnfjörð Bjarmason	d7574c95bb	commit-graph write: use pack order when finding commits Slightly optimize the "commit-graph write" step by using FOR_EACH_OBJECT_PACK_ORDER with for_each_object_in_pack(). See commit [1] and [2] for the facility and a similar optimization for "cat-file". On Linux it is around 5% slower to run: echo 1 >/proc/sys/vm/drop_caches && cat .git/objects/pack/* >/dev/null && git cat-file --batch-all-objects --batch-check --unordered Than the same thing with the "cat" omitted. This is as expected, since we're iterating in pack order and the "cat" is extra work. Before this change the opposite was true of "commit-graph write". We were 6% faster if we first ran "cat" to efficiently populate the FS cache for our sole big pack on linux.git, than if we had populated it via for_each_object_in_pack(). Now we're 3% faster without the "cat" instead. My tests were done on an unloaded Linux 3.10 system with 10 runs for each. Derrick Stolee did his own tests on Windows[3] showing a 2% improvement with a high degree of accuracy. 1. `736eb88fdc` ("for_each_packed_object: support iterating in pack-order", 2018-08-10) 2. `0750bb5b51` ("cat-file: support "unordered" output for --batch-all-objects", 2018-08-10) 3. https://public-inbox.org/git/f71fa868-25e8-a9c9-46a6-611b987f1a8f@gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-22 11:32:56 -08:00
David Turner	d1dd94b308	Do not print 'dangling' for cat-file in case of ambiguity The return values -1 and -2 from get_oid could mean two different things, depending on whether they were from an enum returned by get_tree_entry_follow_symlinks, or from a different code path. This caused 'dangling' to be printed from a git cat-file in the case of an ambiguous (-2) result. Unify the results of get_oid* and get_tree_entry_follow_symlinks to be one common type, with unambiguous values. Signed-off-by: David Turner <novalis@novalis.org> Reported-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-18 15:22:02 -08:00
Junio C Hamano	16a465bc01	Third batch after 2.20 Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-18 13:56:54 -08:00
Junio C Hamano	5104f8f1ac	Merge branch 'js/gc-repack-close-before-remove' "git gc" and "git repack" did not close the open packfiles that they found unneeded before removing them, which didn't work on a platform incapable of removing an open file. This has been corrected. * js/gc-repack-close-before-remove: gc/repack: release packs when needed	2019-01-18 13:49:57 -08:00
Junio C Hamano	eab7584e37	Merge branch 'en/show-ref-doc-fix' Doc update. * en/show-ref-doc-fix: git-show-ref.txt: fix order of flags	2019-01-18 13:49:57 -08:00
Junio C Hamano	55574bd04a	Merge branch 'ot/ref-filter-object-info' The "--format=<placeholder>" option of for-each-ref, branch and tag learned to show a few more traits of objects that can be learned by the object_info API. * ot/ref-filter-object-info: ref-filter: give uintmax_t to format with %PRIuMAX ref-filter: add docs for new options ref-filter: add tests for deltabase ref-filter: add deltabase option ref-filter: add tests for objectsize:disk ref-filter: add check for negative file size ref-filter: add objectsize:disk option	2019-01-18 13:49:56 -08:00
Junio C Hamano	3fe47ff444	Merge branch 'sg/stress-test' Flaky tests can now be repeatedly run under load with the "--stress" option. * sg/stress-test: test-lib: add the '--stress' option to run a test repeatedly under load test-lib-functions: introduce the 'test_set_port' helper function test-lib: set $TRASH_DIRECTORY earlier test-lib: consolidate naming of test-results paths test-lib: parse command line options earlier test-lib: parse options in a for loop to keep $@ intact test-lib: extract Bash version check for '-x' tracing test-lib: translate SIGTERM and SIGHUP to an exit	2019-01-18 13:49:56 -08:00
Junio C Hamano	2c0a645d9e	Merge branch 'rs/sha1-file-close-mapped-file-on-error' Code clean-up. * rs/sha1-file-close-mapped-file-on-error: sha1-file: close fd of empty file in map_sha1_file_1()	2019-01-18 13:49:56 -08:00
Junio C Hamano	eb8638abec	Merge branch 'rs/loose-object-cache-perffix' The loose object cache used to optimize existence look-up has been updated. * rs/loose-object-cache-perffix: object-store: retire odb_load_loose_cache() object-store: use one oid_array per subdirectory for loose cache object-store: factor out odb_clear_loose_cache() object-store: factor out odb_loose_cache()	2019-01-18 13:49:56 -08:00
Junio C Hamano	702bbfef3c	Merge branch 'po/git-p4-wo-login' "git p4" update. * po/git-p4-wo-login: git-p4: fix problem when p4 login is not necessary	2019-01-18 13:49:56 -08:00
Junio C Hamano	41db137234	Merge branch 'mm/multimail-1.5' Update "git multimail" from the upstream. * mm/multimail-1.5: git-multimail: update to release 1.5.0	2019-01-18 13:49:55 -08:00
Junio C Hamano	9462ac7211	Merge branch 'tg/t5570-drop-racy-test' An inherently racy test that caused intermittent failures has been removed. * tg/t5570-drop-racy-test: Revert "t/lib-git-daemon: record daemon log" t5570: drop racy test	2019-01-18 13:49:55 -08:00
Junio C Hamano	74ae0652c4	Merge branch 'jk/dev-build-format-security' Earlier we added "-Wformat-security" to developer builds, assuming that "-Wall" (which includes "-Wformat" which in turn is required to use "-Wformat-security") is always in effect. This is not true when config.mak.autogen is in use, unfortunately. This has been fixed by unconditionally adding "-Wall" to developer builds. * jk/dev-build-format-security: config.mak.dev: add -Wall, primarily for -Wformat, to help autoconf users	2019-01-18 13:49:55 -08:00
Junio C Hamano	77fbd96aeb	Merge branch 'so/cherry-pick-always-allow-m1' "git cherry-pick -m1" was forbidden when picking a non-merge commit, even though there _is_ parent number 1 for such a commit. This was done to avoid mistakes back when "cherry-pick" was about picking a single commit, but is no longer useful with "cherry-pick" that can pick a range of commits. Now the "-m$num" option is allowed when picking any commit, as long as $num names an existing parent of the commit. Technically this is a backward incompatible change; hopefully nobody is relying on the error-checking behaviour. * so/cherry-pick-always-allow-m1: t3506: validate '-m 1 -ff' is now accepted for non-merge commits t3502: validate '-m 1' argument is now accepted for non-merge commits cherry-pick: do not error on non-merge commits when '-m 1' is specified t3510: stop using '-m 1' to force failure mid-sequence of cherry-picks	2019-01-18 13:49:54 -08:00
Junio C Hamano	726f89c2dd	Merge branch 'nd/worktree-remove-with-uninitialized-submodules' "git worktree remove" and "git worktree move" refused to work when there is a submodule involved. This has been loosened to ignore uninitialized submodules. * nd/worktree-remove-with-uninitialized-submodules: worktree: allow to (re)move worktrees with uninitialized submodules	2019-01-18 13:49:54 -08:00
Junio C Hamano	bb20dbbc20	Merge branch 'sg/test-bash-version-fix' The test suite tried to see if it is run under bash, but the check itself failed under some other implementations of shell (notably under NetBSD). This has been corrected. * sg/test-bash-version-fix: test-lib: check Bash version for '-x' without using shell arrays	2019-01-18 13:49:54 -08:00
Junio C Hamano	9f2eba2b90	Merge branch 'rb/hpe' Portability updates for the HPE NonStop platform. * rb/hpe: compat/regex/regcomp.c: define intptr_t and uintptr_t on NonStop git-compat-util.h: add FLOSS headers for HPE NonStop config.mak.uname: support for modern HPE NonStop config. transport-helper: drop read/write errno checks transport-helper: use xread instead of read	2019-01-18 13:49:54 -08:00
Junio C Hamano	e805dc1892	Merge branch 'ed/simplify-setup-git-dir' Code simplification. * ed/simplify-setup-git-dir: Simplify handling of setup_git_directory_gently() failure cases.	2019-01-18 13:49:54 -08:00
Junio C Hamano	b84e297753	Merge branch 'cy/zsh-completion-SP-in-path' With zsh, "git cmd path<TAB>" was completed to "git cmd path name" when the completed path has a special character like SP in it, without any attempt to keep "path name" a single filename. This has been fixed to complete it to "git cmd path\ name" just like Bash completion does. * cy/zsh-completion-SP-in-path: completion: treat results of git ls-tree as file paths zsh: complete unquoted paths with spaces correctly	2019-01-18 13:49:54 -08:00
Junio C Hamano	c433857894	Merge branch 'cy/completion-typofix' Typofix. * cy/completion-typofix: completion: fix typo in git-completion.bash	2019-01-18 13:49:54 -08:00
Junio C Hamano	81bf66b760	Merge branch 'ew/ban-strncat' The "strncat()" function is now among the banned functions. * ew/ban-strncat: banned.h: mark strncat() as banned	2019-01-18 13:49:53 -08:00
Junio C Hamano	d01a3faa50	Merge branch 'ds/commit-graph-assert-missing-parents' Tightening error checking in commit-graph writer. * ds/commit-graph-assert-missing-parents: commit-graph: writing missing parents is a BUG	2019-01-18 13:49:53 -08:00
Junio C Hamano	540ee40e11	Merge branch 'es/doc-worktree-guessremote-config' Doc clarification. * es/doc-worktree-guessremote-config: doc/config: do a better job of introducing 'worktree.guessRemote'	2019-01-18 13:49:53 -08:00
Junio C Hamano	3942920966	Merge branch 'sb/submodule-unset-core-worktree-when-worktree-is-lost' The core.worktree setting in a submodule repository should not be pointing at a directory when the submodule loses its working tree (e.g. getting deinit'ed), but the code did not properly maintain this invariant. * sb/submodule-unset-core-worktree-when-worktree-is-lost: submodule deinit: unset core.worktree submodule--helper: fix BUG message in ensure_core_worktree submodule: unset core.worktree if no working tree is present submodule update: add regression test with old style setups	2019-01-18 13:49:53 -08:00
Junio C Hamano	1ed943e9ae	Merge branch 'ma/asciidoctor' Some of the documentation pages formatted incorrectly with Asciidoctor, which have been fixed. * ma/asciidoctor: git-status.txt: render tables correctly under Asciidoctor Documentation: do not nest open blocks git-column.txt: fix section header	2019-01-18 13:49:53 -08:00
Junio C Hamano	ec27a94013	Merge branch 'jn/stripspace-wo-repository' "git stripspace" should be usable outside a git repository, but under the "-s" or "-c" mode, it didn't. * jn/stripspace-wo-repository: stripspace: allow -s/-c outside git repository	2019-01-18 13:49:53 -08:00
Junio C Hamano	4744d03a47	Merge branch 'sb/submodule-fetchjobs-default-to-one' "git submodule update" ought to use a single job unless asked, but by mistake used multiple jobs, which has been fixed. * sb/submodule-fetchjobs-default-to-one: submodule update: run at most one fetch job unless otherwise set	2019-01-18 13:49:52 -08:00
Junio C Hamano	9c51ad5853	Merge branch 'la/quiltimport-keep-non-patch' "git quiltimport" learned "--keep-non-patch" option. * la/quiltimport-keep-non-patch: git-quiltimport: add --keep-non-patch option	2019-01-18 13:49:52 -08:00
Junio C Hamano	3434569fc2	Merge branch 'nd/style-opening-brace' Code clean-up. * nd/style-opening-brace: style: the opening '{' of a function is in a separate line	2019-01-18 13:49:52 -08:00
Junio C Hamano	e07074d3f0	Merge branch 'ds/gc-doc-typofix' Typofix. * ds/gc-doc-typofix: git-gc.txt: fix typo about gc.writeCommitGraph	2019-01-18 13:49:52 -08:00
Johannes Schindelin	9e9da23c27	mingw: special-case arguments to `sh` The MSYS2 runtime does its best to emulate the command-line wildcard expansion and de-quoting which would be performed by the calling Unix shell on Unix systems. Those Unix shell quoting rules differ from the quoting rules applying to Windows' cmd and Powershell, making it a little awkward to quote command-line parameters properly when spawning other processes. In particular, git.exe passes arguments to subprocesses that are not intended to be interpreted as wildcards, and if they contain backslashes, those are not to be interpreted as escape characters, e.g. when passing Windows paths. Note: this is only a problem when calling MSYS2 executables, not when calling MINGW executables such as git.exe. However, we do call MSYS2 executables frequently, most notably when setting the use_shell flag in the child_process structure. There is no elegant way to determine whether the .exe file to be executed is an MSYS2 program or a MINGW one. But since the use case of passing a command line through the shell is so prevalent, we need to work around this issue at least when executing sh.exe. Let's introduce an ugly, hard-coded test whether argv[0] is "sh", and whether it refers to the MSYS2 Bash, to determine whether we need to quote the arguments differently than usual. That still does not fix the issue completely, but at least it is something. Incidentally, this also fixes the problem where `git clone \\server\repo` failed due to incorrect handling of the backslashes when handing the path to the git-upload-pack process. Further, we need to take care to quote not only whitespace and backslashes, but also curly brackets. As aliases frequently go through the MSYS2 Bash, and as aliases frequently get parameters such as HEAD@{yesterday}, this is really important. As an early version of this patch broke this, let's make sure that this does not regress by adding a test case for that. Helped-by: Kim Gybels <kgybels@infogroep.be> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-18 13:12:14 -08:00
Johannes Schindelin	5440df44c2	mingw (t5580): document bug when cloning from backslashed UNC paths Due to a quirk in Git's method to spawn git-upload-pack, there is a problem when passing paths with backslashes in them: Git will force the command-line through the shell, which has different quoting semantics in Git for Windows (being an MSYS2 program) than regular Win32 executables such as git.exe itself. The symptom is that the first of the two backslashes in UNC paths of the form \\myserver\folder\repository.git is stripped off. Document this bug by introducing a test case. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-18 13:12:14 -08:00
Jonathan Tan	e2f41a0a5a	ls-refs: filter refs using namespace-stripped name If a user fetches refs/heads/master from a repo with namespace "ns", the remote is expected to (1) not send the real refs/heads/master, and (2) send refs/namespaces/ns/refs/heads/master with the name refs/heads/master. (1) indeed happens now, but not (2) - Git only sends refs that have the user-given prefix, but it checks them against the full name of the ref (the one starting with refs/namespaces), and not the namespace-stripped one. This is demonstrated by the patch in the test. Currently, it results in "fatal: couldn't find remote ref refs/heads/master" despite both unnamespaced and namespaced master being present. With the code change, it produces the expected result. Check the ref prefixes against the namespace-stripped name. This bug was discovered through applying patches [1] that override protocol.version to 2 in repositories when running tests, allowing us to notice differences in behavior across different protocol versions. [1] https://public-inbox.org/git/cover.1547677183.git.jonathantanmy@google.com/ Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-18 12:48:41 -08:00
Linus Torvalds	acdd37769d	Add 'human' date format This adds --date=human, which skips the timezone if it matches the current time-zone, and doesn't print the whole date if that matches (ie skip printing year for dates that are "this year", but also skip the whole date itself if it's in the last few days and we can just say what weekday it was). For really recent dates (same day), use the relative date stamp, while for old dates (year doesn't match), don't bother with time and timezone. Also add 'auto' date mode, which defaults to human if we're using the pager. So you can do git config --add log.date auto and your "git log" commands will show the human-legible format unless you're scripting things. Note that this time format still shows the timezone for recent enough events (but not so recent that they show up as relative dates). You can combine it with the "-local" suffix to never show timezones for an even more simplified view. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Stephen P. Smith <ischis2@cox.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-18 10:31:23 -08:00
Johannes Schindelin	21853626ea	built-in rebase: call `git am` directly While the scripted `git rebase` still has to rely on the `git-rebase--am.sh` script to implement the glue between the `rebase` and the `am` commands, we can go a more direct route in the built-in rebase and avoid using a shell script altogether. This patch represents a straight-forward port of `git-rebase--am.sh` to C, along with the glue code to call it directly from within `builtin/rebase.c`. This reduces the chances of Git for Windows running into trouble due to problems with the POSIX emulation layer (known as "MSYS2 runtime", itself a derivative of the Cygwin runtime): when no shell script is called, the POSIX emulation layer is avoided altogether. Note: we pass an empty action to `reset_head()` here when moving back to the original branch, as no other action is applicable, really. This parameter is used to initialize `unpack_trees()`' messages. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-18 10:11:45 -08:00
Johannes Schindelin	414f336069	rebase: teach `reset_head()` to optionally skip the worktree This is what the legacy (scripted) rebase does in `move_to_original_branch`, and we will need this functionality in the next commit. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-18 10:11:45 -08:00
Johannes Schindelin	5b2237a876	rebase: avoid double reflog entry when switching branches When switching a branch and updating said branch to a different revision, let's avoid a double entry in HEAD's reflog by first updating the branch and then adjusting the symbolic ref HEAD. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-18 10:11:45 -08:00
Johannes Schindelin	c5233708c5	rebase: move `reset_head()` into a better spot Over the next commits, we want to make use of it in `run_am()` (i.e. running the `--am` backend directly, without detouring to Unix shell script code) which in turn will be called from `run_specific_rebase()`. So let's move it before that latter function. This commit is best viewed using --color-moved. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-18 10:11:45 -08:00
Johannes Schindelin	d8727b3687	abspath_part_inside_repo: respect core.ignoreCase If the file system is case-insensitive, we really must be careful to ignore differences in case only. This fixes https://github.com/git-for-windows/git/issues/735 Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-18 09:53:06 -08:00
Luke Diamand	7a10946ab9	git-p4: handle update of moved/copied files when updating a shelve Perforce requires a complete list of files being operated on. If git is updating an existing shelved changelist, then any files which are moved or copied were not being added to this list. Signed-off-by: Luke Diamand <luke@diamand.org> Acked-by: Andrey Mazo <amazo@checkvideo.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-18 09:43:40 -08:00
Luke Diamand	7a10bb3a4c	git-p4: add failing test for shelved CL update involving move/copy Updating a shelved P4 changelist where one or more files have been moved or copied does not work. Add a test for this. The problem is that P4 requires a complete list of the files being changed, and move/copy only includes the _source_ in the case of updating a shelved changelist. This results in errors from Perforce such as: //depot/src - needs tofile //depot/dst Submit aborted -- fix problems then use 'p4 submit -c 1234' Signed-off-by: Luke Diamand <luke@diamand.org> Acked-by: Andrey Mazo <amazo@checkvideo.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-18 09:43:34 -08:00
Derrick Stolee	99dbbfa8dd	pack-objects: create GIT_TEST_PACK_SPARSE Create a test variable GIT_TEST_PACK_SPARSE to enable the sparse object walk algorithm by default during the test suite. Enabling this variable ensures coverage in many interesting cases, such as shallow clones, partial clones, and missing objects. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-17 13:44:44 -08:00
Derrick Stolee	3d036eb0d2	pack-objects: create pack.useSparse setting The '--sparse' flag in 'git pack-objects' changes the algorithm used to enumerate objects to one that is faster for individual users pushing new objects that change only a small cone of the working directory. The sparse algorithm is not recommended for a server, which likely sends new objects that appear across the entire working directory. Create a 'pack.useSparse' setting that enables this new algorithm. This allows 'git push' to use this algorithm without passing a '--sparse' flag all the way through four levels of run_command() calls. If the '--no-sparse' flag is set, then this config setting is overridden. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-17 13:44:43 -08:00
Derrick Stolee	d5d2e93577	revision: implement sparse algorithm When enumerating objects to place in a pack-file during 'git pack-objects --revs', we discover the "frontier" of commits that we care about and the boundary with commit we find uninteresting. From that point, we walk trees to discover which trees and blobs are uninteresting. Finally, we walk trees from the interesting commits to find the interesting objects that are placed in the pack. This commit introduces a new, "sparse" way to discover the uninteresting trees. We use the perspective of a single user trying to push their topic to a large repository. That user likely changed a very small fraction of the paths in their working directory, but we spend a lot of time walking all reachable trees. The way to switch the logic to work in this sparse way is to start caring about which paths introduce new trees. While it is not possible to generate a diff between the frontier boundary and all of the interesting commits, we can simulate that behavior by inspecting all of the root trees as a whole, then recursing down to the set of trees at each path. We already had taken the first step by passing an oidset to mark_trees_uninteresting_sparse(). We now create a dictionary whose keys are paths and values are oidsets. We consider the set of trees that appear at each path. While we inspect a tree, we add its subtrees to the oidsets corresponding to the tree entry's path. We also mark trees as UNINTERESTING if the tree we are parsing is UNINTERESTING. To actually improve the performance, we need to terminate our recursion. If the oidset contains only UNINTERESTING trees, then we do not continue the recursion. This avoids walking trees that are likely to not be reachable from interesting trees. If the oidset contains only interesting trees, then we will walk these trees in the final stage that collects the intersting objects to place in the pack. Thus, we only recurse if the oidset contains both interesting and UNINITERESTING trees. There are a few ways that this is not a universally better option. First, we can pack extra objects. If someone copies a subtree from one tree to another, the first tree will appear UNINTERESTING and we will not recurse to see that the subtree should also be UNINTERESTING. We will walk the new tree and see the subtree as a "new" object and add it to the pack. A test is modified to demonstrate this behavior and to verify that the new logic is being exercised. Second, we can have extra memory pressure. If instead of being a single user pushing a small topic we are a server sending new objects from across the entire working directory, then we will gain very little (the recursion will rarely terminate early) but will spend extra time maintaining the path-oidset dictionaries. Despite these potential drawbacks, the benefits of the algorithm are clear. By adding a counter to 'add_children_by_path' and 'mark_tree_contents_uninteresting', I measured the number of parsed trees for the two algorithms in a variety of repos. For git.git, I used the following input: v2.19.0 ^v2.19.0~10 Objects to pack: 550 Walked (old alg): 282 Walked (new alg): 130 For the Linux repo, I used the following input: v4.18 ^v4.18~10 Objects to pack: 518 Walked (old alg): 4,836 Walked (new alg): 188 The two repos above are rather "wide and flat" compared to other repos that I have used in the past. As a comparison, I tested an old topic branch in the Azure DevOps repo, which has a much deeper folder structure than the Linux repo. Objects to pack: 220 Walked (old alg): 22,804 Walked (new alg): 129 I used the number of walked trees the main metric above because it is consistent across multiple runs. When I ran my tests, the performance of the pack-objects command with the same options could change the end-to-end time by 10x depending on the file system being warm. However, by repeating the same test on repeat I could get more consistent timing results. The git.git and Linux tests were too fast overall (less than 0.5s) to measure an end-to-end difference. The Azure DevOps case was slow enough to see the time improve from 15s to 1s in the warm case. The cold case was 90s to 9s in my testing. These improvements will have even larger benefits in the super- large Windows repository. In our experiments, we see the "Enumerate objects" phase of pack-objects taking 60-80% of the end-to-end time of non-trivial pushes, taking longer than the network time to send the pack and the server time to verify the pack. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-17 13:44:42 -08:00
Derrick Stolee	4f6d26b167	list-objects: consume sparse tree walk When creating a pack-file using 'git pack-objects --revs' we provide a list of interesting and uninteresting commits. For example, a push operation would make the local topic branch be interesting and the known remote refs as uninteresting. We want to discover the set of new objects to send to the server as a thin pack. We walk these commits until we discover a frontier of commits such that every commit walk starting at interesting commits ends in a root commit or unintersting commit. We then need to discover which non-commit objects are reachable from uninteresting commits. This commit walk is not changing during this series. The mark_edges_uninteresting() method in list-objects.c iterates on the commit list and does the following: * If the commit is UNINTERSTING, then mark its root tree and every object it can reach as UNINTERESTING. * If the commit is interesting, then mark the root tree of every UNINTERSTING parent (and all objects that tree can reach) as UNINTERSTING. At the very end, we repeat the process on every commit directly given to the revision walk from stdin. This helps ensure we properly cover shallow commits that otherwise were not included in the frontier. The logic to recursively follow trees is in the mark_tree_uninteresting() method in revision.c. The algorithm avoids duplicate work by not recursing into trees that are already marked UNINTERSTING. Add a new 'sparse' option to the mark_edges_uninteresting() method that performs this logic in a slightly different way. As we iterate over the commits, we add all of the root trees to an oidset. Then, call mark_trees_uninteresting_sparse() on that oidset. Note that we include interesting trees in this process. The current implementation of mark_trees_unintersting_sparse() will walk the same trees as the old logic, but this will be replaced in a later change. Add a '--sparse' flag in 'git pack-objects' to call this new logic. Add a new test script t/t5322-pack-objects-sparse.sh that tests this option. The tests currently demonstrate that the resulting object list is the same as the old algorithm. This includes a case where both algorithms pack an object that is not needed by a remote due to limits on the explored set of trees. When the sparse algorithm is changed in a later commit, we will add a test that demonstrates a change of behavior in some cases. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-17 13:44:39 -08:00

1 2 3 4 5 ...