The infrastructure to rewrite "git submodule" in C is being built
incrementally. Let's polish these early parts well enough and make
them graduate to 'next' and 'master', so that the more involved
follow-up can start cooking on a solid ground.
* sb/submodule-helper:
submodule: rewrite `module_clone` shell function in C
submodule: rewrite `module_name` shell function in C
submodule: rewrite `module_list` shell function in C
Some protocols (like git-remote-ext) can execute arbitrary
code found in the URL. The URLs that submodules use may come
from arbitrary sources (e.g., .gitmodules files in a remote
repository). Let's restrict submodules to fetching from a
known-good subset of protocols.
Note that we apply this restriction to all submodule
commands, whether the URL comes from .gitmodules or not.
This is more restrictive than we need to be; for example, in
the tests we run:
git submodule add ext::...
which should be trusted, as the URL comes directly from the
command line provided by the user. But doing it this way is
simpler, and makes it much less likely that we would miss a
case. And since such protocols should be an exception
(especially because nobody who clones from them will be able
to update the submodules!), it's not likely to inconvenience
anyone in practice.
Reported-by: Blake Burkhart <bburky@bburky.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reimplements the helper function `module_clone` in shell
in C as `clone`. This functionality is needed for converting
`git submodule update` later on, which we want to add threading
to.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This implements the helper `name` in C instead of shell,
yielding a nice performance boost.
Before this patch, I measured a time (best out of three):
$ time ./t7400-submodule-basic.sh >/dev/null
real 0m11.066s
user 0m3.348s
sys 0m8.534s
With this patch applied I measured (also best out of three)
$ time ./t7400-submodule-basic.sh >/dev/null
real 0m10.063s
user 0m3.044s
sys 0m7.487s
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most of the submodule operations work on a set of submodules.
Calculating and using this set is usually done via:
module_list "$@" | {
while read mode sha1 stage sm_path
do
# the actual operation
done
}
Currently the function `module_list` is implemented in the
git-submodule.sh as a shell script wrapping a perl script.
The rewrite is in C, such that it is faster and can later be
easily adapted when other functions are rewritten in C.
git-submodule.sh, similar to the builtin commands, will navigate
to the top-most directory of the repository and keep the
subdirectory as a variable. As the helper is called from
within the git-submodule.sh script, we are already navigated
to the root level, but the path arguments are still relative
to the subdirectory we were in when calling git-submodule.sh.
That's why there is a `--prefix` option pointing to an alternative
path which to anchor relative path arguments.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Acked-by: Chris Packham <judge.packham@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we add a new submodule the path of the submodule is being
normalized. We fail to normalize multiple adjacent '/./', though.
Thus 'path/to/././submodule' will become 'path/to/./submodule' where
it should be 'path/to/submodule' instead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Acked-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
SysV-derived implementation of "echo" interprets some backslash
sequences as special instruction, e.g. "echo 'ab\c'" shows an
incomplete line with 'a' and 'b' on it. Avoid using it when showing
a path-like values in the script.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The construct is error-prone; "test" being built-in in most modern
shells, the reason to avoid "test <cond> && test <cond>" spawning
one extra process by using a single "test <cond> -a <cond>" no
longer exists.
Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reverts commit 4dce7d9b40,
which was originally done to help Windows but was almost
immediately reverted in msysGit, and the codebase kept this
unnecessary divergence for almost two years.
Signed-off-by: Stepan Kasal <kasal@ucw.cz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reverts commit 23d25e48f5, as it is
broken for users who haven't opted into the new feature of checking
out submodule.*.branch with update mode set to checkout.
Commit 322bb6e12f (add update 'none' flag to disable update of submodule
by default) added the '--checkout' option to "git submodule update" but
forgot to explicitly document it in synopsis, usage string and man page
(It is only mentioned implicitly in the man page). In 23d25e48 (submodule:
explicit local branch creation in module_clone) the synopsis of the man
page was updated, but the "OPTIONS" section of the man page and the usage
string of the git-submodule script still do not mention the '--checkout'
option.
Fix that by documenting this option in usage string and the "OPTIONS"
section of man page too. While at it group the update-mode options into
a single set in the usage string.
Reported-by: Matthijs Kooijman <matthijs@stdin.nl>
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make sure 'submodule update' modes that do not detach HEADs can
be used more pleasantly by checking out a concrete branch when
cloning them to prime the well.
* wk/submodule-on-branch:
Documentation: describe 'submodule update --remote' use case
submodule: explicit local branch creation in module_clone
submodule: document module_clone arguments in comments
submodule: make 'checkout' update_module mode more explicit
The previous code only checked out branches in cmd_add. This commit
moves the branch-checkout logic into module_clone, where it can be
shared by cmd_add and cmd_update. I also update the initial checkout
command to use 'reset' to preserve branches setup during module_clone.
With this change, folks cloning submodules for the first time via:
$ git submodule update ...
will get a local branch instead of a detached HEAD, unless they are
using the default checkout-mode updates. This is a change from the
previous situation where cmd_update always used checkout-mode logic
(regardless of the requested update mode) for updates that triggered
an initial clone, which always resulted in a detached HEAD.
This commit does not change the logic for updates after the initial
clone, which will continue to create detached HEADs for checkout-mode
updates, and integrate remote work with the local HEAD (detached or
not) in other modes.
The motivation for the change is that developers doing local work
inside the submodule are likely to select a non-checkout-mode for
updates so their local work is integrated with upstream work.
Developers who are not doing local submodule work stick with
checkout-mode updates so any apparently local work is blown away
during updates. For example, if upstream rolls back the remote branch
or gitlinked commit to an earlier version, the checkout-mode developer
wants their old submodule checkout to be rolled back as well, instead
of getting a no-op merge/rebase with the rolled-back reference.
By using the update mode to distinguish submodule developers from
black-box submodule consumers, we can setup local branches for the
developers who will want local branches, and stick with detached HEADs
for the developers that don't care.
Testing
=======
In t7406, just-cloned checkouts now update to the gitlinked hash with
'reset', to preserve the local branch for situations where we're not
on a detached HEAD.
I also added explicit tests to t7406 for HEAD attachement after
cloning updates, showing that it depends on their update mode:
* Checkout-mode updates get detached HEADs
* Everyone else gets a local branch, matching the configured
submodule.<name>.branch and defaulting to master.
The 'initial-setup' tag makes it easy to reset the superproject to a
known state, as several earlier tests commit to submodules and commit
the changed gitlinks to the superproject, but don't push the new
submodule commits to the upstream subprojects. This makes it
impossible to checkout the current super master, because it references
submodule commits that don't exist in the upstream subprojects. For a
specific example, see the tests that currently generate the
'two_new_submodule_commits' commits.
Documentation
=============
I updated the docs to describe the 'submodule update' modes in detail.
The old documentation did not distinguish between cloning and
non-cloning updates and lacked clarity on which operations would lead
to detached HEADs, and which would not. The new documentation
addresses these issues while updating the docs to reflect the changes
introduced by this commit's explicit local branch creation in
module_clone.
I also add '--checkout' to the usage summary and group the update-mode
options into a single set.
Signed-off-by: W. Trevor King <wking@tremily.us>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This avoids the current awkwardness of having either '' or 'checkout'
for checkout-mode updates, which makes testing for checkout-mode
updates (or non-checkout-mode updates) easier.
Signed-off-by: W. Trevor King <wking@tremily.us>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"submodule.*.update=checkout", when propagated from .gitmodules to
.git/config, turned into a "submodule.*.update=none", which did not
make much sense.
* fp/submodule-checkout-mode:
git-submodule.sh: 'checkout' is a valid update mode
'checkout' is documented as one of the valid values for the
'submodule.<name>.update' variable, and in a repository with the
variable set to 'checkout', "git submodule update" command does
update using the 'checkout' mode.
However, it has been an accident that the implementation works this
way; any unknown value would trigger the same codepath and update
using the 'checkout' mode.
Explicitly list 'checkout' as one of the known update modes, and
error out when an unknown update mode is used.
Teach the codepath that initializes the configuration variable from
an in-tree .gitmodules that 'checkout' is one of the valid values.
The code since ac1fbbda (submodule: do not copy unknown update mode
from .gitmodules, 2013-12-02) used to treat the value 'checkout' as
unknown and mapped it to 'none', which made little sense. With this
change, 'checkout' specified in .gitmodules will stay to be 'checkout'.
Signed-off-by: Francesco Pretto <ceztko@gmail.com>
Signed-off-by: Signed-off-by: Junio C Hamano <gitster@pobox.com>
A behavior change, but a worthwhile one: "git submodule foreach"
was treating its arguments as part of a single command to be
concatenated and passed to a shell, making writing buggy
scripts too easy.
This patch preserves the old "just pass it to the shell" behavior
when a single argument is passed to 'git submodule foreach' and
moves to a new "skip the shell and use the arguments passed
unmolested" behavior when more than one argument is passed.
The old behavior (always concatenating and passing to the shell)
was similar to the 'ssh' command, while the new behavior (switching
on the number of arguments) is what 'xterm -e' does.
May need more thought to make sure this change is advertised well
so that scripts that used multiple arguments but added their own
extra layer of quoting are not broken.
* ak/submodule-foreach-quoting:
submodule foreach: skip eval for more than one argument
When submodule.$name.update is given as hint from the upstream in
the .gitmodules file, we used to blindly copy it to .git/config,
unless there already is a value defined for the submodule.
However, there is no reason to expect that the update mode hinted by
the upstream is available in the version of Git the user is using,
and a really custom "!cmd" prepared by an upstream person running on
Linux may not even be available to a user on Windows. It is simply
irresponsible to copy the setting blindly and to attempt to use it
during a later "submodule update" without validating it first.
Just show the suggested value to the diagnostic output, and set the
value to 'none' in the configuration, if it is not one of the ones
that are known to be supported by this version of Git.
Helped-by: Jens Lehmann <Jens.Lehmann@web.de>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
cmd_update() in the submodule script tries to preserve the options given
on the command line in the "orig_flags" variable to pass them on into the
recursion when the '--recursive' option is given. But this isn't necessary
because all the variables set by the options will be seen in the recursion
too as that is achieved by executing "eval cmd_update".
The same has already been done for cmd_status() in e15bec0ec, so let's
clean up cmd_update() likewise. Also add a test to make sure that a
submodule name given on the command line is not passed into the recursion
(which was the goal of adding the orig_flags variable in 98dbe63db).
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Several of the built shell commands invoke a bare "perl" to
perform some one-liners. This will use the first perl in the
PATH rather than the one specified by the user's SHELL_PATH.
We are not asking these perl invocations to do anything
exotic, so typically any old system perl will do; however,
in some cases the system perl may have unexpected behavior
(e.g., by handling line endings differently). We should err
on the side of using the perl the user pointed us to.
The downside of this is that on systems with a sane perl
setup, we no longer find the perl at runtime, but instead
point to a static perl (like /usr/bin/perl). That means we
will not handle somebody moving perl without rebuilding git,
whereas before we tracked it just fine. This is probably not
a big deal, though, as the built perl scripts already
suffered from this.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'eval "$@"' creates an extra layer of shell interpretation, which is
probably not expected by a user who passes multiple arguments to git
submodule foreach:
$ git grep "'"
[searches for single quotes]
$ git submodule foreach git grep "'"
Entering '[submodule]'
/usr/lib/git-core/git-submodule: 1: eval: Syntax error: Unterminated quoted string
Stopping at '[submodule]'; script returned non-zero status.
To fix this, if the user passes more than one argument, execute "$@"
directly instead of passing it to eval.
Examples:
* Typical usage when adding an extra level of quoting is to pass a
single argument representing the entire command to be passed to the
shell. This doesn't change that.
* One can imagine someone feeding untrusted input as an argument:
git submodule foreach git grep "$variable"
That currently results in a nonobvious shell code injection
vulnerability. Executing the command named by the arguments
directly, as in this patch, fixes it.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
Acked-by: Johan Herland <johan@herland.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
* bc/submodule-status-ignored:
Improve documentation concerning the status.submodulesummary setting
submodule: don't print status output with ignore=all
submodule: fix confusing variable name
The --for-status option was an undocumented option used only by
wt-status.c, which inserted a header and commented out the output. We can
achieve the same result within wt-status.c, without polluting the
submodule command-line options.
This will make it easier to disable the comments from wt-status.c later.
The --for-status is kept so that another topic in flight
(bc/submodule-status-ignored) can continue relying on it, although it is
currently a no-op.
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git status prints information for submodules, but it should ignore the status of
those which have submodule.<name>.ignore set to all. Fix it so that it does
properly ignore those which have that setting either in .git/config or in
.gitmodules.
Not ignored are submodules that are added, deleted, or moved (which is
essentially a combination of the first two) because it is not easily possible to
determine the old path once a move has occurred, nor is it easily possible to
detect which adds and deletions are moves and which are not. This also
preserves the previous behavior of always listing modules which are to be
deleted.
Tests are included which verify that this change has no effect on git submodule
summary without the --for-status option.
Signed-off-by: Brian M. Carlson <sandals@crustytoothpaste.net>
Acked-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
cmd_summary reads the output of git diff, but reads in the submodule path into a
variable called name. Since this variable does not contain the name of the
submodule, but the path, rename it to be clearer what data it actually holds.
Signed-off-by: Brian M. Carlson <sandals@crustytoothpaste.net>
Acked-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add the --depth option to the add and update commands of "git submodule",
which is then passed on to the clone command. This is useful when the
submodule(s) are huge and you're not really interested in anything but
the latest commit.
Tests are added and some indention adjustments were made to conform to the
rest of the testfile on "submodule update can handle symbolic links in pwd".
Signed-off-by: Fredrik Gustafsson <iveqy@iveqy.com>
Acked-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Users can set submodule.$name.update to '!command' which will cause
'command' to be run instead of checkout/merge/rebase. This allows
the user finer-grained control over how the update is done.
The primary motivation for this was interoperability with stgit;
however being able to intercept the submodule update process may
prove useful for integrating with or extending other tools.
Signed-off-by: Chris Packham <judge.packham@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow various subcommands of "git submodule" to be run not from the
top of the working tree of the superproject.
* jk/submodule-subdirectory-ok:
submodule: drop the top-level requirement
rev-parse: add --prefix option
submodule: show full path in error message
t7403: add missing && chaining
t7403: modernize style
t7401: make indentation consistent
Many "git submodule" operations do not work on a submodule at a
path whose name is not in ASCII.
* fg/submodule-non-ascii-path:
t7400: test of UTF-8 submodule names pass under Mac OS
handle multibyte characters in name
Use the new rev-parse --prefix option to process all paths given to the
submodule command, dropping the requirement that it be run from the
top-level of the repository.
Since the interpretation of a relative submodule URL depends on whether
or not "remote.origin.url" is configured, explicitly block relative URLs
in "git submodule add" when not at the top level of the working tree.
Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When --recursive was added to "submodule foreach" in commit 15fc56a (git
submodule foreach: Add --recursive to recurse into nested submodules,
2009-08-19), the error message when the script returns a non-zero status
was not updated to contain $prefix to show the full path. Fix this.
Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
set_rev_name is a possiblly expensive operation. If a submodule has
changes in it, set_rev_name was called twice.
Move call to set_rev_name so it's only called once, no matter which
codepath is taken.
Signed-off-by: Fredrik Gustafsson <iveqy@iveqy.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many "git submodule" operations do not work on a submodule at a path whose
name is not in ASCII.
This is because "git ls-files" is used to find which paths are bound to
submodules to the current working tree, and the output is C-quoted by default
for non ASCII pathnames.
Tell "git ls-files" to not C-quote its output, which is easier than unwrapping
C-quote ourselves.
Signed-off-by: Fredrik Gustafsson <iveqy@iveqy.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"submodule summary --summary-limit" option did not support
"--option=value" form.
* rs/submodule-summary-limit:
submodule summary: support --summary-limit=<n>
The output of "git submodule deinit sub" of a populated submodule prints
rm 'sub'
as the first line unless used with the -f option.
The "rm 'sub'" line is exactly the same output the user gets when using
"git rm sub" (because that command is used with the --dry-run option under
the hood to determine if the submodule is clean), which can easily lead to
the false impression that the submodule would be permanently removed. Also
users might be confused that the "rm 'submodule'" line won't show up when
the -f option is used, as the test is skipped in this case.
Silence the "rm 'submodule'" output by using the --quiet option for "git
rm" and always print
Cleared directory 'submodule'
instead as the first output line. This line is printed as long as the
directory exists, no matter if empty or not.
Also extend the tests in t7400 to make sure the "Cleared directory" line
is printed correctly.
Reported-by: Phil Hord <phil.hord@gmail.com>
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>