Prepare for Git on-disk repository representation to undergo
backward incompatible changes by introducing a new repository
format version "1", with an extension mechanism.
* jk/repository-extension:
introduce "preciousObjects" repository extension
introduce "extensions" form of core.repositoryformatversion
Again, we do not usually process release notes with AsciiDoc, but it
is better to be consistent.
This incidentally reveals breakages left by an ancient 5e00439f
(Documentation: build html for all files in technical and howto,
2012-10-23). The index-format documentation was originally written
to be read as straight text without formatting and when the commit
forced everything in Documentation/ to go through AsciiDoc, it did
not do any adjustment--hence the double-dashes will be seen in the
resulting text that is rendered as preformatted fixed-width without
converted into em-dashes.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The spec is very inconsistent about which PKT-LINE() parts
of the grammar include a LF. On top of that, the code is not
consistent, either (e.g., send-pack does not put newlines
into the ref-update commands it sends).
Let's make explicit the long-standing expectation that we
generally expect pkt-lines to end in a newline, but that
receivers should be lenient. This makes the spec consistent,
and matches what git already does (though it does not always
fulfill the SHOULD).
We do make an exception for the push-cert, where the
receiving code is currently a bit pickier. This is a
reasonable way to be, as the data needs to be byte-for-byte
compatible with what was signed. We _could_ make up some
rules about signing a canonicalized version including
newlines, but that would require a code change, and is out
of scope for this patch.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The gitmodules API accessed from the C code learned to cache stuff
lazily.
* hv/submodule-config:
submodule: allow erroneous values for the fetchRecurseSubmodules option
submodule: use new config API for worktree configurations
submodule: extract functions for config set and lookup
submodule: implement a config API for lookup of .gitmodules values
The "lockfile" API has been rebuilt on top of a new "tempfile" API.
* mh/tempfile:
credential-cache--daemon: use tempfile module
credential-cache--daemon: delete socket from main()
gc: use tempfile module to handle gc.pid file
lock_repo_for_gc(): compute the path to "gc.pid" only once
diff: use tempfile module
setup_temporary_shallow(): use tempfile module
write_shared_index(): use tempfile module
register_tempfile(): new function to handle an existing temporary file
tempfile: add several functions for creating temporary files
prepare_tempfile_object(): new function, extracted from create_tempfile()
tempfile: a new module for handling temporary files
commit_lock_file(): use get_locked_file_path()
lockfile: add accessor get_lock_file_path()
lockfile: add accessors get_lock_file_fd() and get_lock_file_fp()
create_bundle(): duplicate file descriptor to avoid closing it twice
lockfile: move documentation to lockfile.h and lockfile.c
We remove the extracted functions and directly parse into and read out
of the cache. This allows us to have one unified way of accessing
submodule configuration values specific to single submodules. Regardless
whether we need to access a configuration from history or from the
worktree.
Signed-off-by: Heiko Voigt <hvoigt@hvoigt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a superproject some commands need to interact with submodules. They
need to query values from the .gitmodules file either from the worktree
of from certain revisions. At the moment this is quite hard since a
caller would need to read the .gitmodules file from the history and then
parse the values. We want to provide an API for this so we have one
place to get values from .gitmodules from any revision (including the
worktree).
The API is realized as a cache which allows us to lazily read
.gitmodules configurations by commit into a runtime cache which can then
be used to easily lookup values from it. Currently only the values for
path or name are stored but it can be extended for any value needed.
It is expected that .gitmodules files do not change often between
commits. Thats why we lookup the .gitmodules sha1 from a commit and then
either lookup an already parsed configuration or parse and cache an
unknown one for each sha1. The cache is lazily build on demand for each
requested commit.
This cache can be used for all purposes which need knowledge about
submodule configurations. Example use cases are:
* Recursive submodule checkout needs to lookup a submodule name from
its path when a submodule first appears. This needs be done before
this configuration exists in the worktree.
* The implementation of submodule support for 'git archive' needs to
lookup the submodule name to generate the archive when given a
revision that is not checked out.
* 'git fetch' when given the --recurse-submodules=on-demand option (or
configuration) needs to lookup submodule names by path from the
database rather than reading from the worktree. For new submodule it
needs to lookup the name from its path to allow cloning new
submodules into the .git folder so they can be checked out without
any network interaction when the user does a checkout of that
revision.
Signed-off-by: Heiko Voigt <hvoigt@hvoigt.net>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rearrange/rewrite it somewhat to fit its new environment.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reimplement 'git pull' in C.
* pt/pull-builtin:
pull: remove redirection to git-pull.sh
pull --rebase: error on no merge candidate cases
pull --rebase: exit early when the working directory is dirty
pull: configure --rebase via branch.<name>.rebase or pull.rebase
pull: teach git pull about --rebase
pull: set reflog message
pull: implement pulling into an unborn branch
pull: fast-forward working tree if head is updated
pull: check if in unresolved merge state
pull: support pull.ff config
pull: error on no merge candidates
pull: pass git-fetch's options to git-fetch
pull: pass git-merge's options to git-merge
pull: pass verbosity, --progress flags to fetch and merge
pull: implement fetch + merge
pull: implement skeletal builtin pull
argv-array: implement argv_array_pushv()
parse-options-cb: implement parse_opt_passthru_argv()
parse-options-cb: implement parse_opt_passthru()
Move machinery to parse human-readable scaled numbers like 1k, 4M,
and 2G as an option parameter's value from pack-objects to
parse-options API, to make it available to other codepaths.
* cb/parse-magnitude:
parse-options: move unsigned long option parsing out of pack-objects.c
test-parse-options: update to handle negative ints
If this extension is used in a repository, then no
operations should run which may drop objects from the object
storage. This can be useful if you are sharing that storage
with other repositories whose refs you cannot see.
For instance, if you do:
$ git clone -s parent child
$ git -C parent config extensions.preciousObjects true
$ git -C parent config core.repositoryformatversion 1
you now have additional safety when running git in the
parent repository. Prunes and repacks will bail with an
error, and `git gc` will skip those operations (it will
continue to pack refs and do other non-object operations).
Older versions of git, when run in the repository, will
fail on every operation.
Note that we do not set the preciousObjects extension by
default when doing a "clone -s", as doing so breaks
backwards compatibility. It is a decision the user should
make explicitly.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Normally we try to avoid bumps of the whole-repository
core.repositoryformatversion field. However, it is
unavoidable if we want to safely change certain aspects of
git in a backwards-incompatible way (e.g., modifying the set
of ref tips that we must traverse to generate a list of
unreachable, safe-to-prune objects).
If we were to bump the repository version for every such
change, then any implementation understanding version `X`
would also have to understand `X-1`, `X-2`, and so forth,
even though the incompatibilities may be in orthogonal parts
of the system, and there is otherwise no reason we cannot
implement one without the other (or more importantly, that
the user cannot choose to use one feature without the other,
weighing the tradeoff in compatibility only for that
particular feature).
This patch documents the existing repositoryformatversion
strategy and introduces a new format, "1", which lets a
repository specify that it must run with an arbitrary set of
extensions. This can be used, for example:
- to inform git that the objects should not be pruned based
only on the reachability of the ref tips (e.g, because it
has "clone --shared" children)
- that the refs are stored in a format besides the usual
"refs" and "packed-refs" directories
Because we bump to format "1", and because format "1"
requires that a running git knows about any extensions
mentioned, we know that older versions of the code will not
do something dangerous when confronted with these new
formats.
For example, if the user chooses to use database storage for
refs, they may set the "extensions.refbackend" config to
"db". Older versions of git will not understand format "1"
and bail. Versions of git which understand "1" but do not
know about "refbackend", or which know about "refbackend"
but not about the "db" backend, will refuse to run. This is
annoying, of course, but much better than the alternative of
claiming that there are no refs in the repository, or
writing to a location that other implementations will not
read.
Note that we are only defining the rules for format 1 here.
We do not ever write format 1 ourselves; it is a tool that
is meant to be used by users and future extensions to
provide safety with older implementations.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The unsigned long option parsing (including 'k'/'m'/'g' suffix
parsing) is more widely applicable. Add support for OPT_MAGNITUDE
to parse-options.h and change pack-objects.c use this support.
The error behavior on parse errors follows that of OPT_INTEGER. The
name of the option that failed to parse is reported with a brief
message describing the expect format for the option argument and
then the full usage message for the command invoked.
This differs from the previous behavior for OPT_ULONG used in
pack-objects for --max-pack-size and --window-memory which used to
display the value supplied in the error message and did not display
the full usage message.
Signed-off-by: Charles Bailey <cbailey32@bloomberg.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we have a null-terminated array, it would be useful to convert it
or append it to an argv_array for further manipulation.
Implement argv_array_pushv() which will push a null-terminated array of
strings on to an argv_array.
Signed-off-by: Paul Tan <pyokagan@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Certain git commands, such as git-pull, are simply wrappers around other
git commands like git-fetch, git-merge and git-rebase. As such, these
wrapper commands will typically need to "pass through" command-line
options of the commands they wrap.
Implement the parse_opt_passthru_argv() parse-options callback, which
will reconstruct all the provided command-line options into an
argv_array, such that it can be passed to another git command. This is
useful for passing command-line options that can be specified multiple
times.
Helped-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Paul Tan <pyokagan@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Certain git commands, such as git-pull, are simply wrappers around other
git commands like git-fetch, git-merge and git-rebase. As such, these
wrapper commands will typically need to "pass through" command-line
options of the commands they wrap.
Implement the parse_opt_passthru() parse-options callback, which will
reconstruct the command-line option into an char* string, such that it
can be passed to another git command.
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Paul Tan <pyokagan@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
for_each_ref() callback functions were taught to name the objects
not with "unsigned char sha1[20]" but with "struct object_id".
* bc/object-id: (56 commits)
struct ref_lock: convert old_sha1 member to object_id
warn_if_dangling_symref(): convert local variable "junk" to object_id
each_ref_fn_adapter(): remove adapter
rev_list_insert_ref(): remove unneeded arguments
rev_list_insert_ref_oid(): new function, taking an object_oid
mark_complete(): remove unneeded arguments
mark_complete_oid(): new function, taking an object_oid
clear_marks(): rewrite to take an object_id argument
mark_complete(): rewrite to take an object_id argument
send_ref(): convert local variable "peeled" to object_id
upload-pack: rewrite functions to take object_id arguments
find_symref(): convert local variable "unused" to object_id
find_symref(): rewrite to take an object_id argument
write_one_ref(): rewrite to take an object_id argument
write_refs_to_temp_dir(): convert local variable sha1 to object_id
submodule: rewrite to take an object_id argument
shallow: rewrite functions to take object_id arguments
handle_one_ref(): rewrite to take an object_id argument
add_info_ref(): rewrite to take an object_id argument
handle_one_reflog(): rewrite to take an object_id argument
...
Introduce <branch>@{push} short-hand to denote the remote-tracking
branch that tracks the branch at the remote the <branch> would be
pushed to.
* jk/at-push-sha1:
for-each-ref: accept "%(push)" format
for-each-ref: use skip_prefix instead of starts_with
sha1_name: implement @{push} shorthand
sha1_name: refactor interpret_upstream_mark
sha1_name: refactor upstream_mark
remote.c: add branch_get_push
remote.c: return upstream name from stat_tracking_info
remote.c: untangle error logic in branch_get_upstream
remote.c: report specific errors from branch_get_upstream
remote.c: introduce branch_get_upstream helper
remote.c: hoist read_config into remote_get_1
remote.c: provide per-branch pushremote name
remote.c: hoist branch.*.remote lookup out of remote_get_1
remote.c: drop "remote" pointer from "struct branch"
remote.c: refactor setup of branch->merge list
remote.c: drop default_remote_name variable
"git upload-pack" that serves "git fetch" can be told to serve
commits that are not at the tip of any ref, as long as they are
reachable from a ref, with uploadpack.allowReachableSHA1InWant
configuration variable.
* fm/fetch-raw-sha1:
upload-pack: optionally allow fetching reachable sha1
upload-pack: prepare to extend allow-tip-sha1-in-want
config.txt: clarify allowTipSHA1InWant with camelCase
Teach the index to optionally remember already seen untracked files
to speed up "git status" in a working tree with tons of cruft.
* nd/untracked-cache: (24 commits)
git-status.txt: advertisement for untracked cache
untracked cache: guard and disable on system changes
mingw32: add uname()
t7063: tests for untracked cache
update-index: test the system before enabling untracked cache
update-index: manually enable or disable untracked cache
status: enable untracked cache
untracked-cache: temporarily disable with $GIT_DISABLE_UNTRACKED_CACHE
untracked cache: mark index dirty if untracked cache is updated
untracked cache: print stats with $GIT_TRACE_UNTRACKED_STATS
untracked cache: avoid racy timestamps
read-cache.c: split racy stat test to a separate function
untracked cache: invalidate at index addition or removal
untracked cache: load from UNTR index extension
untracked cache: save to an index extension
ewah: add convenient wrapper ewah_serialize_strbuf()
untracked cache: don't open non-existent .gitignore
untracked cache: mark what dirs should be recursed/saved
untracked cache: record/validate dir mtime and reuse cached output
untracked cache: make a wrapper around {open,read,close}dir()
...
Change typedef each_ref_fn to take a "const struct object_id *oid"
parameter instead of "const unsigned char *sha1".
To aid this transition, implement an adapter that can be used to wrap
old-style functions matching the old typedef, which is now called
"each_ref_sha1_fn"), and make such functions callable via the new
interface. This requires the old function and its cb_data to be
wrapped in a "struct each_ref_fn_sha1_adapter", and that object to be
used as the cb_data for an adapter function, each_ref_fn_adapter().
This is an enormous diff, but most of it consists of simple,
mechanical changes to the sites that call any of the "for_each_ref"
family of functions. Subsequent to this change, the call sites can be
rewritten one by one to use the new interface.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With uploadpack.allowReachableSHA1InWant configuration option set on the
server side, "git fetch" can make a request with a "want" line that names
an object that has not been advertised (likely to have been obtained out
of band or from a submodule pointer). Only objects reachable from the
branch tips, i.e. the union of advertised branches and branches hidden by
transfer.hideRefs, will be processed. Note that there is an associated
cost of having to walk back the history to check the reachability.
This feature can be used when obtaining the content of a certain commit,
for which the sha1 is known, without the need of cloning the whole
repository, especially if a shallow fetch is used. Useful cases are e.g.
repositories containing large files in the history, fetching only the
needed data for a submodule checkout, when sharing a sha1 without telling
which exact branch it belongs to and in Gerrit, if you think in terms of
commits instead of change numbers. (The Gerrit case has already been
solved through allowTipSHA1InWant as every Gerrit change has a ref.)
Signed-off-by: Fredrik Medley <fredrik.medley@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix remaining instances where "pack-file" is used instead of
"packfile". Some places remain where we still use "pack-file",
This is the case when we explicitly refer to a file with a
".pack" extension as opposed to a data source providing a pack
data stream.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we create each branch struct, we fill in the
"remote_name" field from the config, and then fill in the
actual "remote" field (with a "struct remote") based on that
name. However, it turns out that nobody really cares about
the latter field. The only two sites that access it at all
are:
1. git-merge, which uses it to notice when the branch does
not have a remote defined. But we can easily replace this
with looking at remote_name instead.
2. remote.c itself, when setting up the @{upstream} merge
config. But we don't need to save the "remote" in the
"struct branch" for that; we can just look it up for
the duration of the operation.
So there is no need to have both fields; they are redundant
with each other (the struct remote contains the name, or you
can look up the struct from the name). It would be nice to
simplify this, especially as we are going to add matching
pushremote config in a future patch (and it would be nice to
keep them consistent).
So which one do we keep and which one do we get rid of?
If we had a lot of callers accessing the struct, it would be
more efficient to keep it (since you have to do a lookup to
go from the name to the struct, but not vice versa). But we
don't have a lot of callers; we have exactly one, so
efficiency doesn't matter. We can decide this based on
simplicity and readability.
And the meaning of the struct value is somewhat unclear. Is
it always the remote matching remote_name? If remote_name is
NULL (i.e., no per-branch config), does the struct fall back
to the "origin" remote, or is it also NULL? These questions
will get even more tricky with pushremotes, whose fallback
behavior is more complicated. So let's just store the name,
which pretty clearly represents the branch.*.remote config.
Any lookup or fallback behavior can then be implemented in
helper functions.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the user enables untracked cache, then
- move worktree to an unsupported filesystem
- or simply upgrade OS
- or move the whole (portable) disk from one machine to another
- or access a shared fs from another machine
there's no guarantee that untracked cache can still function properly.
Record the worktree location and OS footprint in the cache. If it
changes, err on the safe side and disable the cache. The user can
'update-index --untracked-cache' again to make sure all conditions are
met.
This adds a new requirement that setup_git_directory* must be called
before read_cache() because we need worktree location by then, or the
cache is dropped.
This change does not cover all bases, you can fool it if you try
hard. The point is to stop accidents.
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: brian m. carlson <sandals@crustytoothpaste.net>
Helped-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The strbuf API was explained between the API documentation and in
the header file. Move missing bits to strbuf.h so that programmers
can check only one place for all necessary information.
* jk/strbuf-doc-to-header:
strbuf.h: group documentation for trim functions
strbuf.h: drop boilerplate descriptions of strbuf_split_*
strbuf.h: reorganize api function grouping headers
strbuf.h: format asciidoc code blocks as 4-space indent
strbuf.h: drop asciidoc list formatting from API docs
strbuf.h: unify documentation comments beginnings
strbuf.h: integrate api-strbuf.txt documentation
The error handling functions and conventions are now documented in
the API manual.
* jn/doc-api-errors:
doc: document error handling functions and conventions
"git push" has been taught a "--atomic" option that makes push to
update more than one ref an "all-or-none" affair.
* sb/atomic-push:
Document receive.advertiseatomic
t5543-atomic-push.sh: add basic tests for atomic pushes
push.c: add an --atomic argument
send-pack.c: add --atomic command line argument
send-pack: rename ref_update_to_be_sent to check_to_send_update
receive-pack.c: negotiate atomic push support
receive-pack.c: add execute_commands_atomic function
receive-pack.c: move transaction handling in a central place
receive-pack.c: move iterating over all commands outside execute_commands
receive-pack.c: die instead of error in case of possible future bug
receive-pack.c: shorten the execute_commands loop over all commands
Some of strbuf is documented as comments above functions,
and some is separate in Documentation/technical/api-strbuf.txt.
This makes it annoying to find the appropriate documentation.
We'd rather have it all in one place, which means all in the
text document, or all in the header.
Let's choose the header as that place. Even though the
formatting is not quite as pretty, this keeps the
documentation close to the related code. The hope is that
this makes it easier to find what you want (human-readable
comments are right next to the C declarations), and easier
for writers to keep the documentation up to date.
This is more or less a straight import of the text from
api-strbuf.txt into C comments, complete with asciidoc
formatting. The exceptions are:
1. All comments created in this way are started with "/**"
to indicate they are part of the API documentation. This
may help later with extracting the text to pretty-print
it.
2. Function descriptions do not repeat the function name,
as it is available in the context directly below. So:
`strbuf_add`::
Add data of given length to the buffer.
from api-strbuf.txt becomes:
/**
* Add data of given length to the buffer.
*/
void strbuf_add(struct strbuf *sb, const void *, size_t);
As a result, any block-continuation required in asciidoc
for that list item was dropped in favor of straight
blank-line paragraph (since it is not necessary when we
are not in a list item).
3. There is minor re-wording to integrate existing comments
and api-strbuf text. In each case, I took whichever
version was more descriptive, and eliminated any
redundancies. In one case, for strbuf_addstr, the api
documentation gave its inline definition; I eliminated
this as redundant with the actual definition, which can
be seen directly below the comment.
4. The functions in the header file are re-ordered to match
the ordering of the API documentation, under the
assumption that more thought went into the grouping
there.
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This adds the atomic protocol option to allow
receive-pack to inform the client that it has
atomic push capability.
This commit makes the functionality introduced
in the previous commits go live for the serving
side. The changes in documentation reflect the
protocol capabilities of the server.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Credential helpers are asked in turn until one of them give
positive response, which is cumbersome to turn off when you need to
run Git in an automated setting. The credential helper interface
learned to allow a helper to say "stop, don't ask other helpers."
Also GIT_TERMINAL_PROMPT environment can be set to false to disable
our built-in prompt mechanism for passwords.
* jk/credential-quit:
prompt: respect GIT_TERMINAL_PROMPT to disable terminal prompts
credential: let helpers tell us to quit
When we are trying to fill a credential, we loop over the
set of defined credential-helpers, then fall back to running
askpass, and then finally prompt on the terminal. Helpers
which cannot find a credential are free to tell us nothing,
but they cannot currently ask us to stop prompting.
This patch lets them provide a "quit" attribute, which asks
us to stop the process entirely (avoiding running more
helpers, as well as the askpass/terminal prompt).
This has a few possible uses:
1. A helper which prompts the user itself (e.g., in a
dialog) can provide a "cancel" button to the user to
stop further prompts.
2. Some helpers may know that prompting cannot possibly
work. For example, if their role is to broker a ticket
from an external auth system and that auth system
cannot be contacted, there is no point in continuing
(we need a ticket to authenticate, and the user cannot
provide one by typing it in).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The new name is more consistent with the names of other
string_list-related functions.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In addition to fixing trivial and obvious typos, be careful about
the following points:
- Spell ASCII, URL and CRC in ALL CAPS;
- Spell Linux as Capitalized;
- Do not omit periods in "i.e." and "e.g.".
Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>