1
0
Fork 0
mirror of https://github.com/git/git.git synced 2024-11-15 21:53:44 +01:00
Commit graph

798 commits

Author SHA1 Message Date
Nguyễn Thái Ngọc Duy
6bf3b81348 diff --stat: mark any file larger than core.bigfilethreshold binary
Too large files may lead to failure to allocate memory. If it happens
here, it could impact quite a few commands that involve
diff. Moreover, too large files are inefficient to compare anyway (and
most likely non-text), so mark them binary and skip looking at their
content.

Noticed-by: Dale R. Worley <worley@alum.mit.edu>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-18 10:16:45 -07:00
Junio C Hamano
c3d2bc720c Merge branch 'jk/tag-sort'
* jk/tag-sort:
  tag: support configuring --sort via .gitconfig
  tag: fix --sort tests to use cat<<-\EOF format
2014-07-23 11:35:45 -07:00
Jacob Keller
b150794daf tag: support configuring --sort via .gitconfig
Add support for configuring default sort ordering for git tags. Command
line option will override this configured value, using the exact same
syntax.

Cc: Jeff King <peff@peff.net>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-07-17 09:22:20 -07:00
Junio C Hamano
25f3119000 Merge branch 'jk/repack-pack-writebitmaps-config'
* jk/repack-pack-writebitmaps-config:
  t7700: drop explicit --no-pack-kept-objects from .keep test
  repack: introduce repack.writeBitmaps config option
  repack: simplify handling of --write-bitmap-index
  pack-objects: stop respecting pack.writebitmaps
2014-06-25 12:23:19 -07:00
Junio C Hamano
96b29bde91 Merge branch 'sh/enable-preloadindex'
* sh/enable-preloadindex:
  environment.c: enable core.preloadindex by default
2014-06-16 12:18:49 -07:00
Junio C Hamano
f18871dcd4 Merge branch 'jm/format-patch-mail-sig'
* jm/format-patch-mail-sig:
  format-patch: add "--signature-file=<file>" option
  format-patch: make newline after signature conditional
2014-06-16 12:18:38 -07:00
Junio C Hamano
6d681f0a3e Merge branch 'jl/status-added-submodule-is-never-ignored'
submodule.*.ignore and diff.ignoresubmodules are used to ignore all
submodule changes in "diff" output, but it can be confusing to
apply these configuration values to status and commit.

This is a backward-incompatible change, but should be so in a good
way (aka bugfix).

* jl/status-added-submodule-is-never-ignored:
  commit -m: commit staged submodules regardless of ignore config
  status/commit: show staged submodules regardless of ignore config
2014-06-16 10:07:19 -07:00
Jeff King
71d76cb480 repack: introduce repack.writeBitmaps config option
We currently have pack.writeBitmaps, which originally
operated at the pack-objects level. This should really have
been a repack.* option from day one. Let's give it the more
sensible name, but keep the old version as a deprecated
synonym.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-10 14:05:19 -07:00
Junio C Hamano
ed47bbd1d0 Merge branch 'jj/command-line-adjective'
* jj/command-line-adjective:
  Documentation: use "command-line" when used as a compound adjective, and fix other minor grammatical issues
2014-06-06 11:38:48 -07:00
Junio C Hamano
1e2600dd6a Merge branch 'nd/status-auto-comment-char'
* nd/status-auto-comment-char:
  commit: allow core.commentChar=auto for character auto selection
  config: be strict on core.commentChar
2014-06-06 11:36:10 -07:00
Junio C Hamano
d2a274aa87 Merge branch 'dk/raise-core-deltabasecachelimit'
The `core.deltabasecachelimit` used to default to 16 MiB , but this
proved to be too small, and has been bumped to 96 MiB.

* dk/raise-core-deltabasecachelimit:
  Bump core.deltaBaseCacheLimit to 96m
2014-06-06 11:18:34 -07:00
Junio C Hamano
561d952ed4 Merge branch 'mm/pager-less-sans-S'
Since the very beginning of Git, we gave the LESS environment a
default value "FRSX" when we spawn "less" as the pager.  "S" (chop
long lines instead of wrapping) has been removed from this default
set of options, because it is more or less a personal taste thing,
as opposed to others that have good justifications (i.e. "R" is very
much justified because many kinds of output we produce are colored
and "FX" is justified because output we produce is often shorter
than a page).

Existing users who prefer not to see line-wrapped output may want to
set

  $ git config core.pager "less -S"

to restore the traditional behaviour.  It is expected that people
find output from the most subcommands easier to read with the new
default, except for "blame" which tends to produce really long
lines.  To override the new default only for "git blame", you can do
this:

  $ git config pager.blame "less -S"

* mm/pager-less-sans-S:
  pager: remove 'S' from $LESS by default
2014-06-06 11:02:59 -07:00
Steve Hoelzer
299e29870b environment.c: enable core.preloadindex by default
Many people are on filesystems with horrible stat latency (not
limited to Windows but also NFS), which core.preloadindex was
designed to help.  We discussed enabling it by default early in 2013
but didn't.

Per

  http://thread.gmane.org/gmane.comp.version-control.git/219273/focus=219322

let's enable the setting by default, with the original choice of max
20 threads / min 500 paths per thread parameters.

Signed-off-by: Steve Hoelzer <shoelzer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-03 10:06:53 -07:00
Jeremiah Mahler
7022650f61 format-patch: add "--signature-file=<file>" option
Add an option to format-patch for reading a signature from a file.

  $ git format-patch -1 --signature-file=$HOME/.signature

The config variable `format.signaturefile` can also be used to make
this the default.

  $ git config format.signaturefile $HOME/.signature

  $ git format-patch -1

Signed-off-by: Jeremiah Mahler <jmmahler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-05-27 12:38:32 -07:00
Jason St. John
06ab60c066 Documentation: use "command-line" when used as a compound adjective, and fix other minor grammatical issues
Signed-off-by: Jason St. John <jstjohn@purdue.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-05-21 13:57:10 -07:00
Nguyễn Thái Ngọc Duy
84c9dc2c5a commit: allow core.commentChar=auto for character auto selection
When core.commentChar is "auto", the comment char starts with '#' as
in default but if it's already in the prepared message, find another
char in a small subset. This should stop surprises because git strips
some lines unexpectedly.

Note that git is not smart enough to recognize '#' as the comment char
in custom templates and convert it if the final comment char is
different. It thinks '#' lines in custom templates as part of the
commit message. So don't use this with custom templates.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-05-19 13:37:25 -07:00
Matthieu Moy
b3275838d9 pager: remove 'S' from $LESS by default
By default, Git used to set $LESS to -FRSX if $LESS was not set by
the user. The FRX flags actually make sense for Git (F and X because
sometimes the output Git pipes to less is short, and R because Git
pipes colored output). The S flag (chop long lines), on the other
hand, is not related to Git and is a matter of user preference. Git
should not decide for the user to change LESS's default.

More specifically, the S flag harms users who review untrusted code
within a pager, since a patch looking like:

    -old code;
    +new good code; [... lots of tabs ...] malicious code;

would appear identical to:

    -old code;
    +new good code;

Users who prefer the old behavior can still set the $LESS environment
variable to -FRSX explicitly, or set core.pager to 'less -S'.

The documentation in config.txt is made a bit longer to keep both an
example setting the 'S' flag (needed to recover the old behavior)
and an example showing how to unset a flag set by Git.

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-05-07 13:41:04 -07:00
David Kastrup
4874f544f1 Bump core.deltaBaseCacheLimit to 96m
The default of 16m causes serious thrashing for large delta chains
combined with large files.

Here are some benchmarks (pu variant of git blame):

time git blame -C src/xdisp.c >/dev/null

for a repository of Emacs repacked with git gc --aggressive (v1.9,
resulting in a window size of 250) located on an SSD drive.  The file in
question has about 30000 lines, 1Mb of size, and a history with about
2500 commits.

    16m (previous default):
    real	3m33.936s
    user	2m15.396s
    sys	1m17.352s

    32m:
    real	3m1.319s
    user	2m8.660s
    sys	0m51.904s

    64m:
    real	2m20.636s
    user	1m55.780s
    sys	0m23.964s

    96m:
    real	2m5.668s
    user	1m50.784s
    sys	0m14.288s

    128m:
    real	2m4.337s
    user	1m50.764s
    sys	0m12.832s

    192m:
    real	2m3.567s
    user	1m49.508s
    sys	0m13.312s

Signed-off-by: David Kastrup <dak@gnu.org>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-05-06 15:32:21 -07:00
Max Kirillov
ec9fa62a10 Documentation: git-gui: describe gui.displayuntracked
Signed-off-by: Max Kirillov <max@max630.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-04-21 10:33:20 -07:00
Junio C Hamano
d59c12d7ad Merge branch 'jl/nor-or-nand-and'
Eradicate mistaken use of "nor" (that is, essentially "nor" used
not in "neither A nor B" ;-)) from in-code comments, command output
strings, and documentations.

* jl/nor-or-nand-and:
  code and test: fix misuses of "nor"
  comments: fix misuses of "nor"
  contrib: fix misuses of "nor"
  Documentation: fix misuses of "nor"
2014-04-08 12:00:28 -07:00
Jens Lehmann
1d2f393ac9 status/commit: show staged submodules regardless of ignore config
Currently setting submodule.<name>.ignore and/or diff.ignoreSubmodules to
"all" suppresses all output of submodule changes for the diff family,
status and commit. For status and commit this is really confusing, as it
even when the user chooses to record a new commit for an ignored submodule
by adding it manually this change won't show up under the to-be-committed
changes. To add insult to injury, a later "git commit" will error out with
"nothing to commit" when only ignored submodules are staged.

Fix that by making wt_status always print staged submodule changes, no
matter what ignore settings are configured. The only exception is when the
user explicitly uses the "--ignore-submodules=all" command line option, in
that case the submodule output is still suppressed. This also makes "git
commit" work again when only modifications of ignored submodules are
staged, as that command uses the "commitable" member of the wt_status
struct to determine if staged changes are present. But this only happens
when the commit command uses the wt_status* functions to produce status
output for human consumption (when forking an editor or with --dry-run),
in all other cases (e.g. when run in a script with '-m') another code path
is taken which uses index_differs_from() to determine if any changes are
staged which still ignores submodules according to their configuration.
This will be fixed in a follow-up commit.

Change t7508 to reflect this new behavior and add three new tests to show
that a single staged submodule configured to be ignored will be committed
when the status output is generated and won't be if not. Also update the
documentation of the ignore config options accordingly.

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-04-07 10:32:20 -07:00
Junio C Hamano
8815d8aa7c Merge branch 'nd/gc-aggressive'
Allow tweaking the maximum length of the delta-chain produced by
"gc --aggressive".

* nd/gc-aggressive:
  environment.c: fix constness for odb_pack_keep()
  gc --aggressive: make --depth configurable
2014-04-03 12:38:47 -07:00
Junio C Hamano
76bc28a3bb Merge branch 'ca/doc-config-third-party'
* ca/doc-config-third-party:
  config.txt: third-party tools may and do use their own variables
2014-03-31 16:30:49 -07:00
Justin Lebar
a58088abe2 Documentation: fix misuses of "nor"
Signed-off-by: Justin Lebar <jlebar@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-31 15:16:22 -07:00
Nguyễn Thái Ngọc Duy
125f81461d gc --aggressive: make --depth configurable
When 1c192f3 (gc --aggressive: make it really aggressive - 2007-12-06)
made --depth=250 the default value, it didn't really explain the
reason behind, especially the pros and cons of --depth=250.

An old mail from Linus below explains it at length. Long story short,
--depth=250 is a disk saver and a performance killer. Not everybody
agrees on that aggressiveness. Let the user configure it.

    From: Linus Torvalds <torvalds@linux-foundation.org>
    Subject: Re: [PATCH] gc --aggressive: make it really aggressive
    Date: Thu, 6 Dec 2007 08:19:24 -0800 (PST)
    Message-ID: <alpine.LFD.0.9999.0712060803430.13796@woody.linux-foundation.org>
    Gmane-URL: http://article.gmane.org/gmane.comp.gcc.devel/94637

    On Thu, 6 Dec 2007, Harvey Harrison wrote:
    >
    > 7:41:25elapsed 86%CPU

    Heh. And this is why you want to do it exactly *once*, and then just
    export the end result for others ;)

    > -r--r--r-- 1 hharrison hharrison 324094684 2007-12-06 07:26 pack-1d46...pack

    But yeah, especially if you allow longer delta chains, the end result can
    be much smaller (and what makes the one-time repack more expensive is the
    window size, not the delta chain - you could make the delta chains longer
    with no cost overhead at packing time)

    HOWEVER.

    The longer delta chains do make it potentially much more expensive to then
    use old history. So there's a trade-off. And quite frankly, a delta depth
    of 250 is likely going to cause overflows in the delta cache (which is
    only 256 entries in size *and* it's a hash, so it's going to start having
    hash conflicts long before hitting the 250 depth limit).

    So when I said "--depth=250 --window=250", I chose those numbers more as
    an example of extremely aggressive packing, and I'm not at all sure that
    the end result is necessarily wonderfully usable. It's going to save disk
    space (and network bandwidth - the delta's will be re-used for the network
    protocol too!), but there are definitely downsides too, and using long
    delta chains may simply not be worth it in practice.

    (And some of it might just want to have git tuning, ie if people think
    that long deltas are worth it, we could easily just expand on the delta
    hash, at the cost of some more memory used!)

    That said, the good news is that working with *new* history will not be
    affected negatively, and if you want to be _really_ sneaky, there are ways
    to say "create a pack that contains the history up to a version one year
    ago, and be very aggressive about those old versions that we still want to
    have around, but do a separate pack for newer stuff using less aggressive
    parameters"

    So this is something that can be tweaked, although we don't really have
    any really nice interfaces for stuff like that (ie the git delta cache
    size is hardcoded in the sources and cannot be set in the config file, and
    the "pack old history more aggressively" involves some manual scripting
    and knowing how "git pack-objects" works rather than any nice simple
    command line switch).

    So the thing to take away from this is:
     - git is certainly flexible as hell
     - .. but to get the full power you may need to tweak things
     - .. happily you really only need to have one person to do the tweaking,
       and the tweaked end results will be available to others that do not
       need to know/care.

    And whether the difference between 320MB and 500MB is worth any really
    involved tweaking (considering the potential downsides), I really don't
    know. Only testing will tell.

			    Linus

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-31 10:26:24 -07:00
Chris Angelico
93728b23ad config.txt: third-party tools may and do use their own variables
Signed-off-by: Chris Angelico <rosuav@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-21 11:55:07 -07:00
Junio C Hamano
249d54b231 Merge branch 'jk/repack-pack-keep-objects'
* jk/repack-pack-keep-objects:
  repack: add `repack.packKeptObjects` config var
2014-03-18 13:50:29 -07:00
Junio C Hamano
e8cb4996ad Merge branch 'sr/add--interactive-term-readkey'
* sr/add--interactive-term-readkey:
  git-add--interactive: warn if module for interactive.singlekey is missing
  git-config: document interactive.singlekey requires Term::ReadKey
2014-03-14 14:27:21 -07:00
Junio C Hamano
d552f8df1b Merge branch 'sg/archive-restrict-remote'
Allow loosening remote "git archive" invocation security check that
refuses to serve tree-ish not at the tip of any ref.

* sg/archive-restrict-remote:
  add uploadarchive.allowUnreachable option
  docs: clarify remote restrictions for git-upload-archive
2014-03-14 14:27:03 -07:00
Junio C Hamano
13b49f1e74 Merge branch 'tg/index-v4-format'
* tg/index-v4-format:
  read-cache: add index.version config variable
  test-lib: allow setting the index format version
  introduce GIT_INDEX_VERSION environment variable
2014-03-14 14:26:50 -07:00
Junio C Hamano
009055f3ec Merge branch 'jc/push-2.0-default-to-simple'
Finally update the "git push" default behaviour to "simple".
2014-03-07 15:13:15 -08:00
Junio C Hamano
4c4ac4db2c Merge branch 'nd/daemonize-gc'
Allow running "gc --auto" in the background.

* nd/daemonize-gc:
  gc: config option for running --auto in background
  daemon: move daemonize() to libgit.a
2014-03-05 15:06:39 -08:00
Simon Ruderich
8358f1acd5 git-config: document interactive.singlekey requires Term::ReadKey
Most distributions don't require Term::ReadKey as dependency, leaving
the user to wonder why the setting doesn't work.

Signed-off-by: Simon Ruderich <simon@ruderich.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-03 14:10:55 -08:00
Jeff King
ee34a2bead repack: add repack.packKeptObjects config var
The git-repack command always passes `--honor-pack-keep`
to pack-objects. This has traditionally been a good thing,
as we do not want to duplicate those objects in a new pack,
and we are not going to delete the old pack.

However, when bitmaps are in use, it is important for a full
repack to include all reachable objects, even if they may be
duplicated in a .keep pack. Otherwise, we cannot generate
the bitmaps, as the on-disk format requires the set of
objects in the pack to be fully closed.

Even if the repository does not generally have .keep files,
a simultaneous push could cause a race condition in which a
.keep file exists at the moment of a repack. The repack may
try to include those objects in one of two situations:

  1. The pushed .keep pack contains objects that were
     already in the repository (e.g., blobs due to a revert of
     an old commit).

  2. Receive-pack updates the refs, making the objects
     reachable, but before it removes the .keep file, the
     repack runs.

In either case, we may prefer to duplicate some objects in
the new, full pack, and let the next repack (after the .keep
file is cleaned up) take care of removing them.

This patch introduces both a command-line and config option
to disable the `--honor-pack-keep` option.  By default, it
is triggered when pack.writeBitmaps (or `--write-bitmap-index`
is turned on), but specifying it explicitly can override the
behavior (e.g., in cases where you prefer .keep files to
bitmaps, but only when they are present).

Note that this option just disables the pack-objects
behavior. We still leave packs with a .keep in place, as we
do not necessarily know that we have duplicated all of their
objects.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-03 12:21:49 -08:00
Scott J. Goldman
7671b63211 add uploadarchive.allowUnreachable option
In commit ee27ca4, we started restricting remote git-archive
invocations to only accessing reachable commits. This
matches what upload-pack allows, but does restrict some
useful cases (e.g., HEAD:foo). We loosened this in 0f544ee,
which allows `foo:bar` as long as `foo` is a ref tip.
However, that still doesn't allow many useful things, like:

  1. Commits accessible from a ref, like `foo^:bar`, which
     are reachable

  2. Arbitrary sha1s, even if they are reachable.

We can do a full object-reachability check for these cases,
but it can be quite expensive if the client has sent us the
sha1 of a tree; we have to visit every sub-tree of every
commit in the worst case.

Let's instead give site admins an escape hatch, in case they
prefer the more liberal behavior.  For many sites, the full
object database is public anyway (e.g., if you allow dumb
walker access), or the site admin may simply decide the
security/convenience tradeoff is not worth it.

This patch adds a new config option to disable the
restrictions added in ee27ca4. It defaults to off, meaning
there is no change in behavior by default.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-28 09:55:37 -08:00
Junio C Hamano
0f9e62e084 Merge branch 'jk/pack-bitmap'
Borrow the bitmap index into packfiles from JGit to speed up
enumeration of objects involved in a commit range without having to
fully traverse the history.

* jk/pack-bitmap: (26 commits)
  ewah: unconditionally ntohll ewah data
  ewah: support platforms that require aligned reads
  read-cache: use get_be32 instead of hand-rolled ntoh_l
  block-sha1: factor out get_be and put_be wrappers
  do not discard revindex when re-preparing packfiles
  pack-bitmap: implement optional name_hash cache
  t/perf: add tests for pack bitmaps
  t: add basic bitmap functionality tests
  count-objects: recognize .bitmap in garbage-checking
  repack: consider bitmaps when performing repacks
  repack: handle optional files created by pack-objects
  repack: turn exts array into array-of-struct
  repack: stop using magic number for ARRAY_SIZE(exts)
  pack-objects: implement bitmap writing
  rev-list: add bitmap mode to speed up object lists
  pack-objects: use bitmaps when packing objects
  pack-objects: split add_object_entry
  pack-bitmap: add support for bitmap indexes
  documentation: add documentation for the bitmap format
  ewah: compressed bitmap implementation
  ...
2014-02-27 14:01:48 -08:00
Junio C Hamano
7da5fd6895 Merge branch 'da/pull-ff-configuration'
"git pull" learned to pay attention to pull.ff configuration
variable.

* da/pull-ff-configuration:
  pull: add --ff-only to the help text
  pull: add pull.ff configuration
2014-02-27 14:01:11 -08:00
Junio C Hamano
810273bc33 Merge branch 'nv/commit-gpgsign-config'
Introduce commit.gpgsign configuration variable to force every
commit to be GPG signed.  The variable cannot be overriden from the
command line of some of the commands that create commits except for
"git commit" and "git commit-tree", but I am not convinced that it
is a good idea to sprinkle support for --no-gpg-sign everywhere,
which in turn means that this configuration variable may not be
such a good idea.

* nv/commit-gpgsign-config:
  test the commit.gpgsign config option
  commit-tree: add and document --no-gpg-sign
  commit-tree: add the commit.gpgsign option to sign all commits
2014-02-27 14:01:03 -08:00
Nicolas Vigier
d95bfb12b8 commit-tree: add the commit.gpgsign option to sign all commits
If you want to GPG sign all your commits, you have to add the -S option
all the time. The commit.gpgsign config option allows to sign all
commits automatically.

Signed-off-by: Nicolas Vigier <boklm@mars-attacks.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24 14:50:56 -08:00
Thomas Gummerer
3c09d6845d read-cache: add index.version config variable
Add a config variable that allows setting the default index version when
initializing a new index file.  Similar to the GIT_INDEX_VERSION
environment variable this only affects new index files.

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24 13:33:17 -08:00
Nguyễn Thái Ngọc Duy
9f673f9477 gc: config option for running --auto in background
`gc --auto` takes time and can block the user temporarily (but not any
less annoyingly). Make it run in background on systems that support
it. The only thing lost with running in background is printouts. But
gc output is not really interesting. You can keep it in foreground by
changing gc.autodetach.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-10 10:46:37 -08:00
Junio C Hamano
92251b1b5b Merge branch 'nd/shallow-clone'
Fetching from a shallow-cloned repository used to be forbidden,
primarily because the codepaths involved were not carefully vetted
and we did not bother supporting such usage. This attempts to allow
object transfer out of a shallow-cloned repository in a controlled
way (i.e. the receiver become a shallow repository with truncated
history).

* nd/shallow-clone: (31 commits)
  t5537: fix incorrect expectation in test case 10
  shallow: remove unused code
  send-pack.c: mark a file-local function static
  git-clone.txt: remove shallow clone limitations
  prune: clean .git/shallow after pruning objects
  clone: use git protocol for cloning shallow repo locally
  send-pack: support pushing from a shallow clone via http
  receive-pack: support pushing to a shallow clone via http
  smart-http: support shallow fetch/clone
  remote-curl: pass ref SHA-1 to fetch-pack as well
  send-pack: support pushing to a shallow clone
  receive-pack: allow pushes that update .git/shallow
  connected.c: add new variant that runs with --shallow-file
  add GIT_SHALLOW_FILE to propagate --shallow-file to subprocesses
  receive/send-pack: support pushing from a shallow clone
  receive-pack: reorder some code in unpack()
  fetch: add --update-shallow to accept refs that update .git/shallow
  upload-pack: make sure deepening preserves shallow roots
  fetch: support fetching from a shallow repository
  clone: support remote shallow repository
  ...
2014-01-17 12:21:20 -08:00
David Aguilar
b814da891e pull: add pull.ff configuration
Add a `pull.ff` configuration option that is analogous
to the `merge.ff` option.

This allows us to control the fast-forward behavior for
pull-initiated merges only.

Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-15 16:01:06 -08:00
Junio C Hamano
9fac0777e1 Merge branch 'jn/pager-lv-default-env'
Just like we give a reasonable default for "less" via the LESS
environment variable, specify a reasonable default for "lv" via the
"LV" environment variable when spawning the pager.

* jn/pager-lv-default-env:
  pager: set LV=-c alongside LESS=FRSX
2014-01-13 11:33:35 -08:00
Jonathan Nieder
e54c1f2d25 pager: set LV=-c alongside LESS=FRSX
On systems with lv configured as the preferred pager (i.e.,
DEFAULT_PAGER=lv at build time, or PAGER=lv exported in the
environment) git commands that use color show control codes instead of
color in the pager:

	$ git diff
	^[[1mdiff --git a/.mailfilter b/.mailfilter^[[m
	^[[1mindex aa4f0b2..17e113e 100644^[[m
	^[[1m--- a/.mailfilter^[[m
	^[[1m+++ b/.mailfilter^[[m
	^[[36m@@ -1,11 +1,58 @@^[[m

"less" avoids this problem because git uses the LESS environment
variable to pass the -R option ('output ANSI color escapes in raw
form') by default.  Use the LV environment variable to pass 'lv' the
-c option ('allow ANSI escape sequences for text decoration / color')
to fix it for lv, too.

Noticed when the default value for color.ui flipped to 'auto' in
v1.8.4-rc0~36^2~1 (2013-06-10).

Reported-by: Olaf Meeuwissen <olaf.meeuwissen@avasys.jp>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-07 09:23:41 -08:00
Vicent Marti
ae4f07fbcc pack-bitmap: implement optional name_hash cache
When we use pack bitmaps rather than walking the object
graph, we end up with the list of objects to include in the
packfile, but we do not know the path at which any tree or
blob objects would be found.

In a recently packed repository, this is fine. A fetch would
use the paths only as a heuristic in the delta compression
phase, and a fully packed repository should not need to do
much delta compression.

As time passes, though, we may acquire more objects on top
of our large bitmapped pack. If clients fetch frequently,
then they never even look at the bitmapped history, and all
works as usual. However, a client who has not fetched since
the last bitmap repack will have "have" tips in the
bitmapped history, but "want" newer objects.

The bitmaps themselves degrade gracefully in this
circumstance. We manually walk the more recent bits of
history, and then use bitmaps when we hit them.

But we would also like to perform delta compression between
the newer objects and the bitmapped objects (both to delta
against what we know the user already has, but also between
"new" and "old" objects that the user is fetching). The lack
of pathnames makes our delta heuristics much less effective.

This patch adds an optional cache of the 32-bit name_hash
values to the end of the bitmap file. If present, a reader
can use it to match bitmapped and non-bitmapped names during
delta compression.

Here are perf results for p5310:

Test                      origin/master       HEAD^                      HEAD
-------------------------------------------------------------------------------------------------
5310.2: repack to disk    36.81(37.82+1.43)   47.70(48.74+1.41) +29.6%   47.75(48.70+1.51) +29.7%
5310.3: simulated clone   30.78(29.70+2.14)   1.08(0.97+0.10) -96.5%     1.07(0.94+0.12) -96.5%
5310.4: simulated fetch   3.16(6.10+0.08)     3.54(10.65+0.06) +12.0%    1.70(3.07+0.06) -46.2%
5310.6: partial bitmap    36.76(43.19+1.81)   6.71(11.25+0.76) -81.7%    4.08(6.26+0.46) -88.9%

You can see that the time spent on an incremental fetch goes
down, as our delta heuristics are able to do their work.
And we save time on the partial bitmap clone for the same
reason.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-30 12:19:23 -08:00
Vicent Marti
7cc8f97108 pack-objects: implement bitmap writing
This commit extends more the functionality of `pack-objects` by allowing
it to write out a `.bitmap` index next to any written packs, together
with the `.idx` index that currently gets written.

If bitmap writing is enabled for a given repository (either by calling
`pack-objects` with the `--write-bitmap-index` flag or by having
`pack.writebitmaps` set to `true` in the config) and pack-objects is
writing a packfile that would normally be indexed (i.e. not piping to
stdout), we will attempt to write the corresponding bitmap index for the
packfile.

Bitmap index writing happens after the packfile and its index has been
successfully written to disk (`finish_tmp_packfile`). The process is
performed in several steps:

    1. `bitmap_writer_set_checksum`: this call stores the partial
       checksum for the packfile being written; the checksum will be
       written in the resulting bitmap index to verify its integrity

    2. `bitmap_writer_build_type_index`: this call uses the array of
       `struct object_entry` that has just been sorted when writing out
       the actual packfile index to disk to generate 4 type-index bitmaps
       (one for each object type).

       These bitmaps have their nth bit set if the given object is of
       the bitmap's type. E.g. the nth bit of the Commits bitmap will be
       1 if the nth object in the packfile index is a commit.

       This is a very cheap operation because the bitmap writing code has
       access to the metadata stored in the `struct object_entry` array,
       and hence the real type for each object in the packfile.

    3. `bitmap_writer_reuse_bitmaps`: if there exists an existing bitmap
       index for one of the packfiles we're trying to repack, this call
       will efficiently rebuild the existing bitmaps so they can be
       reused on the new index. All the existing bitmaps will be stored
       in a `reuse` hash table, and the commit selection phase will
       prioritize these when selecting, as they can be written directly
       to the new index without having to perform a revision walk to
       fill the bitmap. This can greatly speed up the repack of a
       repository that already has bitmaps.

    4. `bitmap_writer_select_commits`: if bitmap writing is enabled for
       a given `pack-objects` run, the sequence of commits generated
       during the Counting Objects phase will be stored in an array.

       We then use that array to build up the list of selected commits.
       Writing a bitmap in the index for each object in the repository
       would be cost-prohibitive, so we use a simple heuristic to pick
       the commits that will be indexed with bitmaps.

       The current heuristics are a simplified version of JGit's
       original implementation. We select a higher density of commits
       depending on their age: the 100 most recent commits are always
       selected, after that we pick 1 commit of each 100, and the gap
       increases as the commits grow older. On top of that, we make sure
       that every single branch that has not been merged (all the tips
       that would be required from a clone) gets their own bitmap, and
       when selecting commits between a gap, we tend to prioritize the
       commit with the most parents.

       Do note that there is no right/wrong way to perform commit
       selection; different selection algorithms will result in
       different commits being selected, but there's no such thing as
       "missing a commit". The bitmap walker algorithm implemented in
       `prepare_bitmap_walk` is able to adapt to missing bitmaps by
       performing manual walks that complete the bitmap: the ideal
       selection algorithm, however, would select the commits that are
       more likely to be used as roots for a walk in the future (e.g.
       the tips of each branch, and so on) to ensure a bitmap for them
       is always available.

    5. `bitmap_writer_build`: this is the computationally expensive part
       of bitmap generation. Based on the list of commits that were
       selected in the previous step, we perform several incremental
       walks to generate the bitmap for each commit.

       The walks begin from the oldest commit, and are built up
       incrementally for each branch. E.g. consider this dag where A, B,
       C, D, E, F are the selected commits, and a, b, c, e are a chunk
       of simplified history that will not receive bitmaps.

            A---a---B--b--C--c--D
                     \
                      E--e--F

       We start by building the bitmap for A, using A as the root for a
       revision walk and marking all the objects that are reachable
       until the walk is over. Once this bitmap is stored, we reuse the
       bitmap walker to perform the walk for B, assuming that once we
       reach A again, the walk will be terminated because A has already
       been SEEN on the previous walk.

       This process is repeated for C, and D, but when we try to
       generate the bitmaps for E, we can reuse neither the current walk
       nor the bitmap we have generated so far.

       What we do now is resetting both the walk and clearing the
       bitmap, and performing the walk from scratch using E as the
       origin. This new walk, however, does not need to be completed.
       Once we hit B, we can lookup the bitmap we have already stored
       for that commit and OR it with the existing bitmap we've composed
       so far, allowing us to limit the walk early.

       After all the bitmaps have been generated, another iteration
       through the list of commits is performed to find the best XOR
       offsets for compression before writing them to disk. Because of
       the incremental nature of these bitmaps, XORing one of them with
       its predecesor results in a minimal "bitmap delta" most of the
       time. We can write this delta to the on-disk bitmap index, and
       then re-compose the original bitmaps by XORing them again when
       loaded.

       This is a phase very similar to pack-object's `find_delta` (using
       bitmaps instead of objects, of course), except the heuristics
       have been greatly simplified: we only check the 10 bitmaps before
       any given one to find best compressing one. This gives good
       results in practice, because there is locality in the ordering of
       the objects (and therefore bitmaps) in the packfile.

     6. `bitmap_writer_finish`: the last step in the process is
	serializing to disk all the bitmap data that has been generated
	in the two previous steps.

	The bitmap is written to a tmp file and then moved atomically to
	its final destination, using the same process as
	`pack-write.c:write_idx_file`.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-30 12:19:22 -08:00
Vicent Marti
6b8fda2db1 pack-objects: use bitmaps when packing objects
In this patch, we use the bitmap API to perform the `Counting Objects`
phase in pack-objects, rather than a traditional walk through the object
graph. For a reasonably-packed large repo, the time to fetch and clone
is often dominated by the full-object revision walk during the Counting
Objects phase. Using bitmaps can reduce the CPU time required on the
server (and therefore start sending the actual pack data with less
delay).

For bitmaps to be used, the following must be true:

  1. We must be packing to stdout (as a normal `pack-objects` from
     `upload-pack` would do).

  2. There must be a .bitmap index containing at least one of the
     "have" objects that the client is asking for.

  3. Bitmaps must be enabled (they are enabled by default, but can be
     disabled by setting `pack.usebitmaps` to false, or by using
     `--no-use-bitmap-index` on the command-line).

If any of these is not true, we fall back to doing a normal walk of the
object graph.

Here are some sample timings from a full pack of `torvalds/linux` (i.e.
something very similar to what would be generated for a clone of the
repository) that show the speedup produced by various
methods:

    [existing graph traversal]
    $ time git pack-objects --all --stdout --no-use-bitmap-index \
			    </dev/null >/dev/null
    Counting objects: 3237103, done.
    Compressing objects: 100% (508752/508752), done.
    Total 3237103 (delta 2699584), reused 3237103 (delta 2699584)

    real    0m44.111s
    user    0m42.396s
    sys     0m3.544s

    [bitmaps only, without partial pack reuse; note that
     pack reuse is automatic, so timing this required a
     patch to disable it]
    $ time git pack-objects --all --stdout </dev/null >/dev/null
    Counting objects: 3237103, done.
    Compressing objects: 100% (508752/508752), done.
    Total 3237103 (delta 2699584), reused 3237103 (delta 2699584)

    real    0m5.413s
    user    0m5.604s
    sys     0m1.804s

    [bitmaps with pack reuse (what you get with this patch)]
    $ time git pack-objects --all --stdout </dev/null >/dev/null
    Reusing existing pack: 3237103, done.
    Total 3237103 (delta 0), reused 0 (delta 0)

    real    0m1.636s
    user    0m1.460s
    sys     0m0.172s

Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-30 12:19:22 -08:00
Nguyễn Thái Ngọc Duy
0a1bc12b6e receive-pack: allow pushes that update .git/shallow
The basic 8 steps to update .git/shallow does not fully apply here
because the user may choose to accept just a few refs (while fetch
always accepts all refs). The steps are modified a bit.

1-6. same as before. After calling assign_shallow_commits_to_refs at
   step 6, each shallow commit has a bitmap that marks all refs that
   require it.

7. mark all "ours" shallow commits that are reachable from any
   refs. We will need to do the original step 7 on them later.

8. go over all shallow commit bitmaps, mark refs that require new
   shallow commits.

9. setup a strict temporary shallow file to plug all the holes, even
   if it may cut some of our history short. This file is used by all
   hooks. The hooks could use --shallow-file=$GIT_DIR/shallow to
   overcome this and reach everything in current repo.

10. go over the new refs one by one. For each ref, do the reachability
   test if it needs a shallow commit on the list from step 7. Remove
   it if it's reachable from our refs. Gather all required shallow
   commits, run check_everything_connected() with the new ref, then
   install them to .git/shallow.

This mode is disabled by default and can be turned on with
receive.shallowupdate

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10 16:14:18 -08:00
Michael Haggerty
0838bf47b3 fetch --prune: prune only based on explicit refspecs
The old behavior of "fetch --prune" was to prune whatever was being
fetched.  In particular, "fetch --prune --tags" caused tags not only
to be fetched, but also to be pruned.  This is inappropriate because
there is only one tags namespace that is shared among the local
repository and all remotes.  Therefore, if the user defines a local
tag and then runs "git fetch --prune --tags", then the local tag is
deleted.  Moreover, "--prune" and "--tags" can also be configured via
fetch.prune / remote.<name>.prune and remote.<name>.tagopt, making it
even less obvious that an invocation of "git fetch" could result in
tag lossage.

Since the command "git remote update" invokes "git fetch", it had the
same problem.

The command "git remote prune", on the other hand, disregarded the
setting of remote.<name>.tagopt, and so its behavior was inconsistent
with that of the other commands.

So the old behavior made it too easy to lose tags.  To fix this
problem, change "fetch --prune" to prune references based only on
refspecs specified explicitly by the user, either on the command line
or via remote.<name>.fetch.  Thus, tags are no longer made subject to
pruning by the --tags option or the remote.<name>.tagopt setting.

However, tags *are* still subject to pruning if they are fetched as
part of a refspec, and that is good.  For example:

* On the command line,

      git fetch --prune 'refs/tags/*:refs/tags/*'

  causes tags, and only tags, to be fetched and pruned, and is
  therefore a simple way for the user to get the equivalent of the old
  behavior of "--prune --tag".

* For a remote that was configured with the "--mirror" option, the
  configuration is set to include

      [remote "name"]
              fetch = +refs/*:refs/*

  , which causes tags to be subject to pruning along with all other
  references.  This is the behavior that will typically be desired for
  a mirror.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-10-30 14:16:37 -07:00