1
0
Fork 0
mirror of https://github.com/git/git.git synced 2024-11-06 01:03:02 +01:00
Commit graph

708 commits

Author SHA1 Message Date
Clemens Buchacher
f7835a251c preserve mtime of local clone
A local clone without hardlinks copies all objects, including dangling
ones, to the new repository. Since the mtimes are renewed, those
dangling objects cannot be pruned by "git gc --prune", even if they
would have been old enough for pruning in the original repository.

Instead, preserve mtime during copy. "git gc --prune" will then work
in the clone just like it did in the original.

Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-09-13 01:32:26 -07:00
Jim Meyering
2b7ca830c6 use write_str_in_full helper to avoid literal string lengths
In 2d14d65 (Use a clearer style to issue commands to remote helpers,
2009-09-03) I happened to notice two changes like this:

-	write_in_full(helper->in, "list\n", 5);
+
+	strbuf_addstr(&buf, "list\n");
+	write_in_full(helper->in, buf.buf, buf.len);
+	strbuf_reset(&buf);

IMHO, it would be better to define a new function,

    static inline ssize_t write_str_in_full(int fd, const char *str)
    {
           return write_in_full(fd, str, strlen(str));
    }

and then use it like this:

-       strbuf_addstr(&buf, "list\n");
-       write_in_full(helper->in, buf.buf, buf.len);
-       strbuf_reset(&buf);
+       write_str_in_full(helper->in, "list\n");

Thus not requiring the added allocation, and still avoiding
the maintenance risk of literal string lengths.
These days, compilers are good enough that strlen("literal")
imposes no run-time cost.

Transformed via this:

    perl -pi -e \
        's/write_in_full\((.*?), (".*?"), \d+\)/write_str_in_full($1, $2)/'\
      $(git grep -l 'write_in_full.*"')

Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-09-13 01:31:10 -07:00
Jeff King
75194438f4 push: make non-fast-forward help message configurable
This message is designed to help new users understand what
has happened when refs fail to push. However, it does not
help experienced users at all, and significantly clutters
the output, frequently dwarfing the regular status table and
making it harder to see.

This patch introduces a general configuration mechanism for
optional messages, with this push message as the first
example.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-09-11 21:33:20 -07:00
Junio C Hamano
aeb84b05ae core.whitespace: split trailing-space into blank-at-{eol,eof}
People who configured trailing-space depended on it to catch both extra
white space at the end of line, and extra blank lines at the end of file.
Earlier attempt to introduce only blank-at-eof gave them an escape hatch
to keep the old behaviour, but it is a regression until they explicitly
specify the new error class.

This introduces a blank-at-eol that only catches extra white space at the
end of line, and makes the traditional trailing-space a convenient synonym
to catch both blank-at-eol and blank-at-eof.  This way, people who used
trailing-space continue to catch both classes of errors.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-09-05 23:14:31 -07:00
Junio C Hamano
77b15bbd88 apply --whitespace=warn/error: diagnose blank at EOF
"git apply" strips new blank lines at EOF under --whitespace=fix option,
but neigher --whitespace=warn nor --whitespace=error paid any attention to
these errors.

Introduce a new whitespace error class, blank-at-eof, to make the
whitespace error handling more consistent.

The patch adds a new "linenr" field to the struct fragment in order to
record which line the hunk started in the input file, but this is needed
solely for reporting purposes.  The detection of this class of whitespace
errors cannot be done while parsing a patch like we do for all the other
classes of whitespace errors.  It instead has to wait until we find where
to apply the hunk, but at that point, we do not have an access to the
original line number in the input file anymore, hence the new field.

Depending on your point of view, this may be a bugfix that makes warn and
error in line with fix.  Or you could call it a new feature.  The line
between them is somewhat fuzzy in this case.

Strictly speaking, triggering more errors than before is a change in
behaviour that is not backward compatible, even though the reason for the
change is because the code was not checking for an error that it should
have.  People who do not want added blank lines at EOF to trigger an error
can disable the new error class.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-09-04 11:50:26 -07:00
Junio C Hamano
554555ac7d Merge branch 'lt/approxidate'
* lt/approxidate:
  fix approxidate parsing of relative months and years
  tests: add date printing and parsing tests
  refactor test-date interface
  Add date formatting and parsing functions relative to a given time
  Further 'approxidate' improvements
  Improve on 'approxidate'

Conflicts:
	date.c
2009-08-31 22:11:36 -07:00
Alex Riesen
33012fc429 Add date formatting and parsing functions relative to a given time
The main purpose is to allow predictable testing of the code.

Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-30 19:59:11 -07:00
Junio C Hamano
235db154f8 Merge branch 'mm/reset-report'
* mm/reset-report:
  reset: make the reminder output consistent with "checkout"
  Rename REFRESH_SAY_CHANGED to REFRESH_IN_PORCELAIN.
2009-08-28 19:39:26 -07:00
Junio C Hamano
3613339eaf Merge branch 'jc/maint-checkout-index-to-prefix'
* jc/maint-checkout-index-to-prefix:
  check_path(): allow symlinked directories to checkout-index --prefix
2009-08-25 14:47:56 -07:00
Nguyễn Thái Ngọc Duy
08aefc9e47 unpack-trees(): "enable" sparse checkout and load $GIT_DIR/info/sparse-checkout
This patch introduces core.sparseCheckout, which will control whether
sparse checkout support is enabled in unpack_trees()

It also loads sparse-checkout file that will be used in the next patch.
I split it out so the next patch will be shorter, easier to read.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-23 17:14:41 -07:00
Nguyễn Thái Ngọc Duy
e663db2f44 unpack-trees(): add CE_WT_REMOVE to remove on worktree alone
CE_REMOVE now removes both worktree and index versions. Sparse
checkout must be able to remove worktree version while keep the
index intact when checkout area is narrowed.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-23 17:13:33 -07:00
Nguyễn Thái Ngọc Duy
44a3691362 Introduce "skip-worktree" bit in index, teach Git to get/set this bit
Detail about this bit is in Documentation/git-update-index.txt.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-23 17:11:28 -07:00
Matthieu Moy
3deffc52d8 reset: make the reminder output consistent with "checkout"
git reset without argument displays a summary of the local modification,
like this:

    $ git reset
    Makefile: locally modified

Some people have problems with this; they look like an error message.

This patch makes its output mimic how "git checkout $another_branch"
reports the paths with local modifications.  "git add --refresh --verbose"
is changed in the same way.

It also adds a header to make it clear that the output is informative,
and not an error.

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
2009-08-21 21:19:35 -07:00
Matthieu Moy
43673fddd3 Rename REFRESH_SAY_CHANGED to REFRESH_IN_PORCELAIN.
The change in the output is going to become more general than just saying
"changed", so let's make the variable name more general too.

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-21 20:45:40 -07:00
Junio C Hamano
f00ecbe42b Merge branch 'cc/replace'
* cc/replace:
  t6050: check pushing something based on a replaced commit
  Documentation: add documentation for "git replace"
  Add git-replace to .gitignore
  builtin-replace: use "usage_msg_opt" to give better error messages
  parse-options: add new function "usage_msg_opt"
  builtin-replace: teach "git replace" to actually replace
  Add new "git replace" command
  environment: add global variable to disable replacement
  mktag: call "check_sha1_signature" with the replacement sha1
  replace_object: add a test case
  object: call "check_sha1_signature" with the replacement sha1
  sha1_file: add a "read_sha1_file_repl" function
  replace_object: add mechanism to replace objects found in "refs/replace/"
  refs: add a "for_each_replace_ref" function
2009-08-21 18:47:53 -07:00
Junio C Hamano
5e092b5bce Merge branch 'gb/apply-ignore-whitespace'
* gb/apply-ignore-whitespace:
  git apply: option to ignore whitespace differences
2009-08-21 18:47:48 -07:00
Junio C Hamano
da02ca508b check_path(): allow symlinked directories to checkout-index --prefix
Merlyn noticed that Documentation/install-doc-quick.sh no longer correctly
removes old installed documents when the target directory has a leading
path that is a symlink.  It turns out that "checkout-index --prefix" was
broken by recent b6986d8 (git-checkout: be careful about untracked
symlinks, 2009-07-29).

I suspect has_symlink_leading_path() could learn the third parameter
(prefix that is allowed to be symlinked directories) to allow us to retire
a similar function has_dirs_only_path().

Another avenue of fixing this I considered was to get rid of base_dir and
base_dir_len from "struct checkout", and instead make "git checkout-index"
when run with --prefix mkdir the leading path and chdir in there.  It
might be the best longer term solution to this issue, as the base_dir
feature is used only by that rather obscure codepath as far as I know.

But at least this patch should fix this breakage.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-18 03:32:45 -07:00
Giuseppe Bilotta
86c91f9179 git apply: option to ignore whitespace differences
Introduce --ignore-whitespace option and corresponding config bool to
ignore whitespace differences while applying patches, akin to the
'patch' program.

'git am', 'git rebase' and the bash git completion are made aware of
this option.

Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-05 11:59:56 -07:00
Linus Torvalds
b6986d8a75 git-checkout: be careful about untracked symlinks
This fixes the case where an untracked symlink that points at a directory
with tracked paths confuses the checkout logic, demostrated in t6035.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-29 20:24:28 -07:00
Junio C Hamano
130b04ab37 Merge branch 'js/maint-graft-unhide-true-parents'
* js/maint-graft-unhide-true-parents:
  git repack: keep commits hidden by a graft
  Add a test showing that 'git repack' throws away grafted-away parents

Conflicts:
	git-repack.sh
2009-07-25 00:45:03 -07:00
Johannes Schindelin
7f3140cd23 git repack: keep commits hidden by a graft
When you have grafts that pretend that a given commit has different
parents than the ones recorded in the commit object, it is dangerous
to let 'git repack' remove those hidden parents, as you can easily
remove the graft and end up with a broken repository.

So let's play it safe and keep those parent objects and everything
that is reachable by them, in addition to the grafted parents.

As this behavior can only be triggered by git pack-objects, and as that
command handles duplicate parents gracefully, we do not bother to cull
duplicated parents that may result by using both true and grafted
parents.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-24 09:10:16 -07:00
Junio C Hamano
bba0fd22ad push: do not give big warning when no preference is configured
If the message said "we will be changing the default in the future, so
this is to warn people who want to keep the current default what to do",
it would have made some sense, but as it stands, the message is merely an
unsolicited advertisement for a new feature, which it is not helpful at
all.  Squelch it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-18 17:20:52 -07:00
Linus Torvalds
b9fd284657 Export thread-safe version of 'has_symlink_leading_path()'
The threaded index preloading will want it, so that it can avoid
locking by simply using a per-thread symlink/directory cache.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-09 20:05:19 -07:00
David Aguilar
003b33a8ad diff: generate pretty filenames in prep_temp_blob()
Naturally, prep_temp_blob() did not care about filenames.
As a result, GIT_EXTERNAL_DIFF and textconv generated
filenames such as ".diff_XXXXXX".

This modifies prep_temp_blob() to generate user-friendly
filenames when creating temporary files.

Diffing "name.ext" now generates "XXXXXX_name.ext".

Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-05-31 17:57:59 -07:00
Christian Couder
dae556bdb1 environment: add global variable to disable replacement
This new "read_replace_refs" global variable is set to 1 by
default, so that replace refs are used by default. But
reachability traversal and packing commands ("cmd_fsck",
"cmd_prune", "cmd_pack_objects", "upload_pack",
"cmd_unpack_objects") set it to 0, as they must work with the
original DAG.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-05-31 17:02:59 -07:00
Christian Couder
f5552aee39 sha1_file: add a "read_sha1_file_repl" function
This new function will replace "read_sha1_file". This latter function
becoming just a stub to call the former will a NULL "replacement"
argument.

This new function is needed because sometimes we need to use the
replacement sha1.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-05-31 17:02:59 -07:00
Felipe Contreras
4b25d091ba Fix a bunch of pointer declarations (codestyle)
Essentially; s/type* /type */ as per the coding guidelines.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-05-01 15:17:31 -07:00
Johannes Schindelin
348df16679 Rename core.unreliableHardlinks to core.createObject
"Unreliable hardlinks" is a misleading description for what is happening.
So rename it to something less misleading.

Suggested by Linus Torvalds.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-04-29 16:50:07 -07:00
Johannes Schindelin
be66a6c43d Add an option not to use link(src, dest) && unlink(src) when that is unreliable
It seems that accessing NTFS partitions with ufsd (at least on my EeePC)
has an unnerving bug: if you link() a file and unlink() it right away,
the target of the link() will have the correct size, but consist of NULs.

It seems as if the calls are simply not serialized correctly, as single-stepping
through the function move_temp_to_file() works flawlessly.

As ufsd is "Commertial software" (sic!), I cannot fix it, and have to work
around it in Git.

At the same time, it seems that this fixes msysGit issues 222 and 229 to
assume that Windows cannot handle link() && unlink().

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Acked-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-04-25 09:49:21 -07:00
Junio C Hamano
fbdc05661d Merge branch 'jc/name-branch'
* jc/name-branch:
  Don't permit ref/branch names to end with ".lock"
  check_ref_format(): tighten refname rules
  strbuf_check_branch_ref(): a helper to check a refname for a branch
  Fix branch -m @{-1} newname
  check-ref-format --branch: give Porcelain a way to grok branch shorthand
  strbuf_branchname(): a wrapper for branch name shorthands
  Rename interpret/substitute nth_last_branch functions

Conflicts:
	Documentation/git-check-ref-format.txt
2009-04-06 00:43:44 -07:00
Junio C Hamano
03a39a9184 Merge branch 'jc/shared-literally'
* jc/shared-literally:
  t1301: loosen test for forced modes
  set_shared_perm(): sometimes we know what the final mode bits should look like
  move_temp_to_file(): do not forget to chmod() in "Coda hack" codepath
  Move chmod(foo, 0444) into move_temp_to_file()
  "core.sharedrepository = 0mode" should set, not loosen
2009-04-06 00:42:52 -07:00
Junio C Hamano
3c91bf6805 Merge branch 'jc/maint-1.6.0-keep-pack'
* jc/maint-1.6.0-keep-pack:
  pack-objects: don't loosen objects available in alternate or kept packs
  t7700: demonstrate repack flaw which may loosen objects unnecessarily
  Remove --kept-pack-only option and associated infrastructure
  pack-objects: only repack or loosen objects residing in "local" packs
  git-repack.sh: don't use --kept-pack-only option to pack-objects
  t7700-repack: add two new tests demonstrating repacking flaws

Conflicts:
	t/t7700-repack.sh
2009-04-01 22:34:19 -07:00
Junio C Hamano
17e61b8288 set_shared_perm(): sometimes we know what the final mode bits should look like
adjust_shared_perm() first obtains the mode bits from lstat(2), expecting
to find what the result of applying user's umask is, and then tweaks it
as necessary.  When the file to be adjusted is created with mkstemp(3),
however, the mode thusly obtained does not have anything to do with user's
umask, and we would need to start from 0444 in such a case and there is no
point running lstat(2) for such a path.

This introduces a new API set_shared_perm() to bypass the lstat(2) and
instead force setting the mode bits to the desired value directly.
adjust_shared_perm() becomes a thin wrapper to the function.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-28 08:02:15 -07:00
Junio C Hamano
2545c089e3 Merge branch 'fg/push-default'
* fg/push-default:
  builtin-push.c: Fix typo: "anythig" -> "anything"
  Display warning for default git push with no push.default config
  New config push.default to decide default behavior for push

Conflicts:
	Documentation/config.txt
2009-03-26 00:26:25 -07:00
Junio C Hamano
431b1969fc Rename interpret/substitute nth_last_branch functions
These allow you to say "git checkout @{-2}" to switch to the branch two
"branch switching" ago by pretending as if you typed the name of that
branch.  As it is likely that we will be introducing more short-hands to
write the name of a branch without writing it explicitly, rename the
functions from "nth_last_branch" to more generic "branch_name", to prepare
for different semantics.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-22 23:36:47 -07:00
Brandon Casey
4d6acb7041 Remove --kept-pack-only option and associated infrastructure
This option to pack-objects/rev-list was created to improve the -A and -a
options of repack.  It was found to be lacking in that it did not provide
the ability to differentiate between local and non-local kept packs, and
found to be unnecessary since objects residing in local kept packs can be
filtered out by the --honor-pack-keep option.

Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-20 13:32:33 -07:00
Junio C Hamano
ca8a36e6e0 Merge branch 'js/remote-improvements'
* js/remote-improvements: (23 commits)
  builtin-remote.c: no "commented out" code, please
  builtin-remote: new show output style for push refspecs
  builtin-remote: new show output style
  remote: make guess_remote_head() use exact HEAD lookup if it is available
  builtin-remote: add set-head subcommand
  builtin-remote: teach show to display remote HEAD
  builtin-remote: fix two inconsistencies in the output of "show <remote>"
  builtin-remote: make get_remote_ref_states() always populate states.tracked
  builtin-remote: rename variables and eliminate redundant function call
  builtin-remote: remove unused code in get_ref_states
  builtin-remote: refactor duplicated cleanup code
  string-list: new for_each_string_list() function
  remote: make match_refs() not short-circuit
  remote: make match_refs() copy src ref before assigning to peer_ref
  remote: let guess_remote_head() optionally return all matches
  remote: make copy_ref() perform a deep copy
  remote: simplify guess_remote_head()
  move locate_head() to remote.c
  move duplicated ref_newer() to remote.c
  move duplicated get_local_heads() to remote.c
  ...

Conflicts:
	builtin-clone.c
2009-03-17 18:55:06 -07:00
Junio C Hamano
a9bfe81309 Merge branch 'kb/checkout-optim'
* kb/checkout-optim:
  Revert "lstat_cache(): print a warning if doing ping-pong between cache types"
  checkout bugfix: use stat.mtime instead of stat.ctime in two places
  Makefile: Set compiler switch for USE_NSEC
  Create USE_ST_TIMESPEC and turn it on for Darwin
  Not all systems use st_[cm]tim field for ns resolution file timestamp
  Record ns-timestamps if possible, but do not use it without USE_NSEC
  write_index(): update index_state->timestamp after flushing to disk
  verify_uptodate(): add ce_uptodate(ce) test
  make USE_NSEC work as expected
  fix compile error when USE_NSEC is defined
  check_updates(): effective removal of cache entries marked CE_REMOVE
  lstat_cache(): print a warning if doing ping-pong between cache types
  show_patch_diff(): remove a call to fstat()
  write_entry(): use fstat() instead of lstat() when file is open
  write_entry(): cleanup of some duplicated code
  create_directories(): remove some memcpy() and strchr() calls
  unlink_entry(): introduce schedule_dir_for_removal()
  lstat_cache(): swap func(length, string) into func(string, length)
  lstat_cache(): generalise longest_match_lstat_cache()
  lstat_cache(): small cleanup and optimisation
2009-03-17 18:54:31 -07:00
Finn Arne Gangstad
521537476f New config push.default to decide default behavior for push
When "git push" is not told what refspecs to push, it pushes all matching
branches to the current remote.  For some workflows this default is not
useful, and surprises new users.  Some have even found that this default
behaviour is too easy to trigger by accident with unwanted consequences.

Introduce a new configuration variable "push.default" that decides what
action git push should take if no refspecs are given or implied by the
command line arguments or the current remote configuration.

Possible values are:

  'nothing'  : Push nothing;
  'matching' : Current default behaviour, push all branches that already
               exist in the current remote;
  'tracking' : Push the current branch to whatever it is tracking;
  'current'  : Push the current branch to a branch of the same name,
               i.e. HEAD.

Signed-off-by: Finn Arne Gangstad <finnag@pvv.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-17 14:50:21 -07:00
Junio C Hamano
aec813062b Merge branch 'jc/maint-1.6.0-keep-pack'
* jc/maint-1.6.0-keep-pack:
  is_kept_pack(): final clean-up
  Simplify is_kept_pack()
  Consolidate ignore_packed logic more
  has_sha1_kept_pack(): take "struct rev_info"
  has_sha1_pack(): refactor "pretend these packs do not exist" interface
  git-repack: resist stray environment variable
2009-03-11 13:49:56 -07:00
Junio C Hamano
69e020ae00 is_kept_pack(): final clean-up
Now is_kept_pack() is just a member lookup into a structure, we can write
it as such.

Also rewrite the sole caller of has_sha1_kept_pack() to switch on the
criteria the callee uses (namely, revs->kept_pack_only) between calling
has_sha1_kept_pack() and has_sha1_pack(), so that these two callees do not
have to take a pointer to struct rev_info as an argument.

This removes the header file dependency issue temporarily introduced by
the earlier commit, so we revert changes associated to that as well.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-28 01:06:06 -08:00
Junio C Hamano
386cb77210 Consolidate ignore_packed logic more
This refactors three loops that check if a given packfile is on the
ignore_packed list into a function is_kept_pack().  The function returns
false for a pack on the list, and true for a pack not on the list, because
this list is solely used by "git repack" to pass list of packfiles that do
not have corresponding .keep files, i.e. a packfile not on the list is
"kept".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-28 01:06:06 -08:00
Junio C Hamano
b8431b033f has_sha1_kept_pack(): take "struct rev_info"
Its "ignore_packed" parameter always comes from struct rev_info.  This
patch makes the function take a pointer to the surrounding structure, so
that the refactoring in the next patch becomes easier to review.

There is an unfortunate header file dependency and the easiest workaround
is to temporarily move the function declaration from cache.h to
revision.h; this will be moved back to cache.h once the function loses
this "ignore_packed" parameter altogether in the later part of the
series.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-28 01:06:06 -08:00
Junio C Hamano
cd673c1f17 has_sha1_pack(): refactor "pretend these packs do not exist" interface
Most of the callers of this function except only one pass NULL to its last
parameter, ignore_packed.

Introduce has_sha1_kept_pack() function that has the function signature
and the semantics of this function, and convert the sole caller that does
not pass NULL to call this new function.

All other callers and has_sha1_pack() lose the ignore_packed parameter.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-28 01:06:06 -08:00
Jeff King
5483f79998 refactor find_ref_by_name() to accept const list
Since it doesn't actually touch its argument, this makes
sense.

However, we still want to return a non-const version (which
requires a cast) so that this:

  struct ref *a, *b;
  a = find_ref_by_name(b);

works. Unfortunately, you can also silently strip the const
from a variable:

  struct ref *a;
  const struct ref *b;
  a = find_ref_by_name(b);

This is a classic C const problem because there is no way to
say "return the type with the same constness that was passed
to us"; we provide the same semantics as standard library
functions like strchr.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jay Soffian <jaysoffian@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-26 00:49:44 -08:00
Kjetil Barvik
e1afca4fd3 write_index(): update index_state->timestamp after flushing to disk
Since this timestamp is used to check for racy-clean files, it is
important to keep it uptodate.

For the 'git checkout' command without the '-q' option, this make a
huge difference.  Before, each and every file which was updated, was
racy-clean after the call to unpack_trees() and write_index() but
before the GIT process ended.

And because of the call to show_local_changes() in builtin-checkout.c,
we ended up reading those files back into memory, doing a SHA1 to
check if the files was really different from the index.  And, of
course, no file was different.

With this fix, 'git checkout' without the '-q' option should now be
almost as fast as with the '-q' option, but not quite, as we still do
some few lstat(2) calls more without the '-q' option.

Below is some average numbers for 10 checkout's to v2.6.27 and 10 to
v2.6.25 of the Linux kernel, to show the difference:

before (git version 1.6.2.rc1.256.g58a87):
 7.860 user  2.427 sys  19.465 real  52.8% CPU  faults: 0 major 95331 minor
after:
 6.184 user  2.160 sys  17.619 real  47.4% CPU  faults: 0 major 38994 minor

Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-23 18:04:20 -08:00
Linus Torvalds
7dff9b30ea Support 'raw' date format
Talking about --date, one thing I wanted for the 1234567890 date was to
get things in the raw format. Sure, you get them with --pretty=raw, but it
felt a bit sad that you couldn't just ask for the date in raw format.

So here's a throw-away patch (meaning: I won't be re-sending it, because I
really don't think it's a big deal) to add "--date=raw". It just prints
out the internal raw git format - seconds since epoch plus timezone (put
another way: 'date +"%s %z"' format)

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-20 21:45:42 -08:00
Junio C Hamano
8c5b85ce87 Merge branch 'maint'
* maint:
  More friendly message when locking the index fails.
  Document git blame --reverse.
  Documentation: Note file formats send-email accepts
2009-02-19 23:44:07 -08:00
Matthieu Moy
e43a6fd3e9 More friendly message when locking the index fails.
Just saying that index.lock exists doesn't tell the user _what_ to do
to fix the problem. We should give an indication that it's normally
safe to delete index.lock after making sure git isn't running here.

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-19 23:22:57 -08:00
Johannes Schindelin
4fcc86b07d Introduce the function strip_path_suffix()
The function strip_path_suffix() will try to strip a given suffix from
a given path.  The suffix must start at a directory boundary (i.e. "core"
is not a path suffix of "libexec/git-core", but "git-core" is).

Arbitrary runs of directory separators ("slashes") are assumed identical.

Example:

	strip_path_suffix("C:\\msysgit/\\libexec\\git-core",
		"libexec///git-core", &prefix)

will set prefix to "C:\\msysgit" and return 0.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Acked-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-19 22:45:48 -08:00
Kjetil Barvik
fba2f38a2c make USE_NSEC work as expected
Since the filesystem ext4 is now defined as stable in Linux v2.6.28,
and ext4 supports nanonsecond resolution timestamps natively, it is
time to make USE_NSEC work as expected.

This will make racy git situations less likely to happen.  For 'git
checkout' this means it will be less likely that we have to open, read
the contents of the file into RAM, and check if file is really
modified or not.  The result sould be a litle less used CPU time, less
pagefaults and a litle faster program, at least for 'git checkout'.

Since the number of possible racy git situations would increase when
disks gets faster, this patch would be more and more helpfull as times
go by.  For a fast Solid State Disk, this patch should be helpfull.

Note that, when file operations starts to take less than 1 nanosecond,
one would again start to get more racy git situations.

For more info on racy git, see Documentation/technical/racy-git.txt
For more info on ext4, see http://kernelnewbies.org/Ext4

Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-19 21:39:48 -08:00
Kjetil Barvik
36419c8ee4 check_updates(): effective removal of cache entries marked CE_REMOVE
Below is oprofile output from GIT command 'git chekcout -q my-v2.6.25'
(move from tag v2.6.27 to tag v2.6.25 of the Linux kernel):

CPU: Core 2, speed 1999.95 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit
                         mask of 0x00 (Unhalted core cycles) count 20000
Counted INST_RETIRED_ANY_P events (number of instructions retired) with a
                           unit mask of 0x00 (No unit mask) count 20000
CPU_CLK_UNHALT...|INST_RETIRED:2...|
  samples|      %|  samples|      %|
------------------------------------
   409247 100.000    342878 100.000 git
        CPU_CLK_UNHALT...|INST_RETIRED:2...|
          samples|      %|  samples|      %|
        ------------------------------------
           260476 63.6476    257843 75.1996 libz.so.1.2.3
           100876 24.6492     64378 18.7758 kernel-2.6.28.4_2.vmlinux
            30850  7.5382      7874  2.2964 libc-2.9.so
            14775  3.6103      8390  2.4469 git
             2020  0.4936      4325  1.2614 libcrypto.so.0.9.8
              191  0.0467        32  0.0093 libpthread-2.9.so
               58  0.0142        36  0.0105 ld-2.9.so
                1 2.4e-04         0       0 libldap-2.3.so.0.2.31

Detail list of the top 20 function entries (libz counted in one blob):

CPU_CLK_UNHALTED  INST_RETIRED_ANY_P
samples  %        samples  %        image name               symbol name
260476   63.6862  257843   75.2725  libz.so.1.2.3            /lib/libz.so.1.2.3
16587     4.0555  3636      1.0615  libc-2.9.so              memcpy
7710      1.8851  277       0.0809  libc-2.9.so              memmove
3679      0.8995  1108      0.3235  kernel-2.6.28.4_2.vmlinux d_validate
3546      0.8670  2607      0.7611  kernel-2.6.28.4_2.vmlinux __getblk
3174      0.7760  1813      0.5293  libc-2.9.so              _int_malloc
2396      0.5858  3681      1.0746  kernel-2.6.28.4_2.vmlinux copy_to_user
2270      0.5550  2528      0.7380  kernel-2.6.28.4_2.vmlinux __link_path_walk
2205      0.5391  1797      0.5246  kernel-2.6.28.4_2.vmlinux ext4_mark_iloc_dirty
2103      0.5142  1203      0.3512  kernel-2.6.28.4_2.vmlinux find_first_zero_bit
2077      0.5078  997       0.2911  kernel-2.6.28.4_2.vmlinux do_get_write_access
2070      0.5061  514       0.1501  git                      cache_name_compare
2043      0.4995  1501      0.4382  kernel-2.6.28.4_2.vmlinux rcu_irq_exit
2022      0.4944  1732      0.5056  kernel-2.6.28.4_2.vmlinux __ext4_get_inode_loc
2020      0.4939  4325      1.2626  libcrypto.so.0.9.8       /usr/lib/libcrypto.so.0.9.8
1965      0.4804  1384      0.4040  git                      patch_delta
1708      0.4176  984       0.2873  kernel-2.6.28.4_2.vmlinux rcu_sched_grace_period
1682      0.4112  727       0.2122  kernel-2.6.28.4_2.vmlinux sysfs_slab_alias
1659      0.4056  290       0.0847  git                      find_pack_entry_one
1480      0.3619  1307      0.3816  kernel-2.6.28.4_2.vmlinux ext4_writepage_trans_blocks

Notice the memmove line, where the CPU did 7710 / 277 = 27.8 cycles
per instruction, and compared to the total cycles spent inside the
source code of GIT for this command, all the memmove() calls
translates to (7710 * 100) / 14775 = 52.2% of this.

Retesting with a GIT program compiled for gcov usage, I found out that
the memmove() calls came from remove_index_entry_at() in read-cache.c,
where we have:

        memmove(istate->cache + pos,
                istate->cache + pos + 1,
                (istate->cache_nr - pos) * sizeof(struct cache_entry *));

remove_index_entry_at() is called 4902 times from check_updates() in
unpack-trees.c, and each time called we move each cache_entry pointers
(from the removed one) one step to the left.

Since we have 28828 entries in the cache this time, and if we on
average move half of them each time, we in total move approximately
4902 * 0.5 * 28828 * 4 = 282 629 712 bytes, or twice this amount if
each pointer is 8 bytes (64 bit).

OK, is seems that the function check_updates() is called 28 times, so
the estimated guess above had been more correct if check_updates() had
been called only once, but the point is: we get lots of bytes moved.

To fix this, and use an O(N) algorithm instead, where N is the number
of cache_entries, we delete/remove all entries in one loop through all
entries.

From a retest, the new remove_marked_cache_entries() from the patch
below, ended up with the following output line from oprofile:

46        0.0105  15        0.0041  git                      remove_marked_cache_entries

If we can trust the numbers from oprofile in this case, we saved
approximately ((7710 - 46) * 20000) / (2 * 1000 * 1000 * 1000) = 0.077
seconds CPU time with this fix for this particular test.  And notice
that now the CPU did only 46 / 15 = 3.1 cycles/instruction.

Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-18 17:11:21 -08:00
Junio C Hamano
160d2bc353 Merge branch 'ms/mailmap'
* ms/mailmap:
  Move mailmap documentation into separate file
  Change current mailmap usage to do matching on both name and email of author/committer.
  Add map_user() and clear_mailmap() to mailmap
  Add find_insert_index, insert_at_index and clear_func functions to string_list
  Add mailmap.file as configurational option for mailmap location
2009-02-15 01:44:15 -08:00
Junio C Hamano
954cfb5cfd Revert "Merge branch 'js/notes'"
This reverts commit 7b75b331f6, reversing
changes made to 5d680a67d7.
2009-02-10 21:32:10 -08:00
Junio C Hamano
6e5d7ddc49 Merge branch 'js/maint-1.6.0-path-normalize'
* js/maint-1.6.0-path-normalize:
  Remove unused normalize_absolute_path()
  Test and fix normalize_path_copy()
  Fix GIT_CEILING_DIRECTORIES on Windows
  Move sanitary_path_copy() to path.c and rename it to normalize_path_copy()
  Make test-path-utils more robust against incorrect use
2009-02-10 21:30:52 -08:00
Junio C Hamano
fd8475d9fb Merge branch 'maint'
* maint:
  Clear the delta base cache during fast-import checkpoint
2009-02-10 21:30:45 -08:00
Junio C Hamano
9b27ea9518 Merge branch 'maint-1.6.0' into maint
* maint-1.6.0:
  Clear the delta base cache during fast-import checkpoint
2009-02-10 15:32:26 -08:00
Shawn O. Pearce
3d20c636af Clear the delta base cache during fast-import checkpoint
Otherwise we may reuse the same memory address for a totally
different "struct packed_git", and a previously cached object from
the prior occupant might be returned when trying to unpack an object
from the new pack.

Found-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-10 15:30:59 -08:00
Kjetil Barvik
7847892716 unlink_entry(): introduce schedule_dir_for_removal()
Currently inside unlink_entry() if we get a successful removal of one
file with unlink(), we try to remove the leading directories each and
every time.  So if one directory containing 200 files is moved to an
other location we get 199 failed calls to rmdir() and 1 successful
call.

To fix this and avoid some unnecessary calls to rmdir(), we schedule
each directory for removal and wait much longer before we do the real
call to rmdir().

Since the unlink_entry() function is called with alphabetically sorted
names, this new function end up being very effective to avoid
unnecessary calls to rmdir().  In some cases over 95% of all calls to
rmdir() is removed with this patch.

Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-09 20:59:26 -08:00
Kjetil Barvik
571998921d lstat_cache(): swap func(length, string) into func(string, length)
Swap function argument pair (length, string) into (string, length) to
conform with the commonly used order inside the GIT source code.

Also, add a note about this fact into the coding guidelines.

Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-09 20:59:26 -08:00
Marius Storm-Olsen
d551a48816 Add mailmap.file as configurational option for mailmap location
This allows us to augment the repo mailmap file, and to use
mailmap files elsewhere than the repository root. Meaning
that the entries in mailmap.file will override the entries
in "./.mailmap", should they match.

Signed-off-by: Marius Storm-Olsen <marius@trolltech.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-08 12:36:26 -08:00
Johannes Sixt
f2a782b8ba Remove unused normalize_absolute_path()
This function is now superseded by normalize_path_copy().

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-07 12:23:30 -08:00
Johannes Sixt
f3cad0ad82 Move sanitary_path_copy() to path.c and rename it to normalize_path_copy()
This function and normalize_absolute_path() do almost the same thing. The
former already works on Windows, but the latter crashes.

In subsequent changes we will remove normalize_absolute_path(). Here we
make the replacement function reusable. On the way we rename it to reflect
that it does some path normalization. Apart from that this is only moving
around code.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-07 12:23:29 -08:00
Junio C Hamano
7b75b331f6 Merge branch 'js/notes'
* js/notes:
  git-notes: fix printing of multi-line notes
  notes: fix core.notesRef documentation
  Add an expensive test for git-notes
  Speed up git notes lookup
  Add a script to edit/inspect notes
  Introduce commit notes

Conflicts:
	pretty.c
2009-02-05 19:40:39 -08:00
Junio C Hamano
141b6b83d7 Merge branch 'lt/maint-wrap-zlib' into maint
* lt/maint-wrap-zlib:
  Wrap inflate and other zlib routines for better error reporting

Conflicts:
	http-push.c
	http-walker.c
	sha1_file.c
2009-02-05 18:01:00 -08:00
Junio C Hamano
8712b3cdb0 Merge branch 'tr/previous-branch'
* tr/previous-branch:
  t1505: remove debugging cruft
  Simplify parsing branch switching events in reflog
  Introduce for_each_recent_reflog_ent().
  interpret_nth_last_branch(): plug small memleak
  Fix reflog parsing for a malformed branch switching entry
  Fix parsing of @{-1}@{1}
  interpret_nth_last_branch(): avoid traversing the reflog twice
  checkout: implement "-" abbreviation, add docs and tests
  sha1_name: support @{-N} syntax in get_sha1()
  sha1_name: tweak @{-N} lookup
  checkout: implement "@{-N}" shortcut name for N-th last branch

Conflicts:
	sha1_name.c
2009-01-28 15:00:27 -08:00
Junio C Hamano
0990e7aaaa Merge branch 'kb/lstat-cache'
* kb/lstat-cache:
  lstat_cache(): introduce clear_lstat_cache() function
  lstat_cache(): introduce invalidate_lstat_cache() function
  lstat_cache(): introduce has_dirs_only_path() function
  lstat_cache(): introduce has_symlink_or_noent_leading_path() function
  lstat_cache(): more cache effective symlink/directory detection
2009-01-25 17:13:34 -08:00
Junio C Hamano
d64d4835b8 Merge branch 'cb/add-pathspec'
* cb/add-pathspec:
  remove pathspec_match, use match_pathspec instead
  clean up pathspec matching
2009-01-25 17:13:11 -08:00
Junio C Hamano
36dd939393 Merge branch 'lt/maint-wrap-zlib'
* lt/maint-wrap-zlib:
  Wrap inflate and other zlib routines for better error reporting

Conflicts:
	http-push.c
	http-walker.c
	sha1_file.c
2009-01-21 16:55:17 -08:00
Kjetil Barvik
bda6eb0da9 lstat_cache(): introduce clear_lstat_cache() function
If you want to completely clear the contents of the lstat_cache(), then
call this new function.

Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-18 13:58:34 -08:00
Kjetil Barvik
aeabab5c71 lstat_cache(): introduce invalidate_lstat_cache() function
In some cases it could maybe be necessary to say to the cache that
"Hey, I deleted/changed the type of this pathname and if you currently
have it inside your cache, you should deleted it".

This patch introduce a function which support this.

Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-18 13:58:31 -08:00
Kjetil Barvik
bad4a54fa6 lstat_cache(): introduce has_dirs_only_path() function
The create_directories() function in entry.c currently calls stat()
or lstat() for each path component of the pathname 'path' each and every
time.  For the 'git checkout' command, this function is called on each
file for which we must do an update (ce->ce_flags & CE_UPDATE), so we get
lots and lots of calls.

To fix this, we make a new wrapper to the lstat_cache() function, and
call the wrapper function instead of the calls to the stat() or the
lstat() functions.  Since the paths given to the create_directories()
function, is sorted alphabetically, the new wrapper would be very
cache effective in this situation.

To support it we must update the lstat_cache() function to be able to
say that "please test the complete length of 'name'", and also to give
it the length of a prefix, where the cache should use the stat()
function instead of the lstat() function to test each path component.

Thanks to Junio C Hamano, Linus Torvalds and Rene Scharfe for valuable
comments to this patch!

Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-18 13:54:54 -08:00
Kjetil Barvik
09c9306658 lstat_cache(): introduce has_symlink_or_noent_leading_path() function
In some cases, especially inside the unpack-trees.c file, and inside
the verify_absent() function, we can avoid some unnecessary calls to
lstat(), if the lstat_cache() function can also be told to keep track
of non-existing directories.

So we update the lstat_cache() function to handle this new fact,
introduce a new wrapper function, and the result is that we save lots
of lstat() calls for a removed directory which previously contained
lots of files, when we call this new wrapper of lstat_cache() instead
of the old one.

We do similar changes inside the unlink_entry() function, since if we
can already say that the leading directory component of a pathname
does not exist, it is not necessary to try to remove a pathname below
it!

Thanks to Junio C Hamano, Linus Torvalds and Rene Scharfe for valuable
comments to this patch!

Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-18 13:54:49 -08:00
Junio C Hamano
ae5a6c3684 checkout: implement "@{-N}" shortcut name for N-th last branch
Implement a shortcut @{-N} for the N-th last branch checked out, that
works by parsing the reflog for the message added by previous
git-checkout invocations.  We expand the @{-N} to the branch name, so
that you end up on an attached HEAD on that branch.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-17 18:36:49 -08:00
Clemens Buchacher
0b50922abf remove pathspec_match, use match_pathspec instead
Both versions have the same functionality. This removes any
redundancy.

This also adds makes two extensions to match_pathspec:

- If pathspec is NULL, return 1. This reflects the behavior of git
  commands, for which no paths usually means "match all paths".

- If seen is NULL, do not use it.

Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-14 19:18:44 -08:00
Christian Couder
c2c5b27051 sha1_file: make "read_object" static
This function is only used from "sha1_file.c".

And as we want to add a "replace_object" hook in "read_sha1_file",
we must not let people bypass the hook using something other than
"read_sha1_file".

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-13 00:14:55 -08:00
Linus Torvalds
39c68542fc Wrap inflate and other zlib routines for better error reporting
R. Tyler Ballance reported a mysterious transient repository corruption;
after much digging, it turns out that we were not catching and reporting
memory allocation errors from some calls we make to zlib.

This one _just_ wraps things; it doesn't do the "retry on low memory
error" part, at least not yet. It is an independent issue from the
reporting.  Some of the errors are expected and passed back to the caller,
but we die when zlib reports it failed to allocate memory for now.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-11 02:13:06 -08:00
Johannes Schindelin
879ef2485d Introduce commit notes
Commit notes are blobs which are shown together with the commit
message.  These blobs are taken from the notes ref, which you can
configure by the config variable core.notesRef, which in turn can
be overridden by the environment variable GIT_NOTES_REF.

The notes ref is a branch which contains "files" whose names are
the names of the corresponding commits (i.e. the SHA-1).

The rationale for putting this information into a ref is this: we
want to be able to fetch and possibly union-merge the notes,
maybe even look at the date when a note was introduced, and we
want to store them efficiently together with the other objects.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-12-21 02:47:21 -08:00
Junio C Hamano
de0db42278 Merge branch 'maint'
* maint:
  fsck: reduce stack footprint
  make sure packs to be replaced are closed beforehand
2008-12-11 00:36:31 -08:00
Nicolas Pitre
c74faea19e make sure packs to be replaced are closed beforehand
Especially on Windows where an opened file cannot be replaced, make
sure pack-objects always close packs it is about to replace. Even on
non Windows systems, this could save potential bad results if ever
objects were to be read from the new pack file using offset from the old
index.

This should fix t5303 on Windows.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Tested-by: Johannes Sixt <j6t@kdbg.org> (MinGW)
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-12-10 17:56:05 -08:00
Junio C Hamano
0fd9d7e66d Merge branch 'bc/maint-keep-pack' into maint
* bc/maint-keep-pack:
  repack: only unpack-unreachable if we are deleting redundant packs
  t7700: test that 'repack -a' packs alternate packed objects
  pack-objects: extend --local to mean ignore non-local loose objects too
  sha1_file.c: split has_loose_object() into local and non-local counterparts
  t7700: demonstrate mishandling of loose objects in an alternate ODB
  builtin-gc.c: use new pack_keep bitfield to detect .keep file existence
  repack: do not fall back to incremental repacking with [-a|-A]
  repack: don't repack local objects in packs with .keep file
  pack-objects: new option --honor-pack-keep
  packed_git: convert pack_local flag into a bitfield and add pack_keep
  t7700: demonstrate mishandling of objects in packs with a .keep file
2008-12-02 23:00:04 -08:00
Junio C Hamano
388b2acd6e git add --intent-to-add: fix removal of cached emptiness
This uses the extended index flag mechanism introduced earlier to mark
the entries added to the index via "git add -N" with CE_INTENT_TO_ADD.

The logic to detect an "intent to add" entry for the purpose of allowing
"git rm --cached $path" is tightened to check not just for a staged empty
blob, but with the CE_INTENT_TO_ADD bit.  This protects an empty blob that
was explicitly added and then modified in the work tree from being dropped
with this sequence:

	$ >empty
	$ git add empty
	$ echo "non empty" >empty
	$ git rm --cached empty

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-11-28 19:58:24 -08:00
Junio C Hamano
fe60dff744 Merge branch 'nd/narrow' (early part) into jc/add-i-t-a
* 'nd/narrow' (early part):
  Extend index to save more flags
2008-11-28 17:22:35 -08:00
Junio C Hamano
2af9664776 Merge branch 'lt/preload-lstat'
* lt/preload-lstat:
  Fix index preloading for racy dirty case
  Add cache preload facility
2008-11-27 19:24:13 -08:00
Junio C Hamano
47a792539a Merge branch 'jk/commit-v-strip'
* jk/commit-v-strip:
  status: show "-v" diff even for initial commit
  wt-status: refactor initial commit printing
  define empty tree sha1 as a macro
2008-11-16 00:48:59 -08:00
Linus Torvalds
671c9b7e31 Add cache preload facility
This can do the lstat() storm in parallel, giving potentially much
improved performance for cold-cache cases or things like NFS that have
weak metadata caching.

Just use "read_cache_preload()" instead of "read_cache()" to force an
optimistic preload of the index stat data.  The function takes a
pathspec as its argument, allowing us to preload only the relevant
portion of the index.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-11-14 19:11:49 -08:00
Junio C Hamano
7b51b77dbc Merge branch 'np/pack-safer'
* np/pack-safer:
  t5303: fix printf format string for portability
  t5303: work around printf breakage in dash
  pack-objects: don't leak pack window reference when splitting packs
  extend test coverage for latest pack corruption resilience improvements
  pack-objects: allow "fixing" a corrupted pack without a full repack
  make find_pack_revindex() aware of the nasty world
  make check_object() resilient to pack corruptions
  make packed_object_info() resilient to pack corruptions
  make unpack_object_header() non fatal
  better validation on delta base object offsets
  close another possibility for propagating pack corruption
2008-11-12 22:26:35 -08:00
Junio C Hamano
ecbbfb15a4 Merge branch 'bc/maint-keep-pack'
* bc/maint-keep-pack:
  t7700: test that 'repack -a' packs alternate packed objects
  pack-objects: extend --local to mean ignore non-local loose objects too
  sha1_file.c: split has_loose_object() into local and non-local counterparts
  t7700: demonstrate mishandling of loose objects in an alternate ODB
  builtin-gc.c: use new pack_keep bitfield to detect .keep file existence
  repack: do not fall back to incremental repacking with [-a|-A]
  repack: don't repack local objects in packs with .keep file
  pack-objects: new option --honor-pack-keep
  packed_git: convert pack_local flag into a bitfield and add pack_keep
  t7700: demonstrate mishandling of objects in packs with a .keep file
2008-11-12 22:00:43 -08:00
Junio C Hamano
6cd3729eae Merge branch 'maint'
* maint:
  Start 1.6.0.5 cycle
  Fix pack.packSizeLimit and --max-pack-size handling
  checkout: Fix "initial checkout" detection
  Remove the period after the git-check-attr summary

Conflicts:
	RelNotes
2008-11-12 15:03:57 -08:00
Junio C Hamano
fa7b3c2f75 checkout: Fix "initial checkout" detection
Earlier commit 5521883 (checkout: do not lose staged removal, 2008-09-07)
tightened the rule to prevent switching branches from losing local
changes, so that staged removal of paths can be protected, while
attempting to keep a loophole to still allow a special case of switching
out of an un-checked-out state.

However, the loophole was made a bit too tight, and did not allow
switching from one branch (in an un-checked-out state) to check out
another branch.

The change to builtin-checkout.c in this commit loosens it to allow this,
by not insisting the original commit and the new commit to be the same.

It also introduces a new function, is_index_unborn (and an associated
macro, is_cache_unborn), to check if the repository is truly in an
un-checked-out state more reliably, by making sure that $GIT_INDEX_FILE
did not exist when populating the in-core index structure.  A few places
the earlier commit 5521883 added the check for the initial checkout
condition are updated to use this function.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-11-12 14:16:50 -08:00
Jeff King
14d9c57896 define empty tree sha1 as a macro
This can potentially be used in a few places, so let's make
it available to all parts of the code.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-11-12 12:52:21 -08:00
Brandon Casey
0f4dc14ac4 sha1_file.c: split has_loose_object() into local and non-local counterparts
Signed-off-by: Brandon Casey <drafnel@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-11-12 10:29:22 -08:00
Brandon Casey
8d25931d6f packed_git: convert pack_local flag into a bitfield and add pack_keep
pack_keep will be set when a pack file has an associated .keep file.

Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-11-12 10:28:08 -08:00
Junio C Hamano
8b1981d32b Merge branch 'ar/maint-mksnpath' into maint
* ar/maint-mksnpath:
  Use git_pathdup instead of xstrdup(git_path(...))
  git_pathdup: returns xstrdup-ed copy of the formatted path
  Fix potentially dangerous use of git_path in ref.c
  Add git_snpath: a .git path formatting routine with output buffer
  Fix potentially dangerous uses of mkpath and git_path
  Fix mkpath abuse in dwim_ref and dwim_log of sha1_name.c
  Add mksnpath which allows you to specify the output buffer

Conflicts:
	builtin-revert.c
	rerere.c
2008-11-08 16:13:19 -08:00
Junio C Hamano
3b8572a429 Merge branch 'mv/maint-branch-m-symref' into maint
* mv/maint-branch-m-symref:
  update-ref --no-deref -d: handle the case when the pointed ref is packed
  git branch -m: forbid renaming of a symref
  Fix git update-ref --no-deref -d.
  rename_ref(): handle the case when the reflog of a ref does not exist
  Fix git branch -m for symrefs.
2008-11-08 16:07:37 -08:00
Junio C Hamano
a1a846a19e Merge branch 'ar/mksnpath'
* ar/mksnpath:
  Use git_pathdup instead of xstrdup(git_path(...))
  git_pathdup: returns xstrdup-ed copy of the formatted path
  Fix potentially dangerous use of git_path in ref.c
  Add git_snpath: a .git path formatting routine with output buffer
  Fix potentially dangerous uses of mkpath and git_path
  Fix potentially dangerous uses of mkpath and git_path
  Fix mkpath abuse in dwim_ref and dwim_log of sha1_name.c
  Add mksnpath which allows you to specify the output buffer

Conflicts:
	builtin-revert.c
2008-11-05 11:35:53 -08:00
Junio C Hamano
efcce2e1f0 Merge branch 'mv/maint-branch-m-symref'
* mv/maint-branch-m-symref:
  update-ref --no-deref -d: handle the case when the pointed ref is packed
  git branch -m: forbid renaming of a symref
  Fix git update-ref --no-deref -d.
  rename_ref(): handle the case when the reflog of a ref does not exist
  Fix git branch -m for symrefs.
2008-11-05 11:33:19 -08:00
Nicolas Pitre
09ded04b7e make unpack_object_header() non fatal
It is possible to have pack corruption in the object header.  Currently
unpack_object_header() simply die() on them instead of letting the caller
deal with that gracefully.

So let's have unpack_object_header() return an error instead, and find
a better name for unpack_object_header_gently() in that context.  All
callers of unpack_object_header() are ready for it.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-11-02 15:22:34 -08:00
Nicolas Pitre
0e8189e270 close another possibility for propagating pack corruption
Abstract
--------

With index v2 we have a per object CRC to allow quick and safe reuse of
pack data when repacking.  This, however, doesn't currently prevent a
stealth corruption from being propagated into a new pack when _not_
reusing pack data as demonstrated by the modification to t5302 included
here.

The Context
-----------

The Git database is all checksummed with SHA1 hashes.  Any kind of
corruption can be confirmed by verifying this per object hash against
corresponding data.  However this can be costly to perform systematically
and therefore this check is often not performed at run time when
accessing the object database.

First, the loose object format is entirely compressed with zlib which
already provide a CRC verification of its own when inflating data.  Any
disk corruption would be caught already in this case.

Then, packed objects are also compressed with zlib but only for their
actual payload.  The object headers and delta base references are not
deflated for obvious performance reasons, however this leave them
vulnerable to potentially undetected disk corruptions.  Object types
are often validated against the expected type when they're requested,
and deflated size must always match the size recorded in the object header,
so those cases are pretty much covered as well.

Where corruptions could go unnoticed is in the delta base reference.
Of course, in the OBJ_REF_DELTA case,  the odds for a SHA1 reference to
get corrupted so it actually matches the SHA1 of another object with the
same size (the delta header stores the expected size of the base object
to apply against) are virtually zero.  In the OBJ_OFS_DELTA case, the
reference is a pack offset which would have to match the start boundary
of a different base object but still with the same size, and although this
is relatively much more "probable" than in the OBJ_REF_DELTA case, the
probability is also about zero in absolute terms.  Still, the possibility
exists as demonstrated in t5302 and is certainly greater than a SHA1
collision, especially in the OBJ_OFS_DELTA case which is now the default
when repacking.

Again, repacking by reusing existing pack data is OK since the per object
CRC provided by index v2 guards against any such corruptions. What t5302
failed to test is a full repack in such case.

The Solution
------------

As unlikely as this kind of stealth corruption can be in practice, it
certainly isn't acceptable to propagate it into a freshly created pack.
But, because this is so unlikely, we don't want to pay the run time cost
associated with extra validation checks all the time either.  Furthermore,
consequences of such corruption in anything but repacking should be rather
visible, and even if it could be quite unpleasant, it still has far less
severe consequences than actively creating bad packs.

So the best compromize is to check packed object CRC when unpacking
objects, and only during the compression/writing phase of a repack, and
only when not streaming the result.  The cost of this is minimal (less
than 1% CPU time), and visible only with a full repack.

Someone with a stats background could provide an objective evaluation of
this, but I suspect that it's bad RAM that has more potential for data
corruptions at this point, even in those cases where this extra check
is not performed.  Still, it is best to prevent a known hole for
corruption when recreating object data into a new pack.

What about the streamed pack case?  Well, any client receiving a pack
must always consider that pack as untrusty and perform full validation
anyway, hence no such stealth corruption could be propagated to remote
repositoryes already.  It is therefore worthless doing local validation
in that case.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-11-02 15:22:15 -08:00
Junio C Hamano
86e67a088c Merge branch 'jk/maint-ls-files-other' into maint
* jk/maint-ls-files-other:
  refactor handling of "other" files in ls-files and status
2008-11-02 13:37:13 -08:00