1
0
Fork 0
mirror of https://github.com/git/git.git synced 2024-11-16 14:04:52 +01:00
Commit graph

1231 commits

Author SHA1 Message Date
Junio C Hamano
f0d8900175 Merge branch 'sp/stream-clean-filter'
When running a required clean filter, we do not have to mmap the
original before feeding the filter.  Instead, stream the file
contents directly to the filter and process its output.

* sp/stream-clean-filter:
  sha1_file: don't convert off_t to size_t too early to avoid potential die()
  convert: stream from fd to required clean filter to reduce used address space
  copy_fd(): do not close the input file descriptor
  mmap_limit: introduce GIT_MMAP_LIMIT to allow testing expected mmap size
  memory_limit: use git_env_ulong() to parse GIT_ALLOC_LIMIT
  config.c: add git_env_ulong() to parse environment variable
  convert: drop arguments other than 'path' from would_convert_to_git()
2014-10-08 13:05:32 -07:00
Junio C Hamano
1c2ea2cdc0 Merge branch 'rs/realloc-array'
Code cleanup.

* rs/realloc-array:
  use REALLOC_ARRAY for changing the allocation size of arrays
  add macro REALLOC_ARRAY
2014-09-26 14:39:45 -07:00
Junio C Hamano
69a5bbbbfa Merge branch 'jk/write-packed-refs-via-stdio'
Optimize the code path to write out the packed-refs file, which
especially matters in a repository with a large number of refs.

* jk/write-packed-refs-via-stdio:
  refs: write packed_refs file using stdio
2014-09-26 14:39:42 -07:00
Junio C Hamano
4daf5c8643 Merge branch 'ta/config-add-to-empty-or-true-fix'
"git config --add section.var val" used to lose existing
section.var whose value was an empty string.

* ta/config-add-to-empty-or-true-fix:
  config: avoid a funny sentinel value "a^"
  make config --add behave correctly for empty and NULL values
2014-09-19 11:38:40 -07:00
Junio C Hamano
9ff700ebac Merge branch 'jk/commit-author-parsing'
Code clean-up.

* jk/commit-author-parsing:
  determine_author_info(): copy getenv output
  determine_author_info(): reuse parsing functions
  date: use strbufs in date-formatting functions
  record_author_date(): use find_commit_header()
  record_author_date(): fix memory leak on malformed commit
  commit: provide a function to find a header in a buffer
2014-09-19 11:38:33 -07:00
Junio C Hamano
ceeacc501b Merge branch 'bb/date-iso-strict'
"log --date=iso" uses a slight variant of ISO 8601 format that is
made more human readable.  A new "--date=iso-strict" option gives
datetime output that is more strictly conformant.

* bb/date-iso-strict:
  pretty: provide a strict ISO 8601 date format
2014-09-19 11:38:32 -07:00
René Scharfe
2756ca4347 use REALLOC_ARRAY for changing the allocation size of arrays
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-18 09:13:42 -07:00
Jeff King
c1063be2a3 config: avoid a funny sentinel value "a^"
Introduce CONFIG_REGEX_NONE as a more explicit sentinel value to say
"we do not want to replace any existing entry" and use it in the
implementation of "git config --add".

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-11 16:33:54 -07:00
Junio C Hamano
7f346e9d73 Merge branch 'ta/config-set-1'
Use the new caching config-set API in git_config() calls.

* ta/config-set-1:
  add tests for `git_config_get_string_const()`
  add a test for semantic errors in config files
  rewrite git_config() to use the config-set API
  config: add `git_die_config()` to the config-set API
  change `git_config()` return value to void
  add line number and file name info to `config_set`
  config.c: fix accuracy of line number in errors
  config.c: mark error and warnings strings for translation
2014-09-11 10:33:25 -07:00
Jeff King
9540ce5030 refs: write packed_refs file using stdio
We write each line of a new packed-refs file individually
using a write() syscall (and sometimes 2, if the ref is
peeled). Since each line is only about 50-100 bytes long,
this creates a lot of system call overhead.

We can instead open a stdio handle around our descriptor and
use fprintf to write to it. The extra buffering is not a
problem for us, because nobody will read our new packed-refs
file until we call commit_lock_file (by which point we have
flushed everything).

On a pathological repository with 8.5 million refs, this
dropped the time to run `git pack-refs` from 20s to 6s.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-10 10:58:32 -07:00
Junio C Hamano
56f214e071 Merge branch 'ta/config-set'
Add in-core caching layer to let us avoid reading the same
configuration files number of times.

* ta/config-set:
  test-config: add tests for the config_set API
  add `config_set` API for caching config-like files
2014-09-02 13:24:18 -07:00
Junio C Hamano
1d8a6f6929 Merge branch 'mm/config-edit-global'
Start "git config --edit --global" from a skeletal per-user
configuration file contents, instead of a total blank, when the
user does not already have any.  This immediately reduces the need
for a later "Have you forgotten setting core.user?" and we can add
more to the template as we gain more experience.

* mm/config-edit-global:
  commit: advertise config --global --edit on guessed identity
  home_config_paths(): let the caller ignore xdg path
  config --global --edit: create a template file if needed
2014-09-02 13:23:20 -07:00
Junio C Hamano
c518279c0e Merge branch 'jc/reopen-lock-file'
There are cases where you lock and open to write a file, close it to
show the updated contents to external processes, and then have to
update the file again while still holding the lock, but the lockfile
API lacked support for such an access pattern.

* jc/reopen-lock-file:
  lockfile: allow reopening a closed but still locked file
2014-09-02 13:20:13 -07:00
Beat Bolli
466fb6742d pretty: provide a strict ISO 8601 date format
Git's "ISO" date format does not really conform to the ISO 8601
standard due to small differences, and it cannot be parsed by ISO
8601-only parsers, e.g. those of XML toolchains.

The output from "--date=iso" deviates from ISO 8601 in these ways:

  - a space instead of the `T` date/time delimiter
  - a space between time and time zone
  - no colon between hours and minutes of the time zone

Add a strict ISO 8601 date format for displaying committer and
author dates.  Use the '%aI' and '%cI' format specifiers and add
'--date=iso-strict' or '--date=iso8601-strict' date format names.

See http://thread.gmane.org/gmane.comp.version-control.git/255879 and
http://thread.gmane.org/gmane.comp.version-control.git/52414/focus=52585
for discussion.

Signed-off-by: Beat Bolli <bbolli@ewanet.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-29 12:37:02 -07:00
Steffen Prohaska
23b0c4782e config.c: add git_env_ulong() to parse environment variable
The new function parses an integeral value that fits in unsigned
long in human readable form, i.e. possibly with unit suffix, e.g.
10k = 10240, etc., from an environment variable.  Parsing of
GIT_MMAP_LIMIT and GIT_ALLOC_LIMIT will use it in later patches.

Signed-off-by: Steffen Prohaska <prohaska@zib.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-28 10:24:54 -07:00
Jeff King
c33ddc2e33 date: use strbufs in date-formatting functions
Many of the date functions write into fixed-size buffers.
This is a minor pain, as we have to take special
precautions, and frequently end up copying the result into a
strbuf or heap-allocated buffer anyway (for which we
sometimes use strcpy!).

Let's instead teach parse_date, datestamp, etc to write to a
strbuf. The obvious downside is that we might need to
perform a heap allocation where we otherwise would not need
to. However, it turns out that the only two new allocations
required are:

  1. In test-date.c, where we don't care about efficiency.

  2. In determine_author_info, which is not performance
     critical (and where the use of a strbuf will help later
     refactoring).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-27 10:32:56 -07:00
Tanay Abhra
155ef25f12 rewrite git_config() to use the config-set API
Of all the functions in `git_config*()` family, `git_config()` has the
most invocations in the whole code base. Each `git_config()` invocation
causes config file rereads which can be avoided using the config-set API.

Use the config-set API to rewrite `git_config()` to use the config caching
layer to avoid config file rereads on each invocation during a git process
lifetime. First invocation constructs the cache, and after that for each
successive invocation, `git_config()` feeds values from the config cache
instead of rereading the configuration files.

Signed-off-by: Tanay Abhra <tanayabh@gmail.com>
Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-07 11:41:10 -07:00
Tanay Abhra
5a80e97c82 config: add git_die_config() to the config-set API
Add `git_die_config` that dies printing the line number and the file name
of the highest priority value for the configuration variable `key`. A custom
error message is also printed before dying, specified by the caller, which can
be skipped if `err` argument is set to NULL.

It has usage in non-callback based config value retrieval where we can
raise an error and die if there is a semantic error.
For example,

	if (!git_config_get_value(key, &value)){
		if (!strcmp(value, "foo"))
			git_config_die(key, "value: `%s` is illegal", value);
		else
			/* do work */
	}

Signed-off-by: Tanay Abhra <tanayabh@gmail.com>
Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-07 11:40:25 -07:00
Tanay Abhra
aace438502 change git_config() return value to void
Currently `git_config()` returns an integer signifying an error code.
During rewrites of the function most of the code was shifted to
`git_config_with_options()`. `git_config_with_options()` normally
returns positive values if its `config_source` parameter is set as NULL,
as most errors are fatal, and non-fatal potential errors are guarded
by "if" statements that are entered only when no error is possible.

Still a negative value can be returned in case of race condition between
`access_or_die()` & `git_config_from_file()`. Also, all callers of
`git_config()` ignore the return value except for one case in branch.c.

Change `git_config()` return value to void and make it die if it receives
a negative value from `git_config_with_options()`.

Original-patch-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Tanay Abhra <tanayabh@gmail.com>
Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-07 11:40:17 -07:00
Tanay Abhra
3df8fd625f add line number and file name info to config_set
Store file name and line number for each key-value pair in the cache
during parsing of the configuration files.

Signed-off-by: Tanay Abhra <tanayabh@gmail.com>
Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-07 11:38:50 -07:00
Tanay Abhra
3c8687a73e add config_set API for caching config-like files
Currently `git_config()` uses a callback mechanism and file rereads for
config values. Due to this approach, it is not uncommon for the config
files to be parsed several times during the run of a git program, with
different callbacks picking out different variables useful to themselves.

Add a `config_set`, that can be used to construct an in-memory cache for
config-like files that the caller specifies (i.e., files like `.gitmodules`,
`~/.gitconfig` etc.). Add two external functions `git_configset_get_value`
and `git_configset_get_value_multi` for querying from the config sets.
`git_configset_get_value` follows `last one wins` semantic (i.e. if there
are multiple matches for the queried key in the files of the configset the
value returned will be the last entry in `value_list`).
`git_configset_get_value_multi` returns a list of values sorted in order of
increasing priority (i.e. last match will be at the end of the list). Add
type specific query functions like `git_configset_get_bool` and similar.

Add a default `config_set`, `the_config_set` to cache all key-value pairs
read from usual config files (repo specific .git/config, user wide
~/.gitconfig, XDG config and the global /etc/gitconfig). `the_config_set`
is populated using `git_config()`.

Add two external functions `git_config_get_value` and
`git_config_get_value_multi` for querying in a non-callback manner from
`the_config_set`. Also, add type specific query functions that are
implemented as a thin wrapper around the `config_set` API.

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Tanay Abhra <tanayabh@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-07-29 14:29:56 -07:00
Jeff King
5de7f500c1 alloc: factor out commit index
We keep a static counter to set the commit index on newly
allocated objects. However, since we also need to set the
index on any_objects which are converted to commits, let's
make the counter available as a public function.

While we're moving it, let's make sure the counter is
allocated as an unsigned integer to match the index field in
"struct commit".

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-07-28 10:14:33 -07:00
Matthieu Moy
9830534e40 config --global --edit: create a template file if needed
When the user has no ~/.gitconfig file, git config --global --edit used
to launch an editor on an nonexistant file name.

Instead, create a file with a default content before launching the
editor. The template contains only commented-out entries, to save a few
keystrokes for the user. If the values are guessed properly, the user
will only have to uncomment the entries.

Advanced users teaching newbies can create a minimalistic configuration
faster for newbies. Beginners reading a tutorial advising to run "git
config --global --edit" as a first step will be slightly more guided for
their first contact with Git.

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-07-25 12:23:06 -07:00
Junio C Hamano
10b944b37b Merge branch 'jk/alloc-commit-id'
Make sure all in-core commit objects are assigned a unique number
so that they can be annotated using the commit-slab API.

* jk/alloc-commit-id:
  diff-tree: avoid lookup_unknown_object
  object_as_type: set commit index
  alloc: factor out commit index
  add object_as_type helper for casting objects
  parse_object_buffer: do not set object type
  move setting of object->type to alloc_* functions
  alloc: write out allocator definitions
  alloc.c: remove the alloc_raw_commit_node() function
2014-07-22 10:59:25 -07:00
Junio C Hamano
9f2de9c121 Merge branch 'kb/perf-trace'
* kb/perf-trace:
  api-trace.txt: add trace API documentation
  progress: simplify performance measurement by using getnanotime()
  wt-status: simplify performance measurement by using getnanotime()
  git: add performance tracing for git's main() function to debug scripts
  trace: add trace_performance facility to debug performance issues
  trace: add high resolution timer function to debug performance issues
  trace: add 'file:line' to all trace output
  trace: move code around, in preparation to file:line output
  trace: add current timestamp to all trace output
  trace: disable additional trace output for unit tests
  trace: add infrastructure to augment trace output with additional info
  sha1_file: change GIT_TRACE_PACK_ACCESS logging to use trace API
  Documentation/git.txt: improve documentation of 'GIT_TRACE*' variables
  trace: improve trace performance
  trace: remove redundant printf format attribute
  trace: consistently name the format parameter
  trace: move trace declarations from cache.h to new trace.h
2014-07-22 10:59:19 -07:00
Junio C Hamano
19a249ba83 Merge branch 'rs/ref-transaction-0'
Early part of the "ref transaction" topic.

* rs/ref-transaction-0:
  refs.c: change ref_transaction_update() to do error checking and return status
  refs.c: remove the onerr argument to ref_transaction_commit
  update-ref: use err argument to get error from ref_transaction_commit
  refs.c: make update_ref_write update a strbuf on failure
  refs.c: make ref_update_reject_duplicates take a strbuf argument for errors
  refs.c: log_ref_write should try to return meaningful errno
  refs.c: make resolve_ref_unsafe set errno to something meaningful on error
  refs.c: commit_packed_refs to return a meaningful errno on failure
  refs.c: make remove_empty_directories always set errno to something sane
  refs.c: verify_lock should set errno to something meaningful
  refs.c: make sure log_ref_setup returns a meaningful errno
  refs.c: add an err argument to repack_without_refs
  lockfile.c: make lock_file return a meaningful errno on failurei
  lockfile.c: add a new public function unable_to_lock_message
  refs.c: add a strbuf argument to ref_transaction_commit for error logging
  refs.c: allow passing NULL to ref_transaction_free
  refs.c: constify the sha arguments for ref_transaction_create|delete|update
  refs.c: ref_transaction_commit should not free the transaction
  refs.c: remove ref_transaction_rollback
2014-07-21 11:18:37 -07:00
Junio C Hamano
c9831bb09d Merge branch 'kb/path-max-must-go'
* kb/path-max-must-go:
  cache.h: rename cache_def_free to cache_def_clear
2014-07-16 11:32:33 -07:00
Junio C Hamano
788cef81d4 Merge branch 'nd/split-index'
An experiment to use two files (the base file and incremental
changes relative to it) to represent the index to reduce I/O cost
of rewriting a large index when only small part of the working tree
changes.

* nd/split-index: (32 commits)
  t1700: new tests for split-index mode
  t2104: make sure split index mode is off for the version test
  read-cache: force split index mode with GIT_TEST_SPLIT_INDEX
  read-tree: note about dropping split-index mode or index version
  read-tree: force split-index mode off on --index-output
  rev-parse: add --shared-index-path to get shared index path
  update-index --split-index: do not split if $GIT_DIR is read only
  update-index: new options to enable/disable split index mode
  split-index: strip pathname of on-disk replaced entries
  split-index: do not invalidate cache-tree at read time
  split-index: the reading part
  split-index: the writing part
  read-cache: mark updated entries for split index
  read-cache: save deleted entries in split index
  read-cache: mark new entries for split index
  read-cache: split-index mode
  read-cache: save index SHA-1 after reading
  entry.c: update cache_changed if refresh_cache is set in checkout_entry()
  cache-tree: mark istate->cache_changed on prime_cache_tree()
  cache-tree: mark istate->cache_changed on cache tree update
  ...
2014-07-16 11:25:40 -07:00
Junio C Hamano
93dcaea226 lockfile: allow reopening a closed but still locked file
In some code paths (e.g. giving "add -i" to prepare the contents to
be committed interactively inside "commit -p") where a caller takes
a lock, writes the new content, give chance for others to use it
while still holding the lock, and then releases the lock when all is
done.  As an extension, allow the caller to re-update an already
closed file while still holding the lock (i.e. not yet committed) by
re-opening the file, to be followed by updating the contents and
then by the usual close_lock_file() or commit_lock_file().

This is necessary if we want to add code to rebuild the cache-tree
and write the resulting index out after "add -i" returns the control
to "commit -p", for example.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-07-14 13:05:37 -07:00
Ronnie Sahlberg
76d70dc0c6 refs.c: make resolve_ref_unsafe set errno to something meaningful on error
Making errno when returning from resolve_ref_unsafe() meaningful,
which should fix

 * a bug in lock_ref_sha1_basic, where it assumes EISDIR
   means it failed due to a directory being in the way

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Acked-by: Michael Haggerty <mhagger@alum.mit.edu>
2014-07-14 11:54:42 -07:00
Ronnie Sahlberg
6af926e8bc lockfile.c: add a new public function unable_to_lock_message
Introducing a new unable_to_lock_message helper, which has nicer
semantics than unable_to_lock_error and cleans up lockfile.c a little.

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Acked-by: Michael Haggerty <mhagger@alum.mit.edu>
2014-07-14 11:54:40 -07:00
Jeff King
94d5a22cf6 alloc: factor out commit index
We keep a static counter to set the commit index on newly
allocated objects. However, since we also need to set the
index on any_objects which are converted to commits, let's
make the counter available as a public function.

While we're moving it, let's make sure the counter is
allocated as an unsigned integer to match the index field in
"struct commit".

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-07-13 18:59:05 -07:00
Karsten Blees
2a60839150 cache.h: rename cache_def_free to cache_def_clear
Rename cache_def_free to cache_def_clear as it doesn't free the struct
cache_def, but just clears its content.

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-07-13 10:12:37 -07:00
Junio C Hamano
11def366e5 Merge branch 'kb/path-max-must-go'
* kb/path-max-must-go:
  symlinks: remove PATH_MAX limitation
2014-07-10 11:27:47 -07:00
Karsten Blees
e7c7305300 symlinks: remove PATH_MAX limitation
'git checkout' fails if a directory is longer than PATH_MAX, because the
lstat_cache in symlinks.c checks if the leading directory exists using
PATH_MAX-bounded string operations.

Remove the limitation by using strbuf instead.

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-07-07 11:22:42 -07:00
Junio C Hamano
1881d2b88c Merge branch 'ym/fix-opportunistic-index-update-race' into maint
"git status", even though it is a read-only operation, tries to
update the index with refreshed lstat(2) info to optimize future
accesses to the working tree opportunistically, but this could
race with a "read-write" operation that modify the index while it
is running.  Detect such a race and avoid overwriting the index.

* ym/fix-opportunistic-index-update-race:
  read-cache.c: verify index file before we opportunistically update it
  wrapper.c: add xpread() similar to xread()
2014-06-25 11:49:48 -07:00
Jeremiah Mahler
ccdd4a0f3c cleanup duplicate name_compare() functions
We often represent our strings as a counted string, i.e. a pair of
the pointer to the beginning of the string and its length, and the
string may not be NUL terminated to that length.

To compare a pair of such counted strings, unpack-trees.c and
read-cache.c implement their own name_compare() functions
identically.  In addition, the cache_name_compare() function in
read-cache.c is nearly identical.  The only difference is when one
string is the prefix of the other string, in which case
name_compare() returns -1/+1 to show which one is longer, and
cache_name_compare() returns the difference of the lengths to show
the same information.

Unify these three functions by using the implementation from
cache_name_compare().  This does not make any difference to the
existing and future callers, as they must be paying attention only
to the sign of the returned value (and not the magnitude) because
the original implementations of these two functions return values
returned by memcmp(3) when the one string is not a prefix of the
other string, and the only thing memcmp(3) guarantees its callers is
the sign of the returned value, not the magnitude.

Signed-off-by: Jeremiah Mahler <jmmahler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-20 10:12:14 -07:00
Karsten Blees
5991a55c54 trace: move trace declarations from cache.h to new trace.h
Also include direct dependencies (strbuf.h and git-compat-util.h for
__attribute__) so that trace.h can be used independently of cache.h, e.g.
in test programs.

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-17 09:37:47 -07:00
Junio C Hamano
b83163643b Merge branch 'sk/windows-unc-path'
* sk/windows-unc-path:
  Windows: allow using UNC path for git repository
2014-06-16 10:07:03 -07:00
Nguyễn Thái Ngọc Duy
3e52f70b15 t1700: new tests for split-index mode
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-13 11:49:42 -07:00
Nguyễn Thái Ngọc Duy
c18b80a0e8 update-index: new options to enable/disable split index mode
If you have a large work tree but only make changes in a subset, then
$GIT_DIR/index's size should be stable after a while. If you change
branches that touch something else, $GIT_DIR/index's size may grow
large that it becomes as slow as the unified index. Do --split-index
again occasionally to force all changes back to the shared index and
keep $GIT_DIR/index small.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-13 11:49:41 -07:00
Nguyễn Thái Ngọc Duy
b3c96fb158 split-index: strip pathname of on-disk replaced entries
We know the positions of replaced entries via the replace bitmap in
"link" extension, so the "name" path does not have to be stored (it's
still in the shared index). With this, we also have a way to
distinguish additions vs replacements at load time and can catch
broken "link" extensions.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-13 11:49:41 -07:00
Nguyễn Thái Ngọc Duy
ce7c614bce split-index: do not invalidate cache-tree at read time
We are sure that after merge_base_index() is done. cache-tree can
still be used with the final index. So don't destroy cache tree.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-13 11:49:41 -07:00
Nguyễn Thái Ngọc Duy
078a58e825 read-cache: mark updated entries for split index
The large part of this patch just follows CE_ENTRY_CHANGED
marks. replace_index_entry() is updated to update
split_index->base->cache[] as well so base->cache[] does not reference
to a freed entry.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-13 11:49:40 -07:00
Nguyễn Thái Ngọc Duy
5fc2fc8fa2 read-cache: split-index mode
This split-index mode is designed to keep write cost proportional to
the number of changes the user has made, not the size of the work
tree. (Read cost is another matter, to be dealt separately.)

This mode stores index info in a pair of $GIT_DIR/index and
$GIT_DIR/sharedindex.<SHA-1>. sharedindex is large and unchanged over
time while "index" is smaller and updated often. Format details are in
index-format.txt, although not everything is implemented in this
patch.

Shared indexes are not automatically removed, because it's unclear if
the shared index is needed by any (even temporary) indexes by just
looking at it. After a while you'll collect stale shared indexes. The
good news is one shared index is useable for long, until
$GIT_DIR/index becomes too big and sluggish that the new shared index
must be created.

The safest way to clean shared indexes is to turn off split index
mode, so shared files are all garbage, delete them all, then turn on
split index mode again.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-13 11:49:39 -07:00
Nguyễn Thái Ngọc Duy
e93021b20a read-cache: save index SHA-1 after reading
Also update SHA-1 after writing. If we do not do that, the second
read_index() will see "initialized" variable already set and not read
.git/index again, which is fine, except istate->sha1 now has a stale
value.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-13 11:49:39 -07:00
Nguyễn Thái Ngọc Duy
d4a2024aef entry.c: update cache_changed if refresh_cache is set in checkout_entry()
Other fill_stat_cache_info() is on new entries, which should set
CE_ENTRY_ADDED in cache_changed, so we're safe.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-13 11:49:39 -07:00
Nguyễn Thái Ngọc Duy
a5400efe29 cache-tree: mark istate->cache_changed on cache tree invalidation
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-13 11:49:39 -07:00
Nguyễn Thái Ngọc Duy
6c306a34ee resolve-undo: be specific what part of the index has changed
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-13 11:49:38 -07:00
Nguyễn Thái Ngọc Duy
e636a7b4d0 read-cache: be specific what part of the index has changed
cache entry additions, removals and modifications are separated
out. The rest of changes are still in the catch-all flag
SOMETHING_CHANGED, which would be more specific later.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-13 11:49:38 -07:00