A follow-up to the recently fixed bugs in the untracked
invalidation. If opendir() fails it should show a warning, perhaps
this should die, but if this ever happens the error is probably
recoverable for the user, and dying would just make things worse.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Let's start with how create a new directory cache after the last one
becomes invalid (e.g. because its dir mtime has changed...). In
open_cached_dir():
1. We start out with valid_cached_dir() returning false, which should
call invalidate_directory() to put a directory state back to
initial state, no untracked entries (untracked_nr zero), no sub
directory traversal (dirs[].recurse zero).
2. Since the cache cannot be used, we go the slow path opendir() and
go through items one by one via readdir(). All the directories on
disk will be added back to the cache (if not already exist in
dirs[]) and its flag "recurse" gets changed to one to note that
it's part of the cached dir travesal next time.
3. By the time we reach close_cached_dir() we should have a good
subdir list in dirs[]. Those with "recurse" flag set are the ones
present in the on-disk directory. The directory is now marked
"valid".
Next time read_directory() is called, since the directory is marked
valid, it will skip readdir(), go fast path and traverse through
dirs[] array instead.
Steps one and two need some tight cooperation. If a subdir is removed,
readdir() will not find it and of course we cannot examine/invalidate
it. To make sure removed directories on disk are gone from the cache,
step one must make sure recurse flag of all subdirs are zero.
But that's not true. If "valid" flag is already false, there is a
chance we go straight to the end of valid_cached_dir() without calling
invalidate_directory(). Or we fail to meet the "if (untracked-valid)"
condition and skip over the invalidate_directory().
After step 3, we mark the cache valid. Any stale subdir with incorrect
recurse flag becomes a real subdir next time we traverse the directory
using dirs[] array.
We could avoid this by making sure invalidate_directory() is always
called (therefore dirs[].recurse cleared) at the beginning of
open_cached_dir(). Which is what this patch does.
As to how we get into this situation, the key in the test is this
command
git checkout master
where "one/file" is replaced with "one" in the index. This index
update triggers untracked_cache_invalidate_path(), which clears valid
flag of the root directory while keeping "recurse" flag on the subdir
"one" on. On the next git-status, we go through steps 1-3 above and
save an incorrect cache on disk. The second git-status blindly follows
the bad cache data and shows the problem.
This is arguably because of a bad design where "recurse" flag plays
double roles: whether a directory should be saved on disk, and whether
it is part of a directory traversal.
We need to keep recurse flag set at "checkout master" because of the
first role: we need to keep subdir caches (dir "two" for example has
not been touched at all, no reason to throw its cache away).
As long as we make sure to ignore/reset "recurse" flag at the
beginning of a directory traversal, we're good. But maybe eventually
we should separate these two roles.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
stat() may follow a symlink and return stat data of the link's target
instead of the link itself. We are concerned about the link itself.
It's kind of hard to demonstrate the bug. I think when path->buf is a
symlink, we most likely find that its target's stat data does not
match our cached one, which means we ignore the cache and fall back to
slow path.
This is performance issue, not correctness (though we could still
catch it by verifying test-dump-untracked-cache. The less unlikely
case is, link target stat data matches the cached version and we
incorrectly go fast path, ignoring real data on disk. A test for this
may involve manipulating stat data, which may be not portable.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The untracked cache gets confused when a directory is swapped out for
a file. It is easiest to reproduce this by swapping out a directory
with a symlink to another directory, and as the tests show the symlink
case is the only case we've found where "git status" will subsequently
report incorrect information, even though it's possible to otherwise
get the untracked cache into a state where its internal data
structures don't reflect reality.
In the symlink case, whatever files are inside the target of the
symlink will be incorrectly shown as untracked. This issue does not
happen if the symlink links to another file, only if it links to
another directory.
A stand-alone testcase for copying into a terminal:
(
rm -rf /tmp/testrepo &&
git init /tmp/testrepo &&
cd /tmp/testrepo &&
mkdir x y &&
touch x/a y/b &&
git add x/a y/b &&
git commit -msnap &&
git rm -rf y &&
ln -s x y &&
git add y &&
git commit -msnap2 &&
git checkout HEAD~ &&
git status &&
git checkout master &&
sleep 1 &&
git status &&
git status
)
This will incorrectly show y/a as an untracked file. Both the "git
status" call right before "git checkout master" and the "sleep 1"
after the "checkout master" are needed to reproduce this, presumably
due to the untracked cache tracking on the basis of cached whole
seconds from stat(2).
When git gets into this state, a workaround to fix it is to issue a
one-off:
git -c core.untrackedCache=false status
For the non-symlink case, the bug is that the output of
test-dump-untracked-cache should not include:
/one/ 0000000000000000000000000000000000000000 recurse valid
It being in the output implies that cached traversal of root includes
the directory "one" which does not exist on disk anymore.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git shows a message to tell the user that it is waiting for the
user to finish editing when spawning an editor, in case the editor
opens to a hidden window or somewhere obscure and the user gets
lost.
* ls/editor-waiting-message:
launch_editor(): indicate that Git waits for user input
refactor "dumb" terminal determination
Ancient part of codebase still shows dots after an abbreviated
object name just to show that it is not a full object name, but
these ellipses are confusing to people who newly discovered Git
who are used to seeing abbreviated object names and find them
confusing with the range syntax.
* ar/unconfuse-three-dots:
t2020: test variations that matter
t4013: test new output from diff --abbrev --raw
diff: diff_aligned_abbrev: remove ellipsis after abbreviated SHA-1 value
t4013: prepare for upcoming "diff --raw --abbrev" output format change
checkout: describe_detached_head: remove ellipsis after committish
print_sha1_ellipsis: introduce helper
Documentation: user-manual: limit usage of ellipsis
Documentation: revisions: fix typo: "three dot" ---> "three-dot" (in line with "two-dot").
The way "git worktree add" determines what branch to create from
where and checkout in the new worktree has been updated a bit.
* tg/worktree-create-tracking:
add worktree.guessRemote config option
worktree: add --guess-remote flag to add subcommand
worktree: make add <path> <branch> dwim
worktree: add --[no-]track option to the add subcommand
worktree: add can be created from any commit-ish
checkout: factor out functions to new lib file
The tracing infrastructure has been optimized for cases where no
tracing is requested.
* gk/tracing-optimization:
trace: improve performance while category is disabled
trace: remove trace key normalization
Recent update to the submodule configuration code broke "diff-tree"
by accidentally stopping to read from the index upfront.
* bw/submodule-config-cleanup:
diff-tree: read the index so attribute checks work in bare repositories
Amending commits in git-gui broke the author name that is non-ascii
due to incorrect enconding conversion.
* ls/git-gui-no-double-utf8-author-name:
git-gui: prevent double UTF-8 conversion
An v2.12-era regression in pathspec match logic, which made it look
into submodule tree even when it is not desired, has been fixed.
* bw/pathspec-match-submodule-boundary:
pathspec: only match across submodule boundaries when requested
"git diff" learned a variant of the "--patience" algorithm, to
which the user can specify which 'unique' line to be used as
anchoring points.
* jt/diff-anchored-patience:
diff: support anchoring line(s)
The code internal to the recursive merge strategy was not fully
prepared to see a path that is renamed to try overwriting another
path that is only different in case on case insensitive systems.
This does not matter in the current code, but will start to matter
once the rename detection logic starts taking hints from nearby
paths moving to some directory and moves a new path along with them.
* en/merge-recursive-icase-removal:
merge-recursive: ignore_case shouldn't reject intentional removals
Historically, the diff machinery for rename detection had a
hardcoded limit of 32k paths; this is being lifted to allow users
trade cycles with a (possibly) easier to read result.
* en/rename-progress:
diffcore-rename: make diff-tree -l0 mean -l<large>
sequencer: show rename progress during cherry picks
diff: remove silent clamp of renameLimit
progress: fix progress meters when dealing with lots of work
sequencer: warn when internal merge may be suboptimal due to renameLimit
An internal function that was left for backward compatibility has
been removed, as there is no remaining callers.
* en/remove-stripspace:
strbuf: remove unused stripspace function alias
A regression in the progress eye-candy was fixed.
* jk/progress-delay-fix:
progress: drop delay-threshold code
progress: set default delay threshold to 100%, not 0%
@{-N} in "git checkout @{-N}" may refer to a detached HEAD state,
but the documentation was not clear about it, which has been fixed.
* ks/doc-checkout-previous:
Doc/checkout: checking out using @{-N} can lead to detached state
"git send-email" tries to see if the sendmail program is available
in /usr/lib and /usr/sbin; extend the list of locations to be
checked to also include directories on $PATH.
* fk/sendmail-from-path:
git-send-email: honor $PATH for sendmail binary
"git grep" compiled with libpcre2 sometimes triggered a segfault,
which is being fixed.
* ab/pcre2-grep:
grep: fix segfault under -P + PCRE2 <=10.30 + (*NO_JIT)
test-lib: add LIBPCRE1 & LIBPCRE2 prerequisites
The tagnames "git log --decorate" uses to annotate the commits can
now be limited to subset of available refs with the two additional
options, --decorate-refs[-exclude]=<pattern>.
* ra/decorate-limit-refs:
log: add option to choose which refs to decorate
An infrastructure to define what hash function is used in Git is
introduced, and an effort to plumb that throughout various
codepaths has been started.
* bc/hash-algo:
repository: fix a sparse 'using integer as NULL pointer' warning
Switch empty tree and blob lookups to use hash abstraction
Integrate hash algorithm support with repo setup
Add structure representing hash algorithm
setup: expose enumerated repo info
When a graphical GIT_EDITOR is spawned by a Git command that opens
and waits for user input (e.g. "git rebase -i"), then the editor window
might be obscured by other windows. The user might be left staring at
the original Git terminal window without even realizing that s/he needs
to interact with another window before Git can proceed. To this user Git
appears hanging.
Print a message that Git is waiting for editor input in the original
terminal and get rid of it when the editor returns, if the terminal
supports erasing the last line. Also, make sure that our message is
terminated with a whitespace so that any message the editor may show
upon starting up will be kept separate from our message.
Power users might not want to see this message or their editor might
already print such a message (e.g. emacsclient). Allow these users to
suppress the message by disabling the "advice.waitingForEditor" config.
The standard advise() function is not used here as it would always add
a newline which would make deleting the message harder.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since gitfiles were introduced in b44ebb19e (Add platform-independent
.git "symlink", 2008-02-20) the order of checks during .git directory
discovery is: gitfile, gitdir, bare repo. However, that commit did
only partially update the in-code comment describing this order,
missing the last line which still puts gitdir before gitfile.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A regression was introduced in 557a5998d (submodule: remove
gitmodules_config, 2017-08-03) to how attribute processing was handled
in bare repositories when running the diff-tree command.
By default the attribute system will first try to read ".gitattribute"
files from the working tree and then falls back to reading them from the
index if there isn't a copy checked out in the worktree. Prior to
557a5998d the index was read as a side effect of the call to
'gitmodules_config()' which ensured that the index was already populated
before entering the attribute subsystem.
Since the call to 'gitmodules_config()' was removed the index is no
longer being read so when the attribute system tries to read from the
in-memory index it doesn't find any ".gitattribute" entries effectively
ignoring any configured attributes.
Fix this by explicitly reading the index during the setup of diff-tree.
Reported-by: Ben Boeckel <ben.boeckel@kitware.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some users might want to have the --guess-remote option introduced in
the previous commit on by default, so they don't have to type it out
every time they create a new worktree.
Add a config option worktree.guessRemote that allows users to configure
the default behaviour for themselves.
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently 'git worktree add <path>' creates a new branch named after the
basename of the <path>, that matches the HEAD of whichever worktree we
were on when calling "git worktree add <path>".
It's sometimes useful to have 'git worktree add <path> behave more like
the dwim machinery in 'git checkout <new-branch>', i.e. check if the new
branch name, derived from the basename of the <path>, uniquely matches
the branch name of a remote-tracking branch, and if so check out that
branch and set the upstream to the remote-tracking branch.
Add a new --guess-remote option that enables exactly that behaviour.
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move just enough code from trace.c into trace.h header so all code
necessary to determine that trace is disabled could be inlined to
calling functions. Then perform the check if the trace key is
enabled sooner in call chain.
Signed-off-by: Gennady Kupava <gkupava@bloomberg.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The ssh-variant 'simple' introduced earlier broke existing
installations by not passing --port/-4/-6 and not diagnosing an
attempt to pass these as an error. Instead, default to
automatically detect how compatible the GIT_SSH/GIT_SSH_COMMAND is
to OpenSSH convention and then error out an invocation to make it
easier to diagnose connection errors.
* jn/ssh-wrappers:
connect: correct style of C-style comment
ssh: 'simple' variant does not support --port
ssh: 'simple' variant does not support -4/-6
ssh: 'auto' variant to select between 'ssh' and 'simple'
connect: split ssh option computation to its own function
connect: split ssh command line options into separate function
connect: split git:// setup into a separate function
connect: move no_fork fallback to git_tcp_connect
ssh test: make copy_ssh_wrapper_as clean up after itself
A new mechanism to upgrade the wire protocol in place is proposed
and demonstrated that it works with the older versions of Git
without harming them.
* bw/protocol-v1:
Documentation: document Extra Parameters
ssh: introduce a 'simple' ssh variant
i5700: add interop test for protocol transition
http: tell server that the client understands v1
connect: tell server that the client understands v1
connect: teach client to recognize v1 server response
upload-pack, receive-pack: introduce protocol version 1
daemon: recognize hidden request arguments
protocol: introduce protocol extension mechanisms
pkt-line: add packet_write function
connect: in ref advertisement, shallows are last
In addition to "git stash -m message", the command learned to
accept "git stash -mmessage" form.
* ph/stash-save-m-option-fix:
stash: learn to parse -m/--message like commit does
Internaly we use 0{40} as a placeholder object name to signal the
codepath that there is no such object (e.g. the fast-forward check
while "git fetch" stores a new remote-tracking ref says "we know
there is no 'old' thing pointed at by the ref, as we are creating
it anew" by passing 0{40} for the 'old' side), and expect that a
codepath to locate an in-core object to return NULL as a sign that
the object does not exist. A look-up for an object that does not
exist however is quite costly with a repository with large number
of packfiles. This access pattern has been optimized.
* jk/fewer-pack-rescan:
sha1_file: fast-path null sha1 as a missing object
everything_local: use "quick" object existence check
p5551: add a script to test fetch pack-dir rescans
t/perf/lib-pack: use fast-import checkpoint to create packs
p5550: factor out nonsense-pack creation