mirror of
https://github.com/git/git.git
synced 2024-10-30 05:47:53 +01:00
f29cd3938d
Linus and other git developers from the early days trained their fingers to type the command, every once in a while even without thinking, to check the consistency of the repository back when the lower core part of the git was still being developed. Developers who wanted to make sure that git correctly dealt with packfiles could deliberately trigger their creation and checked them after they were created carefully, but loose objects are the ones that are written by various commands from random codepaths. It made some technical sense to have a mode that checked only loose objects from the debugging point of view for that reason. Even for git developers, there no longer is any reason to type "git fsck" every five minutes these days, worried that some newly created objects might be corrupt due to recent change to git. The reason we did not make "--full" the default is probably we trust our filesystems a bit too much. At least, we trusted filesystems more than we trusted the lower core part of git that was under development. Once a packfile is created and we always use it read-only, there didn't seem to be much point in suspecting that the underlying filesystems or disks may corrupt them in such a way that is not caught by the SHA-1 checksum over the entire packfile and per object checksum. That trust in the filesystems might have been a good tradeoff between fsck performance and reliability on platforms git was initially developed on and for, but it may not be true anymore as we run on many more platforms these days. Signed-off-by: Junio C Hamano <gitster@pobox.com>
156 lines
4.8 KiB
Text
156 lines
4.8 KiB
Text
git-fsck(1)
|
|
===========
|
|
|
|
NAME
|
|
----
|
|
git-fsck - Verifies the connectivity and validity of the objects in the database
|
|
|
|
|
|
SYNOPSIS
|
|
--------
|
|
[verse]
|
|
'git fsck' [--tags] [--root] [--unreachable] [--cache] [--no-reflogs]
|
|
[--[no-]full] [--strict] [--verbose] [--lost-found] [<object>*]
|
|
|
|
DESCRIPTION
|
|
-----------
|
|
Verifies the connectivity and validity of the objects in the database.
|
|
|
|
OPTIONS
|
|
-------
|
|
<object>::
|
|
An object to treat as the head of an unreachability trace.
|
|
+
|
|
If no objects are given, 'git-fsck' defaults to using the
|
|
index file, all SHA1 references in .git/refs/*, and all reflogs (unless
|
|
--no-reflogs is given) as heads.
|
|
|
|
--unreachable::
|
|
Print out objects that exist but that aren't readable from any
|
|
of the reference nodes.
|
|
|
|
--root::
|
|
Report root nodes.
|
|
|
|
--tags::
|
|
Report tags.
|
|
|
|
--cache::
|
|
Consider any object recorded in the index also as a head node for
|
|
an unreachability trace.
|
|
|
|
--no-reflogs::
|
|
Do not consider commits that are referenced only by an
|
|
entry in a reflog to be reachable. This option is meant
|
|
only to search for commits that used to be in a ref, but
|
|
now aren't, but are still in that corresponding reflog.
|
|
|
|
--full::
|
|
Check not just objects in GIT_OBJECT_DIRECTORY
|
|
($GIT_DIR/objects), but also the ones found in alternate
|
|
object pools listed in GIT_ALTERNATE_OBJECT_DIRECTORIES
|
|
or $GIT_DIR/objects/info/alternates,
|
|
and in packed git archives found in $GIT_DIR/objects/pack
|
|
and corresponding pack subdirectories in alternate
|
|
object pools. This is now default; you can turn it off
|
|
with --no-full.
|
|
|
|
--strict::
|
|
Enable more strict checking, namely to catch a file mode
|
|
recorded with g+w bit set, which was created by older
|
|
versions of git. Existing repositories, including the
|
|
Linux kernel, git itself, and sparse repository have old
|
|
objects that triggers this check, but it is recommended
|
|
to check new projects with this flag.
|
|
|
|
--verbose::
|
|
Be chatty.
|
|
|
|
--lost-found::
|
|
Write dangling objects into .git/lost-found/commit/ or
|
|
.git/lost-found/other/, depending on type. If the object is
|
|
a blob, the contents are written into the file, rather than
|
|
its object name.
|
|
|
|
It tests SHA1 and general object sanity, and it does full tracking of
|
|
the resulting reachability and everything else. It prints out any
|
|
corruption it finds (missing or bad objects), and if you use the
|
|
'--unreachable' flag it will also print out objects that exist but
|
|
that aren't readable from any of the specified head nodes.
|
|
|
|
So for example
|
|
|
|
git fsck --unreachable HEAD \
|
|
$(git for-each-ref --format="%(objectname)" refs/heads)
|
|
|
|
will do quite a _lot_ of verification on the tree. There are a few
|
|
extra validity tests to be added (make sure that tree objects are
|
|
sorted properly etc), but on the whole if 'git-fsck' is happy, you
|
|
do have a valid tree.
|
|
|
|
Any corrupt objects you will have to find in backups or other archives
|
|
(i.e., you can just remove them and do an 'rsync' with some other site in
|
|
the hopes that somebody else has the object you have corrupted).
|
|
|
|
Of course, "valid tree" doesn't mean that it wasn't generated by some
|
|
evil person, and the end result might be crap. git is a revision
|
|
tracking system, not a quality assurance system ;)
|
|
|
|
Extracted Diagnostics
|
|
---------------------
|
|
|
|
expect dangling commits - potential heads - due to lack of head information::
|
|
You haven't specified any nodes as heads so it won't be
|
|
possible to differentiate between un-parented commits and
|
|
root nodes.
|
|
|
|
missing sha1 directory '<dir>'::
|
|
The directory holding the sha1 objects is missing.
|
|
|
|
unreachable <type> <object>::
|
|
The <type> object <object>, isn't actually referred to directly
|
|
or indirectly in any of the trees or commits seen. This can
|
|
mean that there's another root node that you're not specifying
|
|
or that the tree is corrupt. If you haven't missed a root node
|
|
then you might as well delete unreachable nodes since they
|
|
can't be used.
|
|
|
|
missing <type> <object>::
|
|
The <type> object <object>, is referred to but isn't present in
|
|
the database.
|
|
|
|
dangling <type> <object>::
|
|
The <type> object <object>, is present in the database but never
|
|
'directly' used. A dangling commit could be a root node.
|
|
|
|
warning: git-fsck: tree <tree> has full pathnames in it::
|
|
And it shouldn't...
|
|
|
|
sha1 mismatch <object>::
|
|
The database has an object who's sha1 doesn't match the
|
|
database value.
|
|
This indicates a serious data integrity problem.
|
|
|
|
Environment Variables
|
|
---------------------
|
|
|
|
GIT_OBJECT_DIRECTORY::
|
|
used to specify the object database root (usually $GIT_DIR/objects)
|
|
|
|
GIT_INDEX_FILE::
|
|
used to specify the index file of the index
|
|
|
|
GIT_ALTERNATE_OBJECT_DIRECTORIES::
|
|
used to specify additional object database roots (usually unset)
|
|
|
|
Author
|
|
------
|
|
Written by Linus Torvalds <torvalds@osdl.org>
|
|
|
|
Documentation
|
|
--------------
|
|
Documentation by David Greaves, Junio C Hamano and the git-list <git@vger.kernel.org>.
|
|
|
|
GIT
|
|
---
|
|
Part of the linkgit:git[1] suite
|