mirrors/git - Incest Forge: Beyond sex. We incest.

mirrors/git

mirror of https://github.com/git/git.git synced 2024-11-01 14:57:52 +01:00

802 lines

20 KiB

C

Raw Normal View History

Make every builtin-.c file #include "builtin.h" Make every builtin-.c file #include "builtin.h". Also takes care of some declaration/definition mismatches. Signed-off-by: Peter Hagervall <hager@cs.umu.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-07-15 01:14:45 +02:00			`#include "builtin.h"`
Fix up d_type handling - we need to include <dirent.h> before we play with the d_type compatibility macros. 2005-04-30 18:59:31 +02:00			`#include "cache.h"`
[PATCH] Port fsck-cache to use parsing functions This ports fsck-cache to use parsing functions. Note that performance could be improved here by only reading each object once, but this requires somewhat more complicated flow control. Signed-Off-By: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-18 20:39:48 +02:00			`#include "commit.h"`
			`#include "tree.h"`
			`#include "blob.h"`
[PATCH] Rework fsck-cache to use parse_object() With support for parse_object() and tags in the core, fsck_cache can just call them, and can be simplified a bit. Signed-Off-By: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-28 16:46:33 +02:00			`#include "tag.h"`
Fix up "for_each_ref()" to be more usable, and use it in git-fsck-cache It needed to take the GIT_DIR information into account, something that the original receive-pack usage just never cared about. 2005-07-03 19:01:38 +02:00			`#include "refs.h"`
[PATCH] Add git-verify-pack command. Given a list of <pack>.idx files, this command validates the index file and the corresponding .pack file for consistency. This patch also uses the same validation mechanism in fsck-cache when the --full flag is used. During normal operation, sha1_file.c verifies that a given .idx file matches the .pack file by comparing the SHA1 checksum stored in .idx file and .pack file as a minimum sanity check. We may further want to check the pack signature and version when we map the pack, but that would be a separate patch. Earlier, errors to map a pack file was not flagged fatal but led to a random fatal error later. This version explicitly die()s when such an error is detected. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-29 11:51:27 +02:00			`#include "pack.h"`
Teach fsck-objects about cache-tree. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-26 01:37:08 +02:00			`#include "cache-tree.h"`
fsck-objects: avoid unnecessary tree_entry_list usage Prime example of where the raw tree parser is easier for everybody. [jc: "Aieee" one-liner fix from the list applied. ] Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-29 21:19:02 +02:00			`#include "tree-walk.h"`
builtin-fsck: move away from object-refs to fsck_walk Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-25 22:46:05 +01:00			`#include "fsck.h"`
Make builtin-fsck.c use parse_options. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-10-15 22:34:05 +02:00			`#include "parse-options.h"`
add is_dot_or_dotdot inline function A new inline function is_dot_or_dotdot is used to check if the directory name is either "." or "..". It returns a non-zero value if the given string is "." or "..". It's applicable to a lot of Git source code. Signed-off-by: Alexander Potashev <aspotashev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-01-10 13:07:50 +01:00			`#include "dir.h"`
fsck: print progress fsck is usually a long process and it would be nice if it prints progress from time to time. Progress meter is not printed when --verbose is given because --verbose prints a lot, there's no need for "alive" indicator. Progress meter may provide "% complete" information but it would be lost anyway in the flood of text. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-11-07 03:59:26 +01:00			`#include "progress.h"`
fsck: use streaming API for writing lost-found blobs Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-03-07 11:54:20 +01:00			`#include "streaming.h"`
fsck: optionally show more helpful info for broken links When reporting broken links between commits/trees/blobs, it would be quite helpful at times if the user would be told how the object is supposed to be reachable. With the new --name-objects option, git-fsck will try to do exactly that: name the objects in a way that shows how they are reachable. For example, when some reflog got corrupted and a blob is missing that should not be, the user might want to remove the corresponding reflog entry. This option helps them find that entry: `git fsck` will now report something like this: broken link from tree b5eb6ff... (refs/stash@{<date>}~37:) to blob ec5cf80... Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-07-17 13:00:02 +02:00			`#include "decorate.h"`
[PATCH] Port fsck-cache to use parsing functions This ports fsck-cache to use parsing functions. Note that performance could be improved here by only reading each object once, but this requires somewhat more complicated flow control. Signed-Off-By: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-18 20:39:48 +02:00
			`#define REACHABLE 0x0001`
Remove "tree->entries" tree-entry list from tree parser Instead, just use the tree buffer directly, and use the tree-walk infrastructure to walk the buffers instead of the tree-entry list. The tree-entry list is inefficient, and generates tons of small allocations for no good reason. The tree-walk infrastructure is generally no harder to use than following a linked list, and allows us to do most tree parsing in-place. Some programs still use the old tree-entry lists, and are a bit painful to convert without major surgery. For them we have a helper function that creates a temporary tree-entry list on demand. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-29 21:18:33 +02:00			`#define SEEN 0x0002`
clear parsed flag when we free tree buffers Many code paths will free a tree object's buffer and set it to NULL after finishing with it in order to keep memory usage down during a traversal. However, out of 8 sites that do this, only one actually unsets the "parsed" flag back. Those sites that don't are setting a trap for later users of the tree object; even after calling parse_tree, the buffer will remain NULL, causing potential segfaults. It is not known whether this is triggerable in the current code. Most commands do not do an in-memory traversal followed by actually using the objects again. However, it does not hurt to be safe for future callers. In most cases, we can abstract this out to a "free_tree_buffer" helper. However, there are two exceptions: 1. The fsck code relies on the parsed flag to know that we were able to parse the object at one point. We can switch this to using a flag in the "flags" field. 2. The index-pack code sets the buffer to NULL but does not free it (it is freed by a caller). We should still unset the parsed flag here, but we cannot use our helper, as we do not want to free the buffer. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-06-06 00:37:39 +02:00			`#define HAS_OBJ 0x0004`
Make "fsck-cache" use the same revision tracking structure as "rev-tree". This makes things a lot more efficient, and makes it trivial to do things like reachability analysis. Add command line flags to tell what the head is, and whether to warn about unreachable objects. 2005-04-13 18:57:30 +02:00
remove unnecessary initializations [jc: I needed to hand merge the changes to the updated codebase, so the result needs to be checked.] Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-08-15 19:23:48 +02:00			`static int show_root;`
			`static int show_tags;`
			`static int show_unreachable;`
Fix lost-found to show commits only referenced by reflogs Prior to 1.5.0 the git-lost-found utility was useful to locate commits that were not referenced by any ref. These were often amends, or resets, or tips of branches that had been deleted. Being able to locate a 'lost' commit and recover it by creating a new branch was a useful feature in those days. Unfortunately 1.5.0 added the reflogs to the reachability analysis performed by git-fsck, which means that most commits users would consider to be lost are still reachable through a reflog. So most (or all!) commits are reachable, and nothing gets output from git-lost-found. Now git-fsck can be told to ignore reflogs during its reachability analysis, making git-lost-found useful again to locate commits that are no longer referenced by a ref itself, but may still be referenced by a reflog. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-04 16:46:14 +02:00			`static int include_reflogs = 1;`
fsck: default to "git fsck --full" Linus and other git developers from the early days trained their fingers to type the command, every once in a while even without thinking, to check the consistency of the repository back when the lower core part of the git was still being developed. Developers who wanted to make sure that git correctly dealt with packfiles could deliberately trigger their creation and checked them after they were created carefully, but loose objects are the ones that are written by various commands from random codepaths. It made some technical sense to have a mode that checked only loose objects from the debugging point of view for that reason. Even for git developers, there no longer is any reason to type "git fsck" every five minutes these days, worried that some newly created objects might be corrupt due to recent change to git. The reason we did not make "--full" the default is probably we trust our filesystems a bit too much. At least, we trusted filesystems more than we trusted the lower core part of git that was under development. Once a packfile is created and we always use it read-only, there didn't seem to be much point in suspecting that the underlying filesystems or disks may corrupt them in such a way that is not caught by the SHA-1 checksum over the entire packfile and per object checksum. That trust in the filesystems might have been a good tradeoff between fsck performance and reliability on platforms git was initially developed on and for, but it may not be true anymore as we run on many more platforms these days. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-20 20:46:55 +02:00			`static int check_full = 1;`
fsck: introduce `git fsck --connectivity-only` This option avoids unpacking each and all blob objects, and just verifies the connectivity. In particular with large repositories, this speeds up the operation, at the expense of missing corrupt blobs, ignoring unreachable objects and other fsck issues, if any. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-06-22 17:27:12 +02:00			`static int connectivity_only;`
remove unnecessary initializations [jc: I needed to hand merge the changes to the updated codebase, so the result needs to be checked.] Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-08-15 19:23:48 +02:00			`static int check_strict;`
			`static int keep_cache_objects;`
fsck: introduce fsck options Just like the diff machinery, we are about to introduce more settings, therefore it makes sense to carry them around as a (pointer to a) struct containing all of them. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-06-22 17:25:00 +02:00			`static struct fsck_options fsck_walk_options = FSCK_OPTIONS_DEFAULT;`
			`static struct fsck_options fsck_obj_options = FSCK_OPTIONS_DEFAULT;`
fsck: change functions to use object_id Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-25 20:38:50 +02:00			`static struct object_id head_oid;`
fsck: HEAD is part of refs By default we looked at all refs but not HEAD. The only thing that made fsck not lose sight of commits that are only reachable from a detached HEAD was the reflog for the HEAD. This fixes it, with a new test. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-01-30 09:33:00 +01:00			`static const char *head_points_at;`
fsck: exit with non-zero status upon errors git-fsck always exited with status 0, which was a bit sloppy. This makes it exit with a non-zero status when errors are found. The error code is an OR'ed result of: 1 if corrupted objects are found. 2 if objects that are ought to be reachable are missing or corrupt. For example, it would exit with 1 in a repository with an unreachable corrupt object. If a tree object of the HEAD commit is corrupt, you would get 3. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-05 09:22:06 +01:00			`static int errors_found;`
git-fsck: add --lost-found option With this option, dangling objects are not only reported, but also written to .git/lost-found/commit/ or .git/lost-found/other/. This option implies '--full' and '--no-reflogs'. 'git fsck --lost-found' is meant as a replacement for git-lost-found. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-07-03 02:33:54 +02:00			`static int write_lost_and_found;`
git-fsck: learn about --verbose With --verbose, it gets really chatty now. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-05 04:44:00 +02:00			`static int verbose;`
fsck: print progress fsck is usually a long process and it would be nice if it prints progress from time to time. Progress meter is not printed when --verbose is given because --verbose prints a lot, there's no need for "alive" indicator. Progress meter may provide "% complete" information but it would be lost anyway in the flood of text. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-11-07 03:59:26 +01:00			`static int show_progress = -1;`
fsck: --no-dangling omits "dangling object" information The default output from "fsck" is often overwhelmed by informational message on dangling objects, especially if you do not repack often, and a real error can easily be buried. Add "--no-dangling" option to omit them, and update the user manual to demonstrate its use. Based on a patch by Clemens Buchacher, but reverted the part to change the default to --no-dangling, which is unsuitable for the first patch. The usual three-step procedure to break the backward compatibility over time needs to happen on top of this, if we were to go in that direction. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-02-28 23:55:39 +01:00			`static int show_dangling = 1;`
fsck: optionally show more helpful info for broken links When reporting broken links between commits/trees/blobs, it would be quite helpful at times if the user would be told how the object is supposed to be reachable. With the new --name-objects option, git-fsck will try to do exactly that: name the objects in a way that shows how they are reachable. For example, when some reflog got corrupted and a blob is missing that should not be, the user might want to remove the corresponding reflog entry. This option helps them find that entry: `git fsck` will now report something like this: broken link from tree b5eb6ff... (refs/stash@{<date>}~37:) to blob ec5cf80... Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-07-17 13:00:02 +02:00			`static int name_objects;`
fsck: exit with non-zero status upon errors git-fsck always exited with status 0, which was a bit sloppy. This makes it exit with a non-zero status when errors are found. The error code is an OR'ed result of: 1 if corrupted objects are found. 2 if objects that are ought to be reachable are missing or corrupt. For example, it would exit with 1 in a repository with an unreachable corrupt object. If a tree object of the HEAD commit is corrupt, you would get 3. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-05 09:22:06 +01:00			`#define ERROR_OBJECT 01`
			`#define ERROR_REACHABLE 02`
fsck: return error code when verify_pack() goes wrong Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-11-07 03:59:23 +01:00			`#define ERROR_PACK 04`
fsck: exit with non-zero when problems are found After finding some problems (e.g. a ref refs/heads/X points at an object that is not a commit) and issuing an error message, the program failed to signal the fact that it found an error by a non-zero exit status. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-09-23 22:46:39 +02:00			`#define ERROR_REFS 010`
Make "fsck-cache" use the same revision tracking structure as "rev-tree". This makes things a lot more efficient, and makes it trivial to do things like reachability analysis. Add command line flags to tell what the head is, and whether to warn about unreachable objects. 2005-04-13 18:57:30 +02:00
fsck: refactor how to describe objects In many places, we refer to objects via their SHA-1s. Let's abstract that into a function. For the moment, it does nothing else than what we did previously: print out the 40-digit hex string. But that will change over the course of the next patches. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-07-17 12:59:44 +02:00			`static const char describe_object(struct object obj)`
			`{`
fsck: optionally show more helpful info for broken links When reporting broken links between commits/trees/blobs, it would be quite helpful at times if the user would be told how the object is supposed to be reachable. With the new --name-objects option, git-fsck will try to do exactly that: name the objects in a way that shows how they are reachable. For example, when some reflog got corrupted and a blob is missing that should not be, the user might want to remove the corresponding reflog entry. This option helps them find that entry: `git fsck` will now report something like this: broken link from tree b5eb6ff... (refs/stash@{<date>}~37:) to blob ec5cf80... Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-07-17 13:00:02 +02:00			`static struct strbuf buf = STRBUF_INIT;`
			`char *name = name_objects ?`
			`lookup_decoration(fsck_walk_options.object_names, obj) : NULL;`

			`strbuf_reset(&buf);`
			`strbuf_addstr(&buf, oid_to_hex(&obj->oid));`
			`if (name)`
			`strbuf_addf(&buf, " (%s)", name);`

			`return buf.buf;`
fsck: refactor how to describe objects In many places, we refer to objects via their SHA-1s. Let's abstract that into a function. For the moment, it does nothing else than what we did previously: print out the 40-digit hex string. But that will change over the course of the next patches. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-07-17 12:59:44 +02:00			`}`

fsck: move typename() printing to its own function When an object has a problem, we mention its type. But we do so by feeding the result of typename() directly to fprintf(). This is potentially dangerous because typename() can return NULL for some type values (like OBJ_NONE). It's doubtful that this can be triggered in practice with the current code, so this is probably not fixing a bug. But it future-proofs us against modifications that make things like OBJ_NONE more likely (and gives future patches a central point to handle them). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-01-26 05:11:00 +01:00			`static const char printable_type(struct object obj)`
			`{`
			`const char *ret;`

fsck: lazily load types under --connectivity-only The recent fixes to "fsck --connectivity-only" load all of the objects with their correct types. This keeps the connectivity-only code path close to the regular one, but it also introduces some unnecessary inefficiency. While getting the type of an object is cheap compared to actually opening and parsing the object (as the non-connectivity-only case would do), it's still not free. For reachable non-blob objects, we end up having to parse them later anyway (to see what they point to), making our type lookup here redundant. For unreachable objects, we might never hit them at all in the reachability traversal, making the lookup completely wasted. And in some cases, we might have quite a few unreachable objects (e.g., when alternates are used for shared object storage between repositories, it's normal for there to be objects reachable from other repositories but not the one running fsck). The comment in mark_object_for_connectivity() claims two benefits to getting the type up front: 1. We need to know the types during fsck_walk(). (And not explicitly mentioned, but we also need them when printing the types of broken or dangling commits). We can address this by lazy-loading the types as necessary. Most objects never need this lazy-load at all, because they fall into one of these categories: a. Reachable from our tips, and are coerced into the correct type as we traverse (e.g., a parent link will call lookup_commit(), which converts OBJ_NONE to OBJ_COMMIT). b. Unreachable, but not at the tip of a chunk of unreachable history. We only mention the tips as "dangling", so an unreachable commit which links to hundreds of other objects needs only report the type of the tip commit. 2. It serves as a cross-check that the coercion in (1a) is correct (i.e., we'll complain about a parent link that points to a blob). But we get most of this for free already, because right after coercing, we'll parse any non-blob objects. So we'd notice then if we expected a commit and got a blob. The one exception is when we expect a blob, in which case we never actually read the object contents. So this is a slight weakening, but given that the whole point of --connectivity-only is to sacrifice some data integrity checks for speed, this seems like an acceptable tradeoff. Here are before and after timings for an extreme case with ~5M reachable objects and another ~12M unreachable (it's the torvalds/linux repository on GitHub, connected to shared storage for all of the other kernel forks): [before] $ time git fsck --no-dangling --connectivity-only real 3m4.323s user 1m25.121s sys 1m38.710s [after] $ time git fsck --no-dangling --connectivity-only real 0m51.497s user 0m49.575s sys 0m1.776s Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-01-26 05:12:07 +01:00			`if (obj->type == OBJ_NONE) {`
			`enum object_type type = sha1_object_info(obj->oid.hash, NULL);`
			`if (type > 0)`
			`object_as_type(obj, type, 0);`
			`}`

fsck: move typename() printing to its own function When an object has a problem, we mention its type. But we do so by feeding the result of typename() directly to fprintf(). This is potentially dangerous because typename() can return NULL for some type values (like OBJ_NONE). It's doubtful that this can be triggered in practice with the current code, so this is probably not fixing a bug. But it future-proofs us against modifications that make things like OBJ_NONE more likely (and gives future patches a central point to handle them). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-01-26 05:11:00 +01:00			`ret = typename(obj->type);`
			`if (!ret)`
			`ret = "unknown";`

			`return ret;`
			`}`

fsck: support demoting errors to warnings We already have support in `git receive-pack` to deal with some legacy repositories which have non-fatal issues. Let's make `git fsck` itself useful with such repositories, too, by allowing users to ignore known issues, or at least demote those issues to mere warnings. Example: `git -c fsck.missingEmail=ignore fsck` would hide problems with missing emails in author, committer and tagger lines. In the same spirit that `git receive-pack`'s usage of the fsck machinery differs from `git fsck`'s – some of the non-fatal warnings in `git fsck` are fatal with `git receive-pack` when receive.fsckObjects = true, for example – we strictly separate the fsck.<msg-id> from the receive.fsck.<msg-id> settings. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-06-22 17:27:06 +02:00			`static int fsck_config(const char var, const char value, void *cb)`
[PATCH] Make the git-fsck-objects diagnostics more useful Actually report what exactly is wrong with the object, instead of an ambiguous 'bad sha1 file' or such. In places where we already do, unify the format and clean the messages up. Signed-off-by: Petr Baudis <pasky@suse.cz> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-20 20:56:05 +02:00			`{`
fsck: support ignoring objects in `git fsck` via fsck.skiplist Identical to support in `git receive-pack for the config option `receive.fsck.skiplist`, we now support ignoring given objects in `git fsck` via `fsck.skiplist` altogether. This is extremely handy in case of legacy repositories where it would cause more pain to change incorrect objects than to live with them (e.g. a duplicate 'author' line in an early commit object). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-06-22 17:27:23 +02:00			`if (strcmp(var, "fsck.skiplist") == 0) {`
			`const char *path;`
			`struct strbuf sb = STRBUF_INIT;`

			`if (git_config_pathname(&path, var, value))`
			`return 1;`
			`strbuf_addf(&sb, "skiplist=%s", path);`
			`free((char *)path);`
			`fsck_set_msg_types(&fsck_obj_options, sb.buf);`
			`strbuf_release(&sb);`
			`return 0;`
			`}`

fsck: support demoting errors to warnings We already have support in `git receive-pack` to deal with some legacy repositories which have non-fatal issues. Let's make `git fsck` itself useful with such repositories, too, by allowing users to ignore known issues, or at least demote those issues to mere warnings. Example: `git -c fsck.missingEmail=ignore fsck` would hide problems with missing emails in author, committer and tagger lines. In the same spirit that `git receive-pack`'s usage of the fsck machinery differs from `git fsck`'s – some of the non-fatal warnings in `git fsck` are fatal with `git receive-pack` when receive.fsckObjects = true, for example – we strictly separate the fsck.<msg-id> from the receive.fsck.<msg-id> settings. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-06-22 17:27:06 +02:00			`if (skip_prefix(var, "fsck.", &var)) {`
			`fsck_set_msg_type(&fsck_obj_options, var, value);`
			`return 0;`
			`}`

			`return git_default_config(var, value, cb);`
			`}`

fsck: introduce identifiers for fsck messages Instead of specifying whether a message by the fsck machinery constitutes an error or a warning, let's specify an identifier relating to the concrete problem that was encountered. This is necessary for upcoming support to be able to demote certain errors to warnings. In the process, simplify the requirements on the calling code: instead of having to handle full-blown varargs in every callback, we now send a string buffer ready to be used by the callback. We could use a simple enum for the message IDs here, but we want to guarantee that the enum values are associated with the appropriate message types (i.e. error or warning?). Besides, we want to introduce a parser in the next commit that maps the string representation to the enum value, hence we use the slightly ugly preprocessor construct that is extensible for use with said parser. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-06-22 17:25:09 +02:00			`static void objreport(struct object obj, const char msg_type,`
			`const char *err)`
[PATCH] Make the git-fsck-objects diagnostics more useful Actually report what exactly is wrong with the object, instead of an ambiguous 'bad sha1 file' or such. In places where we already do, unify the format and clean the messages up. Signed-off-by: Petr Baudis <pasky@suse.cz> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-20 20:56:05 +02:00			`{`
fsck: introduce identifiers for fsck messages Instead of specifying whether a message by the fsck machinery constitutes an error or a warning, let's specify an identifier relating to the concrete problem that was encountered. This is necessary for upcoming support to be able to demote certain errors to warnings. In the process, simplify the requirements on the calling code: instead of having to handle full-blown varargs in every callback, we now send a string buffer ready to be used by the callback. We could use a simple enum for the message IDs here, but we want to guarantee that the enum values are associated with the appropriate message types (i.e. error or warning?). Besides, we want to introduce a parser in the next commit that maps the string representation to the enum value, hence we use the slightly ugly preprocessor construct that is extensible for use with said parser. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-06-22 17:25:09 +02:00			`fprintf(stderr, "%s in %s %s: %s\n",`
fsck: move typename() printing to its own function When an object has a problem, we mention its type. But we do so by feeding the result of typename() directly to fprintf(). This is potentially dangerous because typename() can return NULL for some type values (like OBJ_NONE). It's doubtful that this can be triggered in practice with the current code, so this is probably not fixing a bug. But it future-proofs us against modifications that make things like OBJ_NONE more likely (and gives future patches a central point to handle them). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-01-26 05:11:00 +01:00			`msg_type, printable_type(obj), describe_object(obj), err);`
[PATCH] Make the git-fsck-objects diagnostics more useful Actually report what exactly is wrong with the object, instead of an ambiguous 'bad sha1 file' or such. In places where we already do, unify the format and clean the messages up. Signed-off-by: Petr Baudis <pasky@suse.cz> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-20 20:56:05 +02:00			`}`

fsck: introduce identifiers for fsck messages Instead of specifying whether a message by the fsck machinery constitutes an error or a warning, let's specify an identifier relating to the concrete problem that was encountered. This is necessary for upcoming support to be able to demote certain errors to warnings. In the process, simplify the requirements on the calling code: instead of having to handle full-blown varargs in every callback, we now send a string buffer ready to be used by the callback. We could use a simple enum for the message IDs here, but we want to guarantee that the enum values are associated with the appropriate message types (i.e. error or warning?). Besides, we want to introduce a parser in the next commit that maps the string representation to the enum value, hence we use the slightly ugly preprocessor construct that is extensible for use with said parser. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-06-22 17:25:09 +02:00			`static int objerror(struct object obj, const char err)`
[PATCH] Make the git-fsck-objects diagnostics more useful Actually report what exactly is wrong with the object, instead of an ambiguous 'bad sha1 file' or such. In places where we already do, unify the format and clean the messages up. Signed-off-by: Petr Baudis <pasky@suse.cz> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-20 20:56:05 +02:00			`{`
fsck: exit with non-zero status upon errors git-fsck always exited with status 0, which was a bit sloppy. This makes it exit with a non-zero status when errors are found. The error code is an OR'ed result of: 1 if corrupted objects are found. 2 if objects that are ought to be reachable are missing or corrupt. For example, it would exit with 1 in a repository with an unreachable corrupt object. If a tree object of the HEAD commit is corrupt, you would get 3. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-05 09:22:06 +01:00			`errors_found \|= ERROR_OBJECT;`
fsck: introduce identifiers for fsck messages Instead of specifying whether a message by the fsck machinery constitutes an error or a warning, let's specify an identifier relating to the concrete problem that was encountered. This is necessary for upcoming support to be able to demote certain errors to warnings. In the process, simplify the requirements on the calling code: instead of having to handle full-blown varargs in every callback, we now send a string buffer ready to be used by the callback. We could use a simple enum for the message IDs here, but we want to guarantee that the enum values are associated with the appropriate message types (i.e. error or warning?). Besides, we want to introduce a parser in the next commit that maps the string representation to the enum value, hence we use the slightly ugly preprocessor construct that is extensible for use with said parser. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-06-22 17:25:09 +02:00			`objreport(obj, "error", err);`
[PATCH] Make the git-fsck-objects diagnostics more useful Actually report what exactly is wrong with the object, instead of an ambiguous 'bad sha1 file' or such. In places where we already do, unify the format and clean the messages up. Signed-off-by: Petr Baudis <pasky@suse.cz> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-20 20:56:05 +02:00			`return -1;`
			`}`

fsck: give the error function a chance to see the fsck_options We will need this in the next commit, where fsck will be taught to optionally name the objects when reporting issues about them. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-07-17 12:59:57 +02:00			`static int fsck_error_func(struct fsck_options *o,`
			`struct object obj, int type, const char message)`
[PATCH] Make the git-fsck-objects diagnostics more useful Actually report what exactly is wrong with the object, instead of an ambiguous 'bad sha1 file' or such. In places where we already do, unify the format and clean the messages up. Signed-off-by: Petr Baudis <pasky@suse.cz> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-20 20:56:05 +02:00			`{`
fsck: introduce identifiers for fsck messages Instead of specifying whether a message by the fsck machinery constitutes an error or a warning, let's specify an identifier relating to the concrete problem that was encountered. This is necessary for upcoming support to be able to demote certain errors to warnings. In the process, simplify the requirements on the calling code: instead of having to handle full-blown varargs in every callback, we now send a string buffer ready to be used by the callback. We could use a simple enum for the message IDs here, but we want to guarantee that the enum values are associated with the appropriate message types (i.e. error or warning?). Besides, we want to introduce a parser in the next commit that maps the string representation to the enum value, hence we use the slightly ugly preprocessor construct that is extensible for use with said parser. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-06-22 17:25:09 +02:00			`objreport(obj, (type == FSCK_WARN) ? "warning" : "error", message);`
builtin-fsck: move common object checking code to fsck.c Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-25 22:46:08 +01:00			`return (type == FSCK_WARN) ? 0 : 1;`
[PATCH] Make the git-fsck-objects diagnostics more useful Actually report what exactly is wrong with the object, instead of an ambiguous 'bad sha1 file' or such. In places where we already do, unify the format and clean the messages up. Signed-off-by: Petr Baudis <pasky@suse.cz> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-20 20:56:05 +02:00			`}`

fsck: reduce stack footprint The logic to mark all objects that are reachable from tips of refs were implemented as a set of recursive functions. In a repository with a deep enough history, this can easily eat up all the available stack space. Restructure the code to require less stackspace by using an object array to keep track of the objects that still need to be processed. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-12-11 04:44:37 +01:00			`static struct object_array pending;`

fsck: introduce fsck options Just like the diff machinery, we are about to introduce more settings, therefore it makes sense to carry them around as a (pointer to a) struct containing all of them. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-06-22 17:25:00 +02:00			`static int mark_object(struct object obj, int type, void data, struct fsck_options *options)`
builtin-fsck: move away from object-refs to fsck_walk Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-25 22:46:05 +01:00			`{`
			`struct object *parent = data;`

fsck: drop unused parameter from traverse_one_object() Also add comments to seemingly unsafe pointer dereferences, that are all safe. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-01-26 21:46:55 +01:00			`/*`
			`* The only case data is NULL or type is OBJ_ANY is when`
			`* mark_object_reachable() calls us. All the callers of`
			`* that function has non-NULL obj hence ...`
			`*/`
builtin-fsck: move away from object-refs to fsck_walk Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-25 22:46:05 +01:00			`if (!obj) {`
fsck: drop unused parameter from traverse_one_object() Also add comments to seemingly unsafe pointer dereferences, that are all safe. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-01-26 21:46:55 +01:00			`/* ... these references to parent->fld are safe here */`
builtin-fsck: move away from object-refs to fsck_walk Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-25 22:46:05 +01:00			`printf("broken link from %7s %s\n",`
fsck: move typename() printing to its own function When an object has a problem, we mention its type. But we do so by feeding the result of typename() directly to fprintf(). This is potentially dangerous because typename() can return NULL for some type values (like OBJ_NONE). It's doubtful that this can be triggered in practice with the current code, so this is probably not fixing a bug. But it future-proofs us against modifications that make things like OBJ_NONE more likely (and gives future patches a central point to handle them). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-01-26 05:11:00 +01:00			`printable_type(parent), describe_object(parent));`
builtin-fsck: move away from object-refs to fsck_walk Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-25 22:46:05 +01:00			`printf("broken link from %7s %s\n",`
			`(type == OBJ_ANY ? "unknown" : typename(type)), "unknown");`
			`errors_found \|= ERROR_REACHABLE;`
			`return 1;`
			`}`

			`if (type != OBJ_ANY && obj->type != type)`
fsck: drop unused parameter from traverse_one_object() Also add comments to seemingly unsafe pointer dereferences, that are all safe. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-01-26 21:46:55 +01:00			`/* ... and the reference to parent is safe here */`
builtin-fsck: move away from object-refs to fsck_walk Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-25 22:46:05 +01:00			`objerror(parent, "wrong object type in link");`

			`if (obj->flags & REACHABLE)`
			`return 0;`
			`obj->flags \|= REACHABLE;`
clear parsed flag when we free tree buffers Many code paths will free a tree object's buffer and set it to NULL after finishing with it in order to keep memory usage down during a traversal. However, out of 8 sites that do this, only one actually unsets the "parsed" flag back. Those sites that don't are setting a trap for later users of the tree object; even after calling parse_tree, the buffer will remain NULL, causing potential segfaults. It is not known whether this is triggerable in the current code. Most commands do not do an in-memory traversal followed by actually using the objects again. However, it does not hurt to be safe for future callers. In most cases, we can abstract this out to a "free_tree_buffer" helper. However, there are two exceptions: 1. The fsck code relies on the parsed flag to know that we were able to parse the object at one point. We can switch this to using a flag in the "flags" field. 2. The index-pack code sets the buffer to NULL but does not free it (it is freed by a caller). We should still unset the parsed flag here, but we cannot use our helper, as we do not want to free the buffer. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-06-06 00:37:39 +02:00			`if (!(obj->flags & HAS_OBJ)) {`
Convert struct object to object_id struct object is one of the major data structures dealing with object IDs. Convert it to use struct object_id instead of an unsigned char array. Convert get_object_hash to refer to the new member as well. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net> 2015-11-10 03:22:28 +01:00			`if (parent && !has_object_file(&obj->oid)) {`
builtin-fsck: move away from object-refs to fsck_walk Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-25 22:46:05 +01:00			`printf("broken link from %7s %s\n",`
fsck: move typename() printing to its own function When an object has a problem, we mention its type. But we do so by feeding the result of typename() directly to fprintf(). This is potentially dangerous because typename() can return NULL for some type values (like OBJ_NONE). It's doubtful that this can be triggered in practice with the current code, so this is probably not fixing a bug. But it future-proofs us against modifications that make things like OBJ_NONE more likely (and gives future patches a central point to handle them). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-01-26 05:11:00 +01:00			`printable_type(parent), describe_object(parent));`
builtin-fsck: move away from object-refs to fsck_walk Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-25 22:46:05 +01:00			`printf(" to %7s %s\n",`
fsck: move typename() printing to its own function When an object has a problem, we mention its type. But we do so by feeding the result of typename() directly to fprintf(). This is potentially dangerous because typename() can return NULL for some type values (like OBJ_NONE). It's doubtful that this can be triggered in practice with the current code, so this is probably not fixing a bug. But it future-proofs us against modifications that make things like OBJ_NONE more likely (and gives future patches a central point to handle them). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-01-26 05:11:00 +01:00			`printable_type(obj), describe_object(obj));`
builtin-fsck: move away from object-refs to fsck_walk Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-25 22:46:05 +01:00			`errors_found \|= ERROR_REACHABLE;`
			`}`
			`return 1;`
			`}`

fsck: don't put a void-shaped peg in a char-shaped hole The source of this nonsense was 04d3975937 fsck: reduce stack footprint , which wedged a pointer to parent into the object_array_entry's name field. The parent pointer was passed to traverse_one_object(), even though that function didn't use it. The useless code has been deleted over time. Commit a1cdc25172 fsck: drop unused parameter from traverse_one_object() removed the parent pointer from traverse_one_object()'s signature. Commit c0aa335c95 Remove unused variables removed the code that read the parent pointer back out of the name field. This commit takes the last step: don't write the parent pointer into the name field in the first place. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-05-25 11:08:11 +02:00			`add_object_array(obj, NULL, &pending);`
fsck: reduce stack footprint The logic to mark all objects that are reachable from tips of refs were implemented as a set of recursive functions. In a repository with a deep enough history, this can easily eat up all the available stack space. Restructure the code to require less stackspace by using an object array to keep track of the objects that still need to be processed. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-12-11 04:44:37 +01:00			`return 0;`
			`}`

			`static void mark_object_reachable(struct object *obj)`
			`{`
fsck: introduce fsck options Just like the diff machinery, we are about to introduce more settings, therefore it makes sense to carry them around as a (pointer to a) struct containing all of them. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-06-22 17:25:00 +02:00			`mark_object(obj, OBJ_ANY, NULL, NULL);`
fsck: reduce stack footprint The logic to mark all objects that are reachable from tips of refs were implemented as a set of recursive functions. In a repository with a deep enough history, this can easily eat up all the available stack space. Restructure the code to require less stackspace by using an object array to keep track of the objects that still need to be processed. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-12-11 04:44:37 +01:00			`}`

fsck: drop unused parameter from traverse_one_object() Also add comments to seemingly unsafe pointer dereferences, that are all safe. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-01-26 21:46:55 +01:00			`static int traverse_one_object(struct object *obj)`
fsck: reduce stack footprint The logic to mark all objects that are reachable from tips of refs were implemented as a set of recursive functions. In a repository with a deep enough history, this can easily eat up all the available stack space. Restructure the code to require less stackspace by using an object array to keep track of the objects that still need to be processed. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-12-11 04:44:37 +01:00			`{`
			`int result;`
			`struct tree *tree = NULL;`

builtin-fsck: move away from object-refs to fsck_walk Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-25 22:46:05 +01:00			`if (obj->type == OBJ_TREE) {`
			`tree = (struct tree *)obj;`
			`if (parse_tree(tree) < 0)`
			`return 1; /* error already displayed */`
			`}`
fsck: introduce fsck options Just like the diff machinery, we are about to introduce more settings, therefore it makes sense to carry them around as a (pointer to a) struct containing all of them. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-06-22 17:25:00 +02:00			`result = fsck_walk(obj, obj, &fsck_walk_options);`
clear parsed flag when we free tree buffers Many code paths will free a tree object's buffer and set it to NULL after finishing with it in order to keep memory usage down during a traversal. However, out of 8 sites that do this, only one actually unsets the "parsed" flag back. Those sites that don't are setting a trap for later users of the tree object; even after calling parse_tree, the buffer will remain NULL, causing potential segfaults. It is not known whether this is triggerable in the current code. Most commands do not do an in-memory traversal followed by actually using the objects again. However, it does not hurt to be safe for future callers. In most cases, we can abstract this out to a "free_tree_buffer" helper. However, there are two exceptions: 1. The fsck code relies on the parsed flag to know that we were able to parse the object at one point. We can switch this to using a flag in the "flags" field. 2. The index-pack code sets the buffer to NULL but does not free it (it is freed by a caller). We should still unset the parsed flag here, but we cannot use our helper, as we do not want to free the buffer. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-06-06 00:37:39 +02:00			`if (tree)`
			`free_tree_buffer(tree);`
builtin-fsck: move away from object-refs to fsck_walk Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-25 22:46:05 +01:00			`return result;`
			`}`

fsck: reduce stack footprint The logic to mark all objects that are reachable from tips of refs were implemented as a set of recursive functions. In a repository with a deep enough history, this can easily eat up all the available stack space. Restructure the code to require less stackspace by using an object array to keep track of the objects that still need to be processed. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-12-11 04:44:37 +01:00			`static int traverse_reachable(void)`
builtin-fsck: move away from object-refs to fsck_walk Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-25 22:46:05 +01:00			`{`
fsck: print progress fsck is usually a long process and it would be nice if it prints progress from time to time. Progress meter is not printed when --verbose is given because --verbose prints a lot, there's no need for "alive" indicator. Progress meter may provide "% complete" information but it would be lost anyway in the flood of text. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-11-07 03:59:26 +01:00			`struct progress *progress = NULL;`
			`unsigned int nr = 0;`
fsck: reduce stack footprint The logic to mark all objects that are reachable from tips of refs were implemented as a set of recursive functions. In a repository with a deep enough history, this can easily eat up all the available stack space. Restructure the code to require less stackspace by using an object array to keep track of the objects that still need to be processed. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-12-11 04:44:37 +01:00			`int result = 0;`
fsck: print progress fsck is usually a long process and it would be nice if it prints progress from time to time. Progress meter is not printed when --verbose is given because --verbose prints a lot, there's no need for "alive" indicator. Progress meter may provide "% complete" information but it would be lost anyway in the flood of text. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-11-07 03:59:26 +01:00			`if (show_progress)`
i18n: mark all progress lines for translation Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-02-21 13:50:18 +01:00			`progress = start_progress_delay(_("Checking connectivity"), 0, 0, 2);`
fsck: reduce stack footprint The logic to mark all objects that are reachable from tips of refs were implemented as a set of recursive functions. In a repository with a deep enough history, this can easily eat up all the available stack space. Restructure the code to require less stackspace by using an object array to keep track of the objects that still need to be processed. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-12-11 04:44:37 +01:00			`while (pending.nr) {`
			`struct object_array_entry *entry;`
Remove unused variables Noticed by gcc 4.6.0. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-03-22 13:50:08 +01:00			`struct object *obj;`
fsck: reduce stack footprint The logic to mark all objects that are reachable from tips of refs were implemented as a set of recursive functions. In a repository with a deep enough history, this can easily eat up all the available stack space. Restructure the code to require less stackspace by using an object array to keep track of the objects that still need to be processed. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-12-11 04:44:37 +01:00
			`entry = pending.objects + --pending.nr;`
			`obj = entry->item;`
fsck: drop unused parameter from traverse_one_object() Also add comments to seemingly unsafe pointer dereferences, that are all safe. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-01-26 21:46:55 +01:00			`result \|= traverse_one_object(obj);`
fsck: print progress fsck is usually a long process and it would be nice if it prints progress from time to time. Progress meter is not printed when --verbose is given because --verbose prints a lot, there's no need for "alive" indicator. Progress meter may provide "% complete" information but it would be lost anyway in the flood of text. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-11-07 03:59:26 +01:00			`display_progress(progress, ++nr);`
fsck: reduce stack footprint The logic to mark all objects that are reachable from tips of refs were implemented as a set of recursive functions. In a repository with a deep enough history, this can easily eat up all the available stack space. Restructure the code to require less stackspace by using an object array to keep track of the objects that still need to be processed. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-12-11 04:44:37 +01:00			`}`
fsck: print progress fsck is usually a long process and it would be nice if it prints progress from time to time. Progress meter is not printed when --verbose is given because --verbose prints a lot, there's no need for "alive" indicator. Progress meter may provide "% complete" information but it would be lost anyway in the flood of text. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-11-07 03:59:26 +01:00			`stop_progress(&progress);`
fsck: reduce stack footprint The logic to mark all objects that are reachable from tips of refs were implemented as a set of recursive functions. In a repository with a deep enough history, this can easily eat up all the available stack space. Restructure the code to require less stackspace by using an object array to keep track of the objects that still need to be processed. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-12-11 04:44:37 +01:00			`return !!result;`
builtin-fsck: move away from object-refs to fsck_walk Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-25 22:46:05 +01:00			`}`

fsck: introduce fsck options Just like the diff machinery, we are about to introduce more settings, therefore it makes sense to carry them around as a (pointer to a) struct containing all of them. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-06-22 17:25:00 +02:00			`static int mark_used(struct object obj, int type, void data, struct fsck_options *options)`
builtin-fsck: move away from object-refs to fsck_walk Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-25 22:46:05 +01:00			`{`
			`if (!obj)`
			`return 1;`
			`obj->used = 1;`
			`return 0;`
[PATCH] Make the git-fsck-objects diagnostics more useful Actually report what exactly is wrong with the object, instead of an ambiguous 'bad sha1 file' or such. In places where we already do, unify the format and clean the messages up. Signed-off-by: Petr Baudis <pasky@suse.cz> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-20 20:56:05 +02:00			`}`

fsck-objects: refactor checking for connectivity This separates the connectivity check into separate codepaths, one for reachable objects and the other for unreachable ones, while adding a lot of comments to explain what is going on. When checking an unreachable object, unlike a reachable one, we do not have to complain if it does not exist (we used to complain about a missing blob even when the only thing that references it is a tree that is dangling). Also we do not have to check and complain about objects that are referenced by an unreachable object. This makes the messages from fsck-objects a lot less noisy and more useful. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-22 07:26:41 +01:00			`/*`
			`* Check a single reachable object`
			`*/`
			`static void check_reachable_object(struct object *obj)`
			`{`
			`/*`
			`* We obviously want the object to be parsed,`
			`* except if it was in a pack-file and we didn't`
			`* do a full fsck`
			`*/`
clear parsed flag when we free tree buffers Many code paths will free a tree object's buffer and set it to NULL after finishing with it in order to keep memory usage down during a traversal. However, out of 8 sites that do this, only one actually unsets the "parsed" flag back. Those sites that don't are setting a trap for later users of the tree object; even after calling parse_tree, the buffer will remain NULL, causing potential segfaults. It is not known whether this is triggerable in the current code. Most commands do not do an in-memory traversal followed by actually using the objects again. However, it does not hurt to be safe for future callers. In most cases, we can abstract this out to a "free_tree_buffer" helper. However, there are two exceptions: 1. The fsck code relies on the parsed flag to know that we were able to parse the object at one point. We can switch this to using a flag in the "flags" field. 2. The index-pack code sets the buffer to NULL but does not free it (it is freed by a caller). We should still unset the parsed flag here, but we cannot use our helper, as we do not want to free the buffer. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-06-06 00:37:39 +02:00			`if (!(obj->flags & HAS_OBJ)) {`
Remove get_object_hash. Convert all instances of get_object_hash to use an appropriate reference to the hash member of the oid member of struct object. This provides no functional change, as it is essentially a macro substitution. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net> 2015-11-10 03:22:29 +01:00			`if (has_sha1_pack(obj->oid.hash))`
fsck-objects: refactor checking for connectivity This separates the connectivity check into separate codepaths, one for reachable objects and the other for unreachable ones, while adding a lot of comments to explain what is going on. When checking an unreachable object, unlike a reachable one, we do not have to complain if it does not exist (we used to complain about a missing blob even when the only thing that references it is a tree that is dangling). Also we do not have to check and complain about objects that are referenced by an unreachable object. This makes the messages from fsck-objects a lot less noisy and more useful. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-22 07:26:41 +01:00			`return; /* it is in pack - forget about it */`
fsck: move typename() printing to its own function When an object has a problem, we mention its type. But we do so by feeding the result of typename() directly to fprintf(). This is potentially dangerous because typename() can return NULL for some type values (like OBJ_NONE). It's doubtful that this can be triggered in practice with the current code, so this is probably not fixing a bug. But it future-proofs us against modifications that make things like OBJ_NONE more likely (and gives future patches a central point to handle them). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-01-26 05:11:00 +01:00			`printf("missing %s %s\n", printable_type(obj),`
fsck: refactor how to describe objects In many places, we refer to objects via their SHA-1s. Let's abstract that into a function. For the moment, it does nothing else than what we did previously: print out the 40-digit hex string. But that will change over the course of the next patches. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-07-17 12:59:44 +02:00			`describe_object(obj));`
fsck: exit with non-zero status upon errors git-fsck always exited with status 0, which was a bit sloppy. This makes it exit with a non-zero status when errors are found. The error code is an OR'ed result of: 1 if corrupted objects are found. 2 if objects that are ought to be reachable are missing or corrupt. For example, it would exit with 1 in a repository with an unreachable corrupt object. If a tree object of the HEAD commit is corrupt, you would get 3. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-05 09:22:06 +01:00			`errors_found \|= ERROR_REACHABLE;`
fsck-objects: refactor checking for connectivity This separates the connectivity check into separate codepaths, one for reachable objects and the other for unreachable ones, while adding a lot of comments to explain what is going on. When checking an unreachable object, unlike a reachable one, we do not have to complain if it does not exist (we used to complain about a missing blob even when the only thing that references it is a tree that is dangling). Also we do not have to check and complain about objects that are referenced by an unreachable object. This makes the messages from fsck-objects a lot less noisy and more useful. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-22 07:26:41 +01:00			`return;`
			`}`
			`}`

			`/*`
			`* Check a single unreachable object`
			`*/`
			`static void check_unreachable_object(struct object *obj)`
			`{`
			`/*`
			`* Missing unreachable object? Ignore it. It's not like`
			`* we miss it (since it can't be reached), nor do we want`
			`* to complain about it being unreachable (since it does`
			`* not exist).`
			`*/`
fsck: report trees as dangling After checking connectivity, fsck looks through the list of any objects we've seen mentioned, and reports unreachable and un-"used" ones as dangling. However, it skips any object which is not marked as "parsed", as that is an object that we _don't_ have (but that somebody mentioned). Since 6e454b9a3 (clear parsed flag when we free tree buffers, 2013-06-05), that flag can't be relied on, and the correct method is to check the HAS_OBJ flag. The cleanup in that commit missed this callsite, though. As a result, we would generally fail to report dangling trees. We never noticed because there were no tests in this area (for trees or otherwise). Let's add some. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-01-16 22:25:35 +01:00			`if (!(obj->flags & HAS_OBJ))`
fsck-objects: refactor checking for connectivity This separates the connectivity check into separate codepaths, one for reachable objects and the other for unreachable ones, while adding a lot of comments to explain what is going on. When checking an unreachable object, unlike a reachable one, we do not have to complain if it does not exist (we used to complain about a missing blob even when the only thing that references it is a tree that is dangling). Also we do not have to check and complain about objects that are referenced by an unreachable object. This makes the messages from fsck-objects a lot less noisy and more useful. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-22 07:26:41 +01:00			`return;`

			`/*`
			`* Unreachable object that exists? Show it if asked to,`
			`* since this is something that is prunable.`
			`*/`
			`if (show_unreachable) {`
fsck: move typename() printing to its own function When an object has a problem, we mention its type. But we do so by feeding the result of typename() directly to fprintf(). This is potentially dangerous because typename() can return NULL for some type values (like OBJ_NONE). It's doubtful that this can be triggered in practice with the current code, so this is probably not fixing a bug. But it future-proofs us against modifications that make things like OBJ_NONE more likely (and gives future patches a central point to handle them). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-01-26 05:11:00 +01:00			`printf("unreachable %s %s\n", printable_type(obj),`
fsck: refactor how to describe objects In many places, we refer to objects via their SHA-1s. Let's abstract that into a function. For the moment, it does nothing else than what we did previously: print out the 40-digit hex string. But that will change over the course of the next patches. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-07-17 12:59:44 +02:00			`describe_object(obj));`
fsck-objects: refactor checking for connectivity This separates the connectivity check into separate codepaths, one for reachable objects and the other for unreachable ones, while adding a lot of comments to explain what is going on. When checking an unreachable object, unlike a reachable one, we do not have to complain if it does not exist (we used to complain about a missing blob even when the only thing that references it is a tree that is dangling). Also we do not have to check and complain about objects that are referenced by an unreachable object. This makes the messages from fsck-objects a lot less noisy and more useful. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-22 07:26:41 +01:00			`return;`
			`}`

			`/*`
			`* "!used" means that nothing at all points to it, including`
Assorted typo fixes Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-04 05:49:16 +01:00			`* other unreachable objects. In other words, it's the "tip"`
fsck-objects: refactor checking for connectivity This separates the connectivity check into separate codepaths, one for reachable objects and the other for unreachable ones, while adding a lot of comments to explain what is going on. When checking an unreachable object, unlike a reachable one, we do not have to complain if it does not exist (we used to complain about a missing blob even when the only thing that references it is a tree that is dangling). Also we do not have to check and complain about objects that are referenced by an unreachable object. This makes the messages from fsck-objects a lot less noisy and more useful. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-22 07:26:41 +01:00			`* of some set of unreachable objects, usually a commit that`
			`* got dropped.`
			`*`
			`* Such starting points are more interesting than some random`
			`* set of unreachable objects, so we show them even if the user`
			`* hasn't asked for _all_ unreachable objects. If you have`
			`* deleted a branch by mistake, this is a prime candidate to`
			`* start looking at, for example.`
			`*/`
			`if (!obj->used) {`
fsck: --no-dangling omits "dangling object" information The default output from "fsck" is often overwhelmed by informational message on dangling objects, especially if you do not repack often, and a real error can easily be buried. Add "--no-dangling" option to omit them, and update the user manual to demonstrate its use. Based on a patch by Clemens Buchacher, but reverted the part to change the default to --no-dangling, which is unsuitable for the first patch. The usual three-step procedure to break the backward compatibility over time needs to happen on top of this, if we were to go in that direction. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-02-28 23:55:39 +01:00			`if (show_dangling)`
fsck: move typename() printing to its own function When an object has a problem, we mention its type. But we do so by feeding the result of typename() directly to fprintf(). This is potentially dangerous because typename() can return NULL for some type values (like OBJ_NONE). It's doubtful that this can be triggered in practice with the current code, so this is probably not fixing a bug. But it future-proofs us against modifications that make things like OBJ_NONE more likely (and gives future patches a central point to handle them). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-01-26 05:11:00 +01:00			`printf("dangling %s %s\n", printable_type(obj),`
fsck: refactor how to describe objects In many places, we refer to objects via their SHA-1s. Let's abstract that into a function. For the moment, it does nothing else than what we did previously: print out the 40-digit hex string. But that will change over the course of the next patches. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-07-17 12:59:44 +02:00			`describe_object(obj));`
git-fsck: add --lost-found option With this option, dangling objects are not only reported, but also written to .git/lost-found/commit/ or .git/lost-found/other/. This option implies '--full' and '--no-reflogs'. 'git fsck --lost-found' is meant as a replacement for git-lost-found. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-07-03 02:33:54 +02:00			`if (write_lost_and_found) {`
prefer git_pathdup to git_path in some possibly-dangerous cases Because git_path uses a static buffer that is shared with calls to git_path, mkpath, etc, it can be dangerous to assign the result to a variable or pass it to a non-trivial function. The value may change unexpectedly due to other calls. None of the cases changed here has a known bug, but they're worth converting away from git_path because: 1. It's easy to use git_pathdup in these cases. 2. They use constructs (like assignment) that make it hard to tell whether they're safe or not. The extra malloc overhead should be trivial, as an allocation should be an order of magnitude cheaper than a system call (which we are clearly about to make, since we are constructing a filename). The real cost is that we must remember to free the result. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-08-10 11:35:31 +02:00			`char *filename = git_pathdup("lost-found/%s/%s",`
git-fsck: add --lost-found option With this option, dangling objects are not only reported, but also written to .git/lost-found/commit/ or .git/lost-found/other/. This option implies '--full' and '--no-reflogs'. 'git fsck --lost-found' is meant as a replacement for git-lost-found. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-07-03 02:33:54 +02:00			`obj->type == OBJ_COMMIT ? "commit" : "other",`
fsck: refactor how to describe objects In many places, we refer to objects via their SHA-1s. Let's abstract that into a function. For the moment, it does nothing else than what we did previously: print out the 40-digit hex string. But that will change over the course of the next patches. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-07-17 12:59:44 +02:00			`describe_object(obj));`
git-fsck: add --lost-found option With this option, dangling objects are not only reported, but also written to .git/lost-found/commit/ or .git/lost-found/other/. This option implies '--full' and '--no-reflogs'. 'git fsck --lost-found' is meant as a replacement for git-lost-found. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-07-03 02:33:54 +02:00			`FILE *f;`

path.c: make get_pathname() call sites return const char * Before the previous commit, get_pathname returns an array of PATH_MAX length. Even if git_path() and similar functions does not use the whole array, git_path() caller can, in theory. After the commit, get_pathname() may return a buffer that has just enough room for the returned string and git_path() caller should never write beyond that. Make git_path(), mkpath() and git_path_submodule() return a const buffer to make sure callers do not write in it at all. This could have been part of the previous commit, but the "const" conversion is too much distraction from the core changes in path.c. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-11-30 09:24:27 +01:00			`if (safe_create_leading_directories_const(filename)) {`
git-fsck: add --lost-found option With this option, dangling objects are not only reported, but also written to .git/lost-found/commit/ or .git/lost-found/other/. This option implies '--full' and '--no-reflogs'. 'git fsck --lost-found' is meant as a replacement for git-lost-found. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-07-03 02:33:54 +02:00			`error("Could not create lost-found");`
prefer git_pathdup to git_path in some possibly-dangerous cases Because git_path uses a static buffer that is shared with calls to git_path, mkpath, etc, it can be dangerous to assign the result to a variable or pass it to a non-trivial function. The value may change unexpectedly due to other calls. None of the cases changed here has a known bug, but they're worth converting away from git_path because: 1. It's easy to use git_pathdup in these cases. 2. They use constructs (like assignment) that make it hard to tell whether they're safe or not. The extra malloc overhead should be trivial, as an allocation should be an order of magnitude cheaper than a system call (which we are clearly about to make, since we are constructing a filename). The real cost is that we must remember to free the result. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-08-10 11:35:31 +02:00			`free(filename);`
git-fsck: add --lost-found option With this option, dangling objects are not only reported, but also written to .git/lost-found/commit/ or .git/lost-found/other/. This option implies '--full' and '--no-reflogs'. 'git fsck --lost-found' is meant as a replacement for git-lost-found. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-07-03 02:33:54 +02:00			`return;`
			`}`
			`if (!(f = fopen(filename, "w")))`
Use die_errno() instead of die() when checking syscalls Lots of die() calls did not actually report the kind of error, which can leave the user confused as to the real problem. Use die_errno() where we check a system/library call that sets errno on failure, or one of the following that wrap such calls: Function Passes on error from -------- -------------------- odb_pack_keep open read_ancestry fopen read_in_full xread strbuf_read xread strbuf_read_file open or strbuf_read_file strbuf_readlink readlink write_in_full xwrite Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-06-27 17:58:47 +02:00			`die_errno("Could not open '%s'", filename);`
fsck --lost-found: write blob's contents, not their SHA-1 When looking for a lost blob, it is much nicer to be able to grep through .git/lost-found/other/* than to write an inefficient loop over the file names. So write the contents of the dangling blobs, not their object names. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-07-22 22:20:26 +02:00			`if (obj->type == OBJ_BLOB) {`
streaming: make stream_blob_to_fd take struct object_id Since all of its callers have been updated, modify stream_blob_to_fd to take a struct object_id. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-09-05 22:07:59 +02:00			`if (stream_blob_to_fd(fileno(f), &obj->oid, NULL, 1))`
fsck: do not abort upon finding an empty blob Asking fwrite() to write one item of size bytes results in fwrite() reporting "I wrote zero item", when size is zero. Instead, we could ask it to write "size" items of 1 byte and expect it to report that "I wrote size items" when it succeeds, with any value of size, including zero. Noticed and reported by BJ Hargrave. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-09-12 03:03:38 +02:00			`die_errno("Could not write '%s'", filename);`
fsck --lost-found: write blob's contents, not their SHA-1 When looking for a lost blob, it is much nicer to be able to grep through .git/lost-found/other/* than to write an inefficient loop over the file names. So write the contents of the dangling blobs, not their object names. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-07-22 22:20:26 +02:00			`} else`
fsck: refactor how to describe objects In many places, we refer to objects via their SHA-1s. Let's abstract that into a function. For the moment, it does nothing else than what we did previously: print out the 40-digit hex string. But that will change over the course of the next patches. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-07-17 12:59:44 +02:00			`fprintf(f, "%s\n", describe_object(obj));`
Make some of fwrite/fclose/write/close failures visible So that full filesystem conditions or permissions problems won't go unnoticed. Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-12-05 01:35:48 +01:00			`if (fclose(f))`
Convert existing die(..., strerror(errno)) to die_errno() Change calls to die(..., strerror(errno)) to use the new die_errno(). In the process, also make slight style adjustments: at least state _something_ about the function that failed (instead of just printing the pathname), and put paths in single quotes. Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-06-27 17:58:46 +02:00			`die_errno("Could not finish '%s'",`
			`filename);`
prefer git_pathdup to git_path in some possibly-dangerous cases Because git_path uses a static buffer that is shared with calls to git_path, mkpath, etc, it can be dangerous to assign the result to a variable or pass it to a non-trivial function. The value may change unexpectedly due to other calls. None of the cases changed here has a known bug, but they're worth converting away from git_path because: 1. It's easy to use git_pathdup in these cases. 2. They use constructs (like assignment) that make it hard to tell whether they're safe or not. The extra malloc overhead should be trivial, as an allocation should be an order of magnitude cheaper than a system call (which we are clearly about to make, since we are constructing a filename). The real cost is that we must remember to free the result. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-08-10 11:35:31 +02:00			`free(filename);`
git-fsck: add --lost-found option With this option, dangling objects are not only reported, but also written to .git/lost-found/commit/ or .git/lost-found/other/. This option implies '--full' and '--no-reflogs'. 'git fsck --lost-found' is meant as a replacement for git-lost-found. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-07-03 02:33:54 +02:00			`}`
fsck-objects: refactor checking for connectivity This separates the connectivity check into separate codepaths, one for reachable objects and the other for unreachable ones, while adding a lot of comments to explain what is going on. When checking an unreachable object, unlike a reachable one, we do not have to complain if it does not exist (we used to complain about a missing blob even when the only thing that references it is a tree that is dangling). Also we do not have to check and complain about objects that are referenced by an unreachable object. This makes the messages from fsck-objects a lot less noisy and more useful. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-22 07:26:41 +01:00			`return;`
			`}`

			`/*`
			`* Otherwise? It's there, it's unreachable, and some other unreachable`
			`* object points to it. Ignore it - it's not interesting, and we showed`
			`* all the interesting cases above.`
			`*/`
			`}`

			`static void check_object(struct object *obj)`
			`{`
git-fsck: learn about --verbose With --verbose, it gets really chatty now. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-05 04:44:00 +02:00			`if (verbose)`
fsck: refactor how to describe objects In many places, we refer to objects via their SHA-1s. Let's abstract that into a function. For the moment, it does nothing else than what we did previously: print out the 40-digit hex string. But that will change over the course of the next patches. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-07-17 12:59:44 +02:00			`fprintf(stderr, "Checking %s\n", describe_object(obj));`
git-fsck: learn about --verbose With --verbose, it gets really chatty now. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-05 04:44:00 +02:00
fsck-objects: refactor checking for connectivity This separates the connectivity check into separate codepaths, one for reachable objects and the other for unreachable ones, while adding a lot of comments to explain what is going on. When checking an unreachable object, unlike a reachable one, we do not have to complain if it does not exist (we used to complain about a missing blob even when the only thing that references it is a tree that is dangling). Also we do not have to check and complain about objects that are referenced by an unreachable object. This makes the messages from fsck-objects a lot less noisy and more useful. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-22 07:26:41 +01:00			`if (obj->flags & REACHABLE)`
			`check_reachable_object(obj);`
			`else`
			`check_unreachable_object(obj);`
			`}`
[PATCH] Make the git-fsck-objects diagnostics more useful Actually report what exactly is wrong with the object, instead of an ambiguous 'bad sha1 file' or such. In places where we already do, unify the format and clean the messages up. Signed-off-by: Petr Baudis <pasky@suse.cz> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-20 20:56:05 +02:00
Add connectivity tracking to fsck. This shows that I've lost track of one commit already. Most likely because I forgot to update the .dircache/HEAD file when doing a commit, so that the next commit referenced not the top-of-tree, but the one older commit. Having dangling commits is fine (in fact, you should always have at least _one_ dangling commit in the top-of-tree). But it's good to know about them. 2005-04-11 08:13:09 +02:00			`static void check_connectivity(void)`
			`{`
Abstract out accesses to object hash array There are a few special places where some programs accessed the object hash array directly, which bothered me because I wanted to play with some simple re-organizations. So this patch makes the object hash array data structures all entirely local to object.c, and the few users who wanted to look at it now get to use a function to query how many object index entries there can be, and to actually access the array. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-30 06:38:55 +02:00			`int i, max;`
Add connectivity tracking to fsck. This shows that I've lost track of one commit already. Most likely because I forgot to update the .dircache/HEAD file when doing a commit, so that the next commit referenced not the top-of-tree, but the one older commit. Having dangling commits is fine (in fact, you should always have at least _one_ dangling commit in the top-of-tree). But it's good to know about them. 2005-04-11 08:13:09 +02:00
fsck: reduce stack footprint The logic to mark all objects that are reachable from tips of refs were implemented as a set of recursive functions. In a repository with a deep enough history, this can easily eat up all the available stack space. Restructure the code to require less stackspace by using an object array to keep track of the objects that still need to be processed. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-12-11 04:44:37 +01:00			`/* Traverse the pending reachable objects */`
			`traverse_reachable();`

Add connectivity tracking to fsck. This shows that I've lost track of one commit already. Most likely because I forgot to update the .dircache/HEAD file when doing a commit, so that the next commit referenced not the top-of-tree, but the one older commit. Having dangling commits is fine (in fact, you should always have at least _one_ dangling commit in the top-of-tree). But it's good to know about them. 2005-04-11 08:13:09 +02:00			`/* Look up all the requirements, warn about missing objects.. */`
Abstract out accesses to object hash array There are a few special places where some programs accessed the object hash array directly, which bothered me because I wanted to play with some simple re-organizations. So this patch makes the object hash array data structures all entirely local to object.c, and the few users who wanted to look at it now get to use a function to query how many object index entries there can be, and to actually access the array. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-30 06:38:55 +02:00			`max = get_max_object_index();`
git-fsck: learn about --verbose With --verbose, it gets really chatty now. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-05 04:44:00 +02:00			`if (verbose)`
			`fprintf(stderr, "Checking connectivity (%d objects)\n", max);`

Abstract out accesses to object hash array There are a few special places where some programs accessed the object hash array directly, which bothered me because I wanted to play with some simple re-organizations. So this patch makes the object hash array data structures all entirely local to object.c, and the few users who wanted to look at it now get to use a function to query how many object index entries there can be, and to actually access the array. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-30 06:38:55 +02:00			`for (i = 0; i < max; i++) {`
			`struct object *obj = get_indexed_object(i);`
Add connectivity tracking to fsck. This shows that I've lost track of one commit already. Most likely because I forgot to update the .dircache/HEAD file when doing a commit, so that the next commit referenced not the top-of-tree, but the one older commit. Having dangling commits is fine (in fact, you should always have at least _one_ dangling commit in the top-of-tree). But it's good to know about them. 2005-04-11 08:13:09 +02:00
fsck-objects: refactor checking for connectivity This separates the connectivity check into separate codepaths, one for reachable objects and the other for unreachable ones, while adding a lot of comments to explain what is going on. When checking an unreachable object, unlike a reachable one, we do not have to complain if it does not exist (we used to complain about a missing blob even when the only thing that references it is a tree that is dangling). Also we do not have to check and complain about objects that are referenced by an unreachable object. This makes the messages from fsck-objects a lot less noisy and more useful. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-22 07:26:41 +01:00			`if (obj)`
			`check_object(obj);`
Add connectivity tracking to fsck. This shows that I've lost track of one commit already. Most likely because I forgot to update the .dircache/HEAD file when doing a commit, so that the next commit referenced not the top-of-tree, but the one older commit. Having dangling commits is fine (in fact, you should always have at least _one_ dangling commit in the top-of-tree). But it's good to know about them. 2005-04-11 08:13:09 +02:00			`}`
			`}`

fsck: avoid reading every object twice During verify_pack() all objects are read for SHA-1 check. Then fsck_sha1() is called on every object, which read the object again (fsck_sha1 -> parse_object -> read_sha1_file). Avoid reading an object twice, do fsck_sha1 while we have an object uncompressed data in verify_pack. On git.git, with this patch I got: $ /usr/bin/time ./git fsck >/dev/null 98.97user 0.90system 1:40.01elapsed 99%CPU (0avgtext+0avgdata 616624maxresident)k 0inputs+0outputs (0major+194186minor)pagefaults 0swaps Without it: $ /usr/bin/time ./git fsck >/dev/null 231.23user 2.35system 3:53.82elapsed 99%CPU (0avgtext+0avgdata 636688maxresident)k 0inputs+0outputs (0major+461629minor)pagefaults 0swaps Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-11-07 03:59:25 +01:00			`static int fsck_obj(struct object *obj)`
Make fsck-cache do better tree checking. We check the ordering of the entries, and we verify that none of the entries has a slash in it (this allows us to remove the hacky "has_full_path" member from the tree structure, since we now just test it by walking the tree entries instead). 2005-05-03 01:13:18 +02:00			`{`
builtin-fsck: move common object checking code to fsck.c Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-25 22:46:08 +01:00			`if (obj->flags & SEEN)`
Make fsck-cache do better tree checking. We check the ordering of the entries, and we verify that none of the entries has a slash in it (this allows us to remove the hacky "has_full_path" member from the tree structure, since we now just test it by walking the tree entries instead). 2005-05-03 01:13:18 +02:00			`return 0;`
builtin-fsck: move common object checking code to fsck.c Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-25 22:46:08 +01:00			`obj->flags \|= SEEN;`
Make fsck-cache do better tree checking. We check the ordering of the entries, and we verify that none of the entries has a slash in it (this allows us to remove the hacky "has_full_path" member from the tree structure, since we now just test it by walking the tree entries instead). 2005-05-03 01:13:18 +02:00
git-fsck: learn about --verbose With --verbose, it gets really chatty now. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-05 04:44:00 +02:00			`if (verbose)`
builtin-fsck: move common object checking code to fsck.c Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-25 22:46:08 +01:00			`fprintf(stderr, "Checking %s %s\n",`
fsck: move typename() printing to its own function When an object has a problem, we mention its type. But we do so by feeding the result of typename() directly to fprintf(). This is potentially dangerous because typename() can return NULL for some type values (like OBJ_NONE). It's doubtful that this can be triggered in practice with the current code, so this is probably not fixing a bug. But it future-proofs us against modifications that make things like OBJ_NONE more likely (and gives future patches a central point to handle them). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-01-26 05:11:00 +01:00			`printable_type(obj), describe_object(obj));`
Be more careful about tree entry modes. The tree object parsing used to get the executable bit wrong, and didn't know about symlinks. Also, fsck really wants the full mode value so that it can verify the other bits for sanity, so save it all in struct tree_entry. 2005-05-06 01:18:48 +02:00
fsck: introduce fsck options Just like the diff machinery, we are about to introduce more settings, therefore it makes sense to carry them around as a (pointer to a) struct containing all of them. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-06-22 17:25:00 +02:00			`if (fsck_walk(obj, NULL, &fsck_obj_options))`
builtin-fsck: move common object checking code to fsck.c Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-25 22:46:08 +01:00			`objerror(obj, "broken links");`
fsck: introduce fsck options Just like the diff machinery, we are about to introduce more settings, therefore it makes sense to carry them around as a (pointer to a) struct containing all of them. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-06-22 17:25:00 +02:00			`if (fsck_object(obj, NULL, 0, &fsck_obj_options))`
builtin-fsck: move common object checking code to fsck.c Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-25 22:46:08 +01:00			`return -1;`
Make fsck-cache do better tree checking. We check the ordering of the entries, and we verify that none of the entries has a slash in it (this allows us to remove the hacky "has_full_path" member from the tree structure, since we now just test it by walking the tree entries instead). 2005-05-03 01:13:18 +02:00
builtin-fsck: move common object checking code to fsck.c Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-25 22:46:08 +01:00			`if (obj->type == OBJ_TREE) {`
			`struct tree item = (struct tree ) obj;`
Make fsck-cache do better tree checking. We check the ordering of the entries, and we verify that none of the entries has a slash in it (this allows us to remove the hacky "has_full_path" member from the tree structure, since we now just test it by walking the tree entries instead). 2005-05-03 01:13:18 +02:00
clear parsed flag when we free tree buffers Many code paths will free a tree object's buffer and set it to NULL after finishing with it in order to keep memory usage down during a traversal. However, out of 8 sites that do this, only one actually unsets the "parsed" flag back. Those sites that don't are setting a trap for later users of the tree object; even after calling parse_tree, the buffer will remain NULL, causing potential segfaults. It is not known whether this is triggerable in the current code. Most commands do not do an in-memory traversal followed by actually using the objects again. However, it does not hurt to be safe for future callers. In most cases, we can abstract this out to a "free_tree_buffer" helper. However, there are two exceptions: 1. The fsck code relies on the parsed flag to know that we were able to parse the object at one point. We can switch this to using a flag in the "flags" field. 2. The index-pack code sets the buffer to NULL but does not free it (it is freed by a caller). We should still unset the parsed flag here, but we cannot use our helper, as we do not want to free the buffer. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-06-06 00:37:39 +02:00			`free_tree_buffer(item);`
Make fsck-cache start parsing the object types, and checking their internal format. This doesn't yet check the reachability information, but we're getting there.. Slowly. 2005-04-09 02:11:14 +02:00			`}`

builtin-fsck: move common object checking code to fsck.c Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-25 22:46:08 +01:00			`if (obj->type == OBJ_COMMIT) {`
			`struct commit commit = (struct commit ) obj;`
git-fsck-cache.c: check commit objects more carefully We historically used to be very careful in fsck-cache, but when it was re-written to use "parse_object()" instead of parsing everything by hand, it lost a bit of the checks. This, together with the previous commit, should make it do more proper commit object syntax checks. Also add a "--strict" flag, which warns about the old-style "0664" file mode bits, which shouldn't exist in modern trees, but that happened early on in git trees and that the default git-fsck-cache thus silently accepts. Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-07-28 00:16:03 +02:00
provide a helper to free commit buffer This converts two lines into one at each caller. But more importantly, it abstracts the concept of freeing the buffer, which will make it easier to change later. Note that we also need to provide a "detach" mechanism for a tricky case in index-pack. We are passed a buffer for the object generated by processing the incoming pack. If we are not using --strict, we just calculate the sha1 on that buffer and return, leaving the caller to free it. But if we are using --strict, we actually attach that buffer to an object, pass the object to the fsck functions, and then detach the buffer from the object again (so that the caller can free it as usual). In this case, we don't want to free the buffer ourselves, but just make sure it is no longer associated with the commit. Note that we are making the assumption here that the attach/detach process does not impact the buffer at all (e.g., it is never reallocated or modified). That holds true now, and we have no plans to change that. However, as we abstract the commit_buffer code, this dependency becomes less obvious. So when we detach, let's also make sure that we get back the same buffer that we gave to the commit_buffer code. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-06-13 00:05:37 +02:00			`free_commit_buffer(commit);`
fsck-cache: notice missing "blob" objects. We should _not_ mark a blob object "parsed" just because we looked it up: it gets marked that way only once we've actually seen it. Otherwise we can never notice a missing blob. 2005-04-24 23:10:55 +02:00
builtin-fsck: move common object checking code to fsck.c Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-25 22:46:08 +01:00			`if (!commit->parents && show_root)`
fsck: refactor how to describe objects In many places, we refer to objects via their SHA-1s. Let's abstract that into a function. For the moment, it does nothing else than what we did previously: print out the 40-digit hex string. But that will change over the course of the next patches. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-07-17 12:59:44 +02:00			`printf("root %s\n", describe_object(&commit->object));`
builtin-fsck: move common object checking code to fsck.c Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-25 22:46:08 +01:00			`}`
fsck-cache: fix SIGSEGV on bad tag object fsck_tag() failes to notice that the parsing of the tag may have failed in the parse_object() call on the object that it is tagging. Noticed by Junio. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-03 16:57:56 +02:00
builtin-fsck: move common object checking code to fsck.c Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-25 22:46:08 +01:00			`if (obj->type == OBJ_TAG) {`
			`struct tag tag = (struct tag ) obj;`
git-fsck: learn about --verbose With --verbose, it gets really chatty now. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-05 04:44:00 +02:00
builtin-fsck: move common object checking code to fsck.c Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-25 22:46:08 +01:00			`if (show_tags && tag->tagged) {`
fsck: move typename() printing to its own function When an object has a problem, we mention its type. But we do so by feeding the result of typename() directly to fprintf(). This is potentially dangerous because typename() can return NULL for some type values (like OBJ_NONE). It's doubtful that this can be triggered in practice with the current code, so this is probably not fixing a bug. But it future-proofs us against modifications that make things like OBJ_NONE more likely (and gives future patches a central point to handle them). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-01-26 05:11:00 +01:00			`printf("tagged %s %s", printable_type(tag->tagged),`
fsck: refactor how to describe objects In many places, we refer to objects via their SHA-1s. Let's abstract that into a function. For the moment, it does nothing else than what we did previously: print out the 40-digit hex string. But that will change over the course of the next patches. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-07-17 12:59:44 +02:00			`describe_object(tag->tagged));`
			`printf(" (%s) in %s\n", tag->tag,`
			`describe_object(&tag->object));`
builtin-fsck: move common object checking code to fsck.c Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-25 22:46:08 +01:00			`}`
fsck-cache: fix SIGSEGV on bad tag object fsck_tag() failes to notice that the parsing of the tag may have failed in the parse_object() call on the object that it is tagging. Noticed by Junio. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-03 16:57:56 +02:00			`}`
fsck-cache: only show tags if asked to do so with "--tags" Normally we don't care, we just check them for being valid tag objects. 2005-04-26 01:31:13 +02:00
[PATCH] Port fsck-cache to use parsing functions This ports fsck-cache to use parsing functions. Note that performance could be improved here by only reading each object once, but this requires somewhat more complicated flow control. Signed-Off-By: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-18 20:39:48 +02:00			`return 0;`
Add first cut at "fsck-cache" that validates the SHA1 object store. It doesn't complain about mine. But it also doesn't yet check for inter-object reachability etc. 2005-04-09 00:02:42 +02:00			`}`

Convert the verify_pack callback to struct object_id Make the verify_pack_callback take a pointer to struct object_id. Change the pack checksum to use GIT_MAX_RAWSZ, even though it is not strictly an object ID. Doing so ensures resilience against future hash size changes, and allows us to remove hard-coded assumptions about how big the buffer needs to be. Also, use a union to convert the pointer from nth_packed_object_sha1 to to a pointer to struct object_id. This behavior is compatible with GCC and clang and explicitly sanctioned by C11. The alternatives are to just perform a cast, which would run afoul of strict aliasing rules, but should just work, and changing the pointer into an instance of struct object_id and copying the value. The latter operation could seriously bloat memory usage on fsck, which already uses a lot of memory on some repositories. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-05-07 00:10:20 +02:00			`static int fsck_obj_buffer(const struct object_id *oid, enum object_type type,`
fsck: avoid reading every object twice During verify_pack() all objects are read for SHA-1 check. Then fsck_sha1() is called on every object, which read the object again (fsck_sha1 -> parse_object -> read_sha1_file). Avoid reading an object twice, do fsck_sha1 while we have an object uncompressed data in verify_pack. On git.git, with this patch I got: $ /usr/bin/time ./git fsck >/dev/null 98.97user 0.90system 1:40.01elapsed 99%CPU (0avgtext+0avgdata 616624maxresident)k 0inputs+0outputs (0major+194186minor)pagefaults 0swaps Without it: $ /usr/bin/time ./git fsck >/dev/null 231.23user 2.35system 3:53.82elapsed 99%CPU (0avgtext+0avgdata 636688maxresident)k 0inputs+0outputs (0major+461629minor)pagefaults 0swaps Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-11-07 03:59:25 +01:00			`unsigned long size, void buffer, int eaten)`
			`{`
fsck: use streaming interface for large blobs in pack For blobs, we want to make sure the on-disk data is not corrupted (i.e. can be inflated and produce the expected SHA-1). Blob content is opaque, there's nothing else inside to check for. For really large blobs, we may want to avoid unpacking the entire blob in memory, just to check whether it produces the same SHA-1. On 32-bit systems, we may not have enough virtual address space for such memory allocation. And even on 64-bit where it's not a problem, allocating a lot more memory could result in kicking other parts of systems to swap file, generating lots of I/O and slowing everything down. For this particular operation, not unpacking the blob and letting check_sha1_signature, which supports streaming interface, do the job is sufficient. check_sha1_signature() is not shown in the diff, unfortunately. But if will be called when "data_valid && !data" is false. We will call the callback function "fn" with NULL as "data". The only callback of this function is fsck_obj_buffer(), which does not touch "data" at all if it's a blob. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-07-13 17:44:04 +02:00			`/*`
			`* Note, buffer may be NULL if type is OBJ_BLOB. See`
			`* verify_packfile(), data_valid variable for details.`
			`*/`
fsck: avoid reading every object twice During verify_pack() all objects are read for SHA-1 check. Then fsck_sha1() is called on every object, which read the object again (fsck_sha1 -> parse_object -> read_sha1_file). Avoid reading an object twice, do fsck_sha1 while we have an object uncompressed data in verify_pack. On git.git, with this patch I got: $ /usr/bin/time ./git fsck >/dev/null 98.97user 0.90system 1:40.01elapsed 99%CPU (0avgtext+0avgdata 616624maxresident)k 0inputs+0outputs (0major+194186minor)pagefaults 0swaps Without it: $ /usr/bin/time ./git fsck >/dev/null 231.23user 2.35system 3:53.82elapsed 99%CPU (0avgtext+0avgdata 636688maxresident)k 0inputs+0outputs (0major+461629minor)pagefaults 0swaps Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-11-07 03:59:25 +01:00			`struct object *obj;`
object: convert parse_object* to take struct object_id Make parse_object, parse_object_or_die, and parse_object_buffer take a pointer to struct object_id. Remove the temporary variables inserted earlier, since they are no longer necessary. Transform all of the callers using the following semantic patch: @@ expression E1; @@ - parse_object(E1.hash) + parse_object(&E1) @@ expression E1; @@ - parse_object(E1->hash) + parse_object(E1) @@ expression E1, E2; @@ - parse_object_or_die(E1.hash, E2) + parse_object_or_die(&E1, E2) @@ expression E1, E2; @@ - parse_object_or_die(E1->hash, E2) + parse_object_or_die(E1, E2) @@ expression E1, E2, E3, E4, E5; @@ - parse_object_buffer(E1.hash, E2, E3, E4, E5) + parse_object_buffer(&E1, E2, E3, E4, E5) @@ expression E1, E2, E3, E4, E5; @@ - parse_object_buffer(E1->hash, E2, E3, E4, E5) + parse_object_buffer(E1, E2, E3, E4, E5) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-05-07 00:10:38 +02:00			`obj = parse_object_buffer(oid, type, size, buffer, eaten);`
fsck: avoid reading every object twice During verify_pack() all objects are read for SHA-1 check. Then fsck_sha1() is called on every object, which read the object again (fsck_sha1 -> parse_object -> read_sha1_file). Avoid reading an object twice, do fsck_sha1 while we have an object uncompressed data in verify_pack. On git.git, with this patch I got: $ /usr/bin/time ./git fsck >/dev/null 98.97user 0.90system 1:40.01elapsed 99%CPU (0avgtext+0avgdata 616624maxresident)k 0inputs+0outputs (0major+194186minor)pagefaults 0swaps Without it: $ /usr/bin/time ./git fsck >/dev/null 231.23user 2.35system 3:53.82elapsed 99%CPU (0avgtext+0avgdata 636688maxresident)k 0inputs+0outputs (0major+461629minor)pagefaults 0swaps Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-11-07 03:59:25 +01:00			`if (!obj) {`
			`errors_found \|= ERROR_OBJECT;`
Convert the verify_pack callback to struct object_id Make the verify_pack_callback take a pointer to struct object_id. Change the pack checksum to use GIT_MAX_RAWSZ, even though it is not strictly an object ID. Doing so ensures resilience against future hash size changes, and allows us to remove hard-coded assumptions about how big the buffer needs to be. Also, use a union to convert the pointer from nth_packed_object_sha1 to to a pointer to struct object_id. This behavior is compatible with GCC and clang and explicitly sanctioned by C11. The alternatives are to just perform a cast, which would run afoul of strict aliasing rules, but should just work, and changing the pointer into an instance of struct object_id and copying the value. The latter operation could seriously bloat memory usage on fsck, which already uses a lot of memory on some repositories. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-05-07 00:10:20 +02:00			`return error("%s: object corrupt or missing", oid_to_hex(oid));`
fsck: avoid reading every object twice During verify_pack() all objects are read for SHA-1 check. Then fsck_sha1() is called on every object, which read the object again (fsck_sha1 -> parse_object -> read_sha1_file). Avoid reading an object twice, do fsck_sha1 while we have an object uncompressed data in verify_pack. On git.git, with this patch I got: $ /usr/bin/time ./git fsck >/dev/null 98.97user 0.90system 1:40.01elapsed 99%CPU (0avgtext+0avgdata 616624maxresident)k 0inputs+0outputs (0major+194186minor)pagefaults 0swaps Without it: $ /usr/bin/time ./git fsck >/dev/null 231.23user 2.35system 3:53.82elapsed 99%CPU (0avgtext+0avgdata 636688maxresident)k 0inputs+0outputs (0major+461629minor)pagefaults 0swaps Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-11-07 03:59:25 +01:00			`}`
clear parsed flag when we free tree buffers Many code paths will free a tree object's buffer and set it to NULL after finishing with it in order to keep memory usage down during a traversal. However, out of 8 sites that do this, only one actually unsets the "parsed" flag back. Those sites that don't are setting a trap for later users of the tree object; even after calling parse_tree, the buffer will remain NULL, causing potential segfaults. It is not known whether this is triggerable in the current code. Most commands do not do an in-memory traversal followed by actually using the objects again. However, it does not hurt to be safe for future callers. In most cases, we can abstract this out to a "free_tree_buffer" helper. However, there are two exceptions: 1. The fsck code relies on the parsed flag to know that we were able to parse the object at one point. We can switch this to using a flag in the "flags" field. 2. The index-pack code sets the buffer to NULL but does not free it (it is freed by a caller). We should still unset the parsed flag here, but we cannot use our helper, as we do not want to free the buffer. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-06-06 00:37:39 +02:00			`obj->flags = HAS_OBJ;`
fsck: avoid reading every object twice During verify_pack() all objects are read for SHA-1 check. Then fsck_sha1() is called on every object, which read the object again (fsck_sha1 -> parse_object -> read_sha1_file). Avoid reading an object twice, do fsck_sha1 while we have an object uncompressed data in verify_pack. On git.git, with this patch I got: $ /usr/bin/time ./git fsck >/dev/null 98.97user 0.90system 1:40.01elapsed 99%CPU (0avgtext+0avgdata 616624maxresident)k 0inputs+0outputs (0major+194186minor)pagefaults 0swaps Without it: $ /usr/bin/time ./git fsck >/dev/null 231.23user 2.35system 3:53.82elapsed 99%CPU (0avgtext+0avgdata 636688maxresident)k 0inputs+0outputs (0major+461629minor)pagefaults 0swaps Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-11-07 03:59:25 +01:00			`return fsck_obj(obj);`
			`}`

remove unnecessary initializations [jc: I needed to hand merge the changes to the updated codebase, so the result needs to be checked.] Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-08-15 19:23:48 +02:00			`static int default_refs;`
Fix up "for_each_ref()" to be more usable, and use it in git-fsck-cache It needed to take the GIT_DIR information into account, something that the original receive-pack usage just never cared about. 2005-07-03 19:01:38 +02:00
refs: convert each_reflog_ent_fn to struct object_id Make each_reflog_ent_fn take two struct object_id pointers instead of two pointers to unsigned char. Convert the various callbacks to use struct object_id as well. Also, rename fsck_handle_reflog_sha1 to fsck_handle_reflog_oid. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-02-22 00:47:32 +01:00			`static void fsck_handle_reflog_oid(const char refname, struct object_id oid,`
timestamp_t: a new data type for timestamps Git's source code assumes that unsigned long is at least as precise as time_t. Which is incorrect, and causes a lot of problems, in particular where unsigned long is only 32-bit (notably on Windows, even in 64-bit versions). So let's just use a more appropriate data type instead. In preparation for this, we introduce the new `timestamp_t` data type. By necessity, this is a very, very large patch, as it has to replace all timestamps' data type in one go. As we will use a data type that is not necessarily identical to `time_t`, we need to be very careful to use `time_t` whenever we interact with the system functions, and `timestamp_t` everywhere else. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-04-26 21:29:31 +02:00			`timestamp_t timestamp)`
Protect commits recorded in reflog from pruning. This teaches fsck-objects and prune to protect objects referred to by reflog entries. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-18 10:36:16 +01:00			`{`
			`struct object *obj;`

refs: convert each_reflog_ent_fn to struct object_id Make each_reflog_ent_fn take two struct object_id pointers instead of two pointers to unsigned char. Convert the various callbacks to use struct object_id as well. Also, rename fsck_handle_reflog_sha1 to fsck_handle_reflog_oid. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-02-22 00:47:32 +01:00			`if (!is_null_oid(oid)) {`
			`obj = lookup_object(oid->hash);`
fsck: check HAS_OBJ more consistently There are two spots that call lookup_object() and assume that a non-NULL result means we have the object: 1. When we're checking the objects given to us by the user on the command line. 2. When we're checking if a reflog entry is valid. This generally follows fsck's mental model that we will have looked at and loaded a "struct object" for each object in the repository. But it misses one case: if another object _mentioned_ an object, but we didn't actually parse it or verify that it exists, it will still have a struct. It's not clear if this is a triggerable bug or not. Certainly the later parts of the reachability check need to be careful of this, and do so by checking the HAS_OBJ flag. But both of these steps happen before we start traversing, so probably we won't have followed any links yet. Still, it's easy enough to be defensive here. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-01-16 22:34:57 +01:00			`if (obj && (obj->flags & HAS_OBJ)) {`
fsck: optionally show more helpful info for broken links When reporting broken links between commits/trees/blobs, it would be quite helpful at times if the user would be told how the object is supposed to be reachable. With the new --name-objects option, git-fsck will try to do exactly that: name the objects in a way that shows how they are reachable. For example, when some reflog got corrupted and a blob is missing that should not be, the user might want to remove the corresponding reflog entry. This option helps them find that entry: `git fsck` will now report something like this: broken link from tree b5eb6ff... (refs/stash@{<date>}~37:) to blob ec5cf80... Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-07-17 13:00:02 +02:00			`if (timestamp && name_objects)`
			`add_decoration(fsck_walk_options.object_names,`
			`obj,`
PRItime: introduce a new "printf format" for timestamps Currently, Git's source code treats all timestamps as if they were unsigned longs. Therefore, it is okay to write "%lu" when printing them. There is a substantial problem with that, though: at least on Windows, time_t is larger than unsigned long, and hence we will want to switch away from the ill-specified `unsigned long` data type. So let's introduce the pseudo format "PRItime" (currently simply being defined to "lu") to make it easier to change the data type used for timestamps. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-04-21 12:45:48 +02:00			`xstrfmt("%s@{%"PRItime"}", refname, timestamp));`
Protect commits recorded in reflog from pruning. This teaches fsck-objects and prune to protect objects referred to by reflog entries. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-18 10:36:16 +01:00			`obj->used = 1;`
builtin-fsck: move away from object-refs to fsck_walk Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-25 22:46:05 +01:00			`mark_object_reachable(obj);`
fsck: report errors if reflog entries point at invalid objects Previously, if a reflog entry's old or new SHA-1 was not resolvable to an object, that SHA-1 was silently ignored. Instead, report such cases as errors. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-06-08 15:40:05 +02:00			`} else {`
refs: convert each_reflog_ent_fn to struct object_id Make each_reflog_ent_fn take two struct object_id pointers instead of two pointers to unsigned char. Convert the various callbacks to use struct object_id as well. Also, rename fsck_handle_reflog_sha1 to fsck_handle_reflog_oid. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-02-22 00:47:32 +01:00			`error("%s: invalid reflog entry %s", refname, oid_to_hex(oid));`
fsck: report errors if reflog entries point at invalid objects Previously, if a reflog entry's old or new SHA-1 was not resolvable to an object, that SHA-1 was silently ignored. Instead, report such cases as errors. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-06-08 15:40:05 +02:00			`errors_found \|= ERROR_REACHABLE;`
Protect commits recorded in reflog from pruning. This teaches fsck-objects and prune to protect objects referred to by reflog entries. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-18 10:36:16 +01:00			`}`
			`}`
fsck_handle_reflog_sha1(): new function New function, extracted from fsck_handle_reflog_ent(). The extra is_null_sha1() test for the new reference is currently unnecessary, as reflogs are deleted when the reference itself is deleted. But it doesn't hurt, either. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-06-08 15:40:04 +02:00			`}`

refs: convert each_reflog_ent_fn to struct object_id Make each_reflog_ent_fn take two struct object_id pointers instead of two pointers to unsigned char. Convert the various callbacks to use struct object_id as well. Also, rename fsck_handle_reflog_sha1 to fsck_handle_reflog_oid. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-02-22 00:47:32 +01:00			`static int fsck_handle_reflog_ent(struct object_id ooid, struct object_id noid,`
timestamp_t: a new data type for timestamps Git's source code assumes that unsigned long is at least as precise as time_t. Which is incorrect, and causes a lot of problems, in particular where unsigned long is only 32-bit (notably on Windows, even in 64-bit versions). So let's just use a more appropriate data type instead. In preparation for this, we introduce the new `timestamp_t` data type. By necessity, this is a very, very large patch, as it has to replace all timestamps' data type in one go. As we will use a data type that is not necessarily identical to `time_t`, we need to be very careful to use `time_t` whenever we interact with the system functions, and `timestamp_t` everywhere else. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-04-26 21:29:31 +02:00			`const char *email, timestamp_t timestamp, int tz,`
Sanitize for_each_reflog_ent() It used to ignore the return value of the helper function; now, it expects it to return 0, and stops iteration upon non-zero return values; this value is then passed on as the return value of for_each_reflog_ent(). Further, it makes no sense to force the parsing upon the helper functions; for_each_reflog_ent() now calls the helper function with old and new sha1, the email, the timestamp & timezone, and the message. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-08 01:59:54 +01:00			`const char message, void cb_data)`
Protect commits recorded in reflog from pruning. This teaches fsck-objects and prune to protect objects referred to by reflog entries. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-18 10:36:16 +01:00			`{`
fsck: report errors if reflog entries point at invalid objects Previously, if a reflog entry's old or new SHA-1 was not resolvable to an object, that SHA-1 was silently ignored. Instead, report such cases as errors. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-06-08 15:40:05 +02:00			`const char *refname = cb_data;`
Protect commits recorded in reflog from pruning. This teaches fsck-objects and prune to protect objects referred to by reflog entries. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-18 10:36:16 +01:00
git-fsck: learn about --verbose With --verbose, it gets really chatty now. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-05 04:44:00 +02:00			`if (verbose)`
			`fprintf(stderr, "Checking reflog %s->%s\n",`
refs: convert each_reflog_ent_fn to struct object_id Make each_reflog_ent_fn take two struct object_id pointers instead of two pointers to unsigned char. Convert the various callbacks to use struct object_id as well. Also, rename fsck_handle_reflog_sha1 to fsck_handle_reflog_oid. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-02-22 00:47:32 +01:00			`oid_to_hex(ooid), oid_to_hex(noid));`
git-fsck: learn about --verbose With --verbose, it gets really chatty now. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-05 04:44:00 +02:00
refs: convert each_reflog_ent_fn to struct object_id Make each_reflog_ent_fn take two struct object_id pointers instead of two pointers to unsigned char. Convert the various callbacks to use struct object_id as well. Also, rename fsck_handle_reflog_sha1 to fsck_handle_reflog_oid. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-02-22 00:47:32 +01:00			`fsck_handle_reflog_oid(refname, ooid, 0);`
			`fsck_handle_reflog_oid(refname, noid, timestamp);`
Protect commits recorded in reflog from pruning. This teaches fsck-objects and prune to protect objects referred to by reflog entries. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-18 10:36:16 +01:00			`return 0;`
			`}`

fsck: change functions to use object_id Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-25 20:38:50 +02:00			`static int fsck_handle_reflog(const char logname, const struct object_id oid,`
			`int flag, void *cb_data)`
scan reflogs independently from refs Currently, the search for all reflogs depends on the existence of corresponding refs under the .git/refs/ directory. Let's scan the .git/logs/ directory directly instead. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-03 19:25:43 +01:00			`{`
fsck: report errors if reflog entries point at invalid objects Previously, if a reflog entry's old or new SHA-1 was not resolvable to an object, that SHA-1 was silently ignored. Instead, report such cases as errors. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-06-08 15:40:05 +02:00			`for_each_reflog_ent(logname, fsck_handle_reflog_ent, (void *)logname);`
scan reflogs independently from refs Currently, the search for all reflogs depends on the existence of corresponding refs under the .git/refs/ directory. Let's scan the .git/logs/ directory directly instead. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-03 19:25:43 +01:00			`return 0;`
			`}`

fsck: change functions to use object_id Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-25 20:38:50 +02:00			`static int fsck_handle_ref(const char refname, const struct object_id oid,`
			`int flag, void *cb_data)`
fsck-cache: walk the 'refs' directory if the user doesn't give any explicit references for reachability analysis. We already had that as separate logic in git-prune-script, so this is not a new special case - it's an old special case moved into fsck, making normal usage be much simpler. 2005-05-18 19:16:14 +02:00			`{`
			`struct object *obj;`

object: convert parse_object* to take struct object_id Make parse_object, parse_object_or_die, and parse_object_buffer take a pointer to struct object_id. Remove the temporary variables inserted earlier, since they are no longer necessary. Transform all of the callers using the following semantic patch: @@ expression E1; @@ - parse_object(E1.hash) + parse_object(&E1) @@ expression E1; @@ - parse_object(E1->hash) + parse_object(E1) @@ expression E1, E2; @@ - parse_object_or_die(E1.hash, E2) + parse_object_or_die(&E1, E2) @@ expression E1, E2; @@ - parse_object_or_die(E1->hash, E2) + parse_object_or_die(E1, E2) @@ expression E1, E2, E3, E4, E5; @@ - parse_object_buffer(E1.hash, E2, E3, E4, E5) + parse_object_buffer(&E1, E2, E3, E4, E5) @@ expression E1, E2, E3, E4, E5; @@ - parse_object_buffer(E1->hash, E2, E3, E4, E5) + parse_object_buffer(E1, E2, E3, E4, E5) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-05-07 00:10:38 +02:00			`obj = parse_object(oid);`
[PATCH] Update fsck-cache (take 2) The fsck-cache complains if objects referred to by files in .git/refs/ or objects stored in files under .git/objects/??/ are not found as stand-alone SHA1 files (i.e. found in alternate object pools GIT_ALTERNATE_OBJECT_DIRECTORIES or packed archives stored under .git/objects/pack). Although this is a good semantics to maintain consistency of a single .git/objects directory as a self contained set of objects, it sometimes is useful to consider it is OK as long as these "outside" objects are available. This commit introduces a new flag, --standalone, to git-fsck-cache. When it is not specified, connectivity checks and .git/refs pointer checks are taught that it is OK when expected objects do not exist under .git/objects/?? hierarchy but are available from an packed archive or in an alternate object pool. Another new flag, --full, makes git-fsck-cache to check not only the current GIT_OBJECT_DIRECTORY but also objects found in alternate object pools and packed GIT archives.a Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-28 23:58:33 +02:00			`if (!obj) {`
fsck: change functions to use object_id Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-25 20:38:50 +02:00			`error("%s: invalid sha1 pointer %s", refname, oid_to_hex(oid));`
fsck: return non-zero status on missing ref tips Fsck tries hard to detect missing objects, and will complain (and exit non-zero) about any inter-object links that are missing. However, it will not exit non-zero for any missing ref tips, meaning that a severely broken repository may still pass "git fsck && echo ok". The problem is that we use for_each_ref to iterate over the ref tips, which hides broken tips. It does at least print an error from the refs.c code, but fsck does not ever see the ref and cannot note the problem in its exit code. We can solve this by using for_each_rawref and noting the error ourselves. In addition to adding tests for this case, we add tests for all types of missing-object links (all of which worked, but which we were not testing). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-09-12 05:38:30 +02:00			`errors_found \|= ERROR_REACHABLE;`
Fix up "for_each_ref()" to be more usable, and use it in git-fsck-cache It needed to take the GIT_DIR information into account, something that the original receive-pack usage just never cared about. 2005-07-03 19:01:38 +02:00			`/* We'll continue with the rest despite the error.. */`
			`return 0;`
[PATCH] Update fsck-cache (take 2) The fsck-cache complains if objects referred to by files in .git/refs/ or objects stored in files under .git/objects/??/ are not found as stand-alone SHA1 files (i.e. found in alternate object pools GIT_ALTERNATE_OBJECT_DIRECTORIES or packed archives stored under .git/objects/pack). Although this is a good semantics to maintain consistency of a single .git/objects directory as a self contained set of objects, it sometimes is useful to consider it is OK as long as these "outside" objects are available. This commit introduces a new flag, --standalone, to git-fsck-cache. When it is not specified, connectivity checks and .git/refs pointer checks are taught that it is OK when expected objects do not exist under .git/objects/?? hierarchy but are available from an packed archive or in an alternate object pool. Another new flag, --full, makes git-fsck-cache to check not only the current GIT_OBJECT_DIRECTORY but also objects found in alternate object pools and packed GIT archives.a Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-28 23:58:33 +02:00			`}`
fsck: exit with non-zero when problems are found After finding some problems (e.g. a ref refs/heads/X points at an object that is not a commit) and issuing an error message, the program failed to signal the fact that it found an error by a non-zero exit status. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-09-23 22:46:39 +02:00			`if (obj->type != OBJ_COMMIT && is_branch(refname)) {`
Make 'git fsck' complain about non-commit branches Since having non-commits in branches is a no-no, and just means you cannot commit on them, let's make fsck tell you when a branch is bad. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-16 01:34:17 +01:00			`error("%s: not a commit", refname);`
fsck: exit with non-zero when problems are found After finding some problems (e.g. a ref refs/heads/X points at an object that is not a commit) and issuing an error message, the program failed to signal the fact that it found an error by a non-zero exit status. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-09-23 22:46:39 +02:00			`errors_found \|= ERROR_REFS;`
			`}`
Fix up "for_each_ref()" to be more usable, and use it in git-fsck-cache It needed to take the GIT_DIR information into account, something that the original receive-pack usage just never cared about. 2005-07-03 19:01:38 +02:00			`default_refs++;`
fsck-cache: walk the 'refs' directory if the user doesn't give any explicit references for reachability analysis. We already had that as separate logic in git-prune-script, so this is not a new special case - it's an old special case moved into fsck, making normal usage be much simpler. 2005-05-18 19:16:14 +02:00			`obj->used = 1;`
fsck: optionally show more helpful info for broken links When reporting broken links between commits/trees/blobs, it would be quite helpful at times if the user would be told how the object is supposed to be reachable. With the new --name-objects option, git-fsck will try to do exactly that: name the objects in a way that shows how they are reachable. For example, when some reflog got corrupted and a blob is missing that should not be, the user might want to remove the corresponding reflog entry. This option helps them find that entry: `git fsck` will now report something like this: broken link from tree b5eb6ff... (refs/stash@{<date>}~37:) to blob ec5cf80... Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-07-17 13:00:02 +02:00			`if (name_objects)`
			`add_decoration(fsck_walk_options.object_names,`
			`obj, xstrdup(refname));`
builtin-fsck: move away from object-refs to fsck_walk Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-25 22:46:05 +01:00			`mark_object_reachable(obj);`
Protect commits recorded in reflog from pruning. This teaches fsck-objects and prune to protect objects referred to by reflog entries. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-18 10:36:16 +01:00
fsck-cache: fix segfault on nonexistent referenced object Noted by Frank Sorenson and Petr Baudis, patch rewritten by me. 2005-05-20 16:49:17 +02:00			`return 0;`
fsck-cache: walk the 'refs' directory if the user doesn't give any explicit references for reachability analysis. We already had that as separate logic in git-prune-script, so this is not a new special case - it's an old special case moved into fsck, making normal usage be much simpler. 2005-05-18 19:16:14 +02:00			`}`

			`static void get_default_heads(void)`
			`{`
fsck: change functions to use object_id Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-25 20:38:50 +02:00			`if (head_points_at && !is_null_oid(&head_oid))`
			`fsck_handle_ref("HEAD", &head_oid, 0, NULL);`
			`for_each_rawref(fsck_handle_ref, NULL);`
Fix lost-found to show commits only referenced by reflogs Prior to 1.5.0 the git-lost-found utility was useful to locate commits that were not referenced by any ref. These were often amends, or resets, or tips of branches that had been deleted. Being able to locate a 'lost' commit and recover it by creating a new branch was a useful feature in those days. Unfortunately 1.5.0 added the reflogs to the reachability analysis performed by git-fsck, which means that most commits users would consider to be lost are still reachable through a reflog. So most (or all!) commits are reachable, and nothing gets output from git-lost-found. Now git-fsck can be told to ignore reflogs during its reachability analysis, making git-lost-found useful again to locate commits that are no longer referenced by a ref itself, but may still be referenced by a reflog. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-04 16:46:14 +02:00			`if (include_reflogs)`
fsck: change functions to use object_id Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-25 20:38:50 +02:00			`for_each_reflog(fsck_handle_reflog, NULL);`
git-fsck-objects: lacking default references should not be fatal The comment added says it all: if we have lost all references in a git archive, git-fsck-objects should still work, so instead of dying it should just notify the user about that condition. This change was triggered by me just doing a "git-init-db" and then populating that empty git archive with a pack/index file to look at it. Having git-fsck-objects not work just because I didn't have any references handy was rather irritating, since part of the reason for running git-fsck-objects in the first place was to _find_ the missing references. However, "--unreachable" really doesn't make sense in that situation, and we want to turn it off to protect anybody who uses the old "git prune" shell-script (rather than the modern built-in). The old pruning script used to remove all objects that were reported as unreachable, and without any refs, that obviously means everything - not worth it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-08-29 20:47:30 +02:00
			`/*`
			`* Not having any default heads isn't really fatal, but`
			`* it does mean that "--unreachable" no longer makes any`
			`* sense (since in this case everything will obviously`
			`* be unreachable by definition.`
			`*`
			`* Showing dangling objects is valid, though (as those`
			`* dangling objects are likely lost heads).`
			`*`
			`* So we just print a warning about it, and clear the`
			`* "show_unreachable" flag.`
			`*/`
			`if (!default_refs) {`
fsck: do not complain on detached HEAD. Detached HEAD is just a normal state of a repository. Do not say anything about it. Do not give worrying "error:" messages when we let the user know that the HEAD points at nothing (i.e. yet to be born branch), nor we do not have any default refs to start following the objects chain. Reword them as "notice:". Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-11 10:28:43 +02:00			`fprintf(stderr, "notice: No default references\n");`
git-fsck-objects: lacking default references should not be fatal The comment added says it all: if we have lost all references in a git archive, git-fsck-objects should still work, so instead of dying it should just notify the user about that condition. This change was triggered by me just doing a "git-init-db" and then populating that empty git archive with a pack/index file to look at it. Having git-fsck-objects not work just because I didn't have any references handy was rather irritating, since part of the reason for running git-fsck-objects in the first place was to _find_ the missing references. However, "--unreachable" really doesn't make sense in that situation, and we want to turn it off to protect anybody who uses the old "git prune" shell-script (rather than the modern built-in). The old pruning script used to remove all objects that were reported as unreachable, and without any refs, that obviously means everything - not worth it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-08-29 20:47:30 +02:00			`show_unreachable = 0;`
			`}`
fsck-cache: walk the 'refs' directory if the user doesn't give any explicit references for reachability analysis. We already had that as separate logic in git-prune-script, so this is not a new special case - it's an old special case moved into fsck, making normal usage be much simpler. 2005-05-18 19:16:14 +02:00			`}`

Convert object iteration callbacks to struct object_id Convert each_loose_object_fn and each_packed_object_fn to take a pointer to struct object_id. Update the various callbacks. Convert several 40-based constants to use GIT_SHA1_HEXSZ. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-02-22 00:47:35 +01:00			`static struct object parse_loose_object(const struct object_id oid,`
fsck: parse loose object paths directly When we iterate over the list of loose objects to check, we get the actual path of each object. But we then throw it away and pass just the sha1 to fsck_sha1(), which will do a fresh lookup. Usually it would find the same object, but it may not if an object exists both as a loose and a packed object. We may end up checking the packed object twice, and never look at the loose one. In practice this isn't too terrible, because if fsck doesn't complain, it means you have at least one good copy. But since the point of fsck is to look for corruption, we should be thorough. The new read_loose_object() interface can help us get the data from disk, and then we replace parse_object() with parse_object_buffer(). As a bonus, our error messages now mention the path to a corrupted object, which should make it easier to track down errors when they do happen. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-01-13 18:59:44 +01:00			`const char *path)`
			`{`
			`struct object *obj;`
			`void *contents;`
			`enum object_type type;`
			`unsigned long size;`
			`int eaten;`

Convert object iteration callbacks to struct object_id Convert each_loose_object_fn and each_packed_object_fn to take a pointer to struct object_id. Update the various callbacks. Convert several 40-based constants to use GIT_SHA1_HEXSZ. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-02-22 00:47:35 +01:00			`if (read_loose_object(path, oid->hash, &type, &size, &contents) < 0)`
fsck: parse loose object paths directly When we iterate over the list of loose objects to check, we get the actual path of each object. But we then throw it away and pass just the sha1 to fsck_sha1(), which will do a fresh lookup. Usually it would find the same object, but it may not if an object exists both as a loose and a packed object. We may end up checking the packed object twice, and never look at the loose one. In practice this isn't too terrible, because if fsck doesn't complain, it means you have at least one good copy. But since the point of fsck is to look for corruption, we should be thorough. The new read_loose_object() interface can help us get the data from disk, and then we replace parse_object() with parse_object_buffer(). As a bonus, our error messages now mention the path to a corrupted object, which should make it easier to track down errors when they do happen. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-01-13 18:59:44 +01:00			`return NULL;`

			`if (!contents && type != OBJ_BLOB)`
			`die("BUG: read_loose_object streamed a non-blob");`

object: convert parse_object* to take struct object_id Make parse_object, parse_object_or_die, and parse_object_buffer take a pointer to struct object_id. Remove the temporary variables inserted earlier, since they are no longer necessary. Transform all of the callers using the following semantic patch: @@ expression E1; @@ - parse_object(E1.hash) + parse_object(&E1) @@ expression E1; @@ - parse_object(E1->hash) + parse_object(E1) @@ expression E1, E2; @@ - parse_object_or_die(E1.hash, E2) + parse_object_or_die(&E1, E2) @@ expression E1, E2; @@ - parse_object_or_die(E1->hash, E2) + parse_object_or_die(E1, E2) @@ expression E1, E2, E3, E4, E5; @@ - parse_object_buffer(E1.hash, E2, E3, E4, E5) + parse_object_buffer(&E1, E2, E3, E4, E5) @@ expression E1, E2, E3, E4, E5; @@ - parse_object_buffer(E1->hash, E2, E3, E4, E5) + parse_object_buffer(E1, E2, E3, E4, E5) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-05-07 00:10:38 +02:00			`obj = parse_object_buffer(oid, type, size, contents, &eaten);`
fsck: parse loose object paths directly When we iterate over the list of loose objects to check, we get the actual path of each object. But we then throw it away and pass just the sha1 to fsck_sha1(), which will do a fresh lookup. Usually it would find the same object, but it may not if an object exists both as a loose and a packed object. We may end up checking the packed object twice, and never look at the loose one. In practice this isn't too terrible, because if fsck doesn't complain, it means you have at least one good copy. But since the point of fsck is to look for corruption, we should be thorough. The new read_loose_object() interface can help us get the data from disk, and then we replace parse_object() with parse_object_buffer(). As a bonus, our error messages now mention the path to a corrupted object, which should make it easier to track down errors when they do happen. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-01-13 18:59:44 +01:00
			`if (!eaten)`
			`free(contents);`
			`return obj;`
			`}`

Convert object iteration callbacks to struct object_id Convert each_loose_object_fn and each_packed_object_fn to take a pointer to struct object_id. Update the various callbacks. Convert several 40-based constants to use GIT_SHA1_HEXSZ. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-02-22 00:47:35 +01:00			`static int fsck_loose(const struct object_id oid, const char path, void *data)`
fsck: use for_each_loose_file_in_objdir Since 27e1e22 (prune: factor out loose-object directory traversal, 2014-10-15), we now have a generic callback system for iterating over the loose object directories. This is used by prune, count-objects, etc. We did not convert git-fsck at the time because it implemented an inode-sorting scheme that was not part of the generic code. Now that the inode-sorting code is gone, we can reuse the generic code. The result is shorter, hopefully more readable, and drops some unchecked sprintf calls. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-09-24 23:08:33 +02:00			`{`
Convert object iteration callbacks to struct object_id Convert each_loose_object_fn and each_packed_object_fn to take a pointer to struct object_id. Update the various callbacks. Convert several 40-based constants to use GIT_SHA1_HEXSZ. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-02-22 00:47:35 +01:00			`struct object *obj = parse_loose_object(oid, path);`
fsck: parse loose object paths directly When we iterate over the list of loose objects to check, we get the actual path of each object. But we then throw it away and pass just the sha1 to fsck_sha1(), which will do a fresh lookup. Usually it would find the same object, but it may not if an object exists both as a loose and a packed object. We may end up checking the packed object twice, and never look at the loose one. In practice this isn't too terrible, because if fsck doesn't complain, it means you have at least one good copy. But since the point of fsck is to look for corruption, we should be thorough. The new read_loose_object() interface can help us get the data from disk, and then we replace parse_object() with parse_object_buffer(). As a bonus, our error messages now mention the path to a corrupted object, which should make it easier to track down errors when they do happen. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-01-13 18:59:44 +01:00
			`if (!obj) {`
			`errors_found \|= ERROR_OBJECT;`
			`error("%s: object corrupt or missing: %s",`
Convert object iteration callbacks to struct object_id Convert each_loose_object_fn and each_packed_object_fn to take a pointer to struct object_id. Update the various callbacks. Convert several 40-based constants to use GIT_SHA1_HEXSZ. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-02-22 00:47:35 +01:00			`oid_to_hex(oid), path);`
fsck: parse loose object paths directly When we iterate over the list of loose objects to check, we get the actual path of each object. But we then throw it away and pass just the sha1 to fsck_sha1(), which will do a fresh lookup. Usually it would find the same object, but it may not if an object exists both as a loose and a packed object. We may end up checking the packed object twice, and never look at the loose one. In practice this isn't too terrible, because if fsck doesn't complain, it means you have at least one good copy. But since the point of fsck is to look for corruption, we should be thorough. The new read_loose_object() interface can help us get the data from disk, and then we replace parse_object() with parse_object_buffer(). As a bonus, our error messages now mention the path to a corrupted object, which should make it easier to track down errors when they do happen. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-01-13 18:59:44 +01:00			`return 0; /* keep checking other objects */`
			`}`

			`obj->flags = HAS_OBJ;`
			`if (fsck_obj(obj))`
fsck: use for_each_loose_file_in_objdir Since 27e1e22 (prune: factor out loose-object directory traversal, 2014-10-15), we now have a generic callback system for iterating over the loose object directories. This is used by prune, count-objects, etc. We did not convert git-fsck at the time because it implemented an inode-sorting scheme that was not part of the generic code. Now that the inode-sorting code is gone, we can reuse the generic code. The result is shorter, hopefully more readable, and drops some unchecked sprintf calls. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-09-24 23:08:33 +02:00			`errors_found \|= ERROR_OBJECT;`
			`return 0;`
			`}`

			`static int fsck_cruft(const char basename, const char path, void *data)`
			`{`
			`if (!starts_with(basename, "tmp_obj_"))`
			`fprintf(stderr, "bad sha1 file: %s\n", path);`
			`return 0;`
			`}`

			`static int fsck_subdir(int nr, const char path, void progress)`
			`{`
			`display_progress(progress, nr + 1);`
			`return 0;`
			`}`

[PATCH] Update fsck-cache (take 2) The fsck-cache complains if objects referred to by files in .git/refs/ or objects stored in files under .git/objects/??/ are not found as stand-alone SHA1 files (i.e. found in alternate object pools GIT_ALTERNATE_OBJECT_DIRECTORIES or packed archives stored under .git/objects/pack). Although this is a good semantics to maintain consistency of a single .git/objects directory as a self contained set of objects, it sometimes is useful to consider it is OK as long as these "outside" objects are available. This commit introduces a new flag, --standalone, to git-fsck-cache. When it is not specified, connectivity checks and .git/refs pointer checks are taught that it is OK when expected objects do not exist under .git/objects/?? hierarchy but are available from an packed archive or in an alternate object pool. Another new flag, --full, makes git-fsck-cache to check not only the current GIT_OBJECT_DIRECTORY but also objects found in alternate object pools and packed GIT archives.a Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-28 23:58:33 +02:00			`static void fsck_object_dir(const char *path)`
			`{`
fsck: print progress fsck is usually a long process and it would be nice if it prints progress from time to time. Progress meter is not printed when --verbose is given because --verbose prints a lot, there's no need for "alive" indicator. Progress meter may provide "% complete" information but it would be lost anyway in the flood of text. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-11-07 03:59:26 +01:00			`struct progress *progress = NULL;`
git-fsck: learn about --verbose With --verbose, it gets really chatty now. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-05 04:44:00 +02:00
			`if (verbose)`
			`fprintf(stderr, "Checking object directory\n");`

fsck: print progress fsck is usually a long process and it would be nice if it prints progress from time to time. Progress meter is not printed when --verbose is given because --verbose prints a lot, there's no need for "alive" indicator. Progress meter may provide "% complete" information but it would be lost anyway in the flood of text. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-11-07 03:59:26 +01:00			`if (show_progress)`
i18n: mark all progress lines for translation Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-02-21 13:50:18 +01:00			`progress = start_progress(_("Checking object directories"), 256);`
fsck: use for_each_loose_file_in_objdir Since 27e1e22 (prune: factor out loose-object directory traversal, 2014-10-15), we now have a generic callback system for iterating over the loose object directories. This is used by prune, count-objects, etc. We did not convert git-fsck at the time because it implemented an inode-sorting scheme that was not part of the generic code. Now that the inode-sorting code is gone, we can reuse the generic code. The result is shorter, hopefully more readable, and drops some unchecked sprintf calls. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-09-24 23:08:33 +02:00
			`for_each_loose_file_in_objdir(path, fsck_loose, fsck_cruft, fsck_subdir,`
			`progress);`
			`display_progress(progress, 256);`
fsck: print progress fsck is usually a long process and it would be nice if it prints progress from time to time. Progress meter is not printed when --verbose is given because --verbose prints a lot, there's no need for "alive" indicator. Progress meter may provide "% complete" information but it would be lost anyway in the flood of text. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-11-07 03:59:26 +01:00			`stop_progress(&progress);`
[PATCH] Update fsck-cache (take 2) The fsck-cache complains if objects referred to by files in .git/refs/ or objects stored in files under .git/objects/??/ are not found as stand-alone SHA1 files (i.e. found in alternate object pools GIT_ALTERNATE_OBJECT_DIRECTORIES or packed archives stored under .git/objects/pack). Although this is a good semantics to maintain consistency of a single .git/objects directory as a self contained set of objects, it sometimes is useful to consider it is OK as long as these "outside" objects are available. This commit introduces a new flag, --standalone, to git-fsck-cache. When it is not specified, connectivity checks and .git/refs pointer checks are taught that it is OK when expected objects do not exist under .git/objects/?? hierarchy but are available from an packed archive or in an alternate object pool. Another new flag, --full, makes git-fsck-cache to check not only the current GIT_OBJECT_DIRECTORY but also objects found in alternate object pools and packed GIT archives.a Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-28 23:58:33 +02:00			`}`

Make git-fsck-cache check HEAD integrity In particular, check that it's a symlink, and points to refs/heads/. We depend on that these days not only for "git checkout", but also because fsck and others only check for references in the .git/refs/ subdirectory, not things like HEAD itself. 2005-07-03 19:40:38 +02:00			`static int fsck_head_link(void)`
			`{`
fsck: do not complain on detached HEAD. Detached HEAD is just a normal state of a repository. Do not say anything about it. Do not give worrying "error:" messages when we let the user know that the HEAD points at nothing (i.e. yet to be born branch), nor we do not have any default refs to start following the objects chain. Reword them as "notice:". Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-11 10:28:43 +02:00			`int null_is_error = 0;`

git-fsck: learn about --verbose With --verbose, it gets really chatty now. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-05 04:44:00 +02:00			`if (verbose)`
			`fprintf(stderr, "Checking HEAD link\n");`

fsck_head_link(): remove unneeded flag variable It is never read, so we can pass NULL to resolve_ref_unsafe(). Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: David Turner <dturner@twopensource.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-04-07 21:03:05 +02:00			`head_points_at = resolve_ref_unsafe("HEAD", 0, head_oid.hash, NULL);`
fsck: exit with non-zero when problems are found After finding some problems (e.g. a ref refs/heads/X points at an object that is not a commit) and issuing an error message, the program failed to signal the fact that it found an error by a non-zero exit status. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-09-23 22:46:39 +02:00			`if (!head_points_at) {`
			`errors_found \|= ERROR_REFS;`
fsck: do not complain on detached HEAD. Detached HEAD is just a normal state of a repository. Do not say anything about it. Do not give worrying "error:" messages when we let the user know that the HEAD points at nothing (i.e. yet to be born branch), nor we do not have any default refs to start following the objects chain. Reword them as "notice:". Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-11 10:28:43 +02:00			`return error("Invalid HEAD");`
fsck: exit with non-zero when problems are found After finding some problems (e.g. a ref refs/heads/X points at an object that is not a commit) and issuing an error message, the program failed to signal the fact that it found an error by a non-zero exit status. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-09-23 22:46:39 +02:00			`}`
fsck: do not complain on detached HEAD. Detached HEAD is just a normal state of a repository. Do not say anything about it. Do not give worrying "error:" messages when we let the user know that the HEAD points at nothing (i.e. yet to be born branch), nor we do not have any default refs to start following the objects chain. Reword them as "notice:". Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-11 10:28:43 +02:00			`if (!strcmp(head_points_at, "HEAD"))`
			`/* detached HEAD */`
			`null_is_error = 1;`
fsck: exit with non-zero when problems are found After finding some problems (e.g. a ref refs/heads/X points at an object that is not a commit) and issuing an error message, the program failed to signal the fact that it found an error by a non-zero exit status. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-09-23 22:46:39 +02:00			`else if (!starts_with(head_points_at, "refs/heads/")) {`
			`errors_found \|= ERROR_REFS;`
Add git-symbolic-ref This adds the counterpart of git-update-ref that lets you read and create "symbolic refs". By default it uses a symbolic link to represent ".git/HEAD -> refs/heads/master", but it can be compiled to use the textfile symbolic ref. The places that did 'readlink .git/HEAD' and 'ln -s refs/heads/blah .git/HEAD' have been converted to use new git-symbolic-ref command, so that they can deal with either implementation. Signed-off-by: Junio C Hamano <junio@twinsun.com> 2005-09-30 23:26:57 +02:00			`return error("HEAD points to something strange (%s)",`
fsck-objects: adjust to resolve_ref() clean-up. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-18 10:08:00 +02:00			`head_points_at);`
fsck: exit with non-zero when problems are found After finding some problems (e.g. a ref refs/heads/X points at an object that is not a commit) and issuing an error message, the program failed to signal the fact that it found an error by a non-zero exit status. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-09-23 22:46:39 +02:00			`}`
fsck: change functions to use object_id Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-25 20:38:50 +02:00			`if (is_null_oid(&head_oid)) {`
fsck: exit with non-zero when problems are found After finding some problems (e.g. a ref refs/heads/X points at an object that is not a commit) and issuing an error message, the program failed to signal the fact that it found an error by a non-zero exit status. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-09-23 22:46:39 +02:00			`if (null_is_error) {`
			`errors_found \|= ERROR_REFS;`
fsck: do not complain on detached HEAD. Detached HEAD is just a normal state of a repository. Do not say anything about it. Do not give worrying "error:" messages when we let the user know that the HEAD points at nothing (i.e. yet to be born branch), nor we do not have any default refs to start following the objects chain. Reword them as "notice:". Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-11 10:28:43 +02:00			`return error("HEAD: detached HEAD points at nothing");`
fsck: exit with non-zero when problems are found After finding some problems (e.g. a ref refs/heads/X points at an object that is not a commit) and issuing an error message, the program failed to signal the fact that it found an error by a non-zero exit status. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-09-23 22:46:39 +02:00			`}`
fsck: do not complain on detached HEAD. Detached HEAD is just a normal state of a repository. Do not say anything about it. Do not give worrying "error:" messages when we let the user know that the HEAD points at nothing (i.e. yet to be born branch), nor we do not have any default refs to start following the objects chain. Reword them as "notice:". Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-11 10:28:43 +02:00			`fprintf(stderr, "notice: HEAD points to an unborn branch (%s)\n",`
			`head_points_at + 11);`
			`}`
Make git-fsck-cache check HEAD integrity In particular, check that it's a symlink, and points to refs/heads/. We depend on that these days not only for "git checkout", but also because fsck and others only check for references in the .git/refs/ subdirectory, not things like HEAD itself. 2005-07-03 19:40:38 +02:00			`return 0;`
			`}`

Teach fsck-objects about cache-tree. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-26 01:37:08 +02:00			`static int fsck_cache_tree(struct cache_tree *it)`
			`{`
			`int i;`
			`int err = 0;`

git-fsck: learn about --verbose With --verbose, it gets really chatty now. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-05 04:44:00 +02:00			`if (verbose)`
			`fprintf(stderr, "Checking cache tree\n");`

Teach fsck-objects about cache-tree. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-26 01:37:08 +02:00			`if (0 <= it->entry_count) {`
object: convert parse_object* to take struct object_id Make parse_object, parse_object_or_die, and parse_object_buffer take a pointer to struct object_id. Remove the temporary variables inserted earlier, since they are no longer necessary. Transform all of the callers using the following semantic patch: @@ expression E1; @@ - parse_object(E1.hash) + parse_object(&E1) @@ expression E1; @@ - parse_object(E1->hash) + parse_object(E1) @@ expression E1, E2; @@ - parse_object_or_die(E1.hash, E2) + parse_object_or_die(&E1, E2) @@ expression E1, E2; @@ - parse_object_or_die(E1->hash, E2) + parse_object_or_die(E1, E2) @@ expression E1, E2, E3, E4, E5; @@ - parse_object_buffer(E1.hash, E2, E3, E4, E5) + parse_object_buffer(&E1, E2, E3, E4, E5) @@ expression E1, E2, E3, E4, E5; @@ - parse_object_buffer(E1->hash, E2, E3, E4, E5) + parse_object_buffer(E1, E2, E3, E4, E5) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-05-07 00:10:38 +02:00			`struct object *obj = parse_object(&it->oid);`
fsck-objects: do not segfault on missing tree in cache-tree Even if trees are missing in cache-tree, we should continue and check the rest of the object database. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-04 06:17:45 +02:00			`if (!obj) {`
			`error("%s: invalid sha1 pointer in cache-tree",`
Convert struct cache_tree to use struct object_id Convert the sha1 member of struct cache_tree to struct object_id by changing the definition and applying the following semantic patch, plus the standard object_id transforms: @@ struct cache_tree E1; @@ - E1.sha1 + E1.oid.hash @@ struct cache_tree *E1; @@ - E1->sha1 + E1->oid.hash Fix up one reference to active_cache_tree which was not automatically caught by Coccinelle. These changes are prerequisites for converting parse_object. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-05-01 04:28:56 +02:00			`oid_to_hex(&it->oid));`
fsck: exit with non-zero when problems are found After finding some problems (e.g. a ref refs/heads/X points at an object that is not a commit) and issuing an error message, the program failed to signal the fact that it found an error by a non-zero exit status. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-09-23 22:46:39 +02:00			`errors_found \|= ERROR_REFS;`
fsck-objects: do not segfault on missing tree in cache-tree Even if trees are missing in cache-tree, we should continue and check the rest of the object database. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-04 06:17:45 +02:00			`return 1;`
			`}`
fsck-objects: mark objects reachable from cache-tree When fsck-objects scanned cache-tree, it forgot to mark the trees it found reachable and in use. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-02 07:15:54 +02:00			`obj->used = 1;`
fsck: optionally show more helpful info for broken links When reporting broken links between commits/trees/blobs, it would be quite helpful at times if the user would be told how the object is supposed to be reachable. With the new --name-objects option, git-fsck will try to do exactly that: name the objects in a way that shows how they are reachable. For example, when some reflog got corrupted and a blob is missing that should not be, the user might want to remove the corresponding reflog entry. This option helps them find that entry: `git fsck` will now report something like this: broken link from tree b5eb6ff... (refs/stash@{<date>}~37:) to blob ec5cf80... Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-07-17 13:00:02 +02:00			`if (name_objects)`
			`add_decoration(fsck_walk_options.object_names,`
			`obj, xstrdup(":"));`
fsck: drop unused parameter from traverse_one_object() Also add comments to seemingly unsafe pointer dereferences, that are all safe. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-01-26 21:46:55 +01:00			`mark_object_reachable(obj);`
Remove TYPE_* constant macros and use object_type enums consistently. This updates the type-enumeration constants introduced to reduce the memory footprint of "struct object" to match the type bits already used in the packfile format, by removing the former (i.e. TYPE_* constant macros) and using the latter (i.e. enum object_type) throughout the code for consistency. Eventually we can stop passing around the "type strings" entirely, and this will help - no confusion about two different integer enumeration. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-12 05:45:31 +02:00			`if (obj->type != OBJ_TREE)`
Teach fsck-objects about cache-tree. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-26 01:37:08 +02:00			`err \|= objerror(obj, "non-tree in cache-tree");`
			`}`
			`for (i = 0; i < it->subtree_nr; i++)`
			`err \|= fsck_cache_tree(it->down[i]->cache_tree);`
			`return err;`
			`}`

Convert object iteration callbacks to struct object_id Convert each_loose_object_fn and each_packed_object_fn to take a pointer to struct object_id. Update the various callbacks. Convert several 40-based constants to use GIT_SHA1_HEXSZ. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-02-22 00:47:35 +01:00			`static void mark_object_for_connectivity(const struct object_id *oid)`
fsck: prepare dummy objects for --connectivity-check Normally fsck makes a pass over all objects to check their integrity, and then follows up with a reachability check to make sure we have all of the referenced objects (and to know which ones are dangling). The latter checks for the HAS_OBJ flag in obj->flags to see if we found the object in the first pass. Commit 02976bf85 (fsck: introduce `git fsck --connectivity-only`, 2015-06-22) taught fsck to skip the initial pass, and to fallback to has_sha1_file() instead of the HAS_OBJ check. However, it converted only one HAS_OBJ check to use has_sha1_file(). But there are many other places in builtin/fsck.c that assume that the flag is set (or that lookup_object() will return an object at all). This leads to several bugs with --connectivity-only: 1. mark_object() will not queue objects for examination, so recursively following links from commits to trees, etc, did nothing. I.e., we were checking the reachability of hardly anything at all. 2. When a set of heads is given on the command-line, we use lookup_object() to see if they exist. But without the initial pass, we assume nothing exists. 3. When loading reflog entries, we do a similar lookup_object() check, and complain that the reflog is broken if the object doesn't exist in our hash. So in short, --connectivity-only is broken pretty badly, and will claim that your repository is fine when it's not. Presumably nobody noticed for a few reasons. One is that the embedded test does not actually test the recursive nature of the reachability check. All of the missing objects are still in the index, and we directly check items from the index. This patch modifies the test to delete the index, which shows off breakage (1). Another is that --connectivity-only just skips the initial pass for loose objects. So on a real repository, the packed objects were still checked correctly. But on the flipside, it means that "git fsck --connectivity-only" still checks the sha1 of all of the packed objects, nullifying its original purpose of being a faster git-fsck. And of course the final problem is that the bug only shows up when there _is_ corruption, which is rare. So anybody running "git fsck --connectivity-only" proactively would assume it was being thorough, when it was not. One possibility for fixing this is to find all of the spots that rely on HAS_OBJ and tweak them for the connectivity-only case. But besides the risk that we might miss a spot (and I found three already, corresponding to the three bugs above), there are other parts of fsck that _can't_ work without a full list of objects. E.g., the list of dangling objects. Instead, let's make the connectivity-only case look more like the normal case. Rather than skip the initial pass completely, we'll do an abbreviated one that sets up the HAS_OBJ flag for each object, without actually loading the object data. That's simple and fast, and we don't have to care about the connectivity_only flag in the rest of the code at all. While we're at it, let's make sure we treat loose and packed objects the same (i.e., setting up dummy objects for both and skipping the actual sha1 check). That makes the connectivity-only check actually fast on a real repo (40 seconds versus 180 seconds on my copy of linux.git). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-01-17 22:32:57 +01:00			`{`
Convert object iteration callbacks to struct object_id Convert each_loose_object_fn and each_packed_object_fn to take a pointer to struct object_id. Update the various callbacks. Convert several 40-based constants to use GIT_SHA1_HEXSZ. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-02-22 00:47:35 +01:00			`struct object *obj = lookup_unknown_object(oid->hash);`
fsck: prepare dummy objects for --connectivity-check Normally fsck makes a pass over all objects to check their integrity, and then follows up with a reachability check to make sure we have all of the referenced objects (and to know which ones are dangling). The latter checks for the HAS_OBJ flag in obj->flags to see if we found the object in the first pass. Commit 02976bf85 (fsck: introduce `git fsck --connectivity-only`, 2015-06-22) taught fsck to skip the initial pass, and to fallback to has_sha1_file() instead of the HAS_OBJ check. However, it converted only one HAS_OBJ check to use has_sha1_file(). But there are many other places in builtin/fsck.c that assume that the flag is set (or that lookup_object() will return an object at all). This leads to several bugs with --connectivity-only: 1. mark_object() will not queue objects for examination, so recursively following links from commits to trees, etc, did nothing. I.e., we were checking the reachability of hardly anything at all. 2. When a set of heads is given on the command-line, we use lookup_object() to see if they exist. But without the initial pass, we assume nothing exists. 3. When loading reflog entries, we do a similar lookup_object() check, and complain that the reflog is broken if the object doesn't exist in our hash. So in short, --connectivity-only is broken pretty badly, and will claim that your repository is fine when it's not. Presumably nobody noticed for a few reasons. One is that the embedded test does not actually test the recursive nature of the reachability check. All of the missing objects are still in the index, and we directly check items from the index. This patch modifies the test to delete the index, which shows off breakage (1). Another is that --connectivity-only just skips the initial pass for loose objects. So on a real repository, the packed objects were still checked correctly. But on the flipside, it means that "git fsck --connectivity-only" still checks the sha1 of all of the packed objects, nullifying its original purpose of being a faster git-fsck. And of course the final problem is that the bug only shows up when there _is_ corruption, which is rare. So anybody running "git fsck --connectivity-only" proactively would assume it was being thorough, when it was not. One possibility for fixing this is to find all of the spots that rely on HAS_OBJ and tweak them for the connectivity-only case. But besides the risk that we might miss a spot (and I found three already, corresponding to the three bugs above), there are other parts of fsck that _can't_ work without a full list of objects. E.g., the list of dangling objects. Instead, let's make the connectivity-only case look more like the normal case. Rather than skip the initial pass completely, we'll do an abbreviated one that sets up the HAS_OBJ flag for each object, without actually loading the object data. That's simple and fast, and we don't have to care about the connectivity_only flag in the rest of the code at all. While we're at it, let's make sure we treat loose and packed objects the same (i.e., setting up dummy objects for both and skipping the actual sha1 check). That makes the connectivity-only check actually fast on a real repo (40 seconds versus 180 seconds on my copy of linux.git). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-01-17 22:32:57 +01:00			`obj->flags \|= HAS_OBJ;`
			`}`

Convert object iteration callbacks to struct object_id Convert each_loose_object_fn and each_packed_object_fn to take a pointer to struct object_id. Update the various callbacks. Convert several 40-based constants to use GIT_SHA1_HEXSZ. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-02-22 00:47:35 +01:00			`static int mark_loose_for_connectivity(const struct object_id *oid,`
fsck: prepare dummy objects for --connectivity-check Normally fsck makes a pass over all objects to check their integrity, and then follows up with a reachability check to make sure we have all of the referenced objects (and to know which ones are dangling). The latter checks for the HAS_OBJ flag in obj->flags to see if we found the object in the first pass. Commit 02976bf85 (fsck: introduce `git fsck --connectivity-only`, 2015-06-22) taught fsck to skip the initial pass, and to fallback to has_sha1_file() instead of the HAS_OBJ check. However, it converted only one HAS_OBJ check to use has_sha1_file(). But there are many other places in builtin/fsck.c that assume that the flag is set (or that lookup_object() will return an object at all). This leads to several bugs with --connectivity-only: 1. mark_object() will not queue objects for examination, so recursively following links from commits to trees, etc, did nothing. I.e., we were checking the reachability of hardly anything at all. 2. When a set of heads is given on the command-line, we use lookup_object() to see if they exist. But without the initial pass, we assume nothing exists. 3. When loading reflog entries, we do a similar lookup_object() check, and complain that the reflog is broken if the object doesn't exist in our hash. So in short, --connectivity-only is broken pretty badly, and will claim that your repository is fine when it's not. Presumably nobody noticed for a few reasons. One is that the embedded test does not actually test the recursive nature of the reachability check. All of the missing objects are still in the index, and we directly check items from the index. This patch modifies the test to delete the index, which shows off breakage (1). Another is that --connectivity-only just skips the initial pass for loose objects. So on a real repository, the packed objects were still checked correctly. But on the flipside, it means that "git fsck --connectivity-only" still checks the sha1 of all of the packed objects, nullifying its original purpose of being a faster git-fsck. And of course the final problem is that the bug only shows up when there _is_ corruption, which is rare. So anybody running "git fsck --connectivity-only" proactively would assume it was being thorough, when it was not. One possibility for fixing this is to find all of the spots that rely on HAS_OBJ and tweak them for the connectivity-only case. But besides the risk that we might miss a spot (and I found three already, corresponding to the three bugs above), there are other parts of fsck that _can't_ work without a full list of objects. E.g., the list of dangling objects. Instead, let's make the connectivity-only case look more like the normal case. Rather than skip the initial pass completely, we'll do an abbreviated one that sets up the HAS_OBJ flag for each object, without actually loading the object data. That's simple and fast, and we don't have to care about the connectivity_only flag in the rest of the code at all. While we're at it, let's make sure we treat loose and packed objects the same (i.e., setting up dummy objects for both and skipping the actual sha1 check). That makes the connectivity-only check actually fast on a real repo (40 seconds versus 180 seconds on my copy of linux.git). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-01-17 22:32:57 +01:00			`const char *path,`
			`void *data)`
			`{`
Convert object iteration callbacks to struct object_id Convert each_loose_object_fn and each_packed_object_fn to take a pointer to struct object_id. Update the various callbacks. Convert several 40-based constants to use GIT_SHA1_HEXSZ. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-02-22 00:47:35 +01:00			`mark_object_for_connectivity(oid);`
fsck: prepare dummy objects for --connectivity-check Normally fsck makes a pass over all objects to check their integrity, and then follows up with a reachability check to make sure we have all of the referenced objects (and to know which ones are dangling). The latter checks for the HAS_OBJ flag in obj->flags to see if we found the object in the first pass. Commit 02976bf85 (fsck: introduce `git fsck --connectivity-only`, 2015-06-22) taught fsck to skip the initial pass, and to fallback to has_sha1_file() instead of the HAS_OBJ check. However, it converted only one HAS_OBJ check to use has_sha1_file(). But there are many other places in builtin/fsck.c that assume that the flag is set (or that lookup_object() will return an object at all). This leads to several bugs with --connectivity-only: 1. mark_object() will not queue objects for examination, so recursively following links from commits to trees, etc, did nothing. I.e., we were checking the reachability of hardly anything at all. 2. When a set of heads is given on the command-line, we use lookup_object() to see if they exist. But without the initial pass, we assume nothing exists. 3. When loading reflog entries, we do a similar lookup_object() check, and complain that the reflog is broken if the object doesn't exist in our hash. So in short, --connectivity-only is broken pretty badly, and will claim that your repository is fine when it's not. Presumably nobody noticed for a few reasons. One is that the embedded test does not actually test the recursive nature of the reachability check. All of the missing objects are still in the index, and we directly check items from the index. This patch modifies the test to delete the index, which shows off breakage (1). Another is that --connectivity-only just skips the initial pass for loose objects. So on a real repository, the packed objects were still checked correctly. But on the flipside, it means that "git fsck --connectivity-only" still checks the sha1 of all of the packed objects, nullifying its original purpose of being a faster git-fsck. And of course the final problem is that the bug only shows up when there _is_ corruption, which is rare. So anybody running "git fsck --connectivity-only" proactively would assume it was being thorough, when it was not. One possibility for fixing this is to find all of the spots that rely on HAS_OBJ and tweak them for the connectivity-only case. But besides the risk that we might miss a spot (and I found three already, corresponding to the three bugs above), there are other parts of fsck that _can't_ work without a full list of objects. E.g., the list of dangling objects. Instead, let's make the connectivity-only case look more like the normal case. Rather than skip the initial pass completely, we'll do an abbreviated one that sets up the HAS_OBJ flag for each object, without actually loading the object data. That's simple and fast, and we don't have to care about the connectivity_only flag in the rest of the code at all. While we're at it, let's make sure we treat loose and packed objects the same (i.e., setting up dummy objects for both and skipping the actual sha1 check). That makes the connectivity-only check actually fast on a real repo (40 seconds versus 180 seconds on my copy of linux.git). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-01-17 22:32:57 +01:00			`return 0;`
			`}`

Convert object iteration callbacks to struct object_id Convert each_loose_object_fn and each_packed_object_fn to take a pointer to struct object_id. Update the various callbacks. Convert several 40-based constants to use GIT_SHA1_HEXSZ. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-02-22 00:47:35 +01:00			`static int mark_packed_for_connectivity(const struct object_id *oid,`
fsck: prepare dummy objects for --connectivity-check Normally fsck makes a pass over all objects to check their integrity, and then follows up with a reachability check to make sure we have all of the referenced objects (and to know which ones are dangling). The latter checks for the HAS_OBJ flag in obj->flags to see if we found the object in the first pass. Commit 02976bf85 (fsck: introduce `git fsck --connectivity-only`, 2015-06-22) taught fsck to skip the initial pass, and to fallback to has_sha1_file() instead of the HAS_OBJ check. However, it converted only one HAS_OBJ check to use has_sha1_file(). But there are many other places in builtin/fsck.c that assume that the flag is set (or that lookup_object() will return an object at all). This leads to several bugs with --connectivity-only: 1. mark_object() will not queue objects for examination, so recursively following links from commits to trees, etc, did nothing. I.e., we were checking the reachability of hardly anything at all. 2. When a set of heads is given on the command-line, we use lookup_object() to see if they exist. But without the initial pass, we assume nothing exists. 3. When loading reflog entries, we do a similar lookup_object() check, and complain that the reflog is broken if the object doesn't exist in our hash. So in short, --connectivity-only is broken pretty badly, and will claim that your repository is fine when it's not. Presumably nobody noticed for a few reasons. One is that the embedded test does not actually test the recursive nature of the reachability check. All of the missing objects are still in the index, and we directly check items from the index. This patch modifies the test to delete the index, which shows off breakage (1). Another is that --connectivity-only just skips the initial pass for loose objects. So on a real repository, the packed objects were still checked correctly. But on the flipside, it means that "git fsck --connectivity-only" still checks the sha1 of all of the packed objects, nullifying its original purpose of being a faster git-fsck. And of course the final problem is that the bug only shows up when there _is_ corruption, which is rare. So anybody running "git fsck --connectivity-only" proactively would assume it was being thorough, when it was not. One possibility for fixing this is to find all of the spots that rely on HAS_OBJ and tweak them for the connectivity-only case. But besides the risk that we might miss a spot (and I found three already, corresponding to the three bugs above), there are other parts of fsck that _can't_ work without a full list of objects. E.g., the list of dangling objects. Instead, let's make the connectivity-only case look more like the normal case. Rather than skip the initial pass completely, we'll do an abbreviated one that sets up the HAS_OBJ flag for each object, without actually loading the object data. That's simple and fast, and we don't have to care about the connectivity_only flag in the rest of the code at all. While we're at it, let's make sure we treat loose and packed objects the same (i.e., setting up dummy objects for both and skipping the actual sha1 check). That makes the connectivity-only check actually fast on a real repo (40 seconds versus 180 seconds on my copy of linux.git). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-01-17 22:32:57 +01:00			`struct packed_git *pack,`
			`uint32_t pos,`
			`void *data)`
			`{`
Convert object iteration callbacks to struct object_id Convert each_loose_object_fn and each_packed_object_fn to take a pointer to struct object_id. Update the various callbacks. Convert several 40-based constants to use GIT_SHA1_HEXSZ. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-02-22 00:47:35 +01:00			`mark_object_for_connectivity(oid);`
fsck: prepare dummy objects for --connectivity-check Normally fsck makes a pass over all objects to check their integrity, and then follows up with a reachability check to make sure we have all of the referenced objects (and to know which ones are dangling). The latter checks for the HAS_OBJ flag in obj->flags to see if we found the object in the first pass. Commit 02976bf85 (fsck: introduce `git fsck --connectivity-only`, 2015-06-22) taught fsck to skip the initial pass, and to fallback to has_sha1_file() instead of the HAS_OBJ check. However, it converted only one HAS_OBJ check to use has_sha1_file(). But there are many other places in builtin/fsck.c that assume that the flag is set (or that lookup_object() will return an object at all). This leads to several bugs with --connectivity-only: 1. mark_object() will not queue objects for examination, so recursively following links from commits to trees, etc, did nothing. I.e., we were checking the reachability of hardly anything at all. 2. When a set of heads is given on the command-line, we use lookup_object() to see if they exist. But without the initial pass, we assume nothing exists. 3. When loading reflog entries, we do a similar lookup_object() check, and complain that the reflog is broken if the object doesn't exist in our hash. So in short, --connectivity-only is broken pretty badly, and will claim that your repository is fine when it's not. Presumably nobody noticed for a few reasons. One is that the embedded test does not actually test the recursive nature of the reachability check. All of the missing objects are still in the index, and we directly check items from the index. This patch modifies the test to delete the index, which shows off breakage (1). Another is that --connectivity-only just skips the initial pass for loose objects. So on a real repository, the packed objects were still checked correctly. But on the flipside, it means that "git fsck --connectivity-only" still checks the sha1 of all of the packed objects, nullifying its original purpose of being a faster git-fsck. And of course the final problem is that the bug only shows up when there _is_ corruption, which is rare. So anybody running "git fsck --connectivity-only" proactively would assume it was being thorough, when it was not. One possibility for fixing this is to find all of the spots that rely on HAS_OBJ and tweak them for the connectivity-only case. But besides the risk that we might miss a spot (and I found three already, corresponding to the three bugs above), there are other parts of fsck that _can't_ work without a full list of objects. E.g., the list of dangling objects. Instead, let's make the connectivity-only case look more like the normal case. Rather than skip the initial pass completely, we'll do an abbreviated one that sets up the HAS_OBJ flag for each object, without actually loading the object data. That's simple and fast, and we don't have to care about the connectivity_only flag in the rest of the code at all. While we're at it, let's make sure we treat loose and packed objects the same (i.e., setting up dummy objects for both and skipping the actual sha1 check). That makes the connectivity-only check actually fast on a real repo (40 seconds versus 180 seconds on my copy of linux.git). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-01-17 22:32:57 +01:00			`return 0;`
			`}`

Make builtin-fsck.c use parse_options. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-10-15 22:34:05 +02:00			`static char const * const fsck_usage[] = {`
standardize usage info string format This patch puts the usage info strings that were not already in docopt- like format into docopt-like format, which will be a litle easier for end users and a lot easier for translators. Changes include: - Placing angle brackets around fill-in-the-blank parameters - Putting dashes in multiword parameter names - Adding spaces to [-f\|--foobar] to make [-f \| --foobar] - Replacing <foobar>* with [<foobar>...] Signed-off-by: Alex Henrie <alexhenrie24@gmail.com> Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-01-13 08:44:47 +01:00			`N_("git fsck [<options>] [<object>...]"),`
Make builtin-fsck.c use parse_options. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-10-15 22:34:05 +02:00			`NULL`
			`};`

			`static struct option fsck_opts[] = {`
i18n: fsck: mark parseopt strings for translation Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-08-20 14:32:13 +02:00			`OPT__VERBOSE(&verbose, N_("be verbose")),`
Replace deprecated OPT_BOOLEAN by OPT_BOOL This task emerged from b04ba2bb (parse-options: deprecate OPT_BOOLEAN, 2011-09-27). All occurrences of the respective variables have been reviewed and none of them relied on the counting up mechanism, but all of them were using the variable as a true boolean. This patch does not change semantics of any command intentionally. Signed-off-by: Stefan Beller <stefanbeller@googlemail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-08-03 13:51:19 +02:00			`OPT_BOOL(0, "unreachable", &show_unreachable, N_("show unreachable objects")),`
i18n: fsck: mark parseopt strings for translation Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-08-20 14:32:13 +02:00			`OPT_BOOL(0, "dangling", &show_dangling, N_("show dangling objects")),`
Replace deprecated OPT_BOOLEAN by OPT_BOOL This task emerged from b04ba2bb (parse-options: deprecate OPT_BOOLEAN, 2011-09-27). All occurrences of the respective variables have been reviewed and none of them relied on the counting up mechanism, but all of them were using the variable as a true boolean. This patch does not change semantics of any command intentionally. Signed-off-by: Stefan Beller <stefanbeller@googlemail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-08-03 13:51:19 +02:00			`OPT_BOOL(0, "tags", &show_tags, N_("report tags")),`
			`OPT_BOOL(0, "root", &show_root, N_("report root nodes")),`
			`OPT_BOOL(0, "cache", &keep_cache_objects, N_("make index objects head nodes")),`
			`OPT_BOOL(0, "reflogs", &include_reflogs, N_("make reflogs head nodes (default)")),`
			`OPT_BOOL(0, "full", &check_full, N_("also consider packs and alternate objects")),`
fsck: introduce `git fsck --connectivity-only` This option avoids unpacking each and all blob objects, and just verifies the connectivity. In particular with large repositories, this speeds up the operation, at the expense of missing corrupt blobs, ignoring unreachable objects and other fsck issues, if any. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-06-22 17:27:12 +02:00			`OPT_BOOL(0, "connectivity-only", &connectivity_only, N_("check only connectivity")),`
Replace deprecated OPT_BOOLEAN by OPT_BOOL This task emerged from b04ba2bb (parse-options: deprecate OPT_BOOLEAN, 2011-09-27). All occurrences of the respective variables have been reviewed and none of them relied on the counting up mechanism, but all of them were using the variable as a true boolean. This patch does not change semantics of any command intentionally. Signed-off-by: Stefan Beller <stefanbeller@googlemail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-08-03 13:51:19 +02:00			`OPT_BOOL(0, "strict", &check_strict, N_("enable more strict checking")),`
			`OPT_BOOL(0, "lost-found", &write_lost_and_found,`
i18n: fsck: mark parseopt strings for translation Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-08-20 14:32:13 +02:00			`N_("write dangling objects in .git/lost-found")),`
			`OPT_BOOL(0, "progress", &show_progress, N_("show progress")),`
fsck: optionally show more helpful info for broken links When reporting broken links between commits/trees/blobs, it would be quite helpful at times if the user would be told how the object is supposed to be reachable. With the new --name-objects option, git-fsck will try to do exactly that: name the objects in a way that shows how they are reachable. For example, when some reflog got corrupted and a blob is missing that should not be, the user might want to remove the corresponding reflog entry. This option helps them find that entry: `git fsck` will now report something like this: broken link from tree b5eb6ff... (refs/stash@{<date>}~37:) to blob ec5cf80... Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-07-17 13:00:02 +02:00			`OPT_BOOL(0, "name-objects", &name_objects, N_("show verbose names for reachable objects")),`
Make builtin-fsck.c use parse_options. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-10-15 22:34:05 +02:00			`OPT_END(),`
			`};`
fsck: exit with non-zero status upon errors git-fsck always exited with status 0, which was a bit sloppy. This makes it exit with a non-zero status when errors are found. The error code is an OR'ed result of: 1 if corrupted objects are found. 2 if objects that are ought to be reachable are missing or corrupt. For example, it would exit with 1 in a repository with an unreachable corrupt object. If a tree object of the HEAD commit is corrupt, you would get 3. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-05 09:22:06 +01:00
Make every builtin-.c file #include "builtin.h" Make every builtin-.c file #include "builtin.h". Also takes care of some declaration/definition mismatches. Signed-off-by: Peter Hagervall <hager@cs.umu.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-07-15 01:14:45 +02:00			`int cmd_fsck(int argc, const char *argv, const char prefix)`
Add first cut at "fsck-cache" that validates the SHA1 object store. It doesn't complain about mine. But it also doesn't yet check for inter-object reachability etc. 2005-04-09 00:02:42 +02:00			`{`
Make 'fsck' able to take an arbitrary number of parents on the command line. "arbitrary" is a bit wrong, since it is limited by the argument size limit (128kB or so), but let's see if anybody ever cares. Arguably you should prune your tree before you have a few thousand dangling heads in your archive. We can fix it by passing in a file listing if we ever care. 2005-04-14 01:42:09 +02:00			`int i, heads;`
fsck: check loose objects from alternate object stores by default "git fsck" used to validate only loose objects that are local and nothing else by default. This is not just too little when a repository is borrowing objects from other object stores, but also caused the connectivity check to mistakenly declare loose objects borrowed from them to be missing. The rationale behind the default mode that validates only loose objects is because these objects are still young and more unlikely to have been pushed to other repositories yet. That holds for loose objects borrowed from alternate object stores as well. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-01-30 09:50:54 +01:00			`struct alternate_object_database *alt;`
Add first cut at "fsck-cache" that validates the SHA1 object store. It doesn't complain about mine. But it also doesn't yet check for inter-object reachability etc. 2005-04-09 00:02:42 +02:00
fsck: exit with non-zero status upon errors git-fsck always exited with status 0, which was a bit sloppy. This makes it exit with a non-zero status when errors are found. The error code is an OR'ed result of: 1 if corrupted objects are found. 2 if objects that are ought to be reachable are missing or corrupt. For example, it would exit with 1 in a repository with an unreachable corrupt object. If a tree object of the HEAD commit is corrupt, you would get 3. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-05 09:22:06 +01:00			`errors_found = 0;`
rename read_replace_refs to check_replace_refs The semantics of this flag was changed in commit e1111cef23 inline lookup_replace_object() calls but wasn't renamed at the time to minimize code churn. Rename it now, and add a comment explaining its use. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-02-18 12:24:55 +01:00			`check_replace_refs = 0;`
fsck-objects: work from subdirectory. Not much point making it work from subdirectory, but for a consistency make it so. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-11-26 08:52:04 +01:00
parse-opts: prepare for OPT_FILENAME To give OPT_FILENAME the prefix, we pass the prefix to parse_options() which passes the prefix to parse_options_start() which sets the prefix member of parse_opts_ctx accordingly. If there isn't a prefix in the calling context, passing NULL will suffice. Signed-off-by: Stephen Boyd <bebarino@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-05-23 20:53:12 +02:00			`argc = parse_options(argc, argv, prefix, fsck_opts, fsck_usage, 0);`
fsck: print progress fsck is usually a long process and it would be nice if it prints progress from time to time. Progress meter is not printed when --verbose is given because --verbose prints a lot, there's no need for "alive" indicator. Progress meter may provide "% complete" information but it would be lost anyway in the flood of text. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-11-07 03:59:26 +01:00
fsck: introduce fsck options Just like the diff machinery, we are about to introduce more settings, therefore it makes sense to carry them around as a (pointer to a) struct containing all of them. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-06-22 17:25:00 +02:00			`fsck_walk_options.walk = mark_object;`
			`fsck_obj_options.walk = mark_used;`
			`fsck_obj_options.error_func = fsck_error_func;`
			`if (check_strict)`
			`fsck_obj_options.strict = 1;`

fsck: print progress fsck is usually a long process and it would be nice if it prints progress from time to time. Progress meter is not printed when --verbose is given because --verbose prints a lot, there's no need for "alive" indicator. Progress meter may provide "% complete" information but it would be lost anyway in the flood of text. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-11-07 03:59:26 +01:00			`if (show_progress == -1)`
			`show_progress = isatty(2);`
			`if (verbose)`
			`show_progress = 0;`

Make builtin-fsck.c use parse_options. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-10-15 22:34:05 +02:00			`if (write_lost_and_found) {`
			`check_full = 1;`
			`include_reflogs = 0;`
fsck-cache: only show tags if asked to do so with "--tags" Normally we don't care, we just check them for being valid tag objects. 2005-04-26 01:31:13 +02:00			`}`

fsck: optionally show more helpful info for broken links When reporting broken links between commits/trees/blobs, it would be quite helpful at times if the user would be told how the object is supposed to be reachable. With the new --name-objects option, git-fsck will try to do exactly that: name the objects in a way that shows how they are reachable. For example, when some reflog got corrupted and a blob is missing that should not be, the user might want to remove the corresponding reflog entry. This option helps them find that entry: `git fsck` will now report something like this: broken link from tree b5eb6ff... (refs/stash@{<date>}~37:) to blob ec5cf80... Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-07-17 13:00:02 +02:00			`if (name_objects)`
			`fsck_walk_options.object_names =`
			`xcalloc(1, sizeof(struct decoration));`

fsck: support demoting errors to warnings We already have support in `git receive-pack` to deal with some legacy repositories which have non-fatal issues. Let's make `git fsck` itself useful with such repositories, too, by allowing users to ignore known issues, or at least demote those issues to mere warnings. Example: `git -c fsck.missingEmail=ignore fsck` would hide problems with missing emails in author, committer and tagger lines. In the same spirit that `git receive-pack`'s usage of the fsck machinery differs from `git fsck`'s – some of the non-fatal warnings in `git fsck` are fatal with `git receive-pack` when receive.fsckObjects = true, for example – we strictly separate the fsck.<msg-id> from the receive.fsck.<msg-id> settings. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-06-22 17:27:06 +02:00			`git_config(fsck_config, NULL);`

Make git-fsck-cache check HEAD integrity In particular, check that it's a symlink, and points to refs/heads/. We depend on that these days not only for "git checkout", but also because fsck and others only check for references in the .git/refs/ subdirectory, not things like HEAD itself. 2005-07-03 19:40:38 +02:00			`fsck_head_link();`
fsck: prepare dummy objects for --connectivity-check Normally fsck makes a pass over all objects to check their integrity, and then follows up with a reachability check to make sure we have all of the referenced objects (and to know which ones are dangling). The latter checks for the HAS_OBJ flag in obj->flags to see if we found the object in the first pass. Commit 02976bf85 (fsck: introduce `git fsck --connectivity-only`, 2015-06-22) taught fsck to skip the initial pass, and to fallback to has_sha1_file() instead of the HAS_OBJ check. However, it converted only one HAS_OBJ check to use has_sha1_file(). But there are many other places in builtin/fsck.c that assume that the flag is set (or that lookup_object() will return an object at all). This leads to several bugs with --connectivity-only: 1. mark_object() will not queue objects for examination, so recursively following links from commits to trees, etc, did nothing. I.e., we were checking the reachability of hardly anything at all. 2. When a set of heads is given on the command-line, we use lookup_object() to see if they exist. But without the initial pass, we assume nothing exists. 3. When loading reflog entries, we do a similar lookup_object() check, and complain that the reflog is broken if the object doesn't exist in our hash. So in short, --connectivity-only is broken pretty badly, and will claim that your repository is fine when it's not. Presumably nobody noticed for a few reasons. One is that the embedded test does not actually test the recursive nature of the reachability check. All of the missing objects are still in the index, and we directly check items from the index. This patch modifies the test to delete the index, which shows off breakage (1). Another is that --connectivity-only just skips the initial pass for loose objects. So on a real repository, the packed objects were still checked correctly. But on the flipside, it means that "git fsck --connectivity-only" still checks the sha1 of all of the packed objects, nullifying its original purpose of being a faster git-fsck. And of course the final problem is that the bug only shows up when there _is_ corruption, which is rare. So anybody running "git fsck --connectivity-only" proactively would assume it was being thorough, when it was not. One possibility for fixing this is to find all of the spots that rely on HAS_OBJ and tweak them for the connectivity-only case. But besides the risk that we might miss a spot (and I found three already, corresponding to the three bugs above), there are other parts of fsck that _can't_ work without a full list of objects. E.g., the list of dangling objects. Instead, let's make the connectivity-only case look more like the normal case. Rather than skip the initial pass completely, we'll do an abbreviated one that sets up the HAS_OBJ flag for each object, without actually loading the object data. That's simple and fast, and we don't have to care about the connectivity_only flag in the rest of the code at all. While we're at it, let's make sure we treat loose and packed objects the same (i.e., setting up dummy objects for both and skipping the actual sha1 check). That makes the connectivity-only check actually fast on a real repo (40 seconds versus 180 seconds on my copy of linux.git). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-01-17 22:32:57 +01:00			`if (connectivity_only) {`
			`for_each_loose_object(mark_loose_for_connectivity, NULL, 0);`
			`for_each_packed_object(mark_packed_for_connectivity, NULL, 0);`
			`} else {`
fsck: introduce `git fsck --connectivity-only` This option avoids unpacking each and all blob objects, and just verifies the connectivity. In particular with large repositories, this speeds up the operation, at the expense of missing corrupt blobs, ignoring unreachable objects and other fsck issues, if any. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-06-22 17:27:12 +02:00			`fsck_object_dir(get_object_directory());`
fsck: check loose objects from alternate object stores by default "git fsck" used to validate only loose objects that are local and nothing else by default. This is not just too little when a repository is borrowing objects from other object stores, but also caused the connectivity check to mistakenly declare loose objects borrowed from them to be missing. The rationale behind the default mode that validates only loose objects is because these objects are still young and more unlikely to have been pushed to other repositories yet. That holds for loose objects borrowed from alternate object stores as well. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-01-30 09:50:54 +01:00
fsck: don't fsck alternates for connectivity-only check Commit 02976bf (fsck: introduce `git fsck --connectivity-only`, 2015-06-22) recently gave fsck an option to perform only a subset of the checks, by skipping the fsck_object_dir() call. However, it does so only for the local object directory, and we still do expensive checks on any alternate repos. We should skip them in this case, too. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-09-24 23:05:30 +02:00			`prepare_alt_odb();`
alternates: use a separate scratch space The alternate_object_database struct uses a single buffer both for storing the path to the alternate, and as a scratch buffer for forming object names. This is efficient (since otherwise we'd end up storing the path twice), but it makes life hard for callers who just want to know the path to the alternate. They have to remember to stop reading after "alt->name - alt->base" bytes, and to subtract one for the trailing '/'. It would be much simpler if they could simply access a NUL-terminated path string. We could encapsulate this in a function which puts a NUL in the scratch buffer and returns the string, but that opens up questions about the lifetime of the result. The first time another caller uses the alternate, the scratch buffer may get other data tacked onto it. Let's instead just store the root path separately from the scratch buffer. There aren't enough alternates being stored for the duplicated data to matter for performance, and this keeps things simple and safe for the callers. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-03 22:35:51 +02:00			`for (alt = alt_odb_list; alt; alt = alt->next)`
			`fsck_object_dir(alt->path);`
fsck: check loose objects from alternate object stores by default "git fsck" used to validate only loose objects that are local and nothing else by default. This is not just too little when a repository is borrowing objects from other object stores, but also caused the connectivity check to mistakenly declare loose objects borrowed from them to be missing. The rationale behind the default mode that validates only loose objects is because these objects are still young and more unlikely to have been pushed to other repositories yet. That holds for loose objects borrowed from alternate object stores as well. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-01-30 09:50:54 +01:00
fsck: prepare dummy objects for --connectivity-check Normally fsck makes a pass over all objects to check their integrity, and then follows up with a reachability check to make sure we have all of the referenced objects (and to know which ones are dangling). The latter checks for the HAS_OBJ flag in obj->flags to see if we found the object in the first pass. Commit 02976bf85 (fsck: introduce `git fsck --connectivity-only`, 2015-06-22) taught fsck to skip the initial pass, and to fallback to has_sha1_file() instead of the HAS_OBJ check. However, it converted only one HAS_OBJ check to use has_sha1_file(). But there are many other places in builtin/fsck.c that assume that the flag is set (or that lookup_object() will return an object at all). This leads to several bugs with --connectivity-only: 1. mark_object() will not queue objects for examination, so recursively following links from commits to trees, etc, did nothing. I.e., we were checking the reachability of hardly anything at all. 2. When a set of heads is given on the command-line, we use lookup_object() to see if they exist. But without the initial pass, we assume nothing exists. 3. When loading reflog entries, we do a similar lookup_object() check, and complain that the reflog is broken if the object doesn't exist in our hash. So in short, --connectivity-only is broken pretty badly, and will claim that your repository is fine when it's not. Presumably nobody noticed for a few reasons. One is that the embedded test does not actually test the recursive nature of the reachability check. All of the missing objects are still in the index, and we directly check items from the index. This patch modifies the test to delete the index, which shows off breakage (1). Another is that --connectivity-only just skips the initial pass for loose objects. So on a real repository, the packed objects were still checked correctly. But on the flipside, it means that "git fsck --connectivity-only" still checks the sha1 of all of the packed objects, nullifying its original purpose of being a faster git-fsck. And of course the final problem is that the bug only shows up when there _is_ corruption, which is rare. So anybody running "git fsck --connectivity-only" proactively would assume it was being thorough, when it was not. One possibility for fixing this is to find all of the spots that rely on HAS_OBJ and tweak them for the connectivity-only case. But besides the risk that we might miss a spot (and I found three already, corresponding to the three bugs above), there are other parts of fsck that _can't_ work without a full list of objects. E.g., the list of dangling objects. Instead, let's make the connectivity-only case look more like the normal case. Rather than skip the initial pass completely, we'll do an abbreviated one that sets up the HAS_OBJ flag for each object, without actually loading the object data. That's simple and fast, and we don't have to care about the connectivity_only flag in the rest of the code at all. While we're at it, let's make sure we treat loose and packed objects the same (i.e., setting up dummy objects for both and skipping the actual sha1 check). That makes the connectivity-only check actually fast on a real repo (40 seconds versus 180 seconds on my copy of linux.git). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-01-17 22:32:57 +01:00			`if (check_full) {`
			`struct packed_git *p;`
			`uint32_t total = 0, count = 0;`
			`struct progress *progress = NULL;`
fsck: check loose objects from alternate object stores by default "git fsck" used to validate only loose objects that are local and nothing else by default. This is not just too little when a repository is borrowing objects from other object stores, but also caused the connectivity check to mistakenly declare loose objects borrowed from them to be missing. The rationale behind the default mode that validates only loose objects is because these objects are still young and more unlikely to have been pushed to other repositories yet. That holds for loose objects borrowed from alternate object stores as well. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-01-30 09:50:54 +01:00
fsck: prepare dummy objects for --connectivity-check Normally fsck makes a pass over all objects to check their integrity, and then follows up with a reachability check to make sure we have all of the referenced objects (and to know which ones are dangling). The latter checks for the HAS_OBJ flag in obj->flags to see if we found the object in the first pass. Commit 02976bf85 (fsck: introduce `git fsck --connectivity-only`, 2015-06-22) taught fsck to skip the initial pass, and to fallback to has_sha1_file() instead of the HAS_OBJ check. However, it converted only one HAS_OBJ check to use has_sha1_file(). But there are many other places in builtin/fsck.c that assume that the flag is set (or that lookup_object() will return an object at all). This leads to several bugs with --connectivity-only: 1. mark_object() will not queue objects for examination, so recursively following links from commits to trees, etc, did nothing. I.e., we were checking the reachability of hardly anything at all. 2. When a set of heads is given on the command-line, we use lookup_object() to see if they exist. But without the initial pass, we assume nothing exists. 3. When loading reflog entries, we do a similar lookup_object() check, and complain that the reflog is broken if the object doesn't exist in our hash. So in short, --connectivity-only is broken pretty badly, and will claim that your repository is fine when it's not. Presumably nobody noticed for a few reasons. One is that the embedded test does not actually test the recursive nature of the reachability check. All of the missing objects are still in the index, and we directly check items from the index. This patch modifies the test to delete the index, which shows off breakage (1). Another is that --connectivity-only just skips the initial pass for loose objects. So on a real repository, the packed objects were still checked correctly. But on the flipside, it means that "git fsck --connectivity-only" still checks the sha1 of all of the packed objects, nullifying its original purpose of being a faster git-fsck. And of course the final problem is that the bug only shows up when there _is_ corruption, which is rare. So anybody running "git fsck --connectivity-only" proactively would assume it was being thorough, when it was not. One possibility for fixing this is to find all of the spots that rely on HAS_OBJ and tweak them for the connectivity-only case. But besides the risk that we might miss a spot (and I found three already, corresponding to the three bugs above), there are other parts of fsck that _can't_ work without a full list of objects. E.g., the list of dangling objects. Instead, let's make the connectivity-only case look more like the normal case. Rather than skip the initial pass completely, we'll do an abbreviated one that sets up the HAS_OBJ flag for each object, without actually loading the object data. That's simple and fast, and we don't have to care about the connectivity_only flag in the rest of the code at all. While we're at it, let's make sure we treat loose and packed objects the same (i.e., setting up dummy objects for both and skipping the actual sha1 check). That makes the connectivity-only check actually fast on a real repo (40 seconds versus 180 seconds on my copy of linux.git). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-01-17 22:32:57 +01:00			`prepare_packed_git();`
fsck: print progress fsck is usually a long process and it would be nice if it prints progress from time to time. Progress meter is not printed when --verbose is given because --verbose prints a lot, there's no need for "alive" indicator. Progress meter may provide "% complete" information but it would be lost anyway in the flood of text. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-11-07 03:59:26 +01:00
fsck: prepare dummy objects for --connectivity-check Normally fsck makes a pass over all objects to check their integrity, and then follows up with a reachability check to make sure we have all of the referenced objects (and to know which ones are dangling). The latter checks for the HAS_OBJ flag in obj->flags to see if we found the object in the first pass. Commit 02976bf85 (fsck: introduce `git fsck --connectivity-only`, 2015-06-22) taught fsck to skip the initial pass, and to fallback to has_sha1_file() instead of the HAS_OBJ check. However, it converted only one HAS_OBJ check to use has_sha1_file(). But there are many other places in builtin/fsck.c that assume that the flag is set (or that lookup_object() will return an object at all). This leads to several bugs with --connectivity-only: 1. mark_object() will not queue objects for examination, so recursively following links from commits to trees, etc, did nothing. I.e., we were checking the reachability of hardly anything at all. 2. When a set of heads is given on the command-line, we use lookup_object() to see if they exist. But without the initial pass, we assume nothing exists. 3. When loading reflog entries, we do a similar lookup_object() check, and complain that the reflog is broken if the object doesn't exist in our hash. So in short, --connectivity-only is broken pretty badly, and will claim that your repository is fine when it's not. Presumably nobody noticed for a few reasons. One is that the embedded test does not actually test the recursive nature of the reachability check. All of the missing objects are still in the index, and we directly check items from the index. This patch modifies the test to delete the index, which shows off breakage (1). Another is that --connectivity-only just skips the initial pass for loose objects. So on a real repository, the packed objects were still checked correctly. But on the flipside, it means that "git fsck --connectivity-only" still checks the sha1 of all of the packed objects, nullifying its original purpose of being a faster git-fsck. And of course the final problem is that the bug only shows up when there _is_ corruption, which is rare. So anybody running "git fsck --connectivity-only" proactively would assume it was being thorough, when it was not. One possibility for fixing this is to find all of the spots that rely on HAS_OBJ and tweak them for the connectivity-only case. But besides the risk that we might miss a spot (and I found three already, corresponding to the three bugs above), there are other parts of fsck that _can't_ work without a full list of objects. E.g., the list of dangling objects. Instead, let's make the connectivity-only case look more like the normal case. Rather than skip the initial pass completely, we'll do an abbreviated one that sets up the HAS_OBJ flag for each object, without actually loading the object data. That's simple and fast, and we don't have to care about the connectivity_only flag in the rest of the code at all. While we're at it, let's make sure we treat loose and packed objects the same (i.e., setting up dummy objects for both and skipping the actual sha1 check). That makes the connectivity-only check actually fast on a real repo (40 seconds versus 180 seconds on my copy of linux.git). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-01-17 22:32:57 +01:00			`if (show_progress) {`
			`for (p = packed_git; p; p = p->next) {`
			`if (open_pack_index(p))`
			`continue;`
			`total += p->num_objects;`
			`}`
fsck: print progress fsck is usually a long process and it would be nice if it prints progress from time to time. Progress meter is not printed when --verbose is given because --verbose prints a lot, there's no need for "alive" indicator. Progress meter may provide "% complete" information but it would be lost anyway in the flood of text. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-11-07 03:59:26 +01:00
fsck: prepare dummy objects for --connectivity-check Normally fsck makes a pass over all objects to check their integrity, and then follows up with a reachability check to make sure we have all of the referenced objects (and to know which ones are dangling). The latter checks for the HAS_OBJ flag in obj->flags to see if we found the object in the first pass. Commit 02976bf85 (fsck: introduce `git fsck --connectivity-only`, 2015-06-22) taught fsck to skip the initial pass, and to fallback to has_sha1_file() instead of the HAS_OBJ check. However, it converted only one HAS_OBJ check to use has_sha1_file(). But there are many other places in builtin/fsck.c that assume that the flag is set (or that lookup_object() will return an object at all). This leads to several bugs with --connectivity-only: 1. mark_object() will not queue objects for examination, so recursively following links from commits to trees, etc, did nothing. I.e., we were checking the reachability of hardly anything at all. 2. When a set of heads is given on the command-line, we use lookup_object() to see if they exist. But without the initial pass, we assume nothing exists. 3. When loading reflog entries, we do a similar lookup_object() check, and complain that the reflog is broken if the object doesn't exist in our hash. So in short, --connectivity-only is broken pretty badly, and will claim that your repository is fine when it's not. Presumably nobody noticed for a few reasons. One is that the embedded test does not actually test the recursive nature of the reachability check. All of the missing objects are still in the index, and we directly check items from the index. This patch modifies the test to delete the index, which shows off breakage (1). Another is that --connectivity-only just skips the initial pass for loose objects. So on a real repository, the packed objects were still checked correctly. But on the flipside, it means that "git fsck --connectivity-only" still checks the sha1 of all of the packed objects, nullifying its original purpose of being a faster git-fsck. And of course the final problem is that the bug only shows up when there _is_ corruption, which is rare. So anybody running "git fsck --connectivity-only" proactively would assume it was being thorough, when it was not. One possibility for fixing this is to find all of the spots that rely on HAS_OBJ and tweak them for the connectivity-only case. But besides the risk that we might miss a spot (and I found three already, corresponding to the three bugs above), there are other parts of fsck that _can't_ work without a full list of objects. E.g., the list of dangling objects. Instead, let's make the connectivity-only case look more like the normal case. Rather than skip the initial pass completely, we'll do an abbreviated one that sets up the HAS_OBJ flag for each object, without actually loading the object data. That's simple and fast, and we don't have to care about the connectivity_only flag in the rest of the code at all. While we're at it, let's make sure we treat loose and packed objects the same (i.e., setting up dummy objects for both and skipping the actual sha1 check). That makes the connectivity-only check actually fast on a real repo (40 seconds versus 180 seconds on my copy of linux.git). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-01-17 22:32:57 +01:00			`progress = start_progress(_("Checking objects"), total);`
			`}`
fsck: print progress fsck is usually a long process and it would be nice if it prints progress from time to time. Progress meter is not printed when --verbose is given because --verbose prints a lot, there's no need for "alive" indicator. Progress meter may provide "% complete" information but it would be lost anyway in the flood of text. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-11-07 03:59:26 +01:00			`for (p = packed_git; p; p = p->next) {`
fsck: prepare dummy objects for --connectivity-check Normally fsck makes a pass over all objects to check their integrity, and then follows up with a reachability check to make sure we have all of the referenced objects (and to know which ones are dangling). The latter checks for the HAS_OBJ flag in obj->flags to see if we found the object in the first pass. Commit 02976bf85 (fsck: introduce `git fsck --connectivity-only`, 2015-06-22) taught fsck to skip the initial pass, and to fallback to has_sha1_file() instead of the HAS_OBJ check. However, it converted only one HAS_OBJ check to use has_sha1_file(). But there are many other places in builtin/fsck.c that assume that the flag is set (or that lookup_object() will return an object at all). This leads to several bugs with --connectivity-only: 1. mark_object() will not queue objects for examination, so recursively following links from commits to trees, etc, did nothing. I.e., we were checking the reachability of hardly anything at all. 2. When a set of heads is given on the command-line, we use lookup_object() to see if they exist. But without the initial pass, we assume nothing exists. 3. When loading reflog entries, we do a similar lookup_object() check, and complain that the reflog is broken if the object doesn't exist in our hash. So in short, --connectivity-only is broken pretty badly, and will claim that your repository is fine when it's not. Presumably nobody noticed for a few reasons. One is that the embedded test does not actually test the recursive nature of the reachability check. All of the missing objects are still in the index, and we directly check items from the index. This patch modifies the test to delete the index, which shows off breakage (1). Another is that --connectivity-only just skips the initial pass for loose objects. So on a real repository, the packed objects were still checked correctly. But on the flipside, it means that "git fsck --connectivity-only" still checks the sha1 of all of the packed objects, nullifying its original purpose of being a faster git-fsck. And of course the final problem is that the bug only shows up when there _is_ corruption, which is rare. So anybody running "git fsck --connectivity-only" proactively would assume it was being thorough, when it was not. One possibility for fixing this is to find all of the spots that rely on HAS_OBJ and tweak them for the connectivity-only case. But besides the risk that we might miss a spot (and I found three already, corresponding to the three bugs above), there are other parts of fsck that _can't_ work without a full list of objects. E.g., the list of dangling objects. Instead, let's make the connectivity-only case look more like the normal case. Rather than skip the initial pass completely, we'll do an abbreviated one that sets up the HAS_OBJ flag for each object, without actually loading the object data. That's simple and fast, and we don't have to care about the connectivity_only flag in the rest of the code at all. While we're at it, let's make sure we treat loose and packed objects the same (i.e., setting up dummy objects for both and skipping the actual sha1 check). That makes the connectivity-only check actually fast on a real repo (40 seconds versus 180 seconds on my copy of linux.git). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-01-17 22:32:57 +01:00			`/* verify gives error messages itself */`
			`if (verify_pack(p, fsck_obj_buffer,`
			`progress, count))`
			`errors_found \|= ERROR_PACK;`
			`count += p->num_objects;`
fsck: print progress fsck is usually a long process and it would be nice if it prints progress from time to time. Progress meter is not printed when --verbose is given because --verbose prints a lot, there's no need for "alive" indicator. Progress meter may provide "% complete" information but it would be lost anyway in the flood of text. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-11-07 03:59:26 +01:00			`}`
fsck: prepare dummy objects for --connectivity-check Normally fsck makes a pass over all objects to check their integrity, and then follows up with a reachability check to make sure we have all of the referenced objects (and to know which ones are dangling). The latter checks for the HAS_OBJ flag in obj->flags to see if we found the object in the first pass. Commit 02976bf85 (fsck: introduce `git fsck --connectivity-only`, 2015-06-22) taught fsck to skip the initial pass, and to fallback to has_sha1_file() instead of the HAS_OBJ check. However, it converted only one HAS_OBJ check to use has_sha1_file(). But there are many other places in builtin/fsck.c that assume that the flag is set (or that lookup_object() will return an object at all). This leads to several bugs with --connectivity-only: 1. mark_object() will not queue objects for examination, so recursively following links from commits to trees, etc, did nothing. I.e., we were checking the reachability of hardly anything at all. 2. When a set of heads is given on the command-line, we use lookup_object() to see if they exist. But without the initial pass, we assume nothing exists. 3. When loading reflog entries, we do a similar lookup_object() check, and complain that the reflog is broken if the object doesn't exist in our hash. So in short, --connectivity-only is broken pretty badly, and will claim that your repository is fine when it's not. Presumably nobody noticed for a few reasons. One is that the embedded test does not actually test the recursive nature of the reachability check. All of the missing objects are still in the index, and we directly check items from the index. This patch modifies the test to delete the index, which shows off breakage (1). Another is that --connectivity-only just skips the initial pass for loose objects. So on a real repository, the packed objects were still checked correctly. But on the flipside, it means that "git fsck --connectivity-only" still checks the sha1 of all of the packed objects, nullifying its original purpose of being a faster git-fsck. And of course the final problem is that the bug only shows up when there _is_ corruption, which is rare. So anybody running "git fsck --connectivity-only" proactively would assume it was being thorough, when it was not. One possibility for fixing this is to find all of the spots that rely on HAS_OBJ and tweak them for the connectivity-only case. But besides the risk that we might miss a spot (and I found three already, corresponding to the three bugs above), there are other parts of fsck that _can't_ work without a full list of objects. E.g., the list of dangling objects. Instead, let's make the connectivity-only case look more like the normal case. Rather than skip the initial pass completely, we'll do an abbreviated one that sets up the HAS_OBJ flag for each object, without actually loading the object data. That's simple and fast, and we don't have to care about the connectivity_only flag in the rest of the code at all. While we're at it, let's make sure we treat loose and packed objects the same (i.e., setting up dummy objects for both and skipping the actual sha1 check). That makes the connectivity-only check actually fast on a real repo (40 seconds versus 180 seconds on my copy of linux.git). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-01-17 22:32:57 +01:00			`stop_progress(&progress);`
fsck: print progress fsck is usually a long process and it would be nice if it prints progress from time to time. Progress meter is not printed when --verbose is given because --verbose prints a lot, there's no need for "alive" indicator. Progress meter may provide "% complete" information but it would be lost anyway in the flood of text. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-11-07 03:59:26 +01:00			`}`
Make 'fsck' able to take an arbitrary number of parents on the command line. "arbitrary" is a bit wrong, since it is limited by the argument size limit (128kB or so), but let's see if anybody ever cares. Arguably you should prune your tree before you have a few thousand dangling heads in your archive. We can fix it by passing in a file listing if we ever care. 2005-04-14 01:42:09 +02:00			`}`

			`heads = 0;`
builtin-fsck: fix off by one head count According to the man page, if "git fsck" is passed one or more heads, it should verify connectivity and validity of only objects reachable from the heads it is passed. However, since 5ac0a20 (Make builtin-fsck.c use parse_options., 2007-10-15) the command behaved as if no heads were passed, when given only one argument. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-01-18 04:46:09 +01:00			`for (i = 0; i < argc; i++) {`
War on whitespace This uses "git-apply --whitespace=strip" to fix whitespace errors that have crept in to our source files over time. There are a few files that need to have trailing whitespaces (most notably, test vectors). The results still passes the test, and build result in Documentation/ area is unchanged. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-07 09:04:01 +02:00			`const char *arg = argv[i];`
fsck: HEAD is part of refs By default we looked at all refs but not HEAD. The only thing that made fsck not lose sight of commits that are only reachable from a detached HEAD was the reflog for the HEAD. This fixes it, with a new test. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-01-30 09:33:00 +01:00			`unsigned char sha1[20];`
			`if (!get_sha1(arg, sha1)) {`
			`struct object *obj = lookup_object(sha1);`
[PATCH] git-fsck-cache: Gracefully handle non-commit IDs Gracefully handle non-commit IDs instead of segfaulting. Signed-off-by: Jonas Fonseca <fonseca@diku.dk> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-30 05:00:40 +02:00
fsck: check HAS_OBJ more consistently There are two spots that call lookup_object() and assume that a non-NULL result means we have the object: 1. When we're checking the objects given to us by the user on the command line. 2. When we're checking if a reflog entry is valid. This generally follows fsck's mental model that we will have looked at and loaded a "struct object" for each object in the repository. But it misses one case: if another object _mentioned_ an object, but we didn't actually parse it or verify that it exists, it will still have a struct. It's not clear if this is a triggerable bug or not. Certainly the later parts of the reachability check need to be careful of this, and do so by checking the HAS_OBJ flag. But both of these steps happen before we start traversing, so probably we won't have followed any links yet. Still, it's easy enough to be defensive here. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-01-16 22:34:57 +01:00			`if (!obj \|\| !(obj->flags & HAS_OBJ)) {`
fsck: tighten error-checks of "git fsck <head>" Instead of checking reachability from the refs, you can ask fsck to check from a particular set of heads. However, the error checking here is quite lax. In particular: 1. It claims lookup_object() will report an error, which is not true. It only does a hash lookup, and the user has no clue that their argument was skipped. 2. When either the name or sha1 cannot be resolved, we continue to exit with a successful error code, even though we didn't check what the user asked us to. This patch fixes both of these cases. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-01-16 22:33:29 +01:00			`error("%s: object missing", sha1_to_hex(sha1));`
			`errors_found \|= ERROR_OBJECT;`
[PATCH] git-fsck-cache: Gracefully handle non-commit IDs Gracefully handle non-commit IDs instead of segfaulting. Signed-off-by: Jonas Fonseca <fonseca@diku.dk> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-30 05:00:40 +02:00			`continue;`
fsck: tighten error-checks of "git fsck <head>" Instead of checking reachability from the refs, you can ask fsck to check from a particular set of heads. However, the error checking here is quite lax. In particular: 1. It claims lookup_object() will report an error, which is not true. It only does a hash lookup, and the user has no clue that their argument was skipped. 2. When either the name or sha1 cannot be resolved, we continue to exit with a successful error code, even though we didn't check what the user asked us to. This patch fixes both of these cases. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-01-16 22:33:29 +01:00			`}`
[PATCH] git-fsck-cache: Gracefully handle non-commit IDs Gracefully handle non-commit IDs instead of segfaulting. Signed-off-by: Jonas Fonseca <fonseca@diku.dk> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-30 05:00:40 +02:00
[PATCH] Port fsck-cache to use parsing functions This ports fsck-cache to use parsing functions. Note that performance could be improved here by only reading each object once, but this requires somewhat more complicated flow control. Signed-Off-By: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-18 20:39:48 +02:00			`obj->used = 1;`
fsck: optionally show more helpful info for broken links When reporting broken links between commits/trees/blobs, it would be quite helpful at times if the user would be told how the object is supposed to be reachable. With the new --name-objects option, git-fsck will try to do exactly that: name the objects in a way that shows how they are reachable. For example, when some reflog got corrupted and a blob is missing that should not be, the user might want to remove the corresponding reflog entry. This option helps them find that entry: `git fsck` will now report something like this: broken link from tree b5eb6ff... (refs/stash@{<date>}~37:) to blob ec5cf80... Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-07-17 13:00:02 +02:00			`if (name_objects)`
			`add_decoration(fsck_walk_options.object_names,`
			`obj, xstrdup(arg));`
builtin-fsck: move away from object-refs to fsck_walk Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-25 22:46:05 +01:00			`mark_object_reachable(obj);`
Make 'fsck' able to take an arbitrary number of parents on the command line. "arbitrary" is a bit wrong, since it is limited by the argument size limit (128kB or so), but let's see if anybody ever cares. Arguably you should prune your tree before you have a few thousand dangling heads in your archive. We can fix it by passing in a file listing if we ever care. 2005-04-14 01:42:09 +02:00			`heads++;`
Make "fsck-cache" use the same revision tracking structure as "rev-tree". This makes things a lot more efficient, and makes it trivial to do things like reachability analysis. Add command line flags to tell what the head is, and whether to warn about unreachable objects. 2005-04-13 18:57:30 +02:00			`continue;`
			`}`
[PATCH] Make the git-fsck-objects diagnostics more useful Actually report what exactly is wrong with the object, instead of an ambiguous 'bad sha1 file' or such. In places where we already do, unify the format and clean the messages up. Signed-off-by: Petr Baudis <pasky@suse.cz> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-20 20:56:05 +02:00			`error("invalid parameter: expected sha1, got '%s'", arg);`
fsck: tighten error-checks of "git fsck <head>" Instead of checking reachability from the refs, you can ask fsck to check from a particular set of heads. However, the error checking here is quite lax. In particular: 1. It claims lookup_object() will report an error, which is not true. It only does a hash lookup, and the user has no clue that their argument was skipped. 2. When either the name or sha1 cannot be resolved, we continue to exit with a successful error code, even though we didn't check what the user asked us to. This patch fixes both of these cases. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-01-16 22:33:29 +01:00			`errors_found \|= ERROR_OBJECT;`
Make "fsck-cache" use the same revision tracking structure as "rev-tree". This makes things a lot more efficient, and makes it trivial to do things like reachability analysis. Add command line flags to tell what the head is, and whether to warn about unreachable objects. 2005-04-13 18:57:30 +02:00			`}`

fsck-cache: walk the 'refs' directory if the user doesn't give any explicit references for reachability analysis. We already had that as separate logic in git-prune-script, so this is not a new special case - it's an old special case moved into fsck, making normal usage be much simpler. 2005-05-18 19:16:14 +02:00			`/*`
[PATCH] delta check This adds knowledge of delta objects to fsck-cache and various object parsing code. A new switch to git-fsck-cache is provided to display the maximum delta depth found in a repository. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-20 22:59:17 +02:00			`* If we've not been given any explicit head information, do the`
fsck-cache: read the default reference information even when not doing reachability analysis. This avoids the dangling head problem, and means that just a plain "git-fsck-cache" with no parameters will DTRT. 2005-05-18 19:19:59 +02:00			`* default ones from .git/refs. We also consider the index file`
			`* in this case (ie this implies --cache).`
fsck-cache: walk the 'refs' directory if the user doesn't give any explicit references for reachability analysis. We already had that as separate logic in git-prune-script, so this is not a new special case - it's an old special case moved into fsck, making normal usage be much simpler. 2005-05-18 19:16:14 +02:00			`*/`
fsck: do not fallback "git fsck <bogus>" to "git fsck" Since fsck tries to continue as much as it can after seeing an error, we still do the reachability check even if some heads we were given on the command-line are bogus. But if _none_ of the heads is is valid, we fallback to checking all refs and the index, which is not what the user asked for at all. Instead of checking "heads", the number of successful heads we got, check "argc" (which we know only has non-options in it, because parse_options removed the others). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-01-16 22:34:21 +01:00			`if (!argc) {`
fsck-cache: walk the 'refs' directory if the user doesn't give any explicit references for reachability analysis. We already had that as separate logic in git-prune-script, so this is not a new special case - it's an old special case moved into fsck, making normal usage be much simpler. 2005-05-18 19:16:14 +02:00			`get_default_heads();`
			`keep_cache_objects = 1;`
			`}`

Git-prune-script loses blobs referenced from an uncommitted cache. (updated from the version posted to GIT mailing list). When a new blob is registered with update-cache, and before the cache is written as a tree and committed, git-fsck-cache will find the blob unreachable. This patch adds a new flag, "--cache" to git-fsck-cache, with which it keeps such blobs from considered "unreachable". The git-prune-script is updated to use this new flag. At the same time it adds .git/refs// to the set of default locations to look for heads, which should be consistent with expectations from Cogito users. Without this fix, "diff-cache -p --cached" after git-prune-script has pruned the blob object will fail mysteriously and git-write-tree would also fail. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-04 10:33:33 +02:00			`if (keep_cache_objects) {`
read-cache: force_verify_index_checksum Teach git to skip verification of the SHA1-1 checksum at the end of the index file in verify_hdr() which is called from read_index() unless the "force_verify_index_checksum" global variable is set. Teach fsck to force this verification. The checksum verification is for detecting disk corruption, and for small projects, the time it takes to compute SHA-1 is not that significant, but for gigantic repositories this calculation adds significant time to every command. These effect can be seen using t/perf/p0002-read-cache.sh: Test HEAD~1 HEAD -------------------------------------------------------------------------------------- 0002.1: read_cache/discard_cache 1000 times 0.66(0.44+0.20) 0.30(0.27+0.02) -54.5% Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-04-14 22:32:21 +02:00			`verify_index_checksum = 1;`
Git-prune-script loses blobs referenced from an uncommitted cache. (updated from the version posted to GIT mailing list). When a new blob is registered with update-cache, and before the cache is written as a tree and committed, git-fsck-cache will find the blob unreachable. This patch adds a new flag, "--cache" to git-fsck-cache, with which it keeps such blobs from considered "unreachable". The git-prune-script is updated to use this new flag. At the same time it adds .git/refs// to the set of default locations to look for heads, which should be consistent with expectations from Cogito users. Without this fix, "diff-cache -p --cached" after git-prune-script has pruned the blob object will fail mysteriously and git-write-tree would also fail. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-04 10:33:33 +02:00			`read_cache();`
			`for (i = 0; i < active_nr; i++) {`
Teach "fsck" not to follow subproject links Since the subprojects don't necessarily even exist in the current tree, much less in the current git repository (they are totally independent repositories), we do not want to try to follow the chain from one git repository to another through a gitlink. This involves teaching fsck to ignore references to gitlink objects from a tree and from the current index. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-10 06:15:29 +02:00			`unsigned int mode;`
			`struct blob *blob;`
Git-prune-script loses blobs referenced from an uncommitted cache. (updated from the version posted to GIT mailing list). When a new blob is registered with update-cache, and before the cache is written as a tree and committed, git-fsck-cache will find the blob unreachable. This patch adds a new flag, "--cache" to git-fsck-cache, with which it keeps such blobs from considered "unreachable". The git-prune-script is updated to use this new flag. At the same time it adds .git/refs// to the set of default locations to look for heads, which should be consistent with expectations from Cogito users. Without this fix, "diff-cache -p --cached" after git-prune-script has pruned the blob object will fail mysteriously and git-write-tree would also fail. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-04 10:33:33 +02:00			`struct object *obj;`
Teach "fsck" not to follow subproject links Since the subprojects don't necessarily even exist in the current tree, much less in the current git repository (they are totally independent repositories), we do not want to try to follow the chain from one git repository to another through a gitlink. This involves teaching fsck to ignore references to gitlink objects from a tree and from the current index. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-10 06:15:29 +02:00
Make on-disk index representation separate from in-core one This converts the index explicitly on read and write to its on-disk format, allowing the in-core format to contain more flags, and be simpler. In particular, the in-core format is now host-endian (as opposed to the on-disk one that is network endian in order to be able to be shared across machines) and as a result we can dispense with all the htonl/ntohl on accesses to the cache_entry fields. This will make it easier to make use of various temporary flags that do not exist in the on-disk format. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2008-01-15 01:03:17 +01:00			`mode = active_cache[i]->ce_mode;`
rename dirlink to gitlink. Unify naming of plumbing dirlink/gitlink concept: git ls-files -z '*.[ch]' \| xargs -0 perl -pi -e 's/dirlink/gitlink/g;' -e 's/DIRLNK/GITLINK/g;' Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-05-21 22:08:28 +02:00			`if (S_ISGITLINK(mode))`
Teach "fsck" not to follow subproject links Since the subprojects don't necessarily even exist in the current tree, much less in the current git repository (they are totally independent repositories), we do not want to try to follow the chain from one git repository to another through a gitlink. This involves teaching fsck to ignore references to gitlink objects from a tree and from the current index. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-10 06:15:29 +02:00			`continue;`
Convert lookup_blob to struct object_id Convert lookup_blob to take a pointer to struct object_id. The commit was created with manual changes to blob.c and blob.h, plus the following semantic patch: @@ expression E1; @@ - lookup_blob(E1.hash) + lookup_blob(&E1) @@ expression E1; @@ - lookup_blob(E1->hash) + lookup_blob(E1) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-05-07 00:10:14 +02:00			`blob = lookup_blob(&active_cache[i]->oid);`
Git-prune-script loses blobs referenced from an uncommitted cache. (updated from the version posted to GIT mailing list). When a new blob is registered with update-cache, and before the cache is written as a tree and committed, git-fsck-cache will find the blob unreachable. This patch adds a new flag, "--cache" to git-fsck-cache, with which it keeps such blobs from considered "unreachable". The git-prune-script is updated to use this new flag. At the same time it adds .git/refs// to the set of default locations to look for heads, which should be consistent with expectations from Cogito users. Without this fix, "diff-cache -p --cached" after git-prune-script has pruned the blob object will fail mysteriously and git-write-tree would also fail. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-04 10:33:33 +02:00			`if (!blob)`
			`continue;`
			`obj = &blob->object;`
			`obj->used = 1;`
fsck: optionally show more helpful info for broken links When reporting broken links between commits/trees/blobs, it would be quite helpful at times if the user would be told how the object is supposed to be reachable. With the new --name-objects option, git-fsck will try to do exactly that: name the objects in a way that shows how they are reachable. For example, when some reflog got corrupted and a blob is missing that should not be, the user might want to remove the corresponding reflog entry. This option helps them find that entry: `git fsck` will now report something like this: broken link from tree b5eb6ff... (refs/stash@{<date>}~37:) to blob ec5cf80... Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-07-17 13:00:02 +02:00			`if (name_objects)`
			`add_decoration(fsck_walk_options.object_names,`
			`obj,`
			`xstrfmt(":%s", active_cache[i]->name));`
builtin-fsck: move away from object-refs to fsck_walk Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-25 22:46:05 +01:00			`mark_object_reachable(obj);`
Git-prune-script loses blobs referenced from an uncommitted cache. (updated from the version posted to GIT mailing list). When a new blob is registered with update-cache, and before the cache is written as a tree and committed, git-fsck-cache will find the blob unreachable. This patch adds a new flag, "--cache" to git-fsck-cache, with which it keeps such blobs from considered "unreachable". The git-prune-script is updated to use this new flag. At the same time it adds .git/refs// to the set of default locations to look for heads, which should be consistent with expectations from Cogito users. Without this fix, "diff-cache -p --cached" after git-prune-script has pruned the blob object will fail mysteriously and git-write-tree would also fail. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-04 10:33:33 +02:00			`}`
Teach fsck-objects about cache-tree. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-26 01:37:08 +02:00			`if (active_cache_tree)`
			`fsck_cache_tree(active_cache_tree);`
Git-prune-script loses blobs referenced from an uncommitted cache. (updated from the version posted to GIT mailing list). When a new blob is registered with update-cache, and before the cache is written as a tree and committed, git-fsck-cache will find the blob unreachable. This patch adds a new flag, "--cache" to git-fsck-cache, with which it keeps such blobs from considered "unreachable". The git-prune-script is updated to use this new flag. At the same time it adds .git/refs// to the set of default locations to look for heads, which should be consistent with expectations from Cogito users. Without this fix, "diff-cache -p --cached" after git-prune-script has pruned the blob object will fail mysteriously and git-write-tree would also fail. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-04 10:33:33 +02:00			`}`

Add connectivity tracking to fsck. This shows that I've lost track of one commit already. Most likely because I forgot to update the .dircache/HEAD file when doing a commit, so that the next commit referenced not the top-of-tree, but the one older commit. Having dangling commits is fine (in fact, you should always have at least _one_ dangling commit in the top-of-tree). But it's good to know about them. 2005-04-11 08:13:09 +02:00			`check_connectivity();`
fsck: exit with non-zero status upon errors git-fsck always exited with status 0, which was a bit sloppy. This makes it exit with a non-zero status when errors are found. The error code is an OR'ed result of: 1 if corrupted objects are found. 2 if objects that are ought to be reachable are missing or corrupt. For example, it would exit with 1 in a repository with an unreachable corrupt object. If a tree object of the HEAD commit is corrupt, you would get 3. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-05 09:22:06 +01:00			`return errors_found;`
Add first cut at "fsck-cache" that validates the SHA1 object store. It doesn't complain about mine. But it also doesn't yet check for inter-object reachability etc. 2005-04-09 00:02:42 +02:00			`}`