mirrors/git - Incest Forge: Beyond sex. We incest.

mirrors/git

mirror of https://github.com/git/git.git synced 2024-11-09 02:33:11 +01:00

243 lines

8 KiB

Text

Raw Normal View History

howto: add article on recovering a corrupted object This is an asciidoc-ified version of a corruption post-mortem sent to the git list. It complements the existing howto article, since it covers a case where the object couldn't be easily recreated or copied from elsewhere. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-10-25 09:55:02 +02:00			`Date: Wed, 16 Oct 2013 04:34:01 -0400`
			`From: Jeff King <peff@peff.net>`
			`Subject: pack corruption post-mortem`
			`Abstract: Recovering a corrupted object when no good copy is available.`
			`Content-type: text/asciidoc`

			`How to recover an object from scratch`
			`=====================================`

			`I was recently presented with a repository with a corrupted packfile,`
			`and was asked if the data was recoverable. This post-mortem describes`
			`the steps I took to investigate and fix the problem. I thought others`
			`might find the process interesting, and it might help somebody in the`
			`same situation.`

			`********************************`
			`Note: In this case, no good copy of the repository was available. For`
			`the much easier case where you can get the corrupted object from`
			`elsewhere, see link:recover-corrupted-blob-object.html[this howto].`
			`********************************`

			`I started with an fsck, which found a problem with exactly one object`
			`(I've used $pack and $obj below to keep the output readable, and also`
			`because I'll refer to them later):`

			`-----------`
			`$ git fsck`
			`error: $pack SHA1 checksum mismatch`
			`error: index CRC mismatch for object $obj from $pack at offset 51653873`
			`error: inflate: data stream error (incorrect data check)`
			`error: cannot unpack $obj from $pack at offset 51653873`
			`-----------`

			`The pack checksum failing means a byte is munged somewhere, and it is`
			`presumably in the object mentioned (since both the index checksum and`
			`zlib were failing).`

			`Reading the zlib source code, I found that "incorrect data check" means`
			`that the adler-32 checksum at the end of the zlib data did not match the`
			`inflated data. So stepping the data through zlib would not help, as it`
Documentation: typofixes In addition to fixing trivial and obvious typos, be careful about the following points: - Spell ASCII, URL and CRC in ALL CAPS; - Spell Linux as Capitalized; - Do not omit periods in "i.e." and "e.g.". Signed-off-by: Thomas Ackermann <th.acker@arcor.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-11-03 21:37:07 +01:00			`did not fail until the very end, when we realize the CRC does not match.`
howto: add article on recovering a corrupted object This is an asciidoc-ified version of a corruption post-mortem sent to the git list. It complements the existing howto article, since it covers a case where the object couldn't be easily recreated or copied from elsewhere. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-10-25 09:55:02 +02:00			`The problematic bytes could be anywhere in the object data.`

			`The first thing I did was pull the broken data out of the packfile. I`
			`needed to know how big the object was, which I found out with:`

			`------------`
			`$ git show-index <$idx \| cut -d' ' -f1 \| sort -n \| grep -A1 51653873`
			`51653873`
			`51664736`
			`------------`

			`Show-index gives us the list of objects and their offsets. We throw away`
			`everything but the offsets, and then sort them so that our interesting`
			`offset (which we got from the fsck output above) is followed immediately`
			`by the offset of the next object. Now we know that the object data is`
			`10863 bytes long, and we can grab it with:`

			`------------`
			`dd if=$pack of=object bs=1 skip=51653873 count=10863`
			`------------`

			`I inspected a hexdump of the data, looking for any obvious bogosity`
			`(e.g., a 4K run of zeroes would be a good sign of filesystem`
			`corruption). But everything looked pretty reasonable.`

			`Note that the "object" file isn't fit for feeding straight to zlib; it`
			`has the git packed object header, which is variable-length. We want to`
			`strip that off so we can start playing with the zlib data directly. You`
			`can either work your way through it manually (the format is described in`
			`link:../technical/pack-format.html[Documentation/technical/pack-format.txt]),`
			`or you can walk through it in a debugger. I did the latter, creating a`
			`valid pack like:`

			`------------`
			`# pack magic and version`
			`printf 'PACK\0\0\0\2' >tmp.pack`
			`# pack has one object`
			`printf '\0\0\0\1' >>tmp.pack`
			`# now add our object data`
			`cat object >>tmp.pack`
			`# and then append the pack trailer`
			`/path/to/git.git/test-sha1 -b <tmp.pack >trailer`
			`cat trailer >>tmp.pack`
			`------------`

			`and then running "git index-pack tmp.pack" in the debugger (stop at`
			`unpack_raw_entry). Doing this, I found that there were 3 bytes of header`
			`(and the header itself had a sane type and size). So I stripped those`
			`off with:`

			`------------`
			`dd if=object of=zlib bs=1 skip=3`
			`------------`

			`I ran the result through zlib's inflate using a custom C program. And`
			`while it did report the error, I did get the right number of output`
			`bytes (i.e., it matched git's size header that we decoded above). But`
			`feeding the result back to "git hash-object" didn't produce the same`
			`sha1. So there were some wrong bytes, but I didn't know which. The file`
			`happened to be C source code, so I hoped I could notice something`
			`obviously wrong with it, but I didn't. I even got it to compile!`

			`I also tried comparing it to other versions of the same path in the`
			`repository, hoping that there would be some part of the diff that didn't`
			`make sense. Unfortunately, this happened to be the only revision of this`
			`particular file in the repository, so I had nothing to compare against.`

			`So I took a different approach. Working under the guess that the`
			`corruption was limited to a single byte, I wrote a program to munge each`
			`byte individually, and try inflating the result. Since the object was`
			`only 10K compressed, that worked out to about 2.5M attempts, which took`
			`a few minutes.`

			`The program I used is here:`

			`----------------------------------------------`
			`#include <stdio.h>`
			`#include <unistd.h>`
			`#include <string.h>`
			`#include <signal.h>`
			`#include <zlib.h>`

			`static int try_zlib(unsigned char *buf, int len)`
			`{`
			`/* make this absurdly large so we don't have to loop */`
			`static unsigned char out[1024*1024];`
			`z_stream z;`
			`int ret;`

			`memset(&z, 0, sizeof(z));`
			`inflateInit(&z);`

			`z.next_in = buf;`
			`z.avail_in = len;`
			`z.next_out = out;`
			`z.avail_out = sizeof(out);`

			`ret = inflate(&z, 0);`
			`inflateEnd(&z);`
			`return ret >= 0;`
			`}`

			`/* eye candy */`
			`static int counter = 0;`
			`static void progress(int sig)`
			`{`
			`fprintf(stderr, "\r%d", counter);`
			`alarm(1);`
			`}`

			`int main(void)`
			`{`
			`/* oversized so we can read the whole buffer in */`
			`unsigned char buf[1024*1024];`
			`int len;`
			`unsigned i, j;`

			`signal(SIGALRM, progress);`
			`alarm(1);`

			`len = read(0, buf, sizeof(buf));`
			`for (i = 0; i < len; i++) {`
			`unsigned char c = buf[i];`
			`for (j = 0; j <= 0xff; j++) {`
			`buf[i] = j;`

			`counter++;`
			`if (try_zlib(buf, len))`
			`printf("i=%d, j=%x\n", i, j);`
			`}`
			`buf[i] = c;`
			`}`

			`alarm(0);`
			`fprintf(stderr, "\n");`
			`return 0;`
			`}`
			`----------------------------------------------`

			`I compiled and ran with:`

			`-------`
			`gcc -Wall -Werror -O3 munge.c -o munge -lz`
			`./munge <zlib`
			`-------`


			`There were a few false positives early on (if you write "no data" in the`
			`zlib header, zlib thinks it's just fine :) ). But I got a hit about`
			`halfway through:`

			`-------`
			`i=5642, j=c7`
			`-------`

			`I let it run to completion, and got a few more hits at the end (where it`
Documentation: typofixes In addition to fixing trivial and obvious typos, be careful about the following points: - Spell ASCII, URL and CRC in ALL CAPS; - Spell Linux as Capitalized; - Do not omit periods in "i.e." and "e.g.". Signed-off-by: Thomas Ackermann <th.acker@arcor.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-11-03 21:37:07 +01:00			`was munging the CRC to match our broken data). So there was a good`
howto: add article on recovering a corrupted object This is an asciidoc-ified version of a corruption post-mortem sent to the git list. It complements the existing howto article, since it covers a case where the object couldn't be easily recreated or copied from elsewhere. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-10-25 09:55:02 +02:00			`chance this middle hit was the source of the problem.`

			`I confirmed by tweaking the byte in a hex editor, zlib inflating the`
			`result (no errors!), and then piping the output into "git hash-object",`
			`which reported the sha1 of the broken object. Success!`

			`I fixed the packfile itself with:`

			`-------`
			`chmod +w $pack`
			`printf '\xc7' \| dd of=$pack bs=1 seek=51659518 conv=notrunc`
			`chmod -w $pack`
			`-------`

			The `\xc7` comes from the replacement byte our "munge" program found.
			`The offset 51659518 is derived by taking the original object offset`
			`(51653873), adding the replacement offset found by "munge" (5642), and`
			`then adding back in the 3 bytes of git header we stripped.`

			`After that, "git fsck" ran clean.`

			`As for the corruption itself, I was lucky that it was indeed a single`
			`byte. In fact, it turned out to be a single bit. The byte 0xc7 was`
			`corrupted to 0xc5. So presumably it was caused by faulty hardware, or a`
			`cosmic ray.`

			`And the aborted attempt to look at the inflated output to see what was`
			`wrong? I could have looked forever and never found it. Here's the diff`
			`between what the corrupted data inflates to, versus the real data:`

			`--------------`
			`- cp = strtok (arg, "+");`
			`+ cp = strtok (arg, ".");`
			`--------------`

			`It tweaked one byte and still ended up as valid, readable C that just`
			`happened to do something totally different! One takeaway is that on a`
			`less unlucky day, looking at the zlib output might have actually been`
			`helpful, as most random changes would actually break the C code.`

			`But more importantly, git's hashing and checksumming noticed a problem`
			`that easily could have gone undetected in another system. The result`
			`still compiled, but would have caused an interesting bug (that would`
			`have been blamed on some random commit).`