mirrors/git - Incest Forge: Beyond sex. We incest.

mirrors/git

mirror of https://github.com/git/git.git synced 2024-11-01 06:47:52 +01:00

495 lines

12 KiB

C

Raw Normal View History

git-tar-tree: Move code for git-archive --format=tar to archive-tar.c This patch doesn't change any functionality, it only moves code around. It makes seeing the few remaining lines of git-tar-tree code easier. ;-) Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-24 17:31:10 +02:00			`/*`
			`* Copyright (c) 2005, 2006 Rene Scharfe`
			`*/`
			`#include "cache.h"`
config: don't include config.h by default Stop including config.h by default in cache.h. Instead only include config.h in those files which require use of the config system. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-06-14 20:07:36 +02:00			`#include "config.h"`
git-tar-tree: Move code for git-archive --format=tar to archive-tar.c This patch doesn't change any functionality, it only moves code around. It makes seeing the few remaining lines of git-tar-tree code easier. ;-) Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-24 17:31:10 +02:00			`#include "tar.h"`
			`#include "archive.h"`
archive-tar: stream large blobs to tar file t5000 verifies output while t1050 makes sure the command always respects core.bigfilethreshold Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-03 03:51:04 +02:00			`#include "streaming.h"`
archive: implement configurable tar filters It's common to pipe the tar output produce by "git archive" through gzip or some other compressor. Locally, this can easily be done by using a shell pipe. When requesting a remote archive, though, it cannot be done through the upload-archive interface. This patch allows configurable tar filters, so that one could define a "tar.gz" format that automatically pipes tar output through gzip. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-06-22 03:26:31 +02:00			`#include "run-command.h"`
git-tar-tree: Move code for git-archive --format=tar to archive-tar.c This patch doesn't change any functionality, it only moves code around. It makes seeing the few remaining lines of git-tar-tree code easier. ;-) Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-24 17:31:10 +02:00
			`#define RECORDSIZE (512)`
			`#define BLOCKSIZE (RECORDSIZE * 20)`

			`static char block[BLOCKSIZE];`
			`static unsigned long offset;`

Set default "tar" umask to 002 and owner.group to root.root In order to make the generated tar files more friendly to users who extract them as root using GNU tar and its implied -p option, change the default umask to 002 and change the owner name and group name to root. This ensures that a) the extracted files and directories are not world-writable and b) that they belong to user and group root. Before they would have been assigned to a user and/or group named git if it existed. This also answers the question in the removed comment: uid=0, gid=0, uname=root, gname=root is exactly what we want. Normal users who let tar apply their umask while extracting are only affected if their umask allowed the world to change their files (e.g. a umask of zero). This case is so unlikely and strange that we don't need to support it. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-05 23:30:22 +01:00			`static int tar_umask = 002;`
git-tar-tree: Move code for git-archive --format=tar to archive-tar.c This patch doesn't change any functionality, it only moves code around. It makes seeing the few remaining lines of git-tar-tree code easier. ;-) Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-24 17:31:10 +02:00
archive: implement configurable tar filters It's common to pipe the tar output produce by "git archive" through gzip or some other compressor. Locally, this can easily be done by using a shell pipe. When requesting a remote archive, though, it cannot be done through the upload-archive interface. This patch allows configurable tar filters, so that one could define a "tar.gz" format that automatically pipes tar output through gzip. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-06-22 03:26:31 +02:00			`static int write_tar_filter_archive(const struct archiver *ar,`
			`struct archiver_args *args);`

archive-tar: write extended headers for file sizes >= 8GB The ustar format has a fixed-length field for the size of each file entry which is supposed to contain up to 11 bytes of octal-formatted data plus a NUL or space terminator. These means that the largest size we can represent is 077777777777, or 1 byte short of 8GB. The correct solution for a larger file, according to POSIX.1-2001, is to add an extended pax header, similar to how we handle long filenames. This patch does that, and writes zero for the size field in the ustar header (the last bit is not mentioned by POSIX, but it matches how GNU tar behaves with --format=pax). This should be a strict improvement over the current behavior, which is to die in xsnprintf with a "BUG". However, there's some interesting history here. Prior to f2f0267 (archive-tar: use xsnprintf for trivial formatting, 2015-09-24), we silently overflowed the "size" field. The extra bytes ended up in the "mtime" field of the header, which was then immediately written itself, overwriting our extra bytes. What that means depends on how many bytes we wrote. If the size was 64GB or greater, then we actually overflowed digits into the mtime field, meaning our value was effectively right-shifted by those lost octal digits. And this patch is again a strict improvement over that. But if the size was between 8GB and 64GB, then our 12-byte field held all of the actual digits, and only our NUL terminator overflowed. According to POSIX, there should be a NUL or space at the end of the field. However, GNU tar seems to be lenient here, and will correctly parse a size up 64GB (minus one) from the field. So sizes in this range might have just worked, depending on the implementation reading the tarfile. This patch is mostly still an improvement there, as the 8GB limit is specifically mentioned in POSIX as the correct limit. But it's possible that it could be a regression (versus the pre-f2f0267 state) if all of the following are true: 1. You have a file between 8GB and 64GB. 2. Your tar implementation _doesn't_ know about pax extended headers. 3. Your tar implementation _does_ parse 12-byte sizes from the ustar header without a delimiter. It's probably not worth worrying about such an obscure set of conditions, but I'm documenting it here just in case. Helped-by: René Scharfe <l.s.r@web.de> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-06-30 11:09:16 +02:00			`/*`
			`* This is the max value that a ustar size header can specify, as it is fixed`
			`* at 11 octal digits. POSIX specifies that we switch to extended headers at`
			`* this size.`
archive-tar: write extended headers for far-future mtime The ustar format represents timestamps as seconds since the epoch, but only has room to store 11 octal digits. To express anything larger, we need to use an extended header. This is exactly the same case we fixed for the size field in the previous commit, and the solution here follows the same pattern. This is even mentioned as an issue in f2f0267 (archive-tar: use xsnprintf for trivial formatting, 2015-09-24), but since it only affected things far in the future, it wasn't deemed worth dealing with. But note that my calculations claiming thousands of years were off there; because our xsnprintf produces a NUL byte, we only have until the year 2242 to fix this. Given that this is just around the corner (geologically speaking, anyway), and because it's easy to fix, let's just make it work. Unlike the previous fix for "size", where we had to write an individual extended header for each file, we can write one global header (since we have only one mtime for the whole archive). There's a slight bit of trickiness there. We may already be writing a global header with a "comment" field for the commit sha1. So we need to write our new field into the same header. To do this, we push the decision of whether to write such a header down into write_global_extended_header(), which will now assemble the header as it sees fit, and will return early if we have nothing to write (in practice, we'll only have a large mtime if it comes from a commit, but this makes it also work if you set your system clock ahead such that time() returns a huge value). Note that we don't (and never did) handle negative timestamps (i.e., before 1970). This would probably not be too hard to support in the same way, but since git does not support negative timestamps at all, I didn't bother here. After writing the extended header, we munge the timestamp in the ustar headers to the maximum-allowable size. This is wrong, but it's the least-wrong thing we can provide to a tar implementation that doesn't understand pax headers (it's also what GNU tar does). Helped-by: René Scharfe <l.s.r@web.de> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-06-30 11:09:20 +02:00			`*`
			`* Likewise for the mtime (which happens to use a buffer of the same size).`
archive-tar: write extended headers for file sizes >= 8GB The ustar format has a fixed-length field for the size of each file entry which is supposed to contain up to 11 bytes of octal-formatted data plus a NUL or space terminator. These means that the largest size we can represent is 077777777777, or 1 byte short of 8GB. The correct solution for a larger file, according to POSIX.1-2001, is to add an extended pax header, similar to how we handle long filenames. This patch does that, and writes zero for the size field in the ustar header (the last bit is not mentioned by POSIX, but it matches how GNU tar behaves with --format=pax). This should be a strict improvement over the current behavior, which is to die in xsnprintf with a "BUG". However, there's some interesting history here. Prior to f2f0267 (archive-tar: use xsnprintf for trivial formatting, 2015-09-24), we silently overflowed the "size" field. The extra bytes ended up in the "mtime" field of the header, which was then immediately written itself, overwriting our extra bytes. What that means depends on how many bytes we wrote. If the size was 64GB or greater, then we actually overflowed digits into the mtime field, meaning our value was effectively right-shifted by those lost octal digits. And this patch is again a strict improvement over that. But if the size was between 8GB and 64GB, then our 12-byte field held all of the actual digits, and only our NUL terminator overflowed. According to POSIX, there should be a NUL or space at the end of the field. However, GNU tar seems to be lenient here, and will correctly parse a size up 64GB (minus one) from the field. So sizes in this range might have just worked, depending on the implementation reading the tarfile. This patch is mostly still an improvement there, as the 8GB limit is specifically mentioned in POSIX as the correct limit. But it's possible that it could be a regression (versus the pre-f2f0267 state) if all of the following are true: 1. You have a file between 8GB and 64GB. 2. Your tar implementation _doesn't_ know about pax extended headers. 3. Your tar implementation _does_ parse 12-byte sizes from the ustar header without a delimiter. It's probably not worth worrying about such an obscure set of conditions, but I'm documenting it here just in case. Helped-by: René Scharfe <l.s.r@web.de> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-06-30 11:09:16 +02:00			`*/`
archive-tar: huge offset and future timestamps would not work on 32-bit As we are not yet moving everything to size_t but still using ulong internally when talking about the size of object, platforms with 32-bit long will not be able to produce tar archive with 4GB+ file, and cannot grok 077777777777UL as a constant. Disable the extended header feature and do not test it on them. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-07-14 22:04:43 +02:00			`#if ULONG_MAX == 0xFFFFFFFF`
			`#define USTAR_MAX_SIZE ULONG_MAX`
			`#else`
archive-tar: fix a sparse 'constant too large' warning Commit dddbad728c ("timestamp_t: a new data type for timestamps", 26-04-2017) introduced a new typedef 'timestamp_t', as a synonym for an unsigned long, which was used at the time to represent timestamps in git. A later commit 28f4aee3fb ("use uintmax_t for timestamps", 26-04-2017) changed the typedef to use an 'uintmax_t' for the timestamp representation type. When building on a 32-bit Linux system, sparse complains that a constant (USTAR_MAX_MTIME) used to detect a 'far-future mtime' timestamp, is too large; 'warning: constant 077777777777UL is so big it is unsigned long long' on lines 335 and 338 of archive-tar.c. Note that both gcc and clang only issue a warning if this constant is used in a context that requires an 'unsigned long' (rather than an uintmax_t). (Since TIME_MAX is no longer equal to 0xFFFFFFFF, even on a 32-bit system, the macro USTAR_MAX_MTIME is set to 077777777777UL, which cannot be represented as an 'unsigned long' constant). In order to suppress the warning, change the definition of the macro constant USTAR_MAX_MTIME to use an 'ULL' type suffix. In a similar vein, on systems which use a 64-bit representation of the 'unsigned long' type, the USTAR_MAX_SIZE constant macro is defined with the value 077777777777ULL. Although this does not cause any warning messages to be issued, it would be more appropriate for this constant to use an 'UL' type suffix rather than 'ULL'. Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-05-08 22:34:58 +02:00			`#define USTAR_MAX_SIZE 077777777777UL`
timestamp_t: a new data type for timestamps Git's source code assumes that unsigned long is at least as precise as time_t. Which is incorrect, and causes a lot of problems, in particular where unsigned long is only 32-bit (notably on Windows, even in 64-bit versions). So let's just use a more appropriate data type instead. In preparation for this, we introduce the new `timestamp_t` data type. By necessity, this is a very, very large patch, as it has to replace all timestamps' data type in one go. As we will use a data type that is not necessarily identical to `time_t`, we need to be very careful to use `time_t` whenever we interact with the system functions, and `timestamp_t` everywhere else. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-04-26 21:29:31 +02:00			`#endif`
			`#if TIME_MAX == 0xFFFFFFFF`
			`#define USTAR_MAX_MTIME TIME_MAX`
			`#else`
archive-tar: fix a sparse 'constant too large' warning Commit dddbad728c ("timestamp_t: a new data type for timestamps", 26-04-2017) introduced a new typedef 'timestamp_t', as a synonym for an unsigned long, which was used at the time to represent timestamps in git. A later commit 28f4aee3fb ("use uintmax_t for timestamps", 26-04-2017) changed the typedef to use an 'uintmax_t' for the timestamp representation type. When building on a 32-bit Linux system, sparse complains that a constant (USTAR_MAX_MTIME) used to detect a 'far-future mtime' timestamp, is too large; 'warning: constant 077777777777UL is so big it is unsigned long long' on lines 335 and 338 of archive-tar.c. Note that both gcc and clang only issue a warning if this constant is used in a context that requires an 'unsigned long' (rather than an uintmax_t). (Since TIME_MAX is no longer equal to 0xFFFFFFFF, even on a 32-bit system, the macro USTAR_MAX_MTIME is set to 077777777777UL, which cannot be represented as an 'unsigned long' constant). In order to suppress the warning, change the definition of the macro constant USTAR_MAX_MTIME to use an 'ULL' type suffix. In a similar vein, on systems which use a 64-bit representation of the 'unsigned long' type, the USTAR_MAX_SIZE constant macro is defined with the value 077777777777ULL. Although this does not cause any warning messages to be issued, it would be more appropriate for this constant to use an 'UL' type suffix rather than 'ULL'. Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-05-08 22:34:58 +02:00			`#define USTAR_MAX_MTIME 077777777777ULL`
archive-tar: huge offset and future timestamps would not work on 32-bit As we are not yet moving everything to size_t but still using ulong internally when talking about the size of object, platforms with 32-bit long will not be able to produce tar archive with 4GB+ file, and cannot grok 077777777777UL as a constant. Disable the extended header feature and do not test it on them. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-07-14 22:04:43 +02:00			`#endif`
archive-tar: write extended headers for file sizes >= 8GB The ustar format has a fixed-length field for the size of each file entry which is supposed to contain up to 11 bytes of octal-formatted data plus a NUL or space terminator. These means that the largest size we can represent is 077777777777, or 1 byte short of 8GB. The correct solution for a larger file, according to POSIX.1-2001, is to add an extended pax header, similar to how we handle long filenames. This patch does that, and writes zero for the size field in the ustar header (the last bit is not mentioned by POSIX, but it matches how GNU tar behaves with --format=pax). This should be a strict improvement over the current behavior, which is to die in xsnprintf with a "BUG". However, there's some interesting history here. Prior to f2f0267 (archive-tar: use xsnprintf for trivial formatting, 2015-09-24), we silently overflowed the "size" field. The extra bytes ended up in the "mtime" field of the header, which was then immediately written itself, overwriting our extra bytes. What that means depends on how many bytes we wrote. If the size was 64GB or greater, then we actually overflowed digits into the mtime field, meaning our value was effectively right-shifted by those lost octal digits. And this patch is again a strict improvement over that. But if the size was between 8GB and 64GB, then our 12-byte field held all of the actual digits, and only our NUL terminator overflowed. According to POSIX, there should be a NUL or space at the end of the field. However, GNU tar seems to be lenient here, and will correctly parse a size up 64GB (minus one) from the field. So sizes in this range might have just worked, depending on the implementation reading the tarfile. This patch is mostly still an improvement there, as the 8GB limit is specifically mentioned in POSIX as the correct limit. But it's possible that it could be a regression (versus the pre-f2f0267 state) if all of the following are true: 1. You have a file between 8GB and 64GB. 2. Your tar implementation _doesn't_ know about pax extended headers. 3. Your tar implementation _does_ parse 12-byte sizes from the ustar header without a delimiter. It's probably not worth worrying about such an obscure set of conditions, but I'm documenting it here just in case. Helped-by: René Scharfe <l.s.r@web.de> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-06-30 11:09:16 +02:00
git-tar-tree: Move code for git-archive --format=tar to archive-tar.c This patch doesn't change any functionality, it only moves code around. It makes seeing the few remaining lines of git-tar-tree code easier. ;-) Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-24 17:31:10 +02:00			`/* writes out the whole block, but only if it is full */`
			`static void write_if_needed(void)`
			`{`
			`if (offset == BLOCKSIZE) {`
			`write_or_die(1, block, BLOCKSIZE);`
			`offset = 0;`
			`}`
			`}`

			`/*`
			`* queues up writes, so that all our write(2) calls write exactly one`
			`* full block; pads writes to RECORDSIZE`
			`*/`
archive-tar: stream large blobs to tar file t5000 verifies output while t1050 makes sure the command always respects core.bigfilethreshold Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-03 03:51:04 +02:00			`static void do_write_blocked(const void *data, unsigned long size)`
git-tar-tree: Move code for git-archive --format=tar to archive-tar.c This patch doesn't change any functionality, it only moves code around. It makes seeing the few remaining lines of git-tar-tree code easier. ;-) Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-24 17:31:10 +02:00			`{`
			`const char *buf = data;`

			`if (offset) {`
			`unsigned long chunk = BLOCKSIZE - offset;`
			`if (size < chunk)`
			`chunk = size;`
			`memcpy(block + offset, buf, chunk);`
			`size -= chunk;`
			`offset += chunk;`
			`buf += chunk;`
			`write_if_needed();`
			`}`
			`while (size >= BLOCKSIZE) {`
			`write_or_die(1, buf, BLOCKSIZE);`
			`size -= BLOCKSIZE;`
			`buf += BLOCKSIZE;`
			`}`
			`if (size) {`
			`memcpy(block + offset, buf, size);`
			`offset += size;`
			`}`
archive-tar: stream large blobs to tar file t5000 verifies output while t1050 makes sure the command always respects core.bigfilethreshold Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-03 03:51:04 +02:00			`}`

			`static void finish_record(void)`
			`{`
			`unsigned long tail;`
git-tar-tree: Move code for git-archive --format=tar to archive-tar.c This patch doesn't change any functionality, it only moves code around. It makes seeing the few remaining lines of git-tar-tree code easier. ;-) Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-24 17:31:10 +02:00			`tail = offset % RECORDSIZE;`
			`if (tail) {`
			`memset(block + offset, 0, RECORDSIZE - tail);`
			`offset += RECORDSIZE - tail;`
			`}`
			`write_if_needed();`
			`}`

archive-tar: stream large blobs to tar file t5000 verifies output while t1050 makes sure the command always respects core.bigfilethreshold Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-03 03:51:04 +02:00			`static void write_blocked(const void *data, unsigned long size)`
			`{`
			`do_write_blocked(data, size);`
			`finish_record();`
			`}`

git-tar-tree: Move code for git-archive --format=tar to archive-tar.c This patch doesn't change any functionality, it only moves code around. It makes seeing the few remaining lines of git-tar-tree code easier. ;-) Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-24 17:31:10 +02:00			`/*`
			`* The end of tar archives is marked by 2*512 nul bytes and after that`
			`* follows the rest of the block (if any).`
			`*/`
			`static void write_trailer(void)`
			`{`
			`int tail = BLOCKSIZE - offset;`
			`memset(block + offset, 0, tail);`
			`write_or_die(1, block, BLOCKSIZE);`
			`if (tail < 2 * RECORDSIZE) {`
			`memset(block, 0, offset);`
			`write_or_die(1, block, BLOCKSIZE);`
			`}`
			`}`

archive-tar: stream large blobs to tar file t5000 verifies output while t1050 makes sure the command always respects core.bigfilethreshold Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-03 03:51:04 +02:00			`/*`
			`* queues up writes, so that all our write(2) calls write exactly one`
			`* full block; pads writes to RECORDSIZE`
			`*/`
			`static int stream_blocked(const unsigned char *sha1)`
			`{`
			`struct git_istream *st;`
			`enum object_type type;`
			`unsigned long sz;`
			`char buf[BLOCKSIZE];`
			`ssize_t readlen;`

			`st = open_istream(sha1, &type, &sz, NULL);`
			`if (!st)`
			`return error("cannot stream blob %s", sha1_to_hex(sha1));`
			`for (;;) {`
			`readlen = read_istream(st, buf, sizeof(buf));`
			`if (readlen <= 0)`
			`break;`
			`do_write_blocked(buf, readlen);`
			`}`
			`close_istream(st);`
			`if (!readlen)`
			`finish_record();`
			`return readlen;`
			`}`

git-tar-tree: Move code for git-archive --format=tar to archive-tar.c This patch doesn't change any functionality, it only moves code around. It makes seeing the few remaining lines of git-tar-tree code easier. ;-) Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-24 17:31:10 +02:00			`/*`
			`* pax extended header records have the format "%u %s=%s\n". %u contains`
			`* the size of the whole string (including the %u), the first %s is the`
			`* keyword, the second one is the value. This function constructs such a`
			`* string and appends it to a struct strbuf.`
			`*/`
			`static void strbuf_append_ext_header(struct strbuf sb, const char keyword,`
			`const char *value, unsigned int valuelen)`
			`{`
Simplify strbuf uses in archive-tar.c using strbuf API This is just cleaner way to deal with strbufs, using its API rather than reinventing it in the module (e.g. strbuf_append_string is just the plain strbuf_addstr function, and it was used to perform what strbuf_addch does anyways). Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-06 13:20:06 +02:00			`int len, tmp;`
git-tar-tree: Move code for git-archive --format=tar to archive-tar.c This patch doesn't change any functionality, it only moves code around. It makes seeing the few remaining lines of git-tar-tree code easier. ;-) Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-24 17:31:10 +02:00
			`/* "%u %s=%s\n" */`
			`len = 1 + 1 + strlen(keyword) + 1 + valuelen + 1;`
			`for (tmp = len; tmp > 9; tmp /= 10)`
			`len++;`

Simplify strbuf uses in archive-tar.c using strbuf API This is just cleaner way to deal with strbufs, using its API rather than reinventing it in the module (e.g. strbuf_append_string is just the plain strbuf_addstr function, and it was used to perform what strbuf_addch does anyways). Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-06 13:20:06 +02:00			`strbuf_grow(sb, len);`
			`strbuf_addf(sb, "%u %s=", len, keyword);`
			`strbuf_add(sb, value, valuelen);`
			`strbuf_addch(sb, '\n');`
git-tar-tree: Move code for git-archive --format=tar to archive-tar.c This patch doesn't change any functionality, it only moves code around. It makes seeing the few remaining lines of git-tar-tree code easier. ;-) Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-24 17:31:10 +02:00			`}`

archive-tar: write extended headers for file sizes >= 8GB The ustar format has a fixed-length field for the size of each file entry which is supposed to contain up to 11 bytes of octal-formatted data plus a NUL or space terminator. These means that the largest size we can represent is 077777777777, or 1 byte short of 8GB. The correct solution for a larger file, according to POSIX.1-2001, is to add an extended pax header, similar to how we handle long filenames. This patch does that, and writes zero for the size field in the ustar header (the last bit is not mentioned by POSIX, but it matches how GNU tar behaves with --format=pax). This should be a strict improvement over the current behavior, which is to die in xsnprintf with a "BUG". However, there's some interesting history here. Prior to f2f0267 (archive-tar: use xsnprintf for trivial formatting, 2015-09-24), we silently overflowed the "size" field. The extra bytes ended up in the "mtime" field of the header, which was then immediately written itself, overwriting our extra bytes. What that means depends on how many bytes we wrote. If the size was 64GB or greater, then we actually overflowed digits into the mtime field, meaning our value was effectively right-shifted by those lost octal digits. And this patch is again a strict improvement over that. But if the size was between 8GB and 64GB, then our 12-byte field held all of the actual digits, and only our NUL terminator overflowed. According to POSIX, there should be a NUL or space at the end of the field. However, GNU tar seems to be lenient here, and will correctly parse a size up 64GB (minus one) from the field. So sizes in this range might have just worked, depending on the implementation reading the tarfile. This patch is mostly still an improvement there, as the 8GB limit is specifically mentioned in POSIX as the correct limit. But it's possible that it could be a regression (versus the pre-f2f0267 state) if all of the following are true: 1. You have a file between 8GB and 64GB. 2. Your tar implementation _doesn't_ know about pax extended headers. 3. Your tar implementation _does_ parse 12-byte sizes from the ustar header without a delimiter. It's probably not worth worrying about such an obscure set of conditions, but I'm documenting it here just in case. Helped-by: René Scharfe <l.s.r@web.de> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-06-30 11:09:16 +02:00			`/*`
			`* Like strbuf_append_ext_header, but for numeric values.`
			`*/`
			`static void strbuf_append_ext_header_uint(struct strbuf *sb,`
			`const char *keyword,`
			`uintmax_t value)`
			`{`
			`char buf[40]; /* big enough for 2^128 in decimal, plus NUL */`
			`int len;`

			`len = xsnprintf(buf, sizeof(buf), "%"PRIuMAX, value);`
			`strbuf_append_ext_header(sb, keyword, buf, len);`
			`}`

git-tar-tree: Move code for git-archive --format=tar to archive-tar.c This patch doesn't change any functionality, it only moves code around. It makes seeing the few remaining lines of git-tar-tree code easier. ;-) Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-24 17:31:10 +02:00			`static unsigned int ustar_header_chksum(const struct ustar_header *header)`
			`{`
archive: ustar header checksum is computed unsigned POSIX.1 (pax) is pretty clear on this: The chksum field shall be the ISO/IEC 646:1991 standard IRV representation of the octal value of the simple sum of all octets in the header logical record. Each octet in the header shall be treated as an unsigned value. These values shall be added to an unsigned integer, initialized to zero, the precision of which is not less than 17 bits. When calculating the checksum, the chksum field is treated as if it were all <space> characters. so is GNU: http://www.gnu.org/software/tar/manual/html_node/Checksumming.html Found by 7zip folks and reported by Rafał Mużyło. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-06-13 19:42:25 +02:00			`const unsigned char p = (const unsigned char )header;`
git-tar-tree: Move code for git-archive --format=tar to archive-tar.c This patch doesn't change any functionality, it only moves code around. It makes seeing the few remaining lines of git-tar-tree code easier. ;-) Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-24 17:31:10 +02:00			`unsigned int chksum = 0;`
archive: ustar header checksum is computed unsigned POSIX.1 (pax) is pretty clear on this: The chksum field shall be the ISO/IEC 646:1991 standard IRV representation of the octal value of the simple sum of all octets in the header logical record. Each octet in the header shall be treated as an unsigned value. These values shall be added to an unsigned integer, initialized to zero, the precision of which is not less than 17 bits. When calculating the checksum, the chksum field is treated as if it were all <space> characters. so is GNU: http://www.gnu.org/software/tar/manual/html_node/Checksumming.html Found by 7zip folks and reported by Rafał Mużyło. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-06-13 19:42:25 +02:00			`while (p < (const unsigned char *)header->chksum)`
git-tar-tree: Move code for git-archive --format=tar to archive-tar.c This patch doesn't change any functionality, it only moves code around. It makes seeing the few remaining lines of git-tar-tree code easier. ;-) Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-24 17:31:10 +02:00			`chksum += *p++;`
			`chksum += sizeof(header->chksum) * ' ';`
			`p += sizeof(header->chksum);`
archive: ustar header checksum is computed unsigned POSIX.1 (pax) is pretty clear on this: The chksum field shall be the ISO/IEC 646:1991 standard IRV representation of the octal value of the simple sum of all octets in the header logical record. Each octet in the header shall be treated as an unsigned value. These values shall be added to an unsigned integer, initialized to zero, the precision of which is not less than 17 bits. When calculating the checksum, the chksum field is treated as if it were all <space> characters. so is GNU: http://www.gnu.org/software/tar/manual/html_node/Checksumming.html Found by 7zip folks and reported by Rafał Mużyło. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-06-13 19:42:25 +02:00			`while (p < (const unsigned char *)header + sizeof(struct ustar_header))`
git-tar-tree: Move code for git-archive --format=tar to archive-tar.c This patch doesn't change any functionality, it only moves code around. It makes seeing the few remaining lines of git-tar-tree code easier. ;-) Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-24 17:31:10 +02:00			`chksum += *p++;`
			`return chksum;`
			`}`

archive: centralize archive entry writing Add the exported function write_archive_entries() to archive.c, which uses the new ability of read_tree_recursive() to pass a context pointer to its callback in order to centralize previously duplicated code. The new callback function write_archive_entry() does the work that every archiver backend needs to do: loading file contents, entering subdirectories, handling file attributes, constructing the full path of the entry. All that done, it calls the backend specific write_archive_entry_fn_t function. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-14 21:22:24 +02:00			`static size_t get_path_prefix(const char *path, size_t pathlen, size_t maxlen)`
git-tar-tree: Move code for git-archive --format=tar to archive-tar.c This patch doesn't change any functionality, it only moves code around. It makes seeing the few remaining lines of git-tar-tree code easier. ;-) Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-24 17:31:10 +02:00			`{`
archive: centralize archive entry writing Add the exported function write_archive_entries() to archive.c, which uses the new ability of read_tree_recursive() to pass a context pointer to its callback in order to centralize previously duplicated code. The new callback function write_archive_entry() does the work that every archiver backend needs to do: loading file contents, entering subdirectories, handling file attributes, constructing the full path of the entry. All that done, it calls the backend specific write_archive_entry_fn_t function. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-14 21:22:24 +02:00			`size_t i = pathlen;`
archive-tar: split long paths more carefully The name field of a tar header has a size of 100 characters. This limit was extended long ago in a backward compatible way by providing the additional prefix field, which can hold 155 additional characters. The actual path is constructed at extraction time by concatenating the prefix field, a slash and the name field. get_path_prefix() is used to determine which slash in the path is used as the cutting point and thus which part of it is placed into the field prefix and which into the field name. It tries to cram as much into the prefix field as possible. (And only if we can't fit a path into the provided 255 characters we use a pax extended header to store it.) If a path is longer than 100 but shorter than 156 characters and ends with a slash (i.e. is for a directory) then get_path_prefix() puts the whole path in the prefix field and leaves the name field empty. GNU tar reconstructs the path without complaint, but the tar included with NetBSD 6 does not: It reports the header to be invalid. For compatibility with this version of tar, make sure to never leave the name field empty. In order to do that, trim the trailing slash from the part considered as possible prefix, if it exists -- that way the last path component (or more, but not less) will end up in the name field. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-05 23:49:54 +01:00			`if (i > 1 && path[i - 1] == '/')`
			`i--;`
git-tar-tree: Move code for git-archive --format=tar to archive-tar.c This patch doesn't change any functionality, it only moves code around. It makes seeing the few remaining lines of git-tar-tree code easier. ;-) Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-24 17:31:10 +02:00			`if (i > maxlen)`
			`i = maxlen;`
			`do {`
			`i--;`
archive: centralize archive entry writing Add the exported function write_archive_entries() to archive.c, which uses the new ability of read_tree_recursive() to pass a context pointer to its callback in order to centralize previously duplicated code. The new callback function write_archive_entry() does the work that every archiver backend needs to do: loading file contents, entering subdirectories, handling file attributes, constructing the full path of the entry. All that done, it calls the backend specific write_archive_entry_fn_t function. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-14 21:22:24 +02:00			`} while (i > 0 && path[i] != '/');`
git-tar-tree: Move code for git-archive --format=tar to archive-tar.c This patch doesn't change any functionality, it only moves code around. It makes seeing the few remaining lines of git-tar-tree code easier. ;-) Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-24 17:31:10 +02:00			`return i;`
			`}`

archive-tar: turn write_tar_entry into blob-writing only Before this patch write_tar_entry() can: - write global header by write_global_extended_header() calling write_tar_entry with with both sha1 and path == NULL - write extended header for symlinks, by write_tar_entry() calling itself with sha1 != NULL and path == NULL - write a normal blob. In this case both sha1 and path are valid. After this patch, the first two call sites are modified to write the header without calling write_tar_entry(). The function is now for writing blobs only. This simplifies handling when write_tar_entry() learns about large blobs. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-03 03:51:01 +02:00			`static void prepare_header(struct archiver_args *args,`
			`struct ustar_header *header,`
			`unsigned int mode, unsigned long size)`
			`{`
archive-tar: use xsnprintf for trivial formatting When we generate tar headers, we sprintf() values directly into a struct with the fixed-size header values. For the most part this is fine, as we are formatting small values (e.g., the octal format of "mode & 0x7777" is of fixed length). But it's still a good idea to use xsnprintf here. It communicates to readers what our expectation is, and it provides a run-time check that we are not overflowing the buffers. The one exception here is the mtime, which comes from the epoch time of the commit we are archiving. For sane values, this fits into the 12-byte value allocated in the header. But since git can handle 64-bit times, if I claim to be a visitor from the year 10,000 AD, I can overflow the buffer. This turns out to be harmless, as we simply overflow into the chksum field, which is then overwritten. This case is also best as an xsnprintf. It should never come up, short of extremely malformed dates, and in that case we are probably better off dying than silently truncating the date value (and we cannot expand the size of the buffer, since it is dictated by the ustar format). Our friends in the year 5138 (when we legitimately flip to a 12-digit epoch) can deal with that problem then. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-09-24 23:06:24 +02:00			`xsnprintf(header->mode, sizeof(header->mode), "%07o", mode & 07777);`
			`xsnprintf(header->size, sizeof(header->size), "%011lo", S_ISREG(mode) ? size : 0);`
			`xsnprintf(header->mtime, sizeof(header->mtime), "%011lo", (unsigned long) args->time);`
archive-tar: turn write_tar_entry into blob-writing only Before this patch write_tar_entry() can: - write global header by write_global_extended_header() calling write_tar_entry with with both sha1 and path == NULL - write extended header for symlinks, by write_tar_entry() calling itself with sha1 != NULL and path == NULL - write a normal blob. In this case both sha1 and path are valid. After this patch, the first two call sites are modified to write the header without calling write_tar_entry(). The function is now for writing blobs only. This simplifies handling when write_tar_entry() learns about large blobs. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-03 03:51:01 +02:00
archive-tar: use xsnprintf for trivial formatting When we generate tar headers, we sprintf() values directly into a struct with the fixed-size header values. For the most part this is fine, as we are formatting small values (e.g., the octal format of "mode & 0x7777" is of fixed length). But it's still a good idea to use xsnprintf here. It communicates to readers what our expectation is, and it provides a run-time check that we are not overflowing the buffers. The one exception here is the mtime, which comes from the epoch time of the commit we are archiving. For sane values, this fits into the 12-byte value allocated in the header. But since git can handle 64-bit times, if I claim to be a visitor from the year 10,000 AD, I can overflow the buffer. This turns out to be harmless, as we simply overflow into the chksum field, which is then overwritten. This case is also best as an xsnprintf. It should never come up, short of extremely malformed dates, and in that case we are probably better off dying than silently truncating the date value (and we cannot expand the size of the buffer, since it is dictated by the ustar format). Our friends in the year 5138 (when we legitimately flip to a 12-digit epoch) can deal with that problem then. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-09-24 23:06:24 +02:00			`xsnprintf(header->uid, sizeof(header->uid), "%07o", 0);`
			`xsnprintf(header->gid, sizeof(header->gid), "%07o", 0);`
archive-tar: turn write_tar_entry into blob-writing only Before this patch write_tar_entry() can: - write global header by write_global_extended_header() calling write_tar_entry with with both sha1 and path == NULL - write extended header for symlinks, by write_tar_entry() calling itself with sha1 != NULL and path == NULL - write a normal blob. In this case both sha1 and path are valid. After this patch, the first two call sites are modified to write the header without calling write_tar_entry(). The function is now for writing blobs only. This simplifies handling when write_tar_entry() learns about large blobs. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-03 03:51:01 +02:00			`strlcpy(header->uname, "root", sizeof(header->uname));`
			`strlcpy(header->gname, "root", sizeof(header->gname));`
archive-tar: use xsnprintf for trivial formatting When we generate tar headers, we sprintf() values directly into a struct with the fixed-size header values. For the most part this is fine, as we are formatting small values (e.g., the octal format of "mode & 0x7777" is of fixed length). But it's still a good idea to use xsnprintf here. It communicates to readers what our expectation is, and it provides a run-time check that we are not overflowing the buffers. The one exception here is the mtime, which comes from the epoch time of the commit we are archiving. For sane values, this fits into the 12-byte value allocated in the header. But since git can handle 64-bit times, if I claim to be a visitor from the year 10,000 AD, I can overflow the buffer. This turns out to be harmless, as we simply overflow into the chksum field, which is then overwritten. This case is also best as an xsnprintf. It should never come up, short of extremely malformed dates, and in that case we are probably better off dying than silently truncating the date value (and we cannot expand the size of the buffer, since it is dictated by the ustar format). Our friends in the year 5138 (when we legitimately flip to a 12-digit epoch) can deal with that problem then. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-09-24 23:06:24 +02:00			`xsnprintf(header->devmajor, sizeof(header->devmajor), "%07o", 0);`
			`xsnprintf(header->devminor, sizeof(header->devminor), "%07o", 0);`
archive-tar: turn write_tar_entry into blob-writing only Before this patch write_tar_entry() can: - write global header by write_global_extended_header() calling write_tar_entry with with both sha1 and path == NULL - write extended header for symlinks, by write_tar_entry() calling itself with sha1 != NULL and path == NULL - write a normal blob. In this case both sha1 and path are valid. After this patch, the first two call sites are modified to write the header without calling write_tar_entry(). The function is now for writing blobs only. This simplifies handling when write_tar_entry() learns about large blobs. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-03 03:51:01 +02:00
			`memcpy(header->magic, "ustar", 6);`
			`memcpy(header->version, "00", 2);`

archive-tar: convert snprintf to xsnprintf Commit f2f0267 (archive-tar: use xsnprintf for trivial formatting, 2015-09-24) converted cases of "sprintf" to "xsnprintf", but accidentally left one as just "snprintf". This meant that we could silently truncate the resulting buffer instead of flagging an error. In practice, this is impossible to achieve, as we are formatting a ustar checksum, which can be at most 7 characters. But the point of xsnprintf is to document and check for "should be impossible" conditions; this site was just accidentally mis-converted during f2f0267. Noticed-by: Paul Green <Paul.Green@stratus.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-05-26 06:28:08 +02:00			`xsnprintf(header->chksum, sizeof(header->chksum), "%07o", ustar_header_chksum(header));`
archive-tar: turn write_tar_entry into blob-writing only Before this patch write_tar_entry() can: - write global header by write_global_extended_header() calling write_tar_entry with with both sha1 and path == NULL - write extended header for symlinks, by write_tar_entry() calling itself with sha1 != NULL and path == NULL - write a normal blob. In this case both sha1 and path are valid. After this patch, the first two call sites are modified to write the header without calling write_tar_entry(). The function is now for writing blobs only. This simplifies handling when write_tar_entry() learns about large blobs. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-03 03:51:01 +02:00			`}`

archive-tar: make write_extended_header() void The function write_extended_header() only ever returns 0. Simplify it and its caller by dropping its return value, like we did with write_global_extended_header() earlier. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-06 16:35:38 +02:00			`static void write_extended_header(struct archiver_args *args,`
			`const unsigned char *sha1,`
			`const void *buffer, unsigned long size)`
archive-tar: turn write_tar_entry into blob-writing only Before this patch write_tar_entry() can: - write global header by write_global_extended_header() calling write_tar_entry with with both sha1 and path == NULL - write extended header for symlinks, by write_tar_entry() calling itself with sha1 != NULL and path == NULL - write a normal blob. In this case both sha1 and path are valid. After this patch, the first two call sites are modified to write the header without calling write_tar_entry(). The function is now for writing blobs only. This simplifies handling when write_tar_entry() learns about large blobs. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-03 03:51:01 +02:00			`{`
			`struct ustar_header header;`
			`unsigned int mode;`
			`memset(&header, 0, sizeof(header));`
			`*header.typeflag = TYPEFLAG_EXT_HEADER;`
Revert "archive: honor tar.umask even for pax headers" This reverts commit 10f343ea814f5c18a0913997904ee11cd9b7da24, whose output is no longer bit-for-bit equivalent from the older versions of Git, which the infrastructure to (pretend to) upload tarballs kernel.org uses depends on. 2014-10-20 21:04:46 +02:00			`mode = 0100666;`
archive-tar: use xsnprintf for trivial formatting When we generate tar headers, we sprintf() values directly into a struct with the fixed-size header values. For the most part this is fine, as we are formatting small values (e.g., the octal format of "mode & 0x7777" is of fixed length). But it's still a good idea to use xsnprintf here. It communicates to readers what our expectation is, and it provides a run-time check that we are not overflowing the buffers. The one exception here is the mtime, which comes from the epoch time of the commit we are archiving. For sane values, this fits into the 12-byte value allocated in the header. But since git can handle 64-bit times, if I claim to be a visitor from the year 10,000 AD, I can overflow the buffer. This turns out to be harmless, as we simply overflow into the chksum field, which is then overwritten. This case is also best as an xsnprintf. It should never come up, short of extremely malformed dates, and in that case we are probably better off dying than silently truncating the date value (and we cannot expand the size of the buffer, since it is dictated by the ustar format). Our friends in the year 5138 (when we legitimately flip to a 12-digit epoch) can deal with that problem then. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-09-24 23:06:24 +02:00			`xsnprintf(header.name, sizeof(header.name), "%s.paxheader", sha1_to_hex(sha1));`
archive-tar: turn write_tar_entry into blob-writing only Before this patch write_tar_entry() can: - write global header by write_global_extended_header() calling write_tar_entry with with both sha1 and path == NULL - write extended header for symlinks, by write_tar_entry() calling itself with sha1 != NULL and path == NULL - write a normal blob. In this case both sha1 and path are valid. After this patch, the first two call sites are modified to write the header without calling write_tar_entry(). The function is now for writing blobs only. This simplifies handling when write_tar_entry() learns about large blobs. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-03 03:51:01 +02:00			`prepare_header(args, &header, mode, size);`
			`write_blocked(&header, sizeof(header));`
			`write_blocked(buffer, size);`
			`}`

archive: centralize archive entry writing Add the exported function write_archive_entries() to archive.c, which uses the new ability of read_tree_recursive() to pass a context pointer to its callback in order to centralize previously duplicated code. The new callback function write_archive_entry() does the work that every archiver backend needs to do: loading file contents, entering subdirectories, handling file attributes, constructing the full path of the entry. All that done, it calls the backend specific write_archive_entry_fn_t function. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-14 21:22:24 +02:00			`static int write_tar_entry(struct archiver_args *args,`
archive: delegate blob reading to backend archive-tar.c and archive-zip.c now perform conversion check, with help of sha1_file_to_archive() from archive.c This gives backends more freedom in dealing with (streaming) large blobs. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-03 03:51:03 +02:00			`const unsigned char *sha1,`
			`const char *path, size_t pathlen,`
			`unsigned int mode)`
git-tar-tree: Move code for git-archive --format=tar to archive-tar.c This patch doesn't change any functionality, it only moves code around. It makes seeing the few remaining lines of git-tar-tree code easier. ;-) Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-24 17:31:10 +02:00			`{`
			`struct ustar_header header;`
Replace calls to strbuf_init(&foo, 0) with STRBUF_INIT initializer Many call sites use strbuf_init(&foo, 0) to initialize local strbuf variable "foo" which has not been accessed since its declaration. These can be replaced with a static initialization using the STRBUF_INIT macro which is just as readable, saves a function call, and takes up fewer lines. Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2008-10-09 21:12:12 +02:00			`struct strbuf ext_header = STRBUF_INIT;`
archive: delegate blob reading to backend archive-tar.c and archive-zip.c now perform conversion check, with help of sha1_file_to_archive() from archive.c This gives backends more freedom in dealing with (streaming) large blobs. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-03 03:51:03 +02:00			`unsigned int old_mode = mode;`
archive-tar: write extended headers for file sizes >= 8GB The ustar format has a fixed-length field for the size of each file entry which is supposed to contain up to 11 bytes of octal-formatted data plus a NUL or space terminator. These means that the largest size we can represent is 077777777777, or 1 byte short of 8GB. The correct solution for a larger file, according to POSIX.1-2001, is to add an extended pax header, similar to how we handle long filenames. This patch does that, and writes zero for the size field in the ustar header (the last bit is not mentioned by POSIX, but it matches how GNU tar behaves with --format=pax). This should be a strict improvement over the current behavior, which is to die in xsnprintf with a "BUG". However, there's some interesting history here. Prior to f2f0267 (archive-tar: use xsnprintf for trivial formatting, 2015-09-24), we silently overflowed the "size" field. The extra bytes ended up in the "mtime" field of the header, which was then immediately written itself, overwriting our extra bytes. What that means depends on how many bytes we wrote. If the size was 64GB or greater, then we actually overflowed digits into the mtime field, meaning our value was effectively right-shifted by those lost octal digits. And this patch is again a strict improvement over that. But if the size was between 8GB and 64GB, then our 12-byte field held all of the actual digits, and only our NUL terminator overflowed. According to POSIX, there should be a NUL or space at the end of the field. However, GNU tar seems to be lenient here, and will correctly parse a size up 64GB (minus one) from the field. So sizes in this range might have just worked, depending on the implementation reading the tarfile. This patch is mostly still an improvement there, as the 8GB limit is specifically mentioned in POSIX as the correct limit. But it's possible that it could be a regression (versus the pre-f2f0267 state) if all of the following are true: 1. You have a file between 8GB and 64GB. 2. Your tar implementation _doesn't_ know about pax extended headers. 3. Your tar implementation _does_ parse 12-byte sizes from the ustar header without a delimiter. It's probably not worth worrying about such an obscure set of conditions, but I'm documenting it here just in case. Helped-by: René Scharfe <l.s.r@web.de> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-06-30 11:09:16 +02:00			`unsigned long size, size_in_header;`
archive: delegate blob reading to backend archive-tar.c and archive-zip.c now perform conversion check, with help of sha1_file_to_archive() from archive.c This gives backends more freedom in dealing with (streaming) large blobs. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-03 03:51:03 +02:00			`void *buffer;`
archive: centralize archive entry writing Add the exported function write_archive_entries() to archive.c, which uses the new ability of read_tree_recursive() to pass a context pointer to its callback in order to centralize previously duplicated code. The new callback function write_archive_entry() does the work that every archiver backend needs to do: loading file contents, entering subdirectories, handling file attributes, constructing the full path of the entry. All that done, it calls the backend specific write_archive_entry_fn_t function. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-14 21:22:24 +02:00			`int err = 0;`
git-tar-tree: Move code for git-archive --format=tar to archive-tar.c This patch doesn't change any functionality, it only moves code around. It makes seeing the few remaining lines of git-tar-tree code easier. ;-) Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-24 17:31:10 +02:00
			`memset(&header, 0, sizeof(header));`

archive-tar: unindent write_tar_entry by one level It's used to be if (!sha1) { ... } else if (!path) { ... } else { ... } Now that the first two blocks are no-op. We can remove the if/else skeleton and put the else block back by one indent level. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-03 03:51:02 +02:00			`if (S_ISDIR(mode) \|\| S_ISGITLINK(mode)) {`
			`*header.typeflag = TYPEFLAG_DIR;`
			`mode = (mode \| 0777) & ~tar_umask;`
			`} else if (S_ISLNK(mode)) {`
			`*header.typeflag = TYPEFLAG_LNK;`
			`mode \|= 0777;`
			`} else if (S_ISREG(mode)) {`
			`*header.typeflag = TYPEFLAG_REG;`
			`mode = (mode \| ((mode & 0100) ? 0777 : 0666)) & ~tar_umask;`
git-tar-tree: Move code for git-archive --format=tar to archive-tar.c This patch doesn't change any functionality, it only moves code around. It makes seeing the few remaining lines of git-tar-tree code easier. ;-) Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-24 17:31:10 +02:00			`} else {`
archive-tar: unindent write_tar_entry by one level It's used to be if (!sha1) { ... } else if (!path) { ... } else { ... } Now that the first two blocks are no-op. We can remove the if/else skeleton and put the else block back by one indent level. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-03 03:51:02 +02:00			`return error("unsupported file mode: 0%o (SHA1: %s)",`
			`mode, sha1_to_hex(sha1));`
			`}`
			`if (pathlen > sizeof(header.name)) {`
			`size_t plen = get_path_prefix(path, pathlen,`
			`sizeof(header.prefix));`
			`size_t rest = pathlen - plen - 1;`
			`if (plen > 0 && rest <= sizeof(header.name)) {`
			`memcpy(header.prefix, path, plen);`
archive-tar: fix minor indentation violation This looks like a simple omission from 8539070 (archive-tar: unindent write_tar_entry by one level, 2012-05-03). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-09-24 23:03:49 +02:00			`memcpy(header.name, path + plen + 1, rest);`
git-tar-tree: Move code for git-archive --format=tar to archive-tar.c This patch doesn't change any functionality, it only moves code around. It makes seeing the few remaining lines of git-tar-tree code easier. ;-) Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-24 17:31:10 +02:00			`} else {`
archive-tar: use xsnprintf for trivial formatting When we generate tar headers, we sprintf() values directly into a struct with the fixed-size header values. For the most part this is fine, as we are formatting small values (e.g., the octal format of "mode & 0x7777" is of fixed length). But it's still a good idea to use xsnprintf here. It communicates to readers what our expectation is, and it provides a run-time check that we are not overflowing the buffers. The one exception here is the mtime, which comes from the epoch time of the commit we are archiving. For sane values, this fits into the 12-byte value allocated in the header. But since git can handle 64-bit times, if I claim to be a visitor from the year 10,000 AD, I can overflow the buffer. This turns out to be harmless, as we simply overflow into the chksum field, which is then overwritten. This case is also best as an xsnprintf. It should never come up, short of extremely malformed dates, and in that case we are probably better off dying than silently truncating the date value (and we cannot expand the size of the buffer, since it is dictated by the ustar format). Our friends in the year 5138 (when we legitimately flip to a 12-digit epoch) can deal with that problem then. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-09-24 23:06:24 +02:00			`xsnprintf(header.name, sizeof(header.name), "%s.data",`
			`sha1_to_hex(sha1));`
archive-tar: unindent write_tar_entry by one level It's used to be if (!sha1) { ... } else if (!path) { ... } else { ... } Now that the first two blocks are no-op. We can remove the if/else skeleton and put the else block back by one indent level. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-03 03:51:02 +02:00			`strbuf_append_ext_header(&ext_header, "path",`
			`path, pathlen);`
git-tar-tree: Move code for git-archive --format=tar to archive-tar.c This patch doesn't change any functionality, it only moves code around. It makes seeing the few remaining lines of git-tar-tree code easier. ;-) Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-24 17:31:10 +02:00			`}`
archive-tar: unindent write_tar_entry by one level It's used to be if (!sha1) { ... } else if (!path) { ... } else { ... } Now that the first two blocks are no-op. We can remove the if/else skeleton and put the else block back by one indent level. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-03 03:51:02 +02:00			`} else`
			`memcpy(header.name, path, pathlen);`
git-tar-tree: Move code for git-archive --format=tar to archive-tar.c This patch doesn't change any functionality, it only moves code around. It makes seeing the few remaining lines of git-tar-tree code easier. ;-) Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-24 17:31:10 +02:00
archive-tar: stream large blobs to tar file t5000 verifies output while t1050 makes sure the command always respects core.bigfilethreshold Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-03 03:51:04 +02:00			`if (S_ISREG(mode) && !args->convert &&`
			`sha1_object_info(sha1, &size) == OBJ_BLOB &&`
			`size > big_file_threshold)`
			`buffer = NULL;`
			`else if (S_ISLNK(mode) \|\| S_ISREG(mode)) {`
archive: delegate blob reading to backend archive-tar.c and archive-zip.c now perform conversion check, with help of sha1_file_to_archive() from archive.c This gives backends more freedom in dealing with (streaming) large blobs. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-03 03:51:03 +02:00			`enum object_type type;`
			`buffer = sha1_file_to_archive(args, path, sha1, old_mode, &type, &size);`
			`if (!buffer)`
			`return error("cannot read %s", sha1_to_hex(sha1));`
			`} else {`
			`buffer = NULL;`
			`size = 0;`
			`}`

			`if (S_ISLNK(mode)) {`
git-tar-tree: Move code for git-archive --format=tar to archive-tar.c This patch doesn't change any functionality, it only moves code around. It makes seeing the few remaining lines of git-tar-tree code easier. ;-) Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-24 17:31:10 +02:00			`if (size > sizeof(header.linkname)) {`
archive-tar: use xsnprintf for trivial formatting When we generate tar headers, we sprintf() values directly into a struct with the fixed-size header values. For the most part this is fine, as we are formatting small values (e.g., the octal format of "mode & 0x7777" is of fixed length). But it's still a good idea to use xsnprintf here. It communicates to readers what our expectation is, and it provides a run-time check that we are not overflowing the buffers. The one exception here is the mtime, which comes from the epoch time of the commit we are archiving. For sane values, this fits into the 12-byte value allocated in the header. But since git can handle 64-bit times, if I claim to be a visitor from the year 10,000 AD, I can overflow the buffer. This turns out to be harmless, as we simply overflow into the chksum field, which is then overwritten. This case is also best as an xsnprintf. It should never come up, short of extremely malformed dates, and in that case we are probably better off dying than silently truncating the date value (and we cannot expand the size of the buffer, since it is dictated by the ustar format). Our friends in the year 5138 (when we legitimately flip to a 12-digit epoch) can deal with that problem then. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-09-24 23:06:24 +02:00			`xsnprintf(header.linkname, sizeof(header.linkname),`
			`"see %s.paxheader", sha1_to_hex(sha1));`
git-tar-tree: Move code for git-archive --format=tar to archive-tar.c This patch doesn't change any functionality, it only moves code around. It makes seeing the few remaining lines of git-tar-tree code easier. ;-) Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-24 17:31:10 +02:00			`strbuf_append_ext_header(&ext_header, "linkpath",`
			`buffer, size);`
			`} else`
			`memcpy(header.linkname, buffer, size);`
			`}`

archive-tar: write extended headers for file sizes >= 8GB The ustar format has a fixed-length field for the size of each file entry which is supposed to contain up to 11 bytes of octal-formatted data plus a NUL or space terminator. These means that the largest size we can represent is 077777777777, or 1 byte short of 8GB. The correct solution for a larger file, according to POSIX.1-2001, is to add an extended pax header, similar to how we handle long filenames. This patch does that, and writes zero for the size field in the ustar header (the last bit is not mentioned by POSIX, but it matches how GNU tar behaves with --format=pax). This should be a strict improvement over the current behavior, which is to die in xsnprintf with a "BUG". However, there's some interesting history here. Prior to f2f0267 (archive-tar: use xsnprintf for trivial formatting, 2015-09-24), we silently overflowed the "size" field. The extra bytes ended up in the "mtime" field of the header, which was then immediately written itself, overwriting our extra bytes. What that means depends on how many bytes we wrote. If the size was 64GB or greater, then we actually overflowed digits into the mtime field, meaning our value was effectively right-shifted by those lost octal digits. And this patch is again a strict improvement over that. But if the size was between 8GB and 64GB, then our 12-byte field held all of the actual digits, and only our NUL terminator overflowed. According to POSIX, there should be a NUL or space at the end of the field. However, GNU tar seems to be lenient here, and will correctly parse a size up 64GB (minus one) from the field. So sizes in this range might have just worked, depending on the implementation reading the tarfile. This patch is mostly still an improvement there, as the 8GB limit is specifically mentioned in POSIX as the correct limit. But it's possible that it could be a regression (versus the pre-f2f0267 state) if all of the following are true: 1. You have a file between 8GB and 64GB. 2. Your tar implementation _doesn't_ know about pax extended headers. 3. Your tar implementation _does_ parse 12-byte sizes from the ustar header without a delimiter. It's probably not worth worrying about such an obscure set of conditions, but I'm documenting it here just in case. Helped-by: René Scharfe <l.s.r@web.de> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-06-30 11:09:16 +02:00			`size_in_header = size;`
			`if (S_ISREG(mode) && size > USTAR_MAX_SIZE) {`
			`size_in_header = 0;`
			`strbuf_append_ext_header_uint(&ext_header, "size", size);`
			`}`

			`prepare_header(args, &header, mode, size_in_header);`
git-tar-tree: Move code for git-archive --format=tar to archive-tar.c This patch doesn't change any functionality, it only moves code around. It makes seeing the few remaining lines of git-tar-tree code easier. ;-) Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-24 17:31:10 +02:00
			`if (ext_header.len > 0) {`
archive-tar: make write_extended_header() void The function write_extended_header() only ever returns 0. Simplify it and its caller by dropping its return value, like we did with write_global_extended_header() earlier. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-06 16:35:38 +02:00			`write_extended_header(args, sha1, ext_header.buf,`
			`ext_header.len);`
git-tar-tree: Move code for git-archive --format=tar to archive-tar.c This patch doesn't change any functionality, it only moves code around. It makes seeing the few remaining lines of git-tar-tree code easier. ;-) Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-24 17:31:10 +02:00			`}`
Simplify strbuf uses in archive-tar.c using strbuf API This is just cleaner way to deal with strbufs, using its API rather than reinventing it in the module (e.g. strbuf_append_string is just the plain strbuf_addstr function, and it was used to perform what strbuf_addch does anyways). Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-06 13:20:06 +02:00			`strbuf_release(&ext_header);`
git-tar-tree: Move code for git-archive --format=tar to archive-tar.c This patch doesn't change any functionality, it only moves code around. It makes seeing the few remaining lines of git-tar-tree code easier. ;-) Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-24 17:31:10 +02:00			`write_blocked(&header, sizeof(header));`
archive-tar: stream large blobs to tar file t5000 verifies output while t1050 makes sure the command always respects core.bigfilethreshold Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-03 03:51:04 +02:00			`if (S_ISREG(mode) && size > 0) {`
			`if (buffer)`
			`write_blocked(buffer, size);`
			`else`
			`err = stream_blocked(sha1);`
			`}`
archive: delegate blob reading to backend archive-tar.c and archive-zip.c now perform conversion check, with help of sha1_file_to_archive() from archive.c This gives backends more freedom in dealing with (streaming) large blobs. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-03 03:51:03 +02:00			`free(buffer);`
archive: centralize archive entry writing Add the exported function write_archive_entries() to archive.c, which uses the new ability of read_tree_recursive() to pass a context pointer to its callback in order to centralize previously duplicated code. The new callback function write_archive_entry() does the work that every archiver backend needs to do: loading file contents, entering subdirectories, handling file attributes, constructing the full path of the entry. All that done, it calls the backend specific write_archive_entry_fn_t function. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-14 21:22:24 +02:00			`return err;`
git-tar-tree: Move code for git-archive --format=tar to archive-tar.c This patch doesn't change any functionality, it only moves code around. It makes seeing the few remaining lines of git-tar-tree code easier. ;-) Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-24 17:31:10 +02:00			`}`

archive-tar: drop return value We never do any error checks, and so never return anything but "0". Let's just drop this to simplify the code. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-06-30 11:09:26 +02:00			`static void write_global_extended_header(struct archiver_args *args)`
git-tar-tree: Move code for git-archive --format=tar to archive-tar.c This patch doesn't change any functionality, it only moves code around. It makes seeing the few remaining lines of git-tar-tree code easier. ;-) Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-24 17:31:10 +02:00			`{`
archive: centralize archive entry writing Add the exported function write_archive_entries() to archive.c, which uses the new ability of read_tree_recursive() to pass a context pointer to its callback in order to centralize previously duplicated code. The new callback function write_archive_entry() does the work that every archiver backend needs to do: loading file contents, entering subdirectories, handling file attributes, constructing the full path of the entry. All that done, it calls the backend specific write_archive_entry_fn_t function. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-14 21:22:24 +02:00			`const unsigned char *sha1 = args->commit_sha1;`
Replace calls to strbuf_init(&foo, 0) with STRBUF_INIT initializer Many call sites use strbuf_init(&foo, 0) to initialize local strbuf variable "foo" which has not been accessed since its declaration. These can be replaced with a static initialization using the STRBUF_INIT macro which is just as readable, saves a function call, and takes up fewer lines. Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2008-10-09 21:12:12 +02:00			`struct strbuf ext_header = STRBUF_INIT;`
archive-tar: turn write_tar_entry into blob-writing only Before this patch write_tar_entry() can: - write global header by write_global_extended_header() calling write_tar_entry with with both sha1 and path == NULL - write extended header for symlinks, by write_tar_entry() calling itself with sha1 != NULL and path == NULL - write a normal blob. In this case both sha1 and path are valid. After this patch, the first two call sites are modified to write the header without calling write_tar_entry(). The function is now for writing blobs only. This simplifies handling when write_tar_entry() learns about large blobs. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-03 03:51:01 +02:00			`struct ustar_header header;`
			`unsigned int mode;`
Simplify strbuf uses in archive-tar.c using strbuf API This is just cleaner way to deal with strbufs, using its API rather than reinventing it in the module (e.g. strbuf_append_string is just the plain strbuf_addstr function, and it was used to perform what strbuf_addch does anyways). Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-06 13:20:06 +02:00
archive-tar: write extended headers for far-future mtime The ustar format represents timestamps as seconds since the epoch, but only has room to store 11 octal digits. To express anything larger, we need to use an extended header. This is exactly the same case we fixed for the size field in the previous commit, and the solution here follows the same pattern. This is even mentioned as an issue in f2f0267 (archive-tar: use xsnprintf for trivial formatting, 2015-09-24), but since it only affected things far in the future, it wasn't deemed worth dealing with. But note that my calculations claiming thousands of years were off there; because our xsnprintf produces a NUL byte, we only have until the year 2242 to fix this. Given that this is just around the corner (geologically speaking, anyway), and because it's easy to fix, let's just make it work. Unlike the previous fix for "size", where we had to write an individual extended header for each file, we can write one global header (since we have only one mtime for the whole archive). There's a slight bit of trickiness there. We may already be writing a global header with a "comment" field for the commit sha1. So we need to write our new field into the same header. To do this, we push the decision of whether to write such a header down into write_global_extended_header(), which will now assemble the header as it sees fit, and will return early if we have nothing to write (in practice, we'll only have a large mtime if it comes from a commit, but this makes it also work if you set your system clock ahead such that time() returns a huge value). Note that we don't (and never did) handle negative timestamps (i.e., before 1970). This would probably not be too hard to support in the same way, but since git does not support negative timestamps at all, I didn't bother here. After writing the extended header, we munge the timestamp in the ustar headers to the maximum-allowable size. This is wrong, but it's the least-wrong thing we can provide to a tar implementation that doesn't understand pax headers (it's also what GNU tar does). Helped-by: René Scharfe <l.s.r@web.de> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-06-30 11:09:20 +02:00			`if (sha1)`
			`strbuf_append_ext_header(&ext_header, "comment",`
			`sha1_to_hex(sha1), 40);`
			`if (args->time > USTAR_MAX_MTIME) {`
			`strbuf_append_ext_header_uint(&ext_header, "mtime",`
			`args->time);`
			`args->time = USTAR_MAX_MTIME;`
			`}`

			`if (!ext_header.len)`
archive-tar: drop return value We never do any error checks, and so never return anything but "0". Let's just drop this to simplify the code. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-06-30 11:09:26 +02:00			`return;`
archive-tar: write extended headers for far-future mtime The ustar format represents timestamps as seconds since the epoch, but only has room to store 11 octal digits. To express anything larger, we need to use an extended header. This is exactly the same case we fixed for the size field in the previous commit, and the solution here follows the same pattern. This is even mentioned as an issue in f2f0267 (archive-tar: use xsnprintf for trivial formatting, 2015-09-24), but since it only affected things far in the future, it wasn't deemed worth dealing with. But note that my calculations claiming thousands of years were off there; because our xsnprintf produces a NUL byte, we only have until the year 2242 to fix this. Given that this is just around the corner (geologically speaking, anyway), and because it's easy to fix, let's just make it work. Unlike the previous fix for "size", where we had to write an individual extended header for each file, we can write one global header (since we have only one mtime for the whole archive). There's a slight bit of trickiness there. We may already be writing a global header with a "comment" field for the commit sha1. So we need to write our new field into the same header. To do this, we push the decision of whether to write such a header down into write_global_extended_header(), which will now assemble the header as it sees fit, and will return early if we have nothing to write (in practice, we'll only have a large mtime if it comes from a commit, but this makes it also work if you set your system clock ahead such that time() returns a huge value). Note that we don't (and never did) handle negative timestamps (i.e., before 1970). This would probably not be too hard to support in the same way, but since git does not support negative timestamps at all, I didn't bother here. After writing the extended header, we munge the timestamp in the ustar headers to the maximum-allowable size. This is wrong, but it's the least-wrong thing we can provide to a tar implementation that doesn't understand pax headers (it's also what GNU tar does). Helped-by: René Scharfe <l.s.r@web.de> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-06-30 11:09:20 +02:00
archive-tar: turn write_tar_entry into blob-writing only Before this patch write_tar_entry() can: - write global header by write_global_extended_header() calling write_tar_entry with with both sha1 and path == NULL - write extended header for symlinks, by write_tar_entry() calling itself with sha1 != NULL and path == NULL - write a normal blob. In this case both sha1 and path are valid. After this patch, the first two call sites are modified to write the header without calling write_tar_entry(). The function is now for writing blobs only. This simplifies handling when write_tar_entry() learns about large blobs. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-03 03:51:01 +02:00			`memset(&header, 0, sizeof(header));`
			`*header.typeflag = TYPEFLAG_GLOBAL_HEADER;`
Revert "archive: honor tar.umask even for pax headers" This reverts commit 10f343ea814f5c18a0913997904ee11cd9b7da24, whose output is no longer bit-for-bit equivalent from the older versions of Git, which the infrastructure to (pretend to) upload tarballs kernel.org uses depends on. 2014-10-20 21:04:46 +02:00			`mode = 0100666;`
convert trivial sprintf / strcpy calls to xsnprintf We sometimes sprintf into fixed-size buffers when we know that the buffer is large enough to fit the input (either because it's a constant, or because it's numeric input that is bounded in size). Likewise with strcpy of constant strings. However, these sites make it hard to audit sprintf and strcpy calls for buffer overflows, as a reader has to cross-reference the size of the array with the input. Let's use xsnprintf instead, which communicates to a reader that we don't expect this to overflow (and catches the mistake in case we do). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-09-24 23:06:08 +02:00			`xsnprintf(header.name, sizeof(header.name), "pax_global_header");`
archive-tar: turn write_tar_entry into blob-writing only Before this patch write_tar_entry() can: - write global header by write_global_extended_header() calling write_tar_entry with with both sha1 and path == NULL - write extended header for symlinks, by write_tar_entry() calling itself with sha1 != NULL and path == NULL - write a normal blob. In this case both sha1 and path are valid. After this patch, the first two call sites are modified to write the header without calling write_tar_entry(). The function is now for writing blobs only. This simplifies handling when write_tar_entry() learns about large blobs. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-03 03:51:01 +02:00			`prepare_header(args, &header, mode, ext_header.len);`
			`write_blocked(&header, sizeof(header));`
			`write_blocked(ext_header.buf, ext_header.len);`
Simplify strbuf uses in archive-tar.c using strbuf API This is just cleaner way to deal with strbufs, using its API rather than reinventing it in the module (e.g. strbuf_append_string is just the plain strbuf_addstr function, and it was used to perform what strbuf_addch does anyways). Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-06 13:20:06 +02:00			`strbuf_release(&ext_header);`
git-tar-tree: Move code for git-archive --format=tar to archive-tar.c This patch doesn't change any functionality, it only moves code around. It makes seeing the few remaining lines of git-tar-tree code easier. ;-) Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-24 17:31:10 +02:00			`}`

archive: implement configurable tar filters It's common to pipe the tar output produce by "git archive" through gzip or some other compressor. Locally, this can easily be done by using a shell pipe. When requesting a remote archive, though, it cannot be done through the upload-archive interface. This patch allows configurable tar filters, so that one could define a "tar.gz" format that automatically pipes tar output through gzip. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-06-22 03:26:31 +02:00			`static struct archiver **tar_filters;`
			`static int nr_tar_filters;`
			`static int alloc_tar_filters;`

			`static struct archiver find_tar_filter(const char name, int len)`
			`{`
			`int i;`
			`for (i = 0; i < nr_tar_filters; i++) {`
			`struct archiver *ar = tar_filters[i];`
			`if (!strncmp(ar->name, name, len) && !ar->name[len])`
			`return ar;`
			`}`
			`return NULL;`
			`}`

			`static int tar_filter_config(const char var, const char value, void *data)`
			`{`
			`struct archiver *ar;`
			`const char *name;`
			`const char *type;`
			`int namelen;`

archive-tar: use parse_config_key when parsing config This is fewer lines of code, but more importantly, fixes a bogus pointer offset. We are looking for "tar." in the section, but later assume that the dot we found is at offset 9, not 3. This is a holdover from an earlier iteration of 767cf45 which called the section "tarfilter". As a result, we could erroneously reject some filters with dots in their name, as well as read uninitialized memory. Reported by (and test by) René Scharfe. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-23 07:23:27 +01:00			`if (parse_config_key(var, "tar", &name, &namelen, &type) < 0 \|\| !name)`
archive: implement configurable tar filters It's common to pipe the tar output produce by "git archive" through gzip or some other compressor. Locally, this can easily be done by using a shell pipe. When requesting a remote archive, though, it cannot be done through the upload-archive interface. This patch allows configurable tar filters, so that one could define a "tar.gz" format that automatically pipes tar output through gzip. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-06-22 03:26:31 +02:00			`return 0;`

			`ar = find_tar_filter(name, namelen);`
			`if (!ar) {`
			`ar = xcalloc(1, sizeof(*ar));`
			`ar->name = xmemdupz(name, namelen);`
			`ar->write_archive = write_tar_filter_archive;`
			`ar->flags = ARCHIVER_WANT_COMPRESSION_LEVELS;`
			`ALLOC_GROW(tar_filters, nr_tar_filters + 1, alloc_tar_filters);`
			`tar_filters[nr_tar_filters++] = ar;`
			`}`

			`if (!strcmp(type, "command")) {`
			`if (!value)`
			`return config_error_nonbool(var);`
			`free(ar->data);`
			`ar->data = xstrdup(value);`
			`return 0;`
			`}`
upload-archive: allow user to turn off filters Some tar filters may be very expensive to run, so sites do not want to expose them via upload-archive. This patch lets users configure tar.<filter>.remote to turn them off. By default, gzip filters are left on, as they are about as expensive as creating zip archives. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-06-22 05:17:35 +02:00			`if (!strcmp(type, "remote")) {`
			`if (git_config_bool(var, value))`
			`ar->flags \|= ARCHIVER_REMOTE;`
			`else`
			`ar->flags &= ~ARCHIVER_REMOTE;`
			`return 0;`
			`}`
archive: implement configurable tar filters It's common to pipe the tar output produce by "git archive" through gzip or some other compressor. Locally, this can easily be done by using a shell pipe. When requesting a remote archive, though, it cannot be done through the upload-archive interface. This patch allows configurable tar filters, so that one could define a "tar.gz" format that automatically pipes tar output through gzip. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-06-22 03:26:31 +02:00
			`return 0;`
			`}`

Provide git_config with a callback-data parameter git_config() only had a function parameter, but no callback data parameter. This assumes that all callback functions only modify global variables. With this patch, every callback gets a void * parameter, and it is hoped that this will help the libification effort. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-05-14 19:46:53 +02:00			`static int git_tar_config(const char var, const char value, void *cb)`
git-tar-tree: Move code for git-archive --format=tar to archive-tar.c This patch doesn't change any functionality, it only moves code around. It makes seeing the few remaining lines of git-tar-tree code easier. ;-) Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-24 17:31:10 +02:00			`{`
			`if (!strcmp(var, "tar.umask")) {`
archive-tar.c: guard config parser from value=NULL Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-09 05:38:22 +01:00			`if (value && !strcmp(value, "user")) {`
git-tar-tree: Move code for git-archive --format=tar to archive-tar.c This patch doesn't change any functionality, it only moves code around. It makes seeing the few remaining lines of git-tar-tree code easier. ;-) Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-24 17:31:10 +02:00			`tar_umask = umask(0);`
			`umask(tar_umask);`
			`} else {`
			`tar_umask = git_config_int(var, value);`
			`}`
			`return 0;`
			`}`
archive: implement configurable tar filters It's common to pipe the tar output produce by "git archive" through gzip or some other compressor. Locally, this can easily be done by using a shell pipe. When requesting a remote archive, though, it cannot be done through the upload-archive interface. This patch allows configurable tar filters, so that one could define a "tar.gz" format that automatically pipes tar output through gzip. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-06-22 03:26:31 +02:00
			`return tar_filter_config(var, value, cb);`
git-tar-tree: Move code for git-archive --format=tar to archive-tar.c This patch doesn't change any functionality, it only moves code around. It makes seeing the few remaining lines of git-tar-tree code easier. ;-) Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-24 17:31:10 +02:00			`}`

archive: pass archiver struct to write_archive callback The current archivers are very static; when you are in the write_tar_archive function, you know you are writing a tar. However, to facilitate runtime-configurable archivers that will share a common write function we need to tell the function which archiver was used. As a convenience, we also provide an opaque data pointer in the archiver struct so that individual archivers can put something useful there when they register themselves. Technically they could just use the "name" field to look in an internal map of names to data, but this is much simpler. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-06-22 03:24:07 +02:00			`static int write_tar_archive(const struct archiver *ar,`
			`struct archiver_args *args)`
git-tar-tree: Move code for git-archive --format=tar to archive-tar.c This patch doesn't change any functionality, it only moves code around. It makes seeing the few remaining lines of git-tar-tree code easier. ;-) Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-24 17:31:10 +02:00			`{`
archive: centralize archive entry writing Add the exported function write_archive_entries() to archive.c, which uses the new ability of read_tree_recursive() to pass a context pointer to its callback in order to centralize previously duplicated code. The new callback function write_archive_entry() does the work that every archiver backend needs to do: loading file contents, entering subdirectories, handling file attributes, constructing the full path of the entry. All that done, it calls the backend specific write_archive_entry_fn_t function. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-14 21:22:24 +02:00			`int err = 0;`
git-tar-tree: Move code for git-archive --format=tar to archive-tar.c This patch doesn't change any functionality, it only moves code around. It makes seeing the few remaining lines of git-tar-tree code easier. ;-) Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-24 17:31:10 +02:00
archive-tar: drop return value We never do any error checks, and so never return anything but "0". Let's just drop this to simplify the code. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-06-30 11:09:26 +02:00			`write_global_extended_header(args);`
			`err = write_archive_entries(args, write_tar_entry);`
archive: centralize archive entry writing Add the exported function write_archive_entries() to archive.c, which uses the new ability of read_tree_recursive() to pass a context pointer to its callback in order to centralize previously duplicated code. The new callback function write_archive_entry() does the work that every archiver backend needs to do: loading file contents, entering subdirectories, handling file attributes, constructing the full path of the entry. All that done, it calls the backend specific write_archive_entry_fn_t function. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-14 21:22:24 +02:00			`if (!err)`
			`write_trailer();`
			`return err;`
git-tar-tree: Move code for git-archive --format=tar to archive-tar.c This patch doesn't change any functionality, it only moves code around. It makes seeing the few remaining lines of git-tar-tree code easier. ;-) Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-24 17:31:10 +02:00			`}`
archive: refactor list of archive formats Most of the tar and zip code was nicely split out into two abstracted files which knew only about their specific formats. The entry point to this code was a single "write archive" function. However, as these basic formats grow more complex (e.g., by handling multiple file extensions and format names), a static list of the entry point functions won't be enough. Instead, let's provide a way for the tar and zip code to tell the main archive code what they support by registering archiver names and functions. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-06-22 03:23:33 +02:00
archive: implement configurable tar filters It's common to pipe the tar output produce by "git archive" through gzip or some other compressor. Locally, this can easily be done by using a shell pipe. When requesting a remote archive, though, it cannot be done through the upload-archive interface. This patch allows configurable tar filters, so that one could define a "tar.gz" format that automatically pipes tar output through gzip. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-06-22 03:26:31 +02:00			`static int write_tar_filter_archive(const struct archiver *ar,`
			`struct archiver_args *args)`
			`{`
			`struct strbuf cmd = STRBUF_INIT;`
run-command: introduce CHILD_PROCESS_INIT Most struct child_process variables are cleared using memset first after declaration. Provide a macro, CHILD_PROCESS_INIT, that can be used to initialize them statically instead. That's shorter, doesn't require a function call and is slightly more readable (especially given that we already have STRBUF_INIT, ARGV_ARRAY_INIT etc.). Helped-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-08-19 21:09:35 +02:00			`struct child_process filter = CHILD_PROCESS_INIT;`
archive: implement configurable tar filters It's common to pipe the tar output produce by "git archive" through gzip or some other compressor. Locally, this can easily be done by using a shell pipe. When requesting a remote archive, though, it cannot be done through the upload-archive interface. This patch allows configurable tar filters, so that one could define a "tar.gz" format that automatically pipes tar output through gzip. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-06-22 03:26:31 +02:00			`const char *argv[2];`
			`int r;`

			`if (!ar->data)`
			`die("BUG: tar-filter archiver called with no filter defined");`

			`strbuf_addstr(&cmd, ar->data);`
			`if (args->compression_level >= 0)`
			`strbuf_addf(&cmd, " -%d", args->compression_level);`

			`argv[0] = cmd.buf;`
			`argv[1] = NULL;`
			`filter.argv = argv;`
			`filter.use_shell = 1;`
			`filter.in = -1;`

			`if (start_command(&filter) < 0)`
			`die_errno("unable to start '%s' filter", argv[0]);`
			`close(1);`
			`if (dup2(filter.in, 1) < 0)`
			`die_errno("unable to redirect descriptor");`
			`close(filter.in);`

			`r = write_tar_archive(ar, args);`

			`close(1);`
			`if (finish_command(&filter) != 0)`
			`die("'%s' filter reported error", argv[0]);`

			`strbuf_release(&cmd);`
			`return r;`
			`}`

archive: refactor list of archive formats Most of the tar and zip code was nicely split out into two abstracted files which knew only about their specific formats. The entry point to this code was a single "write archive" function. However, as these basic formats grow more complex (e.g., by handling multiple file extensions and format names), a static list of the entry point functions won't be enough. Instead, let's provide a way for the tar and zip code to tell the main archive code what they support by registering archiver names and functions. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-06-22 03:23:33 +02:00			`static struct archiver tar_archiver = {`
			`"tar",`
			`write_tar_archive,`
upload-archive: allow user to turn off filters Some tar filters may be very expensive to run, so sites do not want to expose them via upload-archive. This patch lets users configure tar.<filter>.remote to turn them off. By default, gzip filters are left on, as they are about as expensive as creating zip archives. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-06-22 05:17:35 +02:00			`ARCHIVER_REMOTE`
archive: refactor list of archive formats Most of the tar and zip code was nicely split out into two abstracted files which knew only about their specific formats. The entry point to this code was a single "write archive" function. However, as these basic formats grow more complex (e.g., by handling multiple file extensions and format names), a static list of the entry point functions won't be enough. Instead, let's provide a way for the tar and zip code to tell the main archive code what they support by registering archiver names and functions. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-06-22 03:23:33 +02:00			`};`

			`void init_tar_archiver(void)`
			`{`
archive: implement configurable tar filters It's common to pipe the tar output produce by "git archive" through gzip or some other compressor. Locally, this can easily be done by using a shell pipe. When requesting a remote archive, though, it cannot be done through the upload-archive interface. This patch allows configurable tar filters, so that one could define a "tar.gz" format that automatically pipes tar output through gzip. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-06-22 03:26:31 +02:00			`int i;`
archive: refactor list of archive formats Most of the tar and zip code was nicely split out into two abstracted files which knew only about their specific formats. The entry point to this code was a single "write archive" function. However, as these basic formats grow more complex (e.g., by handling multiple file extensions and format names), a static list of the entry point functions won't be enough. Instead, let's provide a way for the tar and zip code to tell the main archive code what they support by registering archiver names and functions. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-06-22 03:23:33 +02:00			`register_archiver(&tar_archiver);`
archive: implement configurable tar filters It's common to pipe the tar output produce by "git archive" through gzip or some other compressor. Locally, this can easily be done by using a shell pipe. When requesting a remote archive, though, it cannot be done through the upload-archive interface. This patch allows configurable tar filters, so that one could define a "tar.gz" format that automatically pipes tar output through gzip. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-06-22 03:26:31 +02:00
archive: provide builtin .tar.gz filter This works exactly as if the user had configured it via: [tar "tgz"] command = gzip -cn [tar "tar.gz"] command = gzip -cn but since it is so common, it's convenient to have it builtin without the user needing to do anything. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-06-22 03:27:35 +02:00			`tar_filter_config("tar.tgz.command", "gzip -cn", NULL);`
upload-archive: allow user to turn off filters Some tar filters may be very expensive to run, so sites do not want to expose them via upload-archive. This patch lets users configure tar.<filter>.remote to turn them off. By default, gzip filters are left on, as they are about as expensive as creating zip archives. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-06-22 05:17:35 +02:00			`tar_filter_config("tar.tgz.remote", "true", NULL);`
archive: provide builtin .tar.gz filter This works exactly as if the user had configured it via: [tar "tgz"] command = gzip -cn [tar "tar.gz"] command = gzip -cn but since it is so common, it's convenient to have it builtin without the user needing to do anything. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-06-22 03:27:35 +02:00			`tar_filter_config("tar.tar.gz.command", "gzip -cn", NULL);`
upload-archive: allow user to turn off filters Some tar filters may be very expensive to run, so sites do not want to expose them via upload-archive. This patch lets users configure tar.<filter>.remote to turn them off. By default, gzip filters are left on, as they are about as expensive as creating zip archives. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-06-22 05:17:35 +02:00			`tar_filter_config("tar.tar.gz.remote", "true", NULL);`
archive: refactor list of archive formats Most of the tar and zip code was nicely split out into two abstracted files which knew only about their specific formats. The entry point to this code was a single "write archive" function. However, as these basic formats grow more complex (e.g., by handling multiple file extensions and format names), a static list of the entry point functions won't be enough. Instead, let's provide a way for the tar and zip code to tell the main archive code what they support by registering archiver names and functions. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-06-22 03:23:33 +02:00			`git_config(git_tar_config, NULL);`
archive: implement configurable tar filters It's common to pipe the tar output produce by "git archive" through gzip or some other compressor. Locally, this can easily be done by using a shell pipe. When requesting a remote archive, though, it cannot be done through the upload-archive interface. This patch allows configurable tar filters, so that one could define a "tar.gz" format that automatically pipes tar output through gzip. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-06-22 03:26:31 +02:00			`for (i = 0; i < nr_tar_filters; i++) {`
			`/* omit any filters that never had a command configured */`
			`if (tar_filters[i]->data)`
			`register_archiver(tar_filters[i]);`
			`}`
archive: refactor list of archive formats Most of the tar and zip code was nicely split out into two abstracted files which knew only about their specific formats. The entry point to this code was a single "write archive" function. However, as these basic formats grow more complex (e.g., by handling multiple file extensions and format names), a static list of the entry point functions won't be enough. Instead, let's provide a way for the tar and zip code to tell the main archive code what they support by registering archiver names and functions. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-06-22 03:23:33 +02:00			`}`