mirrors/git - Incest Forge: Beyond sex. We incest.

mirrors/git

mirror of https://github.com/git/git.git synced 2024-11-05 08:47:56 +01:00

974 lines

21 KiB

C

Raw Normal View History

Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`/*`
			`* Another stupid program, this one parsing the headers of an`
			`* email to figure out authorship and subject`
			`*/`
mailinfo: Use i18n.commitencoding This uses i18n.commitencoding configuration item to pick up the default commit encoding for the repository when converting form e-mail encoding to commit encoding (the default is utf8). Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-11-28 01:29:38 +01:00			`#include "cache.h"`
Make git-mailinfo a builtin [jc: with a bit of constness tightening] Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-13 22:21:50 +02:00			`#include "builtin.h"`
Move encoding conversion routine out of mailinfo to utf8.c This moves the body of convert_to_utf8() routine used in mailinfo to the utf8.c i18n library. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-24 08:36:55 +01:00			`#include "utf8.h"`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`#include "strbuf.h"`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00
Make git-mailinfo a builtin [jc: with a bit of constness tightening] Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-13 22:21:50 +02:00			`static FILE cmitmsg, patchfile, fin, fout;`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00
remove unnecessary initializations [jc: I needed to hand merge the changes to the updated codebase, so the result needs to be checked.] Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-08-15 19:23:48 +02:00			`static int keep_subject;`
			`static const char *metainfo_charset;`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`static struct strbuf line = STRBUF_INIT;`
			`static struct strbuf name = STRBUF_INIT;`
			`static struct strbuf email = STRBUF_INIT;`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`static enum {`
			`TE_DONTCARE, TE_QP, TE_BASE64,`
			`} transfer_encoding;`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`static enum {`
			`TYPE_TEXT, TYPE_OTHER,`
			`} message_type;`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`static struct strbuf charset = STRBUF_INIT;`
remove unnecessary initializations [jc: I needed to hand merge the changes to the updated codebase, so the result needs to be checked.] Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-08-15 19:23:48 +02:00			`static int patch_lines;`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`static struct strbuf p_hdr_data, s_hdr_data;`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00
			`#define MAX_HDR_PARSED 10`
			`#define MAX_BOUNDARIES 5`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00
mailinfo: cleanup extra spaces for complex 'From:' currently for cases like From: A U Thor <a.u.thor@example.com> (Comment) mailinfo extracts the following 'Author:' field: Author: A U Thor (Comment) ^^ which has two extra spaces left in there after removed email part. I think this is wrong so here is a fix. Signed-off-by: Kirill Smelkov <kirr@landau.phys.spbu.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-02-01 18:45:05 +01:00			`static void cleanup_space(struct strbuf *sb);`


git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`static void get_sane_name(struct strbuf out, struct strbuf name, struct strbuf *email)`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`{`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`struct strbuf *src = name;`
			`if (name->len < 3 \|\| 60 < name->len \|\| strchr(name->buf, '@') \|\|`
			`strchr(name->buf, '<') \|\| strchr(name->buf, '>'))`
			`src = email;`
			`else if (name == out)`
			`return;`
			`strbuf_reset(out);`
			`strbuf_addbuf(out, src);`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`}`

git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`static void parse_bogus_from(const struct strbuf *line)`
mailinfo and git-am: allow "John Doe <johndoe>" An isolated developer could have a local-only e-mail, which will be stripped out by mailinfo because it lacks '@'. Define a fallback parser to accomodate that. At the same time, reject authorless patch in git-am. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-15 01:31:06 +01:00			`{`
			`/* John Doe <johndoe> */`

git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`char bra, ket;`
mailinfo and git-am: allow "John Doe <johndoe>" An isolated developer could have a local-only e-mail, which will be stripped out by mailinfo because it lacks '@'. Define a fallback parser to accomodate that. At the same time, reject authorless patch in git-am. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-15 01:31:06 +01:00			`/* This is fallback, so do not bother if we already have an`
			`* e-mail address.`
Make git-mailinfo a builtin [jc: with a bit of constness tightening] Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-13 22:21:50 +02:00			`*/`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`if (email.len)`
			`return;`
mailinfo and git-am: allow "John Doe <johndoe>" An isolated developer could have a local-only e-mail, which will be stripped out by mailinfo because it lacks '@'. Define a fallback parser to accomodate that. At the same time, reject authorless patch in git-am. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-15 01:31:06 +01:00
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`bra = strchr(line->buf, '<');`
mailinfo and git-am: allow "John Doe <johndoe>" An isolated developer could have a local-only e-mail, which will be stripped out by mailinfo because it lacks '@'. Define a fallback parser to accomodate that. At the same time, reject authorless patch in git-am. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-15 01:31:06 +01:00			`if (!bra)`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`return;`
mailinfo and git-am: allow "John Doe <johndoe>" An isolated developer could have a local-only e-mail, which will be stripped out by mailinfo because it lacks '@'. Define a fallback parser to accomodate that. At the same time, reject authorless patch in git-am. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-15 01:31:06 +01:00			`ket = strchr(bra, '>');`
			`if (!ket)`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`return;`
mailinfo and git-am: allow "John Doe <johndoe>" An isolated developer could have a local-only e-mail, which will be stripped out by mailinfo because it lacks '@'. Define a fallback parser to accomodate that. At the same time, reject authorless patch in git-am. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-15 01:31:06 +01:00
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`strbuf_reset(&email);`
			`strbuf_add(&email, bra + 1, ket - bra - 1);`

			`strbuf_reset(&name);`
			`strbuf_add(&name, line->buf, bra - line->buf);`
			`strbuf_trim(&name);`
			`get_sane_name(&name, &name, &email);`
mailinfo and git-am: allow "John Doe <johndoe>" An isolated developer could have a local-only e-mail, which will be stripped out by mailinfo because it lacks '@'. Define a fallback parser to accomodate that. At the same time, reject authorless patch in git-am. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-15 01:31:06 +01:00			`}`

git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`static void handle_from(const struct strbuf *from)`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`{`
Allow in body headers beyond the in body header prefix. - handle_from is fixed to not mangle it's input line. - Then handle_inbody_header is allowed to look in the body of a commit message for additional headers that we haven't already seen. This allows patches with all of the right information in unfortunate places to be imported. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-23 21:58:36 +02:00			`char *at;`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`size_t el;`
			`struct strbuf f;`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`strbuf_init(&f, from->len);`
			`strbuf_addbuf(&f, from);`

			`at = strchr(f.buf, '@');`
			`if (!at) {`
			`parse_bogus_from(from);`
			`return;`
			`}`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00
			`/*`
			`* If we already have one email, don't take any confusing lines`
			`*/`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`if (email.len && strchr(at + 1, '@')) {`
			`strbuf_release(&f);`
			`return;`
			`}`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`/* Pick up the string around '@', possibly delimited with <>`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`* pair; that is the email part.`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`*/`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`while (at > f.buf) {`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`char c = at[-1];`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`if (isspace(c))`
			`break;`
			`if (c == '<') {`
			`at[-1] = ' ';`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`break;`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`}`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`at--;`
			`}`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`el = strcspn(at, " \n\t\r\v\f>");`
			`strbuf_reset(&email);`
			`strbuf_add(&email, at, el);`
mailinfo: avoid violating strbuf assertion In handle_from, we calculate the end boundary of a section to remove from a strbuf using strcspn like this: el = strcspn(buf, set_of_end_boundaries); strbuf_remove(&sb, start, el + 1); This works fine if "el" is the offset of the boundary character, meaning we remove up to and including that character. But if the end boundary didn't match (that is, we hit the end of the string as the boundary instead) then we want just "el". Asking for "el+1" caught an out-of-bounds assertion in the strbuf library. This manifested itself when we got a 'From' header that had just an email address with nothing else in it (the end of the string was the end of the address, rather than, e.g., a trailing '>' character), causing git-mailinfo to barf. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-08-19 19:28:24 +02:00			`strbuf_remove(&f, at - f.buf, el + (at[el] ? 1 : 0));`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00
mailinfo: cleanup extra spaces for complex 'From:' currently for cases like From: A U Thor <a.u.thor@example.com> (Comment) mailinfo extracts the following 'Author:' field: Author: A U Thor (Comment) ^^ which has two extra spaces left in there after removed email part. I think this is wrong so here is a fix. Signed-off-by: Kirill Smelkov <kirr@landau.phys.spbu.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-02-01 18:45:05 +01:00			`/* The remainder is name. It could be`
			`*`
			`* - "John Doe <john.doe@xz>" (a), or`
			`* - "john.doe@xz (John Doe)" (b), or`
			`* - "John (zzz) Doe <john.doe@xz> (Comment)" (c)`
			`*`
			`* but we have removed the email part, so`
			`*`
			`* - remove extra spaces which could stay after email (case 'c'), and`
			`* - trim from both ends, possibly removing the () pair at the end`
			`* (cases 'a' and 'b').`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`*/`
mailinfo: cleanup extra spaces for complex 'From:' currently for cases like From: A U Thor <a.u.thor@example.com> (Comment) mailinfo extracts the following 'Author:' field: Author: A U Thor (Comment) ^^ which has two extra spaces left in there after removed email part. I think this is wrong so here is a fix. Signed-off-by: Kirill Smelkov <kirr@landau.phys.spbu.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-02-01 18:45:05 +01:00			`cleanup_space(&f);`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`strbuf_trim(&f);`
mailinfo: better parse email adresses containg parentheses When using git-rebase, author fields containing a ')' at the last position had the close-parens character removed; the removal should be done only when it is of this form: user@host (User Name) i.e. the remainder after stripping the e-mail address part is enclosed in a parentheses pair as a whole, not for addresses like this: User Name (me) <user@host> Signed-off-by: Philippe Bruhat (BooK) <book@cpan.org> Acked-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-21 15:34:29 +02:00			`if (f.buf[0] == '(' && f.len && f.buf[f.len - 1] == ')') {`
			`strbuf_remove(&f, 0, 1);`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`strbuf_setlen(&f, f.len - 1);`
mailinfo: better parse email adresses containg parentheses When using git-rebase, author fields containing a ')' at the last position had the close-parens character removed; the removal should be done only when it is of this form: user@host (User Name) i.e. the remainder after stripping the e-mail address part is enclosed in a parentheses pair as a whole, not for addresses like this: User Name (me) <user@host> Signed-off-by: Philippe Bruhat (BooK) <book@cpan.org> Acked-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-21 15:34:29 +02:00			`}`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`get_sane_name(&name, &f, &email);`
			`strbuf_release(&f);`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`}`

git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`static void handle_header(struct strbuf *out, const struct strbuf line)`
Get AUTHOR_DATE from the email Date: line Now that git does pretty reliable date parsing, we might as well get the date from the email itself. Of course, it's still questionable whether the date on the email is all that relevant, but it's certainly no worse than taking the commit date. 2005-05-02 06:42:53 +02:00			`{`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`if (!*out) {`
			`*out = xmalloc(sizeof(struct strbuf));`
			`strbuf_init(*out, line->len);`
			`} else`
			`strbuf_reset(*out);`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`strbuf_addbuf(*out, line);`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`}`

			`/* NOTE NOTE NOTE. We do not claim we do full MIME. We just attempt`
			`* to have enough heuristics to grok MIME encoded patches often found`
			`* on our mailing lists. For example, we do not even treat header lines`
			`* case insensitively.`
			`*/`

git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`static int slurp_attr(const char line, const char name, struct strbuf *attr)`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`{`
Make some strings const Signed-off-by: Timo Hirvonen <tihirvon@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-28 11:04:39 +02:00			`const char ends, ap = strcasestr(line, name);`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`size_t sz;`

			`if (!ap) {`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`strbuf_setlen(attr, 0);`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`return 0;`
			`}`
			`ap += strlen(name);`
			`if (*ap == '"') {`
			`ap++;`
			`ends = "\"";`
			`}`
			`else`
			`ends = "; \t";`
			`sz = strcspn(ap, ends);`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`strbuf_add(attr, ap, sz);`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`return 1;`
			`}`

git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`static struct strbuf *content[MAX_BOUNDARIES];`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`static struct strbuf **content_top = content;`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`static void handle_content_type(struct strbuf *line)`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`{`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`struct strbuf *boundary = xmalloc(sizeof(struct strbuf));`
			`strbuf_init(boundary, line->len);`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`if (!strcasestr(line->buf, "text/"))`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`message_type = TYPE_OTHER;`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`if (slurp_attr(line->buf, "boundary=", boundary)) {`
			`strbuf_insert(boundary, 0, "--", 2);`
mailinfo: re-fix MIME multipart boundary parsing Recent changes to is_multipart_boundary() caused git-mailinfo to segfault. The reason was after handling the end of the boundary the code tried to look for another boundary. Because the boundary list was empty, dereferencing the pointer to the top of the boundary caused the program to go boom. The fix is to check to see if the list is empty and if so go on its merry way instead of looking for another boundary. I also fixed a couple of increments and decrements that didn't look correct relating to content_top. The boundary test case was updated to catch future problems like this again. Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-08-14 17:35:42 +02:00			`if (++content_top > &content[MAX_BOUNDARIES]) {`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`fprintf(stderr, "Too many boundaries to handle\n");`
			`exit(1);`
			`}`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`*content_top = boundary;`
			`boundary = NULL;`
mailinfo: barf and exist upon nested multipart. At least we can detect what we do not handle. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-07 01:46:34 +02:00			`}`
builtin-mailinfo.c: compare character encodings case insensitively When converting between character encodings, git tests whether the "from" encoding and the "to" encoding have the same name. git should perform this test case insensitively so that e.g. utf-8 is not seen as a different encoding than UTF-8. Additionally, it is not necessary to call tolower() anymore on the encodings extracted from the mail message. Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-05-19 01:44:40 +02:00			`slurp_attr(line->buf, "charset=", &charset);`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00
			`if (boundary) {`
			`strbuf_release(boundary);`
			`free(boundary);`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`}`
			`}`

git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`static void handle_content_transfer_encoding(const struct strbuf *line)`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`{`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`if (strcasestr(line->buf, "base64"))`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`transfer_encoding = TE_BASE64;`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`else if (strcasestr(line->buf, "quoted-printable"))`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`transfer_encoding = TE_QP;`
			`else`
			`transfer_encoding = TE_DONTCARE;`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`}`

git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`static int is_multipart_boundary(const struct strbuf *line)`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`{`
mailinfo: fix MIME multi-part message boundary handling After finding a MIME multi-part message boundary line, the handle_body() function is supposed to first flush any accumulated contents from the previous part to the output stream. However, the code mistakenly output the boundary line it found. The old code that used one global, fixed-length buffer line[] used an alternate static buffer newline[] for keeping track of this accumulated contents and flushed newline[] upon seeing the boundary; when 3b6121f (git-mailinfo: use strbuf's instead of fixed buffers, 2008-07-13) converted a fixed-length buffer in this program to use strbuf,these two buffers were converted to "line" and "prev" (the latter of which now has a much more sensible name) strbufs, but the code mistakenly flushed "line" (which contains the boundary we have just found), instead of "prev". This resulted in the first boundary to be output in front of the first line of the message. The rewritten implementation of handle_boundary() lost the terminating newline; this would then result in the second line of the message to be stuck with the first line. The is_multipart_boundary() was designed to catch both the internal boundary and the terminating one (the one with trailing "--"); this also was broken with the rewrite, and the code in the handle_boundary() to handle the terminating boundary was never triggered. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-08-09 10:17:24 +02:00			`return (((*content_top)->len <= line->len) &&`
			`!memcmp(line->buf, (content_top)->buf, (content_top)->len));`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`}`

git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`static void cleanup_subject(struct strbuf *subject)`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`{`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`char *pos;`
			`size_t remove;`
mailinfo: Remove only one set of square brackets git-format-patch prepends patches with a [PATCH x/n] prefix, but mailinfo used to remove any number of square-bracket pairs and the content between them. This prevents one from using a commit subject like this: [ and ] must be allowed as input Removing the square bracket pair from this rather clumsily constructed subject line loses important information, so we must take care not to. This patch causes the subject stripping to stop after it has encountered one pair of square brackets. One possible downside of this patch is that the patch-handling programs will now fail at removing author-added square-brackets to be removed, such as [RFC][PATCH x/n] However, since format-patch only adds one set of square brackets, this behaviour is quite easily undesrstood and defended while the previous behaviour is not. Signed-off-by: Andreas Ericsson <ae@op5.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-06-29 11:55:51 +02:00			`int brackets_removed = 0;`

git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`while (subject->len) {`
			`switch (*subject->buf) {`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`case 'r': case 'R':`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`if (subject->len <= 3)`
			`break;`
			`if (!memcmp(subject->buf + 1, "e:", 2)) {`
			`strbuf_remove(subject, 0, 3);`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`continue;`
			`}`
			`break;`
			`case ' ': case '\t': case ':':`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`strbuf_remove(subject, 0, 1);`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`continue;`
			`case '[':`
mailinfo: Remove only one set of square brackets git-format-patch prepends patches with a [PATCH x/n] prefix, but mailinfo used to remove any number of square-bracket pairs and the content between them. This prevents one from using a commit subject like this: [ and ] must be allowed as input Removing the square bracket pair from this rather clumsily constructed subject line loses important information, so we must take care not to. This patch causes the subject stripping to stop after it has encountered one pair of square brackets. One possible downside of this patch is that the patch-handling programs will now fail at removing author-added square-brackets to be removed, such as [RFC][PATCH x/n] However, since format-patch only adds one set of square brackets, this behaviour is quite easily undesrstood and defended while the previous behaviour is not. Signed-off-by: Andreas Ericsson <ae@op5.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-06-29 11:55:51 +02:00			`/* remove only one set of square brackets */`
			`if (brackets_removed)`
			`break;`

git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`if ((pos = strchr(subject->buf, ']'))) {`
mailinfo: off-by-one fix for [PATCH (foobar)] removal from Subject: line A patch title "[PATCH] 1" was sanitized by the original code by stripping the "[PATCH]" from the front, but after the conversion to use strbuf this behaviour was broken due to a counting error. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-17 07:42:04 +02:00			`remove = pos - subject->buf;`
			`if (remove <= (subject->len - remove) * 2) {`
			`strbuf_remove(subject, 0, remove + 1);`
mailinfo: Remove only one set of square brackets git-format-patch prepends patches with a [PATCH x/n] prefix, but mailinfo used to remove any number of square-bracket pairs and the content between them. This prevents one from using a commit subject like this: [ and ] must be allowed as input Removing the square bracket pair from this rather clumsily constructed subject line loses important information, so we must take care not to. This patch causes the subject stripping to stop after it has encountered one pair of square brackets. One possible downside of this patch is that the patch-handling programs will now fail at removing author-added square-brackets to be removed, such as [RFC][PATCH x/n] However, since format-patch only adds one set of square brackets, this behaviour is quite easily undesrstood and defended while the previous behaviour is not. Signed-off-by: Andreas Ericsson <ae@op5.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-06-29 11:55:51 +02:00			`brackets_removed = 1;`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`continue;`
			`}`
			`} else`
			`strbuf_remove(subject, 0, 1);`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`break;`
			`}`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`strbuf_trim(subject);`
			`return;`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`}`
Make git-mailinfo a builtin [jc: with a bit of constness tightening] Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-13 22:21:50 +02:00			`}`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`static void cleanup_space(struct strbuf *sb)`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`{`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`size_t pos, cnt;`
			`for (pos = 0; pos < sb->len; pos++) {`
			`if (isspace(sb->buf[pos])) {`
			`sb->buf[pos] = ' ';`
			`for (cnt = 0; isspace(sb->buf[pos + cnt + 1]); cnt++);`
			`strbuf_remove(sb, pos + 1, cnt);`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`}`
			`}`
			`}`

git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`static void decode_header(struct strbuf *line);`
Improved const correctness for strings Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-10-21 06:12:12 +02:00			`static const char *header[MAX_HDR_PARSED] = {`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`"From","Subject","Date",`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`};`

git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`static inline int cmp_header(const struct strbuf line, const char hdr)`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`{`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`int len = strlen(hdr);`
			`return !strncasecmp(line->buf, hdr, len) && line->len > len &&`
			`line->buf[len] == ':' && isspace(line->buf[len + 1]);`
			`}`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`static int check_header(const struct strbuf *line,`
			`struct strbuf *hdr_data[], int overwrite)`
			`{`
			`int i, ret = 0, len;`
			`struct strbuf sb = STRBUF_INIT;`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`/* search for the interesting parts */`
			`for (i = 0; header[i]; i++) {`
			`int len = strlen(header[i]);`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`if ((!hdr_data[i] \|\| overwrite) && cmp_header(line, header[i])) {`
Move B and Q decoding into check header. B and Q decoding is not appropriate for in body headers, so move it up to where we explicitly know we have a real email header. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-23 21:45:37 +02:00			`/* Unwrap inline B and Q encoding, and optionally`
			`* normalize the meta information to utf8.`
			`*/`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`strbuf_add(&sb, line->buf + len + 2, line->len - len - 2);`
			`decode_header(&sb);`
			`handle_header(&hdr_data[i], &sb);`
			`ret = 1;`
			`goto check_header_out;`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`}`
			`}`

builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`/* Content stuff */`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`if (cmp_header(line, "Content-Type")) {`
			`len = strlen("Content-Type: ");`
			`strbuf_add(&sb, line->buf + len, line->len - len);`
			`decode_header(&sb);`
			`strbuf_insert(&sb, 0, "Content-Type: ", len);`
			`handle_content_type(&sb);`
			`ret = 1;`
			`goto check_header_out;`
			`}`
			`if (cmp_header(line, "Content-Transfer-Encoding")) {`
			`len = strlen("Content-Transfer-Encoding: ");`
			`strbuf_add(&sb, line->buf + len, line->len - len);`
			`decode_header(&sb);`
			`handle_content_transfer_encoding(&sb);`
			`ret = 1;`
			`goto check_header_out;`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`}`

			`/* for inbody stuff */`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`if (!prefixcmp(line->buf, ">From") && isspace(line->buf[5])) {`
			`ret = 1; /* Should this return 0? */`
			`goto check_header_out;`
			`}`
			`if (!prefixcmp(line->buf, "[PATCH]") && isspace(line->buf[7])) {`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`for (i = 0; header[i]; i++) {`
git-mailinfo: Fix getting the subject from the in-body [PATCH] line "Subject: " isn't in the static array "header", and thus memcmp("Subject:", header[i], 7) will never match. Even if it did so, hdr_data[] may not have been allocated if there weren't a "Subject: " in-body when we process "[PATCH]" in the affected codepath. Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-10 23:41:33 +02:00			`if (!memcmp("Subject", header[i], 7)) {`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`handle_header(&hdr_data[i], line);`
			`ret = 1;`
			`goto check_header_out;`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`}`
			`}`
			`}`

git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`check_header_out:`
			`strbuf_release(&sb);`
			`return ret;`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`}`

git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`static int is_rfc2822_header(const struct strbuf *line)`
mailinfo: More carefully parse header lines in read_one_header_line() We exited prematurely from header parsing loop when the header field did not have a space after the colon but we insisted on it, and we got the check wrong because we forgot that we strip the trailing whitespace before we do the check. The space after the colon is not even required by RFC2822, so stop requiring it. While we are at it, the header line is specified to be more strict than "anything with a colon in it" (there must be one or more characters before the colon, and they must not be controls, SP or non US-ASCII), so implement that check as well, lest we mistakenly think something like: Bogus not a header line: this is not. as a header line. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-26 09:46:58 +02:00			`{`
			`/*`
			`* The section that defines the loosest possible`
			`* field name is "3.6.8 Optional fields".`
			`*`
			`* optional-field = field-name ":" unstructured CRLF`
			`* field-name = 1*ftext`
			`* ftext = %d33-57 / %59-126`
			`*/`
			`int ch;`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`char *cp = line->buf;`
mailinfo: do not get confused with logical lines that are too long. It basically considers all the continuation lines to be lines of their own, and if the total line is bigger than what we can fit in it, we just truncate the result rather than stop in the middle and then get confused when we try to parse the "next" line (which is just the remainder of the first line). [jc: added test, and tightened boundary a bit per list discussion.] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-26 20:10:59 +01:00
			`/* Count mbox From headers as headers */`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`if (!prefixcmp(cp, "From ") \|\| !prefixcmp(cp, ">From "))`
mailinfo: do not get confused with logical lines that are too long. It basically considers all the continuation lines to be lines of their own, and if the total line is bigger than what we can fit in it, we just truncate the result rather than stop in the middle and then get confused when we try to parse the "next" line (which is just the remainder of the first line). [jc: added test, and tightened boundary a bit per list discussion.] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-26 20:10:59 +01:00			`return 1;`

mailinfo: More carefully parse header lines in read_one_header_line() We exited prematurely from header parsing loop when the header field did not have a space after the colon but we insisted on it, and we got the check wrong because we forgot that we strip the trailing whitespace before we do the check. The space after the colon is not even required by RFC2822, so stop requiring it. While we are at it, the header line is specified to be more strict than "anything with a colon in it" (there must be one or more characters before the colon, and they must not be controls, SP or non US-ASCII), so implement that check as well, lest we mistakenly think something like: Bogus not a header line: this is not. as a header line. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-26 09:46:58 +02:00			`while ((ch = *cp++)) {`
			`if (ch == ':')`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`return 1;`
mailinfo: More carefully parse header lines in read_one_header_line() We exited prematurely from header parsing loop when the header field did not have a space after the colon but we insisted on it, and we got the check wrong because we forgot that we strip the trailing whitespace before we do the check. The space after the colon is not even required by RFC2822, so stop requiring it. While we are at it, the header line is specified to be more strict than "anything with a colon in it" (there must be one or more characters before the colon, and they must not be controls, SP or non US-ASCII), so implement that check as well, lest we mistakenly think something like: Bogus not a header line: this is not. as a header line. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-26 09:46:58 +02:00			`if ((33 <= ch && ch <= 57) \|\|`
			`(59 <= ch && ch <= 126))`
			`continue;`
			`break;`
			`}`
			`return 0;`
			`}`

git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`static int read_one_header_line(struct strbuf line, FILE in)`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`{`
mailinfo: do not get confused with logical lines that are too long. It basically considers all the continuation lines to be lines of their own, and if the total line is bigger than what we can fit in it, we just truncate the result rather than stop in the middle and then get confused when we try to parse the "next" line (which is just the remainder of the first line). [jc: added test, and tightened boundary a bit per list discussion.] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-26 20:10:59 +01:00			`/* Get the first part of the line. */`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`if (strbuf_getline(line, in, '\n'))`
mailinfo: do not get confused with logical lines that are too long. It basically considers all the continuation lines to be lines of their own, and if the total line is bigger than what we can fit in it, we just truncate the result rather than stop in the middle and then get confused when we try to parse the "next" line (which is just the remainder of the first line). [jc: added test, and tightened boundary a bit per list discussion.] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-26 20:10:59 +01:00			`return 0;`

			`/*`
			`* Is it an empty line or not a valid rfc2822 header?`
			`* If so, stop here, and return false ("not a header")`
			`*/`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`strbuf_rtrim(line);`
			`if (!line->len \|\| !is_rfc2822_header(line)) {`
mailinfo: do not get confused with logical lines that are too long. It basically considers all the continuation lines to be lines of their own, and if the total line is bigger than what we can fit in it, we just truncate the result rather than stop in the middle and then get confused when we try to parse the "next" line (which is just the remainder of the first line). [jc: added test, and tightened boundary a bit per list discussion.] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-26 20:10:59 +01:00			`/* Re-add the newline */`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`strbuf_addch(line, '\n');`
mailinfo: do not get confused with logical lines that are too long. It basically considers all the continuation lines to be lines of their own, and if the total line is bigger than what we can fit in it, we just truncate the result rather than stop in the middle and then get confused when we try to parse the "next" line (which is just the remainder of the first line). [jc: added test, and tightened boundary a bit per list discussion.] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-26 20:10:59 +01:00			`return 0;`
			`}`

			`/*`
			`* Now we need to eat all the continuation lines..`
			`* Yuck, 2822 header "folding"`
			`*/`
			`for (;;) {`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`int peek;`
			`struct strbuf continuation = STRBUF_INIT;`
mailinfo: do not get confused with logical lines that are too long. It basically considers all the continuation lines to be lines of their own, and if the total line is bigger than what we can fit in it, we just truncate the result rather than stop in the middle and then get confused when we try to parse the "next" line (which is just the remainder of the first line). [jc: added test, and tightened boundary a bit per list discussion.] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-26 20:10:59 +01:00
More accurately detect header lines in read_one_header_line Only count lines of the form '^.*: ' and '^From ' as email header lines. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-23 21:53:20 +02:00			`peek = fgetc(in); ungetc(peek, in);`
			`if (peek != ' ' && peek != '\t')`
			`break;`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`if (strbuf_getline(&continuation, in, '\n'))`
mailinfo: do not get confused with logical lines that are too long. It basically considers all the continuation lines to be lines of their own, and if the total line is bigger than what we can fit in it, we just truncate the result rather than stop in the middle and then get confused when we try to parse the "next" line (which is just the remainder of the first line). [jc: added test, and tightened boundary a bit per list discussion.] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-26 20:10:59 +01:00			`break;`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`continuation.buf[0] = '\n';`
			`strbuf_rtrim(&continuation);`
			`strbuf_addbuf(line, &continuation);`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`}`
mailinfo: do not get confused with logical lines that are too long. It basically considers all the continuation lines to be lines of their own, and if the total line is bigger than what we can fit in it, we just truncate the result rather than stop in the middle and then get confused when we try to parse the "next" line (which is just the remainder of the first line). [jc: added test, and tightened boundary a bit per list discussion.] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-26 20:10:59 +01:00
			`return 1;`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`}`

git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`static struct strbuf decode_q_segment(const struct strbuf q_seg, int rfc2047)`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`{`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`const char *in = q_seg->buf;`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`int c;`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`struct strbuf *out = xmalloc(sizeof(struct strbuf));`
			`strbuf_init(out, q_seg->len);`

			`while ((c = *in++) != 0) {`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`if (c == '=') {`
			`int d = *in++;`
			`if (d == '\n' \|\| !d)`
			`break; /* drop trailing newline */`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`strbuf_addch(out, (hexval(d) << 4) \| hexval(*in++));`
mailinfo: decode underscore used in "Q" encoding properly. Quoted-Printable (RFC 2045) and the "Q" encoding (RFC 2047) are subtly different; the latter is used on the mail header and an underscore needs to be decoded to 0x20. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-21 09:06:58 +02:00			`continue;`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`}`
mailinfo: decode underscore used in "Q" encoding properly. Quoted-Printable (RFC 2045) and the "Q" encoding (RFC 2047) are subtly different; the latter is used on the mail header and an underscore needs to be decoded to 0x20. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-21 09:06:58 +02:00			`if (rfc2047 && c == '_') /* rfc2047 4.2 (2) */`
			`c = 0x20;`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`strbuf_addch(out, c);`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`}`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`return out;`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`}`

git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`static struct strbuf decode_b_segment(const struct strbuf b_seg)`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`{`
			`/* Decode in..ep, possibly in-place to ot */`
			`int c, pos = 0, acc = 0;`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`const char *in = b_seg->buf;`
			`struct strbuf *out = xmalloc(sizeof(struct strbuf));`
			`strbuf_init(out, b_seg->len);`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`while ((c = *in++) != 0) {`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`if (c == '+')`
			`c = 62;`
			`else if (c == '/')`
			`c = 63;`
			`else if ('A' <= c && c <= 'Z')`
			`c -= 'A';`
			`else if ('a' <= c && c <= 'z')`
			`c -= 'a' - 26;`
			`else if ('0' <= c && c <= '9')`
			`c -= '0' - 52;`
			`else`
			`continue; /* garbage */`
			`switch (pos++) {`
			`case 0:`
			`acc = (c << 2);`
			`break;`
			`case 1:`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`strbuf_addch(out, (acc \| (c >> 4)));`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`acc = (c & 15) << 4;`
			`break;`
			`case 2:`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`strbuf_addch(out, (acc \| (c >> 2)));`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`acc = (c & 3) << 6;`
			`break;`
			`case 3:`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`strbuf_addch(out, (acc \| c));`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`acc = pos = 0;`
			`break;`
			`}`
			`}`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`return out;`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`}`

Do a better job at guessing unknown character sets At least in the kernel development community, we're generally slowly converting to UTF-8 everywhere, and the old default of Latin1 in emails is being supplanted by UTF-8, and it doesn't necessarily show up as such in the mail headers (because, quite frankly, when people send patches around, they want the email client to do as little as humanly possible about the patch) Despite that, it's often the case that email addresses etc still have Latin1, so I've seen emails where this is a mixed bag, with Signed-off parts being copied from email (and containing Latin1 characters), and the rest of the email being a patch in UTF-8. So this suggests a very natural change: if the target character set is utf-8 (the default), and if the source already looks like utf-8, just assume that it doesn't need any conversion at all. Only assume that it needs conversion if it isn't already valid utf-8, in which case we (for historical reasons) will assume it's Latin1. Basically no really _valid_ latin1 will ever look like utf-8, so while this changes our historical behaviour, it doesn't do so in practice, and makes the default behaviour saner for the case where the input was already in proper format. We could do a more fancy guess, of course, but this correctly handled a series of patches I just got from Andrew that had a mixture of Latin1 and UTF-8 (in different emails, but without any character set indication). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-07-17 19:34:44 +02:00			`/*`
			`* When there is no known charset, guess.`
			`*`
			`* Right now we assume that if the target is UTF-8 (the default),`
			`* and it already looks like UTF-8 (which includes US-ASCII as its`
			`* subset, of course) then that is what it is and there is nothing`
			`* to do.`
			`*`
			`* Otherwise, we default to assuming it is Latin1 for historical`
			`* reasons.`
			`*/`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`static const char guess_charset(const struct strbuf line, const char *target_charset)`
Do a better job at guessing unknown character sets At least in the kernel development community, we're generally slowly converting to UTF-8 everywhere, and the old default of Latin1 in emails is being supplanted by UTF-8, and it doesn't necessarily show up as such in the mail headers (because, quite frankly, when people send patches around, they want the email client to do as little as humanly possible about the patch) Despite that, it's often the case that email addresses etc still have Latin1, so I've seen emails where this is a mixed bag, with Signed-off parts being copied from email (and containing Latin1 characters), and the rest of the email being a patch in UTF-8. So this suggests a very natural change: if the target character set is utf-8 (the default), and if the source already looks like utf-8, just assume that it doesn't need any conversion at all. Only assume that it needs conversion if it isn't already valid utf-8, in which case we (for historical reasons) will assume it's Latin1. Basically no really _valid_ latin1 will ever look like utf-8, so while this changes our historical behaviour, it doesn't do so in practice, and makes the default behaviour saner for the case where the input was already in proper format. We could do a more fancy guess, of course, but this correctly handled a series of patches I just got from Andrew that had a mixture of Latin1 and UTF-8 (in different emails, but without any character set indication). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-07-17 19:34:44 +02:00			`{`
			`if (is_encoding_utf8(target_charset)) {`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`if (is_utf8(line->buf))`
Do a better job at guessing unknown character sets At least in the kernel development community, we're generally slowly converting to UTF-8 everywhere, and the old default of Latin1 in emails is being supplanted by UTF-8, and it doesn't necessarily show up as such in the mail headers (because, quite frankly, when people send patches around, they want the email client to do as little as humanly possible about the patch) Despite that, it's often the case that email addresses etc still have Latin1, so I've seen emails where this is a mixed bag, with Signed-off parts being copied from email (and containing Latin1 characters), and the rest of the email being a patch in UTF-8. So this suggests a very natural change: if the target character set is utf-8 (the default), and if the source already looks like utf-8, just assume that it doesn't need any conversion at all. Only assume that it needs conversion if it isn't already valid utf-8, in which case we (for historical reasons) will assume it's Latin1. Basically no really _valid_ latin1 will ever look like utf-8, so while this changes our historical behaviour, it doesn't do so in practice, and makes the default behaviour saner for the case where the input was already in proper format. We could do a more fancy guess, of course, but this correctly handled a series of patches I just got from Andrew that had a mixture of Latin1 and UTF-8 (in different emails, but without any character set indication). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-07-17 19:34:44 +02:00			`return NULL;`
			`}`
builtin-mailinfo.c: use "ISO8859-1" instead of "latin1" as fallback encoding Some platforms do not understand the character encoding "latin1" which is another name for "ISO8859-1". So use "ISO8859-1" instead which all tested platforms understand. Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-05-19 01:44:41 +02:00			`return "ISO8859-1";`
Do a better job at guessing unknown character sets At least in the kernel development community, we're generally slowly converting to UTF-8 everywhere, and the old default of Latin1 in emails is being supplanted by UTF-8, and it doesn't necessarily show up as such in the mail headers (because, quite frankly, when people send patches around, they want the email client to do as little as humanly possible about the patch) Despite that, it's often the case that email addresses etc still have Latin1, so I've seen emails where this is a mixed bag, with Signed-off parts being copied from email (and containing Latin1 characters), and the rest of the email being a patch in UTF-8. So this suggests a very natural change: if the target character set is utf-8 (the default), and if the source already looks like utf-8, just assume that it doesn't need any conversion at all. Only assume that it needs conversion if it isn't already valid utf-8, in which case we (for historical reasons) will assume it's Latin1. Basically no really _valid_ latin1 will ever look like utf-8, so while this changes our historical behaviour, it doesn't do so in practice, and makes the default behaviour saner for the case where the input was already in proper format. We could do a more fancy guess, of course, but this correctly handled a series of patches I just got from Andrew that had a mixture of Latin1 and UTF-8 (in different emails, but without any character set indication). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-07-17 19:34:44 +02:00			`}`

git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`static void convert_to_utf8(struct strbuf line, const char charset)`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`{`
Do a better job at guessing unknown character sets At least in the kernel development community, we're generally slowly converting to UTF-8 everywhere, and the old default of Latin1 in emails is being supplanted by UTF-8, and it doesn't necessarily show up as such in the mail headers (because, quite frankly, when people send patches around, they want the email client to do as little as humanly possible about the patch) Despite that, it's often the case that email addresses etc still have Latin1, so I've seen emails where this is a mixed bag, with Signed-off parts being copied from email (and containing Latin1 characters), and the rest of the email being a patch in UTF-8. So this suggests a very natural change: if the target character set is utf-8 (the default), and if the source already looks like utf-8, just assume that it doesn't need any conversion at all. Only assume that it needs conversion if it isn't already valid utf-8, in which case we (for historical reasons) will assume it's Latin1. Basically no really _valid_ latin1 will ever look like utf-8, so while this changes our historical behaviour, it doesn't do so in practice, and makes the default behaviour saner for the case where the input was already in proper format. We could do a more fancy guess, of course, but this correctly handled a series of patches I just got from Andrew that had a mixture of Latin1 and UTF-8 (in different emails, but without any character set indication). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-07-17 19:34:44 +02:00			`char *out;`

			`if (!charset \|\| !*charset) {`
			`charset = guess_charset(line, metainfo_charset);`
			`if (!charset)`
			`return;`
			`}`
Move encoding conversion routine out of mailinfo to utf8.c This moves the body of convert_to_utf8() routine used in mailinfo to the utf8.c i18n library. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-24 08:36:55 +01:00
builtin-mailinfo.c: compare character encodings case insensitively When converting between character encodings, git tests whether the "from" encoding and the "to" encoding have the same name. git should perform this test case insensitively so that e.g. utf-8 is not seen as a different encoding than UTF-8. Additionally, it is not necessary to call tolower() anymore on the encodings extracted from the mail message. Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-05-19 01:44:40 +02:00			`if (!strcasecmp(metainfo_charset, charset))`
mailinfo: fix 'fatal: cannot convert from utf-8 to utf-8' For some reason, I got this error message. Maybe it does not make sense, but then we should not really try to convert the text when it is not necessary. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-07-24 02:03:26 +02:00			`return;`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`out = reencode_string(line->buf, metainfo_charset, charset);`
-u is now default for 'git-mailinfo'. Originally from David Woodhouse, but also adjusts the callers of mailinfo to the new default. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-10 06:31:36 +01:00			`if (!out)`
remove trailing LF in die() messages LF at the end of format strings given to die() is redundant because die already adds one on its own. Signed-off-by: Alexander Potashev <aspotashev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-01-04 19:38:41 +01:00			`die("cannot convert from %s to %s",`
Do a better job at guessing unknown character sets At least in the kernel development community, we're generally slowly converting to UTF-8 everywhere, and the old default of Latin1 in emails is being supplanted by UTF-8, and it doesn't necessarily show up as such in the mail headers (because, quite frankly, when people send patches around, they want the email client to do as little as humanly possible about the patch) Despite that, it's often the case that email addresses etc still have Latin1, so I've seen emails where this is a mixed bag, with Signed-off parts being copied from email (and containing Latin1 characters), and the rest of the email being a patch in UTF-8. So this suggests a very natural change: if the target character set is utf-8 (the default), and if the source already looks like utf-8, just assume that it doesn't need any conversion at all. Only assume that it needs conversion if it isn't already valid utf-8, in which case we (for historical reasons) will assume it's Latin1. Basically no really _valid_ latin1 will ever look like utf-8, so while this changes our historical behaviour, it doesn't do so in practice, and makes the default behaviour saner for the case where the input was already in proper format. We could do a more fancy guess, of course, but this correctly handled a series of patches I just got from Andrew that had a mixture of Latin1 and UTF-8 (in different emails, but without any character set indication). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-07-17 19:34:44 +02:00			`charset, metainfo_charset);`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`strbuf_attach(line, out, strlen(out), strlen(out));`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`}`

git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`static int decode_header_bq(struct strbuf *it)`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`{`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`char in, ep, *cp;`
			`struct strbuf outbuf = STRBUF_INIT, *dec;`
			`struct strbuf charset_q = STRBUF_INIT, piecebuf = STRBUF_INIT;`
mailinfo: assume input is latin-1 on the header as we do for the body When the input mbox does not identify what encoding it is in, and already have RFC2047 stripped away, we cannot tell what encoding the header text is in. For body text, when the message does not say what charset it is in, we fall back to assume latin-1 input when converting to utf8. This should be done consistently to the header as well. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-05 23:17:49 +02:00			`int rfc2047 = 0;`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`in = it->buf;`
			`while (in - it->buf <= it->len && (ep = strstr(in, "=?")) != NULL) {`
			`int encoding;`
			`strbuf_reset(&charset_q);`
			`strbuf_reset(&piecebuf);`
mailinfo: assume input is latin-1 on the header as we do for the body When the input mbox does not identify what encoding it is in, and already have RFC2047 stripped away, we cannot tell what encoding the header text is in. For body text, when the message does not say what charset it is in, we fall back to assume latin-1 input when converting to utf8. This should be done consistently to the header as well. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-05 23:17:49 +02:00			`rfc2047 = 1;`

mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`if (in != ep) {`
mailinfo: correctly handle multiline 'Subject:' header When native language (RU) is in use, subject header usually contains several parts, e.g. Subject: [Navy-patches] [PATCH] =?utf-8?b?0JjQt9C80LXQvdGR0L0g0YHQv9C40YHQvtC6INC/0LA=?= =?utf-8?b?0LrQtdGC0L7QsiDQvdC10L7QsdGF0L7QtNC40LzRi9GFINC00LvRjyA=?= =?utf-8?b?0YHQsdC+0YDQutC4?= This exposes several bugs in builtin-mailinfo.c: 1. decode_b_segment: do not append explicit NUL -- explicit NUL was preventing correct header construction on parts concatenation via strbuf_addbuf in decode_header_bq. Fixes: -Subject: Изменён список пакетов необходимых для сборки +Subject: Изменён список па Then 2. Do not emit '\n' between "encoded-word" where RFC2046 says that linear white space between them are ignored when displaying. Fixes: -Subject: Изменён список пакетов необходимых для сборки +Subject: Изменён список па кетов необходимых для сборки Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-01-07 23:43:42 +01:00			`/*`
			`* We are about to process an encoded-word`
			`* that begins at ep, but there is something`
			`* before the encoded word.`
			`*/`
			`char *scan;`
			`for (scan = in; scan < ep; scan++)`
			`if (!isspace(*scan))`
			`break;`

			`if (scan != ep \|\| in == it->buf) {`
			`/*`
			`* We should not lose that "something",`
			`* unless we have just processed an`
			`* encoded-word, and there is only LWS`
			`* before the one we are about to process.`
			`*/`
			`strbuf_add(&outbuf, in, ep - in);`
			`}`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`}`
			`/* E.g.`
			`* ep : "=?iso-2022-jp?B?GyR...?= foo"`
			`* ep : "=?ISO-8859-1?Q?Foo=FCbar?= baz"`
			`*/`
			`ep += 2;`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00
			`if (ep - it->buf >= it->len \|\| !(cp = strchr(ep, '?')))`
			`goto decode_header_bq_out;`

			`if (cp + 3 - it->buf > it->len)`
			`goto decode_header_bq_out;`
			`strbuf_add(&charset_q, ep, cp - ep);`

mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`encoding = cp[1];`
			`if (!encoding \|\| cp[2] != '?')`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`goto decode_header_bq_out;`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`ep = strstr(cp + 3, "?=");`
			`if (!ep)`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`goto decode_header_bq_out;`
			`strbuf_add(&piecebuf, cp + 3, ep - cp - 3);`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`switch (tolower(encoding)) {`
			`default:`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`goto decode_header_bq_out;`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`case 'b':`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`dec = decode_b_segment(&piecebuf);`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`break;`
			`case 'q':`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`dec = decode_q_segment(&piecebuf, 1);`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`break;`
			`}`
mailinfo: allow -u to fall back on latin1 to utf8 conversion. When the message body does not identify what encoding it is in, -u assumes it is in latin-1 and converts it to utf8, which is the recommended encoding for git commit log messages. With -u=<encoding>, the conversion is made into the specified one, instead of utf8, to allow project-local policies. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-11-28 01:22:16 +01:00			`if (metainfo_charset)`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`convert_to_utf8(dec, charset_q.buf);`
Temporary fix for stack smashing in mailinfo Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-08-30 23:48:24 +02:00
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`strbuf_addbuf(&outbuf, dec);`
			`strbuf_release(dec);`
			`free(dec);`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`in = ep + 2;`
			`}`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`strbuf_addstr(&outbuf, in);`
			`strbuf_reset(it);`
			`strbuf_addbuf(it, &outbuf);`
			`decode_header_bq_out:`
			`strbuf_release(&outbuf);`
			`strbuf_release(&charset_q);`
			`strbuf_release(&piecebuf);`
mailinfo: assume input is latin-1 on the header as we do for the body When the input mbox does not identify what encoding it is in, and already have RFC2047 stripped away, we cannot tell what encoding the header text is in. For body text, when the message does not say what charset it is in, we fall back to assume latin-1 input when converting to utf8. This should be done consistently to the header as well. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-05 23:17:49 +02:00			`return rfc2047;`
			`}`

git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`static void decode_header(struct strbuf *it)`
mailinfo: assume input is latin-1 on the header as we do for the body When the input mbox does not identify what encoding it is in, and already have RFC2047 stripped away, we cannot tell what encoding the header text is in. For body text, when the message does not say what charset it is in, we fall back to assume latin-1 input when converting to utf8. This should be done consistently to the header as well. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-05 23:17:49 +02:00			`{`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`if (decode_header_bq(it))`
mailinfo: assume input is latin-1 on the header as we do for the body When the input mbox does not identify what encoding it is in, and already have RFC2047 stripped away, we cannot tell what encoding the header text is in. For body text, when the message does not say what charset it is in, we fall back to assume latin-1 input when converting to utf8. This should be done consistently to the header as well. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-05 23:17:49 +02:00			`return;`
			`/* otherwise "it" is a straight copy of the input.`
			`* This can be binary guck but there is no charset specified.`
			`*/`
			`if (metainfo_charset)`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`convert_to_utf8(it, "");`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`}`

git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`static void decode_transfer_encoding(struct strbuf *line)`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`{`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`struct strbuf *ret;`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00
			`switch (transfer_encoding) {`
			`case TE_QP:`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`ret = decode_q_segment(line, 0);`
			`break;`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`case TE_BASE64:`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`ret = decode_b_segment(line);`
			`break;`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`case TE_DONTCARE:`
mailinfo: apply the same fix not to lose NULs in BASE64 and QP codepaths Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-05-25 10:16:05 +02:00			`default:`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`return;`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`}`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`strbuf_reset(line);`
			`strbuf_addbuf(line, ret);`
			`strbuf_release(ret);`
			`free(ret);`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`}`

git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`static void handle_filter(struct strbuf *line);`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00
			`static int find_boundary(void)`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`{`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`while (!strbuf_getline(&line, fin, '\n')) {`
mailinfo: re-fix MIME multipart boundary parsing Recent changes to is_multipart_boundary() caused git-mailinfo to segfault. The reason was after handling the end of the boundary the code tried to look for another boundary. Because the boundary list was empty, dereferencing the pointer to the top of the boundary caused the program to go boom. The fix is to check to see if the list is empty and if so go on its merry way instead of looking for another boundary. I also fixed a couple of increments and decrements that didn't look correct relating to content_top. The boundary test case was updated to catch future problems like this again. Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-08-14 17:35:42 +02:00			`if (*content_top && is_multipart_boundary(&line))`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`return 1;`
			`}`
			`return 0;`
			`}`

			`static int handle_boundary(void)`
			`{`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`struct strbuf newline = STRBUF_INIT;`

			`strbuf_addch(&newline, '\n');`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`again:`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`if (line.len >= (*content_top)->len + 2 &&`
			`!memcmp(line.buf + (*content_top)->len, "--", 2)) {`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`/* we hit an end boundary */`
			`/* pop the current boundary off the stack */`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`strbuf_release(*content_top);`
			`free(*content_top);`
			`*content_top = NULL;`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00
			`/* technically won't happen as is_multipart_boundary()`
			`will fail first. But just in case..`
			`*/`
mailinfo: re-fix MIME multipart boundary parsing Recent changes to is_multipart_boundary() caused git-mailinfo to segfault. The reason was after handling the end of the boundary the code tried to look for another boundary. Because the boundary list was empty, dereferencing the pointer to the top of the boundary caused the program to go boom. The fix is to check to see if the list is empty and if so go on its merry way instead of looking for another boundary. I also fixed a couple of increments and decrements that didn't look correct relating to content_top. The boundary test case was updated to catch future problems like this again. Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-08-14 17:35:42 +02:00			`if (--content_top < content) {`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`fprintf(stderr, "Detected mismatched boundaries, "`
			`"can't recover\n");`
			`exit(1);`
			`}`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`handle_filter(&newline);`
			`strbuf_release(&newline);`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00
			`/* skip to the next boundary */`
			`if (!find_boundary())`
			`return 0;`
			`goto again;`
			`}`

			`/* set some defaults */`
			`transfer_encoding = TE_DONTCARE;`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`strbuf_reset(&charset);`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`message_type = TYPE_TEXT;`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`/* slurp in this section's info */`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`while (read_one_header_line(&line, fin))`
			`check_header(&line, p_hdr_data, 0);`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`strbuf_release(&newline);`
mailinfo: fix MIME multi-part message boundary handling After finding a MIME multi-part message boundary line, the handle_body() function is supposed to first flush any accumulated contents from the previous part to the output stream. However, the code mistakenly output the boundary line it found. The old code that used one global, fixed-length buffer line[] used an alternate static buffer newline[] for keeping track of this accumulated contents and flushed newline[] upon seeing the boundary; when 3b6121f (git-mailinfo: use strbuf's instead of fixed buffers, 2008-07-13) converted a fixed-length buffer in this program to use strbuf,these two buffers were converted to "line" and "prev" (the latter of which now has a much more sensible name) strbufs, but the code mistakenly flushed "line" (which contains the boundary we have just found), instead of "prev". This resulted in the first boundary to be output in front of the first line of the message. The rewritten implementation of handle_boundary() lost the terminating newline; this would then result in the second line of the message to be stuck with the first line. The is_multipart_boundary() was designed to catch both the internal boundary and the terminating one (the one with trailing "--"); this also was broken with the rewrite, and the code in the handle_boundary() to handle the terminating boundary was never triggered. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-08-09 10:17:24 +02:00			`/* replenish line */`
			`if (strbuf_getline(&line, fin, '\n'))`
			`return 0;`
			`strbuf_addch(&line, '\n');`
			`return 1;`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`}`

git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`static inline int patchbreak(const struct strbuf *line)`
restrict the patch filtering I have come across many emails that use long strings of '-'s as separators for ideas. This patch below limits the separator to only 3 '-', with the intent that long string of '-'s will stay in the commit msg and not in the patch file. Signed-off-by: Don Zickus <dzickus@redhat.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:06 +01:00			`{`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`size_t i;`

restrict the patch filtering I have come across many emails that use long strings of '-'s as separators for ideas. This patch below limits the separator to only 3 '-', with the intent that long string of '-'s will stay in the commit msg and not in the patch file. Signed-off-by: Don Zickus <dzickus@redhat.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:06 +01:00			`/* Beginning of a "diff -" header? */`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`if (!prefixcmp(line->buf, "diff -"))`
restrict the patch filtering I have come across many emails that use long strings of '-'s as separators for ideas. This patch below limits the separator to only 3 '-', with the intent that long string of '-'s will stay in the commit msg and not in the patch file. Signed-off-by: Don Zickus <dzickus@redhat.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:06 +01:00			`return 1;`

			`/* CVS "Index: " line? */`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`if (!prefixcmp(line->buf, "Index: "))`
restrict the patch filtering I have come across many emails that use long strings of '-'s as separators for ideas. This patch below limits the separator to only 3 '-', with the intent that long string of '-'s will stay in the commit msg and not in the patch file. Signed-off-by: Don Zickus <dzickus@redhat.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:06 +01:00			`return 1;`

			`/*`
			`* "--- <filename>" starts patches without headers`
			`* "---<sp>*" is a manual separator`
			`*/`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`if (line->len < 4)`
			`return 0;`

			`if (!prefixcmp(line->buf, "---")) {`
restrict the patch filtering I have come across many emails that use long strings of '-'s as separators for ideas. This patch below limits the separator to only 3 '-', with the intent that long string of '-'s will stay in the commit msg and not in the patch file. Signed-off-by: Don Zickus <dzickus@redhat.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:06 +01:00			`/* space followed by a filename? */`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`if (line->buf[3] == ' ' && !isspace(line->buf[4]))`
restrict the patch filtering I have come across many emails that use long strings of '-'s as separators for ideas. This patch below limits the separator to only 3 '-', with the intent that long string of '-'s will stay in the commit msg and not in the patch file. Signed-off-by: Don Zickus <dzickus@redhat.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:06 +01:00			`return 1;`
			`/* Just whitespace? */`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`for (i = 3; i < line->len; i++) {`
			`unsigned char c = line->buf[i];`
restrict the patch filtering I have come across many emails that use long strings of '-'s as separators for ideas. This patch below limits the separator to only 3 '-', with the intent that long string of '-'s will stay in the commit msg and not in the patch file. Signed-off-by: Don Zickus <dzickus@redhat.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:06 +01:00			`if (c == '\n')`
			`return 1;`
			`if (!isspace(c))`
			`break;`
			`}`
			`return 0;`
			`}`
			`return 0;`
			`}`

git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`static int handle_commit_msg(struct strbuf *line)`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`{`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`static int still_looking = 1;`

mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`if (!cmitmsg)`
			`return 0;`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`if (still_looking) {`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`strbuf_ltrim(line);`
			`if (!line->len)`
			`return 0;`
			`if ((still_looking = check_header(line, s_hdr_data, 0)) != 0)`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`return 0;`
			`}`
Refactor commit messge handling. - Move handle_info into main so it is called once after everything has been parsed. This allows the removal of a static variable and removes two duplicate calls. - Move parsing of inbody headers into handle_commit. This means we parse the in-body headers after we have decoded the character set, and it removes code duplication between handle_multipart_one_part and handle_body. - Change the flag indicating that we have seen an in body prefix header into another bit in seen. This is a little more general and allows the possibility of parsing in body headers after the body message has begun. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-23 21:47:28 +02:00
git-mailinfo fixes for patch munging Don't translate the patch to UTF-8, instead preserve the data as is. This also reverts a test case that was included in the original patch series. Also allow overwriting the authorship and title information we gather from RFC2822 mail headers with additional in-body headers, which was pointed out by Linus. Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-30 18:18:45 +02:00			`/* normalize the log message to UTF-8. */`
			`if (metainfo_charset)`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`convert_to_utf8(line, charset.buf);`
git-mailinfo fixes for patch munging Don't translate the patch to UTF-8, instead preserve the data as is. This also reverts a test case that was included in the original patch series. Also allow overwriting the authorship and title information we gather from RFC2822 mail headers with additional in-body headers, which was pointed out by Linus. Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-30 18:18:45 +02:00
restrict the patch filtering I have come across many emails that use long strings of '-'s as separators for ideas. This patch below limits the separator to only 3 '-', with the intent that long string of '-'s will stay in the commit msg and not in the patch file. Signed-off-by: Don Zickus <dzickus@redhat.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:06 +01:00			`if (patchbreak(line)) {`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`fclose(cmitmsg);`
			`cmitmsg = NULL;`
			`return 1;`
			`}`
Refactor commit messge handling. - Move handle_info into main so it is called once after everything has been parsed. This allows the removal of a static variable and removes two duplicate calls. - Move parsing of inbody headers into handle_commit. This means we parse the in-body headers after we have decoded the character set, and it removes code duplication between handle_multipart_one_part and handle_body. - Change the flag indicating that we have seen an in body prefix header into another bit in seen. This is a little more general and allows the possibility of parsing in body headers after the body message has begun. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-23 21:47:28 +02:00
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`fputs(line->buf, cmitmsg);`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`return 0;`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`}`

git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`static void handle_patch(const struct strbuf *line)`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`{`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`fwrite(line->buf, 1, line->len, patchfile);`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`patch_lines++;`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`}`

git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`static void handle_filter(struct strbuf *line)`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`{`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`static int filter = 0;`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`/* filter tells us which part we left off on */`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`switch (filter) {`
			`case 0:`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`if (!handle_commit_msg(line))`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`break;`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`filter++;`
			`case 1:`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`handle_patch(line);`
			`break;`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`}`
			`}`

builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`static void handle_body(void)`
[PATCH] mailinfo: handle folded header. Some people split their long E-mail address over two lines using the RFC2822 header "folding". We can lose authorship information this way, so make a minimum effort to deal with it, instead of special casing only the "Subject:" field. We could teach mailsplit to unfold the folded header, but teaching mailinfo about folding would make more sense; a single message can be fed to mailinfo without going through mailsplit. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-07-23 11:10:31 +02:00			`{`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`int len = 0;`
			`struct strbuf prev = STRBUF_INIT;`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00
			`/* Skip up to the first boundary */`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`if (*content_top) {`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`if (!find_boundary())`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`goto handle_body_out;`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`}`

			`do {`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`strbuf_setlen(&line, line.len + len);`

builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`/* process any boundary lines */`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`if (*content_top && is_multipart_boundary(&line)) {`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`/* flush any leftover */`
mailinfo: fix MIME multi-part message boundary handling After finding a MIME multi-part message boundary line, the handle_body() function is supposed to first flush any accumulated contents from the previous part to the output stream. However, the code mistakenly output the boundary line it found. The old code that used one global, fixed-length buffer line[] used an alternate static buffer newline[] for keeping track of this accumulated contents and flushed newline[] upon seeing the boundary; when 3b6121f (git-mailinfo: use strbuf's instead of fixed buffers, 2008-07-13) converted a fixed-length buffer in this program to use strbuf,these two buffers were converted to "line" and "prev" (the latter of which now has a much more sensible name) strbufs, but the code mistakenly flushed "line" (which contains the boundary we have just found), instead of "prev". This resulted in the first boundary to be output in front of the first line of the message. The rewritten implementation of handle_boundary() lost the terminating newline; this would then result in the second line of the message to be stuck with the first line. The is_multipart_boundary() was designed to catch both the internal boundary and the terminating one (the one with trailing "--"); this also was broken with the rewrite, and the code in the handle_boundary() to handle the terminating boundary was never triggered. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-08-09 10:17:24 +02:00			`if (prev.len) {`
			`handle_filter(&prev);`
			`strbuf_reset(&prev);`
			`}`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`if (!handle_boundary())`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`goto handle_body_out;`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`}`

git-mailinfo fixes for patch munging Don't translate the patch to UTF-8, instead preserve the data as is. This also reverts a test case that was included in the original patch series. Also allow overwriting the authorship and title information we gather from RFC2822 mail headers with additional in-body headers, which was pointed out by Linus. Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-30 18:18:45 +02:00			`/* Unwrap transfer encoding */`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`decode_transfer_encoding(&line);`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00
			`switch (transfer_encoding) {`
			`case TE_BASE64:`
mailinfo: feed only one line to handle_filter() for QP input The function is intended to be fed one logical line at a time to inspect, but a QP encoded raw input line can have more than one lines, just like BASE64 encoded one. Quoting LF as =0A may be unusual but RFC2045 allows it. The issue was noticed and fixed by Jay Soffian. JC added a test to protect the fix from regressing later. Signed-off-by: Jay Soffian <jaysoffian@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-15 22:53:36 +01:00			`case TE_QP:`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`{`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`struct strbuf lines, it, *sb;`

			`/* Prepend any previous partial lines */`
			`strbuf_insert(&line, 0, prev.buf, prev.len);`
			`strbuf_reset(&prev);`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00
			`/* binary data most likely doesn't have newlines */`
			`if (message_type != TYPE_TEXT) {`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`handle_filter(&line);`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`break;`
			`}`
mailinfo: apply the same fix not to lose NULs in BASE64 and QP codepaths Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-05-25 10:16:05 +02:00			`/*`
			`* This is a decoded line that may contain`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`* multiple new lines. Pass only one chunk`
			`* at a time to handle_filter()`
			`*/`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`lines = strbuf_split(&line, '\n');`
			`for (it = lines; (sb = *it); it++) {`
			`if ((it + 1) == NULL) / The last line */`
			`if (sb->buf[sb->len - 1] != '\n') {`
			`/* Partial line, save it for later. */`
			`strbuf_addbuf(&prev, sb);`
			`break;`
			`}`
			`handle_filter(sb);`
			`}`
mailinfo: apply the same fix not to lose NULs in BASE64 and QP codepaths Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-05-25 10:16:05 +02:00			`/*`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`* The partial chunk is saved in "prev" and will be`
mailinfo: apply the same fix not to lose NULs in BASE64 and QP codepaths Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-05-25 10:16:05 +02:00			`* appended by the next iteration of read_line_with_nul().`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`*/`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`strbuf_list_free(lines);`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`break;`
[PATCH] mailinfo: handle folded header. Some people split their long E-mail address over two lines using the RFC2822 header "folding". We can lose authorship information this way, so make a minimum effort to deal with it, instead of special casing only the "Subject:" field. We could teach mailsplit to unfold the folded header, but teaching mailinfo about folding would make more sense; a single message can be fed to mailinfo without going through mailsplit. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-07-23 11:10:31 +02:00			`}`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`default:`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`handle_filter(&line);`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`}`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`strbuf_reset(&line);`
			`if (strbuf_avail(&line) < 100)`
			`strbuf_grow(&line, 100);`
			`} while ((len = read_line_with_nul(line.buf, strbuf_avail(&line), fin)));`

			`handle_body_out:`
			`strbuf_release(&prev);`
[PATCH] mailinfo: handle folded header. Some people split their long E-mail address over two lines using the RFC2822 header "folding". We can lose authorship information this way, so make a minimum effort to deal with it, instead of special casing only the "Subject:" field. We could teach mailsplit to unfold the folded header, but teaching mailinfo about folding would make more sense; a single message can be fed to mailinfo without going through mailsplit. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-07-23 11:10:31 +02:00			`}`

git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`static void output_header_lines(FILE fout, const char hdr, const struct strbuf *data)`
rebase: try not to munge commit log message This makes rebase/am keep the original commit log message better, even when it does not conform to "single line paragraph to say what it does, then explain and defend why it is a good change in later paragraphs" convention. This change is a two-edged sword. While the earlier behaviour would make such commit log messages more friendly to readers who expect to get the birds-eye view with oneline summary formats, users who primarily use git as a way to interact with foreign SCM systems would not care much about the convenience of oneline git log tools, but care more about preserving their own convention. This changes their commits less useful to readers who read them with git tools while keeping them more consistent with the foreign SCM systems they interact with. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-07-29 02:57:25 +02:00			`{`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`const char *sp = data->buf;`
rebase: try not to munge commit log message This makes rebase/am keep the original commit log message better, even when it does not conform to "single line paragraph to say what it does, then explain and defend why it is a good change in later paragraphs" convention. This change is a two-edged sword. While the earlier behaviour would make such commit log messages more friendly to readers who expect to get the birds-eye view with oneline summary formats, users who primarily use git as a way to interact with foreign SCM systems would not care much about the convenience of oneline git log tools, but care more about preserving their own convention. This changes their commits less useful to readers who read them with git tools while keeping them more consistent with the foreign SCM systems they interact with. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-07-29 02:57:25 +02:00			`while (1) {`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`char *ep = strchr(sp, '\n');`
rebase: try not to munge commit log message This makes rebase/am keep the original commit log message better, even when it does not conform to "single line paragraph to say what it does, then explain and defend why it is a good change in later paragraphs" convention. This change is a two-edged sword. While the earlier behaviour would make such commit log messages more friendly to readers who expect to get the birds-eye view with oneline summary formats, users who primarily use git as a way to interact with foreign SCM systems would not care much about the convenience of oneline git log tools, but care more about preserving their own convention. This changes their commits less useful to readers who read them with git tools while keeping them more consistent with the foreign SCM systems they interact with. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-07-29 02:57:25 +02:00			`int len;`
			`if (!ep)`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`len = strlen(sp);`
rebase: try not to munge commit log message This makes rebase/am keep the original commit log message better, even when it does not conform to "single line paragraph to say what it does, then explain and defend why it is a good change in later paragraphs" convention. This change is a two-edged sword. While the earlier behaviour would make such commit log messages more friendly to readers who expect to get the birds-eye view with oneline summary formats, users who primarily use git as a way to interact with foreign SCM systems would not care much about the convenience of oneline git log tools, but care more about preserving their own convention. This changes their commits less useful to readers who read them with git tools while keeping them more consistent with the foreign SCM systems they interact with. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-07-29 02:57:25 +02:00			`else`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`len = ep - sp;`
			`fprintf(fout, "%s: %.*s\n", hdr, len, sp);`
rebase: try not to munge commit log message This makes rebase/am keep the original commit log message better, even when it does not conform to "single line paragraph to say what it does, then explain and defend why it is a good change in later paragraphs" convention. This change is a two-edged sword. While the earlier behaviour would make such commit log messages more friendly to readers who expect to get the birds-eye view with oneline summary formats, users who primarily use git as a way to interact with foreign SCM systems would not care much about the convenience of oneline git log tools, but care more about preserving their own convention. This changes their commits less useful to readers who read them with git tools while keeping them more consistent with the foreign SCM systems they interact with. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-07-29 02:57:25 +02:00			`if (!ep)`
			`break;`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`sp = ep + 1;`
rebase: try not to munge commit log message This makes rebase/am keep the original commit log message better, even when it does not conform to "single line paragraph to say what it does, then explain and defend why it is a good change in later paragraphs" convention. This change is a two-edged sword. While the earlier behaviour would make such commit log messages more friendly to readers who expect to get the birds-eye view with oneline summary formats, users who primarily use git as a way to interact with foreign SCM systems would not care much about the convenience of oneline git log tools, but care more about preserving their own convention. This changes their commits less useful to readers who read them with git tools while keeping them more consistent with the foreign SCM systems they interact with. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-07-29 02:57:25 +02:00			`}`
			`}`

builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`static void handle_info(void)`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`{`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`struct strbuf *hdr;`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`int i;`

			`for (i = 0; header[i]; i++) {`
			`/* only print inbody headers if we output a patch file */`
			`if (patch_lines && s_hdr_data[i])`
			`hdr = s_hdr_data[i];`
			`else if (p_hdr_data[i])`
			`hdr = p_hdr_data[i];`
			`else`
			`continue;`

			`if (!memcmp(header[i], "Subject", 7)) {`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`if (!keep_subject) {`
			`cleanup_subject(hdr);`
			`cleanup_space(hdr);`
rebase: try not to munge commit log message This makes rebase/am keep the original commit log message better, even when it does not conform to "single line paragraph to say what it does, then explain and defend why it is a good change in later paragraphs" convention. This change is a two-edged sword. While the earlier behaviour would make such commit log messages more friendly to readers who expect to get the birds-eye view with oneline summary formats, users who primarily use git as a way to interact with foreign SCM systems would not care much about the convenience of oneline git log tools, but care more about preserving their own convention. This changes their commits less useful to readers who read them with git tools while keeping them more consistent with the foreign SCM systems they interact with. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-07-29 02:57:25 +02:00			`}`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`output_header_lines(fout, "Subject", hdr);`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`} else if (!memcmp(header[i], "From", 4)) {`
mailinfo: 'From:' header should be unfold as well At present we do headers unfolding (see RFC822 3.1.1. LONG HEADER FIELDS) for all fields except 'From' (always) and 'Subject' (when keep_subject is set) Not unfolding 'From' is a bug -- see above-mentioned RFC link. Signed-off-by: Kirill Smelkov <kirr@landau.phys.spbu.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-01-13 00:22:11 +01:00			`cleanup_space(hdr);`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`handle_from(hdr);`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`fprintf(fout, "Author: %s\n", name.buf);`
			`fprintf(fout, "Email: %s\n", email.buf);`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`} else {`
			`cleanup_space(hdr);`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`fprintf(fout, "%s: %s\n", header[i], hdr->buf);`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`}`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`}`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`fprintf(fout, "\n");`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`}`

More missing static Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-08 11:22:56 +02:00			`static int mailinfo(FILE in, FILE out, int ks, const char *encoding,`
			`const char msg, const char patch)`
Make git-mailinfo a builtin [jc: with a bit of constness tightening] Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-13 22:21:50 +02:00			`{`
Make mailsplit and mailinfo strip whitespace from the start of the input Signed-off-by: Simon Sasburg <Simon.Sasburg@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-01 23:57:45 +01:00			`int peek;`
Make git-mailinfo a builtin [jc: with a bit of constness tightening] Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-13 22:21:50 +02:00			`keep_subject = ks;`
			`metainfo_charset = encoding;`
			`fin = in;`
			`fout = out;`

			`cmitmsg = fopen(msg, "w");`
			`if (!cmitmsg) {`
			`perror(msg);`
			`return -1;`
			`}`
			`patchfile = fopen(patch, "w");`
			`if (!patchfile) {`
			`perror(patch);`
			`fclose(cmitmsg);`
			`return -1;`
			`}`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`p_hdr_data = xcalloc(MAX_HDR_PARSED, sizeof(*p_hdr_data));`
			`s_hdr_data = xcalloc(MAX_HDR_PARSED, sizeof(*s_hdr_data));`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00
Make mailsplit and mailinfo strip whitespace from the start of the input Signed-off-by: Simon Sasburg <Simon.Sasburg@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-01 23:57:45 +01:00			`do {`
			`peek = fgetc(in);`
			`} while (isspace(peek));`
			`ungetc(peek, in);`

builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`/* process the email header */`
git-mailinfo: use strbuf's instead of fixed buffers Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-13 20:30:12 +02:00			`while (read_one_header_line(&line, fin))`
			`check_header(&line, p_hdr_data, 1);`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00
			`handle_body();`
			`handle_info();`
Make git-mailinfo a builtin [jc: with a bit of constness tightening] Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-13 22:21:50 +02:00
			`return 0;`
			`}`

Teach applymbox to keep the Subject: line. This corresponds to the -k flag to git format-patch --mbox option. The option should probably not be used when applying a real e-mail patch, but is needed when format-patch and applymbox pair is used for cherrypicking. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-17 07:18:27 +02:00			`static const char mailinfo_usage[] =`
Merge branch 'sb/dashless' * sb/dashless: Make usage strings dash-less t/: Use "test_must_fail git" instead of "! git" t/test-lib.sh: exit with small negagive int is ok with test_must_fail Conflicts: builtin-blame.c builtin-mailinfo.c builtin-mailsplit.c builtin-shortlog.c git-am.sh t/t4150-am.sh t/t4200-rerere.sh 2008-07-17 02:22:50 +02:00			`"git mailinfo [-k] [-u \| --encoding=<encoding> \| -n] msg patch <mail >info";`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00
Call setup_git_directory() much earlier This changes the calling convention of built-in commands and passes the "prefix" (i.e. pathname of $PWD relative to the project root level) down to them. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-29 07:44:25 +02:00			`int cmd_mailinfo(int argc, const char *argv, const char prefix)`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`{`
-u is now default for 'git-mailinfo'. Originally from David Woodhouse, but also adjusts the callers of mailinfo to the new default. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-10 06:31:36 +01:00			`const char *def_charset;`

mailinfo: Use i18n.commitencoding This uses i18n.commitencoding configuration item to pick up the default commit encoding for the repository when converting form e-mail encoding to commit encoding (the default is utf8). Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-11-28 01:29:38 +01:00			`/* NEEDSWORK: might want to do the optional .git/ directory`
			`* discovery`
			`*/`
Provide git_config with a callback-data parameter git_config() only had a function parameter, but no callback data parameter. This assumes that all callback functions only modify global variables. With this patch, every callback gets a void * parameter, and it is hoped that this will help the libification effort. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-05-14 19:46:53 +02:00			`git_config(git_default_config, NULL);`
mailinfo: Use i18n.commitencoding This uses i18n.commitencoding configuration item to pick up the default commit encoding for the repository when converting form e-mail encoding to commit encoding (the default is utf8). Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-11-28 01:29:38 +01:00
Use 'UTF-8' rather than 'utf-8' everywhere for backward compatibility Some ancient platforms (Solaris 7, IRIX 6.5) do not understand 'utf-8', but all tested implementations understand 'UTF-8'. Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-05-19 01:44:39 +02:00			`def_charset = (git_commit_encoding ? git_commit_encoding : "UTF-8");`
-u is now default for 'git-mailinfo'. Originally from David Woodhouse, but also adjusts the callers of mailinfo to the new default. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-10 06:31:36 +01:00			`metainfo_charset = def_charset;`

Teach applymbox to keep the Subject: line. This corresponds to the -k flag to git format-patch --mbox option. The option should probably not be used when applying a real e-mail patch, but is needed when format-patch and applymbox pair is used for cherrypicking. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-17 07:18:27 +02:00			`while (1 < argc && argv[1][0] == '-') {`
			`if (!strcmp(argv[1], "-k"))`
			`keep_subject = 1;`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`else if (!strcmp(argv[1], "-u"))`
-u is now default for 'git-mailinfo'. Originally from David Woodhouse, but also adjusts the callers of mailinfo to the new default. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-10 06:31:36 +01:00			`metainfo_charset = def_charset;`
			`else if (!strcmp(argv[1], "-n"))`
			`metainfo_charset = NULL;`
Mechanical conversion to use prefixcmp() This mechanically converts strncmp() to use prefixcmp(), but only when the parameters match specific patterns, so that they can be verified easily. Leftover from this will be fixed in a separate step, including idiotic conversions like if (!strncmp("foo", arg, 3)) => if (!(-prefixcmp(arg, "foo"))) This was done by using this script in px.perl #!/usr/bin/perl -i.bak -p if (/strncmp\(([^,]+), "([^\\"])", (\d+)\)/ && (length($2) == $3)) { s\|strncmp\(([^,]+), "([^\\"])", (\d+)\)\|prefixcmp($1, "$2")\|; } if (/strncmp\("([^\\"])", ([^,]+), (\d+)\)/ && (length($1) == $3)) { s\|strncmp\("([^\\"])", ([^,]+), (\d+)\)\|(-prefixcmp($2, "$1"))\|; } and running: $ git grep -l strncmp -- '*.c' \| xargs perl px.perl Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-20 10:53:29 +01:00			`else if (!prefixcmp(argv[1], "--encoding="))`
mailinfo: Do not use -u=<encoding>; say --encoding=<encoding> Specifying the value for a single letter, single dash option parameter with equal sign looked funny, and more importantly calling the flag to override encoding from utf-8 to something else "-u" (obviously abbreviated from "utf-8") did not make any sense. So spell it out. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-11-28 10:29:52 +01:00			`metainfo_charset = argv[1] + 11;`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`else`
mailinfo: Use i18n.commitencoding This uses i18n.commitencoding configuration item to pick up the default commit encoding for the repository when converting form e-mail encoding to commit encoding (the default is utf8). Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-11-28 01:29:38 +01:00			`usage(mailinfo_usage);`
Teach applymbox to keep the Subject: line. This corresponds to the -k flag to git format-patch --mbox option. The option should probably not be used when applying a real e-mail patch, but is needed when format-patch and applymbox pair is used for cherrypicking. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-17 07:18:27 +02:00			`argc--; argv++;`
			`}`

Avoid doing the "filelist" thing, since "git-apply" picks up the files automatically ..and git-apply does a lot better job at it anyway. Also, we break the comment/diff on a line that starts with "diff -", not just on the "---" line. Especially for git diffs, we actually want that line in the diff. (We should probably also break on "Index: ..." followed by "=====") 2005-06-23 18:40:23 +02:00			`if (argc != 3)`
mailinfo: Use i18n.commitencoding This uses i18n.commitencoding configuration item to pick up the default commit encoding for the repository when converting form e-mail encoding to commit encoding (the default is utf8). Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-11-28 01:29:38 +01:00			`usage(mailinfo_usage);`
Make git-mailinfo a builtin [jc: with a bit of constness tightening] Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-13 22:21:50 +02:00
			`return !!mailinfo(stdin, stdout, keep_subject, metainfo_charset, argv[1], argv[2]);`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`}`