mirrors/git - Incest Forge: Beyond sex. We incest.

mirrors/git

mirror of https://github.com/git/git.git synced 2024-11-05 00:37:55 +01:00

921 lines

18 KiB

C

Raw Normal View History

Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`/*`
			`* Another stupid program, this one parsing the headers of an`
			`* email to figure out authorship and subject`
			`*/`
mailinfo: Use i18n.commitencoding This uses i18n.commitencoding configuration item to pick up the default commit encoding for the repository when converting form e-mail encoding to commit encoding (the default is utf8). Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-11-28 01:29:38 +01:00			`#include "cache.h"`
Make git-mailinfo a builtin [jc: with a bit of constness tightening] Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-13 22:21:50 +02:00			`#include "builtin.h"`
Move encoding conversion routine out of mailinfo to utf8.c This moves the body of convert_to_utf8() routine used in mailinfo to the utf8.c i18n library. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-24 08:36:55 +01:00			`#include "utf8.h"`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00
Make git-mailinfo a builtin [jc: with a bit of constness tightening] Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-13 22:21:50 +02:00			`static FILE cmitmsg, patchfile, fin, fout;`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00
remove unnecessary initializations [jc: I needed to hand merge the changes to the updated codebase, so the result needs to be checked.] Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-08-15 19:23:48 +02:00			`static int keep_subject;`
			`static const char *metainfo_charset;`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`static char line[1000];`
			`static char name[1000];`
			`static char email[1000];`

mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`static enum {`
			`TE_DONTCARE, TE_QP, TE_BASE64,`
			`} transfer_encoding;`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`static enum {`
			`TYPE_TEXT, TYPE_OTHER,`
			`} message_type;`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`static char charset[256];`
remove unnecessary initializations [jc: I needed to hand merge the changes to the updated codebase, so the result needs to be checked.] Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-08-15 19:23:48 +02:00			`static int patch_lines;`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`static char p_hdr_data, s_hdr_data;`

			`#define MAX_HDR_PARSED 10`
			`#define MAX_BOUNDARIES 5`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`static char sanity_check(char name, char *email)`
			`{`
			`int len = strlen(name);`
			`if (len < 3 \|\| len > 60)`
			`return email;`
			`if (strchr(name, '@') \|\| strchr(name, '<') \|\| strchr(name, '>'))`
			`return email;`
			`return name;`
			`}`

mailinfo and git-am: allow "John Doe <johndoe>" An isolated developer could have a local-only e-mail, which will be stripped out by mailinfo because it lacks '@'. Define a fallback parser to accomodate that. At the same time, reject authorless patch in git-am. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-15 01:31:06 +01:00			`static int bogus_from(char *line)`
			`{`
			`/* John Doe <johndoe> */`
			`char bra, ket, dst, cp;`

			`/* This is fallback, so do not bother if we already have an`
			`* e-mail address.`
Make git-mailinfo a builtin [jc: with a bit of constness tightening] Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-13 22:21:50 +02:00			`*/`
mailinfo and git-am: allow "John Doe <johndoe>" An isolated developer could have a local-only e-mail, which will be stripped out by mailinfo because it lacks '@'. Define a fallback parser to accomodate that. At the same time, reject authorless patch in git-am. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-15 01:31:06 +01:00			`if (*email)`
			`return 0;`

			`bra = strchr(line, '<');`
			`if (!bra)`
			`return 0;`
			`ket = strchr(bra, '>');`
			`if (!ket)`
			`return 0;`

			`for (dst = email, cp = bra+1; cp < ket; )`
			`dst++ = cp++;`
			`*dst = 0;`
			`for (cp = line; isspace(*cp); cp++)`
			`;`
			`for (bra--; isspace(*bra); bra--)`
			`*bra = 0;`
			`cp = sanity_check(cp, email);`
			`strcpy(name, cp);`
			`return 1;`
			`}`

Allow in body headers beyond the in body header prefix. - handle_from is fixed to not mangle it's input line. - Then handle_inbody_header is allowed to look in the body of a commit message for additional headers that we haven't already seen. This allows patches with all of the right information in unfortunate places to be imported. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-23 21:58:36 +02:00			`static int handle_from(char *in_line)`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`{`
Allow in body headers beyond the in body header prefix. - handle_from is fixed to not mangle it's input line. - Then handle_inbody_header is allowed to look in the body of a commit message for additional headers that we haven't already seen. This allows patches with all of the right information in unfortunate places to be imported. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-23 21:58:36 +02:00			`char line[1000];`
			`char *at;`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`char *dst;`

Allow in body headers beyond the in body header prefix. - handle_from is fixed to not mangle it's input line. - Then handle_inbody_header is allowed to look in the body of a commit message for additional headers that we haven't already seen. This allows patches with all of the right information in unfortunate places to be imported. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-23 21:58:36 +02:00			`strcpy(line, in_line);`
			`at = strchr(line, '@');`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`if (!at)`
mailinfo and git-am: allow "John Doe <johndoe>" An isolated developer could have a local-only e-mail, which will be stripped out by mailinfo because it lacks '@'. Define a fallback parser to accomodate that. At the same time, reject authorless patch in git-am. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-15 01:31:06 +01:00			`return bogus_from(line);`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00
			`/*`
			`* If we already have one email, don't take any confusing lines`
			`*/`
			`if (*email && strchr(at+1, '@'))`
			`return 0;`

mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`/* Pick up the string around '@', possibly delimited with <>`
			`* pair; that is the email part. White them out while copying.`
			`*/`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`while (at > line) {`
			`char c = at[-1];`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`if (isspace(c))`
			`break;`
			`if (c == '<') {`
			`at[-1] = ' ';`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`break;`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`}`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`at--;`
			`}`
			`dst = email;`
			`for (;;) {`
			`unsigned char c = *at;`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`if (!c \|\| c == '>' \|\| isspace(c)) {`
			`if (c == '>')`
			`*at = ' ';`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`break;`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`}`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`*at++ = ' ';`
			`*dst++ = c;`
			`}`
			`*dst++ = 0;`

mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`/* The remainder is name. It could be "John Doe <john.doe@xz>"`
			`* or "john.doe@xz (John Doe)", but we have whited out the`
			`* email part, so trim from both ends, possibly removing`
			`* the () pair at the end.`
			`*/`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`at = line + strlen(line);`
			`while (at > line) {`
			`unsigned char c = *--at;`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`if (!isspace(c)) {`
			`at[(c == ')') ? 0 : 1] = 0;`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`break;`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`}`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`}`

			`at = line;`
			`for (;;) {`
			`unsigned char c = *at;`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`if (!c \|\| !isspace(c)) {`
			`if (c == '(')`
			`at++;`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`break;`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`}`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`at++;`
			`}`
			`at = sanity_check(at, email);`
			`strcpy(name, at);`
			`return 1;`
			`}`

builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`static int handle_header(char line, char data, int ofs)`
Get AUTHOR_DATE from the email Date: line Now that git does pretty reliable date parsing, we might as well get the date from the email itself. Of course, it's still questionable whether the date on the email is all that relevant, but it's certainly no worse than taking the commit date. 2005-05-02 06:42:53 +02:00			`{`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`if (!line \|\| !data)`
			`return 1;`

			`strcpy(data, line+ofs);`
Get AUTHOR_DATE from the email Date: line Now that git does pretty reliable date parsing, we might as well get the date from the email itself. Of course, it's still questionable whether the date on the email is all that relevant, but it's certainly no worse than taking the commit date. 2005-05-02 06:42:53 +02:00
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`return 0;`
			`}`

			`/* NOTE NOTE NOTE. We do not claim we do full MIME. We just attempt`
			`* to have enough heuristics to grok MIME encoded patches often found`
			`* on our mailing lists. For example, we do not even treat header lines`
			`* case insensitively.`
			`*/`

			`static int slurp_attr(const char line, const char name, char *attr)`
			`{`
Make some strings const Signed-off-by: Timo Hirvonen <tihirvon@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-28 11:04:39 +02:00			`const char ends, ap = strcasestr(line, name);`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`size_t sz;`

			`if (!ap) {`
			`*attr = 0;`
			`return 0;`
			`}`
			`ap += strlen(name);`
			`if (*ap == '"') {`
			`ap++;`
			`ends = "\"";`
			`}`
			`else`
			`ends = "; \t";`
			`sz = strcspn(ap, ends);`
			`memcpy(attr, ap, sz);`
			`attr[sz] = 0;`
			`return 1;`
			`}`

builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`struct content_type {`
			`char *boundary;`
			`int boundary_len;`
			`};`

			`static struct content_type content[MAX_BOUNDARIES];`

			`static struct content_type *content_top = content;`

			`static int handle_content_type(char *line)`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`{`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`char boundary[256];`

			`if (strcasestr(line, "text/") == NULL)`
			`message_type = TYPE_OTHER;`
			`if (slurp_attr(line, "boundary=", boundary + 2)) {`
			`memcpy(boundary, "--", 2);`
			`if (content_top++ >= &content[MAX_BOUNDARIES]) {`
			`fprintf(stderr, "Too many boundaries to handle\n");`
			`exit(1);`
			`}`
			`content_top->boundary_len = strlen(boundary);`
			`content_top->boundary = xmalloc(content_top->boundary_len+1);`
			`strcpy(content_top->boundary, boundary);`
mailinfo: barf and exist upon nested multipart. At least we can detect what we do not handle. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-07 01:46:34 +02:00			`}`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`if (slurp_attr(line, "charset=", charset)) {`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`int i, c;`
			`for (i = 0; (c = charset[i]) != 0; i++)`
			`charset[i] = tolower(c);`
			`}`
			`return 0;`
			`}`

			`static int handle_content_transfer_encoding(char *line)`
			`{`
			`if (strcasestr(line, "base64"))`
			`transfer_encoding = TE_BASE64;`
			`else if (strcasestr(line, "quoted-printable"))`
			`transfer_encoding = TE_QP;`
			`else`
			`transfer_encoding = TE_DONTCARE;`
			`return 0;`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`}`

mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`static int is_multipart_boundary(const char *line)`
			`{`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`return (!memcmp(line, content_top->boundary, content_top->boundary_len));`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`}`

			`static int eatspace(char *line)`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`{`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`int len = strlen(line);`
			`while (len > 0 && isspace(line[len-1]))`
			`line[--len] = 0;`
			`return len;`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`}`

mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`static char cleanup_subject(char subject)`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`{`
Teach applymbox to keep the Subject: line. This corresponds to the -k flag to git format-patch --mbox option. The option should probably not be used when applying a real e-mail patch, but is needed when format-patch and applymbox pair is used for cherrypicking. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-17 07:18:27 +02:00			`if (keep_subject)`
			`return subject;`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`for (;;) {`
			`char *p;`
			`int len, remove;`
			`switch (*subject) {`
			`case 'r': case 'R':`
			`if (!memcmp("e:", subject+1, 2)) {`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`subject += 3;`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`continue;`
			`}`
			`break;`
			`case ' ': case '\t': case ':':`
			`subject++;`
			`continue;`

			`case '[':`
			`p = strchr(subject, ']');`
			`if (!p) {`
			`subject++;`
			`continue;`
			`}`
			`len = strlen(p);`
			`remove = p - subject;`
			`if (remove <= len *2) {`
			`subject = p+1;`
			`continue;`
Make git-mailinfo a builtin [jc: with a bit of constness tightening] Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-13 22:21:50 +02:00			`}`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`break;`
			`}`
mailinfo: ignore blanks after in-body headers. [jc: this is based on Eric's patch but also fixes up the parsed subject headers]. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-18 01:58:51 +02:00			`eatspace(subject);`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`return subject;`
			`}`
Make git-mailinfo a builtin [jc: with a bit of constness tightening] Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-13 22:21:50 +02:00			`}`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00
			`static void cleanup_space(char *buf)`
			`{`
			`unsigned char c;`
			`while ((c = *buf) != 0) {`
			`buf++;`
			`if (isspace(c)) {`
			`buf[-1] = ' ';`
			`c = *buf;`
			`while (isspace(c)) {`
			`int len = strlen(buf);`
			`memmove(buf, buf+1, len);`
			`c = *buf;`
			`}`
			`}`
			`}`
			`}`

mailinfo: assume input is latin-1 on the header as we do for the body When the input mbox does not identify what encoding it is in, and already have RFC2047 stripped away, we cannot tell what encoding the header text is in. For body text, when the message does not say what charset it is in, we fall back to assume latin-1 input when converting to utf8. This should be done consistently to the header as well. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-05 23:17:49 +02:00			`static void decode_header(char *it);`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`static char *header[MAX_HDR_PARSED] = {`
			`"From","Subject","Date",`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`};`

git-mailinfo fixes for patch munging Don't translate the patch to UTF-8, instead preserve the data as is. This also reverts a test case that was included in the original patch series. Also allow overwriting the authorship and title information we gather from RFC2822 mail headers with additional in-body headers, which was pointed out by Linus. Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-30 18:18:45 +02:00			`static int check_header(char line, char *hdr_data, int overwrite)`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`{`
			`int i;`

builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`/* search for the interesting parts */`
			`for (i = 0; header[i]; i++) {`
			`int len = strlen(header[i]);`
git-mailinfo fixes for patch munging Don't translate the patch to UTF-8, instead preserve the data as is. This also reverts a test case that was included in the original patch series. Also allow overwriting the authorship and title information we gather from RFC2822 mail headers with additional in-body headers, which was pointed out by Linus. Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-30 18:18:45 +02:00			`if ((!hdr_data[i] \|\| overwrite) &&`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`!strncasecmp(line, header[i], len) &&`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`line[len] == ':' && isspace(line[len + 1])) {`
Move B and Q decoding into check header. B and Q decoding is not appropriate for in body headers, so move it up to where we explicitly know we have a real email header. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-23 21:45:37 +02:00			`/* Unwrap inline B and Q encoding, and optionally`
			`* normalize the meta information to utf8.`
			`*/`
mailinfo: assume input is latin-1 on the header as we do for the body When the input mbox does not identify what encoding it is in, and already have RFC2047 stripped away, we cannot tell what encoding the header text is in. For body text, when the message does not say what charset it is in, we fall back to assume latin-1 input when converting to utf8. This should be done consistently to the header as well. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-05 23:17:49 +02:00			`decode_header(line + len + 2);`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`hdr_data[i] = xmalloc(1000 * sizeof(char));`
			`if (! handle_header(line, hdr_data[i], len + 2)) {`
			`return 1;`
			`}`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`}`
			`}`

builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`/* Content stuff */`
			`if (!strncasecmp(line, "Content-Type", 12) &&`
			`line[12] == ':' && isspace(line[12 + 1])) {`
			`decode_header(line + 12 + 2);`
			`if (! handle_content_type(line)) {`
			`return 1;`
			`}`
			`}`
			`if (!strncasecmp(line, "Content-Transfer-Encoding", 25) &&`
			`line[25] == ':' && isspace(line[25 + 1])) {`
			`decode_header(line + 25 + 2);`
			`if (! handle_content_transfer_encoding(line)) {`
			`return 1;`
			`}`
			`}`

			`/* for inbody stuff */`
			`if (!memcmp(">From", line, 5) && isspace(line[5]))`
			`return 1;`
			`if (!memcmp("[PATCH]", line, 7) && isspace(line[7])) {`
			`for (i = 0; header[i]; i++) {`
			`if (!memcmp("Subject: ", header[i], 9)) {`
			`if (! handle_header(line, hdr_data[i], 0)) {`
			`return 1;`
			`}`
			`}`
			`}`
			`}`

			`/* no match */`
			`return 0;`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`}`

mailinfo: More carefully parse header lines in read_one_header_line() We exited prematurely from header parsing loop when the header field did not have a space after the colon but we insisted on it, and we got the check wrong because we forgot that we strip the trailing whitespace before we do the check. The space after the colon is not even required by RFC2822, so stop requiring it. While we are at it, the header line is specified to be more strict than "anything with a colon in it" (there must be one or more characters before the colon, and they must not be controls, SP or non US-ASCII), so implement that check as well, lest we mistakenly think something like: Bogus not a header line: this is not. as a header line. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-26 09:46:58 +02:00			`static int is_rfc2822_header(char *line)`
			`{`
			`/*`
			`* The section that defines the loosest possible`
			`* field name is "3.6.8 Optional fields".`
			`*`
			`* optional-field = field-name ":" unstructured CRLF`
			`* field-name = 1*ftext`
			`* ftext = %d33-57 / %59-126`
			`*/`
			`int ch;`
			`char *cp = line;`
mailinfo: do not get confused with logical lines that are too long. It basically considers all the continuation lines to be lines of their own, and if the total line is bigger than what we can fit in it, we just truncate the result rather than stop in the middle and then get confused when we try to parse the "next" line (which is just the remainder of the first line). [jc: added test, and tightened boundary a bit per list discussion.] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-26 20:10:59 +01:00
			`/* Count mbox From headers as headers */`
			`if (!memcmp(line, "From ", 5) \|\| !memcmp(line, ">From ", 6))`
			`return 1;`

mailinfo: More carefully parse header lines in read_one_header_line() We exited prematurely from header parsing loop when the header field did not have a space after the colon but we insisted on it, and we got the check wrong because we forgot that we strip the trailing whitespace before we do the check. The space after the colon is not even required by RFC2822, so stop requiring it. While we are at it, the header line is specified to be more strict than "anything with a colon in it" (there must be one or more characters before the colon, and they must not be controls, SP or non US-ASCII), so implement that check as well, lest we mistakenly think something like: Bogus not a header line: this is not. as a header line. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-26 09:46:58 +02:00			`while ((ch = *cp++)) {`
			`if (ch == ':')`
			`return cp != line;`
			`if ((33 <= ch && ch <= 57) \|\|`
			`(59 <= ch && ch <= 126))`
			`continue;`
			`break;`
			`}`
			`return 0;`
			`}`

mailinfo: do not get confused with logical lines that are too long. It basically considers all the continuation lines to be lines of their own, and if the total line is bigger than what we can fit in it, we just truncate the result rather than stop in the middle and then get confused when we try to parse the "next" line (which is just the remainder of the first line). [jc: added test, and tightened boundary a bit per list discussion.] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-26 20:10:59 +01:00			`/*`
			`* sz is size of 'line' buffer in bytes. Must be reasonably`
			`* long enough to hold one physical real-world e-mail line.`
			`*/`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`static int read_one_header_line(char line, int sz, FILE in)`
			`{`
mailinfo: do not get confused with logical lines that are too long. It basically considers all the continuation lines to be lines of their own, and if the total line is bigger than what we can fit in it, we just truncate the result rather than stop in the middle and then get confused when we try to parse the "next" line (which is just the remainder of the first line). [jc: added test, and tightened boundary a bit per list discussion.] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-26 20:10:59 +01:00			`int len;`

			`/*`
			`* We will read at most (sz-1) bytes and then potentially`
			`* re-add NUL after it. Accessing line[sz] after this is safe`
			`* and we can allow len to grow up to and including sz.`
			`*/`
			`sz--;`

			`/* Get the first part of the line. */`
			`if (!fgets(line, sz, in))`
			`return 0;`

			`/*`
			`* Is it an empty line or not a valid rfc2822 header?`
			`* If so, stop here, and return false ("not a header")`
			`*/`
			`len = eatspace(line);`
			`if (!len \|\| !is_rfc2822_header(line)) {`
			`/* Re-add the newline */`
			`line[len] = '\n';`
			`line[len + 1] = '\0';`
			`return 0;`
			`}`

			`/*`
			`* Now we need to eat all the continuation lines..`
			`* Yuck, 2822 header "folding"`
			`*/`
			`for (;;) {`
			`int peek, addlen;`
			`static char continuation[1000];`

More accurately detect header lines in read_one_header_line Only count lines of the form '^.*: ' and '^From ' as email header lines. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-23 21:53:20 +02:00			`peek = fgetc(in); ungetc(peek, in);`
			`if (peek != ' ' && peek != '\t')`
			`break;`
mailinfo: do not get confused with logical lines that are too long. It basically considers all the continuation lines to be lines of their own, and if the total line is bigger than what we can fit in it, we just truncate the result rather than stop in the middle and then get confused when we try to parse the "next" line (which is just the remainder of the first line). [jc: added test, and tightened boundary a bit per list discussion.] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-26 20:10:59 +01:00			`if (!fgets(continuation, sizeof(continuation), in))`
			`break;`
			`addlen = eatspace(continuation);`
			`if (len < sz - 1) {`
			`if (addlen >= sz - len)`
			`addlen = sz - len - 1;`
			`memcpy(line + len, continuation, addlen);`
			`len += addlen;`
			`}`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`}`
mailinfo: do not get confused with logical lines that are too long. It basically considers all the continuation lines to be lines of their own, and if the total line is bigger than what we can fit in it, we just truncate the result rather than stop in the middle and then get confused when we try to parse the "next" line (which is just the remainder of the first line). [jc: added test, and tightened boundary a bit per list discussion.] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-26 20:10:59 +01:00			`line[len] = 0;`

			`return 1;`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`}`

mailinfo: decode underscore used in "Q" encoding properly. Quoted-Printable (RFC 2045) and the "Q" encoding (RFC 2047) are subtly different; the latter is used on the mail header and an underscore needs to be decoded to 0x20. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-21 09:06:58 +02:00			`static int decode_q_segment(char in, char ot, char *ep, int rfc2047)`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`{`
			`int c;`
			`while ((c = *in++) != 0 && (in <= ep)) {`
			`if (c == '=') {`
			`int d = *in++;`
			`if (d == '\n' \|\| !d)`
			`break; /* drop trailing newline */`
			`ot++ = ((hexval(d) << 4) \| hexval(in++));`
mailinfo: decode underscore used in "Q" encoding properly. Quoted-Printable (RFC 2045) and the "Q" encoding (RFC 2047) are subtly different; the latter is used on the mail header and an underscore needs to be decoded to 0x20. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-21 09:06:58 +02:00			`continue;`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`}`
mailinfo: decode underscore used in "Q" encoding properly. Quoted-Printable (RFC 2045) and the "Q" encoding (RFC 2047) are subtly different; the latter is used on the mail header and an underscore needs to be decoded to 0x20. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-21 09:06:58 +02:00			`if (rfc2047 && c == '_') /* rfc2047 4.2 (2) */`
			`c = 0x20;`
			`*ot++ = c;`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`}`
			`*ot = 0;`
			`return 0;`
			`}`

			`static int decode_b_segment(char in, char ot, char *ep)`
			`{`
			`/* Decode in..ep, possibly in-place to ot */`
			`int c, pos = 0, acc = 0;`

			`while ((c = *in++) != 0 && (in <= ep)) {`
			`if (c == '+')`
			`c = 62;`
			`else if (c == '/')`
			`c = 63;`
			`else if ('A' <= c && c <= 'Z')`
			`c -= 'A';`
			`else if ('a' <= c && c <= 'z')`
			`c -= 'a' - 26;`
			`else if ('0' <= c && c <= '9')`
			`c -= '0' - 52;`
			`else if (c == '=') {`
			`/* padding is almost like (c == 0), except we do`
			`* not output NUL resulting only from it;`
			`* for now we just trust the data.`
			`*/`
			`c = 0;`
			`}`
			`else`
			`continue; /* garbage */`
			`switch (pos++) {`
			`case 0:`
			`acc = (c << 2);`
			`break;`
			`case 1:`
			`*ot++ = (acc \| (c >> 4));`
			`acc = (c & 15) << 4;`
			`break;`
			`case 2:`
			`*ot++ = (acc \| (c >> 2));`
			`acc = (c & 3) << 6;`
			`break;`
			`case 3:`
			`*ot++ = (acc \| c);`
			`acc = pos = 0;`
			`break;`
			`}`
			`}`
			`*ot = 0;`
			`return 0;`
			`}`

General const correctness fixes We shouldn't attempt to assign constant strings into char*, as the string is not writable at runtime. Likewise we should always be treating unsigned values as unsigned values, not as signed values. Most of these are very straightforward. The only exception is the (unnecessary) xstrdup/free in builtin-branch.c for the detached head case. Since this is a user-level interactive type program and that particular code path is executed no more than once, I feel that the extra xstrdup call is well worth the easy elimination of this warning. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-07 02:44:17 +01:00			`static void convert_to_utf8(char line, const char charset)`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`{`
General const correctness fixes We shouldn't attempt to assign constant strings into char*, as the string is not writable at runtime. Likewise we should always be treating unsigned values as unsigned values, not as signed values. Most of these are very straightforward. The only exception is the (unnecessary) xstrdup/free in builtin-branch.c for the detached head case. Since this is a user-level interactive type program and that particular code path is executed no more than once, I feel that the extra xstrdup call is well worth the easy elimination of this warning. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-07 02:44:17 +01:00			`static const char latin_one[] = "latin1";`
			`const char input_charset = charset ? charset : latin_one;`
Move encoding conversion routine out of mailinfo to utf8.c This moves the body of convert_to_utf8() routine used in mailinfo to the utf8.c i18n library. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-24 08:36:55 +01:00			`char *out = reencode_string(line, metainfo_charset, input_charset);`

-u is now default for 'git-mailinfo'. Originally from David Woodhouse, but also adjusts the callers of mailinfo to the new default. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-10 06:31:36 +01:00			`if (!out)`
			`die("cannot convert from %s to %s\n",`
			`input_charset, metainfo_charset);`
Move encoding conversion routine out of mailinfo to utf8.c This moves the body of convert_to_utf8() routine used in mailinfo to the utf8.c i18n library. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-24 08:36:55 +01:00			`strcpy(line, out);`
			`free(out);`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`}`

mailinfo: assume input is latin-1 on the header as we do for the body When the input mbox does not identify what encoding it is in, and already have RFC2047 stripped away, we cannot tell what encoding the header text is in. For body text, when the message does not say what charset it is in, we fall back to assume latin-1 input when converting to utf8. This should be done consistently to the header as well. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-05 23:17:49 +02:00			`static int decode_header_bq(char *it)`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`{`
			`char in, out, ep, cp, *sp;`
			`char outbuf[1000];`
mailinfo: assume input is latin-1 on the header as we do for the body When the input mbox does not identify what encoding it is in, and already have RFC2047 stripped away, we cannot tell what encoding the header text is in. For body text, when the message does not say what charset it is in, we fall back to assume latin-1 input when converting to utf8. This should be done consistently to the header as well. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-05 23:17:49 +02:00			`int rfc2047 = 0;`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00
			`in = it;`
			`out = outbuf;`
			`while ((ep = strstr(in, "=?")) != NULL) {`
			`int sz, encoding;`
			`char charset_q[256], piecebuf[256];`
mailinfo: assume input is latin-1 on the header as we do for the body When the input mbox does not identify what encoding it is in, and already have RFC2047 stripped away, we cannot tell what encoding the header text is in. For body text, when the message does not say what charset it is in, we fall back to assume latin-1 input when converting to utf8. This should be done consistently to the header as well. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-05 23:17:49 +02:00			`rfc2047 = 1;`

mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`if (in != ep) {`
			`sz = ep - in;`
			`memcpy(out, in, sz);`
			`out += sz;`
			`in += sz;`
			`}`
			`/* E.g.`
			`* ep : "=?iso-2022-jp?B?GyR...?= foo"`
			`* ep : "=?ISO-8859-1?Q?Foo=FCbar?= baz"`
			`*/`
			`ep += 2;`
			`cp = strchr(ep, '?');`
			`if (!cp)`
mailinfo: assume input is latin-1 on the header as we do for the body When the input mbox does not identify what encoding it is in, and already have RFC2047 stripped away, we cannot tell what encoding the header text is in. For body text, when the message does not say what charset it is in, we fall back to assume latin-1 input when converting to utf8. This should be done consistently to the header as well. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-05 23:17:49 +02:00			`return rfc2047; /* no munging */`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`for (sp = ep; sp < cp; sp++)`
			`charset_q[sp - ep] = tolower(*sp);`
			`charset_q[cp - ep] = 0;`
			`encoding = cp[1];`
			`if (!encoding \|\| cp[2] != '?')`
mailinfo: assume input is latin-1 on the header as we do for the body When the input mbox does not identify what encoding it is in, and already have RFC2047 stripped away, we cannot tell what encoding the header text is in. For body text, when the message does not say what charset it is in, we fall back to assume latin-1 input when converting to utf8. This should be done consistently to the header as well. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-05 23:17:49 +02:00			`return rfc2047; /* no munging */`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`ep = strstr(cp + 3, "?=");`
			`if (!ep)`
mailinfo: assume input is latin-1 on the header as we do for the body When the input mbox does not identify what encoding it is in, and already have RFC2047 stripped away, we cannot tell what encoding the header text is in. For body text, when the message does not say what charset it is in, we fall back to assume latin-1 input when converting to utf8. This should be done consistently to the header as well. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-05 23:17:49 +02:00			`return rfc2047; /* no munging */`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`switch (tolower(encoding)) {`
			`default:`
mailinfo: assume input is latin-1 on the header as we do for the body When the input mbox does not identify what encoding it is in, and already have RFC2047 stripped away, we cannot tell what encoding the header text is in. For body text, when the message does not say what charset it is in, we fall back to assume latin-1 input when converting to utf8. This should be done consistently to the header as well. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-05 23:17:49 +02:00			`return rfc2047; /* no munging */`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`case 'b':`
			`sz = decode_b_segment(cp + 3, piecebuf, ep);`
			`break;`
			`case 'q':`
mailinfo: decode underscore used in "Q" encoding properly. Quoted-Printable (RFC 2045) and the "Q" encoding (RFC 2047) are subtly different; the latter is used on the mail header and an underscore needs to be decoded to 0x20. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-21 09:06:58 +02:00			`sz = decode_q_segment(cp + 3, piecebuf, ep, 1);`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`break;`
			`}`
			`if (sz < 0)`
mailinfo: assume input is latin-1 on the header as we do for the body When the input mbox does not identify what encoding it is in, and already have RFC2047 stripped away, we cannot tell what encoding the header text is in. For body text, when the message does not say what charset it is in, we fall back to assume latin-1 input when converting to utf8. This should be done consistently to the header as well. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-05 23:17:49 +02:00			`return rfc2047;`
mailinfo: allow -u to fall back on latin1 to utf8 conversion. When the message body does not identify what encoding it is in, -u assumes it is in latin-1 and converts it to utf8, which is the recommended encoding for git commit log messages. With -u=<encoding>, the conversion is made into the specified one, instead of utf8, to allow project-local policies. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-11-28 01:22:16 +01:00			`if (metainfo_charset)`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`convert_to_utf8(piecebuf, charset_q);`
			`strcpy(out, piecebuf);`
			`out += strlen(out);`
			`in = ep + 2;`
			`}`
			`strcpy(out, in);`
			`strcpy(it, outbuf);`
mailinfo: assume input is latin-1 on the header as we do for the body When the input mbox does not identify what encoding it is in, and already have RFC2047 stripped away, we cannot tell what encoding the header text is in. For body text, when the message does not say what charset it is in, we fall back to assume latin-1 input when converting to utf8. This should be done consistently to the header as well. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-05 23:17:49 +02:00			`return rfc2047;`
			`}`

			`static void decode_header(char *it)`
			`{`

			`if (decode_header_bq(it))`
			`return;`
			`/* otherwise "it" is a straight copy of the input.`
			`* This can be binary guck but there is no charset specified.`
			`*/`
			`if (metainfo_charset)`
			`convert_to_utf8(it, "");`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`}`

			`static void decode_transfer_encoding(char *line)`
			`{`
			`char *ep;`

			`switch (transfer_encoding) {`
			`case TE_QP:`
			`ep = line + strlen(line);`
mailinfo: decode underscore used in "Q" encoding properly. Quoted-Printable (RFC 2045) and the "Q" encoding (RFC 2047) are subtly different; the latter is used on the mail header and an underscore needs to be decoded to 0x20. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-21 09:06:58 +02:00			`decode_q_segment(line, line, ep, 0);`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`break;`
			`case TE_BASE64:`
			`ep = line + strlen(line);`
			`decode_b_segment(line, line, ep);`
			`break;`
			`case TE_DONTCARE:`
			`break;`
			`}`
			`}`

builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`static int handle_filter(char *line);`

			`static int find_boundary(void)`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`{`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`while(fgets(line, sizeof(line), fin) != NULL) {`
			`if (is_multipart_boundary(line))`
			`return 1;`
			`}`
			`return 0;`
			`}`

			`static int handle_boundary(void)`
			`{`
git-mailinfo fixes for patch munging Don't translate the patch to UTF-8, instead preserve the data as is. This also reverts a test case that was included in the original patch series. Also allow overwriting the authorship and title information we gather from RFC2822 mail headers with additional in-body headers, which was pointed out by Linus. Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-30 18:18:45 +02:00			`char newline[]="\n";`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`again:`
			`if (!memcmp(line+content_top->boundary_len, "--", 2)) {`
			`/* we hit an end boundary */`
			`/* pop the current boundary off the stack */`
			`free(content_top->boundary);`

			`/* technically won't happen as is_multipart_boundary()`
			`will fail first. But just in case..`
			`*/`
			`if (content_top-- < content) {`
			`fprintf(stderr, "Detected mismatched boundaries, "`
			`"can't recover\n");`
			`exit(1);`
			`}`
git-mailinfo fixes for patch munging Don't translate the patch to UTF-8, instead preserve the data as is. This also reverts a test case that was included in the original patch series. Also allow overwriting the authorship and title information we gather from RFC2822 mail headers with additional in-body headers, which was pointed out by Linus. Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-30 18:18:45 +02:00			`handle_filter(newline);`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00
			`/* skip to the next boundary */`
			`if (!find_boundary())`
			`return 0;`
			`goto again;`
			`}`

			`/* set some defaults */`
			`transfer_encoding = TE_DONTCARE;`
			`charset[0] = 0;`
			`message_type = TYPE_TEXT;`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`/* slurp in this section's info */`
			`while (read_one_header_line(line, sizeof(line), fin))`
git-mailinfo fixes for patch munging Don't translate the patch to UTF-8, instead preserve the data as is. This also reverts a test case that was included in the original patch series. Also allow overwriting the authorship and title information we gather from RFC2822 mail headers with additional in-body headers, which was pointed out by Linus. Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-30 18:18:45 +02:00			`check_header(line, p_hdr_data, 0);`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`/* eat the blank line after section info */`
			`return (fgets(line, sizeof(line), fin) != NULL);`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`}`

restrict the patch filtering I have come across many emails that use long strings of '-'s as separators for ideas. This patch below limits the separator to only 3 '-', with the intent that long string of '-'s will stay in the commit msg and not in the patch file. Signed-off-by: Don Zickus <dzickus@redhat.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:06 +01:00			`static inline int patchbreak(const char *line)`
			`{`
			`/* Beginning of a "diff -" header? */`
			`if (!memcmp("diff -", line, 6))`
			`return 1;`

			`/* CVS "Index: " line? */`
			`if (!memcmp("Index: ", line, 7))`
			`return 1;`

			`/*`
			`* "--- <filename>" starts patches without headers`
			`* "---<sp>*" is a manual separator`
			`*/`
			`if (!memcmp("---", line, 3)) {`
			`line += 3;`
			`/* space followed by a filename? */`
			`if (line[0] == ' ' && !isspace(line[1]))`
			`return 1;`
			`/* Just whitespace? */`
			`for (;;) {`
			`unsigned char c = *line++;`
			`if (c == '\n')`
			`return 1;`
			`if (!isspace(c))`
			`break;`
			`}`
			`return 0;`
			`}`
			`return 0;`
			`}`


builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`static int handle_commit_msg(char *line)`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`{`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`static int still_looking = 1;`

mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`if (!cmitmsg)`
			`return 0;`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`if (still_looking) {`
			`char *cp = line;`
			`if (isspace(*line)) {`
			`for (cp = line + 1; *cp; cp++) {`
			`if (!isspace(*cp))`
			`break;`
			`}`
			`if (!*cp)`
			`return 0;`
			`}`
git-mailinfo fixes for patch munging Don't translate the patch to UTF-8, instead preserve the data as is. This also reverts a test case that was included in the original patch series. Also allow overwriting the authorship and title information we gather from RFC2822 mail headers with additional in-body headers, which was pointed out by Linus. Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-30 18:18:45 +02:00			`if ((still_looking = check_header(cp, s_hdr_data, 0)) != 0)`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`return 0;`
			`}`
Refactor commit messge handling. - Move handle_info into main so it is called once after everything has been parsed. This allows the removal of a static variable and removes two duplicate calls. - Move parsing of inbody headers into handle_commit. This means we parse the in-body headers after we have decoded the character set, and it removes code duplication between handle_multipart_one_part and handle_body. - Change the flag indicating that we have seen an in body prefix header into another bit in seen. This is a little more general and allows the possibility of parsing in body headers after the body message has begun. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-23 21:47:28 +02:00
git-mailinfo fixes for patch munging Don't translate the patch to UTF-8, instead preserve the data as is. This also reverts a test case that was included in the original patch series. Also allow overwriting the authorship and title information we gather from RFC2822 mail headers with additional in-body headers, which was pointed out by Linus. Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-30 18:18:45 +02:00			`/* normalize the log message to UTF-8. */`
			`if (metainfo_charset)`
			`convert_to_utf8(line, charset);`

restrict the patch filtering I have come across many emails that use long strings of '-'s as separators for ideas. This patch below limits the separator to only 3 '-', with the intent that long string of '-'s will stay in the commit msg and not in the patch file. Signed-off-by: Don Zickus <dzickus@redhat.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:06 +01:00			`if (patchbreak(line)) {`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`fclose(cmitmsg);`
			`cmitmsg = NULL;`
			`return 1;`
			`}`
Refactor commit messge handling. - Move handle_info into main so it is called once after everything has been parsed. This allows the removal of a static variable and removes two duplicate calls. - Move parsing of inbody headers into handle_commit. This means we parse the in-body headers after we have decoded the character set, and it removes code duplication between handle_multipart_one_part and handle_body. - Change the flag indicating that we have seen an in body prefix header into another bit in seen. This is a little more general and allows the possibility of parsing in body headers after the body message has begun. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-23 21:47:28 +02:00
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`fputs(line, cmitmsg);`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`return 0;`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`}`

builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`static int handle_patch(char *line)`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`{`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`fputs(line, patchfile);`
			`patch_lines++;`
			`return 0;`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`}`

builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`static int handle_filter(char *line)`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`{`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`static int filter = 0;`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`/* filter tells us which part we left off on`
			`* a non-zero return indicates we hit a filter point`
			`*/`
			`switch (filter) {`
			`case 0:`
			`if (!handle_commit_msg(line))`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`break;`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`filter++;`
			`case 1:`
			`if (!handle_patch(line))`
			`break;`
			`filter++;`
			`default:`
			`return 1;`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`}`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`return 0;`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`}`

builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`static void handle_body(void)`
[PATCH] mailinfo: handle folded header. Some people split their long E-mail address over two lines using the RFC2822 header "folding". We can lose authorship information this way, so make a minimum effort to deal with it, instead of special casing only the "Subject:" field. We could teach mailsplit to unfold the folded header, but teaching mailinfo about folding would make more sense; a single message can be fed to mailinfo without going through mailsplit. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-07-23 11:10:31 +02:00			`{`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`int rc = 0;`
			`static char newline[2000];`
			`static char *np = newline;`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00
			`/* Skip up to the first boundary */`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`if (content_top->boundary) {`
			`if (!find_boundary())`
			`return;`
			`}`

			`do {`
			`/* process any boundary lines */`
			`if (content_top->boundary && is_multipart_boundary(line)) {`
			`/* flush any leftover */`
			`if ((transfer_encoding == TE_BASE64) &&`
			`(np != newline)) {`
			`handle_filter(newline);`
			`}`
			`if (!handle_boundary())`
			`return;`
			`}`

git-mailinfo fixes for patch munging Don't translate the patch to UTF-8, instead preserve the data as is. This also reverts a test case that was included in the original patch series. Also allow overwriting the authorship and title information we gather from RFC2822 mail headers with additional in-body headers, which was pointed out by Linus. Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-30 18:18:45 +02:00			`/* Unwrap transfer encoding */`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`decode_transfer_encoding(line);`

			`switch (transfer_encoding) {`
			`case TE_BASE64:`
			`{`
			`char *op = line;`

			`/* binary data most likely doesn't have newlines */`
			`if (message_type != TYPE_TEXT) {`
			`rc = handle_filter(line);`
			`break;`
			`}`

			`/* this is a decoded line that may contain`
			`* multiple new lines. Pass only one chunk`
			`* at a time to handle_filter()`
			`*/`

			`do {`
			`while (op != '\n' && op != 0)`
			`np++ = op++;`
			`np = op;`
			`if (*np != 0) {`
			`/* should be sitting on a new line */`
			`*(++np) = 0;`
			`op++;`
			`rc = handle_filter(newline);`
			`np = newline;`
			`}`
			`} while (*op != 0);`
			`/* the partial chunk is saved in newline and`
			`* will be appended by the next iteration of fgets`
			`*/`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`break;`
[PATCH] mailinfo: handle folded header. Some people split their long E-mail address over two lines using the RFC2822 header "folding". We can lose authorship information this way, so make a minimum effort to deal with it, instead of special casing only the "Subject:" field. We could teach mailsplit to unfold the folded header, but teaching mailinfo about folding would make more sense; a single message can be fed to mailinfo without going through mailsplit. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-07-23 11:10:31 +02:00			`}`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`default:`
			`rc = handle_filter(line);`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`}`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`if (rc)`
			`/* nothing left to filter */`
			`break;`
			`} while (fgets(line, sizeof(line), fin));`

			`return;`
[PATCH] mailinfo: handle folded header. Some people split their long E-mail address over two lines using the RFC2822 header "folding". We can lose authorship information this way, so make a minimum effort to deal with it, instead of special casing only the "Subject:" field. We could teach mailsplit to unfold the folded header, but teaching mailinfo about folding would make more sense; a single message can be fed to mailinfo without going through mailsplit. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-07-23 11:10:31 +02:00			`}`

builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`static void handle_info(void)`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`{`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`char *sub;`
			`char *hdr;`
			`int i;`

			`for (i = 0; header[i]; i++) {`

			`/* only print inbody headers if we output a patch file */`
			`if (patch_lines && s_hdr_data[i])`
			`hdr = s_hdr_data[i];`
			`else if (p_hdr_data[i])`
			`hdr = p_hdr_data[i];`
			`else`
			`continue;`

			`if (!memcmp(header[i], "Subject", 7)) {`
			`sub = cleanup_subject(hdr);`
			`cleanup_space(sub);`
			`fprintf(fout, "Subject: %s\n", sub);`
			`} else if (!memcmp(header[i], "From", 4)) {`
			`handle_from(hdr);`
			`fprintf(fout, "Author: %s\n", name);`
			`fprintf(fout, "Email: %s\n", email);`
			`} else {`
			`cleanup_space(hdr);`
			`fprintf(fout, "%s: %s\n", header[i], hdr);`
			`}`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`}`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00			`fprintf(fout, "\n");`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`}`

More missing static Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-08 11:22:56 +02:00			`static int mailinfo(FILE in, FILE out, int ks, const char *encoding,`
			`const char msg, const char patch)`
Make git-mailinfo a builtin [jc: with a bit of constness tightening] Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-13 22:21:50 +02:00			`{`
			`keep_subject = ks;`
			`metainfo_charset = encoding;`
			`fin = in;`
			`fout = out;`

			`cmitmsg = fopen(msg, "w");`
			`if (!cmitmsg) {`
			`perror(msg);`
			`return -1;`
			`}`
			`patchfile = fopen(patch, "w");`
			`if (!patchfile) {`
			`perror(patch);`
			`fclose(cmitmsg);`
			`return -1;`
			`}`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00
			`p_hdr_data = xcalloc(MAX_HDR_PARSED, sizeof(char *));`
			`s_hdr_data = xcalloc(MAX_HDR_PARSED, sizeof(char *));`

			`/* process the email header */`
			`while (read_one_header_line(line, sizeof(line), fin))`
git-mailinfo fixes for patch munging Don't translate the patch to UTF-8, instead preserve the data as is. This also reverts a test case that was included in the original patch series. Also allow overwriting the authorship and title information we gather from RFC2822 mail headers with additional in-body headers, which was pointed out by Linus. Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-30 18:18:45 +02:00			`check_header(line, p_hdr_data, 1);`
builtin-mailinfo.c infrastrcture changes I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). [jc: squashed in another patch from Don's five patch series to fix the test case, as this patch exposes the bug in the test.] Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:52:04 +01:00
			`handle_body();`
			`handle_info();`
Make git-mailinfo a builtin [jc: with a bit of constness tightening] Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-13 22:21:50 +02:00
			`return 0;`
			`}`

Teach applymbox to keep the Subject: line. This corresponds to the -k flag to git format-patch --mbox option. The option should probably not be used when applying a real e-mail patch, but is needed when format-patch and applymbox pair is used for cherrypicking. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-17 07:18:27 +02:00			`static const char mailinfo_usage[] =`
mailinfo: Do not use -u=<encoding>; say --encoding=<encoding> Specifying the value for a single letter, single dash option parameter with equal sign looked funny, and more importantly calling the flag to override encoding from utf-8 to something else "-u" (obviously abbreviated from "utf-8") did not make any sense. So spell it out. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-11-28 10:29:52 +01:00			`"git-mailinfo [-k] [-u \| --encoding=<encoding>] msg patch <mail >info";`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00
Call setup_git_directory() much earlier This changes the calling convention of built-in commands and passes the "prefix" (i.e. pathname of $PWD relative to the project root level) down to them. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-29 07:44:25 +02:00			`int cmd_mailinfo(int argc, const char *argv, const char prefix)`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`{`
-u is now default for 'git-mailinfo'. Originally from David Woodhouse, but also adjusts the callers of mailinfo to the new default. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-10 06:31:36 +01:00			`const char *def_charset;`

mailinfo: Use i18n.commitencoding This uses i18n.commitencoding configuration item to pick up the default commit encoding for the repository when converting form e-mail encoding to commit encoding (the default is utf8). Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-11-28 01:29:38 +01:00			`/* NEEDSWORK: might want to do the optional .git/ directory`
			`* discovery`
			`*/`
			`git_config(git_default_config);`

-u is now default for 'git-mailinfo'. Originally from David Woodhouse, but also adjusts the callers of mailinfo to the new default. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-10 06:31:36 +01:00			`def_charset = (git_commit_encoding ? git_commit_encoding : "utf-8");`
			`metainfo_charset = def_charset;`

Teach applymbox to keep the Subject: line. This corresponds to the -k flag to git format-patch --mbox option. The option should probably not be used when applying a real e-mail patch, but is needed when format-patch and applymbox pair is used for cherrypicking. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-17 07:18:27 +02:00			`while (1 < argc && argv[1][0] == '-') {`
			`if (!strcmp(argv[1], "-k"))`
			`keep_subject = 1;`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`else if (!strcmp(argv[1], "-u"))`
-u is now default for 'git-mailinfo'. Originally from David Woodhouse, but also adjusts the callers of mailinfo to the new default. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-10 06:31:36 +01:00			`metainfo_charset = def_charset;`
			`else if (!strcmp(argv[1], "-n"))`
			`metainfo_charset = NULL;`
Mechanical conversion to use prefixcmp() This mechanically converts strncmp() to use prefixcmp(), but only when the parameters match specific patterns, so that they can be verified easily. Leftover from this will be fixed in a separate step, including idiotic conversions like if (!strncmp("foo", arg, 3)) => if (!(-prefixcmp(arg, "foo"))) This was done by using this script in px.perl #!/usr/bin/perl -i.bak -p if (/strncmp\(([^,]+), "([^\\"])", (\d+)\)/ && (length($2) == $3)) { s\|strncmp\(([^,]+), "([^\\"])", (\d+)\)\|prefixcmp($1, "$2")\|; } if (/strncmp\("([^\\"])", ([^,]+), (\d+)\)/ && (length($1) == $3)) { s\|strncmp\("([^\\"])", ([^,]+), (\d+)\)\|(-prefixcmp($2, "$1"))\|; } and running: $ git grep -l strncmp -- '*.c' \| xargs perl px.perl Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-20 10:53:29 +01:00			`else if (!prefixcmp(argv[1], "--encoding="))`
mailinfo: Do not use -u=<encoding>; say --encoding=<encoding> Specifying the value for a single letter, single dash option parameter with equal sign looked funny, and more importantly calling the flag to override encoding from utf-8 to something else "-u" (obviously abbreviated from "utf-8") did not make any sense. So spell it out. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-11-28 10:29:52 +01:00			`metainfo_charset = argv[1] + 11;`
mailinfo and applymbox updates This attempts to minimally cope with a subset of MIME "features" often seen in patches sent to our mailing lists. Namely: - People's name spelled in characters outside ASCII (both on From: header and the signed-off-by line). - Content-transfer-encoding using quoted-printable (both in multipart and non-multipart messages). These MIME features are detected and decoded by "git mailinfo". Optionally, with the '-u' flag, the output to .info and .msg is transliterated from its original chaset to utf-8. This is to encourage people to use utf8 in their commit messages for interoperability. Applymbox accepts additional flag '-u' which is passed to mailinfo. Signed-off-by: Junio C Hamano / 濱野純 <junkio@cox.net> 2005-08-28 21:33:16 +02:00			`else`
mailinfo: Use i18n.commitencoding This uses i18n.commitencoding configuration item to pick up the default commit encoding for the repository when converting form e-mail encoding to commit encoding (the default is utf8). Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-11-28 01:29:38 +01:00			`usage(mailinfo_usage);`
Teach applymbox to keep the Subject: line. This corresponds to the -k flag to git format-patch --mbox option. The option should probably not be used when applying a real e-mail patch, but is needed when format-patch and applymbox pair is used for cherrypicking. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-17 07:18:27 +02:00			`argc--; argv++;`
			`}`

Avoid doing the "filelist" thing, since "git-apply" picks up the files automatically ..and git-apply does a lot better job at it anyway. Also, we break the comment/diff on a line that starts with "diff -", not just on the "---" line. Especially for git diffs, we actually want that line in the diff. (We should probably also break on "Index: ..." followed by "=====") 2005-06-23 18:40:23 +02:00			`if (argc != 3)`
mailinfo: Use i18n.commitencoding This uses i18n.commitencoding configuration item to pick up the default commit encoding for the repository when converting form e-mail encoding to commit encoding (the default is utf8). Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-11-28 01:29:38 +01:00			`usage(mailinfo_usage);`
Make git-mailinfo a builtin [jc: with a bit of constness tightening] Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-13 22:21:50 +02:00
			`return !!mailinfo(stdin, stdout, keep_subject, metainfo_charset, argv[1], argv[2]);`
Start of early patch applicator tools for git. I looked a bit at my old BK tools for the same thing, but they were just so horrid in many ways that I largely rewrote it all and these tools do things a bit differently. Instead of aggressively piping data from one process to another (which was clever but very hard to follow), this first just splits out the mbox into many smaller email files, and then does some scripts on these temporary files. 2005-04-12 08:46:50 +02:00			`}`