mirrors/git - Incest Forge: Beyond sex. We incest.

mirrors/git

mirror of https://github.com/git/git.git synced 2024-11-14 21:23:03 +01:00

2502 lines

62 KiB

C

Raw Normal View History

Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`/*`
			`Format of STDIN stream:`

			`stream ::= cmd*;`

			`cmd ::= new_blob`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`\| new_commit`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`\| new_tag`
Added 'reset' command to clear a branch's tree. Sometimes an import frontend may need to work with a temporary branch which will actually contain many different branches over the life of the import. This is especially useful when the frontend needs to create a tag from a set of file versions which are otherwise never a commit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-27 12:20:49 +02:00			`\| reset_branch`
Include checkpoint command in the BNF. This command isn't encouraged (as its slow) but it does exist and is accepted, so it still should be covered in the BNF. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-05 22:05:11 +01:00			`\| checkpoint`
Allow frontends to bidirectionally communicate with fast-import The existing checkpoint command is very useful to force fast-import to dump the branches out to disk so that standard Git tools can access them and the objects they refer to. However there was not a way to know when fast-import had finished executing the checkpoint and it was safe to read those refs. The progress command can be used to make fast-import output any message of the frontend's choosing to standard out. The frontend can scan for these messages using select() or poll() to monitor a pipe connected to the standard output of fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-01 16:23:08 +02:00			`\| progress`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`;`

Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`new_blob ::= 'blob' lf`
Fix whitespace in "Format of STDIN stream" of fast-import Something probably assumed that HT indentation is 4 characters. Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-15 10:57:40 +02:00			`mark?`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`file_content;`
			`file_content ::= data;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`new_commit ::= 'commit' sp ref_str lf`
Remove branch creation command from fast-import. Jon Smirl was finding it difficult to alter cvs2svn to generate branch commands prior to the first commit of the same branch. This change moves the 'from' command to be an optional parameter of the 'commit' command, thereby allowing a new branch to be defined at the moment it gets used to create the first commit on that branch. This change makes it impossible to create a branch with no commits on it as at least one commit is needed to register the branch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 00:45:26 +02:00			`mark?`
Support RFC 2822 date parsing in fast-import. Since some frontends may be working with source material where the dates are only readily available as RFC 2822 strings, it is more friendly if fast-import exposes Git's parse_date() function to handle the conversion. This way the frontend doesn't need to perform the parsing itself. The new --date-format option to fast-import can be used by a frontend to select which format it will supply date strings in. The default is the standard `raw` Git format, which fast-import has always supported. Format rfc2822 can be used to activate the parse_date() function instead. Because fast-import could also be useful for creating new, current commits, the format `now` is also supported to generate the current system timestamp. The implementation of `now` is a trivial call to datestamp(), but is actually a whole whopping 3 lines so that fast-import can verify the frontend really meant `now`. As part of this change I have added validation of the `raw` date format. Prior to this change fast-import would accept anything in a `committer` command, even if it was seriously malformed. Now fast-import requires the '> ' near the end of the string and verifies the timestamp is formatted properly. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 20:58:30 +01:00			`('author' sp name '<' email '>' when lf)?`
			`'committer' sp name '<' email '>' when lf`
Remove branch creation command from fast-import. Jon Smirl was finding it difficult to alter cvs2svn to generate branch commands prior to the first commit of the same branch. This change moves the 'from' command to be an optional parameter of the 'commit' command, thereby allowing a new branch to be defined at the moment it gets used to create the first commit on that branch. This change makes it impossible to create a branch with no commits on it as at least one commit is needed to register the branch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 00:45:26 +02:00			`commit_msg`
Moved from command to after data to help cvs2svn. cvs2svn has three phases: begin_commit, middle_commit, end_commit. The ancester is computed in the middle_commit phase. So its easier to generate a stream if the from command appears after the commit message itself but before the file change commands. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 04:38:13 +02:00			`('from' sp (ref_str \| hexsha1 \| sha1exp_str \| idnum) lf)?`
Support creation of merge commits in fast-import. Some importers are able to determine when branch merges occurred within their source data. In these cases they will want to supply the correct commits to fast-import so that a proper merge commit will exist in Git. This is now supported by supplying a 'merge ' command after the commit message and optional from command. A merge is not actually performed by fast-import, its assumed that the frontend performed any sort of merging activity already and that fast-import should simply be storing its result. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-12 04:21:38 +01:00			`('merge' sp (ref_str \| hexsha1 \| sha1exp_str \| idnum) lf)*`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`file_change*`
Make trailing LF optional for all fast-import commands For the same reasons as the prior change we want to allow frontends to omit the trailing LF that usually delimits commands. In some cases these just make the input stream more verbose looking than it needs to be, and its just simpler for the frontend developer to get started if our parser is slightly more lenient about where an LF is required and where it isn't. To make this optional LF feature work we now have to buffer up to one line of input in command_buf. This buffering can happen if we look at the current input command but don't recognize it at this point in the code. In such a case we need to "unget" the entire line, but we cannot depend upon the stdio library to let us do ungetc() for that many characters at once. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-01 08:22:53 +02:00			`lf?;`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`commit_msg ::= data;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00
Teach fast-import to recursively copy files/directories Some source material (e.g. Subversion dump files) perform directory renames by telling us the directory was copied, then deleted in the same revision. This makes it difficult for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Copy a/ to b/; Delete a/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'C' subcommand within a commit allows the frontend to make a recursive copy of one path to another path within the branch, without needing to keep track of the individual file paths. The metadata copy is performed in memory efficiently, but is implemented as a copy-immediately operation, rather than copy-on-write. With this new 'C' subcommand frontends could obviously implement an 'R' (rename) on their own as a combination of 'C' and 'D' (delete), but since we have already offered up 'R' in the past and it is a trivial thing to keep implemented I'm not going to deprecate it. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-15 07:40:37 +02:00			`file_change ::= file_clr`
			`\| file_del`
			`\| file_rnm`
			`\| file_cpy`
			`\| file_obm`
			`\| file_inm;`
Teach fast-import how to clear the internal branch content. Some frontends may not be able to (easily) keep track of which files are included in the branch, and which aren't. Performing this tracking can be tedious and error prone for the frontend to do, especially if its foreign data source cannot supply the changed path list on a per-commit basis. fast-import now allows a frontend to request that a branch's tree be wiped clean (reset to the empty tree) at the start of a commit, allowing the frontend to feed in all paths which belong on the branch. This is ideal for a tar-file importer frontend, for example, as the frontend just needs to reformat the tar data stream into a gfi data stream, which may be something a few Perl regexps can take care of. :) Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-07 08:03:03 +01:00			`file_clr ::= 'deleteall' lf;`
Accept 'inline' file data in fast-import commit structure. Its very annoying to need to specify the file content ahead of a commit and use marks to connect the individual blobs to the commit's file modification entry, especially if the frontend can't/won't generate the blob SHA1s itself. Instead it would much easier to use if we can accept the blob data at the same time as we receive each file_change line. Now fast-import accepts 'inline' instead of a mark idnum or blob SHA1 within the 'M' type file_change command. If an inline is detected the very next line must be a 'data n' command, supplying the file data. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-18 21:17:58 +01:00			`file_del ::= 'D' sp path_str lf;`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`file_rnm ::= 'R' sp path_str sp path_str lf;`
Teach fast-import to recursively copy files/directories Some source material (e.g. Subversion dump files) perform directory renames by telling us the directory was copied, then deleted in the same revision. This makes it difficult for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Copy a/ to b/; Delete a/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'C' subcommand within a commit allows the frontend to make a recursive copy of one path to another path within the branch, without needing to keep track of the individual file paths. The metadata copy is performed in memory efficiently, but is implemented as a copy-immediately operation, rather than copy-on-write. With this new 'C' subcommand frontends could obviously implement an 'R' (rename) on their own as a combination of 'C' and 'D' (delete), but since we have already offered up 'R' in the past and it is a trivial thing to keep implemented I'm not going to deprecate it. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-15 07:40:37 +02:00			`file_cpy ::= 'C' sp path_str sp path_str lf;`
Accept 'inline' file data in fast-import commit structure. Its very annoying to need to specify the file content ahead of a commit and use marks to connect the individual blobs to the commit's file modification entry, especially if the frontend can't/won't generate the blob SHA1s itself. Instead it would much easier to use if we can accept the blob data at the same time as we receive each file_change line. Now fast-import accepts 'inline' instead of a mark idnum or blob SHA1 within the 'M' type file_change command. If an inline is detected the very next line must be a 'data n' command, supplying the file data. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-18 21:17:58 +01:00			`file_obm ::= 'M' sp mode sp (hexsha1 \| idnum) sp path_str lf;`
			`file_inm ::= 'M' sp mode sp 'inline' sp path_str lf`
			`data;`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00
			`new_tag ::= 'tag' sp tag_str lf`
			`'from' sp (ref_str \| hexsha1 \| sha1exp_str \| idnum) lf`
Fix whitespace in "Format of STDIN stream" of fast-import Something probably assumed that HT indentation is 4 characters. Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-15 10:57:40 +02:00			`'tagger' sp name '<' email '>' when lf`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`tag_msg;`
			`tag_msg ::= data;`

Allow creating branches without committing in fast-import. Some importers may want to create a branch long before they actually commit to it, or in some cases they may never commit to the branch but they still need the ref to be created in the repository after the import is complete. This extends the 'reset ' command to automatically create a new branch if the supplied reference isn't already known as a branch. While I'm at it I also modified the syntax of the reset command to terminate with an empty line, like commit and tag operate. This just makes the command set more consistent. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-12 04:28:39 +01:00			`reset_branch ::= 'reset' sp ref_str lf`
			`('from' sp (ref_str \| hexsha1 \| sha1exp_str \| idnum) lf)?`
Make trailing LF optional for all fast-import commands For the same reasons as the prior change we want to allow frontends to omit the trailing LF that usually delimits commands. In some cases these just make the input stream more verbose looking than it needs to be, and its just simpler for the frontend developer to get started if our parser is slightly more lenient about where an LF is required and where it isn't. To make this optional LF feature work we now have to buffer up to one line of input in command_buf. This buffering can happen if we look at the current input command but don't recognize it at this point in the code. In such a case we need to "unget" the entire line, but we cannot depend upon the stdio library to let us do ungetc() for that many characters at once. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-01 08:22:53 +02:00			`lf?;`
Added 'reset' command to clear a branch's tree. Sometimes an import frontend may need to work with a temporary branch which will actually contain many different branches over the life of the import. This is especially useful when the frontend needs to create a tag from a set of file versions which are otherwise never a commit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-27 12:20:49 +02:00
Implemented manual packfile switching in fast-import. To help importers which are dealing with massive amounts of data fast-import needs to be able to close the packfile it is currently writing to and open a new packfile for any additional data that will be received. A new 'checkpoint' command has been introduced which can be used by the frontend import process to force this to occur at any time. This may be useful to ensure a very long running import doesn't lose any work due to unexpected failures. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:35:41 +01:00			`checkpoint ::= 'checkpoint' lf`
Make trailing LF optional for all fast-import commands For the same reasons as the prior change we want to allow frontends to omit the trailing LF that usually delimits commands. In some cases these just make the input stream more verbose looking than it needs to be, and its just simpler for the frontend developer to get started if our parser is slightly more lenient about where an LF is required and where it isn't. To make this optional LF feature work we now have to buffer up to one line of input in command_buf. This buffering can happen if we look at the current input command but don't recognize it at this point in the code. In such a case we need to "unget" the entire line, but we cannot depend upon the stdio library to let us do ungetc() for that many characters at once. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-01 08:22:53 +02:00			`lf?;`
Implemented manual packfile switching in fast-import. To help importers which are dealing with massive amounts of data fast-import needs to be able to close the packfile it is currently writing to and open a new packfile for any additional data that will be received. A new 'checkpoint' command has been introduced which can be used by the frontend import process to force this to occur at any time. This may be useful to ensure a very long running import doesn't lose any work due to unexpected failures. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:35:41 +01:00
Allow frontends to bidirectionally communicate with fast-import The existing checkpoint command is very useful to force fast-import to dump the branches out to disk so that standard Git tools can access them and the objects they refer to. However there was not a way to know when fast-import had finished executing the checkpoint and it was safe to read those refs. The progress command can be used to make fast-import output any message of the frontend's choosing to standard out. The frontend can scan for these messages using select() or poll() to monitor a pipe connected to the standard output of fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-01 16:23:08 +02:00			`progress ::= 'progress' sp not_lf* lf`
			`lf?;`

Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`# note: the first idnum in a stream should be 1 and subsequent`
			`# idnums should not have gaps between values as this will cause`
			`# the stream parser to reserve space for the gapped values. An`
Fix whitespace in "Format of STDIN stream" of fast-import Something probably assumed that HT indentation is 4 characters. Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-15 10:57:40 +02:00			`# idnum can be updated in the future to a new object by issuing`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`# a new mark directive with the old idnum.`
Fix whitespace in "Format of STDIN stream" of fast-import Something probably assumed that HT indentation is 4 characters. Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-15 10:57:40 +02:00			`#`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`mark ::= 'mark' sp idnum lf;`
Support delimited data regions in fast-import. During testing its nice to not have to feed the length of a data chunk to the 'data' command of fast-import. Instead we would prefer to be able to establish a data chunk much like shell's << operator and use a line delimiter to denote the end of the input. So now if a data command is started as 'data <<EOF' we will look for a terminator line containing only the string EOF on that line. Once found, we stop the data command. Everything between the two lines is used as the data value. The 'data <<' syntax is slower than 'data n', as we don't know how many bytes to expect and instead must grow our buffer on the fly. It also has the problem that the frontend must use a string which will not appear on a line by itself in the input, and the data region will always end in an LF. For these reasons real import frontends are encouraged to continue to use _only_ 'data n'. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-18 19:14:27 +01:00			`data ::= (delimited_data \| exact_data)`
Make trailing LF following fast-import `data` commands optional A few fast-import frontend developers have found it odd that we require the LF following a `data` command, especially in the exact byte count format. Technically we don't need this LF to parse the stream properly, but having it here does make the stream more readable to humans. We can easily make the LF optional by peeking at the next byte available from the stream and pushing it back into the buffer if its not LF. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-01 06:24:25 +02:00			`lf?;`
Support delimited data regions in fast-import. During testing its nice to not have to feed the length of a data chunk to the 'data' command of fast-import. Instead we would prefer to be able to establish a data chunk much like shell's << operator and use a line delimiter to denote the end of the input. So now if a data command is started as 'data <<EOF' we will look for a terminator line containing only the string EOF on that line. Once found, we stop the data command. Everything between the two lines is used as the data value. The 'data <<' syntax is slower than 'data n', as we don't know how many bytes to expect and instead must grow our buffer on the fly. It also has the problem that the frontend must use a string which will not appear on a line by itself in the input, and the data region will always end in an LF. For these reasons real import frontends are encouraged to continue to use _only_ 'data n'. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-18 19:14:27 +01:00
			`# note: delim may be any string but must not contain lf.`
			`# data_line may contain any data but must not be exactly`
			`# delim.`
			`delimited_data ::= 'data' sp '<<' delim lf`
			`(data_line lf)*`
Fix whitespace in "Format of STDIN stream" of fast-import Something probably assumed that HT indentation is 4 characters. Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-15 10:57:40 +02:00			`delim lf;`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00
			`# note: declen indicates the length of binary_data in bytes.`
Support delimited data regions in fast-import. During testing its nice to not have to feed the length of a data chunk to the 'data' command of fast-import. Instead we would prefer to be able to establish a data chunk much like shell's << operator and use a line delimiter to denote the end of the input. So now if a data command is started as 'data <<EOF' we will look for a terminator line containing only the string EOF on that line. Once found, we stop the data command. Everything between the two lines is used as the data value. The 'data <<' syntax is slower than 'data n', as we don't know how many bytes to expect and instead must grow our buffer on the fly. It also has the problem that the frontend must use a string which will not appear on a line by itself in the input, and the data region will always end in an LF. For these reasons real import frontends are encouraged to continue to use _only_ 'data n'. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-18 19:14:27 +01:00			`# declen does not include the lf preceeding the binary data.`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`#`
Support delimited data regions in fast-import. During testing its nice to not have to feed the length of a data chunk to the 'data' command of fast-import. Instead we would prefer to be able to establish a data chunk much like shell's << operator and use a line delimiter to denote the end of the input. So now if a data command is started as 'data <<EOF' we will look for a terminator line containing only the string EOF on that line. Once found, we stop the data command. Everything between the two lines is used as the data value. The 'data <<' syntax is slower than 'data n', as we don't know how many bytes to expect and instead must grow our buffer on the fly. It also has the problem that the frontend must use a string which will not appear on a line by itself in the input, and the data region will always end in an LF. For these reasons real import frontends are encouraged to continue to use _only_ 'data n'. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-18 19:14:27 +01:00			`exact_data ::= 'data' sp declen lf`
			`binary_data;`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00
			`# note: quoted strings are C-style quoting supporting \c for`
			`# common escapes of 'c' (e..g \n, \t, \\, \") or \nnn where nnn`
Fix whitespace in "Format of STDIN stream" of fast-import Something probably assumed that HT indentation is 4 characters. Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-15 10:57:40 +02:00			`# is the signed byte value in octal. Note that the only`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`# characters which must actually be escaped to protect the`
			`# stream formatting is: \, " and LF. Otherwise these values`
Fix whitespace in "Format of STDIN stream" of fast-import Something probably assumed that HT indentation is 4 characters. Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-15 10:57:40 +02:00			`# are UTF8.`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`#`
Don't support shell-quoted refnames in fast-import. The current implementation of shell-style quoted refnames and SHA-1 expressions within fast-import contains a bad memory leak. We leak the unquoted strings used by the `from` and `merge` commands, maybe others. Its also just muddling up the docs. Since Git refnames cannot contain LF, and that is our delimiter for the end of the refname, and we accept any other character as-is, there is no reason for these strings to support quoting, except to be nice to frontends. But frontends shouldn't be expecting to use funny refs in Git, and its just as simple to never quote them as it is to always pass them through the same quoting filter as pathnames. So frontends should never quote refs, or ref expressions. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 02:30:37 +01:00			`ref_str ::= ref;`
			`sha1exp_str ::= sha1exp;`
			`tag_str ::= tag;`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`path_str ::= path \| '"' quoted(path) '"' ;`
Accept 'inline' file data in fast-import commit structure. Its very annoying to need to specify the file content ahead of a commit and use marks to connect the individual blobs to the commit's file modification entry, especially if the frontend can't/won't generate the blob SHA1s itself. Instead it would much easier to use if we can accept the blob data at the same time as we receive each file_change line. Now fast-import accepts 'inline' instead of a mark idnum or blob SHA1 within the 'M' type file_change command. If an inline is detected the very next line must be a 'data n' command, supplying the file data. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-18 21:17:58 +01:00			`mode ::= '100644' \| '644'`
			`\| '100755' \| '755'`
S_IFLNK != 0140000 Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 21:46:11 +01:00			`\| '120000'`
Accept 'inline' file data in fast-import commit structure. Its very annoying to need to specify the file content ahead of a commit and use marks to connect the individual blobs to the commit's file modification entry, especially if the frontend can't/won't generate the blob SHA1s itself. Instead it would much easier to use if we can accept the blob data at the same time as we receive each file_change line. Now fast-import accepts 'inline' instead of a mark idnum or blob SHA1 within the 'M' type file_change command. If an inline is detected the very next line must be a 'data n' command, supplying the file data. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-18 21:17:58 +01:00			`;`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00
			`declen ::= # unsigned 32 bit value, ascii base10 notation;`
Corrected BNF input documentation for fast-import. Now that fast-import uses uintmax_t (the largest available unsigned integer type) for marks we don't want to say its an unsigned 32 bit integer in ASCII base 10 notation. It could be much larger, especially on 64 bit systems, and especially if a frontend uses a very large number of marks (1 per file revision on a very, very large import). Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-17 06:33:18 +01:00			`bigint ::= # unsigned integer value, ascii base10 notation;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`binary_data ::= # file content, not interpreted;`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00
Support RFC 2822 date parsing in fast-import. Since some frontends may be working with source material where the dates are only readily available as RFC 2822 strings, it is more friendly if fast-import exposes Git's parse_date() function to handle the conversion. This way the frontend doesn't need to perform the parsing itself. The new --date-format option to fast-import can be used by a frontend to select which format it will supply date strings in. The default is the standard `raw` Git format, which fast-import has always supported. Format rfc2822 can be used to activate the parse_date() function instead. Because fast-import could also be useful for creating new, current commits, the format `now` is also supported to generate the current system timestamp. The implementation of `now` is a trivial call to datestamp(), but is actually a whole whopping 3 lines so that fast-import can verify the frontend really meant `now`. As part of this change I have added validation of the `raw` date format. Prior to this change fast-import would accept anything in a `committer` command, even if it was seriously malformed. Now fast-import requires the '> ' near the end of the string and verifies the timestamp is formatted properly. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 20:58:30 +01:00			`when ::= raw_when \| rfc2822_when;`
			`raw_when ::= ts sp tz;`
			`rfc2822_when ::= # Valid RFC 2822 date and time;`

Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`sp ::= # ASCII space character;`
			`lf ::= # ASCII newline (LF) character;`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00
			`# note: a colon (':') must precede the numerical value assigned to`
Fix whitespace in "Format of STDIN stream" of fast-import Something probably assumed that HT indentation is 4 characters. Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-15 10:57:40 +02:00			`# an idnum. This is to distinguish it from a ref or tag name as`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`# GIT does not permit ':' in ref or tag strings.`
Fix whitespace in "Format of STDIN stream" of fast-import Something probably assumed that HT indentation is 4 characters. Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-15 10:57:40 +02:00			`#`
Corrected BNF input documentation for fast-import. Now that fast-import uses uintmax_t (the largest available unsigned integer type) for marks we don't want to say its an unsigned 32 bit integer in ASCII base 10 notation. It could be much larger, especially on 64 bit systems, and especially if a frontend uses a very large number of marks (1 per file revision on a very, very large import). Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-17 06:33:18 +01:00			`idnum ::= ':' bigint;`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`path ::= # GIT style file path, e.g. "a/b/c";`
			`ref ::= # GIT ref name, e.g. "refs/heads/MOZ_GECKO_EXPERIMENT";`
			`tag ::= # GIT tag name, e.g. "FIREFOX_1_5";`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`sha1exp ::= # Any valid GIT SHA1 expression;`
			`hexsha1 ::= # SHA1 in hexadecimal format;`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00
			`# note: name and email are UTF8 strings, however name must not`
Fix whitespace in "Format of STDIN stream" of fast-import Something probably assumed that HT indentation is 4 characters. Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-15 10:57:40 +02:00			`# contain '<' or lf and email must not contain any of the`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`# following: '<', '>', lf.`
Fix whitespace in "Format of STDIN stream" of fast-import Something probably assumed that HT indentation is 4 characters. Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-15 10:57:40 +02:00			`#`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`name ::= # valid GIT author/committer name;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`email ::= # valid GIT author/committer email;`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`ts ::= # time since the epoch in seconds, ascii base10 notation;`
			`tz ::= # GIT style timezone;`
Teach fast-import to ignore lines starting with '#' Several frontend developers have asked that some form of stream comments be permitted within a fast-import data stream. This way they can include information from their own frontend program about where specific data was taken from in the source system, or about a decision that their frontend may have made while creating the fast-import data stream. This change introduces comments in the Bourne-shell/Tcl/Perl style. Lines starting with '#' are ignored, up to and including the LF. Unlike the above mentioned three languages however we do not look for and ignore leading whitespace. This just simplifies the definition of the comment format and the code that parses them. To make comments work we had to stop using read_next_command() within cmd_data() and directly invoke read_line() during the inline variant of the function. This is necessary to retain any lines of the input data that might otherwise look like a comment to fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-01 06:05:15 +02:00
			`# note: comments may appear anywhere in the input, except`
			`# within a data command. Any form of the data command`
			`# always escapes the related input from comment processing.`
			`#`
			`# In case it is not clear, the '#' that starts the comment`
			`# must be the first character on that the line (an lf have`
			`# preceeded it).`
			`#`
			`comment ::= '#' not_lf* lf;`
			`not_lf ::= # Any byte that is not ASCII newline (LF);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`*/`

Created fast-import, a tool to quickly generating a pack from blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-05 08:04:21 +02:00			`#include "builtin.h"`
			`#include "cache.h"`
			`#include "object.h"`
			`#include "blob.h"`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`#include "tree.h"`
Don't do non-fastforward updates in fast-import. If fast-import is being used to update an existing branch of a repository, the user may not want to lose commits if another process updates the same ref at the same time. For example, the user might be using fast-import to make just one or two commits against a live branch. We now perform a fast-forward check during the ref updating process. If updating a branch would cause commits in that branch to be lost, we skip over it and display the new SHA1 to standard error. This new default behavior can be overridden with `--force`, like git-push and git-fetch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 22:08:06 +01:00			`#include "commit.h"`
Created fast-import, a tool to quickly generating a pack from blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-05 08:04:21 +02:00			`#include "delta.h"`
			`#include "pack.h"`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`#include "refs.h"`
Created fast-import, a tool to quickly generating a pack from blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-05 08:04:21 +02:00			`#include "csum-file.h"`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`#include "quote.h"`
Created fast-import, a tool to quickly generating a pack from blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-05 08:04:21 +02:00
Correct packfile edge output in fast-import. Branches are only contained by a packfile if the branch actually had its most recent commit in that packfile. So new branches are set to MAX_PACK_ID to ensure they don't cause their commit to list as part of the first packfile when it closes out if the commit was actually in existance before fast-import started. Also corrected the type of last_commit to be umaxint_t to prevent overflow and wraparound on very large imports. Though that is highly unlikely to occur as we're talking 4 billion commits, which no real project has right now. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-17 08:42:43 +01:00			`#define PACK_ID_BITS 16`
			`#define MAX_PACK_ID ((1<<PACK_ID_BITS)-1)`
Don't allow fast-import tree delta chains to exceed maximum depth Brian Downing noticed fast-import can produce tree depths of up to 6,035 objects and even deeper. Long delta chains can create very small packfiles but cause problems during repacking as git needs to unpack each tree to count the reachable blobs. What's happening here is the active branch cache isn't big enough. We're swapping out the branch and thus recycling the tree information (struct tree_content) back into the free pool. When we later reload the tree we set the delta_depth to 0 but we kept the tree we just reloaded as a delta base. So if the tree we reloaded was already at the maximum depth we wouldn't know it and make the new tree a delta. Multiply the number of times the branch cache has to swap out the tree times max_depth (10) and you get the maximum delta depth of a tree created by fast-import. In Brian's case above the active branch cache had to swap the branch out 603/604 times during this import to produce a tree with a delta depth of 6035. Acked-by: Brian Downing <bdowning@lavos.net> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-14 05:48:42 +01:00			`#define DEPTH_BITS 13`
			`#define MAX_DEPTH ((1<<DEPTH_BITS)-1)`
Correct packfile edge output in fast-import. Branches are only contained by a packfile if the branch actually had its most recent commit in that packfile. So new branches are set to MAX_PACK_ID to ensure they don't cause their commit to list as part of the first packfile when it closes out if the commit was actually in existance before fast-import started. Also corrected the type of last_commit to be umaxint_t to prevent overflow and wraparound on very large imports. Though that is highly unlikely to occur as we're talking 4 billion commits, which no real project has right now. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-17 08:42:43 +01:00
Cleaned up memory allocation for object_entry structs. Although its easy to ask the user to tell us how many objects they will need, its probably better to dynamically grow the object table in large units. But if the user can give us a hint as to roughly how many objects then we can still use it during startup. Also stopped printing the SHA1 strings to stdout as no user is currently making use of that facility. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:03:59 +02:00			`struct object_entry`
			`{`
			`struct object_entry *next;`
Reduce memory usage of fast-import. Some structs are allocated rather frequently, but were using integer types which were far larger than required to actually store their full value range. As packfiles are limited to 4 GiB we don't need more than 32 bits to store the offset of an object within that packfile, an `unsigned long` on a 64 bit system is likely a 64 bit unsigned value. Saving 4 bytes per object on a 64 bit system can add up fast on any sizable import. As atom strings are strictly single components in a path name these are probably limited to just 255 bytes by the underlying OS. Going to that short of a string is probably too restrictive, but certainly `unsigned int` is far too large for their lengths. `unsigned short` is a reasonable limit. Modes within a tree really only need two bytes to store their whole value; using `unsigned int` here is vast overkill. Saving 4 bytes per file entry in an active branch can add up quickly on a project with a large number of files. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-05 22:34:56 +01:00			`uint32_t offset;`
Don't allow fast-import tree delta chains to exceed maximum depth Brian Downing noticed fast-import can produce tree depths of up to 6,035 objects and even deeper. Long delta chains can create very small packfiles but cause problems during repacking as git needs to unpack each tree to count the reachable blobs. What's happening here is the active branch cache isn't big enough. We're swapping out the branch and thus recycling the tree information (struct tree_content) back into the free pool. When we later reload the tree we set the delta_depth to 0 but we kept the tree we just reloaded as a delta base. So if the tree we reloaded was already at the maximum depth we wouldn't know it and make the new tree a delta. Multiply the number of times the branch cache has to swap out the tree times max_depth (10) and you get the maximum delta depth of a tree created by fast-import. In Brian's case above the active branch cache had to swap the branch out 603/604 times during this import to produce a tree with a delta depth of 6035. Acked-by: Brian Downing <bdowning@lavos.net> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-14 05:48:42 +01:00			`uint32_t type : TYPE_BITS,`
			`pack_id : PACK_ID_BITS,`
			`depth : DEPTH_BITS;`
Cleaned up memory allocation for object_entry structs. Although its easy to ask the user to tell us how many objects they will need, its probably better to dynamically grow the object table in large units. But if the user can give us a hint as to roughly how many objects then we can still use it during startup. Also stopped printing the SHA1 strings to stdout as no user is currently making use of that facility. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:03:59 +02:00			`unsigned char sha1[20];`
			`};`

Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`struct object_entry_pool`
Cleaned up memory allocation for object_entry structs. Although its easy to ask the user to tell us how many objects they will need, its probably better to dynamically grow the object table in large units. But if the user can give us a hint as to roughly how many objects then we can still use it during startup. Also stopped printing the SHA1 strings to stdout as no user is currently making use of that facility. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:03:59 +02:00			`{`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`struct object_entry_pool *next_pool;`
Cleaned up memory allocation for object_entry structs. Although its easy to ask the user to tell us how many objects they will need, its probably better to dynamically grow the object table in large units. But if the user can give us a hint as to roughly how many objects then we can still use it during startup. Also stopped printing the SHA1 strings to stdout as no user is currently making use of that facility. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:03:59 +02:00			`struct object_entry *next_free;`
			`struct object_entry *end;`
Refactored fast-import's internals for future additions. Too many globals variables were being used not not enough code was resuable to process trees and commits so this is a simple refactoring of the existing blob processing code to get into a state that will be easier to handle trees and commits in. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:46:13 +02:00			`struct object_entry entries[FLEX_ARRAY]; /* more */`
Cleaned up memory allocation for object_entry structs. Although its easy to ask the user to tell us how many objects they will need, its probably better to dynamically grow the object table in large units. But if the user can give us a hint as to roughly how many objects then we can still use it during startup. Also stopped printing the SHA1 strings to stdout as no user is currently making use of that facility. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:03:59 +02:00			`};`

Added mark store/find to fast-import. Marks are now saved when the mark directive gets used by the frontend and may be used in place of a SHA1 expression to locate a previous SHA1 which fast-import may have generated. This is particularly useful with commits where the frontend does not (easily) have the ability to compute the SHA1 for an arbitrary commit but needs it to generate a branch or tag from that commit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 10:17:45 +02:00			`struct mark_set`
			`{`
			`union {`
			`struct object_entry *marked[1024];`
			`struct mark_set *sets[1024];`
			`} data;`
Correct a few types to be unsigned in fast-import. The length of an atom string cannot be negative. So make it explicit and declare it as an unsigned value. The shift width in a mark table node also cannot be negative. I'm also moving it to after the pointer arrays to prevent any possible alignment problems on a 64 bit system. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-17 06:57:23 +01:00			`unsigned int shift;`
Added mark store/find to fast-import. Marks are now saved when the mark directive gets used by the frontend and may be used in place of a SHA1 expression to locate a previous SHA1 which fast-import may have generated. This is particularly useful with commits where the frontend does not (easily) have the ability to compute the SHA1 for an arbitrary commit but needs it to generate a branch or tag from that commit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 10:17:45 +02:00			`};`

Refactored fast-import's internals for future additions. Too many globals variables were being used not not enough code was resuable to process trees and commits so this is a simple refactoring of the existing blob processing code to get into a state that will be easier to handle trees and commits in. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:46:13 +02:00			`struct last_object`
			`{`
fast-import optimization: Now that cmd_data acts on a strbuf, make last_object stashed buffer be a strbuf as well. On new stash, don't free the last stashed buffer, rather swap it with the one you will stash, this way, callers of store_object can act on static strbufs, and at some point, fast-import won't allocate new memory for objects buffers. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 14:00:38 +02:00			`struct strbuf data;`
Reduce memory usage of fast-import. Some structs are allocated rather frequently, but were using integer types which were far larger than required to actually store their full value range. As packfiles are limited to 4 GiB we don't need more than 32 bits to store the offset of an object within that packfile, an `unsigned long` on a 64 bit system is likely a 64 bit unsigned value. Saving 4 bytes per object on a 64 bit system can add up fast on any sizable import. As atom strings are strictly single components in a path name these are probably limited to just 255 bytes by the underlying OS. Going to that short of a string is probably too restrictive, but certainly `unsigned int` is far too large for their lengths. `unsigned short` is a reasonable limit. Modes within a tree really only need two bytes to store their whole value; using `unsigned int` here is vast overkill. Saving 4 bytes per file entry in an active branch can add up quickly on a project with a large number of files. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-05 22:34:56 +01:00			`uint32_t offset;`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00			`unsigned int depth;`
fast-import optimization: Now that cmd_data acts on a strbuf, make last_object stashed buffer be a strbuf as well. On new stash, don't free the last stashed buffer, rather swap it with the one you will stash, this way, callers of store_object can act on static strbufs, and at some point, fast-import won't allocate new memory for objects buffers. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 14:00:38 +02:00			`unsigned no_swap : 1;`
Refactored fast-import's internals for future additions. Too many globals variables were being used not not enough code was resuable to process trees and commits so this is a simple refactoring of the existing blob processing code to get into a state that will be easier to handle trees and commits in. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:46:13 +02:00			`};`

Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`struct mem_pool`
			`{`
			`struct mem_pool *next_pool;`
			`char *next_free;`
			`char *end;`
fast-import: fix unalinged allocation and access The specialized pool allocator fast-import uses aligned objects on the size of a pointer, which was not sufficient at least on Sparc. Instead, make the alignment for objects of type unitmax_t. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-12-15 05:39:16 +01:00			`uintmax_t space[FLEX_ARRAY]; /* more */`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`};`

			`struct atom_str`
			`{`
			`struct atom_str *next_atom;`
Reduce memory usage of fast-import. Some structs are allocated rather frequently, but were using integer types which were far larger than required to actually store their full value range. As packfiles are limited to 4 GiB we don't need more than 32 bits to store the offset of an object within that packfile, an `unsigned long` on a 64 bit system is likely a 64 bit unsigned value. Saving 4 bytes per object on a 64 bit system can add up fast on any sizable import. As atom strings are strictly single components in a path name these are probably limited to just 255 bytes by the underlying OS. Going to that short of a string is probably too restrictive, but certainly `unsigned int` is far too large for their lengths. `unsigned short` is a reasonable limit. Modes within a tree really only need two bytes to store their whole value; using `unsigned int` here is vast overkill. Saving 4 bytes per file entry in an active branch can add up quickly on a project with a large number of files. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-05 22:34:56 +01:00			`unsigned short str_len;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`char str_dat[FLEX_ARRAY]; /* more */`
			`};`

			`struct tree_content;`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00			`struct tree_entry`
			`{`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`struct tree_content *tree;`
			`struct atom_str* name;`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`struct tree_entry_ms`
			`{`
Reduce memory usage of fast-import. Some structs are allocated rather frequently, but were using integer types which were far larger than required to actually store their full value range. As packfiles are limited to 4 GiB we don't need more than 32 bits to store the offset of an object within that packfile, an `unsigned long` on a 64 bit system is likely a 64 bit unsigned value. Saving 4 bytes per object on a 64 bit system can add up fast on any sizable import. As atom strings are strictly single components in a path name these are probably limited to just 255 bytes by the underlying OS. Going to that short of a string is probably too restrictive, but certainly `unsigned int` is far too large for their lengths. `unsigned short` is a reasonable limit. Modes within a tree really only need two bytes to store their whole value; using `unsigned int` here is vast overkill. Saving 4 bytes per file entry in an active branch can add up quickly on a project with a large number of files. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-05 22:34:56 +01:00			`uint16_t mode;`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`unsigned char sha1[20];`
			`} versions[2];`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00			`};`

Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`struct tree_content`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00			`{`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`unsigned int entry_capacity; /* must match avail_tree_content */`
			`unsigned int entry_count;`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`unsigned int delta_depth;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`struct tree_entry entries[FLEX_ARRAY]; / more */`
			`};`

			`struct avail_tree_content`
			`{`
			`unsigned int entry_capacity; /* must match tree_content */`
			`struct avail_tree_content *next_avail;`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00			`};`

			`struct branch`
			`{`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`struct branch *table_next_branch;`
			`struct branch *active_next_branch;`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00			`const char *name;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`struct tree_entry branch_tree;`
Correct packfile edge output in fast-import. Branches are only contained by a packfile if the branch actually had its most recent commit in that packfile. So new branches are set to MAX_PACK_ID to ensure they don't cause their commit to list as part of the first packfile when it closes out if the commit was actually in existance before fast-import started. Also corrected the type of last_commit to be umaxint_t to prevent overflow and wraparound on very large imports. Though that is highly unlikely to occur as we're talking 4 billion commits, which no real project has right now. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-17 08:42:43 +01:00			`uintmax_t last_commit;`
fast-import: Avoid infinite loop after reset Johannes Sixt noticed that a 'reset' command applied to a branch that is already active in the branch LRU cache can cause fast-import to relink the same branch into the LRU cache twice. This will cause the LRU cache to contain a cycle, making unload_one_branch run in an infinite loop as it tries to select the oldest branch for eviction. I have trivially fixed the problem by adding an active bit to each branch object; this bit indicates if the branch is already in the LRU and allows us to avoid trying to add it a second time. Converting the pack_id field into a bitfield makes this change take up no additional memory. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-05 18:31:09 +01:00			`unsigned active : 1;`
			`unsigned pack_id : PACK_ID_BITS;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`unsigned char sha1[20];`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00			`};`

Implemented 'tag' command in fast-import. Tags received from the frontend are generated in memory in a simple linked list in the order that the tag commands were sent by the frontend. If multiple different tag objects for the same tag name get generated the last one sent by the frontend will be the one that gets written out at termination. Multiple tag objects for the same name will cause all older tags of the same name to be lost. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 09:12:13 +02:00			`struct tag`
			`{`
			`struct tag *next_tag;`
			`const char *name;`
Print out the edge commits for each packfile in fast-import. To help callers repack very large repositories into a series of packfiles fast-import now outputs the last commits/tags it wrote to a packfile when it prints out the packfile name. This information can be feed to pack-objects --revs to repack. For the first pack of an initial import this is pretty easy (just feed those SHA1s on stdin) but for subsequent packs you want to feed the subsequent pack's final SHA1s but also all prior pack's SHA1s prefixed with the negation operator. This way the prior pack's data does not get included into the subsequent pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 22:18:44 +01:00			`unsigned int pack_id;`
Implemented 'tag' command in fast-import. Tags received from the frontend are generated in memory in a simple linked list in the order that the tag commands were sent by the frontend. If multiple different tag objects for the same tag name get generated the last one sent by the frontend will be the one that gets written out at termination. Multiple tag objects for the same name will cause all older tags of the same name to be lost. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 09:12:13 +02:00			`unsigned char sha1[20];`
			`};`

Support creation of merge commits in fast-import. Some importers are able to determine when branch merges occurred within their source data. In these cases they will want to supply the correct commits to fast-import so that a proper merge commit will exist in Git. This is now supported by supplying a 'merge ' command after the commit message and optional from command. A merge is not actually performed by fast-import, its assumed that the frontend performed any sort of merging activity already and that fast-import should simply be storing its result. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-12 04:21:38 +01:00			`struct hash_list`
			`{`
			`struct hash_list *next;`
			`unsigned char sha1[20];`
			`};`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00
Support RFC 2822 date parsing in fast-import. Since some frontends may be working with source material where the dates are only readily available as RFC 2822 strings, it is more friendly if fast-import exposes Git's parse_date() function to handle the conversion. This way the frontend doesn't need to perform the parsing itself. The new --date-format option to fast-import can be used by a frontend to select which format it will supply date strings in. The default is the standard `raw` Git format, which fast-import has always supported. Format rfc2822 can be used to activate the parse_date() function instead. Because fast-import could also be useful for creating new, current commits, the format `now` is also supported to generate the current system timestamp. The implementation of `now` is a trivial call to datestamp(), but is actually a whole whopping 3 lines so that fast-import can verify the frontend really meant `now`. As part of this change I have added validation of the `raw` date format. Prior to this change fast-import would accept anything in a `committer` command, even if it was seriously malformed. Now fast-import requires the '> ' near the end of the string and verifies the timestamp is formatted properly. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 20:58:30 +01:00			`typedef enum {`
			`WHENSPEC_RAW = 1,`
			`WHENSPEC_RFC2822,`
			`WHENSPEC_NOW,`
			`} whenspec_type;`

Include recent command history in fast-import crash reports When we crash the frontend developer (or end-user) may need to know roughly around what part of the input stream we had a problem with and aborted on. Because line numbers aren't very useful in this sort of application we instead just keep the last 100 commands in a FIFO queue and print them as part of the crash report. Currently one problem with this design is a commit that has more than 100 modified files in it will flood the FIFO and any context regarding branch/from/committer/mark/comments will be lost. We really should save only the last few (10?) file changes for the current commit, ensuring we have some prior higher level commands in the FIFO when we crash on a file M/D/C/R command. Another issue with this approach is the FIFO only includes the commands, it does not include the commit messages. Yet having a commit message may be useful to help locate the relevant change in the source material. In practice I don't think this is going to be a major concern as the frontend can always embed its own source change set identifier as a comment (which will appear in the crash report) and the commit message(s) for the most recent commits of any given branch should be obtainable from the (packed) commit objects. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 10:47:04 +02:00			`struct recent_command`
			`{`
			`struct recent_command *prev;`
			`struct recent_command *next;`
			`char *buf;`
			`};`

Use uintmax_t for marks in fast-import. If a frontend wants to use a mark per file revision and per commit and is doing a truly huge import (such as a 32 GiB SVN repository) we may need more than 2**32 unique mark values, especially if the frontend is unable (or unwilling) to recycle mark values. For mark idnums we should use the largest unsigned integer type available, hoping that will be at least 64 bits when we are compiled as a 64 bit executable. This way we may consume huge amounts of memory storing our mark table, but we'll at least be able to process the entire import without failing. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 06:33:19 +01:00			`/* Configured limits on output */`
Converted fast-import to accept standard command line parameters. The following command line options are now accepted before the pack name: --objects=n # replaces the object count after the pack name --depth=n # delta chain depth to use (default is 10) --active-branches=n # maximum number of branches to keep in memory Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 08:00:31 +02:00			`static unsigned long max_depth = 10;`
Use off_t in pack-objects/fast-import when we mean an offset Always use an off_t value in pack-objects anytime we are dealing with an offset to some data within a packfile. Also fixed a minor uintmax_t that was incorrectly defined before. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-07 02:44:34 +01:00			`static off_t max_packsize = (1LL << 32) - 1;`
Don't do non-fastforward updates in fast-import. If fast-import is being used to update an existing branch of a repository, the user may not want to lose commits if another process updates the same ref at the same time. For example, the user might be using fast-import to make just one or two commits against a live branch. We now perform a fast-forward check during the ref updating process. If updating a branch would cause commits in that branch to be lost, we skip over it and display the new SHA1 to standard error. This new default behavior can be overridden with `--force`, like git-push and git-fetch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 22:08:06 +01:00			`static int force_update;`
Teach fast-import to honor pack.compression and pack.depth We now use the configured pack.compression and pack.depth values within fast-import, as like builtin-pack-objects fast-import is generating a packfile for consumption by the Git tools. We use the same behavior as builtin-pack-objects does for these options, allowing core.compression to supply the default value for pack.compression. The default setting for pack.depth within fast-import is still 10 as users will generally repack fast-import generated packfiles by `repack -f`. A large delta depth within the fast-import packfile can significantly slow down such a later repack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-21 05:36:54 +01:00			`static int pack_compression_level = Z_DEFAULT_COMPRESSION;`
			`static int pack_compression_seen;`
Use uintmax_t for marks in fast-import. If a frontend wants to use a mark per file revision and per commit and is doing a truly huge import (such as a 32 GiB SVN repository) we may need more than 2**32 unique mark values, especially if the frontend is unable (or unwilling) to recycle mark values. For mark idnums we should use the largest unsigned integer type available, hoping that will be at least 64 bits when we are compiled as a 64 bit executable. This way we may consume huge amounts of memory storing our mark table, but we'll at least be able to process the entire import without failing. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 06:33:19 +01:00
			`/* Stats and misc. counters */`
			`static uintmax_t alloc_count;`
			`static uintmax_t marks_set_count;`
			`static uintmax_t object_count_by_type[1 << TYPE_BITS];`
			`static uintmax_t duplicate_count_by_type[1 << TYPE_BITS];`
			`static uintmax_t delta_count_by_type[1 << TYPE_BITS];`
Correct object_count type and stat output in fast-import. Since object_count is limited to 'unsigned long' (really an unsigned 32 bit integer value) by the pack file format we may as well use exactly that type here in fast-import for that counter. An earlier change by me incorrectly made it uintmax_t. But since object_count is a counter for the current packfile only, we don't want to output its value at the end. Instead we should sum up the individual type counters and report that total, as that will cover all of the packfiles. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 10:55:41 +01:00			`static unsigned long object_count;`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00			`static unsigned long branch_count;`
Added branch load counter to fast-import. If the branch load count exceeds the number of branches created then the frontend is causing fast-import to page branches into and out of memory due to the way its ordering its commits. Performance can likely be increased if the frontend were to alter its commit sequence such that it stays on one branch before switching to another branch, then never returns to the prior branch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 10:31:12 +02:00			`static unsigned long branch_load_count;`
Don't do non-fastforward updates in fast-import. If fast-import is being used to update an existing branch of a repository, the user may not want to lose commits if another process updates the same ref at the same time. For example, the user might be using fast-import to make just one or two commits against a live branch. We now perform a fast-forward check during the ref updating process. If updating a branch would cause commits in that branch to be lost, we skip over it and display the new SHA1 to standard error. This new default behavior can be overridden with `--force`, like git-push and git-fetch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 22:08:06 +01:00			`static int failure;`
fast-import: Hide the pack boundary commits by default. Most users don't need the pack boundary information that fast-import was printing to standard output, especially if they were calling it with --quiet. Those users who do want this information probably want it captured so they can go back and use it to repack the imported repository. So dumping the boundary commits to a log file makes more sense then printing them to standard output. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-12 01:45:56 +01:00			`static FILE *pack_edges;`
Refactored fast-import's internals for future additions. Too many globals variables were being used not not enough code was resuable to process trees and commits so this is a simple refactoring of the existing blob processing code to get into a state that will be easier to handle trees and commits in. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:46:13 +02:00
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`/* Memory pools */`
			`static size_t mem_pool_alloc = 210241024 - sizeof(struct mem_pool);`
			`static size_t total_allocd;`
			`static struct mem_pool *mem_pool;`

Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`/* Atom management */`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`static unsigned int atom_table_sz = 4451;`
			`static unsigned int atom_cnt;`
			`static struct atom_str **atom_table;`

			`/* The .pack file being generated */`
Implemented manual packfile switching in fast-import. To help importers which are dealing with massive amounts of data fast-import needs to be able to close the packfile it is currently writing to and open a new packfile for any additional data that will be received. A new 'checkpoint' command has been introduced which can be used by the frontend import process to force this to occur at any time. This may be useful to ensure a very long running import doesn't lose any work due to unexpected failures. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:35:41 +01:00			`static unsigned int pack_id;`
Improve reuse of sha1_file library within fast-import. Now that the sha1_file.c library routines use the sliding mmap routines to perform efficient access to portions of a packfile I can remove that code from fast-import.c and just invoke it. One benefit is we now have reloading support for any packfile which uses OBJ_OFS_DELTA. Another is we have significantly less code to maintain. This code reuse change requires that fast-import generate only an OBJ_OFS_DELTA format packfile, as there is absolutely no index available to perform OBJ_REF_DELTA lookup in while unpacking an object. This is probably reasonable to require as the delta offsets result in smaller packfiles and are faster to unpack, as no index searching is required. Its also only a temporary requirement as users could always repack without offsets before making the import available to older versions of Git. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-14 12:20:23 +01:00			`static struct packed_git *pack_data;`
Implemented manual packfile switching in fast-import. To help importers which are dealing with massive amounts of data fast-import needs to be able to close the packfile it is currently writing to and open a new packfile for any additional data that will be received. A new 'checkpoint' command has been introduced which can be used by the frontend import process to force this to occur at any time. This may be useful to ensure a very long running import doesn't lose any work due to unexpected failures. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:35:41 +01:00			`static struct packed_git **all_packs;`
Implemented tree reloading in fast-import. Tree reloading allows fast-import to swap out the least-recently used branch by simply deallocating the data structures from memory that were associated with that branch. Later if the branch becomes active again it can lazily recreate those structures on demand by reloading the necessary trees from the pack file it originally wrote them to. The reloading process is implemented by mmap'ing the pack into memory and using a much tighter variant of the pack reading code contained in sha1_file.c. This was a blatent copy from sha1_file.c but the unpacking functions were significantly simplified and are actually now in a form that should make it easier to map only the necessary regions of a pack rather than the entire file. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 10:37:35 +02:00			`static unsigned long pack_size;`
Refactored fast-import's internals for future additions. Too many globals variables were being used not not enough code was resuable to process trees and commits so this is a simple refactoring of the existing blob processing code to get into a state that will be easier to handle trees and commits in. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:46:13 +02:00
			`/* Table of objects we've written. */`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`static unsigned int object_entry_alloc = 5000;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`static struct object_entry_pool *blocks;`
			`static struct object_entry *object_table[1 << 16];`
Added mark store/find to fast-import. Marks are now saved when the mark directive gets used by the frontend and may be used in place of a SHA1 expression to locate a previous SHA1 which fast-import may have generated. This is particularly useful with commits where the frontend does not (easily) have the ability to compute the SHA1 for an arbitrary commit but needs it to generate a branch or tag from that commit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 10:17:45 +02:00			`static struct mark_set *marks;`
Added option to export the marks table when fast-import terminates. The marks table can be used by the frontend to load any commit after the import and compare it to whatever data the frontend knows about that commit. If the mark idnums can be easily correlated to some reference source then its relatively trivial to compare the GIT tree to the reference to verify the accuracy of the import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 22:03:04 +02:00			`static const char* mark_file;`
Refactored fast-import's internals for future additions. Too many globals variables were being used not not enough code was resuable to process trees and commits so this is a simple refactoring of the existing blob processing code to get into a state that will be easier to handle trees and commits in. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:46:13 +02:00
			`/* Our last blob */`
fast-import optimization: Now that cmd_data acts on a strbuf, make last_object stashed buffer be a strbuf as well. On new stash, don't free the last stashed buffer, rather swap it with the one you will stash, this way, callers of store_object can act on static strbufs, and at some point, fast-import won't allocate new memory for objects buffers. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 14:00:38 +02:00			`static struct last_object last_blob = { STRBUF_INIT, 0, 0, 0 };`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00
			`/* Tree management */`
			`static unsigned int tree_entry_alloc = 1000;`
			`static void *avail_tree_entry;`
			`static unsigned int avail_tree_table_sz = 100;`
			`static struct avail_tree_content **avail_tree_table;`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`static struct strbuf old_tree = STRBUF_INIT;`
			`static struct strbuf new_tree = STRBUF_INIT;`
Added automatic index generation to fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-06 19:51:39 +02:00
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00			`/* Branch data */`
Converted fast-import to accept standard command line parameters. The following command line options are now accepted before the pack name: --objects=n # replaces the object count after the pack name --depth=n # delta chain depth to use (default is 10) --active-branches=n # maximum number of branches to keep in memory Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 08:00:31 +02:00			`static unsigned long max_active_branches = 5;`
			`static unsigned long cur_active_branches;`
			`static unsigned long branch_table_sz = 1039;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`static struct branch **branch_table;`
			`static struct branch *active_branches;`

Implemented 'tag' command in fast-import. Tags received from the frontend are generated in memory in a simple linked list in the order that the tag commands were sent by the frontend. If multiple different tag objects for the same tag name get generated the last one sent by the frontend will be the one that gets written out at termination. Multiple tag objects for the same name will cause all older tags of the same name to be lost. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 09:12:13 +02:00			`/* Tag data */`
			`static struct tag *first_tag;`
			`static struct tag *last_tag;`

Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`/* Input stream parsing */`
Support RFC 2822 date parsing in fast-import. Since some frontends may be working with source material where the dates are only readily available as RFC 2822 strings, it is more friendly if fast-import exposes Git's parse_date() function to handle the conversion. This way the frontend doesn't need to perform the parsing itself. The new --date-format option to fast-import can be used by a frontend to select which format it will supply date strings in. The default is the standard `raw` Git format, which fast-import has always supported. Format rfc2822 can be used to activate the parse_date() function instead. Because fast-import could also be useful for creating new, current commits, the format `now` is also supported to generate the current system timestamp. The implementation of `now` is a trivial call to datestamp(), but is actually a whole whopping 3 lines so that fast-import can verify the frontend really meant `now`. As part of this change I have added validation of the `raw` date format. Prior to this change fast-import would accept anything in a `committer` command, even if it was seriously malformed. Now fast-import requires the '> ' near the end of the string and verifies the timestamp is formatted properly. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 20:58:30 +01:00			`static whenspec_type whenspec = WHENSPEC_RAW;`
fast-import: Use strbuf API, and simplify cmd_data() This patch features the use of strbuf_detach, and prevent the programmer to mess with allocation directly. The code is as efficent as before, just more concise and more straightforward. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-06 13:20:07 +02:00			`static struct strbuf command_buf = STRBUF_INIT;`
Make trailing LF optional for all fast-import commands For the same reasons as the prior change we want to allow frontends to omit the trailing LF that usually delimits commands. In some cases these just make the input stream more verbose looking than it needs to be, and its just simpler for the frontend developer to get started if our parser is slightly more lenient about where an LF is required and where it isn't. To make this optional LF feature work we now have to buffer up to one line of input in command_buf. This buffering can happen if we look at the current input command but don't recognize it at this point in the code. In such a case we need to "unget" the entire line, but we cannot depend upon the stdio library to let us do ungetc() for that many characters at once. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-01 08:22:53 +02:00			`static int unread_command_buf;`
Include recent command history in fast-import crash reports When we crash the frontend developer (or end-user) may need to know roughly around what part of the input stream we had a problem with and aborted on. Because line numbers aren't very useful in this sort of application we instead just keep the last 100 commands in a FIFO queue and print them as part of the crash report. Currently one problem with this design is a commit that has more than 100 modified files in it will flood the FIFO and any context regarding branch/from/committer/mark/comments will be lost. We really should save only the last few (10?) file changes for the current commit, ensuring we have some prior higher level commands in the FIFO when we crash on a file M/D/C/R command. Another issue with this approach is the FIFO only includes the commands, it does not include the commit messages. Yet having a commit message may be useful to help locate the relevant change in the source material. In practice I don't think this is going to be a major concern as the frontend can always embed its own source change set identifier as a comment (which will appear in the crash report) and the commit message(s) for the most recent commits of any given branch should be obtainable from the (packed) commit objects. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 10:47:04 +02:00			`static struct recent_command cmd_hist = {&cmd_hist, &cmd_hist, NULL};`
			`static struct recent_command *cmd_tail = &cmd_hist;`
			`static struct recent_command *rc_free;`
			`static unsigned int cmd_save = 100;`
Use uintmax_t for marks in fast-import. If a frontend wants to use a mark per file revision and per commit and is doing a truly huge import (such as a 32 GiB SVN repository) we may need more than 2**32 unique mark values, especially if the frontend is unable (or unwilling) to recycle mark values. For mark idnums we should use the largest unsigned integer type available, hoping that will be at least 64 bits when we are compiled as a 64 bit executable. This way we may consume huge amounts of memory storing our mark table, but we'll at least be able to process the entire import without failing. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 06:33:19 +01:00			`static uintmax_t next_mark;`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`static struct strbuf new_data = STRBUF_INIT;`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00
Generate crash reports on die in fast-import As fast-import is quite strict about its input and die()'s anytime something goes wrong it can be difficult for a frontend developer to troubleshoot why fast-import rejected their input, or to even determine what input command it rejected. This change introduces a custom handler for Git's die() routine. When we receive a die() for any reason (fast-import or a lower level core Git routine we called) the error is first dumped onto stderr and then a more extensive crash report file is prepared in GIT_DIR. Finally we exit the process with status 128, just like the stock builtin die handler. An internal flag is set to prevent any further die()'s that may be invoked during the crash report generator from causing us to enter into an infinite loop. We shouldn't die() from our crash report handler, but just in case someone makes a future code change we are prepared to gaurd against small mistakes turning into huge problems for the end-user. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 08:00:37 +02:00			`static void write_branch_report(FILE rpt, struct branch b)`
			`{`
			`fprintf(rpt, "%s:\n", b->name);`

			`fprintf(rpt, " status :");`
			`if (b->active)`
			`fputs(" active", rpt);`
			`if (b->branch_tree.tree)`
			`fputs(" loaded", rpt);`
			`if (is_null_sha1(b->branch_tree.versions[1].sha1))`
			`fputs(" dirty", rpt);`
			`fputc('\n', rpt);`

			`fprintf(rpt, " tip commit : %s\n", sha1_to_hex(b->sha1));`
			`fprintf(rpt, " old tree : %s\n", sha1_to_hex(b->branch_tree.versions[0].sha1));`
			`fprintf(rpt, " cur tree : %s\n", sha1_to_hex(b->branch_tree.versions[1].sha1));`
			`fprintf(rpt, " commit clock: %" PRIuMAX "\n", b->last_commit);`

			`fputs(" last pack : ", rpt);`
			`if (b->pack_id < MAX_PACK_ID)`
			`fprintf(rpt, "%u", b->pack_id);`
			`fputc('\n', rpt);`

			`fputc('\n', rpt);`
			`}`

Include the fast-import marks table in crash reports If fast-import was not run with --export-marks but we are crashing the frontend application developer may still benefit from having that information available to them. We now include the marks table as part of the crash report if --export-marks was not supplied on the command line. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-14 07:34:40 +01:00			`static void dump_marks_helper(FILE , uintmax_t, struct mark_set );`

Avoid using va_copy in fast-import: it seems to be unportable. [sp: minor change to use fputs, thus reducing the patch size] Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-08-21 05:38:14 +02:00			`static void write_crash_report(const char *err)`
Generate crash reports on die in fast-import As fast-import is quite strict about its input and die()'s anytime something goes wrong it can be difficult for a frontend developer to troubleshoot why fast-import rejected their input, or to even determine what input command it rejected. This change introduces a custom handler for Git's die() routine. When we receive a die() for any reason (fast-import or a lower level core Git routine we called) the error is first dumped onto stderr and then a more extensive crash report file is prepared in GIT_DIR. Finally we exit the process with status 128, just like the stock builtin die handler. An internal flag is set to prevent any further die()'s that may be invoked during the crash report generator from causing us to enter into an infinite loop. We shouldn't die() from our crash report handler, but just in case someone makes a future code change we are prepared to gaurd against small mistakes turning into huge problems for the end-user. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 08:00:37 +02:00			`{`
			`char *loc = git_path("fast_import_crash_%d", getpid());`
			`FILE *rpt = fopen(loc, "w");`
			`struct branch *b;`
			`unsigned long lu;`
Include recent command history in fast-import crash reports When we crash the frontend developer (or end-user) may need to know roughly around what part of the input stream we had a problem with and aborted on. Because line numbers aren't very useful in this sort of application we instead just keep the last 100 commands in a FIFO queue and print them as part of the crash report. Currently one problem with this design is a commit that has more than 100 modified files in it will flood the FIFO and any context regarding branch/from/committer/mark/comments will be lost. We really should save only the last few (10?) file changes for the current commit, ensuring we have some prior higher level commands in the FIFO when we crash on a file M/D/C/R command. Another issue with this approach is the FIFO only includes the commands, it does not include the commit messages. Yet having a commit message may be useful to help locate the relevant change in the source material. In practice I don't think this is going to be a major concern as the frontend can always embed its own source change set identifier as a comment (which will appear in the crash report) and the commit message(s) for the most recent commits of any given branch should be obtainable from the (packed) commit objects. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 10:47:04 +02:00			`struct recent_command *rc;`
Generate crash reports on die in fast-import As fast-import is quite strict about its input and die()'s anytime something goes wrong it can be difficult for a frontend developer to troubleshoot why fast-import rejected their input, or to even determine what input command it rejected. This change introduces a custom handler for Git's die() routine. When we receive a die() for any reason (fast-import or a lower level core Git routine we called) the error is first dumped onto stderr and then a more extensive crash report file is prepared in GIT_DIR. Finally we exit the process with status 128, just like the stock builtin die handler. An internal flag is set to prevent any further die()'s that may be invoked during the crash report generator from causing us to enter into an infinite loop. We shouldn't die() from our crash report handler, but just in case someone makes a future code change we are prepared to gaurd against small mistakes turning into huge problems for the end-user. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 08:00:37 +02:00
			`if (!rpt) {`
			`error("can't write crash report %s: %s", loc, strerror(errno));`
			`return;`
			`}`

			`fprintf(stderr, "fast-import: dumping crash report to %s\n", loc);`

			`fprintf(rpt, "fast-import crash report:\n");`
			`fprintf(rpt, " fast-import process: %d\n", getpid());`
			`fprintf(rpt, " parent process : %d\n", getppid());`
			`fprintf(rpt, " at %s\n", show_date(time(NULL), 0, DATE_LOCAL));`
			`fputc('\n', rpt);`

			`fputs("fatal: ", rpt);`
Avoid using va_copy in fast-import: it seems to be unportable. [sp: minor change to use fputs, thus reducing the patch size] Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-08-21 05:38:14 +02:00			`fputs(err, rpt);`
Generate crash reports on die in fast-import As fast-import is quite strict about its input and die()'s anytime something goes wrong it can be difficult for a frontend developer to troubleshoot why fast-import rejected their input, or to even determine what input command it rejected. This change introduces a custom handler for Git's die() routine. When we receive a die() for any reason (fast-import or a lower level core Git routine we called) the error is first dumped onto stderr and then a more extensive crash report file is prepared in GIT_DIR. Finally we exit the process with status 128, just like the stock builtin die handler. An internal flag is set to prevent any further die()'s that may be invoked during the crash report generator from causing us to enter into an infinite loop. We shouldn't die() from our crash report handler, but just in case someone makes a future code change we are prepared to gaurd against small mistakes turning into huge problems for the end-user. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 08:00:37 +02:00			`fputc('\n', rpt);`

Include recent command history in fast-import crash reports When we crash the frontend developer (or end-user) may need to know roughly around what part of the input stream we had a problem with and aborted on. Because line numbers aren't very useful in this sort of application we instead just keep the last 100 commands in a FIFO queue and print them as part of the crash report. Currently one problem with this design is a commit that has more than 100 modified files in it will flood the FIFO and any context regarding branch/from/committer/mark/comments will be lost. We really should save only the last few (10?) file changes for the current commit, ensuring we have some prior higher level commands in the FIFO when we crash on a file M/D/C/R command. Another issue with this approach is the FIFO only includes the commands, it does not include the commit messages. Yet having a commit message may be useful to help locate the relevant change in the source material. In practice I don't think this is going to be a major concern as the frontend can always embed its own source change set identifier as a comment (which will appear in the crash report) and the commit message(s) for the most recent commits of any given branch should be obtainable from the (packed) commit objects. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 10:47:04 +02:00			`fputc('\n', rpt);`
			`fputs("Most Recent Commands Before Crash\n", rpt);`
			`fputs("---------------------------------\n", rpt);`
			`for (rc = cmd_hist.next; rc != &cmd_hist; rc = rc->next) {`
			`if (rc->next == &cmd_hist)`
			`fputs("* ", rpt);`
			`else`
			`fputs(" ", rpt);`
			`fputs(rc->buf, rpt);`
			`fputc('\n', rpt);`
			`}`

Generate crash reports on die in fast-import As fast-import is quite strict about its input and die()'s anytime something goes wrong it can be difficult for a frontend developer to troubleshoot why fast-import rejected their input, or to even determine what input command it rejected. This change introduces a custom handler for Git's die() routine. When we receive a die() for any reason (fast-import or a lower level core Git routine we called) the error is first dumped onto stderr and then a more extensive crash report file is prepared in GIT_DIR. Finally we exit the process with status 128, just like the stock builtin die handler. An internal flag is set to prevent any further die()'s that may be invoked during the crash report generator from causing us to enter into an infinite loop. We shouldn't die() from our crash report handler, but just in case someone makes a future code change we are prepared to gaurd against small mistakes turning into huge problems for the end-user. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 08:00:37 +02:00			`fputc('\n', rpt);`
			`fputs("Active Branch LRU\n", rpt);`
			`fputs("-----------------\n", rpt);`
			`fprintf(rpt, " active_branches = %lu cur, %lu max\n",`
			`cur_active_branches,`
			`max_active_branches);`
			`fputc('\n', rpt);`
			`fputs(" pos clock name\n", rpt);`
			`fputs(" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n", rpt);`
			`for (b = active_branches, lu = 0; b; b = b->active_next_branch)`
			`fprintf(rpt, " %2lu) %6" PRIuMAX" %s\n",`
			`++lu, b->last_commit, b->name);`

			`fputc('\n', rpt);`
			`fputs("Inactive Branches\n", rpt);`
			`fputs("-----------------\n", rpt);`
			`for (lu = 0; lu < branch_table_sz; lu++) {`
			`for (b = branch_table[lu]; b; b = b->table_next_branch)`
			`write_branch_report(rpt, b);`
			`}`

Include annotated tags in fast-import crash reports If annotated tags were created they exist in a different namespace within the fast-import process' internal memory tables so we did not export them in the inactive branch table. Now they are written out after the branches, in the order that they were defined by the frontend process. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-14 07:34:36 +01:00			`if (first_tag) {`
			`struct tag *tg;`
			`fputc('\n', rpt);`
			`fputs("Annotated Tags\n", rpt);`
			`fputs("--------------\n", rpt);`
			`for (tg = first_tag; tg; tg = tg->next_tag) {`
			`fputs(sha1_to_hex(tg->sha1), rpt);`
			`fputc(' ', rpt);`
			`fputs(tg->name, rpt);`
			`fputc('\n', rpt);`
			`}`
			`}`

Include the fast-import marks table in crash reports If fast-import was not run with --export-marks but we are crashing the frontend application developer may still benefit from having that information available to them. We now include the marks table as part of the crash report if --export-marks was not supplied on the command line. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-14 07:34:40 +01:00			`fputc('\n', rpt);`
			`fputs("Marks\n", rpt);`
			`fputs("-----\n", rpt);`
			`if (mark_file)`
			`fprintf(rpt, " exported to %s\n", mark_file);`
			`else`
			`dump_marks_helper(rpt, 0, marks);`

Generate crash reports on die in fast-import As fast-import is quite strict about its input and die()'s anytime something goes wrong it can be difficult for a frontend developer to troubleshoot why fast-import rejected their input, or to even determine what input command it rejected. This change introduces a custom handler for Git's die() routine. When we receive a die() for any reason (fast-import or a lower level core Git routine we called) the error is first dumped onto stderr and then a more extensive crash report file is prepared in GIT_DIR. Finally we exit the process with status 128, just like the stock builtin die handler. An internal flag is set to prevent any further die()'s that may be invoked during the crash report generator from causing us to enter into an infinite loop. We shouldn't die() from our crash report handler, but just in case someone makes a future code change we are prepared to gaurd against small mistakes turning into huge problems for the end-user. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 08:00:37 +02:00			`fputc('\n', rpt);`
			`fputs("-------------------\n", rpt);`
			`fputs("END OF CRASH REPORT\n", rpt);`
			`fclose(rpt);`
			`}`

Finish current packfile during fast-import crash handler If fast-import is in the middle of crashing due to a protocol error or something like that then it can be very useful to have the mark table and all objects up until that point be available for a new import to resume from. Currently we just close the active packfile, unkeep all of our newly created packfiles (so they can be deleted), and dump the marks table to a temporary file. We don't attempt to update the refs/tags that the process has in memory as much of that data can be found in the crash report and I'm not sure it would be the right thing to do under every type of crash. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-14 07:34:43 +01:00			`static void end_packfile(void);`
			`static void unkeep_all_packs(void);`
			`static void dump_marks(void);`

Generate crash reports on die in fast-import As fast-import is quite strict about its input and die()'s anytime something goes wrong it can be difficult for a frontend developer to troubleshoot why fast-import rejected their input, or to even determine what input command it rejected. This change introduces a custom handler for Git's die() routine. When we receive a die() for any reason (fast-import or a lower level core Git routine we called) the error is first dumped onto stderr and then a more extensive crash report file is prepared in GIT_DIR. Finally we exit the process with status 128, just like the stock builtin die handler. An internal flag is set to prevent any further die()'s that may be invoked during the crash report generator from causing us to enter into an infinite loop. We shouldn't die() from our crash report handler, but just in case someone makes a future code change we are prepared to gaurd against small mistakes turning into huge problems for the end-user. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 08:00:37 +02:00			`static NORETURN void die_nicely(const char *err, va_list params)`
			`{`
			`static int zombie;`
Avoid using va_copy in fast-import: it seems to be unportable. [sp: minor change to use fputs, thus reducing the patch size] Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-08-21 05:38:14 +02:00			`char message[2 * PATH_MAX];`
Generate crash reports on die in fast-import As fast-import is quite strict about its input and die()'s anytime something goes wrong it can be difficult for a frontend developer to troubleshoot why fast-import rejected their input, or to even determine what input command it rejected. This change introduces a custom handler for Git's die() routine. When we receive a die() for any reason (fast-import or a lower level core Git routine we called) the error is first dumped onto stderr and then a more extensive crash report file is prepared in GIT_DIR. Finally we exit the process with status 128, just like the stock builtin die handler. An internal flag is set to prevent any further die()'s that may be invoked during the crash report generator from causing us to enter into an infinite loop. We shouldn't die() from our crash report handler, but just in case someone makes a future code change we are prepared to gaurd against small mistakes turning into huge problems for the end-user. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 08:00:37 +02:00
Avoid using va_copy in fast-import: it seems to be unportable. [sp: minor change to use fputs, thus reducing the patch size] Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-08-21 05:38:14 +02:00			`vsnprintf(message, sizeof(message), err, params);`
Generate crash reports on die in fast-import As fast-import is quite strict about its input and die()'s anytime something goes wrong it can be difficult for a frontend developer to troubleshoot why fast-import rejected their input, or to even determine what input command it rejected. This change introduces a custom handler for Git's die() routine. When we receive a die() for any reason (fast-import or a lower level core Git routine we called) the error is first dumped onto stderr and then a more extensive crash report file is prepared in GIT_DIR. Finally we exit the process with status 128, just like the stock builtin die handler. An internal flag is set to prevent any further die()'s that may be invoked during the crash report generator from causing us to enter into an infinite loop. We shouldn't die() from our crash report handler, but just in case someone makes a future code change we are prepared to gaurd against small mistakes turning into huge problems for the end-user. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 08:00:37 +02:00			`fputs("fatal: ", stderr);`
Avoid using va_copy in fast-import: it seems to be unportable. [sp: minor change to use fputs, thus reducing the patch size] Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-08-21 05:38:14 +02:00			`fputs(message, stderr);`
Generate crash reports on die in fast-import As fast-import is quite strict about its input and die()'s anytime something goes wrong it can be difficult for a frontend developer to troubleshoot why fast-import rejected their input, or to even determine what input command it rejected. This change introduces a custom handler for Git's die() routine. When we receive a die() for any reason (fast-import or a lower level core Git routine we called) the error is first dumped onto stderr and then a more extensive crash report file is prepared in GIT_DIR. Finally we exit the process with status 128, just like the stock builtin die handler. An internal flag is set to prevent any further die()'s that may be invoked during the crash report generator from causing us to enter into an infinite loop. We shouldn't die() from our crash report handler, but just in case someone makes a future code change we are prepared to gaurd against small mistakes turning into huge problems for the end-user. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 08:00:37 +02:00			`fputc('\n', stderr);`

			`if (!zombie) {`
			`zombie = 1;`
Avoid using va_copy in fast-import: it seems to be unportable. [sp: minor change to use fputs, thus reducing the patch size] Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-08-21 05:38:14 +02:00			`write_crash_report(message);`
Finish current packfile during fast-import crash handler If fast-import is in the middle of crashing due to a protocol error or something like that then it can be very useful to have the mark table and all objects up until that point be available for a new import to resume from. Currently we just close the active packfile, unkeep all of our newly created packfiles (so they can be deleted), and dump the marks table to a temporary file. We don't attempt to update the refs/tags that the process has in memory as much of that data can be found in the crash report and I'm not sure it would be the right thing to do under every type of crash. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-14 07:34:43 +01:00			`end_packfile();`
			`unkeep_all_packs();`
			`dump_marks();`
Generate crash reports on die in fast-import As fast-import is quite strict about its input and die()'s anytime something goes wrong it can be difficult for a frontend developer to troubleshoot why fast-import rejected their input, or to even determine what input command it rejected. This change introduces a custom handler for Git's die() routine. When we receive a die() for any reason (fast-import or a lower level core Git routine we called) the error is first dumped onto stderr and then a more extensive crash report file is prepared in GIT_DIR. Finally we exit the process with status 128, just like the stock builtin die handler. An internal flag is set to prevent any further die()'s that may be invoked during the crash report generator from causing us to enter into an infinite loop. We shouldn't die() from our crash report handler, but just in case someone makes a future code change we are prepared to gaurd against small mistakes turning into huge problems for the end-user. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 08:00:37 +02:00			`}`
			`exit(128);`
			`}`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00
Misc. type cleanups within fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 06:16:23 +01:00			`static void alloc_objects(unsigned int cnt)`
Added automatic index generation to fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-06 19:51:39 +02:00			`{`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`struct object_entry_pool *b;`
Cleaned up memory allocation for object_entry structs. Although its easy to ask the user to tell us how many objects they will need, its probably better to dynamically grow the object table in large units. But if the user can give us a hint as to roughly how many objects then we can still use it during startup. Also stopped printing the SHA1 strings to stdout as no user is currently making use of that facility. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:03:59 +02:00
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`b = xmalloc(sizeof(struct object_entry_pool)`
Cleaned up memory allocation for object_entry structs. Although its easy to ask the user to tell us how many objects they will need, its probably better to dynamically grow the object table in large units. But if the user can give us a hint as to roughly how many objects then we can still use it during startup. Also stopped printing the SHA1 strings to stdout as no user is currently making use of that facility. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:03:59 +02:00			`+ cnt * sizeof(struct object_entry));`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`b->next_pool = blocks;`
Cleaned up memory allocation for object_entry structs. Although its easy to ask the user to tell us how many objects they will need, its probably better to dynamically grow the object table in large units. But if the user can give us a hint as to roughly how many objects then we can still use it during startup. Also stopped printing the SHA1 strings to stdout as no user is currently making use of that facility. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:03:59 +02:00			`b->next_free = b->entries;`
			`b->end = b->entries + cnt;`
			`blocks = b;`
			`alloc_count += cnt;`
			`}`
Added automatic index generation to fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-06 19:51:39 +02:00
Correct minor style issue in fast-import. Junio noticed that I was using a different style in fast-import for returned pointers than the rest of Git. Before merging this code into the main git.git tree I'd like to make it consistent, as this style variation was not intentional. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 06:43:59 +01:00			`static struct object_entry new_object(unsigned char sha1)`
Added automatic index generation to fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-06 19:51:39 +02:00			`{`
Cleaned up memory allocation for object_entry structs. Although its easy to ask the user to tell us how many objects they will need, its probably better to dynamically grow the object table in large units. But if the user can give us a hint as to roughly how many objects then we can still use it during startup. Also stopped printing the SHA1 strings to stdout as no user is currently making use of that facility. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:03:59 +02:00			`struct object_entry *e;`
Added automatic index generation to fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-06 19:51:39 +02:00
Cleaned up memory allocation for object_entry structs. Although its easy to ask the user to tell us how many objects they will need, its probably better to dynamically grow the object table in large units. But if the user can give us a hint as to roughly how many objects then we can still use it during startup. Also stopped printing the SHA1 strings to stdout as no user is currently making use of that facility. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:03:59 +02:00			`if (blocks->next_free == blocks->end)`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`alloc_objects(object_entry_alloc);`
Added automatic index generation to fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-06 19:51:39 +02:00
Cleaned up memory allocation for object_entry structs. Although its easy to ask the user to tell us how many objects they will need, its probably better to dynamically grow the object table in large units. But if the user can give us a hint as to roughly how many objects then we can still use it during startup. Also stopped printing the SHA1 strings to stdout as no user is currently making use of that facility. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:03:59 +02:00			`e = blocks->next_free++;`
Converted hash memcpy/memcmp to new hashcpy/hashcmp/hashclr. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 16:46:58 +02:00			`hashcpy(e->sha1, sha1);`
Cleaned up memory allocation for object_entry structs. Although its easy to ask the user to tell us how many objects they will need, its probably better to dynamically grow the object table in large units. But if the user can give us a hint as to roughly how many objects then we can still use it during startup. Also stopped printing the SHA1 strings to stdout as no user is currently making use of that facility. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:03:59 +02:00			`return e;`
Added automatic index generation to fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-06 19:51:39 +02:00			`}`

Correct minor style issue in fast-import. Junio noticed that I was using a different style in fast-import for returned pointers than the rest of Git. Before merging this code into the main git.git tree I'd like to make it consistent, as this style variation was not intentional. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 06:43:59 +01:00			`static struct object_entry find_object(unsigned char sha1)`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`{`
			`unsigned int h = sha1[0] << 8 \| sha1[1];`
			`struct object_entry *e;`
			`for (e = object_table[h]; e; e = e->next)`
Converted hash memcpy/memcmp to new hashcpy/hashcmp/hashclr. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 16:46:58 +02:00			`if (!hashcmp(sha1, e->sha1))`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`return e;`
			`return NULL;`
			`}`

Correct minor style issue in fast-import. Junio noticed that I was using a different style in fast-import for returned pointers than the rest of Git. Before merging this code into the main git.git tree I'd like to make it consistent, as this style variation was not intentional. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 06:43:59 +01:00			`static struct object_entry insert_object(unsigned char sha1)`
Added automatic index generation to fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-06 19:51:39 +02:00			`{`
			`unsigned int h = sha1[0] << 8 \| sha1[1];`
Cleaned up memory allocation for object_entry structs. Although its easy to ask the user to tell us how many objects they will need, its probably better to dynamically grow the object table in large units. But if the user can give us a hint as to roughly how many objects then we can still use it during startup. Also stopped printing the SHA1 strings to stdout as no user is currently making use of that facility. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:03:59 +02:00			`struct object_entry *e = object_table[h];`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`struct object_entry *p = NULL;`
Added automatic index generation to fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-06 19:51:39 +02:00
			`while (e) {`
Converted hash memcpy/memcmp to new hashcpy/hashcmp/hashclr. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 16:46:58 +02:00			`if (!hashcmp(sha1, e->sha1))`
Added automatic index generation to fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-06 19:51:39 +02:00			`return e;`
			`p = e;`
			`e = e->next;`
			`}`

			`e = new_object(sha1);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`e->next = NULL;`
Added automatic index generation to fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-06 19:51:39 +02:00			`e->offset = 0;`
			`if (p)`
			`p->next = e;`
			`else`
Cleaned up memory allocation for object_entry structs. Although its easy to ask the user to tell us how many objects they will need, its probably better to dynamically grow the object table in large units. But if the user can give us a hint as to roughly how many objects then we can still use it during startup. Also stopped printing the SHA1 strings to stdout as no user is currently making use of that facility. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:03:59 +02:00			`object_table[h] = e;`
Added automatic index generation to fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-06 19:51:39 +02:00			`return e;`
			`}`
Created fast-import, a tool to quickly generating a pack from blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-05 08:04:21 +02:00
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`static unsigned int hc_str(const char *s, size_t len)`
			`{`
			`unsigned int r = 0;`
			`while (len-- > 0)`
			`r = r * 31 + *s++;`
			`return r;`
			`}`

Correct minor style issue in fast-import. Junio noticed that I was using a different style in fast-import for returned pointers than the rest of Git. Before merging this code into the main git.git tree I'd like to make it consistent, as this style variation was not intentional. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 06:43:59 +01:00			`static void *pool_alloc(size_t len)`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`{`
			`struct mem_pool *p;`
			`void *r;`

			`for (p = mem_pool; p; p = p->next_pool)`
			`if ((p->end - p->next_free >= len))`
			`break;`

			`if (!p) {`
			`if (len >= (mem_pool_alloc/2)) {`
			`total_allocd += len;`
			`return xmalloc(len);`
			`}`
			`total_allocd += sizeof(struct mem_pool) + mem_pool_alloc;`
			`p = xmalloc(sizeof(struct mem_pool) + mem_pool_alloc);`
			`p->next_pool = mem_pool;`
fast-import: fix unalinged allocation and access The specialized pool allocator fast-import uses aligned objects on the size of a pointer, which was not sufficient at least on Sparc. Instead, make the alignment for objects of type unitmax_t. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-12-15 05:39:16 +01:00			`p->next_free = (char *) p->space;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`p->end = p->next_free + mem_pool_alloc;`
			`mem_pool = p;`
			`}`

			`r = p->next_free;`
fast-import: fix unalinged allocation and access The specialized pool allocator fast-import uses aligned objects on the size of a pointer, which was not sufficient at least on Sparc. Instead, make the alignment for objects of type unitmax_t. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-12-15 05:39:16 +01:00			`/* round out to a 'uintmax_t' alignment */`
			`if (len & (sizeof(uintmax_t) - 1))`
			`len += sizeof(uintmax_t) - (len & (sizeof(uintmax_t) - 1));`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`p->next_free += len;`
			`return r;`
			`}`

Correct minor style issue in fast-import. Junio noticed that I was using a different style in fast-import for returned pointers than the rest of Git. Before merging this code into the main git.git tree I'd like to make it consistent, as this style variation was not intentional. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 06:43:59 +01:00			`static void *pool_calloc(size_t count, size_t size)`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`{`
			`size_t len = count * size;`
			`void *r = pool_alloc(len);`
			`memset(r, 0, len);`
			`return r;`
			`}`

Correct minor style issue in fast-import. Junio noticed that I was using a different style in fast-import for returned pointers than the rest of Git. Before merging this code into the main git.git tree I'd like to make it consistent, as this style variation was not intentional. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 06:43:59 +01:00			`static char pool_strdup(const char s)`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`{`
			`char *r = pool_alloc(strlen(s) + 1);`
			`strcpy(r, s);`
			`return r;`
			`}`

Use uintmax_t for marks in fast-import. If a frontend wants to use a mark per file revision and per commit and is doing a truly huge import (such as a 32 GiB SVN repository) we may need more than 2**32 unique mark values, especially if the frontend is unable (or unwilling) to recycle mark values. For mark idnums we should use the largest unsigned integer type available, hoping that will be at least 64 bits when we are compiled as a 64 bit executable. This way we may consume huge amounts of memory storing our mark table, but we'll at least be able to process the entire import without failing. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 06:33:19 +01:00			`static void insert_mark(uintmax_t idnum, struct object_entry *oe)`
Added mark store/find to fast-import. Marks are now saved when the mark directive gets used by the frontend and may be used in place of a SHA1 expression to locate a previous SHA1 which fast-import may have generated. This is particularly useful with commits where the frontend does not (easily) have the ability to compute the SHA1 for an arbitrary commit but needs it to generate a branch or tag from that commit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 10:17:45 +02:00			`{`
			`struct mark_set *s = marks;`
			`while ((idnum >> s->shift) >= 1024) {`
			`s = pool_calloc(1, sizeof(struct mark_set));`
			`s->shift = marks->shift + 10;`
			`s->data.sets[0] = marks;`
			`marks = s;`
			`}`
			`while (s->shift) {`
Use uintmax_t for marks in fast-import. If a frontend wants to use a mark per file revision and per commit and is doing a truly huge import (such as a 32 GiB SVN repository) we may need more than 2**32 unique mark values, especially if the frontend is unable (or unwilling) to recycle mark values. For mark idnums we should use the largest unsigned integer type available, hoping that will be at least 64 bits when we are compiled as a 64 bit executable. This way we may consume huge amounts of memory storing our mark table, but we'll at least be able to process the entire import without failing. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 06:33:19 +01:00			`uintmax_t i = idnum >> s->shift;`
Added mark store/find to fast-import. Marks are now saved when the mark directive gets used by the frontend and may be used in place of a SHA1 expression to locate a previous SHA1 which fast-import may have generated. This is particularly useful with commits where the frontend does not (easily) have the ability to compute the SHA1 for an arbitrary commit but needs it to generate a branch or tag from that commit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 10:17:45 +02:00			`idnum -= i << s->shift;`
			`if (!s->data.sets[i]) {`
			`s->data.sets[i] = pool_calloc(1, sizeof(struct mark_set));`
			`s->data.sets[i]->shift = s->shift - 10;`
			`}`
			`s = s->data.sets[i];`
			`}`
			`if (!s->data.marked[idnum])`
			`marks_set_count++;`
			`s->data.marked[idnum] = oe;`
			`}`

Correct minor style issue in fast-import. Junio noticed that I was using a different style in fast-import for returned pointers than the rest of Git. Before merging this code into the main git.git tree I'd like to make it consistent, as this style variation was not intentional. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 06:43:59 +01:00			`static struct object_entry *find_mark(uintmax_t idnum)`
Added mark store/find to fast-import. Marks are now saved when the mark directive gets used by the frontend and may be used in place of a SHA1 expression to locate a previous SHA1 which fast-import may have generated. This is particularly useful with commits where the frontend does not (easily) have the ability to compute the SHA1 for an arbitrary commit but needs it to generate a branch or tag from that commit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 10:17:45 +02:00			`{`
Use uintmax_t for marks in fast-import. If a frontend wants to use a mark per file revision and per commit and is doing a truly huge import (such as a 32 GiB SVN repository) we may need more than 2**32 unique mark values, especially if the frontend is unable (or unwilling) to recycle mark values. For mark idnums we should use the largest unsigned integer type available, hoping that will be at least 64 bits when we are compiled as a 64 bit executable. This way we may consume huge amounts of memory storing our mark table, but we'll at least be able to process the entire import without failing. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 06:33:19 +01:00			`uintmax_t orig_idnum = idnum;`
Added mark store/find to fast-import. Marks are now saved when the mark directive gets used by the frontend and may be used in place of a SHA1 expression to locate a previous SHA1 which fast-import may have generated. This is particularly useful with commits where the frontend does not (easily) have the ability to compute the SHA1 for an arbitrary commit but needs it to generate a branch or tag from that commit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 10:17:45 +02:00			`struct mark_set *s = marks;`
			`struct object_entry *oe = NULL;`
			`if ((idnum >> s->shift) < 1024) {`
			`while (s && s->shift) {`
Use uintmax_t for marks in fast-import. If a frontend wants to use a mark per file revision and per commit and is doing a truly huge import (such as a 32 GiB SVN repository) we may need more than 2**32 unique mark values, especially if the frontend is unable (or unwilling) to recycle mark values. For mark idnums we should use the largest unsigned integer type available, hoping that will be at least 64 bits when we are compiled as a 64 bit executable. This way we may consume huge amounts of memory storing our mark table, but we'll at least be able to process the entire import without failing. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 06:33:19 +01:00			`uintmax_t i = idnum >> s->shift;`
Added mark store/find to fast-import. Marks are now saved when the mark directive gets used by the frontend and may be used in place of a SHA1 expression to locate a previous SHA1 which fast-import may have generated. This is particularly useful with commits where the frontend does not (easily) have the ability to compute the SHA1 for an arbitrary commit but needs it to generate a branch or tag from that commit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 10:17:45 +02:00			`idnum -= i << s->shift;`
			`s = s->data.sets[i];`
			`}`
			`if (s)`
			`oe = s->data.marked[idnum];`
			`}`
			`if (!oe)`
Check for PRIuMAX rather than NO_C99_FORMAT in fast-import.c. Thanks to Simon 'corecode' Schubert <corecode@fs.ei.tum.de> for the clean-up. Defining the C99 standard PRIuMAX when necessary replaces UM_FMT and the awkward UM10_FMT. There are no direct C99 translations for other uses of NO_C99_FORMAT in git, alas. Signed-off-by: Jason Riedy <ejr@cs.berkeley.edu> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-21 02:34:56 +01:00			`die("mark :%" PRIuMAX " not declared", orig_idnum);`
Added mark store/find to fast-import. Marks are now saved when the mark directive gets used by the frontend and may be used in place of a SHA1 expression to locate a previous SHA1 which fast-import may have generated. This is particularly useful with commits where the frontend does not (easily) have the ability to compute the SHA1 for an arbitrary commit but needs it to generate a branch or tag from that commit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 10:17:45 +02:00			`return oe;`
			`}`

Correct minor style issue in fast-import. Junio noticed that I was using a different style in fast-import for returned pointers than the rest of Git. Before merging this code into the main git.git tree I'd like to make it consistent, as this style variation was not intentional. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 06:43:59 +01:00			`static struct atom_str to_atom(const char s, unsigned short len)`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`{`
			`unsigned int hc = hc_str(s, len) % atom_table_sz;`
			`struct atom_str *c;`

			`for (c = atom_table[hc]; c; c = c->next_atom)`
			`if (c->str_len == len && !strncmp(s, c->str_dat, len))`
			`return c;`

			`c = pool_alloc(sizeof(struct atom_str) + len + 1);`
			`c->str_len = len;`
			`strncpy(c->str_dat, s, len);`
			`c->str_dat[len] = 0;`
			`c->next_atom = atom_table[hc];`
			`atom_table[hc] = c;`
			`atom_cnt++;`
			`return c;`
			`}`

Correct minor style issue in fast-import. Junio noticed that I was using a different style in fast-import for returned pointers than the rest of Git. Before merging this code into the main git.git tree I'd like to make it consistent, as this style variation was not intentional. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 06:43:59 +01:00			`static struct branch lookup_branch(const char name)`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`{`
			`unsigned int hc = hc_str(name, strlen(name)) % branch_table_sz;`
			`struct branch *b;`

			`for (b = branch_table[hc]; b; b = b->table_next_branch)`
			`if (!strcmp(name, b->name))`
			`return b;`
			`return NULL;`
			`}`

Correct minor style issue in fast-import. Junio noticed that I was using a different style in fast-import for returned pointers than the rest of Git. Before merging this code into the main git.git tree I'd like to make it consistent, as this style variation was not intentional. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 06:43:59 +01:00			`static struct branch new_branch(const char name)`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`{`
			`unsigned int hc = hc_str(name, strlen(name)) % branch_table_sz;`
			`struct branch* b = lookup_branch(name);`

			`if (b)`
			`die("Invalid attempt to create duplicate branch: %s", name);`
Actually allow TAG_FIXUP branches in fast-import Michael Haggerty <mhagger@alum.mit.edu> noticed while debugging a Git backend for cvs2svn that fast-import was barfing when he tried to use "TAG_FIXUP" as a branch name for temporary work needed to cleanup the tree prior to creating an annotated tag object. The reason we were rejecting the branch name was check_ref_format() returns -2 when there are less than 2 '/' characters in the input name. TAG_FIXUP has 0 '/' characters, but is technically just as valid of a ref as HEAD and MERGE_HEAD, so we really should permit it (and any other similar looking name) during import. New test cases have been added to make sure we still detect very wrong branch names (e.g. containing [ or starting with .) and yet still permit reasonable names (e.g. TAG_FIXUP). Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 05:37:21 +02:00			`switch (check_ref_format(name)) {`
Update callers of check_ref_format() This updates send-pack and fast-import to use symbolic constants for checking the return values from check_ref_format(), and also futureproof the logic in lock_any_ref_for_update() to explicitly name the case that is usually considered an error but is Ok for this particular use. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-02 20:14:40 +01:00			`case 0: break; /* its valid */`
			`case CHECK_REF_FORMAT_ONELEVEL:`
			`break; /* valid, but too few '/', allow anyway */`
Actually allow TAG_FIXUP branches in fast-import Michael Haggerty <mhagger@alum.mit.edu> noticed while debugging a Git backend for cvs2svn that fast-import was barfing when he tried to use "TAG_FIXUP" as a branch name for temporary work needed to cleanup the tree prior to creating an annotated tag object. The reason we were rejecting the branch name was check_ref_format() returns -2 when there are less than 2 '/' characters in the input name. TAG_FIXUP has 0 '/' characters, but is technically just as valid of a ref as HEAD and MERGE_HEAD, so we really should permit it (and any other similar looking name) during import. New test cases have been added to make sure we still detect very wrong branch names (e.g. containing [ or starting with .) and yet still permit reasonable names (e.g. TAG_FIXUP). Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 05:37:21 +02:00			`default:`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`die("Branch name doesn't conform to GIT standards: %s", name);`
Actually allow TAG_FIXUP branches in fast-import Michael Haggerty <mhagger@alum.mit.edu> noticed while debugging a Git backend for cvs2svn that fast-import was barfing when he tried to use "TAG_FIXUP" as a branch name for temporary work needed to cleanup the tree prior to creating an annotated tag object. The reason we were rejecting the branch name was check_ref_format() returns -2 when there are less than 2 '/' characters in the input name. TAG_FIXUP has 0 '/' characters, but is technically just as valid of a ref as HEAD and MERGE_HEAD, so we really should permit it (and any other similar looking name) during import. New test cases have been added to make sure we still detect very wrong branch names (e.g. containing [ or starting with .) and yet still permit reasonable names (e.g. TAG_FIXUP). Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 05:37:21 +02:00			`}`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00
			`b = pool_calloc(1, sizeof(struct branch));`
			`b->name = pool_strdup(name);`
			`b->table_next_branch = branch_table[hc];`
Additional fast-import tree delta corruption cleanups. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-29 04:06:13 +02:00			`b->branch_tree.versions[0].mode = S_IFDIR;`
			`b->branch_tree.versions[1].mode = S_IFDIR;`
fast-import: Avoid infinite loop after reset Johannes Sixt noticed that a 'reset' command applied to a branch that is already active in the branch LRU cache can cause fast-import to relink the same branch into the LRU cache twice. This will cause the LRU cache to contain a cycle, making unload_one_branch run in an infinite loop as it tries to select the oldest branch for eviction. I have trivially fixed the problem by adding an active bit to each branch object; this bit indicates if the branch is already in the LRU and allows us to avoid trying to add it a second time. Converting the pack_id field into a bitfield makes this change take up no additional memory. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-05 18:31:09 +01:00			`b->active = 0;`
Correct packfile edge output in fast-import. Branches are only contained by a packfile if the branch actually had its most recent commit in that packfile. So new branches are set to MAX_PACK_ID to ensure they don't cause their commit to list as part of the first packfile when it closes out if the commit was actually in existance before fast-import started. Also corrected the type of last_commit to be umaxint_t to prevent overflow and wraparound on very large imports. Though that is highly unlikely to occur as we're talking 4 billion commits, which no real project has right now. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-17 08:42:43 +01:00			`b->pack_id = MAX_PACK_ID;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`branch_table[hc] = b;`
			`branch_count++;`
			`return b;`
			`}`

			`static unsigned int hc_entries(unsigned int cnt)`
			`{`
			`cnt = cnt & 7 ? (cnt / 8) + 1 : cnt / 8;`
			`return cnt < avail_tree_table_sz ? cnt : avail_tree_table_sz - 1;`
			`}`

Correct minor style issue in fast-import. Junio noticed that I was using a different style in fast-import for returned pointers than the rest of Git. Before merging this code into the main git.git tree I'd like to make it consistent, as this style variation was not intentional. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 06:43:59 +01:00			`static struct tree_content *new_tree_content(unsigned int cnt)`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`{`
			`struct avail_tree_content f, l = NULL;`
			`struct tree_content *t;`
			`unsigned int hc = hc_entries(cnt);`

			`for (f = avail_tree_table[hc]; f; l = f, f = f->next_avail)`
			`if (f->entry_capacity >= cnt)`
			`break;`

			`if (f) {`
			`if (l)`
			`l->next_avail = f->next_avail;`
			`else`
			`avail_tree_table[hc] = f->next_avail;`
			`} else {`
			`cnt = cnt & 7 ? ((cnt / 8) + 1) * 8 : cnt;`
			`f = pool_alloc(sizeof(t) + sizeof(t->entries[0]) cnt);`
			`f->entry_capacity = cnt;`
			`}`

			`t = (struct tree_content*)f;`
			`t->entry_count = 0;`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`t->delta_depth = 0;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`return t;`
			`}`

			`static void release_tree_entry(struct tree_entry *e);`
			`static void release_tree_content(struct tree_content *t)`
			`{`
			`struct avail_tree_content f = (struct avail_tree_content)t;`
			`unsigned int hc = hc_entries(f->entry_capacity);`
Fixed segfault in fast-import after growing a tree. Growing a tree caused all subtrees to be deallocated and put back into the free list yet those subtree's contents were still actively in use. Consequently they were doled out again and got stomped on elsewhere. Releasing a tree is now performed in two parts, either releasing only the content array or releasing the content array and recursively releasing the subtree(s). Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 07:33:47 +02:00			`f->next_avail = avail_tree_table[hc];`
			`avail_tree_table[hc] = f;`
			`}`

			`static void release_tree_content_recursive(struct tree_content *t)`
			`{`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`unsigned int i;`
			`for (i = 0; i < t->entry_count; i++)`
			`release_tree_entry(t->entries[i]);`
Fixed segfault in fast-import after growing a tree. Growing a tree caused all subtrees to be deallocated and put back into the free list yet those subtree's contents were still actively in use. Consequently they were doled out again and got stomped on elsewhere. Releasing a tree is now performed in two parts, either releasing only the content array or releasing the content array and recursively releasing the subtree(s). Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 07:33:47 +02:00			`release_tree_content(t);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`}`

Correct minor style issue in fast-import. Junio noticed that I was using a different style in fast-import for returned pointers than the rest of Git. Before merging this code into the main git.git tree I'd like to make it consistent, as this style variation was not intentional. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 06:43:59 +01:00			`static struct tree_content *grow_tree_content(`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`struct tree_content *t,`
			`int amt)`
			`{`
			`struct tree_content *r = new_tree_content(t->entry_count + amt);`
			`r->entry_count = t->entry_count;`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`r->delta_depth = t->delta_depth;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`memcpy(r->entries,t->entries,t->entry_count*sizeof(t->entries[0]));`
			`release_tree_content(t);`
			`return r;`
			`}`

Correct minor style issue in fast-import. Junio noticed that I was using a different style in fast-import for returned pointers than the rest of Git. Before merging this code into the main git.git tree I'd like to make it consistent, as this style variation was not intentional. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 06:43:59 +01:00			`static struct tree_entry *new_tree_entry(void)`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`{`
			`struct tree_entry *e;`

			`if (!avail_tree_entry) {`
			`unsigned int n = tree_entry_alloc;`
Account for tree entry memory costs in fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 20:53:32 +02:00			`total_allocd += n * sizeof(struct tree_entry);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`avail_tree_entry = e = xmalloc(n * sizeof(struct tree_entry));`
Fixed GPF in fast-import caused by unterminated linked list. fast-import was encounting a GPF when it ran out of free tree_entry objects but didn't know this was the cause because the last tree_entry wasn't terminated with a NULL pointer. The missing NULL pointer occurred when we allocated additional entries via xmalloc but didn't set the last tree_entry's "next" pointer to NULL. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-27 04:38:02 +02:00			`while (n-- > 1) {`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`((void*)e) = e + 1;`
			`e++;`
			`}`
Fixed compile error in fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-27 05:37:31 +02:00			`((void*)e) = NULL;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`}`

			`e = avail_tree_entry;`
			`avail_tree_entry = ((void*)e);`
			`return e;`
			`}`

			`static void release_tree_entry(struct tree_entry *e)`
			`{`
			`if (e->tree)`
Fixed segfault in fast-import after growing a tree. Growing a tree caused all subtrees to be deallocated and put back into the free list yet those subtree's contents were still actively in use. Consequently they were doled out again and got stomped on elsewhere. Releasing a tree is now performed in two parts, either releasing only the content array or releasing the content array and recursively releasing the subtree(s). Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 07:33:47 +02:00			`release_tree_content_recursive(e->tree);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`((void*)e) = avail_tree_entry;`
			`avail_tree_entry = e;`
			`}`

Teach fast-import to recursively copy files/directories Some source material (e.g. Subversion dump files) perform directory renames by telling us the directory was copied, then deleted in the same revision. This makes it difficult for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Copy a/ to b/; Delete a/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'C' subcommand within a commit allows the frontend to make a recursive copy of one path to another path within the branch, without needing to keep track of the individual file paths. The metadata copy is performed in memory efficiently, but is implemented as a copy-immediately operation, rather than copy-on-write. With this new 'C' subcommand frontends could obviously implement an 'R' (rename) on their own as a combination of 'C' and 'D' (delete), but since we have already offered up 'R' in the past and it is a trivial thing to keep implemented I'm not going to deprecate it. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-15 07:40:37 +02:00			`static struct tree_content dup_tree_content(struct tree_content s)`
			`{`
			`struct tree_content *d;`
			`struct tree_entry a, b;`
			`unsigned int i;`

			`if (!s)`
			`return NULL;`
			`d = new_tree_content(s->entry_count);`
			`for (i = 0; i < s->entry_count; i++) {`
			`a = s->entries[i];`
			`b = new_tree_entry();`
			`memcpy(b, a, sizeof(*a));`
			`if (a->tree && is_null_sha1(b->versions[1].sha1))`
			`b->tree = dup_tree_content(a->tree);`
			`else`
			`b->tree = NULL;`
			`d->entries[i] = b;`
			`}`
			`d->entry_count = s->entry_count;`
			`d->delta_depth = s->delta_depth;`

			`return d;`
			`}`

Declare no-arg functions as (void) in fast-import. Apparently the git convention is to declare any function which takes no arguments as taking void. I did not do this during the early fast-import development, but should have. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-17 07:47:25 +01:00			`static void start_packfile(void)`
Restructure fast-import to support creating multiple packfiles. Now that we are starting to see some really large projects (such as KDE or a fork of FreeBSD) get imported into Git we're running into the upper limit on packfile object count as well as overall byte length. The KDE and FreeBSD projects are both likely to require more than 4 GiB to store their current history, which means we really need multiple packfiles to handle their content. This is a fairly simple restructuring of the internal code to help us support creating multiple packfiles from within fast-import. We are now adding a 5 digit incrementing suffix to the end of the basename supplied to us by the caller, permitting up to 99,999 packs to be generated in a single fast-import run. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 10:39:05 +01:00			`{`
Use .keep files in fast-import during processing. Because fast-import automatically updates all references (heads and tags) at the end of its run the repository is corrupt unless the objects are available in the .git/objects/pack directory prior to the refs being modified. The easiest way to ensure that is true is to move the packfile and its associated index directly into the .git/objects/pack directory as soon as we have finished output to it. But the only safe way to do this is to create the a temporary .keep file for that pack, so we use the same tricks that index-pack uses when its being invoked by receive-pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 07:15:31 +01:00			`static char tmpfile[PATH_MAX];`
Implemented manual packfile switching in fast-import. To help importers which are dealing with massive amounts of data fast-import needs to be able to close the packfile it is currently writing to and open a new packfile for any additional data that will be received. A new 'checkpoint' command has been introduced which can be used by the frontend import process to force this to occur at any time. This may be useful to ensure a very long running import doesn't lose any work due to unexpected failures. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:35:41 +01:00			`struct packed_git *p;`
Restructure fast-import to support creating multiple packfiles. Now that we are starting to see some really large projects (such as KDE or a fork of FreeBSD) get imported into Git we're running into the upper limit on packfile object count as well as overall byte length. The KDE and FreeBSD projects are both likely to require more than 4 GiB to store their current history, which means we really need multiple packfiles to handle their content. This is a fairly simple restructuring of the internal code to help us support creating multiple packfiles from within fast-import. We are now adding a 5 digit incrementing suffix to the end of the basename supplied to us by the caller, permitting up to 99,999 packs to be generated in a single fast-import run. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 10:39:05 +01:00			`struct pack_header hdr;`
Remove unnecessary pack_fd global in fast-import. Much like the pack_sha1 the pack_fd is an unnecessary global variable, we already have the fd stored in our struct packed_git *pack_data so that the core library functions in sha1_file.c are able to lookup and decompress object data that we have previously written. Keeping an extra copy of this value in our own variable is just a hold-over from earlier versions of fast-import and is now completely unnecessary. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 07:20:57 +01:00			`int pack_fd;`
Restructure fast-import to support creating multiple packfiles. Now that we are starting to see some really large projects (such as KDE or a fork of FreeBSD) get imported into Git we're running into the upper limit on packfile object count as well as overall byte length. The KDE and FreeBSD projects are both likely to require more than 4 GiB to store their current history, which means we really need multiple packfiles to handle their content. This is a fairly simple restructuring of the internal code to help us support creating multiple packfiles from within fast-import. We are now adding a 5 digit incrementing suffix to the end of the basename supplied to us by the caller, permitting up to 99,999 packs to be generated in a single fast-import run. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 10:39:05 +01:00
Use .keep files in fast-import during processing. Because fast-import automatically updates all references (heads and tags) at the end of its run the repository is corrupt unless the objects are available in the .git/objects/pack directory prior to the refs being modified. The easiest way to ensure that is true is to move the packfile and its associated index directly into the .git/objects/pack directory as soon as we have finished output to it. But the only safe way to do this is to create the a temporary .keep file for that pack, so we use the same tricks that index-pack uses when its being invoked by receive-pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 07:15:31 +01:00			`snprintf(tmpfile, sizeof(tmpfile),`
make it more obvious that temporary files are temporary files When some operations are interrupted (or "die()'d" or crashed) then the partial object/pack/index file may remain around. Make it more obvious in their name that those files are temporary stuff and can be cleaned up if no operation is in progress. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-24 16:58:22 +01:00			`"%s/tmp_pack_XXXXXX", get_object_directory());`
Use xmkstemp() instead of mkstemp() xmkstemp() performs error checking and prints a standard error message when an error occur. Signed-off-by: Luiz Fernando N. Capitulino <lcapitulino@mandriva.com.br> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-08-14 21:45:58 +02:00			`pack_fd = xmkstemp(tmpfile);`
Use .keep files in fast-import during processing. Because fast-import automatically updates all references (heads and tags) at the end of its run the repository is corrupt unless the objects are available in the .git/objects/pack directory prior to the refs being modified. The easiest way to ensure that is true is to move the packfile and its associated index directly into the .git/objects/pack directory as soon as we have finished output to it. But the only safe way to do this is to create the a temporary .keep file for that pack, so we use the same tricks that index-pack uses when its being invoked by receive-pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 07:15:31 +01:00			`p = xcalloc(1, sizeof(*p) + strlen(tmpfile) + 2);`
			`strcpy(p->pack_name, tmpfile);`
Implemented manual packfile switching in fast-import. To help importers which are dealing with massive amounts of data fast-import needs to be able to close the packfile it is currently writing to and open a new packfile for any additional data that will be received. A new 'checkpoint' command has been introduced which can be used by the frontend import process to force this to occur at any time. This may be useful to ensure a very long running import doesn't lose any work due to unexpected failures. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:35:41 +01:00			`p->pack_fd = pack_fd;`
Restructure fast-import to support creating multiple packfiles. Now that we are starting to see some really large projects (such as KDE or a fork of FreeBSD) get imported into Git we're running into the upper limit on packfile object count as well as overall byte length. The KDE and FreeBSD projects are both likely to require more than 4 GiB to store their current history, which means we really need multiple packfiles to handle their content. This is a fairly simple restructuring of the internal code to help us support creating multiple packfiles from within fast-import. We are now adding a 5 digit incrementing suffix to the end of the basename supplied to us by the caller, permitting up to 99,999 packs to be generated in a single fast-import run. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 10:39:05 +01:00
			`hdr.hdr_signature = htonl(PACK_SIGNATURE);`
			`hdr.hdr_version = htonl(2);`
			`hdr.hdr_entries = 0;`
Remove unnecessary pack_fd global in fast-import. Much like the pack_sha1 the pack_fd is an unnecessary global variable, we already have the fd stored in our struct packed_git *pack_data so that the core library functions in sha1_file.c are able to lookup and decompress object data that we have previously written. Keeping an extra copy of this value in our own variable is just a hold-over from earlier versions of fast-import and is now completely unnecessary. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 07:20:57 +01:00			`write_or_die(p->pack_fd, &hdr, sizeof(hdr));`
Implemented manual packfile switching in fast-import. To help importers which are dealing with massive amounts of data fast-import needs to be able to close the packfile it is currently writing to and open a new packfile for any additional data that will be received. A new 'checkpoint' command has been introduced which can be used by the frontend import process to force this to occur at any time. This may be useful to ensure a very long running import doesn't lose any work due to unexpected failures. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:35:41 +01:00
			`pack_data = p;`
Restructure fast-import to support creating multiple packfiles. Now that we are starting to see some really large projects (such as KDE or a fork of FreeBSD) get imported into Git we're running into the upper limit on packfile object count as well as overall byte length. The KDE and FreeBSD projects are both likely to require more than 4 GiB to store their current history, which means we really need multiple packfiles to handle their content. This is a fairly simple restructuring of the internal code to help us support creating multiple packfiles from within fast-import. We are now adding a 5 digit incrementing suffix to the end of the basename supplied to us by the caller, permitting up to 99,999 packs to be generated in a single fast-import run. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 10:39:05 +01:00			`pack_size = sizeof(hdr);`
			`object_count = 0;`
Implemented manual packfile switching in fast-import. To help importers which are dealing with massive amounts of data fast-import needs to be able to close the packfile it is currently writing to and open a new packfile for any additional data that will be received. A new 'checkpoint' command has been introduced which can be used by the frontend import process to force this to occur at any time. This may be useful to ensure a very long running import doesn't lose any work due to unexpected failures. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:35:41 +01:00
			`all_packs = xrealloc(all_packs, sizeof(all_packs) (pack_id + 1));`
			`all_packs[pack_id] = p;`
Restructure fast-import to support creating multiple packfiles. Now that we are starting to see some really large projects (such as KDE or a fork of FreeBSD) get imported into Git we're running into the upper limit on packfile object count as well as overall byte length. The KDE and FreeBSD projects are both likely to require more than 4 GiB to store their current history, which means we really need multiple packfiles to handle their content. This is a fairly simple restructuring of the internal code to help us support creating multiple packfiles from within fast-import. We are now adding a 5 digit incrementing suffix to the end of the basename supplied to us by the caller, permitting up to 99,999 packs to be generated in a single fast-import run. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 10:39:05 +01:00			`}`

			`static int oecmp (const void a_, const void b_)`
			`{`
			`struct object_entry a = ((struct object_entry**)a_);`
			`struct object_entry b = ((struct object_entry**)b_);`
			`return hashcmp(a->sha1, b->sha1);`
			`}`

Correct minor style issue in fast-import. Junio noticed that I was using a different style in fast-import for returned pointers than the rest of Git. Before merging this code into the main git.git tree I'd like to make it consistent, as this style variation was not intentional. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 06:43:59 +01:00			`static char *create_index(void)`
Restructure fast-import to support creating multiple packfiles. Now that we are starting to see some really large projects (such as KDE or a fork of FreeBSD) get imported into Git we're running into the upper limit on packfile object count as well as overall byte length. The KDE and FreeBSD projects are both likely to require more than 4 GiB to store their current history, which means we really need multiple packfiles to handle their content. This is a fairly simple restructuring of the internal code to help us support creating multiple packfiles from within fast-import. We are now adding a 5 digit incrementing suffix to the end of the basename supplied to us by the caller, permitting up to 99,999 packs to be generated in a single fast-import run. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 10:39:05 +01:00			`{`
Use .keep files in fast-import during processing. Because fast-import automatically updates all references (heads and tags) at the end of its run the repository is corrupt unless the objects are available in the .git/objects/pack directory prior to the refs being modified. The easiest way to ensure that is true is to move the packfile and its associated index directly into the .git/objects/pack directory as soon as we have finished output to it. But the only safe way to do this is to create the a temporary .keep file for that pack, so we use the same tricks that index-pack uses when its being invoked by receive-pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 07:15:31 +01:00			`static char tmpfile[PATH_MAX];`
			`SHA_CTX ctx;`
Restructure fast-import to support creating multiple packfiles. Now that we are starting to see some really large projects (such as KDE or a fork of FreeBSD) get imported into Git we're running into the upper limit on packfile object count as well as overall byte length. The KDE and FreeBSD projects are both likely to require more than 4 GiB to store their current history, which means we really need multiple packfiles to handle their content. This is a fairly simple restructuring of the internal code to help us support creating multiple packfiles from within fast-import. We are now adding a 5 digit incrementing suffix to the end of the basename supplied to us by the caller, permitting up to 99,999 packs to be generated in a single fast-import run. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 10:39:05 +01:00			`struct sha1file *f;`
			`struct object_entry idx, c, *last, e;`
			`struct object_entry_pool *o;`
Use fixed-size integers when writing out the index in fast-import. Currently the pack .idx file format uses 32-bit unsigned integers for the fan-out table and the object offsets. We had previously defined these as 'unsigned int', but not every system will define that type to be a 32 bit value. To ensure maximum portability we should always use 'uint32_t'. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-18 17:30:17 +01:00			`uint32_t array[256];`
Use .keep files in fast-import during processing. Because fast-import automatically updates all references (heads and tags) at the end of its run the repository is corrupt unless the objects are available in the .git/objects/pack directory prior to the refs being modified. The easiest way to ensure that is true is to move the packfile and its associated index directly into the .git/objects/pack directory as soon as we have finished output to it. But the only safe way to do this is to create the a temporary .keep file for that pack, so we use the same tricks that index-pack uses when its being invoked by receive-pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 07:15:31 +01:00			`int i, idx_fd;`
Restructure fast-import to support creating multiple packfiles. Now that we are starting to see some really large projects (such as KDE or a fork of FreeBSD) get imported into Git we're running into the upper limit on packfile object count as well as overall byte length. The KDE and FreeBSD projects are both likely to require more than 4 GiB to store their current history, which means we really need multiple packfiles to handle their content. This is a fairly simple restructuring of the internal code to help us support creating multiple packfiles from within fast-import. We are now adding a 5 digit incrementing suffix to the end of the basename supplied to us by the caller, permitting up to 99,999 packs to be generated in a single fast-import run. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 10:39:05 +01:00
			`/* Build the sorted table of object IDs. */`
			`idx = xmalloc(object_count * sizeof(struct object_entry*));`
			`c = idx;`
			`for (o = blocks; o; o = o->next_pool)`
Implemented automatic checkpoints within fast-import. When the number of objects or number of bytes gets close to the limit allowed by the packfile format (or configured on the command line by our caller) we should automatically checkpoint the current packfile and start a new one before writing the object out. This does however require that we abandon the delta (if we had one) as its not valid in a new packfile. I also added the simple rule that if we got a delta back but the delta itself is the same size as or larger than the uncompressed object to ignore the delta and just store the object data. This should avoid some really bad behavior caused by our current delta strategy. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 14:00:49 +01:00			`for (e = o->next_free; e-- != o->entries;)`
			`if (pack_id == e->pack_id)`
			`*c++ = e;`
Restructure fast-import to support creating multiple packfiles. Now that we are starting to see some really large projects (such as KDE or a fork of FreeBSD) get imported into Git we're running into the upper limit on packfile object count as well as overall byte length. The KDE and FreeBSD projects are both likely to require more than 4 GiB to store their current history, which means we really need multiple packfiles to handle their content. This is a fairly simple restructuring of the internal code to help us support creating multiple packfiles from within fast-import. We are now adding a 5 digit incrementing suffix to the end of the basename supplied to us by the caller, permitting up to 99,999 packs to be generated in a single fast-import run. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 10:39:05 +01:00			`last = idx + object_count;`
Optimize index creation on large object sets in fast-import. When we are generating multiple packfiles at once we only need to scan the blocks of object_entry structs which contain objects for the current packfile. Because the most recent blocks are at the front of the linked list, and because all new objects going into the current file are allocated from the front of that list, we can stop scanning for objects as soon as we identify one which doesn't belong to the current packfile. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:51:58 +01:00			`if (c != last)`
			`die("internal consistency error creating the index");`
Restructure fast-import to support creating multiple packfiles. Now that we are starting to see some really large projects (such as KDE or a fork of FreeBSD) get imported into Git we're running into the upper limit on packfile object count as well as overall byte length. The KDE and FreeBSD projects are both likely to require more than 4 GiB to store their current history, which means we really need multiple packfiles to handle their content. This is a fairly simple restructuring of the internal code to help us support creating multiple packfiles from within fast-import. We are now adding a 5 digit incrementing suffix to the end of the basename supplied to us by the caller, permitting up to 99,999 packs to be generated in a single fast-import run. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 10:39:05 +01:00			`qsort(idx, object_count, sizeof(struct object_entry*), oecmp);`

			`/* Generate the fan-out array. */`
			`c = idx;`
			`for (i = 0; i < 256; i++) {`
			`struct object_entry **next = c;;`
			`while (next < last) {`
			`if ((*next)->sha1[0] != i)`
			`break;`
			`next++;`
			`}`
			`array[i] = htonl(next - idx);`
			`c = next;`
			`}`

Use .keep files in fast-import during processing. Because fast-import automatically updates all references (heads and tags) at the end of its run the repository is corrupt unless the objects are available in the .git/objects/pack directory prior to the refs being modified. The easiest way to ensure that is true is to move the packfile and its associated index directly into the .git/objects/pack directory as soon as we have finished output to it. But the only safe way to do this is to create the a temporary .keep file for that pack, so we use the same tricks that index-pack uses when its being invoked by receive-pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 07:15:31 +01:00			`snprintf(tmpfile, sizeof(tmpfile),`
make it more obvious that temporary files are temporary files When some operations are interrupted (or "die()'d" or crashed) then the partial object/pack/index file may remain around. Make it more obvious in their name that those files are temporary stuff and can be cleaned up if no operation is in progress. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-24 16:58:22 +01:00			`"%s/tmp_idx_XXXXXX", get_object_directory());`
Use xmkstemp() instead of mkstemp() xmkstemp() performs error checking and prints a standard error message when an error occur. Signed-off-by: Luiz Fernando N. Capitulino <lcapitulino@mandriva.com.br> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-08-14 21:45:58 +02:00			`idx_fd = xmkstemp(tmpfile);`
Use .keep files in fast-import during processing. Because fast-import automatically updates all references (heads and tags) at the end of its run the repository is corrupt unless the objects are available in the .git/objects/pack directory prior to the refs being modified. The easiest way to ensure that is true is to move the packfile and its associated index directly into the .git/objects/pack directory as soon as we have finished output to it. But the only safe way to do this is to create the a temporary .keep file for that pack, so we use the same tricks that index-pack uses when its being invoked by receive-pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 07:15:31 +01:00			`f = sha1fd(idx_fd, tmpfile);`
Restructure fast-import to support creating multiple packfiles. Now that we are starting to see some really large projects (such as KDE or a fork of FreeBSD) get imported into Git we're running into the upper limit on packfile object count as well as overall byte length. The KDE and FreeBSD projects are both likely to require more than 4 GiB to store their current history, which means we really need multiple packfiles to handle their content. This is a fairly simple restructuring of the internal code to help us support creating multiple packfiles from within fast-import. We are now adding a 5 digit incrementing suffix to the end of the basename supplied to us by the caller, permitting up to 99,999 packs to be generated in a single fast-import run. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 10:39:05 +01:00			`sha1write(f, array, 256 * sizeof(int));`
Use .keep files in fast-import during processing. Because fast-import automatically updates all references (heads and tags) at the end of its run the repository is corrupt unless the objects are available in the .git/objects/pack directory prior to the refs being modified. The easiest way to ensure that is true is to move the packfile and its associated index directly into the .git/objects/pack directory as soon as we have finished output to it. But the only safe way to do this is to create the a temporary .keep file for that pack, so we use the same tricks that index-pack uses when its being invoked by receive-pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 07:15:31 +01:00			`SHA1_Init(&ctx);`
Restructure fast-import to support creating multiple packfiles. Now that we are starting to see some really large projects (such as KDE or a fork of FreeBSD) get imported into Git we're running into the upper limit on packfile object count as well as overall byte length. The KDE and FreeBSD projects are both likely to require more than 4 GiB to store their current history, which means we really need multiple packfiles to handle their content. This is a fairly simple restructuring of the internal code to help us support creating multiple packfiles from within fast-import. We are now adding a 5 digit incrementing suffix to the end of the basename supplied to us by the caller, permitting up to 99,999 packs to be generated in a single fast-import run. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 10:39:05 +01:00			`for (c = idx; c != last; c++) {`
Use fixed-size integers when writing out the index in fast-import. Currently the pack .idx file format uses 32-bit unsigned integers for the fan-out table and the object offsets. We had previously defined these as 'unsigned int', but not every system will define that type to be a 32 bit value. To ensure maximum portability we should always use 'uint32_t'. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-18 17:30:17 +01:00			`uint32_t offset = htonl((*c)->offset);`
Restructure fast-import to support creating multiple packfiles. Now that we are starting to see some really large projects (such as KDE or a fork of FreeBSD) get imported into Git we're running into the upper limit on packfile object count as well as overall byte length. The KDE and FreeBSD projects are both likely to require more than 4 GiB to store their current history, which means we really need multiple packfiles to handle their content. This is a fairly simple restructuring of the internal code to help us support creating multiple packfiles from within fast-import. We are now adding a 5 digit incrementing suffix to the end of the basename supplied to us by the caller, permitting up to 99,999 packs to be generated in a single fast-import run. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 10:39:05 +01:00			`sha1write(f, &offset, 4);`
			`sha1write(f, (c)->sha1, sizeof((c)->sha1));`
Use .keep files in fast-import during processing. Because fast-import automatically updates all references (heads and tags) at the end of its run the repository is corrupt unless the objects are available in the .git/objects/pack directory prior to the refs being modified. The easiest way to ensure that is true is to move the packfile and its associated index directly into the .git/objects/pack directory as soon as we have finished output to it. But the only safe way to do this is to create the a temporary .keep file for that pack, so we use the same tricks that index-pack uses when its being invoked by receive-pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 07:15:31 +01:00			`SHA1_Update(&ctx, (*c)->sha1, 20);`
Restructure fast-import to support creating multiple packfiles. Now that we are starting to see some really large projects (such as KDE or a fork of FreeBSD) get imported into Git we're running into the upper limit on packfile object count as well as overall byte length. The KDE and FreeBSD projects are both likely to require more than 4 GiB to store their current history, which means we really need multiple packfiles to handle their content. This is a fairly simple restructuring of the internal code to help us support creating multiple packfiles from within fast-import. We are now adding a 5 digit incrementing suffix to the end of the basename supplied to us by the caller, permitting up to 99,999 packs to be generated in a single fast-import run. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 10:39:05 +01:00			`}`
Reuse sha1 in packed_git in fast-import. Rather than maintaing our own packfile level sha1 variable we can make use of the one already available in struct packed_git. Its meant for the SHA1 of the index but it can also hold the SHA1 of the packfile itself between final checksumming of the packfile and creation of the index. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 06:44:48 +01:00			`sha1write(f, pack_data->sha1, sizeof(pack_data->sha1));`
Use .keep files in fast-import during processing. Because fast-import automatically updates all references (heads and tags) at the end of its run the repository is corrupt unless the objects are available in the .git/objects/pack directory prior to the refs being modified. The easiest way to ensure that is true is to move the packfile and its associated index directly into the .git/objects/pack directory as soon as we have finished output to it. But the only safe way to do this is to create the a temporary .keep file for that pack, so we use the same tricks that index-pack uses when its being invoked by receive-pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 07:15:31 +01:00			`sha1close(f, NULL, 1);`
Restructure fast-import to support creating multiple packfiles. Now that we are starting to see some really large projects (such as KDE or a fork of FreeBSD) get imported into Git we're running into the upper limit on packfile object count as well as overall byte length. The KDE and FreeBSD projects are both likely to require more than 4 GiB to store their current history, which means we really need multiple packfiles to handle their content. This is a fairly simple restructuring of the internal code to help us support creating multiple packfiles from within fast-import. We are now adding a 5 digit incrementing suffix to the end of the basename supplied to us by the caller, permitting up to 99,999 packs to be generated in a single fast-import run. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 10:39:05 +01:00			`free(idx);`
Use .keep files in fast-import during processing. Because fast-import automatically updates all references (heads and tags) at the end of its run the repository is corrupt unless the objects are available in the .git/objects/pack directory prior to the refs being modified. The easiest way to ensure that is true is to move the packfile and its associated index directly into the .git/objects/pack directory as soon as we have finished output to it. But the only safe way to do this is to create the a temporary .keep file for that pack, so we use the same tricks that index-pack uses when its being invoked by receive-pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 07:15:31 +01:00			`SHA1_Final(pack_data->sha1, &ctx);`
			`return tmpfile;`
			`}`

Correct minor style issue in fast-import. Junio noticed that I was using a different style in fast-import for returned pointers than the rest of Git. Before merging this code into the main git.git tree I'd like to make it consistent, as this style variation was not intentional. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 06:43:59 +01:00			`static char keep_pack(char curr_index_name)`
Use .keep files in fast-import during processing. Because fast-import automatically updates all references (heads and tags) at the end of its run the repository is corrupt unless the objects are available in the .git/objects/pack directory prior to the refs being modified. The easiest way to ensure that is true is to move the packfile and its associated index directly into the .git/objects/pack directory as soon as we have finished output to it. But the only safe way to do this is to create the a temporary .keep file for that pack, so we use the same tricks that index-pack uses when its being invoked by receive-pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 07:15:31 +01:00			`{`
			`static char name[PATH_MAX];`
General const correctness fixes We shouldn't attempt to assign constant strings into char*, as the string is not writable at runtime. Likewise we should always be treating unsigned values as unsigned values, not as signed values. Most of these are very straightforward. The only exception is the (unnecessary) xstrdup/free in builtin-branch.c for the detached head case. Since this is a user-level interactive type program and that particular code path is executed no more than once, I feel that the extra xstrdup call is well worth the easy elimination of this warning. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-07 02:44:17 +01:00			`static const char *keep_msg = "fast-import";`
Use .keep files in fast-import during processing. Because fast-import automatically updates all references (heads and tags) at the end of its run the repository is corrupt unless the objects are available in the .git/objects/pack directory prior to the refs being modified. The easiest way to ensure that is true is to move the packfile and its associated index directly into the .git/objects/pack directory as soon as we have finished output to it. But the only safe way to do this is to create the a temporary .keep file for that pack, so we use the same tricks that index-pack uses when its being invoked by receive-pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 07:15:31 +01:00			`int keep_fd;`

			`chmod(pack_data->pack_name, 0444);`
			`chmod(curr_index_name, 0444);`

			`snprintf(name, sizeof(name), "%s/pack/pack-%s.keep",`
			`get_object_directory(), sha1_to_hex(pack_data->sha1));`
			`keep_fd = open(name, O_RDWR\|O_CREAT\|O_EXCL, 0600);`
			`if (keep_fd < 0)`
			`die("cannot create keep file");`
bundle, fast-import: detect write failure I noticed some unchecked writes. This fixes them. * bundle.c (create_bundle): Die upon write failure. * fast-import.c (keep_pack): Die upon write or close failure. Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-10 09:54:25 +01:00			`write_or_die(keep_fd, keep_msg, strlen(keep_msg));`
			`if (close(keep_fd))`
			`die("failed to write keep file");`
Use .keep files in fast-import during processing. Because fast-import automatically updates all references (heads and tags) at the end of its run the repository is corrupt unless the objects are available in the .git/objects/pack directory prior to the refs being modified. The easiest way to ensure that is true is to move the packfile and its associated index directly into the .git/objects/pack directory as soon as we have finished output to it. But the only safe way to do this is to create the a temporary .keep file for that pack, so we use the same tricks that index-pack uses when its being invoked by receive-pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 07:15:31 +01:00
			`snprintf(name, sizeof(name), "%s/pack/pack-%s.pack",`
			`get_object_directory(), sha1_to_hex(pack_data->sha1));`
			`if (move_temp_to_file(pack_data->pack_name, name))`
			`die("cannot store pack file");`

			`snprintf(name, sizeof(name), "%s/pack/pack-%s.idx",`
			`get_object_directory(), sha1_to_hex(pack_data->sha1));`
			`if (move_temp_to_file(curr_index_name, name))`
			`die("cannot store index file");`
			`return name;`
			`}`

Declare no-arg functions as (void) in fast-import. Apparently the git convention is to declare any function which takes no arguments as taking void. I did not do this during the early fast-import development, but should have. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-17 07:47:25 +01:00			`static void unkeep_all_packs(void)`
Use .keep files in fast-import during processing. Because fast-import automatically updates all references (heads and tags) at the end of its run the repository is corrupt unless the objects are available in the .git/objects/pack directory prior to the refs being modified. The easiest way to ensure that is true is to move the packfile and its associated index directly into the .git/objects/pack directory as soon as we have finished output to it. But the only safe way to do this is to create the a temporary .keep file for that pack, so we use the same tricks that index-pack uses when its being invoked by receive-pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 07:15:31 +01:00			`{`
			`static char name[PATH_MAX];`
			`int k;`

			`for (k = 0; k < pack_id; k++) {`
			`struct packed_git *p = all_packs[k];`
			`snprintf(name, sizeof(name), "%s/pack/pack-%s.keep",`
			`get_object_directory(), sha1_to_hex(p->sha1));`
			`unlink(name);`
			`}`
Restructure fast-import to support creating multiple packfiles. Now that we are starting to see some really large projects (such as KDE or a fork of FreeBSD) get imported into Git we're running into the upper limit on packfile object count as well as overall byte length. The KDE and FreeBSD projects are both likely to require more than 4 GiB to store their current history, which means we really need multiple packfiles to handle their content. This is a fairly simple restructuring of the internal code to help us support creating multiple packfiles from within fast-import. We are now adding a 5 digit incrementing suffix to the end of the basename supplied to us by the caller, permitting up to 99,999 packs to be generated in a single fast-import run. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 10:39:05 +01:00			`}`

Declare no-arg functions as (void) in fast-import. Apparently the git convention is to declare any function which takes no arguments as taking void. I did not do this during the early fast-import development, but should have. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-17 07:47:25 +01:00			`static void end_packfile(void)`
Restructure fast-import to support creating multiple packfiles. Now that we are starting to see some really large projects (such as KDE or a fork of FreeBSD) get imported into Git we're running into the upper limit on packfile object count as well as overall byte length. The KDE and FreeBSD projects are both likely to require more than 4 GiB to store their current history, which means we really need multiple packfiles to handle their content. This is a fairly simple restructuring of the internal code to help us support creating multiple packfiles from within fast-import. We are now adding a 5 digit incrementing suffix to the end of the basename supplied to us by the caller, permitting up to 99,999 packs to be generated in a single fast-import run. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 10:39:05 +01:00			`{`
Implemented manual packfile switching in fast-import. To help importers which are dealing with massive amounts of data fast-import needs to be able to close the packfile it is currently writing to and open a new packfile for any additional data that will be received. A new 'checkpoint' command has been introduced which can be used by the frontend import process to force this to occur at any time. This may be useful to ensure a very long running import doesn't lose any work due to unexpected failures. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:35:41 +01:00			`struct packed_git old_p = pack_data, new_p;`

Don't create a final empty packfile in fast-import. If the last packfile is going to be empty (has 0 objects) then it shouldn't be kept after the import has terminated, as there is no point to the packfile. So rather than hashing it and making the index file, just delete the packfile. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:39:39 +01:00			`if (object_count) {`
Use .keep files in fast-import during processing. Because fast-import automatically updates all references (heads and tags) at the end of its run the repository is corrupt unless the objects are available in the .git/objects/pack directory prior to the refs being modified. The easiest way to ensure that is true is to move the packfile and its associated index directly into the .git/objects/pack directory as soon as we have finished output to it. But the only safe way to do this is to create the a temporary .keep file for that pack, so we use the same tricks that index-pack uses when its being invoked by receive-pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 07:15:31 +01:00			`char *idx_name;`
Print out the edge commits for each packfile in fast-import. To help callers repack very large repositories into a series of packfiles fast-import now outputs the last commits/tags it wrote to a packfile when it prints out the packfile name. This information can be feed to pack-objects --revs to repack. For the first pack of an initial import this is pretty easy (just feed those SHA1s on stdin) but for subsequent packs you want to feed the subsequent pack's final SHA1s but also all prior pack's SHA1s prefixed with the negation operator. This way the prior pack's data does not get included into the subsequent pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 22:18:44 +01:00			`int i;`
			`struct branch *b;`
			`struct tag *t;`
Use .keep files in fast-import during processing. Because fast-import automatically updates all references (heads and tags) at the end of its run the repository is corrupt unless the objects are available in the .git/objects/pack directory prior to the refs being modified. The easiest way to ensure that is true is to move the packfile and its associated index directly into the .git/objects/pack directory as soon as we have finished output to it. But the only safe way to do this is to create the a temporary .keep file for that pack, so we use the same tricks that index-pack uses when its being invoked by receive-pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 07:15:31 +01:00
Fix random fast-import errors when compiled with NO_MMAP fast-import was relying on the fact that on most systems mmap() and write() are synchronized by the filesystem's buffer cache. We were relying on the ability to mmap() 20 bytes beyond the current end of the file, then later fill in those bytes with a future write() call, then read them through the previously obtained mmap() address. This isn't always true with some implementations of NFS, but it is especially not true with our NO_MMAP=YesPlease build time option used on some platforms. If fast-import was built with NO_MMAP=YesPlease we used the malloc()+pread() emulation and the subsequent write() call does not update the trailing 20 bytes of a previously obtained "mmap()" (aka malloc'd) address. Under NO_MMAP that behavior causes unpack_entry() in sha1_file.c to be unable to read an object header (or data) that has been unlucky enough to be written to the packfile at a location such that it is in the trailing 20 bytes of a window previously opened on that same packfile. This bug has gone unnoticed for a very long time as it is highly data dependent. Not only does the object have to be placed at the right position, but it also needs to be positioned behind some other object that has been accessed due to a branch cache invalidation. In other words the stars had to align just right, and if you did run into this bug you probably should also have purchased a lottery ticket. Fortunately the workaround is a lot easier than the bug explanation. Before we allow unpack_entry() to read data from a pack window that has also (possibly) been modified through write() we force all existing windows on that packfile to be closed. By closing the windows we ensure that any new access via the emulated mmap() will reread the packfile, updating to the current file content. This comes at a slight performance degredation as we cannot reuse previously cached windows when we update the packfile. But it is a fairly minor difference as the window closes happen at only two points: - When the packfile is finalized and its .idx is generated: At this stage we are getting ready to update the refs and any data access into the packfile is going to be random, and is going after only the branch tips (to ensure they are valid). Our existing windows (if any) are not likely to be positioned at useful locations to access those final tip commits so we probably were closing them before anyway. - When the branch cache missed and we need to reload: At this point fast-import is getting change commands for the next commit and it needs to go re-read a tree object it previously had written out to the packfile. What windows we had (if any) are not likely to cover the tree in question so we probably were closing them before anyway. We do try to avoid unnecessarily closing windows in the second case by checking to see if the packfile size has increased since the last time we called unpack_entry() on that packfile. If the size has not changed then we have not written additional data, and any existing window is still vaild. This nicely handles the cases where fast-import is going through a branch cache reload and needs to read many trees at once. During such an event we are not likely to be updating the packfile so we do not cycle the windows between reads. With this change in place t9301-fast-export.sh (which was broken by c3b0dec509fe136c5417422f31898b5a4e2d5e02) finally works again. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-18 04:57:00 +01:00			`close_pack_windows(pack_data);`
Create pack-write.c for common pack writing code Include a generalized fixup_pack_header_footer() in this new file. Needed by git-repack --max-pack-size feature in a later patchset. [sp: Moved close(pack_fd) to callers, to support index-pack, and changed name to better indicate it is for packfiles.] Signed-off-by: Dana L. How <danahow@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-05-02 18:13:14 +02:00			`fixup_pack_header_footer(pack_data->pack_fd, pack_data->sha1,`
			`pack_data->pack_name, object_count);`
			`close(pack_data->pack_fd);`
Use .keep files in fast-import during processing. Because fast-import automatically updates all references (heads and tags) at the end of its run the repository is corrupt unless the objects are available in the .git/objects/pack directory prior to the refs being modified. The easiest way to ensure that is true is to move the packfile and its associated index directly into the .git/objects/pack directory as soon as we have finished output to it. But the only safe way to do this is to create the a temporary .keep file for that pack, so we use the same tricks that index-pack uses when its being invoked by receive-pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 07:15:31 +01:00			`idx_name = keep_pack(create_index());`
Don't create a final empty packfile in fast-import. If the last packfile is going to be empty (has 0 objects) then it shouldn't be kept after the import has terminated, as there is no point to the packfile. So rather than hashing it and making the index file, just delete the packfile. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:39:39 +01:00
			`/* Register the packfile with core git's machinary. */`
			`new_p = add_packed_git(idx_name, strlen(idx_name), 1);`
			`if (!new_p)`
			`die("core git rejected index %s", idx_name);`
Print out the edge commits for each packfile in fast-import. To help callers repack very large repositories into a series of packfiles fast-import now outputs the last commits/tags it wrote to a packfile when it prints out the packfile name. This information can be feed to pack-objects --revs to repack. For the first pack of an initial import this is pretty easy (just feed those SHA1s on stdin) but for subsequent packs you want to feed the subsequent pack's final SHA1s but also all prior pack's SHA1s prefixed with the negation operator. This way the prior pack's data does not get included into the subsequent pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 22:18:44 +01:00			`all_packs[pack_id] = new_p;`
Don't create a final empty packfile in fast-import. If the last packfile is going to be empty (has 0 objects) then it shouldn't be kept after the import has terminated, as there is no point to the packfile. So rather than hashing it and making the index file, just delete the packfile. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:39:39 +01:00			`install_packed_git(new_p);`
Print out the edge commits for each packfile in fast-import. To help callers repack very large repositories into a series of packfiles fast-import now outputs the last commits/tags it wrote to a packfile when it prints out the packfile name. This information can be feed to pack-objects --revs to repack. For the first pack of an initial import this is pretty easy (just feed those SHA1s on stdin) but for subsequent packs you want to feed the subsequent pack's final SHA1s but also all prior pack's SHA1s prefixed with the negation operator. This way the prior pack's data does not get included into the subsequent pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 22:18:44 +01:00
			`/* Print the boundary */`
fast-import: Hide the pack boundary commits by default. Most users don't need the pack boundary information that fast-import was printing to standard output, especially if they were calling it with --quiet. Those users who do want this information probably want it captured so they can go back and use it to repack the imported repository. So dumping the boundary commits to a log file makes more sense then printing them to standard output. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-12 01:45:56 +01:00			`if (pack_edges) {`
			`fprintf(pack_edges, "%s:", new_p->pack_name);`
			`for (i = 0; i < branch_table_sz; i++) {`
			`for (b = branch_table[i]; b; b = b->table_next_branch) {`
			`if (b->pack_id == pack_id)`
			`fprintf(pack_edges, " %s", sha1_to_hex(b->sha1));`
			`}`
Print out the edge commits for each packfile in fast-import. To help callers repack very large repositories into a series of packfiles fast-import now outputs the last commits/tags it wrote to a packfile when it prints out the packfile name. This information can be feed to pack-objects --revs to repack. For the first pack of an initial import this is pretty easy (just feed those SHA1s on stdin) but for subsequent packs you want to feed the subsequent pack's final SHA1s but also all prior pack's SHA1s prefixed with the negation operator. This way the prior pack's data does not get included into the subsequent pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 22:18:44 +01:00			`}`
fast-import: Hide the pack boundary commits by default. Most users don't need the pack boundary information that fast-import was printing to standard output, especially if they were calling it with --quiet. Those users who do want this information probably want it captured so they can go back and use it to repack the imported repository. So dumping the boundary commits to a log file makes more sense then printing them to standard output. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-12 01:45:56 +01:00			`for (t = first_tag; t; t = t->next_tag) {`
			`if (t->pack_id == pack_id)`
			`fprintf(pack_edges, " %s", sha1_to_hex(t->sha1));`
			`}`
			`fputc('\n', pack_edges);`
			`fflush(pack_edges);`
Print out the edge commits for each packfile in fast-import. To help callers repack very large repositories into a series of packfiles fast-import now outputs the last commits/tags it wrote to a packfile when it prints out the packfile name. This information can be feed to pack-objects --revs to repack. For the first pack of an initial import this is pretty easy (just feed those SHA1s on stdin) but for subsequent packs you want to feed the subsequent pack's final SHA1s but also all prior pack's SHA1s prefixed with the negation operator. This way the prior pack's data does not get included into the subsequent pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 22:18:44 +01:00			`}`

			`pack_id++;`
Don't create a final empty packfile in fast-import. If the last packfile is going to be empty (has 0 objects) then it shouldn't be kept after the import has terminated, as there is no point to the packfile. So rather than hashing it and making the index file, just delete the packfile. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:39:39 +01:00			`}`
Ensure we close the packfile after creating it in fast-import. Because we are renaming the packfile into its file destination we need to be sure its not open when the rename is called, otherwise some operating systems (e.g. Windows) may prevent the rename from occurring. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 07:17:47 +01:00			`else`
Don't create a final empty packfile in fast-import. If the last packfile is going to be empty (has 0 objects) then it shouldn't be kept after the import has terminated, as there is no point to the packfile. So rather than hashing it and making the index file, just delete the packfile. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:39:39 +01:00			`unlink(old_p->pack_name);`
Implemented manual packfile switching in fast-import. To help importers which are dealing with massive amounts of data fast-import needs to be able to close the packfile it is currently writing to and open a new packfile for any additional data that will be received. A new 'checkpoint' command has been introduced which can be used by the frontend import process to force this to occur at any time. This may be useful to ensure a very long running import doesn't lose any work due to unexpected failures. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:35:41 +01:00			`free(old_p);`

			`/* We can't carry a delta across packfiles. */`
fast-import optimization: Now that cmd_data acts on a strbuf, make last_object stashed buffer be a strbuf as well. On new stash, don't free the last stashed buffer, rather swap it with the one you will stash, this way, callers of store_object can act on static strbufs, and at some point, fast-import won't allocate new memory for objects buffers. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 14:00:38 +02:00			`strbuf_release(&last_blob.data);`
Implemented manual packfile switching in fast-import. To help importers which are dealing with massive amounts of data fast-import needs to be able to close the packfile it is currently writing to and open a new packfile for any additional data that will be received. A new 'checkpoint' command has been introduced which can be used by the frontend import process to force this to occur at any time. This may be useful to ensure a very long running import doesn't lose any work due to unexpected failures. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:35:41 +01:00			`last_blob.offset = 0;`
			`last_blob.depth = 0;`
Restructure fast-import to support creating multiple packfiles. Now that we are starting to see some really large projects (such as KDE or a fork of FreeBSD) get imported into Git we're running into the upper limit on packfile object count as well as overall byte length. The KDE and FreeBSD projects are both likely to require more than 4 GiB to store their current history, which means we really need multiple packfiles to handle their content. This is a fairly simple restructuring of the internal code to help us support creating multiple packfiles from within fast-import. We are now adding a 5 digit incrementing suffix to the end of the basename supplied to us by the caller, permitting up to 99,999 packs to be generated in a single fast-import run. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 10:39:05 +01:00			`}`

Dump all refs and marks during a checkpoint in fast-import. If the frontend asks us to checkpoint (via the explicit checkpoint command) its probably because they are afraid the current import will crash/fail/whatever and want to make sure they can pickup from the last checkpoint. To do that sort of recovery, we will need the current tip of every branch and tag available at the next startup. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-07 08:42:44 +01:00			`static void cycle_packfile(void)`
Implemented automatic checkpoints within fast-import. When the number of objects or number of bytes gets close to the limit allowed by the packfile format (or configured on the command line by our caller) we should automatically checkpoint the current packfile and start a new one before writing the object out. This does however require that we abandon the delta (if we had one) as its not valid in a new packfile. I also added the simple rule that if we got a delta back but the delta itself is the same size as or larger than the uncompressed object to ignore the delta and just store the object data. This should avoid some really bad behavior caused by our current delta strategy. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 14:00:49 +01:00			`{`
			`end_packfile();`
			`start_packfile();`
			`}`

Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`static size_t encode_header(`
Refactored fast-import's internals for future additions. Too many globals variables were being used not not enough code was resuable to process trees and commits so this is a simple refactoring of the existing blob processing code to get into a state that will be easier to handle trees and commits in. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:46:13 +02:00			`enum object_type type,`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`size_t size,`
Refactored fast-import's internals for future additions. Too many globals variables were being used not not enough code was resuable to process trees and commits so this is a simple refactoring of the existing blob processing code to get into a state that will be easier to handle trees and commits in. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:46:13 +02:00			`unsigned char *hdr)`
Created fast-import, a tool to quickly generating a pack from blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-05 08:04:21 +02:00			`{`
			`int n = 1;`
			`unsigned char c;`

Merge branch 'master' into sp/fast-import I'm bringing master in early so that the OBJ_OFS_DELTA implementation is available as part of the topic. This way git-fast-import can learn about this new slightly smaller and faster packfile format, and can generate them directly rather than needing to have them be repacked with git-pack-objects. Due to the API changes in master during the period of development of git-fast-import, a few minor tweaks to fast-import.c are needed to produce a working merge. I've done them here as part of the merge to ensure bisection always works. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-14 08:44:18 +01:00			`if (type < OBJ_COMMIT \|\| type > OBJ_REF_DELTA)`
Created fast-import, a tool to quickly generating a pack from blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-05 08:04:21 +02:00			`die("bad type %d", type);`

			`c = (type << 4) \| (size & 15);`
			`size >>= 4;`
			`while (size) {`
			`*hdr++ = c \| 0x80;`
			`c = size & 0x7f;`
			`size >>= 7;`
			`n++;`
			`}`
			`*hdr = c;`
			`return n;`
			`}`

Refactored fast-import's internals for future additions. Too many globals variables were being used not not enough code was resuable to process trees and commits so this is a simple refactoring of the existing blob processing code to get into a state that will be easier to handle trees and commits in. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:46:13 +02:00			`static int store_object(`
			`enum object_type type,`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`struct strbuf *dat,`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00			`struct last_object *last,`
Added mark store/find to fast-import. Marks are now saved when the mark directive gets used by the frontend and may be used in place of a SHA1 expression to locate a previous SHA1 which fast-import may have generated. This is particularly useful with commits where the frontend does not (easily) have the ability to compute the SHA1 for an arbitrary commit but needs it to generate a branch or tag from that commit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 10:17:45 +02:00			`unsigned char *sha1out,`
Use uintmax_t for marks in fast-import. If a frontend wants to use a mark per file revision and per commit and is doing a truly huge import (such as a 32 GiB SVN repository) we may need more than 2**32 unique mark values, especially if the frontend is unable (or unwilling) to recycle mark values. For mark idnums we should use the largest unsigned integer type available, hoping that will be at least 64 bits when we are compiled as a 64 bit executable. This way we may consume huge amounts of memory storing our mark table, but we'll at least be able to process the entire import without failing. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 06:33:19 +01:00			`uintmax_t mark)`
Created fast-import, a tool to quickly generating a pack from blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-05 08:04:21 +02:00			`{`
			`void out, delta;`
Refactored fast-import's internals for future additions. Too many globals variables were being used not not enough code was resuable to process trees and commits so this is a simple refactoring of the existing blob processing code to get into a state that will be easier to handle trees and commits in. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:46:13 +02:00			`struct object_entry *e;`
			`unsigned char hdr[96];`
			`unsigned char sha1[20];`
Created fast-import, a tool to quickly generating a pack from blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-05 08:04:21 +02:00			`unsigned long hdrlen, deltalen;`
Refactored fast-import's internals for future additions. Too many globals variables were being used not not enough code was resuable to process trees and commits so this is a simple refactoring of the existing blob processing code to get into a state that will be easier to handle trees and commits in. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:46:13 +02:00			`SHA_CTX c;`
			`z_stream s;`

formalize typename(), and add its reverse type_from_string() Sometime typename() is used, sometimes type_names[] is accessed directly. Let's enforce typename() all the time which allows for validating the type. Also let's add a function to go from a name to a type and use it instead of manual memcpy() when appropriate. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-26 20:55:58 +01:00			`hdrlen = sprintf((char*)hdr,"%s %lu", typename(type),`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`(unsigned long)dat->len) + 1;`
Refactored fast-import's internals for future additions. Too many globals variables were being used not not enough code was resuable to process trees and commits so this is a simple refactoring of the existing blob processing code to get into a state that will be easier to handle trees and commits in. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:46:13 +02:00			`SHA1_Init(&c);`
			`SHA1_Update(&c, hdr, hdrlen);`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`SHA1_Update(&c, dat->buf, dat->len);`
Refactored fast-import's internals for future additions. Too many globals variables were being used not not enough code was resuable to process trees and commits so this is a simple refactoring of the existing blob processing code to get into a state that will be easier to handle trees and commits in. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:46:13 +02:00			`SHA1_Final(sha1, &c);`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00			`if (sha1out)`
Converted hash memcpy/memcmp to new hashcpy/hashcmp/hashclr. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 16:46:58 +02:00			`hashcpy(sha1out, sha1);`
Refactored fast-import's internals for future additions. Too many globals variables were being used not not enough code was resuable to process trees and commits so this is a simple refactoring of the existing blob processing code to get into a state that will be easier to handle trees and commits in. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:46:13 +02:00
			`e = insert_object(sha1);`
Added mark store/find to fast-import. Marks are now saved when the mark directive gets used by the frontend and may be used in place of a SHA1 expression to locate a previous SHA1 which fast-import may have generated. This is particularly useful with commits where the frontend does not (easily) have the ability to compute the SHA1 for an arbitrary commit but needs it to generate a branch or tag from that commit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 10:17:45 +02:00			`if (mark)`
			`insert_mark(mark, e);`
Refactored fast-import's internals for future additions. Too many globals variables were being used not not enough code was resuable to process trees and commits so this is a simple refactoring of the existing blob processing code to get into a state that will be easier to handle trees and commits in. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:46:13 +02:00			`if (e->offset) {`
Added basic command handler to fast-import. Moved the new_blob logic off into a new subroutine and invoked it when getting the 'blob' command. Added statistics dump to STDERR when the program terminates listing what it did at a high level. This is somewhat interesting. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 07:14:21 +02:00			`duplicate_count_by_type[type]++;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`return 1;`
Don't repack existing objects in fast-import Some users of fast-import have been trying to use it to rewrite commits and trees, an activity where the all of the relevant blobs are already available from the existing packfiles. In such a case we don't want to repack a blob, even if the frontend application has supplied us the raw data rather than a mark or a SHA-1 name. I'm intentionally only checking the packfiles that existed when fast-import started and am always ignoring all loose object files. We ignore loose objects because fast-import tends to operate on a very large number of objects in a very short timespan, and it is usually creating new objects, not reusing existing ones. In such a situtation the majority of the objects will not be found in the existing packfiles, nor will they be loose object files. If the frontend application really wants us to look at loose object files, then they can just repack the repository before running fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-04-20 17:23:45 +02:00			`} else if (find_sha1_pack(sha1, packed_git)) {`
			`e->type = type;`
			`e->pack_id = MAX_PACK_ID;`
			`e->offset = 1; /* just not zero! */`
			`duplicate_count_by_type[type]++;`
			`return 1;`
Refactored fast-import's internals for future additions. Too many globals variables were being used not not enough code was resuable to process trees and commits so this is a simple refactoring of the existing blob processing code to get into a state that will be easier to handle trees and commits in. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:46:13 +02:00			`}`
Created fast-import, a tool to quickly generating a pack from blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-05 08:04:21 +02:00
fast-import optimization: Now that cmd_data acts on a strbuf, make last_object stashed buffer be a strbuf as well. On new stash, don't free the last stashed buffer, rather swap it with the one you will stash, this way, callers of store_object can act on static strbufs, and at some point, fast-import won't allocate new memory for objects buffers. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 14:00:38 +02:00			`if (last && last->data.buf && last->depth < max_depth) {`
			`delta = diff_delta(last->data.buf, last->data.len,`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`dat->buf, dat->len,`
Created fast-import, a tool to quickly generating a pack from blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-05 08:04:21 +02:00			`&deltalen, 0);`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`if (delta && deltalen >= dat->len) {`
Implemented automatic checkpoints within fast-import. When the number of objects or number of bytes gets close to the limit allowed by the packfile format (or configured on the command line by our caller) we should automatically checkpoint the current packfile and start a new one before writing the object out. This does however require that we abandon the delta (if we had one) as its not valid in a new packfile. I also added the simple rule that if we got a delta back but the delta itself is the same size as or larger than the uncompressed object to ignore the delta and just store the object data. This should avoid some really bad behavior caused by our current delta strategy. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 14:00:49 +01:00			`free(delta);`
			`delta = NULL;`
			`}`
			`} else`
			`delta = NULL;`
Created fast-import, a tool to quickly generating a pack from blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-05 08:04:21 +02:00
			`memset(&s, 0, sizeof(s));`
Teach fast-import to honor pack.compression and pack.depth We now use the configured pack.compression and pack.depth values within fast-import, as like builtin-pack-objects fast-import is generating a packfile for consumption by the Git tools. We use the same behavior as builtin-pack-objects does for these options, allowing core.compression to supply the default value for pack.compression. The default setting for pack.depth within fast-import is still 10 as users will generally repack fast-import generated packfiles by `repack -f`. A large delta depth within the fast-import packfile can significantly slow down such a later repack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-21 05:36:54 +01:00			`deflateInit(&s, pack_compression_level);`
Implemented automatic checkpoints within fast-import. When the number of objects or number of bytes gets close to the limit allowed by the packfile format (or configured on the command line by our caller) we should automatically checkpoint the current packfile and start a new one before writing the object out. This does however require that we abandon the delta (if we had one) as its not valid in a new packfile. I also added the simple rule that if we got a delta back but the delta itself is the same size as or larger than the uncompressed object to ignore the delta and just store the object data. This should avoid some really bad behavior caused by our current delta strategy. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 14:00:49 +01:00			`if (delta) {`
			`s.next_in = delta;`
			`s.avail_in = deltalen;`
			`} else {`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`s.next_in = (void *)dat->buf;`
			`s.avail_in = dat->len;`
Implemented automatic checkpoints within fast-import. When the number of objects or number of bytes gets close to the limit allowed by the packfile format (or configured on the command line by our caller) we should automatically checkpoint the current packfile and start a new one before writing the object out. This does however require that we abandon the delta (if we had one) as its not valid in a new packfile. I also added the simple rule that if we got a delta back but the delta itself is the same size as or larger than the uncompressed object to ignore the delta and just store the object data. This should avoid some really bad behavior caused by our current delta strategy. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 14:00:49 +01:00			`}`
			`s.avail_out = deflateBound(&s, s.avail_in);`
			`s.next_out = out = xmalloc(s.avail_out);`
			`while (deflate(&s, Z_FINISH) == Z_OK)`
			`/* nothing */;`
			`deflateEnd(&s);`

			`/* Determine if we should auto-checkpoint. */`
Remove unnecessary options from fast-import. The --objects command line option is rather unnecessary. Internally we allocate objects in 5000 unit blocks, ensuring that any sort of malloc overhead is ammortized over the individual objects to almost nothing. Since most frontends don't know how many objects they will need for a given import run (and its hard for them to predict without just doing the run) we probably won't see anyone using --objects. Further since there's really no major benefit to using the option, most frontends won't even bother supplying it even if they could estimate the number of objects. So I'm removing it. The --max-objects-per-pack option was probably a mistake to even have added in the first place. The packfile format is limited to 4 GiB today; given that objects need at least 3 bytes of data (and probably need even more) there's no way we are going to exceed the limit of 1<<32-1 objects before we reach the file size limit. So I'm removing it (to slightly reduce the complexity of the code) before anyone gets any wise ideas and tries to use it. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-18 18:00:49 +01:00			`if ((pack_size + 60 + s.total_out) > max_packsize`
Implemented automatic checkpoints within fast-import. When the number of objects or number of bytes gets close to the limit allowed by the packfile format (or configured on the command line by our caller) we should automatically checkpoint the current packfile and start a new one before writing the object out. This does however require that we abandon the delta (if we had one) as its not valid in a new packfile. I also added the simple rule that if we got a delta back but the delta itself is the same size as or larger than the uncompressed object to ignore the delta and just store the object data. This should avoid some really bad behavior caused by our current delta strategy. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 14:00:49 +01:00			`\|\| (pack_size + 60 + s.total_out) < pack_size) {`

			`/* This new object needs to not have the current pack_id. */`
			`e->pack_id = pack_id + 1;`
Dump all refs and marks during a checkpoint in fast-import. If the frontend asks us to checkpoint (via the explicit checkpoint command) its probably because they are afraid the current import will crash/fail/whatever and want to make sure they can pickup from the last checkpoint. To do that sort of recovery, we will need the current tip of every branch and tag available at the next startup. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-07 08:42:44 +01:00			`cycle_packfile();`
Implemented automatic checkpoints within fast-import. When the number of objects or number of bytes gets close to the limit allowed by the packfile format (or configured on the command line by our caller) we should automatically checkpoint the current packfile and start a new one before writing the object out. This does however require that we abandon the delta (if we had one) as its not valid in a new packfile. I also added the simple rule that if we got a delta back but the delta itself is the same size as or larger than the uncompressed object to ignore the delta and just store the object data. This should avoid some really bad behavior caused by our current delta strategy. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 14:00:49 +01:00
			`/* We cannot carry a delta into the new pack. */`
			`if (delta) {`
			`free(delta);`
			`delta = NULL;`
Corrected buffer overflow during automatic checkpoint in fast-import. If we previously were using a delta but we needed to checkpoint the current packfile and switch to a new packfile we need to throw away the delta and compress the raw object by itself, as delta chains cannot span non-thin packfiles. Unfortunately the output buffer in this case needs to grow, as the size of the compressed object may be quite a bit larger than the size of the compressed delta. I've also avoided recompressing the object if we are checkpointing and we didn't use a delta. In this case the output buffer is the correct size and has already been populated with the right data, we just need to close out the current packfile and open a new one. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 05:40:27 +01:00
			`memset(&s, 0, sizeof(s));`
Teach fast-import to honor pack.compression and pack.depth We now use the configured pack.compression and pack.depth values within fast-import, as like builtin-pack-objects fast-import is generating a packfile for consumption by the Git tools. We use the same behavior as builtin-pack-objects does for these options, allowing core.compression to supply the default value for pack.compression. The default setting for pack.depth within fast-import is still 10 as users will generally repack fast-import generated packfiles by `repack -f`. A large delta depth within the fast-import packfile can significantly slow down such a later repack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-21 05:36:54 +01:00			`deflateInit(&s, pack_compression_level);`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`s.next_in = (void *)dat->buf;`
			`s.avail_in = dat->len;`
Corrected buffer overflow during automatic checkpoint in fast-import. If we previously were using a delta but we needed to checkpoint the current packfile and switch to a new packfile we need to throw away the delta and compress the raw object by itself, as delta chains cannot span non-thin packfiles. Unfortunately the output buffer in this case needs to grow, as the size of the compressed object may be quite a bit larger than the size of the compressed delta. I've also avoided recompressing the object if we are checkpointing and we didn't use a delta. In this case the output buffer is the correct size and has already been populated with the right data, we just need to close out the current packfile and open a new one. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 05:40:27 +01:00			`s.avail_out = deflateBound(&s, s.avail_in);`
			`s.next_out = out = xrealloc(out, s.avail_out);`
			`while (deflate(&s, Z_FINISH) == Z_OK)`
			`/* nothing */;`
			`deflateEnd(&s);`
Implemented automatic checkpoints within fast-import. When the number of objects or number of bytes gets close to the limit allowed by the packfile format (or configured on the command line by our caller) we should automatically checkpoint the current packfile and start a new one before writing the object out. This does however require that we abandon the delta (if we had one) as its not valid in a new packfile. I also added the simple rule that if we got a delta back but the delta itself is the same size as or larger than the uncompressed object to ignore the delta and just store the object data. This should avoid some really bad behavior caused by our current delta strategy. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 14:00:49 +01:00			`}`
			`}`

			`e->type = type;`
			`e->pack_id = pack_id;`
			`e->offset = pack_size;`
			`object_count++;`
			`object_count_by_type[type]++;`
Created fast-import, a tool to quickly generating a pack from blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-05 08:04:21 +02:00
			`if (delta) {`
Improve reuse of sha1_file library within fast-import. Now that the sha1_file.c library routines use the sliding mmap routines to perform efficient access to portions of a packfile I can remove that code from fast-import.c and just invoke it. One benefit is we now have reloading support for any packfile which uses OBJ_OFS_DELTA. Another is we have significantly less code to maintain. This code reuse change requires that fast-import generate only an OBJ_OFS_DELTA format packfile, as there is absolutely no index available to perform OBJ_REF_DELTA lookup in while unpacking an object. This is probably reasonable to require as the delta offsets result in smaller packfiles and are faster to unpack, as no index searching is required. Its also only a temporary requirement as users could always repack without offsets before making the import available to older versions of Git. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-14 12:20:23 +01:00			`unsigned long ofs = e->offset - last->offset;`
			`unsigned pos = sizeof(hdr) - 1;`

Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`delta_count_by_type[type]++;`
Don't allow fast-import tree delta chains to exceed maximum depth Brian Downing noticed fast-import can produce tree depths of up to 6,035 objects and even deeper. Long delta chains can create very small packfiles but cause problems during repacking as git needs to unpack each tree to count the reachable blobs. What's happening here is the active branch cache isn't big enough. We're swapping out the branch and thus recycling the tree information (struct tree_content) back into the free pool. When we later reload the tree we set the delta_depth to 0 but we kept the tree we just reloaded as a delta base. So if the tree we reloaded was already at the maximum depth we wouldn't know it and make the new tree a delta. Multiply the number of times the branch cache has to swap out the tree times max_depth (10) and you get the maximum delta depth of a tree created by fast-import. In Brian's case above the active branch cache had to swap the branch out 603/604 times during this import to produce a tree with a delta depth of 6035. Acked-by: Brian Downing <bdowning@lavos.net> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-14 05:48:42 +01:00			`e->depth = last->depth + 1;`
Improve reuse of sha1_file library within fast-import. Now that the sha1_file.c library routines use the sliding mmap routines to perform efficient access to portions of a packfile I can remove that code from fast-import.c and just invoke it. One benefit is we now have reloading support for any packfile which uses OBJ_OFS_DELTA. Another is we have significantly less code to maintain. This code reuse change requires that fast-import generate only an OBJ_OFS_DELTA format packfile, as there is absolutely no index available to perform OBJ_REF_DELTA lookup in while unpacking an object. This is probably reasonable to require as the delta offsets result in smaller packfiles and are faster to unpack, as no index searching is required. Its also only a temporary requirement as users could always repack without offsets before making the import available to older versions of Git. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-14 12:20:23 +01:00
			`hdrlen = encode_header(OBJ_OFS_DELTA, deltalen, hdr);`
Remove unnecessary pack_fd global in fast-import. Much like the pack_sha1 the pack_fd is an unnecessary global variable, we already have the fd stored in our struct packed_git *pack_data so that the core library functions in sha1_file.c are able to lookup and decompress object data that we have previously written. Keeping an extra copy of this value in our own variable is just a hold-over from earlier versions of fast-import and is now completely unnecessary. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 07:20:57 +01:00			`write_or_die(pack_data->pack_fd, hdr, hdrlen);`
Improve reuse of sha1_file library within fast-import. Now that the sha1_file.c library routines use the sliding mmap routines to perform efficient access to portions of a packfile I can remove that code from fast-import.c and just invoke it. One benefit is we now have reloading support for any packfile which uses OBJ_OFS_DELTA. Another is we have significantly less code to maintain. This code reuse change requires that fast-import generate only an OBJ_OFS_DELTA format packfile, as there is absolutely no index available to perform OBJ_REF_DELTA lookup in while unpacking an object. This is probably reasonable to require as the delta offsets result in smaller packfiles and are faster to unpack, as no index searching is required. Its also only a temporary requirement as users could always repack without offsets before making the import available to older versions of Git. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-14 12:20:23 +01:00			`pack_size += hdrlen;`

			`hdr[pos] = ofs & 127;`
			`while (ofs >>= 7)`
			`hdr[--pos] = 128 \| (--ofs & 127);`
Remove unnecessary pack_fd global in fast-import. Much like the pack_sha1 the pack_fd is an unnecessary global variable, we already have the fd stored in our struct packed_git *pack_data so that the core library functions in sha1_file.c are able to lookup and decompress object data that we have previously written. Keeping an extra copy of this value in our own variable is just a hold-over from earlier versions of fast-import and is now completely unnecessary. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 07:20:57 +01:00			`write_or_die(pack_data->pack_fd, hdr + pos, sizeof(hdr) - pos);`
Improve reuse of sha1_file library within fast-import. Now that the sha1_file.c library routines use the sliding mmap routines to perform efficient access to portions of a packfile I can remove that code from fast-import.c and just invoke it. One benefit is we now have reloading support for any packfile which uses OBJ_OFS_DELTA. Another is we have significantly less code to maintain. This code reuse change requires that fast-import generate only an OBJ_OFS_DELTA format packfile, as there is absolutely no index available to perform OBJ_REF_DELTA lookup in while unpacking an object. This is probably reasonable to require as the delta offsets result in smaller packfiles and are faster to unpack, as no index searching is required. Its also only a temporary requirement as users could always repack without offsets before making the import available to older versions of Git. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-14 12:20:23 +01:00			`pack_size += sizeof(hdr) - pos;`
Created fast-import, a tool to quickly generating a pack from blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-05 08:04:21 +02:00			`} else {`
Don't allow fast-import tree delta chains to exceed maximum depth Brian Downing noticed fast-import can produce tree depths of up to 6,035 objects and even deeper. Long delta chains can create very small packfiles but cause problems during repacking as git needs to unpack each tree to count the reachable blobs. What's happening here is the active branch cache isn't big enough. We're swapping out the branch and thus recycling the tree information (struct tree_content) back into the free pool. When we later reload the tree we set the delta_depth to 0 but we kept the tree we just reloaded as a delta base. So if the tree we reloaded was already at the maximum depth we wouldn't know it and make the new tree a delta. Multiply the number of times the branch cache has to swap out the tree times max_depth (10) and you get the maximum delta depth of a tree created by fast-import. In Brian's case above the active branch cache had to swap the branch out 603/604 times during this import to produce a tree with a delta depth of 6035. Acked-by: Brian Downing <bdowning@lavos.net> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-14 05:48:42 +01:00			`e->depth = 0;`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`hdrlen = encode_header(type, dat->len, hdr);`
Remove unnecessary pack_fd global in fast-import. Much like the pack_sha1 the pack_fd is an unnecessary global variable, we already have the fd stored in our struct packed_git *pack_data so that the core library functions in sha1_file.c are able to lookup and decompress object data that we have previously written. Keeping an extra copy of this value in our own variable is just a hold-over from earlier versions of fast-import and is now completely unnecessary. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 07:20:57 +01:00			`write_or_die(pack_data->pack_fd, hdr, hdrlen);`
Implemented tree reloading in fast-import. Tree reloading allows fast-import to swap out the least-recently used branch by simply deallocating the data structures from memory that were associated with that branch. Later if the branch becomes active again it can lazily recreate those structures on demand by reloading the necessary trees from the pack file it originally wrote them to. The reloading process is implemented by mmap'ing the pack into memory and using a much tighter variant of the pack reading code contained in sha1_file.c. This was a blatent copy from sha1_file.c but the unpacking functions were significantly simplified and are actually now in a form that should make it easier to map only the necessary regions of a pack rather than the entire file. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 10:37:35 +02:00			`pack_size += hdrlen;`
Created fast-import, a tool to quickly generating a pack from blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-05 08:04:21 +02:00			`}`

Remove unnecessary pack_fd global in fast-import. Much like the pack_sha1 the pack_fd is an unnecessary global variable, we already have the fd stored in our struct packed_git *pack_data so that the core library functions in sha1_file.c are able to lookup and decompress object data that we have previously written. Keeping an extra copy of this value in our own variable is just a hold-over from earlier versions of fast-import and is now completely unnecessary. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 07:20:57 +01:00			`write_or_die(pack_data->pack_fd, out, s.total_out);`
Implemented tree reloading in fast-import. Tree reloading allows fast-import to swap out the least-recently used branch by simply deallocating the data structures from memory that were associated with that branch. Later if the branch becomes active again it can lazily recreate those structures on demand by reloading the necessary trees from the pack file it originally wrote them to. The reloading process is implemented by mmap'ing the pack into memory and using a much tighter variant of the pack reading code contained in sha1_file.c. This was a blatent copy from sha1_file.c but the unpacking functions were significantly simplified and are actually now in a form that should make it easier to map only the necessary regions of a pack rather than the entire file. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 10:37:35 +02:00			`pack_size += s.total_out;`
Created fast-import, a tool to quickly generating a pack from blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-05 08:04:21 +02:00
			`free(out);`
Remove unnecessary null pointer checks in fast-import. There is no need to check for a NULL pointer before invoking free(), the runtime library automatically performs this check anyway and does nothing if a NULL pointer is supplied. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 18:05:51 +01:00			`free(delta);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`if (last) {`
fast-import optimization: Now that cmd_data acts on a strbuf, make last_object stashed buffer be a strbuf as well. On new stash, don't free the last stashed buffer, rather swap it with the one you will stash, this way, callers of store_object can act on static strbufs, and at some point, fast-import won't allocate new memory for objects buffers. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 14:00:38 +02:00			`if (last->no_swap) {`
			`last->data = *dat;`
			`} else {`
strbuf API additions and enhancements. Add strbuf_remove, change strbuf_insert: As both are special cases of strbuf_splice, implement them as such. gcc is able to do the math and generate almost optimal code this way. Add strbuf_swap: Exchange the values of its arguments. Use it in fast-import.c Also fix spacing issues in strbuf.h Signed-off-by: Pierre Habouzit <madcoder@debian.org> 2007-09-20 00:42:12 +02:00			`strbuf_swap(&last->data, dat);`
fast-import optimization: Now that cmd_data acts on a strbuf, make last_object stashed buffer be a strbuf as well. On new stash, don't free the last stashed buffer, rather swap it with the one you will stash, this way, callers of store_object can act on static strbufs, and at some point, fast-import won't allocate new memory for objects buffers. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 14:00:38 +02:00			`}`
Improve reuse of sha1_file library within fast-import. Now that the sha1_file.c library routines use the sliding mmap routines to perform efficient access to portions of a packfile I can remove that code from fast-import.c and just invoke it. One benefit is we now have reloading support for any packfile which uses OBJ_OFS_DELTA. Another is we have significantly less code to maintain. This code reuse change requires that fast-import generate only an OBJ_OFS_DELTA format packfile, as there is absolutely no index available to perform OBJ_REF_DELTA lookup in while unpacking an object. This is probably reasonable to require as the delta offsets result in smaller packfiles and are faster to unpack, as no index searching is required. Its also only a temporary requirement as users could always repack without offsets before making the import available to older versions of Git. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-14 12:20:23 +01:00			`last->offset = e->offset;`
Don't allow fast-import tree delta chains to exceed maximum depth Brian Downing noticed fast-import can produce tree depths of up to 6,035 objects and even deeper. Long delta chains can create very small packfiles but cause problems during repacking as git needs to unpack each tree to count the reachable blobs. What's happening here is the active branch cache isn't big enough. We're swapping out the branch and thus recycling the tree information (struct tree_content) back into the free pool. When we later reload the tree we set the delta_depth to 0 but we kept the tree we just reloaded as a delta base. So if the tree we reloaded was already at the maximum depth we wouldn't know it and make the new tree a delta. Multiply the number of times the branch cache has to swap out the tree times max_depth (10) and you get the maximum delta depth of a tree created by fast-import. In Brian's case above the active branch cache had to swap the branch out 603/604 times during this import to produce a tree with a delta depth of 6035. Acked-by: Brian Downing <bdowning@lavos.net> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-14 05:48:42 +01:00			`last->depth = e->depth;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`}`
			`return 0;`
			`}`

Document the hairy gfi_unpack_entry part of fast-import Junio pointed out this part of fast-import wasn't very clear on initial read, and it took some time for someone who was new to fast-import's "dirty little tricks" to understand how this was even working. So a little bit of commentary in the proper place may help future readers. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-21 05:37:01 +01:00			`/* All calls must be guarded by find_object() or find_mark() to`
			`* ensure the 'struct object_entry' passed was written by this`
			`* process instance. We unpack the entry by the offset, avoiding`
			`* the need for the corresponding .idx file. This unpacking rule`
			`* works because we only use OBJ_REF_DELTA within the packfiles`
			`* created by fast-import.`
			`*`
			`* oe must not be NULL. Such an oe usually comes from giving`
			`* an unknown SHA-1 to find_object() or an undefined mark to`
			`* find_mark(). Callers must test for this condition and use`
			`* the standard read_sha1_file() when it happens.`
			`*`
			`* oe->pack_id must not be MAX_PACK_ID. Such an oe is usually from`
			`* find_mark(), where the mark was reloaded from an existing marks`
			`* file and is referencing an object that this fast-import process`
			`* instance did not write out to a packfile. Callers must test for`
			`* this condition and use read_sha1_file() instead.`
			`*/`
Implemented manual packfile switching in fast-import. To help importers which are dealing with massive amounts of data fast-import needs to be able to close the packfile it is currently writing to and open a new packfile for any additional data that will be received. A new 'checkpoint' command has been introduced which can be used by the frontend import process to force this to occur at any time. This may be useful to ensure a very long running import doesn't lose any work due to unexpected failures. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:35:41 +01:00			`static void *gfi_unpack_entry(`
			`struct object_entry *oe,`
			`unsigned long *sizep)`
Implemented tree reloading in fast-import. Tree reloading allows fast-import to swap out the least-recently used branch by simply deallocating the data structures from memory that were associated with that branch. Later if the branch becomes active again it can lazily recreate those structures on demand by reloading the necessary trees from the pack file it originally wrote them to. The reloading process is implemented by mmap'ing the pack into memory and using a much tighter variant of the pack reading code contained in sha1_file.c. This was a blatent copy from sha1_file.c but the unpacking functions were significantly simplified and are actually now in a form that should make it easier to map only the necessary regions of a pack rather than the entire file. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 10:37:35 +02:00			`{`
convert object type handling from a string to a number We currently have two parallel notation for dealing with object types in the code: a string and a numerical value. One of them is obviously redundent, and the most used one requires more stack space and a bunch of strcmp() all over the place. This is an initial step for the removal of the version using a char array found in object reading code paths. The patch is unfortunately large but there is no sane way to split it in smaller parts without breaking the system. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-26 20:55:59 +01:00			`enum object_type type;`
Implemented manual packfile switching in fast-import. To help importers which are dealing with massive amounts of data fast-import needs to be able to close the packfile it is currently writing to and open a new packfile for any additional data that will be received. A new 'checkpoint' command has been introduced which can be used by the frontend import process to force this to occur at any time. This may be useful to ensure a very long running import doesn't lose any work due to unexpected failures. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:35:41 +01:00			`struct packed_git *p = all_packs[oe->pack_id];`
Fix random fast-import errors when compiled with NO_MMAP fast-import was relying on the fact that on most systems mmap() and write() are synchronized by the filesystem's buffer cache. We were relying on the ability to mmap() 20 bytes beyond the current end of the file, then later fill in those bytes with a future write() call, then read them through the previously obtained mmap() address. This isn't always true with some implementations of NFS, but it is especially not true with our NO_MMAP=YesPlease build time option used on some platforms. If fast-import was built with NO_MMAP=YesPlease we used the malloc()+pread() emulation and the subsequent write() call does not update the trailing 20 bytes of a previously obtained "mmap()" (aka malloc'd) address. Under NO_MMAP that behavior causes unpack_entry() in sha1_file.c to be unable to read an object header (or data) that has been unlucky enough to be written to the packfile at a location such that it is in the trailing 20 bytes of a window previously opened on that same packfile. This bug has gone unnoticed for a very long time as it is highly data dependent. Not only does the object have to be placed at the right position, but it also needs to be positioned behind some other object that has been accessed due to a branch cache invalidation. In other words the stars had to align just right, and if you did run into this bug you probably should also have purchased a lottery ticket. Fortunately the workaround is a lot easier than the bug explanation. Before we allow unpack_entry() to read data from a pack window that has also (possibly) been modified through write() we force all existing windows on that packfile to be closed. By closing the windows we ensure that any new access via the emulated mmap() will reread the packfile, updating to the current file content. This comes at a slight performance degredation as we cannot reuse previously cached windows when we update the packfile. But it is a fairly minor difference as the window closes happen at only two points: - When the packfile is finalized and its .idx is generated: At this stage we are getting ready to update the refs and any data access into the packfile is going to be random, and is going after only the branch tips (to ensure they are valid). Our existing windows (if any) are not likely to be positioned at useful locations to access those final tip commits so we probably were closing them before anyway. - When the branch cache missed and we need to reload: At this point fast-import is getting change commands for the next commit and it needs to go re-read a tree object it previously had written out to the packfile. What windows we had (if any) are not likely to cover the tree in question so we probably were closing them before anyway. We do try to avoid unnecessarily closing windows in the second case by checking to see if the packfile size has increased since the last time we called unpack_entry() on that packfile. If the size has not changed then we have not written additional data, and any existing window is still vaild. This nicely handles the cases where fast-import is going through a branch cache reload and needs to read many trees at once. During such an event we are not likely to be updating the packfile so we do not cycle the windows between reads. With this change in place t9301-fast-export.sh (which was broken by c3b0dec509fe136c5417422f31898b5a4e2d5e02) finally works again. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-18 04:57:00 +01:00			`if (p == pack_data && p->pack_size < (pack_size + 20)) {`
Document the hairy gfi_unpack_entry part of fast-import Junio pointed out this part of fast-import wasn't very clear on initial read, and it took some time for someone who was new to fast-import's "dirty little tricks" to understand how this was even working. So a little bit of commentary in the proper place may help future readers. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-21 05:37:01 +01:00			`/* The object is stored in the packfile we are writing to`
			`* and we have modified it since the last time we scanned`
			`* back to read a previously written object. If an old`
			`* window covered [p->pack_size, p->pack_size + 20) its`
			`* data is stale and is not valid. Closing all windows`
			`* and updating the packfile length ensures we can read`
			`* the newly written data.`
			`*/`
Fix random fast-import errors when compiled with NO_MMAP fast-import was relying on the fact that on most systems mmap() and write() are synchronized by the filesystem's buffer cache. We were relying on the ability to mmap() 20 bytes beyond the current end of the file, then later fill in those bytes with a future write() call, then read them through the previously obtained mmap() address. This isn't always true with some implementations of NFS, but it is especially not true with our NO_MMAP=YesPlease build time option used on some platforms. If fast-import was built with NO_MMAP=YesPlease we used the malloc()+pread() emulation and the subsequent write() call does not update the trailing 20 bytes of a previously obtained "mmap()" (aka malloc'd) address. Under NO_MMAP that behavior causes unpack_entry() in sha1_file.c to be unable to read an object header (or data) that has been unlucky enough to be written to the packfile at a location such that it is in the trailing 20 bytes of a window previously opened on that same packfile. This bug has gone unnoticed for a very long time as it is highly data dependent. Not only does the object have to be placed at the right position, but it also needs to be positioned behind some other object that has been accessed due to a branch cache invalidation. In other words the stars had to align just right, and if you did run into this bug you probably should also have purchased a lottery ticket. Fortunately the workaround is a lot easier than the bug explanation. Before we allow unpack_entry() to read data from a pack window that has also (possibly) been modified through write() we force all existing windows on that packfile to be closed. By closing the windows we ensure that any new access via the emulated mmap() will reread the packfile, updating to the current file content. This comes at a slight performance degredation as we cannot reuse previously cached windows when we update the packfile. But it is a fairly minor difference as the window closes happen at only two points: - When the packfile is finalized and its .idx is generated: At this stage we are getting ready to update the refs and any data access into the packfile is going to be random, and is going after only the branch tips (to ensure they are valid). Our existing windows (if any) are not likely to be positioned at useful locations to access those final tip commits so we probably were closing them before anyway. - When the branch cache missed and we need to reload: At this point fast-import is getting change commands for the next commit and it needs to go re-read a tree object it previously had written out to the packfile. What windows we had (if any) are not likely to cover the tree in question so we probably were closing them before anyway. We do try to avoid unnecessarily closing windows in the second case by checking to see if the packfile size has increased since the last time we called unpack_entry() on that packfile. If the size has not changed then we have not written additional data, and any existing window is still vaild. This nicely handles the cases where fast-import is going through a branch cache reload and needs to read many trees at once. During such an event we are not likely to be updating the packfile so we do not cycle the windows between reads. With this change in place t9301-fast-export.sh (which was broken by c3b0dec509fe136c5417422f31898b5a4e2d5e02) finally works again. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-18 04:57:00 +01:00			`close_pack_windows(p);`
Document the hairy gfi_unpack_entry part of fast-import Junio pointed out this part of fast-import wasn't very clear on initial read, and it took some time for someone who was new to fast-import's "dirty little tricks" to understand how this was even working. So a little bit of commentary in the proper place may help future readers. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-21 05:37:01 +01:00
			`/* We have to offer 20 bytes additional on the end of`
			`* the packfile as the core unpacker code assumes the`
			`* footer is present at the file end and must promise`
			`* at least 20 bytes within any window it maps. But`
			`* we don't actually create the footer here.`
			`*/`
Implemented manual packfile switching in fast-import. To help importers which are dealing with massive amounts of data fast-import needs to be able to close the packfile it is currently writing to and open a new packfile for any additional data that will be received. A new 'checkpoint' command has been introduced which can be used by the frontend import process to force this to occur at any time. This may be useful to ensure a very long running import doesn't lose any work due to unexpected failures. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:35:41 +01:00			`p->pack_size = pack_size + 20;`
Fix random fast-import errors when compiled with NO_MMAP fast-import was relying on the fact that on most systems mmap() and write() are synchronized by the filesystem's buffer cache. We were relying on the ability to mmap() 20 bytes beyond the current end of the file, then later fill in those bytes with a future write() call, then read them through the previously obtained mmap() address. This isn't always true with some implementations of NFS, but it is especially not true with our NO_MMAP=YesPlease build time option used on some platforms. If fast-import was built with NO_MMAP=YesPlease we used the malloc()+pread() emulation and the subsequent write() call does not update the trailing 20 bytes of a previously obtained "mmap()" (aka malloc'd) address. Under NO_MMAP that behavior causes unpack_entry() in sha1_file.c to be unable to read an object header (or data) that has been unlucky enough to be written to the packfile at a location such that it is in the trailing 20 bytes of a window previously opened on that same packfile. This bug has gone unnoticed for a very long time as it is highly data dependent. Not only does the object have to be placed at the right position, but it also needs to be positioned behind some other object that has been accessed due to a branch cache invalidation. In other words the stars had to align just right, and if you did run into this bug you probably should also have purchased a lottery ticket. Fortunately the workaround is a lot easier than the bug explanation. Before we allow unpack_entry() to read data from a pack window that has also (possibly) been modified through write() we force all existing windows on that packfile to be closed. By closing the windows we ensure that any new access via the emulated mmap() will reread the packfile, updating to the current file content. This comes at a slight performance degredation as we cannot reuse previously cached windows when we update the packfile. But it is a fairly minor difference as the window closes happen at only two points: - When the packfile is finalized and its .idx is generated: At this stage we are getting ready to update the refs and any data access into the packfile is going to be random, and is going after only the branch tips (to ensure they are valid). Our existing windows (if any) are not likely to be positioned at useful locations to access those final tip commits so we probably were closing them before anyway. - When the branch cache missed and we need to reload: At this point fast-import is getting change commands for the next commit and it needs to go re-read a tree object it previously had written out to the packfile. What windows we had (if any) are not likely to cover the tree in question so we probably were closing them before anyway. We do try to avoid unnecessarily closing windows in the second case by checking to see if the packfile size has increased since the last time we called unpack_entry() on that packfile. If the size has not changed then we have not written additional data, and any existing window is still vaild. This nicely handles the cases where fast-import is going through a branch cache reload and needs to read many trees at once. During such an event we are not likely to be updating the packfile so we do not cycle the windows between reads. With this change in place t9301-fast-export.sh (which was broken by c3b0dec509fe136c5417422f31898b5a4e2d5e02) finally works again. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-18 04:57:00 +01:00			`}`
convert object type handling from a string to a number We currently have two parallel notation for dealing with object types in the code: a string and a numerical value. One of them is obviously redundent, and the most used one requires more stack space and a bunch of strcmp() all over the place. This is an initial step for the removal of the version using a char array found in object reading code paths. The patch is unfortunately large but there is no sane way to split it in smaller parts without breaking the system. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-26 20:55:59 +01:00			`return unpack_entry(p, oe->offset, &type, sizep);`
Implemented tree reloading in fast-import. Tree reloading allows fast-import to swap out the least-recently used branch by simply deallocating the data structures from memory that were associated with that branch. Later if the branch becomes active again it can lazily recreate those structures on demand by reloading the necessary trees from the pack file it originally wrote them to. The reloading process is implemented by mmap'ing the pack into memory and using a much tighter variant of the pack reading code contained in sha1_file.c. This was a blatent copy from sha1_file.c but the unpacking functions were significantly simplified and are actually now in a form that should make it easier to map only the necessary regions of a pack rather than the entire file. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 10:37:35 +02:00			`}`

Reduce memory usage of fast-import. Some structs are allocated rather frequently, but were using integer types which were far larger than required to actually store their full value range. As packfiles are limited to 4 GiB we don't need more than 32 bits to store the offset of an object within that packfile, an `unsigned long` on a 64 bit system is likely a 64 bit unsigned value. Saving 4 bytes per object on a 64 bit system can add up fast on any sizable import. As atom strings are strictly single components in a path name these are probably limited to just 255 bytes by the underlying OS. Going to that short of a string is probably too restrictive, but certainly `unsigned int` is far too large for their lengths. `unsigned short` is a reasonable limit. Modes within a tree really only need two bytes to store their whole value; using `unsigned int` here is vast overkill. Saving 4 bytes per file entry in an active branch can add up quickly on a project with a large number of files. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-05 22:34:56 +01:00			`static const char get_mode(const char str, uint16_t *modep)`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`{`
			`unsigned char c;`
Reduce memory usage of fast-import. Some structs are allocated rather frequently, but were using integer types which were far larger than required to actually store their full value range. As packfiles are limited to 4 GiB we don't need more than 32 bits to store the offset of an object within that packfile, an `unsigned long` on a 64 bit system is likely a 64 bit unsigned value. Saving 4 bytes per object on a 64 bit system can add up fast on any sizable import. As atom strings are strictly single components in a path name these are probably limited to just 255 bytes by the underlying OS. Going to that short of a string is probably too restrictive, but certainly `unsigned int` is far too large for their lengths. `unsigned short` is a reasonable limit. Modes within a tree really only need two bytes to store their whole value; using `unsigned int` here is vast overkill. Saving 4 bytes per file entry in an active branch can add up quickly on a project with a large number of files. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-05 22:34:56 +01:00			`uint16_t mode = 0;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00
			`while ((c = *str++) != ' ') {`
			`if (c < '0' \|\| c > '7')`
			`return NULL;`
			`mode = (mode << 3) + (c - '0');`
			`}`
			`*modep = mode;`
			`return str;`
			`}`

			`static void load_tree(struct tree_entry *root)`
			`{`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`unsigned char* sha1 = root->versions[1].sha1;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`struct object_entry *myoe;`
			`struct tree_content *t;`
			`unsigned long size;`
			`char *buf;`
			`const char *c;`

			`root->tree = t = new_tree_content(8);`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`if (is_null_sha1(sha1))`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`return;`

Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`myoe = find_object(sha1);`
fast-import: Fix crash when referencing already existing objects Commit a5c1780a0355a71b9fb70f1f1977ce726ee5b8d8 sets the pack_id of existing objects to MAX_PACK_ID. When the same object is referenced later again it is found in the local object hash. With such a pack_id fast-import should not try to locate that object in the newly created pack(s). Signed-off-by: Simon Hausmann <simon@lst.de> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-05-23 23:01:49 +02:00			`if (myoe && myoe->pack_id != MAX_PACK_ID) {`
Implemented tree reloading in fast-import. Tree reloading allows fast-import to swap out the least-recently used branch by simply deallocating the data structures from memory that were associated with that branch. Later if the branch becomes active again it can lazily recreate those structures on demand by reloading the necessary trees from the pack file it originally wrote them to. The reloading process is implemented by mmap'ing the pack into memory and using a much tighter variant of the pack reading code contained in sha1_file.c. This was a blatent copy from sha1_file.c but the unpacking functions were significantly simplified and are actually now in a form that should make it easier to map only the necessary regions of a pack rather than the entire file. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 10:37:35 +02:00			`if (myoe->type != OBJ_TREE)`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`die("Not a tree: %s", sha1_to_hex(sha1));`
Don't allow fast-import tree delta chains to exceed maximum depth Brian Downing noticed fast-import can produce tree depths of up to 6,035 objects and even deeper. Long delta chains can create very small packfiles but cause problems during repacking as git needs to unpack each tree to count the reachable blobs. What's happening here is the active branch cache isn't big enough. We're swapping out the branch and thus recycling the tree information (struct tree_content) back into the free pool. When we later reload the tree we set the delta_depth to 0 but we kept the tree we just reloaded as a delta base. So if the tree we reloaded was already at the maximum depth we wouldn't know it and make the new tree a delta. Multiply the number of times the branch cache has to swap out the tree times max_depth (10) and you get the maximum delta depth of a tree created by fast-import. In Brian's case above the active branch cache had to swap the branch out 603/604 times during this import to produce a tree with a delta depth of 6035. Acked-by: Brian Downing <bdowning@lavos.net> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-14 05:48:42 +01:00			`t->delta_depth = myoe->depth;`
Implemented manual packfile switching in fast-import. To help importers which are dealing with massive amounts of data fast-import needs to be able to close the packfile it is currently writing to and open a new packfile for any additional data that will be received. A new 'checkpoint' command has been introduced which can be used by the frontend import process to force this to occur at any time. This may be useful to ensure a very long running import doesn't lose any work due to unexpected failures. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:35:41 +01:00			`buf = gfi_unpack_entry(myoe, &size);`
fast-import: check return value from unpack_entry() If the tree object we have asked for is deltafied in the packfile and the delta did not apply correctly or was not able to be decompressed from the packfile then we can get back NULL instead of the tree data. This is (part of) the reason why read_sha1_file() can return NULL, so we need to also handle it the same way. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-14 07:34:34 +01:00			`if (!buf)`
			`die("Can't load tree %s", sha1_to_hex(sha1));`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`} else {`
convert object type handling from a string to a number We currently have two parallel notation for dealing with object types in the code: a string and a numerical value. One of them is obviously redundent, and the most used one requires more stack space and a bunch of strcmp() all over the place. This is an initial step for the removal of the version using a char array found in object reading code paths. The patch is unfortunately large but there is no sane way to split it in smaller parts without breaking the system. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-26 20:55:59 +01:00			`enum object_type type;`
			`buf = read_sha1_file(sha1, &type, &size);`
			`if (!buf \|\| type != OBJ_TREE)`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`die("Can't load tree %s", sha1_to_hex(sha1));`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`}`

			`c = buf;`
			`while (c != (buf + size)) {`
			`struct tree_entry *e = new_tree_entry();`

			`if (t->entry_count == t->entry_capacity)`
fast-import: grow tree storage more aggressively When building up a tree for a commit, fast-import dynamically allocates memory for the tree entries. When more space is needed, the allocated memory is increased by a constant amount. For very large trees, this means re-allocating and memcpy()ing the memory O(n) times. To compound this problem, releasing the previous tree resource does not free the memory; it is kept in a pool for future trees. This means that each of the O(n) allocations will consume increasing amounts of memory, giving O(n^2) memory consumption. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-11 03:39:17 +01:00			`root->tree = t = grow_tree_content(t, t->entry_count);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`t->entries[t->entry_count++] = e;`

			`e->tree = NULL;`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`c = get_mode(c, &e->versions[1].mode);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`if (!c)`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`die("Corrupt mode in %s", sha1_to_hex(sha1));`
			`e->versions[0].mode = e->versions[1].mode;`
Remove unnecessary casts from fast-import Jeff King pointed out that these casts are quite unnecessary, as the compiler should be doing them anyway, and may cause problems in the future if the size of the argument for to_atom were to ever be increased. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-12 20:48:37 +01:00			`e->name = to_atom(c, strlen(c));`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`c += e->name->str_len + 1;`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`hashcpy(e->versions[0].sha1, (unsigned char*)c);`
			`hashcpy(e->versions[1].sha1, (unsigned char*)c);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`c += 20;`
			`}`
			`free(buf);`
			`}`

Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`static int tecmp0 (const void _a, const void _b)`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`{`
			`struct tree_entry a = ((struct tree_entry**)_a);`
			`struct tree_entry b = ((struct tree_entry**)_b);`
			`return base_name_compare(`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`a->name->str_dat, a->name->str_len, a->versions[0].mode,`
			`b->name->str_dat, b->name->str_len, b->versions[0].mode);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`}`

Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`static int tecmp1 (const void _a, const void _b)`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`{`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`struct tree_entry a = ((struct tree_entry**)_a);`
			`struct tree_entry b = ((struct tree_entry**)_b);`
			`return base_name_compare(`
			`a->name->str_dat, a->name->str_len, a->versions[1].mode,`
			`b->name->str_dat, b->name->str_len, b->versions[1].mode);`
			`}`

fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`static void mktree(struct tree_content t, int v, struct strbuf b)`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`{`
			`size_t maxlen = 0;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`unsigned int i;`

Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`if (!v)`
			`qsort(t->entries,t->entry_count,sizeof(t->entries[0]),tecmp0);`
			`else`
			`qsort(t->entries,t->entry_count,sizeof(t->entries[0]),tecmp1);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00
			`for (i = 0; i < t->entry_count; i++) {`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`if (t->entries[i]->versions[v].mode)`
			`maxlen += t->entries[i]->name->str_len + 34;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`}`

fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`strbuf_reset(b);`
			`strbuf_grow(b, maxlen);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`for (i = 0; i < t->entry_count; i++) {`
			`struct tree_entry *e = t->entries[i];`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`if (!e->versions[v].mode)`
			`continue;`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`strbuf_addf(b, "%o %s%c", (unsigned int)e->versions[v].mode,`
			`e->name->str_dat, '\0');`
			`strbuf_add(b, e->versions[v].sha1, 20);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`}`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`}`

			`static void store_tree(struct tree_entry *root)`
			`{`
			`struct tree_content *t = root->tree;`
			`unsigned int i, j, del;`
fast-import optimization: Now that cmd_data acts on a strbuf, make last_object stashed buffer be a strbuf as well. On new stash, don't free the last stashed buffer, rather swap it with the one you will stash, this way, callers of store_object can act on static strbufs, and at some point, fast-import won't allocate new memory for objects buffers. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 14:00:38 +02:00			`struct last_object lo = { STRBUF_INIT, 0, 0, /* no_swap */ 1 };`
Improve reuse of sha1_file library within fast-import. Now that the sha1_file.c library routines use the sliding mmap routines to perform efficient access to portions of a packfile I can remove that code from fast-import.c and just invoke it. One benefit is we now have reloading support for any packfile which uses OBJ_OFS_DELTA. Another is we have significantly less code to maintain. This code reuse change requires that fast-import generate only an OBJ_OFS_DELTA format packfile, as there is absolutely no index available to perform OBJ_REF_DELTA lookup in while unpacking an object. This is probably reasonable to require as the delta offsets result in smaller packfiles and are faster to unpack, as no index searching is required. Its also only a temporary requirement as users could always repack without offsets before making the import available to older versions of Git. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-14 12:20:23 +01:00			`struct object_entry *le;`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00
			`if (!is_null_sha1(root->versions[1].sha1))`
			`return;`

			`for (i = 0; i < t->entry_count; i++) {`
			`if (t->entries[i]->tree)`
			`store_tree(t->entries[i]);`
			`}`

Improve reuse of sha1_file library within fast-import. Now that the sha1_file.c library routines use the sliding mmap routines to perform efficient access to portions of a packfile I can remove that code from fast-import.c and just invoke it. One benefit is we now have reloading support for any packfile which uses OBJ_OFS_DELTA. Another is we have significantly less code to maintain. This code reuse change requires that fast-import generate only an OBJ_OFS_DELTA format packfile, as there is absolutely no index available to perform OBJ_REF_DELTA lookup in while unpacking an object. This is probably reasonable to require as the delta offsets result in smaller packfiles and are faster to unpack, as no index searching is required. Its also only a temporary requirement as users could always repack without offsets before making the import available to older versions of Git. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-14 12:20:23 +01:00			`le = find_object(root->versions[0].sha1);`
fast-import optimization: Now that cmd_data acts on a strbuf, make last_object stashed buffer be a strbuf as well. On new stash, don't free the last stashed buffer, rather swap it with the one you will stash, this way, callers of store_object can act on static strbufs, and at some point, fast-import won't allocate new memory for objects buffers. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 14:00:38 +02:00			`if (S_ISDIR(root->versions[0].mode) && le && le->pack_id == pack_id) {`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`mktree(t, 0, &old_tree);`
fast-import optimization: Now that cmd_data acts on a strbuf, make last_object stashed buffer be a strbuf as well. On new stash, don't free the last stashed buffer, rather swap it with the one you will stash, this way, callers of store_object can act on static strbufs, and at some point, fast-import won't allocate new memory for objects buffers. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 14:00:38 +02:00			`lo.data = old_tree;`
Improve reuse of sha1_file library within fast-import. Now that the sha1_file.c library routines use the sliding mmap routines to perform efficient access to portions of a packfile I can remove that code from fast-import.c and just invoke it. One benefit is we now have reloading support for any packfile which uses OBJ_OFS_DELTA. Another is we have significantly less code to maintain. This code reuse change requires that fast-import generate only an OBJ_OFS_DELTA format packfile, as there is absolutely no index available to perform OBJ_REF_DELTA lookup in while unpacking an object. This is probably reasonable to require as the delta offsets result in smaller packfiles and are faster to unpack, as no index searching is required. Its also only a temporary requirement as users could always repack without offsets before making the import available to older versions of Git. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-14 12:20:23 +01:00			`lo.offset = le->offset;`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`lo.depth = t->delta_depth;`
			`}`

fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`mktree(t, 1, &new_tree);`
			`store_object(OBJ_TREE, &new_tree, &lo, root->versions[1].sha1, 0);`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00
			`t->delta_depth = lo.depth;`
			`for (i = 0, j = 0, del = 0; i < t->entry_count; i++) {`
			`struct tree_entry *e = t->entries[i];`
			`if (e->versions[1].mode) {`
			`e->versions[0].mode = e->versions[1].mode;`
			`hashcpy(e->versions[0].sha1, e->versions[1].sha1);`
			`t->entries[j++] = e;`
			`} else {`
			`release_tree_entry(e);`
			`del++;`
			`}`
			`}`
			`t->entry_count -= del;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`}`

			`static int tree_content_set(`
			`struct tree_entry *root,`
			`const char *p,`
			`const unsigned char *sha1,`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`const uint16_t mode,`
			`struct tree_content *subtree)`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`{`
			`struct tree_content *t = root->tree;`
			`const char *slash1;`
			`unsigned int i, n;`
			`struct tree_entry *e;`

			`slash1 = strchr(p, '/');`
			`if (slash1)`
			`n = slash1 - p;`
			`else`
			`n = strlen(p);`
Don't allow empty pathnames in fast-import riddochc on #git noticed corruption caused by import-tars. This was fixed in the prior commit by Dscho, but fast-import was wrong to have allowed a tree to be created with an empty string as the filename. No operating system allows this, and Git itself doesn't accept this into the index. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-04-29 02:01:27 +02:00			`if (!n)`
			`die("Empty path component found in input");`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`if (!slash1 && !S_ISDIR(mode) && subtree)`
			`die("Non-directories cannot have subtrees");`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00
			`for (i = 0; i < t->entry_count; i++) {`
			`e = t->entries[i];`
			`if (e->name->str_len == n && !strncmp(p, e->name->str_dat, n)) {`
			`if (!slash1) {`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`if (!S_ISDIR(mode)`
			`&& e->versions[1].mode == mode`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`&& !hashcmp(e->versions[1].sha1, sha1))`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`return 0;`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`e->versions[1].mode = mode;`
			`hashcpy(e->versions[1].sha1, sha1);`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`if (e->tree)`
Fixed segfault in fast-import after growing a tree. Growing a tree caused all subtrees to be deallocated and put back into the free list yet those subtree's contents were still actively in use. Consequently they were doled out again and got stomped on elsewhere. Releasing a tree is now performed in two parts, either releasing only the content array or releasing the content array and recursively releasing the subtree(s). Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 07:33:47 +02:00			`release_tree_content_recursive(e->tree);`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`e->tree = subtree;`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`hashclr(root->versions[1].sha1);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`return 1;`
			`}`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`if (!S_ISDIR(e->versions[1].mode)) {`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`e->tree = new_tree_content(8);`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`e->versions[1].mode = S_IFDIR;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`}`
			`if (!e->tree)`
			`load_tree(e);`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`if (tree_content_set(e, slash1 + 1, sha1, mode, subtree)) {`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`hashclr(root->versions[1].sha1);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`return 1;`
			`}`
			`return 0;`
			`}`
			`}`

			`if (t->entry_count == t->entry_capacity)`
fast-import: grow tree storage more aggressively When building up a tree for a commit, fast-import dynamically allocates memory for the tree entries. When more space is needed, the allocated memory is increased by a constant amount. For very large trees, this means re-allocating and memcpy()ing the memory O(n) times. To compound this problem, releasing the previous tree resource does not free the memory; it is kept in a pool for future trees. This means that each of the O(n) allocations will consume increasing amounts of memory, giving O(n^2) memory consumption. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-11 03:39:17 +01:00			`root->tree = t = grow_tree_content(t, t->entry_count);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`e = new_tree_entry();`
Remove unnecessary casts from fast-import Jeff King pointed out that these casts are quite unnecessary, as the compiler should be doing them anyway, and may cause problems in the future if the size of the argument for to_atom were to ever be increased. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-12 20:48:37 +01:00			`e->name = to_atom(p, n);`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`e->versions[0].mode = 0;`
			`hashclr(e->versions[0].sha1);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`t->entries[t->entry_count++] = e;`
			`if (slash1) {`
			`e->tree = new_tree_content(8);`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`e->versions[1].mode = S_IFDIR;`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`tree_content_set(e, slash1 + 1, sha1, mode, subtree);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`} else {`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`e->tree = subtree;`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`e->versions[1].mode = mode;`
			`hashcpy(e->versions[1].sha1, sha1);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`}`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`hashclr(root->versions[1].sha1);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`return 1;`
			`}`

Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`static int tree_content_remove(`
			`struct tree_entry *root,`
			`const char *p,`
			`struct tree_entry *backup_leaf)`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`{`
			`struct tree_content *t = root->tree;`
			`const char *slash1;`
			`unsigned int i, n;`
			`struct tree_entry *e;`

			`slash1 = strchr(p, '/');`
			`if (slash1)`
			`n = slash1 - p;`
			`else`
			`n = strlen(p);`

			`for (i = 0; i < t->entry_count; i++) {`
			`e = t->entries[i];`
			`if (e->name->str_len == n && !strncmp(p, e->name->str_dat, n)) {`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`if (!slash1 \|\| !S_ISDIR(e->versions[1].mode))`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`goto del_entry;`
			`if (!e->tree)`
			`load_tree(e);`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`if (tree_content_remove(e, slash1 + 1, backup_leaf)) {`
Correct tree corruption problems in fast-import. The new tree delta implementation caused blob SHA1s to be used instead of a tree SHA1 when a tree was written out. This really only appeared to happen when converting an existing file to a tree, but may have been possible in some other situations. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-29 03:43:04 +02:00			`for (n = 0; n < e->tree->entry_count; n++) {`
			`if (e->tree->entries[n]->versions[1].mode) {`
			`hashclr(root->versions[1].sha1);`
			`return 1;`
			`}`
			`}`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`backup_leaf = NULL;`
Correct tree corruption problems in fast-import. The new tree delta implementation caused blob SHA1s to be used instead of a tree SHA1 when a tree was written out. This really only appeared to happen when converting an existing file to a tree, but may have been possible in some other situations. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-29 03:43:04 +02:00			`goto del_entry;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`}`
			`return 0;`
			`}`
			`}`
			`return 0;`

			`del_entry:`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`if (backup_leaf)`
			`memcpy(backup_leaf, e, sizeof(*backup_leaf));`
			`else if (e->tree)`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`release_tree_content_recursive(e->tree);`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`e->tree = NULL;`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`e->versions[1].mode = 0;`
			`hashclr(e->versions[1].sha1);`
			`hashclr(root->versions[1].sha1);`
Refactored fast-import's internals for future additions. Too many globals variables were being used not not enough code was resuable to process trees and commits so this is a simple refactoring of the existing blob processing code to get into a state that will be easier to handle trees and commits in. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:46:13 +02:00			`return 1;`
Created fast-import, a tool to quickly generating a pack from blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-05 08:04:21 +02:00			`}`

Teach fast-import to recursively copy files/directories Some source material (e.g. Subversion dump files) perform directory renames by telling us the directory was copied, then deleted in the same revision. This makes it difficult for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Copy a/ to b/; Delete a/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'C' subcommand within a commit allows the frontend to make a recursive copy of one path to another path within the branch, without needing to keep track of the individual file paths. The metadata copy is performed in memory efficiently, but is implemented as a copy-immediately operation, rather than copy-on-write. With this new 'C' subcommand frontends could obviously implement an 'R' (rename) on their own as a combination of 'C' and 'D' (delete), but since we have already offered up 'R' in the past and it is a trivial thing to keep implemented I'm not going to deprecate it. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-15 07:40:37 +02:00			`static int tree_content_get(`
			`struct tree_entry *root,`
			`const char *p,`
			`struct tree_entry *leaf)`
			`{`
			`struct tree_content *t = root->tree;`
			`const char *slash1;`
			`unsigned int i, n;`
			`struct tree_entry *e;`

			`slash1 = strchr(p, '/');`
			`if (slash1)`
			`n = slash1 - p;`
			`else`
			`n = strlen(p);`

			`for (i = 0; i < t->entry_count; i++) {`
			`e = t->entries[i];`
			`if (e->name->str_len == n && !strncmp(p, e->name->str_dat, n)) {`
			`if (!slash1) {`
			`memcpy(leaf, e, sizeof(*leaf));`
			`if (e->tree && is_null_sha1(e->versions[1].sha1))`
			`leaf->tree = dup_tree_content(e->tree);`
			`else`
			`leaf->tree = NULL;`
			`return 1;`
			`}`
			`if (!S_ISDIR(e->versions[1].mode))`
			`return 0;`
			`if (!e->tree)`
			`load_tree(e);`
			`return tree_content_get(e, slash1 + 1, leaf);`
			`}`
			`}`
			`return 0;`
			`}`

Don't do non-fastforward updates in fast-import. If fast-import is being used to update an existing branch of a repository, the user may not want to lose commits if another process updates the same ref at the same time. For example, the user might be using fast-import to make just one or two commits against a live branch. We now perform a fast-forward check during the ref updating process. If updating a branch would cause commits in that branch to be lost, we skip over it and display the new SHA1 to standard error. This new default behavior can be overridden with `--force`, like git-push and git-fetch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 22:08:06 +01:00			`static int update_branch(struct branch *b)`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`{`
			`static const char *msg = "fast-import";`
Don't do non-fastforward updates in fast-import. If fast-import is being used to update an existing branch of a repository, the user may not want to lose commits if another process updates the same ref at the same time. For example, the user might be using fast-import to make just one or two commits against a live branch. We now perform a fast-forward check during the ref updating process. If updating a branch would cause commits in that branch to be lost, we skip over it and display the new SHA1 to standard error. This new default behavior can be overridden with `--force`, like git-push and git-fetch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 22:08:06 +01:00			`struct ref_lock *lock;`
			`unsigned char old_sha1[20];`

			`if (read_ref(b->name, old_sha1))`
			`hashclr(old_sha1);`
git-update-ref: add --no-deref option for overwriting/detaching ref git-checkout is also adapted to make use of this new option instead of the handcrafted command sequence. Signed-off-by: Sven Verdoolaege <skimo@kotnet.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-05-09 12:33:20 +02:00			`lock = lock_any_ref_for_update(b->name, old_sha1, 0);`
Don't do non-fastforward updates in fast-import. If fast-import is being used to update an existing branch of a repository, the user may not want to lose commits if another process updates the same ref at the same time. For example, the user might be using fast-import to make just one or two commits against a live branch. We now perform a fast-forward check during the ref updating process. If updating a branch would cause commits in that branch to be lost, we skip over it and display the new SHA1 to standard error. This new default behavior can be overridden with `--force`, like git-push and git-fetch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 22:08:06 +01:00			`if (!lock)`
			`return error("Unable to lock %s", b->name);`
			`if (!force_update && !is_null_sha1(old_sha1)) {`
			`struct commit old_cmit, new_cmit;`

			`old_cmit = lookup_commit_reference_gently(old_sha1, 0);`
			`new_cmit = lookup_commit_reference_gently(b->sha1, 0);`
			`if (!old_cmit \|\| !new_cmit) {`
			`unlock_ref(lock);`
			`return error("Branch %s is missing commits.", b->name);`
			`}`

Merge branch 'jc/merge-base' (early part) This contains an evil merge to fast-import, in order to resolve in_merge_bases() update. 2007-02-14 01:50:32 +01:00			`if (!in_merge_bases(old_cmit, &new_cmit, 1)) {`
Don't do non-fastforward updates in fast-import. If fast-import is being used to update an existing branch of a repository, the user may not want to lose commits if another process updates the same ref at the same time. For example, the user might be using fast-import to make just one or two commits against a live branch. We now perform a fast-forward check during the ref updating process. If updating a branch would cause commits in that branch to be lost, we skip over it and display the new SHA1 to standard error. This new default behavior can be overridden with `--force`, like git-push and git-fetch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 22:08:06 +01:00			`unlock_ref(lock);`
Rename warn() to warning() to fix symbol conflicts on BSD and Mac OS This fixes a problem reported by Randal Schwartz: >I finally tracked down all the (albeit inconsequential) errors I was getting >on both OpenBSD and OSX. It's the warn() function in usage.c. There's >warn(3) in BSD-style distros. It'd take a "great rename" to change it, but if >someone with better C skills than I have could do that, my linker and I would >appreciate it. It was annoying to me, too, when I was doing some mergetool testing on Mac OS X, so here's a fix. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: "Randal L. Schwartz" <merlyn@stonehenge.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-31 01:07:05 +02:00			`warning("Not updating %s"`
Don't do non-fastforward updates in fast-import. If fast-import is being used to update an existing branch of a repository, the user may not want to lose commits if another process updates the same ref at the same time. For example, the user might be using fast-import to make just one or two commits against a live branch. We now perform a fast-forward check during the ref updating process. If updating a branch would cause commits in that branch to be lost, we skip over it and display the new SHA1 to standard error. This new default behavior can be overridden with `--force`, like git-push and git-fetch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 22:08:06 +01:00			`" (new tip %s does not contain %s)",`
			`b->name, sha1_to_hex(b->sha1), sha1_to_hex(old_sha1));`
			`return -1;`
			`}`
			`}`
			`if (write_ref_sha1(lock, b->sha1, msg) < 0)`
			`return error("Unable to update %s", b->name);`
			`return 0;`
			`}`

			`static void dump_branches(void)`
			`{`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`unsigned int i;`
			`struct branch *b;`

			`for (i = 0; i < branch_table_sz; i++) {`
Don't do non-fastforward updates in fast-import. If fast-import is being used to update an existing branch of a repository, the user may not want to lose commits if another process updates the same ref at the same time. For example, the user might be using fast-import to make just one or two commits against a live branch. We now perform a fast-forward check during the ref updating process. If updating a branch would cause commits in that branch to be lost, we skip over it and display the new SHA1 to standard error. This new default behavior can be overridden with `--force`, like git-push and git-fetch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 22:08:06 +01:00			`for (b = branch_table[i]; b; b = b->table_next_branch)`
			`failure \|= update_branch(b);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`}`
			`}`

Declare no-arg functions as (void) in fast-import. Apparently the git convention is to declare any function which takes no arguments as taking void. I did not do this during the early fast-import development, but should have. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-17 07:47:25 +01:00			`static void dump_tags(void)`
Implemented 'tag' command in fast-import. Tags received from the frontend are generated in memory in a simple linked list in the order that the tag commands were sent by the frontend. If multiple different tag objects for the same tag name get generated the last one sent by the frontend will be the one that gets written out at termination. Multiple tag objects for the same name will cause all older tags of the same name to be lost. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 09:12:13 +02:00			`{`
			`static const char *msg = "fast-import";`
			`struct tag *t;`
			`struct ref_lock *lock;`
Don't do non-fastforward updates in fast-import. If fast-import is being used to update an existing branch of a repository, the user may not want to lose commits if another process updates the same ref at the same time. For example, the user might be using fast-import to make just one or two commits against a live branch. We now perform a fast-forward check during the ref updating process. If updating a branch would cause commits in that branch to be lost, we skip over it and display the new SHA1 to standard error. This new default behavior can be overridden with `--force`, like git-push and git-fetch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 22:08:06 +01:00			`char ref_name[PATH_MAX];`
Implemented 'tag' command in fast-import. Tags received from the frontend are generated in memory in a simple linked list in the order that the tag commands were sent by the frontend. If multiple different tag objects for the same tag name get generated the last one sent by the frontend will be the one that gets written out at termination. Multiple tag objects for the same name will cause all older tags of the same name to be lost. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 09:12:13 +02:00
			`for (t = first_tag; t; t = t->next_tag) {`
Don't do non-fastforward updates in fast-import. If fast-import is being used to update an existing branch of a repository, the user may not want to lose commits if another process updates the same ref at the same time. For example, the user might be using fast-import to make just one or two commits against a live branch. We now perform a fast-forward check during the ref updating process. If updating a branch would cause commits in that branch to be lost, we skip over it and display the new SHA1 to standard error. This new default behavior can be overridden with `--force`, like git-push and git-fetch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 22:08:06 +01:00			`sprintf(ref_name, "tags/%s", t->name);`
			`lock = lock_ref_sha1(ref_name, NULL);`
Implemented 'tag' command in fast-import. Tags received from the frontend are generated in memory in a simple linked list in the order that the tag commands were sent by the frontend. If multiple different tag objects for the same tag name get generated the last one sent by the frontend will be the one that gets written out at termination. Multiple tag objects for the same name will cause all older tags of the same name to be lost. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 09:12:13 +02:00			`if (!lock \|\| write_ref_sha1(lock, t->sha1, msg) < 0)`
Don't do non-fastforward updates in fast-import. If fast-import is being used to update an existing branch of a repository, the user may not want to lose commits if another process updates the same ref at the same time. For example, the user might be using fast-import to make just one or two commits against a live branch. We now perform a fast-forward check during the ref updating process. If updating a branch would cause commits in that branch to be lost, we skip over it and display the new SHA1 to standard error. This new default behavior can be overridden with `--force`, like git-push and git-fetch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 22:08:06 +01:00			`failure \|= error("Unable to update %s", ref_name);`
Implemented 'tag' command in fast-import. Tags received from the frontend are generated in memory in a simple linked list in the order that the tag commands were sent by the frontend. If multiple different tag objects for the same tag name get generated the last one sent by the frontend will be the one that gets written out at termination. Multiple tag objects for the same name will cause all older tags of the same name to be lost. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 09:12:13 +02:00			`}`
			`}`

Added option to export the marks table when fast-import terminates. The marks table can be used by the frontend to load any commit after the import and compare it to whatever data the frontend knows about that commit. If the mark idnums can be easily correlated to some reference source then its relatively trivial to compare the GIT tree to the reference to verify the accuracy of the import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 22:03:04 +02:00			`static void dump_marks_helper(FILE *f,`
Use uintmax_t for marks in fast-import. If a frontend wants to use a mark per file revision and per commit and is doing a truly huge import (such as a 32 GiB SVN repository) we may need more than 2**32 unique mark values, especially if the frontend is unable (or unwilling) to recycle mark values. For mark idnums we should use the largest unsigned integer type available, hoping that will be at least 64 bits when we are compiled as a 64 bit executable. This way we may consume huge amounts of memory storing our mark table, but we'll at least be able to process the entire import without failing. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 06:33:19 +01:00			`uintmax_t base,`
Added option to export the marks table when fast-import terminates. The marks table can be used by the frontend to load any commit after the import and compare it to whatever data the frontend knows about that commit. If the mark idnums can be easily correlated to some reference source then its relatively trivial to compare the GIT tree to the reference to verify the accuracy of the import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 22:03:04 +02:00			`struct mark_set *m)`
			`{`
Use uintmax_t for marks in fast-import. If a frontend wants to use a mark per file revision and per commit and is doing a truly huge import (such as a 32 GiB SVN repository) we may need more than 2**32 unique mark values, especially if the frontend is unable (or unwilling) to recycle mark values. For mark idnums we should use the largest unsigned integer type available, hoping that will be at least 64 bits when we are compiled as a 64 bit executable. This way we may consume huge amounts of memory storing our mark table, but we'll at least be able to process the entire import without failing. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 06:33:19 +01:00			`uintmax_t k;`
Added option to export the marks table when fast-import terminates. The marks table can be used by the frontend to load any commit after the import and compare it to whatever data the frontend knows about that commit. If the mark idnums can be easily correlated to some reference source then its relatively trivial to compare the GIT tree to the reference to verify the accuracy of the import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 22:03:04 +02:00			`if (m->shift) {`
			`for (k = 0; k < 1024; k++) {`
			`if (m->data.sets[k])`
			`dump_marks_helper(f, (base + k) << m->shift,`
			`m->data.sets[k]);`
			`}`
			`} else {`
			`for (k = 0; k < 1024; k++) {`
			`if (m->data.marked[k])`
Check for PRIuMAX rather than NO_C99_FORMAT in fast-import.c. Thanks to Simon 'corecode' Schubert <corecode@fs.ei.tum.de> for the clean-up. Defining the C99 standard PRIuMAX when necessary replaces UM_FMT and the awkward UM10_FMT. There are no direct C99 translations for other uses of NO_C99_FORMAT in git, alas. Signed-off-by: Jason Riedy <ejr@cs.berkeley.edu> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-21 02:34:56 +01:00			`fprintf(f, ":%" PRIuMAX " %s\n", base + k,`
Added option to export the marks table when fast-import terminates. The marks table can be used by the frontend to load any commit after the import and compare it to whatever data the frontend knows about that commit. If the mark idnums can be easily correlated to some reference source then its relatively trivial to compare the GIT tree to the reference to verify the accuracy of the import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 22:03:04 +02:00			`sha1_to_hex(m->data.marked[k]->sha1));`
			`}`
			`}`
			`}`

Declare no-arg functions as (void) in fast-import. Apparently the git convention is to declare any function which takes no arguments as taking void. I did not do this during the early fast-import development, but should have. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-17 07:47:25 +01:00			`static void dump_marks(void)`
Added option to export the marks table when fast-import terminates. The marks table can be used by the frontend to load any commit after the import and compare it to whatever data the frontend knows about that commit. If the mark idnums can be easily correlated to some reference source then its relatively trivial to compare the GIT tree to the reference to verify the accuracy of the import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 22:03:04 +02:00			`{`
Use atomic updates to the fast-import mark file When we allow fast-import frontends to reload a mark file from a prior session we want to let them use the same file as they exported the marks to. This makes it very simple for the frontend to save state across incremental imports. But we don't want to lose the old marks table if anything goes wrong while writing our current marks table. So instead of truncating and overwriting the path specified to --export-marks we use the standard lockfile code to write the current marks out to a temporary file, then rename it over the old marks table. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-08 00:05:38 +01:00			`static struct lock_file mark_lock;`
			`int mark_fd;`
			`FILE *f;`

			`if (!mark_file)`
			`return;`

			`mark_fd = hold_lock_file_for_update(&mark_lock, mark_file, 0);`
			`if (mark_fd < 0) {`
			`failure \|= error("Unable to write marks file %s: %s",`
			`mark_file, strerror(errno));`
			`return;`
Added option to export the marks table when fast-import terminates. The marks table can be used by the frontend to load any commit after the import and compare it to whatever data the frontend knows about that commit. If the mark idnums can be easily correlated to some reference source then its relatively trivial to compare the GIT tree to the reference to verify the accuracy of the import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 22:03:04 +02:00			`}`
Use atomic updates to the fast-import mark file When we allow fast-import frontends to reload a mark file from a prior session we want to let them use the same file as they exported the marks to. This makes it very simple for the frontend to save state across incremental imports. But we don't want to lose the old marks table if anything goes wrong while writing our current marks table. So instead of truncating and overwriting the path specified to --export-marks we use the standard lockfile code to write the current marks out to a temporary file, then rename it over the old marks table. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-08 00:05:38 +01:00
			`f = fdopen(mark_fd, "w");`
			`if (!f) {`
fast-import: Don't use a maybe-clobbered errno value Without this change, each diagnostic could use an errno value clobbered by the close or unlink in rollback_lock_file. Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-18 19:35:49 +01:00			`int saved_errno = errno;`
Use atomic updates to the fast-import mark file When we allow fast-import frontends to reload a mark file from a prior session we want to let them use the same file as they exported the marks to. This makes it very simple for the frontend to save state across incremental imports. But we don't want to lose the old marks table if anything goes wrong while writing our current marks table. So instead of truncating and overwriting the path specified to --export-marks we use the standard lockfile code to write the current marks out to a temporary file, then rename it over the old marks table. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-08 00:05:38 +01:00			`rollback_lock_file(&mark_lock);`
			`failure \|= error("Unable to write marks file %s: %s",`
fast-import: Don't use a maybe-clobbered errno value Without this change, each diagnostic could use an errno value clobbered by the close or unlink in rollback_lock_file. Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-18 19:35:49 +01:00			`mark_file, strerror(saved_errno));`
Use atomic updates to the fast-import mark file When we allow fast-import frontends to reload a mark file from a prior session we want to let them use the same file as they exported the marks to. This makes it very simple for the frontend to save state across incremental imports. But we don't want to lose the old marks table if anything goes wrong while writing our current marks table. So instead of truncating and overwriting the path specified to --export-marks we use the standard lockfile code to write the current marks out to a temporary file, then rename it over the old marks table. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-08 00:05:38 +01:00			`return;`
Added option to export the marks table when fast-import terminates. The marks table can be used by the frontend to load any commit after the import and compare it to whatever data the frontend knows about that commit. If the mark idnums can be easily correlated to some reference source then its relatively trivial to compare the GIT tree to the reference to verify the accuracy of the import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 22:03:04 +02:00			`}`
Use atomic updates to the fast-import mark file When we allow fast-import frontends to reload a mark file from a prior session we want to let them use the same file as they exported the marks to. This makes it very simple for the frontend to save state across incremental imports. But we don't want to lose the old marks table if anything goes wrong while writing our current marks table. So instead of truncating and overwriting the path specified to --export-marks we use the standard lockfile code to write the current marks out to a temporary file, then rename it over the old marks table. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-08 00:05:38 +01:00
Improve use of lockfile API Remove remaining double close(2)'s. i.e. close() before commit_locked_index() or commit_lock_file(). Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-16 20:12:46 +01:00			`/*`
fast-import.c: don't try to commit marks file if write failed We also move the assignment of -1 to the lock file descriptor up, so that rollback_lock_file() can be called safely after a possible attempt to fclose(). This matches the contents of the 'if' statement just above testing success of fdopen(). Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-17 17:58:34 +01:00			`* Since the lock file was fdopen()'ed, it should not be close()'ed.`
			`* Assign -1 to the lock file descriptor so that commit_lock_file()`
Improve use of lockfile API Remove remaining double close(2)'s. i.e. close() before commit_locked_index() or commit_lock_file(). Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-16 20:12:46 +01:00			`* won't try to close() it.`
			`*/`
			`mark_lock.fd = -1;`
fast-import.c: don't try to commit marks file if write failed We also move the assignment of -1 to the lock file descriptor up, so that rollback_lock_file() can be called safely after a possible attempt to fclose(). This matches the contents of the 'if' statement just above testing success of fdopen(). Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-17 17:58:34 +01:00
			`dump_marks_helper(f, 0, marks);`
			`if (ferror(f) \|\| fclose(f)) {`
fast-import: Don't use a maybe-clobbered errno value Without this change, each diagnostic could use an errno value clobbered by the close or unlink in rollback_lock_file. Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-18 19:35:49 +01:00			`int saved_errno = errno;`
fast-import.c: don't try to commit marks file if write failed We also move the assignment of -1 to the lock file descriptor up, so that rollback_lock_file() can be called safely after a possible attempt to fclose(). This matches the contents of the 'if' statement just above testing success of fdopen(). Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-17 17:58:34 +01:00			`rollback_lock_file(&mark_lock);`
			`failure \|= error("Unable to write marks file %s: %s",`
fast-import: Don't use a maybe-clobbered errno value Without this change, each diagnostic could use an errno value clobbered by the close or unlink in rollback_lock_file. Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-18 19:35:49 +01:00			`mark_file, strerror(saved_errno));`
fast-import.c: don't try to commit marks file if write failed We also move the assignment of -1 to the lock file descriptor up, so that rollback_lock_file() can be called safely after a possible attempt to fclose(). This matches the contents of the 'if' statement just above testing success of fdopen(). Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-17 17:58:34 +01:00			`return;`
			`}`

			`if (commit_lock_file(&mark_lock)) {`
fast-import: Don't use a maybe-clobbered errno value Without this change, each diagnostic could use an errno value clobbered by the close or unlink in rollback_lock_file. Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-18 19:35:49 +01:00			`int saved_errno = errno;`
fast-import.c: don't try to commit marks file if write failed We also move the assignment of -1 to the lock file descriptor up, so that rollback_lock_file() can be called safely after a possible attempt to fclose(). This matches the contents of the 'if' statement just above testing success of fdopen(). Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-17 17:58:34 +01:00			`rollback_lock_file(&mark_lock);`
			`failure \|= error("Unable to commit marks file %s: %s",`
fast-import: Don't use a maybe-clobbered errno value Without this change, each diagnostic could use an errno value clobbered by the close or unlink in rollback_lock_file. Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-18 19:35:49 +01:00			`mark_file, strerror(saved_errno));`
fast-import.c: don't try to commit marks file if write failed We also move the assignment of -1 to the lock file descriptor up, so that rollback_lock_file() can be called safely after a possible attempt to fclose(). This matches the contents of the 'if' statement just above testing success of fdopen(). Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-17 17:58:34 +01:00			`return;`
			`}`
Added option to export the marks table when fast-import terminates. The marks table can be used by the frontend to load any commit after the import and compare it to whatever data the frontend knows about that commit. If the mark idnums can be easily correlated to some reference source then its relatively trivial to compare the GIT tree to the reference to verify the accuracy of the import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 22:03:04 +02:00			`}`

Drop strbuf's 'eof' marker, and make read_line a first class citizen. read_line is now strbuf_getline, and is a first class citizen, it returns 0 when reading a line worked, EOF else. The ->eof marker was used non-locally by fast-import.c, mimic the same behaviour using a static int in "read_next_command", that now returns -1 on EOF, and avoids to call strbuf_getline when it's in EOF state. Also no longer automagically strbuf_release the buffer, it's counter intuitive and breaks fast-import in a very subtle way. Note: being at EOF implies that command_buf.len == 0. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 11:19:04 +02:00			`static int read_next_command(void)`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`{`
Drop strbuf's 'eof' marker, and make read_line a first class citizen. read_line is now strbuf_getline, and is a first class citizen, it returns 0 when reading a line worked, EOF else. The ->eof marker was used non-locally by fast-import.c, mimic the same behaviour using a static int in "read_next_command", that now returns -1 on EOF, and avoids to call strbuf_getline when it's in EOF state. Also no longer automagically strbuf_release the buffer, it's counter intuitive and breaks fast-import in a very subtle way. Note: being at EOF implies that command_buf.len == 0. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 11:19:04 +02:00			`static int stdin_eof = 0;`

			`if (stdin_eof) {`
			`unread_command_buf = 0;`
			`return EOF;`
			`}`

Teach fast-import to ignore lines starting with '#' Several frontend developers have asked that some form of stream comments be permitted within a fast-import data stream. This way they can include information from their own frontend program about where specific data was taken from in the source system, or about a decision that their frontend may have made while creating the fast-import data stream. This change introduces comments in the Bourne-shell/Tcl/Perl style. Lines starting with '#' are ignored, up to and including the LF. Unlike the above mentioned three languages however we do not look for and ignore leading whitespace. This just simplifies the definition of the comment format and the code that parses them. To make comments work we had to stop using read_next_command() within cmd_data() and directly invoke read_line() during the inline variant of the function. This is necessary to retain any lines of the input data that might otherwise look like a comment to fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-01 06:05:15 +02:00			`do {`
Include recent command history in fast-import crash reports When we crash the frontend developer (or end-user) may need to know roughly around what part of the input stream we had a problem with and aborted on. Because line numbers aren't very useful in this sort of application we instead just keep the last 100 commands in a FIFO queue and print them as part of the crash report. Currently one problem with this design is a commit that has more than 100 modified files in it will flood the FIFO and any context regarding branch/from/committer/mark/comments will be lost. We really should save only the last few (10?) file changes for the current commit, ensuring we have some prior higher level commands in the FIFO when we crash on a file M/D/C/R command. Another issue with this approach is the FIFO only includes the commands, it does not include the commit messages. Yet having a commit message may be useful to help locate the relevant change in the source material. In practice I don't think this is going to be a major concern as the frontend can always embed its own source change set identifier as a comment (which will appear in the crash report) and the commit message(s) for the most recent commits of any given branch should be obtainable from the (packed) commit objects. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 10:47:04 +02:00			`if (unread_command_buf) {`
Make trailing LF optional for all fast-import commands For the same reasons as the prior change we want to allow frontends to omit the trailing LF that usually delimits commands. In some cases these just make the input stream more verbose looking than it needs to be, and its just simpler for the frontend developer to get started if our parser is slightly more lenient about where an LF is required and where it isn't. To make this optional LF feature work we now have to buffer up to one line of input in command_buf. This buffering can happen if we look at the current input command but don't recognize it at this point in the code. In such a case we need to "unget" the entire line, but we cannot depend upon the stdio library to let us do ungetc() for that many characters at once. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-01 08:22:53 +02:00			`unread_command_buf = 0;`
Include recent command history in fast-import crash reports When we crash the frontend developer (or end-user) may need to know roughly around what part of the input stream we had a problem with and aborted on. Because line numbers aren't very useful in this sort of application we instead just keep the last 100 commands in a FIFO queue and print them as part of the crash report. Currently one problem with this design is a commit that has more than 100 modified files in it will flood the FIFO and any context regarding branch/from/committer/mark/comments will be lost. We really should save only the last few (10?) file changes for the current commit, ensuring we have some prior higher level commands in the FIFO when we crash on a file M/D/C/R command. Another issue with this approach is the FIFO only includes the commands, it does not include the commit messages. Yet having a commit message may be useful to help locate the relevant change in the source material. In practice I don't think this is going to be a major concern as the frontend can always embed its own source change set identifier as a comment (which will appear in the crash report) and the commit message(s) for the most recent commits of any given branch should be obtainable from the (packed) commit objects. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 10:47:04 +02:00			`} else {`
			`struct recent_command *rc;`

strbuf change: be sure ->buf is never ever NULL. For that purpose, the ->buf is always initialized with a char * buf living in the strbuf module. It is made a char * so that we can sloppily accept things that perform: sb->buf[0] = '\0', and because you can't pass "" as an initializer for ->buf without making gcc unhappy for very good reasons. strbuf_init/_detach/_grow have been fixed to trust ->alloc and not ->buf anymore. as a consequence strbuf_detach is _mandatory_ to detach a buffer, copying ->buf isn't an option anymore, if ->buf is going to escape from the scope, and eventually be free'd. API changes: * strbuf_setlen now always works, so just make strbuf_reset a convenience macro. * strbuf_detatch takes a size_t* optional argument (meaning it can be NULL) to copy the buffer's len, as it was needed for this refactor to make the code more readable, and working like the callers. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-27 12:58:23 +02:00			`strbuf_detach(&command_buf, NULL);`
Drop strbuf's 'eof' marker, and make read_line a first class citizen. read_line is now strbuf_getline, and is a first class citizen, it returns 0 when reading a line worked, EOF else. The ->eof marker was used non-locally by fast-import.c, mimic the same behaviour using a static int in "read_next_command", that now returns -1 on EOF, and avoids to call strbuf_getline when it's in EOF state. Also no longer automagically strbuf_release the buffer, it's counter intuitive and breaks fast-import in a very subtle way. Note: being at EOF implies that command_buf.len == 0. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 11:19:04 +02:00			`stdin_eof = strbuf_getline(&command_buf, stdin, '\n');`
			`if (stdin_eof)`
			`return EOF;`
Include recent command history in fast-import crash reports When we crash the frontend developer (or end-user) may need to know roughly around what part of the input stream we had a problem with and aborted on. Because line numbers aren't very useful in this sort of application we instead just keep the last 100 commands in a FIFO queue and print them as part of the crash report. Currently one problem with this design is a commit that has more than 100 modified files in it will flood the FIFO and any context regarding branch/from/committer/mark/comments will be lost. We really should save only the last few (10?) file changes for the current commit, ensuring we have some prior higher level commands in the FIFO when we crash on a file M/D/C/R command. Another issue with this approach is the FIFO only includes the commands, it does not include the commit messages. Yet having a commit message may be useful to help locate the relevant change in the source material. In practice I don't think this is going to be a major concern as the frontend can always embed its own source change set identifier as a comment (which will appear in the crash report) and the commit message(s) for the most recent commits of any given branch should be obtainable from the (packed) commit objects. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 10:47:04 +02:00
			`rc = rc_free;`
			`if (rc)`
			`rc_free = rc->next;`
			`else {`
			`rc = cmd_hist.next;`
			`cmd_hist.next = rc->next;`
			`cmd_hist.next->prev = &cmd_hist;`
			`free(rc->buf);`
			`}`

			`rc->buf = command_buf.buf;`
			`rc->prev = cmd_tail;`
			`rc->next = cmd_hist.prev;`
			`rc->prev->next = rc;`
			`cmd_tail = rc;`
			`}`
			`} while (command_buf.buf[0] == '#');`
Drop strbuf's 'eof' marker, and make read_line a first class citizen. read_line is now strbuf_getline, and is a first class citizen, it returns 0 when reading a line worked, EOF else. The ->eof marker was used non-locally by fast-import.c, mimic the same behaviour using a static int in "read_next_command", that now returns -1 on EOF, and avoids to call strbuf_getline when it's in EOF state. Also no longer automagically strbuf_release the buffer, it's counter intuitive and breaks fast-import in a very subtle way. Note: being at EOF implies that command_buf.len == 0. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 11:19:04 +02:00
			`return 0;`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`}`

fast-import pull request * skip_optional_lf() decl is old-style -- please say static skip_optional_lf(void) { ... } * t9300 #14 fails, like this: * expecting failure: git-fast-import <input fatal: Branch name doesn't conform to GIT standards: .badbranchname fast-import: dumping crash report to .git/fast_import_crash_14354 ./test-lib.sh: line 143: 14354 Segmentation fault git-fast-import <input -- >8 -- Subject: [PATCH] fastimport: Fix re-use of va_list The va_list is designed to be used only once. The current code reuses va_list argument may cause segmentation fault. Copy and release the arguments to avoid this problem. While we are at it, fix old-style function declaration of skip_optional_lf(). Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-19 11:50:18 +02:00			`static void skip_optional_lf(void)`
Make trailing LF following fast-import `data` commands optional A few fast-import frontend developers have found it odd that we require the LF following a `data` command, especially in the exact byte count format. Technically we don't need this LF to parse the stream properly, but having it here does make the stream more readable to humans. We can easily make the LF optional by peeking at the next byte available from the stream and pushing it back into the buffer if its not LF. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-01 06:24:25 +02:00			`{`
			`int term_char = fgetc(stdin);`
			`if (term_char != '\n' && term_char != EOF)`
			`ungetc(term_char, stdin);`
			`}`

Declare no-arg functions as (void) in fast-import. Apparently the git convention is to declare any function which takes no arguments as taking void. I did not do this during the early fast-import development, but should have. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-17 07:47:25 +01:00			`static void cmd_mark(void)`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`{`
prefixcmp(): fix-up mechanical conversion. Previous step converted use of strncmp() with literal string mechanically even when the result is only used as a boolean: if (!strncmp("foo", arg, 3)) ==> if (!(-prefixcmp(arg, "foo"))) This step manually cleans them up to read: if (!prefixcmp(arg, "foo")) Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-20 10:54:00 +01:00			`if (!prefixcmp(command_buf.buf, "mark :")) {`
Use uintmax_t for marks in fast-import. If a frontend wants to use a mark per file revision and per commit and is doing a truly huge import (such as a 32 GiB SVN repository) we may need more than 2**32 unique mark values, especially if the frontend is unable (or unwilling) to recycle mark values. For mark idnums we should use the largest unsigned integer type available, hoping that will be at least 64 bits when we are compiled as a 64 bit executable. This way we may consume huge amounts of memory storing our mark table, but we'll at least be able to process the entire import without failing. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 06:33:19 +01:00			`next_mark = strtoumax(command_buf.buf + 6, NULL, 10);`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`read_next_command();`
			`}`
			`else`
Added mark store/find to fast-import. Marks are now saved when the mark directive gets used by the frontend and may be used in place of a SHA1 expression to locate a previous SHA1 which fast-import may have generated. This is particularly useful with commits where the frontend does not (easily) have the ability to compute the SHA1 for an arbitrary commit but needs it to generate a branch or tag from that commit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 10:17:45 +02:00			`next_mark = 0;`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`}`

fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`static void cmd_data(struct strbuf *sb)`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`{`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`strbuf_reset(sb);`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00
prefixcmp(): fix-up mechanical conversion. Previous step converted use of strncmp() with literal string mechanically even when the result is only used as a boolean: if (!strncmp("foo", arg, 3)) ==> if (!(-prefixcmp(arg, "foo"))) This step manually cleans them up to read: if (!prefixcmp(arg, "foo")) Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-20 10:54:00 +01:00			`if (prefixcmp(command_buf.buf, "data "))`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`die("Expected 'data n' command, found: %s", command_buf.buf);`

prefixcmp(): fix-up mechanical conversion. Previous step converted use of strncmp() with literal string mechanically even when the result is only used as a boolean: if (!strncmp("foo", arg, 3)) ==> if (!(-prefixcmp(arg, "foo"))) This step manually cleans them up to read: if (!prefixcmp(arg, "foo")) Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-20 10:54:00 +01:00			`if (!prefixcmp(command_buf.buf + 5, "<<")) {`
Support delimited data regions in fast-import. During testing its nice to not have to feed the length of a data chunk to the 'data' command of fast-import. Instead we would prefer to be able to establish a data chunk much like shell's << operator and use a line delimiter to denote the end of the input. So now if a data command is started as 'data <<EOF' we will look for a terminator line containing only the string EOF on that line. Once found, we stop the data command. Everything between the two lines is used as the data value. The 'data <<' syntax is slower than 'data n', as we don't know how many bytes to expect and instead must grow our buffer on the fly. It also has the problem that the frontend must use a string which will not appear on a line by itself in the input, and the data region will always end in an LF. For these reasons real import frontends are encouraged to continue to use _only_ 'data n'. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-18 19:14:27 +01:00			`char *term = xstrdup(command_buf.buf + 5 + 2);`
fast-import: Use strbuf API, and simplify cmd_data() This patch features the use of strbuf_detach, and prevent the programmer to mess with allocation directly. The code is as efficent as before, just more concise and more straightforward. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-06 13:20:07 +02:00			`size_t term_len = command_buf.len - 5 - 2;`

fast-import.c: fix regression due to strbuf conversion Without this strbuf_detach(), it yields a double free later, the command is in fact stashed, and this is not a memory leak. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-10-26 09:59:12 +02:00			`strbuf_detach(&command_buf, NULL);`
Support delimited data regions in fast-import. During testing its nice to not have to feed the length of a data chunk to the 'data' command of fast-import. Instead we would prefer to be able to establish a data chunk much like shell's << operator and use a line delimiter to denote the end of the input. So now if a data command is started as 'data <<EOF' we will look for a terminator line containing only the string EOF on that line. Once found, we stop the data command. Everything between the two lines is used as the data value. The 'data <<' syntax is slower than 'data n', as we don't know how many bytes to expect and instead must grow our buffer on the fly. It also has the problem that the frontend must use a string which will not appear on a line by itself in the input, and the data region will always end in an LF. For these reasons real import frontends are encouraged to continue to use _only_ 'data n'. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-18 19:14:27 +01:00			`for (;;) {`
Drop strbuf's 'eof' marker, and make read_line a first class citizen. read_line is now strbuf_getline, and is a first class citizen, it returns 0 when reading a line worked, EOF else. The ->eof marker was used non-locally by fast-import.c, mimic the same behaviour using a static int in "read_next_command", that now returns -1 on EOF, and avoids to call strbuf_getline when it's in EOF state. Also no longer automagically strbuf_release the buffer, it's counter intuitive and breaks fast-import in a very subtle way. Note: being at EOF implies that command_buf.len == 0. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 11:19:04 +02:00			`if (strbuf_getline(&command_buf, stdin, '\n') == EOF)`
Support delimited data regions in fast-import. During testing its nice to not have to feed the length of a data chunk to the 'data' command of fast-import. Instead we would prefer to be able to establish a data chunk much like shell's << operator and use a line delimiter to denote the end of the input. So now if a data command is started as 'data <<EOF' we will look for a terminator line containing only the string EOF on that line. Once found, we stop the data command. Everything between the two lines is used as the data value. The 'data <<' syntax is slower than 'data n', as we don't know how many bytes to expect and instead must grow our buffer on the fly. It also has the problem that the frontend must use a string which will not appear on a line by itself in the input, and the data region will always end in an LF. For these reasons real import frontends are encouraged to continue to use _only_ 'data n'. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-18 19:14:27 +01:00			`die("EOF in data (terminator '%s' not found)", term);`
			`if (term_len == command_buf.len`
			`&& !strcmp(term, command_buf.buf))`
			`break;`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`strbuf_addbuf(sb, &command_buf);`
			`strbuf_addch(sb, '\n');`
Support delimited data regions in fast-import. During testing its nice to not have to feed the length of a data chunk to the 'data' command of fast-import. Instead we would prefer to be able to establish a data chunk much like shell's << operator and use a line delimiter to denote the end of the input. So now if a data command is started as 'data <<EOF' we will look for a terminator line containing only the string EOF on that line. Once found, we stop the data command. Everything between the two lines is used as the data value. The 'data <<' syntax is slower than 'data n', as we don't know how many bytes to expect and instead must grow our buffer on the fly. It also has the problem that the frontend must use a string which will not appear on a line by itself in the input, and the data region will always end in an LF. For these reasons real import frontends are encouraged to continue to use _only_ 'data n'. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-18 19:14:27 +01:00			`}`
			`free(term);`
			`}`
			`else {`
fast-import: Use strbuf API, and simplify cmd_data() This patch features the use of strbuf_detach, and prevent the programmer to mess with allocation directly. The code is as efficent as before, just more concise and more straightforward. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-06 13:20:07 +02:00			`size_t n = 0, length;`

Support delimited data regions in fast-import. During testing its nice to not have to feed the length of a data chunk to the 'data' command of fast-import. Instead we would prefer to be able to establish a data chunk much like shell's << operator and use a line delimiter to denote the end of the input. So now if a data command is started as 'data <<EOF' we will look for a terminator line containing only the string EOF on that line. Once found, we stop the data command. Everything between the two lines is used as the data value. The 'data <<' syntax is slower than 'data n', as we don't know how many bytes to expect and instead must grow our buffer on the fly. It also has the problem that the frontend must use a string which will not appear on a line by itself in the input, and the data region will always end in an LF. For these reasons real import frontends are encouraged to continue to use _only_ 'data n'. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-18 19:14:27 +01:00			`length = strtoul(command_buf.buf + 5, NULL, 10);`
fast-import: Use strbuf API, and simplify cmd_data() This patch features the use of strbuf_detach, and prevent the programmer to mess with allocation directly. The code is as efficent as before, just more concise and more straightforward. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-06 13:20:07 +02:00
Support delimited data regions in fast-import. During testing its nice to not have to feed the length of a data chunk to the 'data' command of fast-import. Instead we would prefer to be able to establish a data chunk much like shell's << operator and use a line delimiter to denote the end of the input. So now if a data command is started as 'data <<EOF' we will look for a terminator line containing only the string EOF on that line. Once found, we stop the data command. Everything between the two lines is used as the data value. The 'data <<' syntax is slower than 'data n', as we don't know how many bytes to expect and instead must grow our buffer on the fly. It also has the problem that the frontend must use a string which will not appear on a line by itself in the input, and the data region will always end in an LF. For these reasons real import frontends are encouraged to continue to use _only_ 'data n'. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-18 19:14:27 +01:00			`while (n < length) {`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`size_t s = strbuf_fread(sb, length - n, stdin);`
Support delimited data regions in fast-import. During testing its nice to not have to feed the length of a data chunk to the 'data' command of fast-import. Instead we would prefer to be able to establish a data chunk much like shell's << operator and use a line delimiter to denote the end of the input. So now if a data command is started as 'data <<EOF' we will look for a terminator line containing only the string EOF on that line. Once found, we stop the data command. Everything between the two lines is used as the data value. The 'data <<' syntax is slower than 'data n', as we don't know how many bytes to expect and instead must grow our buffer on the fly. It also has the problem that the frontend must use a string which will not appear on a line by itself in the input, and the data region will always end in an LF. For these reasons real import frontends are encouraged to continue to use _only_ 'data n'. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-18 19:14:27 +01:00			`if (!s && feof(stdin))`
fast-import: Fix compile warnings Not on all platforms are size_t and unsigned long equivalent. Since I do not know how portable %z is, I play safe, and just cast the respective variables to unsigned long. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-07 12:38:21 +01:00			`die("EOF in data (%lu bytes remaining)",`
			`(unsigned long)(length - n));`
Support delimited data regions in fast-import. During testing its nice to not have to feed the length of a data chunk to the 'data' command of fast-import. Instead we would prefer to be able to establish a data chunk much like shell's << operator and use a line delimiter to denote the end of the input. So now if a data command is started as 'data <<EOF' we will look for a terminator line containing only the string EOF on that line. Once found, we stop the data command. Everything between the two lines is used as the data value. The 'data <<' syntax is slower than 'data n', as we don't know how many bytes to expect and instead must grow our buffer on the fly. It also has the problem that the frontend must use a string which will not appear on a line by itself in the input, and the data region will always end in an LF. For these reasons real import frontends are encouraged to continue to use _only_ 'data n'. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-18 19:14:27 +01:00			`n += s;`
			`}`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`}`

Make trailing LF following fast-import `data` commands optional A few fast-import frontend developers have found it odd that we require the LF following a `data` command, especially in the exact byte count format. Technically we don't need this LF to parse the stream properly, but having it here does make the stream more readable to humans. We can easily make the LF optional by peeking at the next byte available from the stream and pushing it back into the buffer if its not LF. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-01 06:24:25 +02:00			`skip_optional_lf();`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`}`

Support RFC 2822 date parsing in fast-import. Since some frontends may be working with source material where the dates are only readily available as RFC 2822 strings, it is more friendly if fast-import exposes Git's parse_date() function to handle the conversion. This way the frontend doesn't need to perform the parsing itself. The new --date-format option to fast-import can be used by a frontend to select which format it will supply date strings in. The default is the standard `raw` Git format, which fast-import has always supported. Format rfc2822 can be used to activate the parse_date() function instead. Because fast-import could also be useful for creating new, current commits, the format `now` is also supported to generate the current system timestamp. The implementation of `now` is a trivial call to datestamp(), but is actually a whole whopping 3 lines so that fast-import can verify the frontend really meant `now`. As part of this change I have added validation of the `raw` date format. Prior to this change fast-import would accept anything in a `committer` command, even if it was seriously malformed. Now fast-import requires the '> ' near the end of the string and verifies the timestamp is formatted properly. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 20:58:30 +01:00			`static int validate_raw_date(const char src, char result, int maxlen)`
			`{`
			`const char *orig_src = src;`
			`char *endp, sign;`

			`strtoul(src, &endp, 10);`
			`if (endp == src \|\| *endp != ' ')`
			`return -1;`

			`src = endp + 1;`
			`if (src != '-' && src != '+')`
			`return -1;`
			`sign = *src;`

			`strtoul(src + 1, &endp, 10);`
			`if (endp == src \|\| *endp \|\| (endp - orig_src) >= maxlen)`
			`return -1;`

			`strcpy(result, orig_src);`
			`return 0;`
			`}`

			`static char parse_ident(const char buf)`
			`{`
			`const char *gt;`
			`size_t name_len;`
			`char *ident;`

			`gt = strrchr(buf, '>');`
			`if (!gt)`
			`die("Missing > in ident string: %s", buf);`
			`gt++;`
			`if (*gt != ' ')`
			`die("Missing space after > in ident string: %s", buf);`
			`gt++;`
			`name_len = gt - buf;`
			`ident = xmalloc(name_len + 24);`
			`strncpy(ident, buf, name_len);`

			`switch (whenspec) {`
			`case WHENSPEC_RAW:`
			`if (validate_raw_date(gt, ident + name_len, 24) < 0)`
			`die("Invalid raw date \"%s\" in ident: %s", gt, buf);`
			`break;`
			`case WHENSPEC_RFC2822:`
			`if (parse_date(gt, ident + name_len, 24) < 0)`
			`die("Invalid rfc2822 date \"%s\" in ident: %s", gt, buf);`
			`break;`
			`case WHENSPEC_NOW:`
			`if (strcmp("now", gt))`
			`die("Date in ident must be 'now': %s", buf);`
			`datestamp(ident + name_len, 24);`
			`break;`
			`}`

			`return ident;`
			`}`

Declare no-arg functions as (void) in fast-import. Apparently the git convention is to declare any function which takes no arguments as taking void. I did not do this during the early fast-import development, but should have. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-17 07:47:25 +01:00			`static void cmd_new_blob(void)`
Added basic command handler to fast-import. Moved the new_blob logic off into a new subroutine and invoked it when getting the 'blob' command. Added statistics dump to STDERR when the program terminates listing what it did at a high level. This is somewhat interesting. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 07:14:21 +02:00			`{`
fast-import optimization: Now that cmd_data acts on a strbuf, make last_object stashed buffer be a strbuf as well. On new stash, don't free the last stashed buffer, rather swap it with the one you will stash, this way, callers of store_object can act on static strbufs, and at some point, fast-import won't allocate new memory for objects buffers. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 14:00:38 +02:00			`static struct strbuf buf = STRBUF_INIT;`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00
			`read_next_command();`
			`cmd_mark();`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`cmd_data(&buf);`
fast-import optimization: Now that cmd_data acts on a strbuf, make last_object stashed buffer be a strbuf as well. On new stash, don't free the last stashed buffer, rather swap it with the one you will stash, this way, callers of store_object can act on static strbufs, and at some point, fast-import won't allocate new memory for objects buffers. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 14:00:38 +02:00			`store_object(OBJ_BLOB, &buf, &last_blob, NULL, next_mark);`
Added basic command handler to fast-import. Moved the new_blob logic off into a new subroutine and invoked it when getting the 'blob' command. Added statistics dump to STDERR when the program terminates listing what it did at a high level. This is somewhat interesting. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 07:14:21 +02:00			`}`

Declare no-arg functions as (void) in fast-import. Apparently the git convention is to declare any function which takes no arguments as taking void. I did not do this during the early fast-import development, but should have. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-17 07:47:25 +01:00			`static void unload_one_branch(void)`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00			`{`
Implemented tree reloading in fast-import. Tree reloading allows fast-import to swap out the least-recently used branch by simply deallocating the data structures from memory that were associated with that branch. Later if the branch becomes active again it can lazily recreate those structures on demand by reloading the necessary trees from the pack file it originally wrote them to. The reloading process is implemented by mmap'ing the pack into memory and using a much tighter variant of the pack reading code contained in sha1_file.c. This was a blatent copy from sha1_file.c but the unpacking functions were significantly simplified and are actually now in a form that should make it easier to map only the necessary regions of a pack rather than the entire file. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 10:37:35 +02:00			`while (cur_active_branches`
			`&& cur_active_branches >= max_active_branches) {`
Use off_t in pack-objects/fast-import when we mean an offset Always use an off_t value in pack-objects anytime we are dealing with an offset to some data within a packfile. Also fixed a minor uintmax_t that was incorrectly defined before. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-07 02:44:34 +01:00			`uintmax_t min_commit = ULONG_MAX;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`struct branch e, l = NULL, *p = NULL;`

			`for (e = active_branches; e; e = e->active_next_branch) {`
			`if (e->last_commit < min_commit) {`
			`p = l;`
			`min_commit = e->last_commit;`
			`}`
			`l = e;`
			`}`

			`if (p) {`
			`e = p->active_next_branch;`
			`p->active_next_branch = e->active_next_branch;`
			`} else {`
			`e = active_branches;`
			`active_branches = e->active_next_branch;`
			`}`
fast-import: Avoid infinite loop after reset Johannes Sixt noticed that a 'reset' command applied to a branch that is already active in the branch LRU cache can cause fast-import to relink the same branch into the LRU cache twice. This will cause the LRU cache to contain a cycle, making unload_one_branch run in an infinite loop as it tries to select the oldest branch for eviction. I have trivially fixed the problem by adding an active bit to each branch object; this bit indicates if the branch is already in the LRU and allows us to avoid trying to add it a second time. Converting the pack_id field into a bitfield makes this change take up no additional memory. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-05 18:31:09 +01:00			`e->active = 0;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`e->active_next_branch = NULL;`
			`if (e->branch_tree.tree) {`
Fixed segfault in fast-import after growing a tree. Growing a tree caused all subtrees to be deallocated and put back into the free list yet those subtree's contents were still actively in use. Consequently they were doled out again and got stomped on elsewhere. Releasing a tree is now performed in two parts, either releasing only the content array or releasing the content array and recursively releasing the subtree(s). Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 07:33:47 +02:00			`release_tree_content_recursive(e->branch_tree.tree);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`e->branch_tree.tree = NULL;`
			`}`
			`cur_active_branches--;`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00			`}`
			`}`

Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`static void load_branch(struct branch *b)`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00			`{`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`load_tree(&b->branch_tree);`
fast-import: Avoid infinite loop after reset Johannes Sixt noticed that a 'reset' command applied to a branch that is already active in the branch LRU cache can cause fast-import to relink the same branch into the LRU cache twice. This will cause the LRU cache to contain a cycle, making unload_one_branch run in an infinite loop as it tries to select the oldest branch for eviction. I have trivially fixed the problem by adding an active bit to each branch object; this bit indicates if the branch is already in the LRU and allows us to avoid trying to add it a second time. Converting the pack_id field into a bitfield makes this change take up no additional memory. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-05 18:31:09 +01:00			`if (!b->active) {`
			`b->active = 1;`
			`b->active_next_branch = active_branches;`
			`active_branches = b;`
			`cur_active_branches++;`
			`branch_load_count++;`
			`}`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00			`}`

Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`static void file_change_m(struct branch *b)`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00			`{`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`const char *p = command_buf.buf + 2;`
Rework unquote_c_style to work on a strbuf. If the gain is not obvious in the diffstat, the resulting code is more readable, _and_ in checkout-index/update-index we now reuse the same buffer to unquote strings instead of always freeing/mallocing. This also is more coherent with the next patch that reworks quoting functions. The quoting function is also made more efficient scanning for backslashes and treating portions of strings without a backslash at once. Signed-off-by: Pierre Habouzit <madcoder@debian.org> 2007-09-20 00:42:14 +02:00			`static struct strbuf uq = STRBUF_INIT;`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`const char *endp;`
Correct compiler warnings in fast-import. Junio noticed these warnings/errors in fast-import when compiling with `-Werror -ansi -pedantic`. A few changes are to reduce compiler warnings, while one (in cmd_merge) is a bug fix. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 06:26:49 +01:00			`struct object_entry *oe = oe;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`unsigned char sha1[20];`
Reduce memory usage of fast-import. Some structs are allocated rather frequently, but were using integer types which were far larger than required to actually store their full value range. As packfiles are limited to 4 GiB we don't need more than 32 bits to store the offset of an object within that packfile, an `unsigned long` on a 64 bit system is likely a 64 bit unsigned value. Saving 4 bytes per object on a 64 bit system can add up fast on any sizable import. As atom strings are strictly single components in a path name these are probably limited to just 255 bytes by the underlying OS. Going to that short of a string is probably too restrictive, but certainly `unsigned int` is far too large for their lengths. `unsigned short` is a reasonable limit. Modes within a tree really only need two bytes to store their whole value; using `unsigned int` here is vast overkill. Saving 4 bytes per file entry in an active branch can add up quickly on a project with a large number of files. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-05 22:34:56 +01:00			`uint16_t mode, inline_data = 0;`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`p = get_mode(p, &mode);`
			`if (!p)`
			`die("Corrupt mode: %s", command_buf.buf);`
			`switch (mode) {`
			`case S_IFREG \| 0644:`
			`case S_IFREG \| 0755:`
Allow symlink blobs in trees during fast-import. If a frontend is smart enough to import a symlink then we should let them do so. We'll assume that they were smart enough to first generate a blob to hold the link target, as that's how symlinks get represented in GIT. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-21 09:29:13 +02:00			`case S_IFLNK:`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`case 0644:`
			`case 0755:`
			`/* ok */`
			`break;`
			`default:`
			`die("Corrupt mode: %s", command_buf.buf);`
			`}`

Added mark store/find to fast-import. Marks are now saved when the mark directive gets used by the frontend and may be used in place of a SHA1 expression to locate a previous SHA1 which fast-import may have generated. This is particularly useful with commits where the frontend does not (easily) have the ability to compute the SHA1 for an arbitrary commit but needs it to generate a branch or tag from that commit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 10:17:45 +02:00			`if (*p == ':') {`
			`char *x;`
Use uintmax_t for marks in fast-import. If a frontend wants to use a mark per file revision and per commit and is doing a truly huge import (such as a 32 GiB SVN repository) we may need more than 2**32 unique mark values, especially if the frontend is unable (or unwilling) to recycle mark values. For mark idnums we should use the largest unsigned integer type available, hoping that will be at least 64 bits when we are compiled as a 64 bit executable. This way we may consume huge amounts of memory storing our mark table, but we'll at least be able to process the entire import without failing. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 06:33:19 +01:00			`oe = find_mark(strtoumax(p + 1, &x, 10));`
Fix repository corruption when using marks for modified blobs. Apparently we did not copy the blob SHA1 into the stack variable 'sha1' when a mark is used to refer to a prior blob. This code was not previously tested as the Mozilla CVS -> git-fast-import program always fed us full SHA1s for modified blobs and did not use the mark feature there. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-12 03:25:01 +01:00			`hashcpy(sha1, oe->sha1);`
Added mark store/find to fast-import. Marks are now saved when the mark directive gets used by the frontend and may be used in place of a SHA1 expression to locate a previous SHA1 which fast-import may have generated. This is particularly useful with commits where the frontend does not (easily) have the ability to compute the SHA1 for an arbitrary commit but needs it to generate a branch or tag from that commit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 10:17:45 +02:00			`p = x;`
prefixcmp(): fix-up mechanical conversion. Previous step converted use of strncmp() with literal string mechanically even when the result is only used as a boolean: if (!strncmp("foo", arg, 3)) ==> if (!(-prefixcmp(arg, "foo"))) This step manually cleans them up to read: if (!prefixcmp(arg, "foo")) Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-20 10:54:00 +01:00			`} else if (!prefixcmp(p, "inline")) {`
Accept 'inline' file data in fast-import commit structure. Its very annoying to need to specify the file content ahead of a commit and use marks to connect the individual blobs to the commit's file modification entry, especially if the frontend can't/won't generate the blob SHA1s itself. Instead it would much easier to use if we can accept the blob data at the same time as we receive each file_change line. Now fast-import accepts 'inline' instead of a mark idnum or blob SHA1 within the 'M' type file_change command. If an inline is detected the very next line must be a 'data n' command, supplying the file data. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-18 21:17:58 +01:00			`inline_data = 1;`
			`p += 6;`
Added mark store/find to fast-import. Marks are now saved when the mark directive gets used by the frontend and may be used in place of a SHA1 expression to locate a previous SHA1 which fast-import may have generated. This is particularly useful with commits where the frontend does not (easily) have the ability to compute the SHA1 for an arbitrary commit but needs it to generate a branch or tag from that commit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 10:17:45 +02:00			`} else {`
			`if (get_sha1_hex(p, sha1))`
			`die("Invalid SHA1: %s", command_buf.buf);`
			`oe = find_object(sha1);`
			`p += 40;`
			`}`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`if (*p++ != ' ')`
			`die("Missing space after SHA1: %s", command_buf.buf);`

Rework unquote_c_style to work on a strbuf. If the gain is not obvious in the diffstat, the resulting code is more readable, _and_ in checkout-index/update-index we now reuse the same buffer to unquote strings instead of always freeing/mallocing. This also is more coherent with the next patch that reworks quoting functions. The quoting function is also made more efficient scanning for backslashes and treating portions of strings without a backslash at once. Signed-off-by: Pierre Habouzit <madcoder@debian.org> 2007-09-20 00:42:14 +02:00			`strbuf_reset(&uq);`
			`if (!unquote_c_style(&uq, p, &endp)) {`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`if (*endp)`
			`die("Garbage after path in: %s", command_buf.buf);`
Rework unquote_c_style to work on a strbuf. If the gain is not obvious in the diffstat, the resulting code is more readable, _and_ in checkout-index/update-index we now reuse the same buffer to unquote strings instead of always freeing/mallocing. This also is more coherent with the next patch that reworks quoting functions. The quoting function is also made more efficient scanning for backslashes and treating portions of strings without a backslash at once. Signed-off-by: Pierre Habouzit <madcoder@debian.org> 2007-09-20 00:42:14 +02:00			`p = uq.buf;`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`}`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00
Accept 'inline' file data in fast-import commit structure. Its very annoying to need to specify the file content ahead of a commit and use marks to connect the individual blobs to the commit's file modification entry, especially if the frontend can't/won't generate the blob SHA1s itself. Instead it would much easier to use if we can accept the blob data at the same time as we receive each file_change line. Now fast-import accepts 'inline' instead of a mark idnum or blob SHA1 within the 'M' type file_change command. If an inline is detected the very next line must be a 'data n' command, supplying the file data. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-18 21:17:58 +01:00			`if (inline_data) {`
fast-import optimization: Now that cmd_data acts on a strbuf, make last_object stashed buffer be a strbuf as well. On new stash, don't free the last stashed buffer, rather swap it with the one you will stash, this way, callers of store_object can act on static strbufs, and at some point, fast-import won't allocate new memory for objects buffers. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 14:00:38 +02:00			`static struct strbuf buf = STRBUF_INIT;`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00
Rework unquote_c_style to work on a strbuf. If the gain is not obvious in the diffstat, the resulting code is more readable, _and_ in checkout-index/update-index we now reuse the same buffer to unquote strings instead of always freeing/mallocing. This also is more coherent with the next patch that reworks quoting functions. The quoting function is also made more efficient scanning for backslashes and treating portions of strings without a backslash at once. Signed-off-by: Pierre Habouzit <madcoder@debian.org> 2007-09-20 00:42:14 +02:00			`if (p != uq.buf) {`
			`strbuf_addstr(&uq, p);`
			`p = uq.buf;`
			`}`
Accept 'inline' file data in fast-import commit structure. Its very annoying to need to specify the file content ahead of a commit and use marks to connect the individual blobs to the commit's file modification entry, especially if the frontend can't/won't generate the blob SHA1s itself. Instead it would much easier to use if we can accept the blob data at the same time as we receive each file_change line. Now fast-import accepts 'inline' instead of a mark idnum or blob SHA1 within the 'M' type file_change command. If an inline is detected the very next line must be a 'data n' command, supplying the file data. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-18 21:17:58 +01:00			`read_next_command();`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`cmd_data(&buf);`
fast-import optimization: Now that cmd_data acts on a strbuf, make last_object stashed buffer be a strbuf as well. On new stash, don't free the last stashed buffer, rather swap it with the one you will stash, this way, callers of store_object can act on static strbufs, and at some point, fast-import won't allocate new memory for objects buffers. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 14:00:38 +02:00			`store_object(OBJ_BLOB, &buf, &last_blob, sha1, 0);`
Accept 'inline' file data in fast-import commit structure. Its very annoying to need to specify the file content ahead of a commit and use marks to connect the individual blobs to the commit's file modification entry, especially if the frontend can't/won't generate the blob SHA1s itself. Instead it would much easier to use if we can accept the blob data at the same time as we receive each file_change line. Now fast-import accepts 'inline' instead of a mark idnum or blob SHA1 within the 'M' type file_change command. If an inline is detected the very next line must be a 'data n' command, supplying the file data. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-18 21:17:58 +01:00			`} else if (oe) {`
Implement blob ID validation in fast-import. When accepting revision SHA1 IDs from the frontend verify the SHA1 actually refers to a blob and is known to exist. Its an error to use a SHA1 in a tree if the blob doesn't exist as this would cause git-fsck-objects to report a missing blob should the pack get closed without the blob being appended into it or a subsequent pack. So right now we'll just ask that the frontend "pre-declare" any blobs it wants to use in a tree before it can use them. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 08:50:18 +02:00			`if (oe->type != OBJ_BLOB)`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`die("Not a blob (actually a %s): %s",`
fast-import: Fix argument order to die in file_change_m The arguments to the "Not a blob" die call in file_change_m were transposed, so that the command was printed as the type, and the type as the command. Switch them around so that the error message comes out correctly. Signed-off-by: Julian Phillips <julian@quantumfyre.co.uk> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-10-20 18:15:38 +02:00			`typename(oe->type), command_buf.buf);`
Implement blob ID validation in fast-import. When accepting revision SHA1 IDs from the frontend verify the SHA1 actually refers to a blob and is known to exist. Its an error to use a SHA1 in a tree if the blob doesn't exist as this would cause git-fsck-objects to report a missing blob should the pack get closed without the blob being appended into it or a subsequent pack. So right now we'll just ask that the frontend "pre-declare" any blobs it wants to use in a tree before it can use them. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 08:50:18 +02:00			`} else {`
convert object type handling from a string to a number We currently have two parallel notation for dealing with object types in the code: a string and a numerical value. One of them is obviously redundent, and the most used one requires more stack space and a bunch of strcmp() all over the place. This is an initial step for the removal of the version using a char array found in object reading code paths. The patch is unfortunately large but there is no sane way to split it in smaller parts without breaking the system. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-26 20:55:59 +01:00			`enum object_type type = sha1_object_info(sha1, NULL);`
			`if (type < 0)`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`die("Blob not found: %s", command_buf.buf);`
convert object type handling from a string to a number We currently have two parallel notation for dealing with object types in the code: a string and a numerical value. One of them is obviously redundent, and the most used one requires more stack space and a bunch of strcmp() all over the place. This is an initial step for the removal of the version using a char array found in object reading code paths. The patch is unfortunately large but there is no sane way to split it in smaller parts without breaking the system. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-26 20:55:59 +01:00			`if (type != OBJ_BLOB)`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`die("Not a blob (actually a %s): %s",`
convert object type handling from a string to a number We currently have two parallel notation for dealing with object types in the code: a string and a numerical value. One of them is obviously redundent, and the most used one requires more stack space and a bunch of strcmp() all over the place. This is an initial step for the removal of the version using a char array found in object reading code paths. The patch is unfortunately large but there is no sane way to split it in smaller parts without breaking the system. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-26 20:55:59 +01:00			`typename(type), command_buf.buf);`
Implement blob ID validation in fast-import. When accepting revision SHA1 IDs from the frontend verify the SHA1 actually refers to a blob and is known to exist. Its an error to use a SHA1 in a tree if the blob doesn't exist as this would cause git-fsck-objects to report a missing blob should the pack get closed without the blob being appended into it or a subsequent pack. So right now we'll just ask that the frontend "pre-declare" any blobs it wants to use in a tree before it can use them. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 08:50:18 +02:00			`}`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`tree_content_set(&b->branch_tree, p, sha1, S_IFREG \| mode, NULL);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`}`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`static void file_change_d(struct branch *b)`
			`{`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`const char *p = command_buf.buf + 2;`
Rework unquote_c_style to work on a strbuf. If the gain is not obvious in the diffstat, the resulting code is more readable, _and_ in checkout-index/update-index we now reuse the same buffer to unquote strings instead of always freeing/mallocing. This also is more coherent with the next patch that reworks quoting functions. The quoting function is also made more efficient scanning for backslashes and treating portions of strings without a backslash at once. Signed-off-by: Pierre Habouzit <madcoder@debian.org> 2007-09-20 00:42:14 +02:00			`static struct strbuf uq = STRBUF_INIT;`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`const char *endp;`

Rework unquote_c_style to work on a strbuf. If the gain is not obvious in the diffstat, the resulting code is more readable, _and_ in checkout-index/update-index we now reuse the same buffer to unquote strings instead of always freeing/mallocing. This also is more coherent with the next patch that reworks quoting functions. The quoting function is also made more efficient scanning for backslashes and treating portions of strings without a backslash at once. Signed-off-by: Pierre Habouzit <madcoder@debian.org> 2007-09-20 00:42:14 +02:00			`strbuf_reset(&uq);`
			`if (!unquote_c_style(&uq, p, &endp)) {`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`if (*endp)`
			`die("Garbage after path in: %s", command_buf.buf);`
Rework unquote_c_style to work on a strbuf. If the gain is not obvious in the diffstat, the resulting code is more readable, _and_ in checkout-index/update-index we now reuse the same buffer to unquote strings instead of always freeing/mallocing. This also is more coherent with the next patch that reworks quoting functions. The quoting function is also made more efficient scanning for backslashes and treating portions of strings without a backslash at once. Signed-off-by: Pierre Habouzit <madcoder@debian.org> 2007-09-20 00:42:14 +02:00			`p = uq.buf;`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`}`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`tree_content_remove(&b->branch_tree, p, NULL);`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00			`}`

Teach fast-import to recursively copy files/directories Some source material (e.g. Subversion dump files) perform directory renames by telling us the directory was copied, then deleted in the same revision. This makes it difficult for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Copy a/ to b/; Delete a/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'C' subcommand within a commit allows the frontend to make a recursive copy of one path to another path within the branch, without needing to keep track of the individual file paths. The metadata copy is performed in memory efficiently, but is implemented as a copy-immediately operation, rather than copy-on-write. With this new 'C' subcommand frontends could obviously implement an 'R' (rename) on their own as a combination of 'C' and 'D' (delete), but since we have already offered up 'R' in the past and it is a trivial thing to keep implemented I'm not going to deprecate it. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-15 07:40:37 +02:00			`static void file_change_cr(struct branch *b, int rename)`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`{`
			`const char s, d;`
Rework unquote_c_style to work on a strbuf. If the gain is not obvious in the diffstat, the resulting code is more readable, _and_ in checkout-index/update-index we now reuse the same buffer to unquote strings instead of always freeing/mallocing. This also is more coherent with the next patch that reworks quoting functions. The quoting function is also made more efficient scanning for backslashes and treating portions of strings without a backslash at once. Signed-off-by: Pierre Habouzit <madcoder@debian.org> 2007-09-20 00:42:14 +02:00			`static struct strbuf s_uq = STRBUF_INIT;`
			`static struct strbuf d_uq = STRBUF_INIT;`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`const char *endp;`
			`struct tree_entry leaf;`

			`s = command_buf.buf + 2;`
Rework unquote_c_style to work on a strbuf. If the gain is not obvious in the diffstat, the resulting code is more readable, _and_ in checkout-index/update-index we now reuse the same buffer to unquote strings instead of always freeing/mallocing. This also is more coherent with the next patch that reworks quoting functions. The quoting function is also made more efficient scanning for backslashes and treating portions of strings without a backslash at once. Signed-off-by: Pierre Habouzit <madcoder@debian.org> 2007-09-20 00:42:14 +02:00			`strbuf_reset(&s_uq);`
			`if (!unquote_c_style(&s_uq, s, &endp)) {`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`if (*endp != ' ')`
			`die("Missing space after source: %s", command_buf.buf);`
Rework unquote_c_style to work on a strbuf. If the gain is not obvious in the diffstat, the resulting code is more readable, _and_ in checkout-index/update-index we now reuse the same buffer to unquote strings instead of always freeing/mallocing. This also is more coherent with the next patch that reworks quoting functions. The quoting function is also made more efficient scanning for backslashes and treating portions of strings without a backslash at once. Signed-off-by: Pierre Habouzit <madcoder@debian.org> 2007-09-20 00:42:14 +02:00			`} else {`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`endp = strchr(s, ' ');`
			`if (!endp)`
			`die("Missing space after source: %s", command_buf.buf);`
Rework unquote_c_style to work on a strbuf. If the gain is not obvious in the diffstat, the resulting code is more readable, _and_ in checkout-index/update-index we now reuse the same buffer to unquote strings instead of always freeing/mallocing. This also is more coherent with the next patch that reworks quoting functions. The quoting function is also made more efficient scanning for backslashes and treating portions of strings without a backslash at once. Signed-off-by: Pierre Habouzit <madcoder@debian.org> 2007-09-20 00:42:14 +02:00			`strbuf_add(&s_uq, s, endp - s);`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`}`
Rework unquote_c_style to work on a strbuf. If the gain is not obvious in the diffstat, the resulting code is more readable, _and_ in checkout-index/update-index we now reuse the same buffer to unquote strings instead of always freeing/mallocing. This also is more coherent with the next patch that reworks quoting functions. The quoting function is also made more efficient scanning for backslashes and treating portions of strings without a backslash at once. Signed-off-by: Pierre Habouzit <madcoder@debian.org> 2007-09-20 00:42:14 +02:00			`s = s_uq.buf;`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00
			`endp++;`
			`if (!*endp)`
			`die("Missing dest: %s", command_buf.buf);`

			`d = endp;`
Rework unquote_c_style to work on a strbuf. If the gain is not obvious in the diffstat, the resulting code is more readable, _and_ in checkout-index/update-index we now reuse the same buffer to unquote strings instead of always freeing/mallocing. This also is more coherent with the next patch that reworks quoting functions. The quoting function is also made more efficient scanning for backslashes and treating portions of strings without a backslash at once. Signed-off-by: Pierre Habouzit <madcoder@debian.org> 2007-09-20 00:42:14 +02:00			`strbuf_reset(&d_uq);`
			`if (!unquote_c_style(&d_uq, d, &endp)) {`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`if (*endp)`
			`die("Garbage after dest in: %s", command_buf.buf);`
Rework unquote_c_style to work on a strbuf. If the gain is not obvious in the diffstat, the resulting code is more readable, _and_ in checkout-index/update-index we now reuse the same buffer to unquote strings instead of always freeing/mallocing. This also is more coherent with the next patch that reworks quoting functions. The quoting function is also made more efficient scanning for backslashes and treating portions of strings without a backslash at once. Signed-off-by: Pierre Habouzit <madcoder@debian.org> 2007-09-20 00:42:14 +02:00			`d = d_uq.buf;`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`}`

			`memset(&leaf, 0, sizeof(leaf));`
Teach fast-import to recursively copy files/directories Some source material (e.g. Subversion dump files) perform directory renames by telling us the directory was copied, then deleted in the same revision. This makes it difficult for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Copy a/ to b/; Delete a/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'C' subcommand within a commit allows the frontend to make a recursive copy of one path to another path within the branch, without needing to keep track of the individual file paths. The metadata copy is performed in memory efficiently, but is implemented as a copy-immediately operation, rather than copy-on-write. With this new 'C' subcommand frontends could obviously implement an 'R' (rename) on their own as a combination of 'C' and 'D' (delete), but since we have already offered up 'R' in the past and it is a trivial thing to keep implemented I'm not going to deprecate it. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-15 07:40:37 +02:00			`if (rename)`
			`tree_content_remove(&b->branch_tree, s, &leaf);`
			`else`
			`tree_content_get(&b->branch_tree, s, &leaf);`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`if (!leaf.versions[1].mode)`
			`die("Path %s not in branch", s);`
			`tree_content_set(&b->branch_tree, d,`
			`leaf.versions[1].sha1,`
			`leaf.versions[1].mode,`
			`leaf.tree);`
			`}`

Teach fast-import how to clear the internal branch content. Some frontends may not be able to (easily) keep track of which files are included in the branch, and which aren't. Performing this tracking can be tedious and error prone for the frontend to do, especially if its foreign data source cannot supply the changed path list on a per-commit basis. fast-import now allows a frontend to request that a branch's tree be wiped clean (reset to the empty tree) at the start of a commit, allowing the frontend to feed in all paths which belong on the branch. This is ideal for a tar-file importer frontend, for example, as the frontend just needs to reformat the tar data stream into a gfi data stream, which may be something a few Perl regexps can take care of. :) Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-07 08:03:03 +01:00			`static void file_change_deleteall(struct branch *b)`
			`{`
			`release_tree_content_recursive(b->branch_tree.tree);`
			`hashclr(b->branch_tree.versions[0].sha1);`
			`hashclr(b->branch_tree.versions[1].sha1);`
			`load_tree(&b->branch_tree);`
			`}`

Refactor fast-import branch creation from existing commit To resolve a corner case uncovered by Simon Hausmann I need to reuse the logic for the SHA-1 expression version of the 'from ' command within the mark version of the 'from ' command. This change doesn't alter any functionality, but is merely breaking the common code out to a function that I can reuse. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-05-24 06:05:19 +02:00			`static void cmd_from_commit(struct branch b, char buf, unsigned long size)`
			`{`
			`if (!buf \|\| size < 46)`
			`die("Not a valid commit: %s", sha1_to_hex(b->sha1));`
			`if (memcmp("tree ", buf, 5)`
			`\|\| get_sha1_hex(buf + 5, b->branch_tree.versions[1].sha1))`
			`die("The commit %s is corrupt", sha1_to_hex(b->sha1));`
			`hashcpy(b->branch_tree.versions[0].sha1,`
			`b->branch_tree.versions[1].sha1);`
			`}`

			`static void cmd_from_existing(struct branch *b)`
			`{`
			`if (is_null_sha1(b->sha1)) {`
			`hashclr(b->branch_tree.versions[0].sha1);`
			`hashclr(b->branch_tree.versions[1].sha1);`
			`} else {`
			`unsigned long size;`
			`char *buf;`

			`buf = read_object_with_reference(b->sha1,`
			`commit_type, &size, b->sha1);`
			`cmd_from_commit(b, buf, size);`
			`free(buf);`
			`}`
			`}`

Make trailing LF optional for all fast-import commands For the same reasons as the prior change we want to allow frontends to omit the trailing LF that usually delimits commands. In some cases these just make the input stream more verbose looking than it needs to be, and its just simpler for the frontend developer to get started if our parser is slightly more lenient about where an LF is required and where it isn't. To make this optional LF feature work we now have to buffer up to one line of input in command_buf. This buffering can happen if we look at the current input command but don't recognize it at this point in the code. In such a case we need to "unget" the entire line, but we cannot depend upon the stdio library to let us do ungetc() for that many characters at once. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-01 08:22:53 +02:00			`static int cmd_from(struct branch *b)`
Remove branch creation command from fast-import. Jon Smirl was finding it difficult to alter cvs2svn to generate branch commands prior to the first commit of the same branch. This change moves the 'from' command to be an optional parameter of the 'commit' command, thereby allowing a new branch to be defined at the moment it gets used to create the first commit on that branch. This change makes it impossible to create a branch with no commits on it as at least one commit is needed to register the branch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 00:45:26 +02:00			`{`
Don't support shell-quoted refnames in fast-import. The current implementation of shell-style quoted refnames and SHA-1 expressions within fast-import contains a bad memory leak. We leak the unquoted strings used by the `from` and `merge` commands, maybe others. Its also just muddling up the docs. Since Git refnames cannot contain LF, and that is our delimiter for the end of the refname, and we accept any other character as-is, there is no reason for these strings to support quoting, except to be nice to frontends. But frontends shouldn't be expecting to use funny refs in Git, and its just as simple to never quote them as it is to always pass them through the same quoting filter as pathnames. So frontends should never quote refs, or ref expressions. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 02:30:37 +01:00			`const char *from;`
Remove branch creation command from fast-import. Jon Smirl was finding it difficult to alter cvs2svn to generate branch commands prior to the first commit of the same branch. This change moves the 'from' command to be an optional parameter of the 'commit' command, thereby allowing a new branch to be defined at the moment it gets used to create the first commit on that branch. This change makes it impossible to create a branch with no commits on it as at least one commit is needed to register the branch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 00:45:26 +02:00			`struct branch *s;`

prefixcmp(): fix-up mechanical conversion. Previous step converted use of strncmp() with literal string mechanically even when the result is only used as a boolean: if (!strncmp("foo", arg, 3)) ==> if (!(-prefixcmp(arg, "foo"))) This step manually cleans them up to read: if (!prefixcmp(arg, "foo")) Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-20 10:54:00 +01:00			`if (prefixcmp(command_buf.buf, "from "))`
Make trailing LF optional for all fast-import commands For the same reasons as the prior change we want to allow frontends to omit the trailing LF that usually delimits commands. In some cases these just make the input stream more verbose looking than it needs to be, and its just simpler for the frontend developer to get started if our parser is slightly more lenient about where an LF is required and where it isn't. To make this optional LF feature work we now have to buffer up to one line of input in command_buf. This buffering can happen if we look at the current input command but don't recognize it at this point in the code. In such a case we need to "unget" the entire line, but we cannot depend upon the stdio library to let us do ungetc() for that many characters at once. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-01 08:22:53 +02:00			`return 0;`
Remove branch creation command from fast-import. Jon Smirl was finding it difficult to alter cvs2svn to generate branch commands prior to the first commit of the same branch. This change moves the 'from' command to be an optional parameter of the 'commit' command, thereby allowing a new branch to be defined at the moment it gets used to create the first commit on that branch. This change makes it impossible to create a branch with no commits on it as at least one commit is needed to register the branch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 00:45:26 +02:00
fast-import: Support reusing 'from' and brown paper bag fix reset. It was suggested on the mailing list that being able to use `from` in any commit to reset the current branch is useful in some types of importers, such as a darcs importer. We originally did not permit resetting an existing branch with a new `from` command during a `commit` command, but this restriction was only to help debug the hacked up cvs2svn that Jon Smirl was developing in parallel with git-fast-import. It is probably more of a problem to disallow it than to allow it. So now we permit a `from` during any `commit`. While making the changes required to permit multiple `from` commands on the same branch, I discovered we no longer needed the last_commit field to be set to 0 during a reset, so that was removed. (Reset was originally setting the field to 0 to signal cmd_from() that it was OK to execute on the branch.) While poking around in this section of fast-import I also realized the `reset` command was not working as intended if the corresponding `from` command was omitted (as allowed by the BNF grammar and the code). If `from` was omitted we cleared out the tree but we left the tree SHA-1 and parent commit SHA-1 intact. This is not what the user intended in this case. Instead they would be trying to reset the branch to have no parent and to have no tree, making the branch look new-born during the next commit. We now clear these SHA-1 values during `reset`, ensuring the branch looks new-born if `from` does not get supplied. New test cases for these were also added. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-12 10:08:43 +01:00			`if (b->branch_tree.tree) {`
			`release_tree_content_recursive(b->branch_tree.tree);`
			`b->branch_tree.tree = NULL;`
			`}`
Remove branch creation command from fast-import. Jon Smirl was finding it difficult to alter cvs2svn to generate branch commands prior to the first commit of the same branch. This change moves the 'from' command to be an optional parameter of the 'commit' command, thereby allowing a new branch to be defined at the moment it gets used to create the first commit on that branch. This change makes it impossible to create a branch with no commits on it as at least one commit is needed to register the branch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 00:45:26 +02:00
			`from = strchr(command_buf.buf, ' ') + 1;`
			`s = lookup_branch(from);`
			`if (b == s)`
			`die("Can't create a branch from itself: %s", b->name);`
			`else if (s) {`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`unsigned char *t = s->branch_tree.versions[1].sha1;`
Converted hash memcpy/memcmp to new hashcpy/hashcmp/hashclr. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 16:46:58 +02:00			`hashcpy(b->sha1, s->sha1);`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`hashcpy(b->branch_tree.versions[0].sha1, t);`
			`hashcpy(b->branch_tree.versions[1].sha1, t);`
Remove branch creation command from fast-import. Jon Smirl was finding it difficult to alter cvs2svn to generate branch commands prior to the first commit of the same branch. This change moves the 'from' command to be an optional parameter of the 'commit' command, thereby allowing a new branch to be defined at the moment it gets used to create the first commit on that branch. This change makes it impossible to create a branch with no commits on it as at least one commit is needed to register the branch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 00:45:26 +02:00			`} else if (*from == ':') {`
Use uintmax_t for marks in fast-import. If a frontend wants to use a mark per file revision and per commit and is doing a truly huge import (such as a 32 GiB SVN repository) we may need more than 2**32 unique mark values, especially if the frontend is unable (or unwilling) to recycle mark values. For mark idnums we should use the largest unsigned integer type available, hoping that will be at least 64 bits when we are compiled as a 64 bit executable. This way we may consume huge amounts of memory storing our mark table, but we'll at least be able to process the entire import without failing. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 06:33:19 +01:00			`uintmax_t idnum = strtoumax(from + 1, NULL, 10);`
Remove branch creation command from fast-import. Jon Smirl was finding it difficult to alter cvs2svn to generate branch commands prior to the first commit of the same branch. This change moves the 'from' command to be an optional parameter of the 'commit' command, thereby allowing a new branch to be defined at the moment it gets used to create the first commit on that branch. This change makes it impossible to create a branch with no commits on it as at least one commit is needed to register the branch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 00:45:26 +02:00			`struct object_entry *oe = find_mark(idnum);`
			`if (oe->type != OBJ_COMMIT)`
Check for PRIuMAX rather than NO_C99_FORMAT in fast-import.c. Thanks to Simon 'corecode' Schubert <corecode@fs.ei.tum.de> for the clean-up. Defining the C99 standard PRIuMAX when necessary replaces UM_FMT and the awkward UM10_FMT. There are no direct C99 translations for other uses of NO_C99_FORMAT in git, alas. Signed-off-by: Jason Riedy <ejr@cs.berkeley.edu> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-21 02:34:56 +01:00			`die("Mark :%" PRIuMAX " not a commit", idnum);`
Converted hash memcpy/memcmp to new hashcpy/hashcmp/hashclr. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 16:46:58 +02:00			`hashcpy(b->sha1, oe->sha1);`
Fix possible coredump with fast-import --import-marks When e8438420bb7d368bec3647b90c557b9931582267 allowed us to reload the marks table on subsequent runs of fast-import we really broke things, as we set pack_id to MAX_PACK_ID for any objects we imported into the marks table. Creating a branch from that mark should fail as we attempt to read the object through a non-existant packed_git pointer. Instead we have to use the normal Git object system to locate the older commit, as we ourselves do not have a reference to the packed_git it resides in. This bug only occurred because t9300 was not complete enough. When we added the --import-marks feature we didn't actually test its implementation enough to verify the function worked as intended. I have corrected that, and included the changes as part of this fix. Prior versions of fast-import fail the new test(s); this commit allows them to pass. Credit for this bug find goes to Simon Hausmann <simon@lst.de> as he recently identified a similiar bug in the tree lazy-loading path. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-05-24 06:32:31 +02:00			`if (oe->pack_id != MAX_PACK_ID) {`
Remove branch creation command from fast-import. Jon Smirl was finding it difficult to alter cvs2svn to generate branch commands prior to the first commit of the same branch. This change moves the 'from' command to be an optional parameter of the 'commit' command, thereby allowing a new branch to be defined at the moment it gets used to create the first commit on that branch. This change makes it impossible to create a branch with no commits on it as at least one commit is needed to register the branch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 00:45:26 +02:00			`unsigned long size;`
Fix possible coredump with fast-import --import-marks When e8438420bb7d368bec3647b90c557b9931582267 allowed us to reload the marks table on subsequent runs of fast-import we really broke things, as we set pack_id to MAX_PACK_ID for any objects we imported into the marks table. Creating a branch from that mark should fail as we attempt to read the object through a non-existant packed_git pointer. Instead we have to use the normal Git object system to locate the older commit, as we ourselves do not have a reference to the packed_git it resides in. This bug only occurred because t9300 was not complete enough. When we added the --import-marks feature we didn't actually test its implementation enough to verify the function worked as intended. I have corrected that, and included the changes as part of this fix. Prior versions of fast-import fail the new test(s); this commit allows them to pass. Credit for this bug find goes to Simon Hausmann <simon@lst.de> as he recently identified a similiar bug in the tree lazy-loading path. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-05-24 06:32:31 +02:00			`char *buf = gfi_unpack_entry(oe, &size);`
			`cmd_from_commit(b, buf, size);`
Remove branch creation command from fast-import. Jon Smirl was finding it difficult to alter cvs2svn to generate branch commands prior to the first commit of the same branch. This change moves the 'from' command to be an optional parameter of the 'commit' command, thereby allowing a new branch to be defined at the moment it gets used to create the first commit on that branch. This change makes it impossible to create a branch with no commits on it as at least one commit is needed to register the branch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 00:45:26 +02:00			`free(buf);`
Fix possible coredump with fast-import --import-marks When e8438420bb7d368bec3647b90c557b9931582267 allowed us to reload the marks table on subsequent runs of fast-import we really broke things, as we set pack_id to MAX_PACK_ID for any objects we imported into the marks table. Creating a branch from that mark should fail as we attempt to read the object through a non-existant packed_git pointer. Instead we have to use the normal Git object system to locate the older commit, as we ourselves do not have a reference to the packed_git it resides in. This bug only occurred because t9300 was not complete enough. When we added the --import-marks feature we didn't actually test its implementation enough to verify the function worked as intended. I have corrected that, and included the changes as part of this fix. Prior versions of fast-import fail the new test(s); this commit allows them to pass. Credit for this bug find goes to Simon Hausmann <simon@lst.de> as he recently identified a similiar bug in the tree lazy-loading path. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-05-24 06:32:31 +02:00			`} else`
			`cmd_from_existing(b);`
Refactor fast-import branch creation from existing commit To resolve a corner case uncovered by Simon Hausmann I need to reuse the logic for the SHA-1 expression version of the 'from ' command within the mark version of the 'from ' command. This change doesn't alter any functionality, but is merely breaking the common code out to a function that I can reuse. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-05-24 06:05:19 +02:00			`} else if (!get_sha1(from, b->sha1))`
			`cmd_from_existing(b);`
			`else`
Remove branch creation command from fast-import. Jon Smirl was finding it difficult to alter cvs2svn to generate branch commands prior to the first commit of the same branch. This change moves the 'from' command to be an optional parameter of the 'commit' command, thereby allowing a new branch to be defined at the moment it gets used to create the first commit on that branch. This change makes it impossible to create a branch with no commits on it as at least one commit is needed to register the branch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 00:45:26 +02:00			`die("Invalid ref name or SHA1 expression: %s", from);`

			`read_next_command();`
Make trailing LF optional for all fast-import commands For the same reasons as the prior change we want to allow frontends to omit the trailing LF that usually delimits commands. In some cases these just make the input stream more verbose looking than it needs to be, and its just simpler for the frontend developer to get started if our parser is slightly more lenient about where an LF is required and where it isn't. To make this optional LF feature work we now have to buffer up to one line of input in command_buf. This buffering can happen if we look at the current input command but don't recognize it at this point in the code. In such a case we need to "unget" the entire line, but we cannot depend upon the stdio library to let us do ungetc() for that many characters at once. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-01 08:22:53 +02:00			`return 1;`
Remove branch creation command from fast-import. Jon Smirl was finding it difficult to alter cvs2svn to generate branch commands prior to the first commit of the same branch. This change moves the 'from' command to be an optional parameter of the 'commit' command, thereby allowing a new branch to be defined at the moment it gets used to create the first commit on that branch. This change makes it impossible to create a branch with no commits on it as at least one commit is needed to register the branch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 00:45:26 +02:00			`}`

Correct minor style issue in fast-import. Junio noticed that I was using a different style in fast-import for returned pointers than the rest of Git. Before merging this code into the main git.git tree I'd like to make it consistent, as this style variation was not intentional. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 06:43:59 +01:00			`static struct hash_list cmd_merge(unsigned int count)`
Support creation of merge commits in fast-import. Some importers are able to determine when branch merges occurred within their source data. In these cases they will want to supply the correct commits to fast-import so that a proper merge commit will exist in Git. This is now supported by supplying a 'merge ' command after the commit message and optional from command. A merge is not actually performed by fast-import, its assumed that the frontend performed any sort of merging activity already and that fast-import should simply be storing its result. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-12 04:21:38 +01:00			`{`
Correct compiler warnings in fast-import. Junio noticed these warnings/errors in fast-import when compiling with `-Werror -ansi -pedantic`. A few changes are to reduce compiler warnings, while one (in cmd_merge) is a bug fix. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 06:26:49 +01:00			`struct hash_list list = NULL, n, *e = e;`
Don't support shell-quoted refnames in fast-import. The current implementation of shell-style quoted refnames and SHA-1 expressions within fast-import contains a bad memory leak. We leak the unquoted strings used by the `from` and `merge` commands, maybe others. Its also just muddling up the docs. Since Git refnames cannot contain LF, and that is our delimiter for the end of the refname, and we accept any other character as-is, there is no reason for these strings to support quoting, except to be nice to frontends. But frontends shouldn't be expecting to use funny refs in Git, and its just as simple to never quote them as it is to always pass them through the same quoting filter as pathnames. So frontends should never quote refs, or ref expressions. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 02:30:37 +01:00			`const char *from;`
Support creation of merge commits in fast-import. Some importers are able to determine when branch merges occurred within their source data. In these cases they will want to supply the correct commits to fast-import so that a proper merge commit will exist in Git. This is now supported by supplying a 'merge ' command after the commit message and optional from command. A merge is not actually performed by fast-import, its assumed that the frontend performed any sort of merging activity already and that fast-import should simply be storing its result. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-12 04:21:38 +01:00			`struct branch *s;`

			`*count = 0;`
prefixcmp(): fix-up mechanical conversion. Previous step converted use of strncmp() with literal string mechanically even when the result is only used as a boolean: if (!strncmp("foo", arg, 3)) ==> if (!(-prefixcmp(arg, "foo"))) This step manually cleans them up to read: if (!prefixcmp(arg, "foo")) Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-20 10:54:00 +01:00			`while (!prefixcmp(command_buf.buf, "merge ")) {`
Support creation of merge commits in fast-import. Some importers are able to determine when branch merges occurred within their source data. In these cases they will want to supply the correct commits to fast-import so that a proper merge commit will exist in Git. This is now supported by supplying a 'merge ' command after the commit message and optional from command. A merge is not actually performed by fast-import, its assumed that the frontend performed any sort of merging activity already and that fast-import should simply be storing its result. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-12 04:21:38 +01:00			`from = strchr(command_buf.buf, ' ') + 1;`
			`n = xmalloc(sizeof(*n));`
			`s = lookup_branch(from);`
			`if (s)`
			`hashcpy(n->sha1, s->sha1);`
			`else if (*from == ':') {`
Use uintmax_t for marks in fast-import. If a frontend wants to use a mark per file revision and per commit and is doing a truly huge import (such as a 32 GiB SVN repository) we may need more than 2**32 unique mark values, especially if the frontend is unable (or unwilling) to recycle mark values. For mark idnums we should use the largest unsigned integer type available, hoping that will be at least 64 bits when we are compiled as a 64 bit executable. This way we may consume huge amounts of memory storing our mark table, but we'll at least be able to process the entire import without failing. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 06:33:19 +01:00			`uintmax_t idnum = strtoumax(from + 1, NULL, 10);`
Support creation of merge commits in fast-import. Some importers are able to determine when branch merges occurred within their source data. In these cases they will want to supply the correct commits to fast-import so that a proper merge commit will exist in Git. This is now supported by supplying a 'merge ' command after the commit message and optional from command. A merge is not actually performed by fast-import, its assumed that the frontend performed any sort of merging activity already and that fast-import should simply be storing its result. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-12 04:21:38 +01:00			`struct object_entry *oe = find_mark(idnum);`
			`if (oe->type != OBJ_COMMIT)`
Check for PRIuMAX rather than NO_C99_FORMAT in fast-import.c. Thanks to Simon 'corecode' Schubert <corecode@fs.ei.tum.de> for the clean-up. Defining the C99 standard PRIuMAX when necessary replaces UM_FMT and the awkward UM10_FMT. There are no direct C99 translations for other uses of NO_C99_FORMAT in git, alas. Signed-off-by: Jason Riedy <ejr@cs.berkeley.edu> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-21 02:34:56 +01:00			`die("Mark :%" PRIuMAX " not a commit", idnum);`
Support creation of merge commits in fast-import. Some importers are able to determine when branch merges occurred within their source data. In these cases they will want to supply the correct commits to fast-import so that a proper merge commit will exist in Git. This is now supported by supplying a 'merge ' command after the commit message and optional from command. A merge is not actually performed by fast-import, its assumed that the frontend performed any sort of merging activity already and that fast-import should simply be storing its result. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-12 04:21:38 +01:00			`hashcpy(n->sha1, oe->sha1);`
fast-import: Fail if a non-existant commit is used for merge Johannes Sixt noticed during one of his own imports that fast-import did not fail if a non-existant commit is referenced by SHA-1 value as an argument to the 'merge' command. This allowed the user to unknowingly create commits that would fail in fsck, as the commit contents would not be completely reachable. A side effect of this bug was that a frontend process could mark any SHA-1 object (blob, tree, tag) as a parent of a merge commit. This should also fail in fsck, as the commit is not a valid commit. We now use the same rule as the 'from' command. If a commit is referenced in the 'merge' command by hex formatted SHA-1 then the SHA-1 must be a commit or a tag that can be peeled back to a commit, the commit must already exist, and must be readable by the core Git infrastructure code. This requirement means that the commit must have existed prior to fast-import starting, or the commit must have been flushed out by a prior 'checkpoint' command. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-05 18:43:14 +01:00			`} else if (!get_sha1(from, n->sha1)) {`
			`unsigned long size;`
			`char *buf = read_object_with_reference(n->sha1,`
Merge branch 'maint' * maint: fast-import: Fail if a non-existant commit is used for merge fast-import: Avoid infinite loop after reset [sp: Minor evil merge to deal with type_names array moving to be private in 'master'.] 2007-03-05 18:49:02 +01:00			`commit_type, &size, n->sha1);`
fast-import: Fail if a non-existant commit is used for merge Johannes Sixt noticed during one of his own imports that fast-import did not fail if a non-existant commit is referenced by SHA-1 value as an argument to the 'merge' command. This allowed the user to unknowingly create commits that would fail in fsck, as the commit contents would not be completely reachable. A side effect of this bug was that a frontend process could mark any SHA-1 object (blob, tree, tag) as a parent of a merge commit. This should also fail in fsck, as the commit is not a valid commit. We now use the same rule as the 'from' command. If a commit is referenced in the 'merge' command by hex formatted SHA-1 then the SHA-1 must be a commit or a tag that can be peeled back to a commit, the commit must already exist, and must be readable by the core Git infrastructure code. This requirement means that the commit must have existed prior to fast-import starting, or the commit must have been flushed out by a prior 'checkpoint' command. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-05 18:43:14 +01:00			`if (!buf \|\| size < 46)`
			`die("Not a valid commit: %s", from);`
			`free(buf);`
			`} else`
Support creation of merge commits in fast-import. Some importers are able to determine when branch merges occurred within their source data. In these cases they will want to supply the correct commits to fast-import so that a proper merge commit will exist in Git. This is now supported by supplying a 'merge ' command after the commit message and optional from command. A merge is not actually performed by fast-import, its assumed that the frontend performed any sort of merging activity already and that fast-import should simply be storing its result. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-12 04:21:38 +01:00			`die("Invalid ref name or SHA1 expression: %s", from);`

			`n->next = NULL;`
			`if (list)`
			`e->next = n;`
			`else`
			`list = n;`
			`e = n;`
Correct compiler warnings in fast-import. Junio noticed these warnings/errors in fast-import when compiling with `-Werror -ansi -pedantic`. A few changes are to reduce compiler warnings, while one (in cmd_merge) is a bug fix. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 06:26:49 +01:00			`(*count)++;`
Support creation of merge commits in fast-import. Some importers are able to determine when branch merges occurred within their source data. In these cases they will want to supply the correct commits to fast-import so that a proper merge commit will exist in Git. This is now supported by supplying a 'merge ' command after the commit message and optional from command. A merge is not actually performed by fast-import, its assumed that the frontend performed any sort of merging activity already and that fast-import should simply be storing its result. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-12 04:21:38 +01:00			`read_next_command();`
			`}`
			`return list;`
			`}`

Declare no-arg functions as (void) in fast-import. Apparently the git convention is to declare any function which takes no arguments as taking void. I did not do this during the early fast-import development, but should have. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-17 07:47:25 +01:00			`static void cmd_new_commit(void)`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00			`{`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`static struct strbuf msg = STRBUF_INIT;`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`struct branch *b;`
			`char *sp;`
			`char *author = NULL;`
			`char *committer = NULL;`
Support creation of merge commits in fast-import. Some importers are able to determine when branch merges occurred within their source data. In these cases they will want to supply the correct commits to fast-import so that a proper merge commit will exist in Git. This is now supported by supplying a 'merge ' command after the commit message and optional from command. A merge is not actually performed by fast-import, its assumed that the frontend performed any sort of merging activity already and that fast-import should simply be storing its result. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-12 04:21:38 +01:00			`struct hash_list *merge_list = NULL;`
			`unsigned int merge_count;`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00
			`/* Obtain the branch name from the rest of our command */`
			`sp = strchr(command_buf.buf, ' ') + 1;`
			`b = lookup_branch(sp);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`if (!b)`
Remove branch creation command from fast-import. Jon Smirl was finding it difficult to alter cvs2svn to generate branch commands prior to the first commit of the same branch. This change moves the 'from' command to be an optional parameter of the 'commit' command, thereby allowing a new branch to be defined at the moment it gets used to create the first commit on that branch. This change makes it impossible to create a branch with no commits on it as at least one commit is needed to register the branch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 00:45:26 +02:00			`b = new_branch(sp);`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00
			`read_next_command();`
			`cmd_mark();`
prefixcmp(): fix-up mechanical conversion. Previous step converted use of strncmp() with literal string mechanically even when the result is only used as a boolean: if (!strncmp("foo", arg, 3)) ==> if (!(-prefixcmp(arg, "foo"))) This step manually cleans them up to read: if (!prefixcmp(arg, "foo")) Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-20 10:54:00 +01:00			`if (!prefixcmp(command_buf.buf, "author ")) {`
Support RFC 2822 date parsing in fast-import. Since some frontends may be working with source material where the dates are only readily available as RFC 2822 strings, it is more friendly if fast-import exposes Git's parse_date() function to handle the conversion. This way the frontend doesn't need to perform the parsing itself. The new --date-format option to fast-import can be used by a frontend to select which format it will supply date strings in. The default is the standard `raw` Git format, which fast-import has always supported. Format rfc2822 can be used to activate the parse_date() function instead. Because fast-import could also be useful for creating new, current commits, the format `now` is also supported to generate the current system timestamp. The implementation of `now` is a trivial call to datestamp(), but is actually a whole whopping 3 lines so that fast-import can verify the frontend really meant `now`. As part of this change I have added validation of the `raw` date format. Prior to this change fast-import would accept anything in a `committer` command, even if it was seriously malformed. Now fast-import requires the '> ' near the end of the string and verifies the timestamp is formatted properly. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 20:58:30 +01:00			`author = parse_ident(command_buf.buf + 7);`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`read_next_command();`
			`}`
prefixcmp(): fix-up mechanical conversion. Previous step converted use of strncmp() with literal string mechanically even when the result is only used as a boolean: if (!strncmp("foo", arg, 3)) ==> if (!(-prefixcmp(arg, "foo"))) This step manually cleans them up to read: if (!prefixcmp(arg, "foo")) Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-20 10:54:00 +01:00			`if (!prefixcmp(command_buf.buf, "committer ")) {`
Support RFC 2822 date parsing in fast-import. Since some frontends may be working with source material where the dates are only readily available as RFC 2822 strings, it is more friendly if fast-import exposes Git's parse_date() function to handle the conversion. This way the frontend doesn't need to perform the parsing itself. The new --date-format option to fast-import can be used by a frontend to select which format it will supply date strings in. The default is the standard `raw` Git format, which fast-import has always supported. Format rfc2822 can be used to activate the parse_date() function instead. Because fast-import could also be useful for creating new, current commits, the format `now` is also supported to generate the current system timestamp. The implementation of `now` is a trivial call to datestamp(), but is actually a whole whopping 3 lines so that fast-import can verify the frontend really meant `now`. As part of this change I have added validation of the `raw` date format. Prior to this change fast-import would accept anything in a `committer` command, even if it was seriously malformed. Now fast-import requires the '> ' near the end of the string and verifies the timestamp is formatted properly. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 20:58:30 +01:00			`committer = parse_ident(command_buf.buf + 10);`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`read_next_command();`
			`}`
			`if (!committer)`
			`die("Expected committer but didn't get one");`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`cmd_data(&msg);`
Moved from command to after data to help cvs2svn. cvs2svn has three phases: begin_commit, middle_commit, end_commit. The ancester is computed in the middle_commit phase. So its easier to generate a stream if the from command appears after the commit message itself but before the file change commands. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 04:38:13 +02:00			`read_next_command();`
			`cmd_from(b);`
Support creation of merge commits in fast-import. Some importers are able to determine when branch merges occurred within their source data. In these cases they will want to supply the correct commits to fast-import so that a proper merge commit will exist in Git. This is now supported by supplying a 'merge ' command after the commit message and optional from command. A merge is not actually performed by fast-import, its assumed that the frontend performed any sort of merging activity already and that fast-import should simply be storing its result. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-12 04:21:38 +01:00			`merge_list = cmd_merge(&merge_count);`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00
			`/* ensure the branch is active/loaded */`
Implemented tree reloading in fast-import. Tree reloading allows fast-import to swap out the least-recently used branch by simply deallocating the data structures from memory that were associated with that branch. Later if the branch becomes active again it can lazily recreate those structures on demand by reloading the necessary trees from the pack file it originally wrote them to. The reloading process is implemented by mmap'ing the pack into memory and using a much tighter variant of the pack reading code contained in sha1_file.c. This was a blatent copy from sha1_file.c but the unpacking functions were significantly simplified and are actually now in a form that should make it easier to map only the necessary regions of a pack rather than the entire file. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 10:37:35 +02:00			`if (!b->branch_tree.tree \|\| !max_active_branches) {`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`unload_one_branch();`
			`load_branch(b);`
			`}`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`/* file_change* */`
Drop strbuf's 'eof' marker, and make read_line a first class citizen. read_line is now strbuf_getline, and is a first class citizen, it returns 0 when reading a line worked, EOF else. The ->eof marker was used non-locally by fast-import.c, mimic the same behaviour using a static int in "read_next_command", that now returns -1 on EOF, and avoids to call strbuf_getline when it's in EOF state. Also no longer automagically strbuf_release the buffer, it's counter intuitive and breaks fast-import in a very subtle way. Note: being at EOF implies that command_buf.len == 0. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 11:19:04 +02:00			`while (command_buf.len > 0) {`
Make trailing LF optional for all fast-import commands For the same reasons as the prior change we want to allow frontends to omit the trailing LF that usually delimits commands. In some cases these just make the input stream more verbose looking than it needs to be, and its just simpler for the frontend developer to get started if our parser is slightly more lenient about where an LF is required and where it isn't. To make this optional LF feature work we now have to buffer up to one line of input in command_buf. This buffering can happen if we look at the current input command but don't recognize it at this point in the code. In such a case we need to "unget" the entire line, but we cannot depend upon the stdio library to let us do ungetc() for that many characters at once. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-01 08:22:53 +02:00			`if (!prefixcmp(command_buf.buf, "M "))`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`file_change_m(b);`
prefixcmp(): fix-up mechanical conversion. Previous step converted use of strncmp() with literal string mechanically even when the result is only used as a boolean: if (!strncmp("foo", arg, 3)) ==> if (!(-prefixcmp(arg, "foo"))) This step manually cleans them up to read: if (!prefixcmp(arg, "foo")) Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-20 10:54:00 +01:00			`else if (!prefixcmp(command_buf.buf, "D "))`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`file_change_d(b);`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`else if (!prefixcmp(command_buf.buf, "R "))`
Teach fast-import to recursively copy files/directories Some source material (e.g. Subversion dump files) perform directory renames by telling us the directory was copied, then deleted in the same revision. This makes it difficult for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Copy a/ to b/; Delete a/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'C' subcommand within a commit allows the frontend to make a recursive copy of one path to another path within the branch, without needing to keep track of the individual file paths. The metadata copy is performed in memory efficiently, but is implemented as a copy-immediately operation, rather than copy-on-write. With this new 'C' subcommand frontends could obviously implement an 'R' (rename) on their own as a combination of 'C' and 'D' (delete), but since we have already offered up 'R' in the past and it is a trivial thing to keep implemented I'm not going to deprecate it. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-15 07:40:37 +02:00			`file_change_cr(b, 1);`
			`else if (!prefixcmp(command_buf.buf, "C "))`
			`file_change_cr(b, 0);`
Teach fast-import how to clear the internal branch content. Some frontends may not be able to (easily) keep track of which files are included in the branch, and which aren't. Performing this tracking can be tedious and error prone for the frontend to do, especially if its foreign data source cannot supply the changed path list on a per-commit basis. fast-import now allows a frontend to request that a branch's tree be wiped clean (reset to the empty tree) at the start of a commit, allowing the frontend to feed in all paths which belong on the branch. This is ideal for a tar-file importer frontend, for example, as the frontend just needs to reformat the tar data stream into a gfi data stream, which may be something a few Perl regexps can take care of. :) Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-07 08:03:03 +01:00			`else if (!strcmp("deleteall", command_buf.buf))`
			`file_change_deleteall(b);`
Make trailing LF optional for all fast-import commands For the same reasons as the prior change we want to allow frontends to omit the trailing LF that usually delimits commands. In some cases these just make the input stream more verbose looking than it needs to be, and its just simpler for the frontend developer to get started if our parser is slightly more lenient about where an LF is required and where it isn't. To make this optional LF feature work we now have to buffer up to one line of input in command_buf. This buffering can happen if we look at the current input command but don't recognize it at this point in the code. In such a case we need to "unget" the entire line, but we cannot depend upon the stdio library to let us do ungetc() for that many characters at once. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-01 08:22:53 +02:00			`else {`
			`unread_command_buf = 1;`
			`break;`
			`}`
Drop strbuf's 'eof' marker, and make read_line a first class citizen. read_line is now strbuf_getline, and is a first class citizen, it returns 0 when reading a line worked, EOF else. The ->eof marker was used non-locally by fast-import.c, mimic the same behaviour using a static int in "read_next_command", that now returns -1 on EOF, and avoids to call strbuf_getline when it's in EOF state. Also no longer automagically strbuf_release the buffer, it's counter intuitive and breaks fast-import in a very subtle way. Note: being at EOF implies that command_buf.len == 0. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 11:19:04 +02:00			`if (read_next_command() == EOF)`
			`break;`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00			`}`

Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`/* build the tree and the commit */`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`store_tree(&b->branch_tree);`
Additional fast-import tree delta corruption cleanups. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-29 04:06:13 +02:00			`hashcpy(b->branch_tree.versions[0].sha1,`
			`b->branch_tree.versions[1].sha1);`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00
			`strbuf_reset(&new_data);`
			`strbuf_addf(&new_data, "tree %s\n",`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`sha1_to_hex(b->branch_tree.versions[1].sha1));`
Converted hash memcpy/memcmp to new hashcpy/hashcmp/hashclr. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 16:46:58 +02:00			`if (!is_null_sha1(b->sha1))`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`strbuf_addf(&new_data, "parent %s\n", sha1_to_hex(b->sha1));`
Support creation of merge commits in fast-import. Some importers are able to determine when branch merges occurred within their source data. In these cases they will want to supply the correct commits to fast-import so that a proper merge commit will exist in Git. This is now supported by supplying a 'merge ' command after the commit message and optional from command. A merge is not actually performed by fast-import, its assumed that the frontend performed any sort of merging activity already and that fast-import should simply be storing its result. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-12 04:21:38 +01:00			`while (merge_list) {`
			`struct hash_list *next = merge_list->next;`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`strbuf_addf(&new_data, "parent %s\n", sha1_to_hex(merge_list->sha1));`
Support creation of merge commits in fast-import. Some importers are able to determine when branch merges occurred within their source data. In these cases they will want to supply the correct commits to fast-import so that a proper merge commit will exist in Git. This is now supported by supplying a 'merge ' command after the commit message and optional from command. A merge is not actually performed by fast-import, its assumed that the frontend performed any sort of merging activity already and that fast-import should simply be storing its result. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-12 04:21:38 +01:00			`free(merge_list);`
			`merge_list = next;`
			`}`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`strbuf_addf(&new_data,`
			`"author %s\n"`
			`"committer %s\n"`
			`"\n",`
			`author ? author : committer, committer);`
			`strbuf_addbuf(&new_data, &msg);`
Remove unnecessary null pointer checks in fast-import. There is no need to check for a NULL pointer before invoking free(), the runtime library automatically performs this check anyway and does nothing if a NULL pointer is supplied. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 18:05:51 +01:00			`free(author);`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`free(committer);`

fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`if (!store_object(OBJ_COMMIT, &new_data, NULL, b->sha1, next_mark))`
Correct packfile edge output in fast-import. Branches are only contained by a packfile if the branch actually had its most recent commit in that packfile. So new branches are set to MAX_PACK_ID to ensure they don't cause their commit to list as part of the first packfile when it closes out if the commit was actually in existance before fast-import started. Also corrected the type of last_commit to be umaxint_t to prevent overflow and wraparound on very large imports. Though that is highly unlikely to occur as we're talking 4 billion commits, which no real project has right now. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-17 08:42:43 +01:00			`b->pack_id = pack_id;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`b->last_commit = object_count_by_type[OBJ_COMMIT];`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00			`}`

Declare no-arg functions as (void) in fast-import. Apparently the git convention is to declare any function which takes no arguments as taking void. I did not do this during the early fast-import development, but should have. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-17 07:47:25 +01:00			`static void cmd_new_tag(void)`
Implemented 'tag' command in fast-import. Tags received from the frontend are generated in memory in a simple linked list in the order that the tag commands were sent by the frontend. If multiple different tag objects for the same tag name get generated the last one sent by the frontend will be the one that gets written out at termination. Multiple tag objects for the same name will cause all older tags of the same name to be lost. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 09:12:13 +02:00			`{`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`static struct strbuf msg = STRBUF_INIT;`
Implemented 'tag' command in fast-import. Tags received from the frontend are generated in memory in a simple linked list in the order that the tag commands were sent by the frontend. If multiple different tag objects for the same tag name get generated the last one sent by the frontend will be the one that gets written out at termination. Multiple tag objects for the same name will cause all older tags of the same name to be lost. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 09:12:13 +02:00			`char *sp;`
			`const char *from;`
			`char *tagger;`
			`struct branch *s;`
			`struct tag *t;`
Use uintmax_t for marks in fast-import. If a frontend wants to use a mark per file revision and per commit and is doing a truly huge import (such as a 32 GiB SVN repository) we may need more than 2**32 unique mark values, especially if the frontend is unable (or unwilling) to recycle mark values. For mark idnums we should use the largest unsigned integer type available, hoping that will be at least 64 bits when we are compiled as a 64 bit executable. This way we may consume huge amounts of memory storing our mark table, but we'll at least be able to process the entire import without failing. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 06:33:19 +01:00			`uintmax_t from_mark = 0;`
Implemented 'tag' command in fast-import. Tags received from the frontend are generated in memory in a simple linked list in the order that the tag commands were sent by the frontend. If multiple different tag objects for the same tag name get generated the last one sent by the frontend will be the one that gets written out at termination. Multiple tag objects for the same name will cause all older tags of the same name to be lost. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 09:12:13 +02:00			`unsigned char sha1[20];`

			`/* Obtain the new tag name from the rest of our command */`
			`sp = strchr(command_buf.buf, ' ') + 1;`
			`t = pool_alloc(sizeof(struct tag));`
			`t->next_tag = NULL;`
			`t->name = pool_strdup(sp);`
			`if (last_tag)`
			`last_tag->next_tag = t;`
			`else`
			`first_tag = t;`
			`last_tag = t;`
			`read_next_command();`

			`/* from ... */`
prefixcmp(): fix-up mechanical conversion. Previous step converted use of strncmp() with literal string mechanically even when the result is only used as a boolean: if (!strncmp("foo", arg, 3)) ==> if (!(-prefixcmp(arg, "foo"))) This step manually cleans them up to read: if (!prefixcmp(arg, "foo")) Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-20 10:54:00 +01:00			`if (prefixcmp(command_buf.buf, "from "))`
Implemented 'tag' command in fast-import. Tags received from the frontend are generated in memory in a simple linked list in the order that the tag commands were sent by the frontend. If multiple different tag objects for the same tag name get generated the last one sent by the frontend will be the one that gets written out at termination. Multiple tag objects for the same name will cause all older tags of the same name to be lost. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 09:12:13 +02:00			`die("Expected from command, got %s", command_buf.buf);`
			`from = strchr(command_buf.buf, ' ') + 1;`
			`s = lookup_branch(from);`
			`if (s) {`
Converted hash memcpy/memcmp to new hashcpy/hashcmp/hashclr. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 16:46:58 +02:00			`hashcpy(sha1, s->sha1);`
Implemented 'tag' command in fast-import. Tags received from the frontend are generated in memory in a simple linked list in the order that the tag commands were sent by the frontend. If multiple different tag objects for the same tag name get generated the last one sent by the frontend will be the one that gets written out at termination. Multiple tag objects for the same name will cause all older tags of the same name to be lost. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 09:12:13 +02:00			`} else if (*from == ':') {`
Correct compiler warnings in fast-import. Junio noticed these warnings/errors in fast-import when compiling with `-Werror -ansi -pedantic`. A few changes are to reduce compiler warnings, while one (in cmd_merge) is a bug fix. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 06:26:49 +01:00			`struct object_entry *oe;`
Use uintmax_t for marks in fast-import. If a frontend wants to use a mark per file revision and per commit and is doing a truly huge import (such as a 32 GiB SVN repository) we may need more than 2**32 unique mark values, especially if the frontend is unable (or unwilling) to recycle mark values. For mark idnums we should use the largest unsigned integer type available, hoping that will be at least 64 bits when we are compiled as a 64 bit executable. This way we may consume huge amounts of memory storing our mark table, but we'll at least be able to process the entire import without failing. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 06:33:19 +01:00			`from_mark = strtoumax(from + 1, NULL, 10);`
Correct compiler warnings in fast-import. Junio noticed these warnings/errors in fast-import when compiling with `-Werror -ansi -pedantic`. A few changes are to reduce compiler warnings, while one (in cmd_merge) is a bug fix. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 06:26:49 +01:00			`oe = find_mark(from_mark);`
Implemented 'tag' command in fast-import. Tags received from the frontend are generated in memory in a simple linked list in the order that the tag commands were sent by the frontend. If multiple different tag objects for the same tag name get generated the last one sent by the frontend will be the one that gets written out at termination. Multiple tag objects for the same name will cause all older tags of the same name to be lost. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 09:12:13 +02:00			`if (oe->type != OBJ_COMMIT)`
Check for PRIuMAX rather than NO_C99_FORMAT in fast-import.c. Thanks to Simon 'corecode' Schubert <corecode@fs.ei.tum.de> for the clean-up. Defining the C99 standard PRIuMAX when necessary replaces UM_FMT and the awkward UM10_FMT. There are no direct C99 translations for other uses of NO_C99_FORMAT in git, alas. Signed-off-by: Jason Riedy <ejr@cs.berkeley.edu> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-21 02:34:56 +01:00			`die("Mark :%" PRIuMAX " not a commit", from_mark);`
Converted hash memcpy/memcmp to new hashcpy/hashcmp/hashclr. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 16:46:58 +02:00			`hashcpy(sha1, oe->sha1);`
Implemented 'tag' command in fast-import. Tags received from the frontend are generated in memory in a simple linked list in the order that the tag commands were sent by the frontend. If multiple different tag objects for the same tag name get generated the last one sent by the frontend will be the one that gets written out at termination. Multiple tag objects for the same name will cause all older tags of the same name to be lost. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 09:12:13 +02:00			`} else if (!get_sha1(from, sha1)) {`
			`unsigned long size;`
			`char *buf;`

			`buf = read_object_with_reference(sha1,`
formalize typename(), and add its reverse type_from_string() Sometime typename() is used, sometimes type_names[] is accessed directly. Let's enforce typename() all the time which allows for validating the type. Also let's add a function to go from a name to a type and use it instead of manual memcpy() when appropriate. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-26 20:55:58 +01:00			`commit_type, &size, sha1);`
Implemented 'tag' command in fast-import. Tags received from the frontend are generated in memory in a simple linked list in the order that the tag commands were sent by the frontend. If multiple different tag objects for the same tag name get generated the last one sent by the frontend will be the one that gets written out at termination. Multiple tag objects for the same name will cause all older tags of the same name to be lost. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 09:12:13 +02:00			`if (!buf \|\| size < 46)`
			`die("Not a valid commit: %s", from);`
			`free(buf);`
			`} else`
			`die("Invalid ref name or SHA1 expression: %s", from);`
			`read_next_command();`

			`/* tagger ... */`
prefixcmp(): fix-up mechanical conversion. Previous step converted use of strncmp() with literal string mechanically even when the result is only used as a boolean: if (!strncmp("foo", arg, 3)) ==> if (!(-prefixcmp(arg, "foo"))) This step manually cleans them up to read: if (!prefixcmp(arg, "foo")) Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-20 10:54:00 +01:00			`if (prefixcmp(command_buf.buf, "tagger "))`
Implemented 'tag' command in fast-import. Tags received from the frontend are generated in memory in a simple linked list in the order that the tag commands were sent by the frontend. If multiple different tag objects for the same tag name get generated the last one sent by the frontend will be the one that gets written out at termination. Multiple tag objects for the same name will cause all older tags of the same name to be lost. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 09:12:13 +02:00			`die("Expected tagger command, got %s", command_buf.buf);`
Support RFC 2822 date parsing in fast-import. Since some frontends may be working with source material where the dates are only readily available as RFC 2822 strings, it is more friendly if fast-import exposes Git's parse_date() function to handle the conversion. This way the frontend doesn't need to perform the parsing itself. The new --date-format option to fast-import can be used by a frontend to select which format it will supply date strings in. The default is the standard `raw` Git format, which fast-import has always supported. Format rfc2822 can be used to activate the parse_date() function instead. Because fast-import could also be useful for creating new, current commits, the format `now` is also supported to generate the current system timestamp. The implementation of `now` is a trivial call to datestamp(), but is actually a whole whopping 3 lines so that fast-import can verify the frontend really meant `now`. As part of this change I have added validation of the `raw` date format. Prior to this change fast-import would accept anything in a `committer` command, even if it was seriously malformed. Now fast-import requires the '> ' near the end of the string and verifies the timestamp is formatted properly. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 20:58:30 +01:00			`tagger = parse_ident(command_buf.buf + 7);`
Implemented 'tag' command in fast-import. Tags received from the frontend are generated in memory in a simple linked list in the order that the tag commands were sent by the frontend. If multiple different tag objects for the same tag name get generated the last one sent by the frontend will be the one that gets written out at termination. Multiple tag objects for the same name will cause all older tags of the same name to be lost. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 09:12:13 +02:00
			`/* tag payload/message */`
			`read_next_command();`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`cmd_data(&msg);`
Implemented 'tag' command in fast-import. Tags received from the frontend are generated in memory in a simple linked list in the order that the tag commands were sent by the frontend. If multiple different tag objects for the same tag name get generated the last one sent by the frontend will be the one that gets written out at termination. Multiple tag objects for the same name will cause all older tags of the same name to be lost. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 09:12:13 +02:00
			`/* build the tag object */`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`strbuf_reset(&new_data);`
			`strbuf_addf(&new_data,`
			`"object %s\n"`
			`"type %s\n"`
			`"tag %s\n"`
			`"tagger %s\n"`
			`"\n",`
			`sha1_to_hex(sha1), commit_type, t->name, tagger);`
			`strbuf_addbuf(&new_data, &msg);`
Implemented 'tag' command in fast-import. Tags received from the frontend are generated in memory in a simple linked list in the order that the tag commands were sent by the frontend. If multiple different tag objects for the same tag name get generated the last one sent by the frontend will be the one that gets written out at termination. Multiple tag objects for the same name will cause all older tags of the same name to be lost. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 09:12:13 +02:00			`free(tagger);`

fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`if (store_object(OBJ_TAG, &new_data, NULL, t->sha1, 0))`
Correct packfile edge output in fast-import. Branches are only contained by a packfile if the branch actually had its most recent commit in that packfile. So new branches are set to MAX_PACK_ID to ensure they don't cause their commit to list as part of the first packfile when it closes out if the commit was actually in existance before fast-import started. Also corrected the type of last_commit to be umaxint_t to prevent overflow and wraparound on very large imports. Though that is highly unlikely to occur as we're talking 4 billion commits, which no real project has right now. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-17 08:42:43 +01:00			`t->pack_id = MAX_PACK_ID;`
			`else`
			`t->pack_id = pack_id;`
Implemented 'tag' command in fast-import. Tags received from the frontend are generated in memory in a simple linked list in the order that the tag commands were sent by the frontend. If multiple different tag objects for the same tag name get generated the last one sent by the frontend will be the one that gets written out at termination. Multiple tag objects for the same name will cause all older tags of the same name to be lost. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 09:12:13 +02:00			`}`

Declare no-arg functions as (void) in fast-import. Apparently the git convention is to declare any function which takes no arguments as taking void. I did not do this during the early fast-import development, but should have. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-17 07:47:25 +01:00			`static void cmd_reset_branch(void)`
Added 'reset' command to clear a branch's tree. Sometimes an import frontend may need to work with a temporary branch which will actually contain many different branches over the life of the import. This is especially useful when the frontend needs to create a tag from a set of file versions which are otherwise never a commit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-27 12:20:49 +02:00			`{`
			`struct branch *b;`
			`char *sp;`

			`/* Obtain the branch name from the rest of our command */`
			`sp = strchr(command_buf.buf, ' ') + 1;`
			`b = lookup_branch(sp);`
			`if (b) {`
fast-import: Support reusing 'from' and brown paper bag fix reset. It was suggested on the mailing list that being able to use `from` in any commit to reset the current branch is useful in some types of importers, such as a darcs importer. We originally did not permit resetting an existing branch with a new `from` command during a `commit` command, but this restriction was only to help debug the hacked up cvs2svn that Jon Smirl was developing in parallel with git-fast-import. It is probably more of a problem to disallow it than to allow it. So now we permit a `from` during any `commit`. While making the changes required to permit multiple `from` commands on the same branch, I discovered we no longer needed the last_commit field to be set to 0 during a reset, so that was removed. (Reset was originally setting the field to 0 to signal cmd_from() that it was OK to execute on the branch.) While poking around in this section of fast-import I also realized the `reset` command was not working as intended if the corresponding `from` command was omitted (as allowed by the BNF grammar and the code). If `from` was omitted we cleared out the tree but we left the tree SHA-1 and parent commit SHA-1 intact. This is not what the user intended in this case. Instead they would be trying to reset the branch to have no parent and to have no tree, making the branch look new-born during the next commit. We now clear these SHA-1 values during `reset`, ensuring the branch looks new-born if `from` does not get supplied. New test cases for these were also added. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-12 10:08:43 +01:00			`hashclr(b->sha1);`
			`hashclr(b->branch_tree.versions[0].sha1);`
			`hashclr(b->branch_tree.versions[1].sha1);`
Added 'reset' command to clear a branch's tree. Sometimes an import frontend may need to work with a temporary branch which will actually contain many different branches over the life of the import. This is especially useful when the frontend needs to create a tag from a set of file versions which are otherwise never a commit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-27 12:20:49 +02:00			`if (b->branch_tree.tree) {`
			`release_tree_content_recursive(b->branch_tree.tree);`
			`b->branch_tree.tree = NULL;`
			`}`
			`}`
Allow creating branches without committing in fast-import. Some importers may want to create a branch long before they actually commit to it, or in some cases they may never commit to the branch but they still need the ref to be created in the repository after the import is complete. This extends the 'reset ' command to automatically create a new branch if the supplied reference isn't already known as a branch. While I'm at it I also modified the syntax of the reset command to terminate with an empty line, like commit and tag operate. This just makes the command set more consistent. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-12 04:28:39 +01:00			`else`
			`b = new_branch(sp);`
			`read_next_command();`
Rework strbuf API and semantics. The gory details are explained in strbuf.h. The change of semantics this patch enforces is that the embeded buffer has always a '\0' character after its last byte, to always make it a C-string. The offs-by-one changes are all related to that very change. A strbuf can be used to store byte arrays, or as an extended string library. The `buf' member can be passed to any C legacy string function, because strbuf operations always ensure there is a terminating \0 at the end of the buffer, not accounted in the `len' field of the structure. A strbuf can be used to generate a string/buffer whose final size is not really known, and then "strbuf_detach" can be used to get the built buffer, and keep the wrapping "strbuf" structure usable for further work again. Other interesting feature: strbuf_grow(sb, size) ensure that there is enough allocated space in `sb' to put `size' new octets of data in the buffer. It helps avoiding reallocating data for nothing when the problem the strbuf helps to solve has a known typical size. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-06 13:20:05 +02:00			`if (!cmd_from(b) && command_buf.len > 0)`
Make trailing LF optional for all fast-import commands For the same reasons as the prior change we want to allow frontends to omit the trailing LF that usually delimits commands. In some cases these just make the input stream more verbose looking than it needs to be, and its just simpler for the frontend developer to get started if our parser is slightly more lenient about where an LF is required and where it isn't. To make this optional LF feature work we now have to buffer up to one line of input in command_buf. This buffering can happen if we look at the current input command but don't recognize it at this point in the code. In such a case we need to "unget" the entire line, but we cannot depend upon the stdio library to let us do ungetc() for that many characters at once. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-01 08:22:53 +02:00			`unread_command_buf = 1;`
Added 'reset' command to clear a branch's tree. Sometimes an import frontend may need to work with a temporary branch which will actually contain many different branches over the life of the import. This is especially useful when the frontend needs to create a tag from a set of file versions which are otherwise never a commit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-27 12:20:49 +02:00			`}`

Declare no-arg functions as (void) in fast-import. Apparently the git convention is to declare any function which takes no arguments as taking void. I did not do this during the early fast-import development, but should have. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-17 07:47:25 +01:00			`static void cmd_checkpoint(void)`
Implemented manual packfile switching in fast-import. To help importers which are dealing with massive amounts of data fast-import needs to be able to close the packfile it is currently writing to and open a new packfile for any additional data that will be received. A new 'checkpoint' command has been introduced which can be used by the frontend import process to force this to occur at any time. This may be useful to ensure a very long running import doesn't lose any work due to unexpected failures. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:35:41 +01:00			`{`
Dump all refs and marks during a checkpoint in fast-import. If the frontend asks us to checkpoint (via the explicit checkpoint command) its probably because they are afraid the current import will crash/fail/whatever and want to make sure they can pickup from the last checkpoint. To do that sort of recovery, we will need the current tip of every branch and tag available at the next startup. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-07 08:42:44 +01:00			`if (object_count) {`
			`cycle_packfile();`
			`dump_branches();`
			`dump_tags();`
			`dump_marks();`
			`}`
Make trailing LF optional for all fast-import commands For the same reasons as the prior change we want to allow frontends to omit the trailing LF that usually delimits commands. In some cases these just make the input stream more verbose looking than it needs to be, and its just simpler for the frontend developer to get started if our parser is slightly more lenient about where an LF is required and where it isn't. To make this optional LF feature work we now have to buffer up to one line of input in command_buf. This buffering can happen if we look at the current input command but don't recognize it at this point in the code. In such a case we need to "unget" the entire line, but we cannot depend upon the stdio library to let us do ungetc() for that many characters at once. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-01 08:22:53 +02:00			`skip_optional_lf();`
Implemented manual packfile switching in fast-import. To help importers which are dealing with massive amounts of data fast-import needs to be able to close the packfile it is currently writing to and open a new packfile for any additional data that will be received. A new 'checkpoint' command has been introduced which can be used by the frontend import process to force this to occur at any time. This may be useful to ensure a very long running import doesn't lose any work due to unexpected failures. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:35:41 +01:00			`}`

Allow frontends to bidirectionally communicate with fast-import The existing checkpoint command is very useful to force fast-import to dump the branches out to disk so that standard Git tools can access them and the objects they refer to. However there was not a way to know when fast-import had finished executing the checkpoint and it was safe to read those refs. The progress command can be used to make fast-import output any message of the frontend's choosing to standard out. The frontend can scan for these messages using select() or poll() to monitor a pipe connected to the standard output of fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-01 16:23:08 +02:00			`static void cmd_progress(void)`
			`{`
Rework strbuf API and semantics. The gory details are explained in strbuf.h. The change of semantics this patch enforces is that the embeded buffer has always a '\0' character after its last byte, to always make it a C-string. The offs-by-one changes are all related to that very change. A strbuf can be used to store byte arrays, or as an extended string library. The `buf' member can be passed to any C legacy string function, because strbuf operations always ensure there is a terminating \0 at the end of the buffer, not accounted in the `len' field of the structure. A strbuf can be used to generate a string/buffer whose final size is not really known, and then "strbuf_detach" can be used to get the built buffer, and keep the wrapping "strbuf" structure usable for further work again. Other interesting feature: strbuf_grow(sb, size) ensure that there is enough allocated space in `sb' to put `size' new octets of data in the buffer. It helps avoiding reallocating data for nothing when the problem the strbuf helps to solve has a known typical size. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-06 13:20:05 +02:00			`fwrite(command_buf.buf, 1, command_buf.len, stdout);`
Allow frontends to bidirectionally communicate with fast-import The existing checkpoint command is very useful to force fast-import to dump the branches out to disk so that standard Git tools can access them and the objects they refer to. However there was not a way to know when fast-import had finished executing the checkpoint and it was safe to read those refs. The progress command can be used to make fast-import output any message of the frontend's choosing to standard out. The frontend can scan for these messages using select() or poll() to monitor a pipe connected to the standard output of fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-01 16:23:08 +02:00			`fputc('\n', stdout);`
			`fflush(stdout);`
			`skip_optional_lf();`
			`}`

Allow fast-import frontends to reload the marks table I'm giving fast-import a lesson on how to reload the marks table using the same format it outputs with --export-marks. This way a frontend can reload the marks table from a prior import, making incremental imports less painful. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-08 00:07:26 +01:00			`static void import_marks(const char *input_file)`
			`{`
			`char line[512];`
			`FILE *f = fopen(input_file, "r");`
			`if (!f)`
			`die("cannot read %s: %s", input_file, strerror(errno));`
			`while (fgets(line, sizeof(line), f)) {`
			`uintmax_t mark;`
			`char *end;`
			`unsigned char sha1[20];`
			`struct object_entry *e;`

			`end = strchr(line, '\n');`
			`if (line[0] != ':' \|\| !end)`
			`die("corrupt mark line: %s", line);`
			`*end = 0;`
			`mark = strtoumax(line + 1, &end, 10);`
			`if (!mark \|\| end == line + 1`
			`\|\| *end != ' ' \|\| get_sha1(end + 1, sha1))`
			`die("corrupt mark line: %s", line);`
			`e = find_object(sha1);`
			`if (!e) {`
			`enum object_type type = sha1_object_info(sha1, NULL);`
			`if (type < 0)`
			`die("object not found: %s", sha1_to_hex(sha1));`
			`e = insert_object(sha1);`
			`e->type = type;`
			`e->pack_id = MAX_PACK_ID;`
Don't repack existing objects in fast-import Some users of fast-import have been trying to use it to rewrite commits and trees, an activity where the all of the relevant blobs are already available from the existing packfiles. In such a case we don't want to repack a blob, even if the frontend application has supplied us the raw data rather than a mark or a SHA-1 name. I'm intentionally only checking the packfiles that existed when fast-import started and am always ignoring all loose object files. We ignore loose objects because fast-import tends to operate on a very large number of objects in a very short timespan, and it is usually creating new objects, not reusing existing ones. In such a situtation the majority of the objects will not be found in the existing packfiles, nor will they be loose object files. If the frontend application really wants us to look at loose object files, then they can just repack the repository before running fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-04-20 17:23:45 +02:00			`e->offset = 1; /* just not zero! */`
Allow fast-import frontends to reload the marks table I'm giving fast-import a lesson on how to reload the marks table using the same format it outputs with --export-marks. This way a frontend can reload the marks table from a prior import, making incremental imports less painful. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-08 00:07:26 +01:00			`}`
			`insert_mark(mark, e);`
			`}`
			`fclose(f);`
			`}`

Teach fast-import to honor pack.compression and pack.depth We now use the configured pack.compression and pack.depth values within fast-import, as like builtin-pack-objects fast-import is generating a packfile for consumption by the Git tools. We use the same behavior as builtin-pack-objects does for these options, allowing core.compression to supply the default value for pack.compression. The default setting for pack.depth within fast-import is still 10 as users will generally repack fast-import generated packfiles by `repack -f`. A large delta depth within the fast-import packfile can significantly slow down such a later repack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-21 05:36:54 +01:00			`static int git_pack_config(const char k, const char v)`
			`{`
			`if (!strcmp(k, "pack.depth")) {`
			`max_depth = git_config_int(k, v);`
			`if (max_depth > MAX_DEPTH)`
			`max_depth = MAX_DEPTH;`
			`return 0;`
			`}`
			`if (!strcmp(k, "pack.compression")) {`
			`int level = git_config_int(k, v);`
			`if (level == -1)`
			`level = Z_DEFAULT_COMPRESSION;`
			`else if (level < 0 \|\| level > Z_BEST_COMPRESSION)`
			`die("bad pack compression level %d", level);`
			`pack_compression_level = level;`
			`pack_compression_seen = 1;`
			`return 0;`
			`}`
			`return git_default_config(k, v);`
			`}`

Converted fast-import to accept standard command line parameters. The following command line options are now accepted before the pack name: --objects=n # replaces the object count after the pack name --depth=n # delta chain depth to use (default is 10) --active-branches=n # maximum number of branches to keep in memory Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 08:00:31 +02:00			`static const char fast_import_usage[] =`
Support RFC 2822 date parsing in fast-import. Since some frontends may be working with source material where the dates are only readily available as RFC 2822 strings, it is more friendly if fast-import exposes Git's parse_date() function to handle the conversion. This way the frontend doesn't need to perform the parsing itself. The new --date-format option to fast-import can be used by a frontend to select which format it will supply date strings in. The default is the standard `raw` Git format, which fast-import has always supported. Format rfc2822 can be used to activate the parse_date() function instead. Because fast-import could also be useful for creating new, current commits, the format `now` is also supported to generate the current system timestamp. The implementation of `now` is a trivial call to datestamp(), but is actually a whole whopping 3 lines so that fast-import can verify the frontend really meant `now`. As part of this change I have added validation of the `raw` date format. Prior to this change fast-import would accept anything in a `committer` command, even if it was seriously malformed. Now fast-import requires the '> ' near the end of the string and verifies the timestamp is formatted properly. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 20:58:30 +01:00			`"git-fast-import [--date-format=f] [--max-pack-size=n] [--depth=n] [--active-branches=n] [--export-marks=marks.file]";`
Converted fast-import to accept standard command line parameters. The following command line options are now accepted before the pack name: --objects=n # replaces the object count after the pack name --depth=n # delta chain depth to use (default is 10) --active-branches=n # maximum number of branches to keep in memory Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 08:00:31 +02:00
Added automatic index generation to fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-06 19:51:39 +02:00			`int main(int argc, const char **argv)`
			`{`
Include recent command history in fast-import crash reports When we crash the frontend developer (or end-user) may need to know roughly around what part of the input stream we had a problem with and aborted on. Because line numbers aren't very useful in this sort of application we instead just keep the last 100 commands in a FIFO queue and print them as part of the crash report. Currently one problem with this design is a commit that has more than 100 modified files in it will flood the FIFO and any context regarding branch/from/committer/mark/comments will be lost. We really should save only the last few (10?) file changes for the current commit, ensuring we have some prior higher level commands in the FIFO when we crash on a file M/D/C/R command. Another issue with this approach is the FIFO only includes the commands, it does not include the commit messages. Yet having a commit message may be useful to help locate the relevant change in the source material. In practice I don't think this is going to be a major concern as the frontend can always embed its own source change set identifier as a comment (which will appear in the crash report) and the commit message(s) for the most recent commits of any given branch should be obtainable from the (packed) commit objects. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 10:47:04 +02:00			`unsigned int i, show_stats = 1;`
Added automatic index generation to fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-06 19:51:39 +02:00
fast-import: exit with proper message if not a git dir git fast-import expects to be run from an existing (possibly empty) repository. It was dying with a suboptimal message if that wasn't the case. Signed-off-by: Jean-Luc Herren <jlh@gmx.ch> Acked-by: Shawn O. Pearce <spearce@spearce.org> 2008-02-28 23:29:54 +01:00			`setup_git_directory();`
Teach fast-import to honor pack.compression and pack.depth We now use the configured pack.compression and pack.depth values within fast-import, as like builtin-pack-objects fast-import is generating a packfile for consumption by the Git tools. We use the same behavior as builtin-pack-objects does for these options, allowing core.compression to supply the default value for pack.compression. The default setting for pack.depth within fast-import is still 10 as users will generally repack fast-import generated packfiles by `repack -f`. A large delta depth within the fast-import packfile can significantly slow down such a later repack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-21 05:36:54 +01:00			`git_config(git_pack_config);`
			`if (!pack_compression_seen && core_compression_seen)`
			`pack_compression_level = core_compression_level;`

Preallocate memory earlier in fast-import I'm about to teach fast-import how to reload the marks file created by a prior session. The general approach that I want to use is to immediately parse the marks file when the specific argument is found in argv, thereby allowing the caller to supply multiple marks files, as the mark space can be sparsely populated. To make that work out we need to allocate our object tables before we parse the command line options. Since none of these tables depend on the command line options, we can easily relocate them. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-07 23:09:21 +01:00			`alloc_objects(object_entry_alloc);`
Strbuf API extensions and fixes. * Add strbuf_rtrim to remove trailing spaces. * Add strbuf_insert to insert data at a given position. * Off-by one fix in strbuf_addf: strbuf_avail() does not counts the final \0 so the overflow test for snprintf is the strict comparison. This is not critical as the growth mechanism chosen will always allocate _more_ memory than asked, so the second test will not fail. It's some kind of miracle though. * Add size extension hints for strbuf_init and strbuf_read. If 0, default applies, else: + initial buffer has the given size for strbuf_init. + first growth checks it has at least this size rather than the default 8192. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-10 12:35:04 +02:00			`strbuf_init(&command_buf, 0);`
Preallocate memory earlier in fast-import I'm about to teach fast-import how to reload the marks file created by a prior session. The general approach that I want to use is to immediately parse the marks file when the specific argument is found in argv, thereby allowing the caller to supply multiple marks files, as the mark space can be sparsely populated. To make that work out we need to allocate our object tables before we parse the command line options. Since none of these tables depend on the command line options, we can easily relocate them. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-07 23:09:21 +01:00			`atom_table = xcalloc(atom_table_sz, sizeof(struct atom_str*));`
			`branch_table = xcalloc(branch_table_sz, sizeof(struct branch*));`
			`avail_tree_table = xcalloc(avail_tree_table_sz, sizeof(struct avail_tree_content*));`
			`marks = pool_calloc(1, sizeof(struct mark_set));`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00
Converted fast-import to accept standard command line parameters. The following command line options are now accepted before the pack name: --objects=n # replaces the object count after the pack name --depth=n # delta chain depth to use (default is 10) --active-branches=n # maximum number of branches to keep in memory Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 08:00:31 +02:00			`for (i = 1; i < argc; i++) {`
			`const char *a = argv[i];`

			`if (*a != '-' \|\| !strcmp(a, "--"))`
			`break;`
Mechanical conversion to use prefixcmp() This mechanically converts strncmp() to use prefixcmp(), but only when the parameters match specific patterns, so that they can be verified easily. Leftover from this will be fixed in a separate step, including idiotic conversions like if (!strncmp("foo", arg, 3)) => if (!(-prefixcmp(arg, "foo"))) This was done by using this script in px.perl #!/usr/bin/perl -i.bak -p if (/strncmp\(([^,]+), "([^\\"])", (\d+)\)/ && (length($2) == $3)) { s\|strncmp\(([^,]+), "([^\\"])", (\d+)\)\|prefixcmp($1, "$2")\|; } if (/strncmp\("([^\\"])", ([^,]+), (\d+)\)/ && (length($1) == $3)) { s\|strncmp\("([^\\"])", ([^,]+), (\d+)\)\|(-prefixcmp($2, "$1"))\|; } and running: $ git grep -l strncmp -- '*.c' \| xargs perl px.perl Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-20 10:53:29 +01:00			`else if (!prefixcmp(a, "--date-format=")) {`
Support RFC 2822 date parsing in fast-import. Since some frontends may be working with source material where the dates are only readily available as RFC 2822 strings, it is more friendly if fast-import exposes Git's parse_date() function to handle the conversion. This way the frontend doesn't need to perform the parsing itself. The new --date-format option to fast-import can be used by a frontend to select which format it will supply date strings in. The default is the standard `raw` Git format, which fast-import has always supported. Format rfc2822 can be used to activate the parse_date() function instead. Because fast-import could also be useful for creating new, current commits, the format `now` is also supported to generate the current system timestamp. The implementation of `now` is a trivial call to datestamp(), but is actually a whole whopping 3 lines so that fast-import can verify the frontend really meant `now`. As part of this change I have added validation of the `raw` date format. Prior to this change fast-import would accept anything in a `committer` command, even if it was seriously malformed. Now fast-import requires the '> ' near the end of the string and verifies the timestamp is formatted properly. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 20:58:30 +01:00			`const char *fmt = a + 14;`
			`if (!strcmp(fmt, "raw"))`
			`whenspec = WHENSPEC_RAW;`
			`else if (!strcmp(fmt, "rfc2822"))`
			`whenspec = WHENSPEC_RFC2822;`
			`else if (!strcmp(fmt, "now"))`
			`whenspec = WHENSPEC_NOW;`
			`else`
			`die("unknown --date-format argument %s", fmt);`
			`}`
Mechanical conversion to use prefixcmp() This mechanically converts strncmp() to use prefixcmp(), but only when the parameters match specific patterns, so that they can be verified easily. Leftover from this will be fixed in a separate step, including idiotic conversions like if (!strncmp("foo", arg, 3)) => if (!(-prefixcmp(arg, "foo"))) This was done by using this script in px.perl #!/usr/bin/perl -i.bak -p if (/strncmp\(([^,]+), "([^\\"])", (\d+)\)/ && (length($2) == $3)) { s\|strncmp\(([^,]+), "([^\\"])", (\d+)\)\|prefixcmp($1, "$2")\|; } if (/strncmp\("([^\\"])", ([^,]+), (\d+)\)/ && (length($1) == $3)) { s\|strncmp\("([^\\"])", ([^,]+), (\d+)\)\|(-prefixcmp($2, "$1"))\|; } and running: $ git grep -l strncmp -- '*.c' \| xargs perl px.perl Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-20 10:53:29 +01:00			`else if (!prefixcmp(a, "--max-pack-size="))`
Use uintmax_t for marks in fast-import. If a frontend wants to use a mark per file revision and per commit and is doing a truly huge import (such as a 32 GiB SVN repository) we may need more than 2**32 unique mark values, especially if the frontend is unable (or unwilling) to recycle mark values. For mark idnums we should use the largest unsigned integer type available, hoping that will be at least 64 bits when we are compiled as a 64 bit executable. This way we may consume huge amounts of memory storing our mark table, but we'll at least be able to process the entire import without failing. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 06:33:19 +01:00			`max_packsize = strtoumax(a + 16, NULL, 0) * 1024 * 1024;`
Don't allow fast-import tree delta chains to exceed maximum depth Brian Downing noticed fast-import can produce tree depths of up to 6,035 objects and even deeper. Long delta chains can create very small packfiles but cause problems during repacking as git needs to unpack each tree to count the reachable blobs. What's happening here is the active branch cache isn't big enough. We're swapping out the branch and thus recycling the tree information (struct tree_content) back into the free pool. When we later reload the tree we set the delta_depth to 0 but we kept the tree we just reloaded as a delta base. So if the tree we reloaded was already at the maximum depth we wouldn't know it and make the new tree a delta. Multiply the number of times the branch cache has to swap out the tree times max_depth (10) and you get the maximum delta depth of a tree created by fast-import. In Brian's case above the active branch cache had to swap the branch out 603/604 times during this import to produce a tree with a delta depth of 6035. Acked-by: Brian Downing <bdowning@lavos.net> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-14 05:48:42 +01:00			`else if (!prefixcmp(a, "--depth=")) {`
Converted fast-import to accept standard command line parameters. The following command line options are now accepted before the pack name: --objects=n # replaces the object count after the pack name --depth=n # delta chain depth to use (default is 10) --active-branches=n # maximum number of branches to keep in memory Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 08:00:31 +02:00			`max_depth = strtoul(a + 8, NULL, 0);`
Don't allow fast-import tree delta chains to exceed maximum depth Brian Downing noticed fast-import can produce tree depths of up to 6,035 objects and even deeper. Long delta chains can create very small packfiles but cause problems during repacking as git needs to unpack each tree to count the reachable blobs. What's happening here is the active branch cache isn't big enough. We're swapping out the branch and thus recycling the tree information (struct tree_content) back into the free pool. When we later reload the tree we set the delta_depth to 0 but we kept the tree we just reloaded as a delta base. So if the tree we reloaded was already at the maximum depth we wouldn't know it and make the new tree a delta. Multiply the number of times the branch cache has to swap out the tree times max_depth (10) and you get the maximum delta depth of a tree created by fast-import. In Brian's case above the active branch cache had to swap the branch out 603/604 times during this import to produce a tree with a delta depth of 6035. Acked-by: Brian Downing <bdowning@lavos.net> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-14 05:48:42 +01:00			`if (max_depth > MAX_DEPTH)`
			`die("--depth cannot exceed %u", MAX_DEPTH);`
			`}`
Mechanical conversion to use prefixcmp() This mechanically converts strncmp() to use prefixcmp(), but only when the parameters match specific patterns, so that they can be verified easily. Leftover from this will be fixed in a separate step, including idiotic conversions like if (!strncmp("foo", arg, 3)) => if (!(-prefixcmp(arg, "foo"))) This was done by using this script in px.perl #!/usr/bin/perl -i.bak -p if (/strncmp\(([^,]+), "([^\\"])", (\d+)\)/ && (length($2) == $3)) { s\|strncmp\(([^,]+), "([^\\"])", (\d+)\)\|prefixcmp($1, "$2")\|; } if (/strncmp\("([^\\"])", ([^,]+), (\d+)\)/ && (length($1) == $3)) { s\|strncmp\("([^\\"])", ([^,]+), (\d+)\)\|(-prefixcmp($2, "$1"))\|; } and running: $ git grep -l strncmp -- '*.c' \| xargs perl px.perl Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-20 10:53:29 +01:00			`else if (!prefixcmp(a, "--active-branches="))`
Converted fast-import to accept standard command line parameters. The following command line options are now accepted before the pack name: --objects=n # replaces the object count after the pack name --depth=n # delta chain depth to use (default is 10) --active-branches=n # maximum number of branches to keep in memory Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 08:00:31 +02:00			`max_active_branches = strtoul(a + 18, NULL, 0);`
Allow fast-import frontends to reload the marks table I'm giving fast-import a lesson on how to reload the marks table using the same format it outputs with --export-marks. This way a frontend can reload the marks table from a prior import, making incremental imports less painful. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-08 00:07:26 +01:00			`else if (!prefixcmp(a, "--import-marks="))`
			`import_marks(a + 15);`
Mechanical conversion to use prefixcmp() This mechanically converts strncmp() to use prefixcmp(), but only when the parameters match specific patterns, so that they can be verified easily. Leftover from this will be fixed in a separate step, including idiotic conversions like if (!strncmp("foo", arg, 3)) => if (!(-prefixcmp(arg, "foo"))) This was done by using this script in px.perl #!/usr/bin/perl -i.bak -p if (/strncmp\(([^,]+), "([^\\"])", (\d+)\)/ && (length($2) == $3)) { s\|strncmp\(([^,]+), "([^\\"])", (\d+)\)\|prefixcmp($1, "$2")\|; } if (/strncmp\("([^\\"])", ([^,]+), (\d+)\)/ && (length($1) == $3)) { s\|strncmp\("([^\\"])", ([^,]+), (\d+)\)\|(-prefixcmp($2, "$1"))\|; } and running: $ git grep -l strncmp -- '*.c' \| xargs perl px.perl Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-20 10:53:29 +01:00			`else if (!prefixcmp(a, "--export-marks="))`
Added option to export the marks table when fast-import terminates. The marks table can be used by the frontend to load any commit after the import and compare it to whatever data the frontend knows about that commit. If the mark idnums can be easily correlated to some reference source then its relatively trivial to compare the GIT tree to the reference to verify the accuracy of the import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 22:03:04 +02:00			`mark_file = a + 15;`
Mechanical conversion to use prefixcmp() This mechanically converts strncmp() to use prefixcmp(), but only when the parameters match specific patterns, so that they can be verified easily. Leftover from this will be fixed in a separate step, including idiotic conversions like if (!strncmp("foo", arg, 3)) => if (!(-prefixcmp(arg, "foo"))) This was done by using this script in px.perl #!/usr/bin/perl -i.bak -p if (/strncmp\(([^,]+), "([^\\"])", (\d+)\)/ && (length($2) == $3)) { s\|strncmp\(([^,]+), "([^\\"])", (\d+)\)\|prefixcmp($1, "$2")\|; } if (/strncmp\("([^\\"])", ([^,]+), (\d+)\)/ && (length($1) == $3)) { s\|strncmp\("([^\\"])", ([^,]+), (\d+)\)\|(-prefixcmp($2, "$1"))\|; } and running: $ git grep -l strncmp -- '*.c' \| xargs perl px.perl Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-20 10:53:29 +01:00			`else if (!prefixcmp(a, "--export-pack-edges=")) {`
fast-import: Hide the pack boundary commits by default. Most users don't need the pack boundary information that fast-import was printing to standard output, especially if they were calling it with --quiet. Those users who do want this information probably want it captured so they can go back and use it to repack the imported repository. So dumping the boundary commits to a log file makes more sense then printing them to standard output. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-12 01:45:56 +01:00			`if (pack_edges)`
			`fclose(pack_edges);`
			`pack_edges = fopen(a + 20, "a");`
			`if (!pack_edges)`
			`die("Cannot open %s: %s", a + 20, strerror(errno));`
			`} else if (!strcmp(a, "--force"))`
Don't do non-fastforward updates in fast-import. If fast-import is being used to update an existing branch of a repository, the user may not want to lose commits if another process updates the same ref at the same time. For example, the user might be using fast-import to make just one or two commits against a live branch. We now perform a fast-forward check during the ref updating process. If updating a branch would cause commits in that branch to be lost, we skip over it and display the new SHA1 to standard error. This new default behavior can be overridden with `--force`, like git-push and git-fetch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 22:08:06 +01:00			`force_update = 1;`
Teach fast-import how to sit quietly in the corner. Often users will be running fast-import from within a larger frontend process, and this may be a frequent periodic tool such as a future edition of `git-svn fetch`. We don't want to bombard users with our large stats output if they won't be interested in it, so `--quiet` is now an option to make gfi be more silent. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-07 08:19:31 +01:00			`else if (!strcmp(a, "--quiet"))`
			`show_stats = 0;`
			`else if (!strcmp(a, "--stats"))`
			`show_stats = 1;`
Converted fast-import to accept standard command line parameters. The following command line options are now accepted before the pack name: --objects=n # replaces the object count after the pack name --depth=n # delta chain depth to use (default is 10) --active-branches=n # maximum number of branches to keep in memory Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 08:00:31 +02:00			`else`
			`die("unknown option %s", a);`
			`}`
Use .keep files in fast-import during processing. Because fast-import automatically updates all references (heads and tags) at the end of its run the repository is corrupt unless the objects are available in the .git/objects/pack directory prior to the refs being modified. The easiest way to ensure that is true is to move the packfile and its associated index directly into the .git/objects/pack directory as soon as we have finished output to it. But the only safe way to do this is to create the a temporary .keep file for that pack, so we use the same tricks that index-pack uses when its being invoked by receive-pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 07:15:31 +01:00			`if (i != argc)`
Converted fast-import to accept standard command line parameters. The following command line options are now accepted before the pack name: --objects=n # replaces the object count after the pack name --depth=n # delta chain depth to use (default is 10) --active-branches=n # maximum number of branches to keep in memory Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 08:00:31 +02:00			`usage(fast_import_usage);`

Include recent command history in fast-import crash reports When we crash the frontend developer (or end-user) may need to know roughly around what part of the input stream we had a problem with and aborted on. Because line numbers aren't very useful in this sort of application we instead just keep the last 100 commands in a FIFO queue and print them as part of the crash report. Currently one problem with this design is a commit that has more than 100 modified files in it will flood the FIFO and any context regarding branch/from/committer/mark/comments will be lost. We really should save only the last few (10?) file changes for the current commit, ensuring we have some prior higher level commands in the FIFO when we crash on a file M/D/C/R command. Another issue with this approach is the FIFO only includes the commands, it does not include the commit messages. Yet having a commit message may be useful to help locate the relevant change in the source material. In practice I don't think this is going to be a major concern as the frontend can always embed its own source change set identifier as a comment (which will appear in the crash report) and the commit message(s) for the most recent commits of any given branch should be obtainable from the (packed) commit objects. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 10:47:04 +02:00			`rc_free = pool_alloc(cmd_save * sizeof(*rc_free));`
			`for (i = 0; i < (cmd_save - 1); i++)`
			`rc_free[i].next = &rc_free[i + 1];`
			`rc_free[cmd_save - 1].next = NULL;`

Don't repack existing objects in fast-import Some users of fast-import have been trying to use it to rewrite commits and trees, an activity where the all of the relevant blobs are already available from the existing packfiles. In such a case we don't want to repack a blob, even if the frontend application has supplied us the raw data rather than a mark or a SHA-1 name. I'm intentionally only checking the packfiles that existed when fast-import started and am always ignoring all loose object files. We ignore loose objects because fast-import tends to operate on a very large number of objects in a very short timespan, and it is usually creating new objects, not reusing existing ones. In such a situtation the majority of the objects will not be found in the existing packfiles, nor will they be loose object files. If the frontend application really wants us to look at loose object files, then they can just repack the repository before running fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-04-20 17:23:45 +02:00			`prepare_packed_git();`
Restructure fast-import to support creating multiple packfiles. Now that we are starting to see some really large projects (such as KDE or a fork of FreeBSD) get imported into Git we're running into the upper limit on packfile object count as well as overall byte length. The KDE and FreeBSD projects are both likely to require more than 4 GiB to store their current history, which means we really need multiple packfiles to handle their content. This is a fairly simple restructuring of the internal code to help us support creating multiple packfiles from within fast-import. We are now adding a 5 digit incrementing suffix to the end of the basename supplied to us by the caller, permitting up to 99,999 packs to be generated in a single fast-import run. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 10:39:05 +01:00			`start_packfile();`
Generate crash reports on die in fast-import As fast-import is quite strict about its input and die()'s anytime something goes wrong it can be difficult for a frontend developer to troubleshoot why fast-import rejected their input, or to even determine what input command it rejected. This change introduces a custom handler for Git's die() routine. When we receive a die() for any reason (fast-import or a lower level core Git routine we called) the error is first dumped onto stderr and then a more extensive crash report file is prepared in GIT_DIR. Finally we exit the process with status 128, just like the stock builtin die handler. An internal flag is set to prevent any further die()'s that may be invoked during the crash report generator from causing us to enter into an infinite loop. We shouldn't die() from our crash report handler, but just in case someone makes a future code change we are prepared to gaurd against small mistakes turning into huge problems for the end-user. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 08:00:37 +02:00			`set_die_routine(die_nicely);`
Drop strbuf's 'eof' marker, and make read_line a first class citizen. read_line is now strbuf_getline, and is a first class citizen, it returns 0 when reading a line worked, EOF else. The ->eof marker was used non-locally by fast-import.c, mimic the same behaviour using a static int in "read_next_command", that now returns -1 on EOF, and avoids to call strbuf_getline when it's in EOF state. Also no longer automagically strbuf_release the buffer, it's counter intuitive and breaks fast-import in a very subtle way. Note: being at EOF implies that command_buf.len == 0. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 11:19:04 +02:00			`while (read_next_command() != EOF) {`
			`if (!strcmp("blob", command_buf.buf))`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`cmd_new_blob();`
prefixcmp(): fix-up mechanical conversion. Previous step converted use of strncmp() with literal string mechanically even when the result is only used as a boolean: if (!strncmp("foo", arg, 3)) ==> if (!(-prefixcmp(arg, "foo"))) This step manually cleans them up to read: if (!prefixcmp(arg, "foo")) Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-20 10:54:00 +01:00			`else if (!prefixcmp(command_buf.buf, "commit "))`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`cmd_new_commit();`
prefixcmp(): fix-up mechanical conversion. Previous step converted use of strncmp() with literal string mechanically even when the result is only used as a boolean: if (!strncmp("foo", arg, 3)) ==> if (!(-prefixcmp(arg, "foo"))) This step manually cleans them up to read: if (!prefixcmp(arg, "foo")) Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-20 10:54:00 +01:00			`else if (!prefixcmp(command_buf.buf, "tag "))`
Implemented 'tag' command in fast-import. Tags received from the frontend are generated in memory in a simple linked list in the order that the tag commands were sent by the frontend. If multiple different tag objects for the same tag name get generated the last one sent by the frontend will be the one that gets written out at termination. Multiple tag objects for the same name will cause all older tags of the same name to be lost. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 09:12:13 +02:00			`cmd_new_tag();`
prefixcmp(): fix-up mechanical conversion. Previous step converted use of strncmp() with literal string mechanically even when the result is only used as a boolean: if (!strncmp("foo", arg, 3)) ==> if (!(-prefixcmp(arg, "foo"))) This step manually cleans them up to read: if (!prefixcmp(arg, "foo")) Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-20 10:54:00 +01:00			`else if (!prefixcmp(command_buf.buf, "reset "))`
Added 'reset' command to clear a branch's tree. Sometimes an import frontend may need to work with a temporary branch which will actually contain many different branches over the life of the import. This is especially useful when the frontend needs to create a tag from a set of file versions which are otherwise never a commit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-27 12:20:49 +02:00			`cmd_reset_branch();`
Implemented manual packfile switching in fast-import. To help importers which are dealing with massive amounts of data fast-import needs to be able to close the packfile it is currently writing to and open a new packfile for any additional data that will be received. A new 'checkpoint' command has been introduced which can be used by the frontend import process to force this to occur at any time. This may be useful to ensure a very long running import doesn't lose any work due to unexpected failures. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:35:41 +01:00			`else if (!strcmp("checkpoint", command_buf.buf))`
			`cmd_checkpoint();`
Allow frontends to bidirectionally communicate with fast-import The existing checkpoint command is very useful to force fast-import to dump the branches out to disk so that standard Git tools can access them and the objects they refer to. However there was not a way to know when fast-import had finished executing the checkpoint and it was safe to read those refs. The progress command can be used to make fast-import output any message of the frontend's choosing to standard out. The frontend can scan for these messages using select() or poll() to monitor a pipe connected to the standard output of fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-01 16:23:08 +02:00			`else if (!prefixcmp(command_buf.buf, "progress "))`
			`cmd_progress();`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`else`
			`die("Unsupported command: %s", command_buf.buf);`
Created fast-import, a tool to quickly generating a pack from blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-05 08:04:21 +02:00			`}`
Restructure fast-import to support creating multiple packfiles. Now that we are starting to see some really large projects (such as KDE or a fork of FreeBSD) get imported into Git we're running into the upper limit on packfile object count as well as overall byte length. The KDE and FreeBSD projects are both likely to require more than 4 GiB to store their current history, which means we really need multiple packfiles to handle their content. This is a fairly simple restructuring of the internal code to help us support creating multiple packfiles from within fast-import. We are now adding a 5 digit incrementing suffix to the end of the basename supplied to us by the caller, permitting up to 99,999 packs to be generated in a single fast-import run. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 10:39:05 +01:00			`end_packfile();`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`dump_branches();`
Implemented 'tag' command in fast-import. Tags received from the frontend are generated in memory in a simple linked list in the order that the tag commands were sent by the frontend. If multiple different tag objects for the same tag name get generated the last one sent by the frontend will be the one that gets written out at termination. Multiple tag objects for the same name will cause all older tags of the same name to be lost. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 09:12:13 +02:00			`dump_tags();`
Use .keep files in fast-import during processing. Because fast-import automatically updates all references (heads and tags) at the end of its run the repository is corrupt unless the objects are available in the .git/objects/pack directory prior to the refs being modified. The easiest way to ensure that is true is to move the packfile and its associated index directly into the .git/objects/pack directory as soon as we have finished output to it. But the only safe way to do this is to create the a temporary .keep file for that pack, so we use the same tricks that index-pack uses when its being invoked by receive-pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 07:15:31 +01:00			`unkeep_all_packs();`
Added option to export the marks table when fast-import terminates. The marks table can be used by the frontend to load any commit after the import and compare it to whatever data the frontend knows about that commit. If the mark idnums can be easily correlated to some reference source then its relatively trivial to compare the GIT tree to the reference to verify the accuracy of the import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 22:03:04 +02:00			`dump_marks();`
Added automatic index generation to fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-06 19:51:39 +02:00
fast-import: Hide the pack boundary commits by default. Most users don't need the pack boundary information that fast-import was printing to standard output, especially if they were calling it with --quiet. Those users who do want this information probably want it captured so they can go back and use it to repack the imported repository. So dumping the boundary commits to a log file makes more sense then printing them to standard output. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-12 01:45:56 +01:00			`if (pack_edges)`
			`fclose(pack_edges);`

Teach fast-import how to sit quietly in the corner. Often users will be running fast-import from within a larger frontend process, and this may be a frequent periodic tool such as a future edition of `git-svn fetch`. We don't want to bombard users with our large stats output if they won't be interested in it, so `--quiet` is now an option to make gfi be more silent. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-07 08:19:31 +01:00			`if (show_stats) {`
			`uintmax_t total_count = 0, duplicate_count = 0;`
			`for (i = 0; i < ARRAY_SIZE(object_count_by_type); i++)`
			`total_count += object_count_by_type[i];`
			`for (i = 0; i < ARRAY_SIZE(duplicate_count_by_type); i++)`
			`duplicate_count += duplicate_count_by_type[i];`

			`fprintf(stderr, "%s statistics:\n", argv[0]);`
			`fprintf(stderr, "---------------------------------------------------------------------\n");`
Check for PRIuMAX rather than NO_C99_FORMAT in fast-import.c. Thanks to Simon 'corecode' Schubert <corecode@fs.ei.tum.de> for the clean-up. Defining the C99 standard PRIuMAX when necessary replaces UM_FMT and the awkward UM10_FMT. There are no direct C99 translations for other uses of NO_C99_FORMAT in git, alas. Signed-off-by: Jason Riedy <ejr@cs.berkeley.edu> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-21 02:34:56 +01:00			`fprintf(stderr, "Alloc'd objects: %10" PRIuMAX "\n", alloc_count);`
			`fprintf(stderr, "Total objects: %10" PRIuMAX " (%10" PRIuMAX " duplicates )\n", total_count, duplicate_count);`
			`fprintf(stderr, " blobs : %10" PRIuMAX " (%10" PRIuMAX " duplicates %10" PRIuMAX " deltas)\n", object_count_by_type[OBJ_BLOB], duplicate_count_by_type[OBJ_BLOB], delta_count_by_type[OBJ_BLOB]);`
			`fprintf(stderr, " trees : %10" PRIuMAX " (%10" PRIuMAX " duplicates %10" PRIuMAX " deltas)\n", object_count_by_type[OBJ_TREE], duplicate_count_by_type[OBJ_TREE], delta_count_by_type[OBJ_TREE]);`
			`fprintf(stderr, " commits: %10" PRIuMAX " (%10" PRIuMAX " duplicates %10" PRIuMAX " deltas)\n", object_count_by_type[OBJ_COMMIT], duplicate_count_by_type[OBJ_COMMIT], delta_count_by_type[OBJ_COMMIT]);`
			`fprintf(stderr, " tags : %10" PRIuMAX " (%10" PRIuMAX " duplicates %10" PRIuMAX " deltas)\n", object_count_by_type[OBJ_TAG], duplicate_count_by_type[OBJ_TAG], delta_count_by_type[OBJ_TAG]);`
Teach fast-import how to sit quietly in the corner. Often users will be running fast-import from within a larger frontend process, and this may be a frequent periodic tool such as a future edition of `git-svn fetch`. We don't want to bombard users with our large stats output if they won't be interested in it, so `--quiet` is now an option to make gfi be more silent. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-07 08:19:31 +01:00			`fprintf(stderr, "Total branches: %10lu (%10lu loads )\n", branch_count, branch_load_count);`
Check for PRIuMAX rather than NO_C99_FORMAT in fast-import.c. Thanks to Simon 'corecode' Schubert <corecode@fs.ei.tum.de> for the clean-up. Defining the C99 standard PRIuMAX when necessary replaces UM_FMT and the awkward UM10_FMT. There are no direct C99 translations for other uses of NO_C99_FORMAT in git, alas. Signed-off-by: Jason Riedy <ejr@cs.berkeley.edu> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-21 02:34:56 +01:00			`fprintf(stderr, " marks: %10" PRIuMAX " (%10" PRIuMAX " unique )\n", (((uintmax_t)1) << marks->shift) * 1024, marks_set_count);`
Teach fast-import how to sit quietly in the corner. Often users will be running fast-import from within a larger frontend process, and this may be a frequent periodic tool such as a future edition of `git-svn fetch`. We don't want to bombard users with our large stats output if they won't be interested in it, so `--quiet` is now an option to make gfi be more silent. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-07 08:19:31 +01:00			`fprintf(stderr, " atoms: %10u\n", atom_cnt);`
Check for PRIuMAX rather than NO_C99_FORMAT in fast-import.c. Thanks to Simon 'corecode' Schubert <corecode@fs.ei.tum.de> for the clean-up. Defining the C99 standard PRIuMAX when necessary replaces UM_FMT and the awkward UM10_FMT. There are no direct C99 translations for other uses of NO_C99_FORMAT in git, alas. Signed-off-by: Jason Riedy <ejr@cs.berkeley.edu> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-21 02:34:56 +01:00			`fprintf(stderr, "Memory total: %10" PRIuMAX " KiB\n", (total_allocd + alloc_count*sizeof(struct object_entry))/1024);`
fast-import: Fix compile warnings Not on all platforms are size_t and unsigned long equivalent. Since I do not know how portable %z is, I play safe, and just cast the respective variables to unsigned long. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-07 12:38:21 +01:00			`fprintf(stderr, " pools: %10lu KiB\n", (unsigned long)(total_allocd/1024));`
Check for PRIuMAX rather than NO_C99_FORMAT in fast-import.c. Thanks to Simon 'corecode' Schubert <corecode@fs.ei.tum.de> for the clean-up. Defining the C99 standard PRIuMAX when necessary replaces UM_FMT and the awkward UM10_FMT. There are no direct C99 translations for other uses of NO_C99_FORMAT in git, alas. Signed-off-by: Jason Riedy <ejr@cs.berkeley.edu> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-21 02:34:56 +01:00			`fprintf(stderr, " objects: %10" PRIuMAX " KiB\n", (alloc_count*sizeof(struct object_entry))/1024);`
Teach fast-import how to sit quietly in the corner. Often users will be running fast-import from within a larger frontend process, and this may be a frequent periodic tool such as a future edition of `git-svn fetch`. We don't want to bombard users with our large stats output if they won't be interested in it, so `--quiet` is now an option to make gfi be more silent. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-07 08:19:31 +01:00			`fprintf(stderr, "---------------------------------------------------------------------\n");`
			`pack_report();`
			`fprintf(stderr, "---------------------------------------------------------------------\n");`
			`fprintf(stderr, "\n");`
			`}`
Created fast-import, a tool to quickly generating a pack from blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-05 08:04:21 +02:00
Don't do non-fastforward updates in fast-import. If fast-import is being used to update an existing branch of a repository, the user may not want to lose commits if another process updates the same ref at the same time. For example, the user might be using fast-import to make just one or two commits against a live branch. We now perform a fast-forward check during the ref updating process. If updating a branch would cause commits in that branch to be lost, we skip over it and display the new SHA1 to standard error. This new default behavior can be overridden with `--force`, like git-push and git-fetch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 22:08:06 +01:00			`return failure ? 1 : 0;`
Created fast-import, a tool to quickly generating a pack from blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-05 08:04:21 +02:00			`}`