mirrors/git - Incest Forge: Beyond sex. We incest.

mirrors/git

mirror of https://github.com/git/git.git synced 2024-11-17 22:44:49 +01:00

3462 lines

87 KiB

C

Raw Normal View History

Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`/*`
Correct missing SP characters in grammar comment at top of fast-import.c Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-03-25 22:22:13 +01:00			`(See Documentation/git-fast-import.txt for maintained documentation.)`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`Format of STDIN stream:`

			`stream ::= cmd*;`

			`cmd ::= new_blob`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`\| new_commit`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`\| new_tag`
Added 'reset' command to clear a branch's tree. Sometimes an import frontend may need to work with a temporary branch which will actually contain many different branches over the life of the import. This is especially useful when the frontend needs to create a tag from a set of file versions which are otherwise never a commit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-27 12:20:49 +02:00			`\| reset_branch`
Include checkpoint command in the BNF. This command isn't encouraged (as its slow) but it does exist and is accepted, so it still should be covered in the BNF. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-05 22:05:11 +01:00			`\| checkpoint`
Allow frontends to bidirectionally communicate with fast-import The existing checkpoint command is very useful to force fast-import to dump the branches out to disk so that standard Git tools can access them and the objects they refer to. However there was not a way to know when fast-import had finished executing the checkpoint and it was safe to read those refs. The progress command can be used to make fast-import output any message of the frontend's choosing to standard out. The frontend can scan for these messages using select() or poll() to monitor a pipe connected to the standard output of fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-01 16:23:08 +02:00			`\| progress`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`;`

Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`new_blob ::= 'blob' lf`
Fix whitespace in "Format of STDIN stream" of fast-import Something probably assumed that HT indentation is 4 characters. Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-15 10:57:40 +02:00			`mark?`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`file_content;`
			`file_content ::= data;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`new_commit ::= 'commit' sp ref_str lf`
Remove branch creation command from fast-import. Jon Smirl was finding it difficult to alter cvs2svn to generate branch commands prior to the first commit of the same branch. This change moves the 'from' command to be an optional parameter of the 'commit' command, thereby allowing a new branch to be defined at the moment it gets used to create the first commit on that branch. This change makes it impossible to create a branch with no commits on it as at least one commit is needed to register the branch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 00:45:26 +02:00			`mark?`
fast-import: Document author/committer/tagger name is optional The fast-import parser does not validate that the author, committer or tagger name component contains both a name and an email address. Therefore the name component has always been optional. Correct the documentation to match the implementation. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-30 16:03:48 +01:00			`('author' (sp name)? sp '<' email '>' sp when lf)?`
			`'committer' (sp name)? sp '<' email '>' sp when lf`
Remove branch creation command from fast-import. Jon Smirl was finding it difficult to alter cvs2svn to generate branch commands prior to the first commit of the same branch. This change moves the 'from' command to be an optional parameter of the 'commit' command, thereby allowing a new branch to be defined at the moment it gets used to create the first commit on that branch. This change makes it impossible to create a branch with no commits on it as at least one commit is needed to register the branch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 00:45:26 +02:00			`commit_msg`
fast-import: Add support for importing commit notes Introduce a 'notemodify' subcommand of the 'commit' command. This subcommand is similar to 'filemodify', except that no mode is supplied (all notes have mode 0644), and the path is set to the hex SHA1 of the given "comittish". This enables fast import of note objects along with their associated commits, since the notes can now be named using the mark references of their corresponding commits. The patch also includes a test case of the added functionality. Signed-off-by: Johan Herland <johan@herland.net> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-09 12:22:02 +02:00			`('from' sp committish lf)?`
			`('merge' sp committish lf)*`
fast-import: add 'ls' command Lazy fast-import frontend authors that want to rely on the backend to keep track of the content of the imported trees _almost_ have what they need in the 'cat-blob' command (v1.7.4-rc0~30^2~3, 2010-11-28). But it is not quite enough, since (1) cat-blob can be used to retrieve the content of files, but not their mode, and (2) using cat-blob requires the frontend to keep track of a name (mark number or object id) for each blob to be retrieved Introduce an 'ls' command to complement cat-blob and take care of the remaining needs. The 'ls' command finds what is at a given path within a given tree-ish (tag, commit, or tree): 'ls' SP <dataref> SP <path> LF or in fast-import's active commit: 'ls' SP <path> LF The response is a single line sent through the cat-blob channel, imitating ls-tree output. So for example: FE> ls :1 Documentation gfi> 040000 tree 9e6c2b599341d28a2a375f8207507e0a2a627fe9 Documentation FE> ls 9e6c2b599341d28a2a375f8207507e0a2a627fe9 git-fast-import.txt gfi> 100644 blob 4f92954396e3f0f97e75b6838a5635b583708870 git-fast-import.txt FE> ls :1 RelNotes gfi> 120000 blob b942e499449d97aeb50c73ca2bdc1c6e6d528743 RelNotes FE> cat-blob b942e499449d97aeb50c73ca2bdc1c6e6d528743 gfi> b942e499449d97aeb50c73ca2bdc1c6e6d528743 blob 32 gfi> Documentation/RelNotes/1.7.4.txt The most interesting parts of the reply are the first word, which is a 6-digit octal mode (regular file, executable, symlink, directory, or submodule), and the part from the second space to the tab, which is a <dataref> that can be used in later cat-blob, ls, and filemodify (M) commands to refer to the content (blob, tree, or commit) at that path. If there is nothing there, the response is "missing some/path". The intent is for this command to be used to read files from the active commit, so a frontend can apply patches to them, and to copy files and directories from previous revisions. For example, proposed updates to svn-fe use this command in place of its internal representation of the repository directory structure. This simplifies the frontend a great deal and means support for resuming an import in a separate fast-import run (i.e., incremental import) is basically free. Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Improved-by: Junio C Hamano <gitster@pobox.com> Improved-by: Sverre Rabbelier <srabbelier@gmail.com> 2010-12-02 11:40:20 +01:00			`(file_change \| ls)*`
Make trailing LF optional for all fast-import commands For the same reasons as the prior change we want to allow frontends to omit the trailing LF that usually delimits commands. In some cases these just make the input stream more verbose looking than it needs to be, and its just simpler for the frontend developer to get started if our parser is slightly more lenient about where an LF is required and where it isn't. To make this optional LF feature work we now have to buffer up to one line of input in command_buf. This buffering can happen if we look at the current input command but don't recognize it at this point in the code. In such a case we need to "unget" the entire line, but we cannot depend upon the stdio library to let us do ungetc() for that many characters at once. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-01 08:22:53 +02:00			`lf?;`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`commit_msg ::= data;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00
fast-import: add 'ls' command Lazy fast-import frontend authors that want to rely on the backend to keep track of the content of the imported trees _almost_ have what they need in the 'cat-blob' command (v1.7.4-rc0~30^2~3, 2010-11-28). But it is not quite enough, since (1) cat-blob can be used to retrieve the content of files, but not their mode, and (2) using cat-blob requires the frontend to keep track of a name (mark number or object id) for each blob to be retrieved Introduce an 'ls' command to complement cat-blob and take care of the remaining needs. The 'ls' command finds what is at a given path within a given tree-ish (tag, commit, or tree): 'ls' SP <dataref> SP <path> LF or in fast-import's active commit: 'ls' SP <path> LF The response is a single line sent through the cat-blob channel, imitating ls-tree output. So for example: FE> ls :1 Documentation gfi> 040000 tree 9e6c2b599341d28a2a375f8207507e0a2a627fe9 Documentation FE> ls 9e6c2b599341d28a2a375f8207507e0a2a627fe9 git-fast-import.txt gfi> 100644 blob 4f92954396e3f0f97e75b6838a5635b583708870 git-fast-import.txt FE> ls :1 RelNotes gfi> 120000 blob b942e499449d97aeb50c73ca2bdc1c6e6d528743 RelNotes FE> cat-blob b942e499449d97aeb50c73ca2bdc1c6e6d528743 gfi> b942e499449d97aeb50c73ca2bdc1c6e6d528743 blob 32 gfi> Documentation/RelNotes/1.7.4.txt The most interesting parts of the reply are the first word, which is a 6-digit octal mode (regular file, executable, symlink, directory, or submodule), and the part from the second space to the tab, which is a <dataref> that can be used in later cat-blob, ls, and filemodify (M) commands to refer to the content (blob, tree, or commit) at that path. If there is nothing there, the response is "missing some/path". The intent is for this command to be used to read files from the active commit, so a frontend can apply patches to them, and to copy files and directories from previous revisions. For example, proposed updates to svn-fe use this command in place of its internal representation of the repository directory structure. This simplifies the frontend a great deal and means support for resuming an import in a separate fast-import run (i.e., incremental import) is basically free. Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Improved-by: Junio C Hamano <gitster@pobox.com> Improved-by: Sverre Rabbelier <srabbelier@gmail.com> 2010-12-02 11:40:20 +01:00			`ls ::= 'ls' sp '"' quoted(path) '"' lf;`

Teach fast-import to recursively copy files/directories Some source material (e.g. Subversion dump files) perform directory renames by telling us the directory was copied, then deleted in the same revision. This makes it difficult for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Copy a/ to b/; Delete a/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'C' subcommand within a commit allows the frontend to make a recursive copy of one path to another path within the branch, without needing to keep track of the individual file paths. The metadata copy is performed in memory efficiently, but is implemented as a copy-immediately operation, rather than copy-on-write. With this new 'C' subcommand frontends could obviously implement an 'R' (rename) on their own as a combination of 'C' and 'D' (delete), but since we have already offered up 'R' in the past and it is a trivial thing to keep implemented I'm not going to deprecate it. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-15 07:40:37 +02:00			`file_change ::= file_clr`
			`\| file_del`
			`\| file_rnm`
			`\| file_cpy`
			`\| file_obm`
			`\| file_inm;`
Teach fast-import how to clear the internal branch content. Some frontends may not be able to (easily) keep track of which files are included in the branch, and which aren't. Performing this tracking can be tedious and error prone for the frontend to do, especially if its foreign data source cannot supply the changed path list on a per-commit basis. fast-import now allows a frontend to request that a branch's tree be wiped clean (reset to the empty tree) at the start of a commit, allowing the frontend to feed in all paths which belong on the branch. This is ideal for a tar-file importer frontend, for example, as the frontend just needs to reformat the tar data stream into a gfi data stream, which may be something a few Perl regexps can take care of. :) Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-07 08:03:03 +01:00			`file_clr ::= 'deleteall' lf;`
Accept 'inline' file data in fast-import commit structure. Its very annoying to need to specify the file content ahead of a commit and use marks to connect the individual blobs to the commit's file modification entry, especially if the frontend can't/won't generate the blob SHA1s itself. Instead it would much easier to use if we can accept the blob data at the same time as we receive each file_change line. Now fast-import accepts 'inline' instead of a mark idnum or blob SHA1 within the 'M' type file_change command. If an inline is detected the very next line must be a 'data n' command, supplying the file data. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-18 21:17:58 +01:00			`file_del ::= 'D' sp path_str lf;`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`file_rnm ::= 'R' sp path_str sp path_str lf;`
Teach fast-import to recursively copy files/directories Some source material (e.g. Subversion dump files) perform directory renames by telling us the directory was copied, then deleted in the same revision. This makes it difficult for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Copy a/ to b/; Delete a/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'C' subcommand within a commit allows the frontend to make a recursive copy of one path to another path within the branch, without needing to keep track of the individual file paths. The metadata copy is performed in memory efficiently, but is implemented as a copy-immediately operation, rather than copy-on-write. With this new 'C' subcommand frontends could obviously implement an 'R' (rename) on their own as a combination of 'C' and 'D' (delete), but since we have already offered up 'R' in the past and it is a trivial thing to keep implemented I'm not going to deprecate it. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-15 07:40:37 +02:00			`file_cpy ::= 'C' sp path_str sp path_str lf;`
Accept 'inline' file data in fast-import commit structure. Its very annoying to need to specify the file content ahead of a commit and use marks to connect the individual blobs to the commit's file modification entry, especially if the frontend can't/won't generate the blob SHA1s itself. Instead it would much easier to use if we can accept the blob data at the same time as we receive each file_change line. Now fast-import accepts 'inline' instead of a mark idnum or blob SHA1 within the 'M' type file_change command. If an inline is detected the very next line must be a 'data n' command, supplying the file data. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-18 21:17:58 +01:00			`file_obm ::= 'M' sp mode sp (hexsha1 \| idnum) sp path_str lf;`
			`file_inm ::= 'M' sp mode sp 'inline' sp path_str lf`
			`data;`
fast-import: Add support for importing commit notes Introduce a 'notemodify' subcommand of the 'commit' command. This subcommand is similar to 'filemodify', except that no mode is supplied (all notes have mode 0644), and the path is set to the hex SHA1 of the given "comittish". This enables fast import of note objects along with their associated commits, since the notes can now be named using the mark references of their corresponding commits. The patch also includes a test case of the added functionality. Signed-off-by: Johan Herland <johan@herland.net> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-09 12:22:02 +02:00			`note_obm ::= 'N' sp (hexsha1 \| idnum) sp committish lf;`
			`note_inm ::= 'N' sp 'inline' sp committish lf`
			`data;`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00
			`new_tag ::= 'tag' sp tag_str lf`
fast-import: Add support for importing commit notes Introduce a 'notemodify' subcommand of the 'commit' command. This subcommand is similar to 'filemodify', except that no mode is supplied (all notes have mode 0644), and the path is set to the hex SHA1 of the given "comittish". This enables fast import of note objects along with their associated commits, since the notes can now be named using the mark references of their corresponding commits. The patch also includes a test case of the added functionality. Signed-off-by: Johan Herland <johan@herland.net> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-09 12:22:02 +02:00			`'from' sp committish lf`
fast-import: Document author/committer/tagger name is optional The fast-import parser does not validate that the author, committer or tagger name component contains both a name and an email address. Therefore the name component has always been optional. Correct the documentation to match the implementation. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-30 16:03:48 +01:00			`('tagger' (sp name)? sp '<' email '>' sp when lf)?`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`tag_msg;`
			`tag_msg ::= data;`

Allow creating branches without committing in fast-import. Some importers may want to create a branch long before they actually commit to it, or in some cases they may never commit to the branch but they still need the ref to be created in the repository after the import is complete. This extends the 'reset ' command to automatically create a new branch if the supplied reference isn't already known as a branch. While I'm at it I also modified the syntax of the reset command to terminate with an empty line, like commit and tag operate. This just makes the command set more consistent. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-12 04:28:39 +01:00			`reset_branch ::= 'reset' sp ref_str lf`
fast-import: Add support for importing commit notes Introduce a 'notemodify' subcommand of the 'commit' command. This subcommand is similar to 'filemodify', except that no mode is supplied (all notes have mode 0644), and the path is set to the hex SHA1 of the given "comittish". This enables fast import of note objects along with their associated commits, since the notes can now be named using the mark references of their corresponding commits. The patch also includes a test case of the added functionality. Signed-off-by: Johan Herland <johan@herland.net> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-09 12:22:02 +02:00			`('from' sp committish lf)?`
Make trailing LF optional for all fast-import commands For the same reasons as the prior change we want to allow frontends to omit the trailing LF that usually delimits commands. In some cases these just make the input stream more verbose looking than it needs to be, and its just simpler for the frontend developer to get started if our parser is slightly more lenient about where an LF is required and where it isn't. To make this optional LF feature work we now have to buffer up to one line of input in command_buf. This buffering can happen if we look at the current input command but don't recognize it at this point in the code. In such a case we need to "unget" the entire line, but we cannot depend upon the stdio library to let us do ungetc() for that many characters at once. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-01 08:22:53 +02:00			`lf?;`
Added 'reset' command to clear a branch's tree. Sometimes an import frontend may need to work with a temporary branch which will actually contain many different branches over the life of the import. This is especially useful when the frontend needs to create a tag from a set of file versions which are otherwise never a commit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-27 12:20:49 +02:00
Implemented manual packfile switching in fast-import. To help importers which are dealing with massive amounts of data fast-import needs to be able to close the packfile it is currently writing to and open a new packfile for any additional data that will be received. A new 'checkpoint' command has been introduced which can be used by the frontend import process to force this to occur at any time. This may be useful to ensure a very long running import doesn't lose any work due to unexpected failures. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:35:41 +01:00			`checkpoint ::= 'checkpoint' lf`
Make trailing LF optional for all fast-import commands For the same reasons as the prior change we want to allow frontends to omit the trailing LF that usually delimits commands. In some cases these just make the input stream more verbose looking than it needs to be, and its just simpler for the frontend developer to get started if our parser is slightly more lenient about where an LF is required and where it isn't. To make this optional LF feature work we now have to buffer up to one line of input in command_buf. This buffering can happen if we look at the current input command but don't recognize it at this point in the code. In such a case we need to "unget" the entire line, but we cannot depend upon the stdio library to let us do ungetc() for that many characters at once. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-01 08:22:53 +02:00			`lf?;`
Implemented manual packfile switching in fast-import. To help importers which are dealing with massive amounts of data fast-import needs to be able to close the packfile it is currently writing to and open a new packfile for any additional data that will be received. A new 'checkpoint' command has been introduced which can be used by the frontend import process to force this to occur at any time. This may be useful to ensure a very long running import doesn't lose any work due to unexpected failures. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:35:41 +01:00
Allow frontends to bidirectionally communicate with fast-import The existing checkpoint command is very useful to force fast-import to dump the branches out to disk so that standard Git tools can access them and the objects they refer to. However there was not a way to know when fast-import had finished executing the checkpoint and it was safe to read those refs. The progress command can be used to make fast-import output any message of the frontend's choosing to standard out. The frontend can scan for these messages using select() or poll() to monitor a pipe connected to the standard output of fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-01 16:23:08 +02:00			`progress ::= 'progress' sp not_lf* lf`
			`lf?;`

Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`# note: the first idnum in a stream should be 1 and subsequent`
			`# idnums should not have gaps between values as this will cause`
			`# the stream parser to reserve space for the gapped values. An`
Fix whitespace in "Format of STDIN stream" of fast-import Something probably assumed that HT indentation is 4 characters. Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-15 10:57:40 +02:00			`# idnum can be updated in the future to a new object by issuing`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`# a new mark directive with the old idnum.`
Fix whitespace in "Format of STDIN stream" of fast-import Something probably assumed that HT indentation is 4 characters. Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-15 10:57:40 +02:00			`#`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`mark ::= 'mark' sp idnum lf;`
Support delimited data regions in fast-import. During testing its nice to not have to feed the length of a data chunk to the 'data' command of fast-import. Instead we would prefer to be able to establish a data chunk much like shell's << operator and use a line delimiter to denote the end of the input. So now if a data command is started as 'data <<EOF' we will look for a terminator line containing only the string EOF on that line. Once found, we stop the data command. Everything between the two lines is used as the data value. The 'data <<' syntax is slower than 'data n', as we don't know how many bytes to expect and instead must grow our buffer on the fly. It also has the problem that the frontend must use a string which will not appear on a line by itself in the input, and the data region will always end in an LF. For these reasons real import frontends are encouraged to continue to use _only_ 'data n'. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-18 19:14:27 +01:00			`data ::= (delimited_data \| exact_data)`
Make trailing LF following fast-import `data` commands optional A few fast-import frontend developers have found it odd that we require the LF following a `data` command, especially in the exact byte count format. Technically we don't need this LF to parse the stream properly, but having it here does make the stream more readable to humans. We can easily make the LF optional by peeking at the next byte available from the stream and pushing it back into the buffer if its not LF. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-01 06:24:25 +02:00			`lf?;`
Support delimited data regions in fast-import. During testing its nice to not have to feed the length of a data chunk to the 'data' command of fast-import. Instead we would prefer to be able to establish a data chunk much like shell's << operator and use a line delimiter to denote the end of the input. So now if a data command is started as 'data <<EOF' we will look for a terminator line containing only the string EOF on that line. Once found, we stop the data command. Everything between the two lines is used as the data value. The 'data <<' syntax is slower than 'data n', as we don't know how many bytes to expect and instead must grow our buffer on the fly. It also has the problem that the frontend must use a string which will not appear on a line by itself in the input, and the data region will always end in an LF. For these reasons real import frontends are encouraged to continue to use _only_ 'data n'. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-18 19:14:27 +01:00
			`# note: delim may be any string but must not contain lf.`
			`# data_line may contain any data but must not be exactly`
			`# delim.`
			`delimited_data ::= 'data' sp '<<' delim lf`
			`(data_line lf)*`
Fix whitespace in "Format of STDIN stream" of fast-import Something probably assumed that HT indentation is 4 characters. Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-15 10:57:40 +02:00			`delim lf;`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00
			`# note: declen indicates the length of binary_data in bytes.`
Fix typos / spelling in comments Signed-off-by: Mike Ralphson <mike@abacus.co.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-04-17 20:13:30 +02:00			`# declen does not include the lf preceding the binary data.`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`#`
Support delimited data regions in fast-import. During testing its nice to not have to feed the length of a data chunk to the 'data' command of fast-import. Instead we would prefer to be able to establish a data chunk much like shell's << operator and use a line delimiter to denote the end of the input. So now if a data command is started as 'data <<EOF' we will look for a terminator line containing only the string EOF on that line. Once found, we stop the data command. Everything between the two lines is used as the data value. The 'data <<' syntax is slower than 'data n', as we don't know how many bytes to expect and instead must grow our buffer on the fly. It also has the problem that the frontend must use a string which will not appear on a line by itself in the input, and the data region will always end in an LF. For these reasons real import frontends are encouraged to continue to use _only_ 'data n'. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-18 19:14:27 +01:00			`exact_data ::= 'data' sp declen lf`
			`binary_data;`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00
			`# note: quoted strings are C-style quoting supporting \c for`
			`# common escapes of 'c' (e..g \n, \t, \\, \") or \nnn where nnn`
Fix whitespace in "Format of STDIN stream" of fast-import Something probably assumed that HT indentation is 4 characters. Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-15 10:57:40 +02:00			`# is the signed byte value in octal. Note that the only`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`# characters which must actually be escaped to protect the`
			`# stream formatting is: \, " and LF. Otherwise these values`
Fix whitespace in "Format of STDIN stream" of fast-import Something probably assumed that HT indentation is 4 characters. Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-15 10:57:40 +02:00			`# are UTF8.`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`#`
fast-import: Add support for importing commit notes Introduce a 'notemodify' subcommand of the 'commit' command. This subcommand is similar to 'filemodify', except that no mode is supplied (all notes have mode 0644), and the path is set to the hex SHA1 of the given "comittish". This enables fast import of note objects along with their associated commits, since the notes can now be named using the mark references of their corresponding commits. The patch also includes a test case of the added functionality. Signed-off-by: Johan Herland <johan@herland.net> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-09 12:22:02 +02:00			`committish ::= (ref_str \| hexsha1 \| sha1exp_str \| idnum);`
Don't support shell-quoted refnames in fast-import. The current implementation of shell-style quoted refnames and SHA-1 expressions within fast-import contains a bad memory leak. We leak the unquoted strings used by the `from` and `merge` commands, maybe others. Its also just muddling up the docs. Since Git refnames cannot contain LF, and that is our delimiter for the end of the refname, and we accept any other character as-is, there is no reason for these strings to support quoting, except to be nice to frontends. But frontends shouldn't be expecting to use funny refs in Git, and its just as simple to never quote them as it is to always pass them through the same quoting filter as pathnames. So frontends should never quote refs, or ref expressions. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 02:30:37 +01:00			`ref_str ::= ref;`
			`sha1exp_str ::= sha1exp;`
			`tag_str ::= tag;`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`path_str ::= path \| '"' quoted(path) '"' ;`
Accept 'inline' file data in fast-import commit structure. Its very annoying to need to specify the file content ahead of a commit and use marks to connect the individual blobs to the commit's file modification entry, especially if the frontend can't/won't generate the blob SHA1s itself. Instead it would much easier to use if we can accept the blob data at the same time as we receive each file_change line. Now fast-import accepts 'inline' instead of a mark idnum or blob SHA1 within the 'M' type file_change command. If an inline is detected the very next line must be a 'data n' command, supplying the file data. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-18 21:17:58 +01:00			`mode ::= '100644' \| '644'`
			`\| '100755' \| '755'`
S_IFLNK != 0140000 Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 21:46:11 +01:00			`\| '120000'`
Accept 'inline' file data in fast-import commit structure. Its very annoying to need to specify the file content ahead of a commit and use marks to connect the individual blobs to the commit's file modification entry, especially if the frontend can't/won't generate the blob SHA1s itself. Instead it would much easier to use if we can accept the blob data at the same time as we receive each file_change line. Now fast-import accepts 'inline' instead of a mark idnum or blob SHA1 within the 'M' type file_change command. If an inline is detected the very next line must be a 'data n' command, supplying the file data. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-18 21:17:58 +01:00			`;`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00
			`declen ::= # unsigned 32 bit value, ascii base10 notation;`
Corrected BNF input documentation for fast-import. Now that fast-import uses uintmax_t (the largest available unsigned integer type) for marks we don't want to say its an unsigned 32 bit integer in ASCII base 10 notation. It could be much larger, especially on 64 bit systems, and especially if a frontend uses a very large number of marks (1 per file revision on a very, very large import). Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-17 06:33:18 +01:00			`bigint ::= # unsigned integer value, ascii base10 notation;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`binary_data ::= # file content, not interpreted;`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00
Support RFC 2822 date parsing in fast-import. Since some frontends may be working with source material where the dates are only readily available as RFC 2822 strings, it is more friendly if fast-import exposes Git's parse_date() function to handle the conversion. This way the frontend doesn't need to perform the parsing itself. The new --date-format option to fast-import can be used by a frontend to select which format it will supply date strings in. The default is the standard `raw` Git format, which fast-import has always supported. Format rfc2822 can be used to activate the parse_date() function instead. Because fast-import could also be useful for creating new, current commits, the format `now` is also supported to generate the current system timestamp. The implementation of `now` is a trivial call to datestamp(), but is actually a whole whopping 3 lines so that fast-import can verify the frontend really meant `now`. As part of this change I have added validation of the `raw` date format. Prior to this change fast-import would accept anything in a `committer` command, even if it was seriously malformed. Now fast-import requires the '> ' near the end of the string and verifies the timestamp is formatted properly. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 20:58:30 +01:00			`when ::= raw_when \| rfc2822_when;`
			`raw_when ::= ts sp tz;`
			`rfc2822_when ::= # Valid RFC 2822 date and time;`

Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`sp ::= # ASCII space character;`
			`lf ::= # ASCII newline (LF) character;`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00
			`# note: a colon (':') must precede the numerical value assigned to`
Fix whitespace in "Format of STDIN stream" of fast-import Something probably assumed that HT indentation is 4 characters. Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-15 10:57:40 +02:00			`# an idnum. This is to distinguish it from a ref or tag name as`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`# GIT does not permit ':' in ref or tag strings.`
Fix whitespace in "Format of STDIN stream" of fast-import Something probably assumed that HT indentation is 4 characters. Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-15 10:57:40 +02:00			`#`
Corrected BNF input documentation for fast-import. Now that fast-import uses uintmax_t (the largest available unsigned integer type) for marks we don't want to say its an unsigned 32 bit integer in ASCII base 10 notation. It could be much larger, especially on 64 bit systems, and especially if a frontend uses a very large number of marks (1 per file revision on a very, very large import). Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-17 06:33:18 +01:00			`idnum ::= ':' bigint;`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`path ::= # GIT style file path, e.g. "a/b/c";`
			`ref ::= # GIT ref name, e.g. "refs/heads/MOZ_GECKO_EXPERIMENT";`
			`tag ::= # GIT tag name, e.g. "FIREFOX_1_5";`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`sha1exp ::= # Any valid GIT SHA1 expression;`
			`hexsha1 ::= # SHA1 in hexadecimal format;`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00
			`# note: name and email are UTF8 strings, however name must not`
Fix whitespace in "Format of STDIN stream" of fast-import Something probably assumed that HT indentation is 4 characters. Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-15 10:57:40 +02:00			`# contain '<' or lf and email must not contain any of the`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`# following: '<', '>', lf.`
Fix whitespace in "Format of STDIN stream" of fast-import Something probably assumed that HT indentation is 4 characters. Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-15 10:57:40 +02:00			`#`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`name ::= # valid GIT author/committer name;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`email ::= # valid GIT author/committer email;`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`ts ::= # time since the epoch in seconds, ascii base10 notation;`
			`tz ::= # GIT style timezone;`
Teach fast-import to ignore lines starting with '#' Several frontend developers have asked that some form of stream comments be permitted within a fast-import data stream. This way they can include information from their own frontend program about where specific data was taken from in the source system, or about a decision that their frontend may have made while creating the fast-import data stream. This change introduces comments in the Bourne-shell/Tcl/Perl style. Lines starting with '#' are ignored, up to and including the LF. Unlike the above mentioned three languages however we do not look for and ignore leading whitespace. This just simplifies the definition of the comment format and the code that parses them. To make comments work we had to stop using read_next_command() within cmd_data() and directly invoke read_line() during the inline variant of the function. This is necessary to retain any lines of the input data that might otherwise look like a comment to fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-01 06:05:15 +02:00
fast-import: add 'ls' command Lazy fast-import frontend authors that want to rely on the backend to keep track of the content of the imported trees _almost_ have what they need in the 'cat-blob' command (v1.7.4-rc0~30^2~3, 2010-11-28). But it is not quite enough, since (1) cat-blob can be used to retrieve the content of files, but not their mode, and (2) using cat-blob requires the frontend to keep track of a name (mark number or object id) for each blob to be retrieved Introduce an 'ls' command to complement cat-blob and take care of the remaining needs. The 'ls' command finds what is at a given path within a given tree-ish (tag, commit, or tree): 'ls' SP <dataref> SP <path> LF or in fast-import's active commit: 'ls' SP <path> LF The response is a single line sent through the cat-blob channel, imitating ls-tree output. So for example: FE> ls :1 Documentation gfi> 040000 tree 9e6c2b599341d28a2a375f8207507e0a2a627fe9 Documentation FE> ls 9e6c2b599341d28a2a375f8207507e0a2a627fe9 git-fast-import.txt gfi> 100644 blob 4f92954396e3f0f97e75b6838a5635b583708870 git-fast-import.txt FE> ls :1 RelNotes gfi> 120000 blob b942e499449d97aeb50c73ca2bdc1c6e6d528743 RelNotes FE> cat-blob b942e499449d97aeb50c73ca2bdc1c6e6d528743 gfi> b942e499449d97aeb50c73ca2bdc1c6e6d528743 blob 32 gfi> Documentation/RelNotes/1.7.4.txt The most interesting parts of the reply are the first word, which is a 6-digit octal mode (regular file, executable, symlink, directory, or submodule), and the part from the second space to the tab, which is a <dataref> that can be used in later cat-blob, ls, and filemodify (M) commands to refer to the content (blob, tree, or commit) at that path. If there is nothing there, the response is "missing some/path". The intent is for this command to be used to read files from the active commit, so a frontend can apply patches to them, and to copy files and directories from previous revisions. For example, proposed updates to svn-fe use this command in place of its internal representation of the repository directory structure. This simplifies the frontend a great deal and means support for resuming an import in a separate fast-import run (i.e., incremental import) is basically free. Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Improved-by: Junio C Hamano <gitster@pobox.com> Improved-by: Sverre Rabbelier <srabbelier@gmail.com> 2010-12-02 11:40:20 +01:00			`# note: comments, ls and cat requests may appear anywhere`
fast-import: Allow cat-blob requests at arbitrary points in stream The new rule: a "cat-blob" can be inserted wherever a comment is allowed, which means at the start of any line except in the middle of a "data" command. This saves frontends from having to loop over everything they want to commit in the next commit and cat-ing the necessary objects in advance. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-11-28 20:45:58 +01:00			`# in the input, except within a data command. Any form`
			`# of the data command always escapes the related input`
			`# from comment processing.`
Teach fast-import to ignore lines starting with '#' Several frontend developers have asked that some form of stream comments be permitted within a fast-import data stream. This way they can include information from their own frontend program about where specific data was taken from in the source system, or about a decision that their frontend may have made while creating the fast-import data stream. This change introduces comments in the Bourne-shell/Tcl/Perl style. Lines starting with '#' are ignored, up to and including the LF. Unlike the above mentioned three languages however we do not look for and ignore leading whitespace. This just simplifies the definition of the comment format and the code that parses them. To make comments work we had to stop using read_next_command() within cmd_data() and directly invoke read_line() during the inline variant of the function. This is necessary to retain any lines of the input data that might otherwise look like a comment to fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-01 06:05:15 +02:00			`#`
			`# In case it is not clear, the '#' that starts the comment`
Fix more typos/spelling in comments A few more fixes on top of the automatic spell checker generated ones. Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-04-22 23:15:56 +02:00			`# must be the first character on that line (an lf`
Fix typos / spelling in comments Signed-off-by: Mike Ralphson <mike@abacus.co.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-04-17 20:13:30 +02:00			`# preceded it).`
Teach fast-import to ignore lines starting with '#' Several frontend developers have asked that some form of stream comments be permitted within a fast-import data stream. This way they can include information from their own frontend program about where specific data was taken from in the source system, or about a decision that their frontend may have made while creating the fast-import data stream. This change introduces comments in the Bourne-shell/Tcl/Perl style. Lines starting with '#' are ignored, up to and including the LF. Unlike the above mentioned three languages however we do not look for and ignore leading whitespace. This just simplifies the definition of the comment format and the code that parses them. To make comments work we had to stop using read_next_command() within cmd_data() and directly invoke read_line() during the inline variant of the function. This is necessary to retain any lines of the input data that might otherwise look like a comment to fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-01 06:05:15 +02:00			`#`
fast-import: add 'ls' command Lazy fast-import frontend authors that want to rely on the backend to keep track of the content of the imported trees _almost_ have what they need in the 'cat-blob' command (v1.7.4-rc0~30^2~3, 2010-11-28). But it is not quite enough, since (1) cat-blob can be used to retrieve the content of files, but not their mode, and (2) using cat-blob requires the frontend to keep track of a name (mark number or object id) for each blob to be retrieved Introduce an 'ls' command to complement cat-blob and take care of the remaining needs. The 'ls' command finds what is at a given path within a given tree-ish (tag, commit, or tree): 'ls' SP <dataref> SP <path> LF or in fast-import's active commit: 'ls' SP <path> LF The response is a single line sent through the cat-blob channel, imitating ls-tree output. So for example: FE> ls :1 Documentation gfi> 040000 tree 9e6c2b599341d28a2a375f8207507e0a2a627fe9 Documentation FE> ls 9e6c2b599341d28a2a375f8207507e0a2a627fe9 git-fast-import.txt gfi> 100644 blob 4f92954396e3f0f97e75b6838a5635b583708870 git-fast-import.txt FE> ls :1 RelNotes gfi> 120000 blob b942e499449d97aeb50c73ca2bdc1c6e6d528743 RelNotes FE> cat-blob b942e499449d97aeb50c73ca2bdc1c6e6d528743 gfi> b942e499449d97aeb50c73ca2bdc1c6e6d528743 blob 32 gfi> Documentation/RelNotes/1.7.4.txt The most interesting parts of the reply are the first word, which is a 6-digit octal mode (regular file, executable, symlink, directory, or submodule), and the part from the second space to the tab, which is a <dataref> that can be used in later cat-blob, ls, and filemodify (M) commands to refer to the content (blob, tree, or commit) at that path. If there is nothing there, the response is "missing some/path". The intent is for this command to be used to read files from the active commit, so a frontend can apply patches to them, and to copy files and directories from previous revisions. For example, proposed updates to svn-fe use this command in place of its internal representation of the repository directory structure. This simplifies the frontend a great deal and means support for resuming an import in a separate fast-import run (i.e., incremental import) is basically free. Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Improved-by: Junio C Hamano <gitster@pobox.com> Improved-by: Sverre Rabbelier <srabbelier@gmail.com> 2010-12-02 11:40:20 +01:00
fast-import: Allow cat-blob requests at arbitrary points in stream The new rule: a "cat-blob" can be inserted wherever a comment is allowed, which means at the start of any line except in the middle of a "data" command. This saves frontends from having to loop over everything they want to commit in the next commit and cat-ing the necessary objects in advance. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-11-28 20:45:58 +01:00			`cat_blob ::= 'cat-blob' sp (hexsha1 \| idnum) lf;`
fast-import: add 'ls' command Lazy fast-import frontend authors that want to rely on the backend to keep track of the content of the imported trees _almost_ have what they need in the 'cat-blob' command (v1.7.4-rc0~30^2~3, 2010-11-28). But it is not quite enough, since (1) cat-blob can be used to retrieve the content of files, but not their mode, and (2) using cat-blob requires the frontend to keep track of a name (mark number or object id) for each blob to be retrieved Introduce an 'ls' command to complement cat-blob and take care of the remaining needs. The 'ls' command finds what is at a given path within a given tree-ish (tag, commit, or tree): 'ls' SP <dataref> SP <path> LF or in fast-import's active commit: 'ls' SP <path> LF The response is a single line sent through the cat-blob channel, imitating ls-tree output. So for example: FE> ls :1 Documentation gfi> 040000 tree 9e6c2b599341d28a2a375f8207507e0a2a627fe9 Documentation FE> ls 9e6c2b599341d28a2a375f8207507e0a2a627fe9 git-fast-import.txt gfi> 100644 blob 4f92954396e3f0f97e75b6838a5635b583708870 git-fast-import.txt FE> ls :1 RelNotes gfi> 120000 blob b942e499449d97aeb50c73ca2bdc1c6e6d528743 RelNotes FE> cat-blob b942e499449d97aeb50c73ca2bdc1c6e6d528743 gfi> b942e499449d97aeb50c73ca2bdc1c6e6d528743 blob 32 gfi> Documentation/RelNotes/1.7.4.txt The most interesting parts of the reply are the first word, which is a 6-digit octal mode (regular file, executable, symlink, directory, or submodule), and the part from the second space to the tab, which is a <dataref> that can be used in later cat-blob, ls, and filemodify (M) commands to refer to the content (blob, tree, or commit) at that path. If there is nothing there, the response is "missing some/path". The intent is for this command to be used to read files from the active commit, so a frontend can apply patches to them, and to copy files and directories from previous revisions. For example, proposed updates to svn-fe use this command in place of its internal representation of the repository directory structure. This simplifies the frontend a great deal and means support for resuming an import in a separate fast-import run (i.e., incremental import) is basically free. Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Improved-by: Junio C Hamano <gitster@pobox.com> Improved-by: Sverre Rabbelier <srabbelier@gmail.com> 2010-12-02 11:40:20 +01:00			`ls_tree ::= 'ls' sp (hexsha1 \| idnum) sp path_str lf;`
fast-import: Allow cat-blob requests at arbitrary points in stream The new rule: a "cat-blob" can be inserted wherever a comment is allowed, which means at the start of any line except in the middle of a "data" command. This saves frontends from having to loop over everything they want to commit in the next commit and cat-ing the necessary objects in advance. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-11-28 20:45:58 +01:00
Teach fast-import to ignore lines starting with '#' Several frontend developers have asked that some form of stream comments be permitted within a fast-import data stream. This way they can include information from their own frontend program about where specific data was taken from in the source system, or about a decision that their frontend may have made while creating the fast-import data stream. This change introduces comments in the Bourne-shell/Tcl/Perl style. Lines starting with '#' are ignored, up to and including the LF. Unlike the above mentioned three languages however we do not look for and ignore leading whitespace. This just simplifies the definition of the comment format and the code that parses them. To make comments work we had to stop using read_next_command() within cmd_data() and directly invoke read_line() during the inline variant of the function. This is necessary to retain any lines of the input data that might otherwise look like a comment to fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-01 06:05:15 +02:00			`comment ::= '#' not_lf* lf;`
			`not_lf ::= # Any byte that is not ASCII newline (LF);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`*/`

Created fast-import, a tool to quickly generating a pack from blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-05 08:04:21 +02:00			`#include "builtin.h"`
			`#include "cache.h"`
			`#include "object.h"`
			`#include "blob.h"`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`#include "tree.h"`
Don't do non-fastforward updates in fast-import. If fast-import is being used to update an existing branch of a repository, the user may not want to lose commits if another process updates the same ref at the same time. For example, the user might be using fast-import to make just one or two commits against a live branch. We now perform a fast-forward check during the ref updating process. If updating a branch would cause commits in that branch to be lost, we skip over it and display the new SHA1 to standard error. This new default behavior can be overridden with `--force`, like git-push and git-fetch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 22:08:06 +01:00			`#include "commit.h"`
Created fast-import, a tool to quickly generating a pack from blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-05 08:04:21 +02:00			`#include "delta.h"`
			`#include "pack.h"`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`#include "refs.h"`
Created fast-import, a tool to quickly generating a pack from blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-05 08:04:21 +02:00			`#include "csum-file.h"`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`#include "quote.h"`
Add calls to git_extract_argv0_path() in programs that call git_config_* Programs that use git_config need to find the global configuration. When runtime prefix computation is enabled, this requires that git_extract_argv0_path() is called early in the program's main(). This commit adds the necessary calls. Signed-off-by: Steffen Prohaska <prohaska@zib.de> Acked-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-01-18 13:00:12 +01:00			`#include "exec_cmd.h"`
Support case folding in git fast-import when core.ignorecase=true When core.ignorecase=true, imported file paths will be folded to match existing directory case. Signed-off-by: Joshua Jensen <jjensen@workspacewhiz.com> Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-10-03 11:56:46 +02:00			`#include "dir.h"`
Created fast-import, a tool to quickly generating a pack from blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-05 08:04:21 +02:00
Correct packfile edge output in fast-import. Branches are only contained by a packfile if the branch actually had its most recent commit in that packfile. So new branches are set to MAX_PACK_ID to ensure they don't cause their commit to list as part of the first packfile when it closes out if the commit was actually in existance before fast-import started. Also corrected the type of last_commit to be umaxint_t to prevent overflow and wraparound on very large imports. Though that is highly unlikely to occur as we're talking 4 billion commits, which no real project has right now. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-17 08:42:43 +01:00			`#define PACK_ID_BITS 16`
			`#define MAX_PACK_ID ((1<<PACK_ID_BITS)-1)`
Don't allow fast-import tree delta chains to exceed maximum depth Brian Downing noticed fast-import can produce tree depths of up to 6,035 objects and even deeper. Long delta chains can create very small packfiles but cause problems during repacking as git needs to unpack each tree to count the reachable blobs. What's happening here is the active branch cache isn't big enough. We're swapping out the branch and thus recycling the tree information (struct tree_content) back into the free pool. When we later reload the tree we set the delta_depth to 0 but we kept the tree we just reloaded as a delta base. So if the tree we reloaded was already at the maximum depth we wouldn't know it and make the new tree a delta. Multiply the number of times the branch cache has to swap out the tree times max_depth (10) and you get the maximum delta depth of a tree created by fast-import. In Brian's case above the active branch cache had to swap the branch out 603/604 times during this import to produce a tree with a delta depth of 6035. Acked-by: Brian Downing <bdowning@lavos.net> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-14 05:48:42 +01:00			`#define DEPTH_BITS 13`
			`#define MAX_DEPTH ((1<<DEPTH_BITS)-1)`
Correct packfile edge output in fast-import. Branches are only contained by a packfile if the branch actually had its most recent commit in that packfile. So new branches are set to MAX_PACK_ID to ensure they don't cause their commit to list as part of the first packfile when it closes out if the commit was actually in existance before fast-import started. Also corrected the type of last_commit to be umaxint_t to prevent overflow and wraparound on very large imports. Though that is highly unlikely to occur as we're talking 4 billion commits, which no real project has right now. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-17 08:42:43 +01:00
fast-import: prevent producing bad delta To produce deltas for tree objects fast-import tracks two versions of tree's entries - base and current one. Base version stands both for a delta base of this tree, and for a entry inside a delta base of a parent tree. So care should be taken to keep it in sync. tree_content_set cuts away a whole subtree and replaces it with a new one (or NULL for lazy load of a tree with known sha1). It keeps a base sha1 for this subtree (needed for parent tree). And here is the problem, 'subtree' tree root doesn't have the implied base version entries. Adjusting the subtree to include them would mean a deep rewrite of subtree. Invalidating the subtree base version would mean recursive invalidation of parents' base versions. So just mark this tree as do-not-delta me. Abuse setuid bit for this purpose. tree_content_replace is the same as tree_content_set except that is is used to replace the root, so just clearing base sha1 here (instead of setting the bit) is fine. [di: log message] Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Dmitry Ivankov <divanorama@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-08-14 20:32:24 +02:00			`/*`
			`* We abuse the setuid bit on directories to mean "do not delta".`
			`*/`
			`#define NO_DELTA S_ISUID`

standardize brace placement in struct definitions In a struct definitions, unlike functions, the prevailing style is for the opening brace to go on the same line as the struct name, like so: struct foo { int bar; char baz; }; Indeed, grepping for 'struct [a-z_] {$' yields about 5 times as many matches as 'struct [a-z_]*$'. Linus sayeth: Heretic people all over the world have claimed that this inconsistency is ... well ... inconsistent, but all right-thinking people know that (a) K&R are _right_ and (b) K&R are right. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-03-16 08:08:34 +01:00			`struct object_entry {`
fast-import: start using struct pack_idx_entry This is in preparation for using write_idx_file(). Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:51 +01:00			`struct pack_idx_entry idx;`
Cleaned up memory allocation for object_entry structs. Although its easy to ask the user to tell us how many objects they will need, its probably better to dynamically grow the object table in large units. But if the user can give us a hint as to roughly how many objects then we can still use it during startup. Also stopped printing the SHA1 strings to stdout as no user is currently making use of that facility. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:03:59 +02:00			`struct object_entry *next;`
Don't allow fast-import tree delta chains to exceed maximum depth Brian Downing noticed fast-import can produce tree depths of up to 6,035 objects and even deeper. Long delta chains can create very small packfiles but cause problems during repacking as git needs to unpack each tree to count the reachable blobs. What's happening here is the active branch cache isn't big enough. We're swapping out the branch and thus recycling the tree information (struct tree_content) back into the free pool. When we later reload the tree we set the delta_depth to 0 but we kept the tree we just reloaded as a delta base. So if the tree we reloaded was already at the maximum depth we wouldn't know it and make the new tree a delta. Multiply the number of times the branch cache has to swap out the tree times max_depth (10) and you get the maximum delta depth of a tree created by fast-import. In Brian's case above the active branch cache had to swap the branch out 603/604 times during this import to produce a tree with a delta depth of 6035. Acked-by: Brian Downing <bdowning@lavos.net> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-14 05:48:42 +01:00			`uint32_t type : TYPE_BITS,`
			`pack_id : PACK_ID_BITS,`
			`depth : DEPTH_BITS;`
Cleaned up memory allocation for object_entry structs. Although its easy to ask the user to tell us how many objects they will need, its probably better to dynamically grow the object table in large units. But if the user can give us a hint as to roughly how many objects then we can still use it during startup. Also stopped printing the SHA1 strings to stdout as no user is currently making use of that facility. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:03:59 +02:00			`};`

standardize brace placement in struct definitions In a struct definitions, unlike functions, the prevailing style is for the opening brace to go on the same line as the struct name, like so: struct foo { int bar; char baz; }; Indeed, grepping for 'struct [a-z_] {$' yields about 5 times as many matches as 'struct [a-z_]*$'. Linus sayeth: Heretic people all over the world have claimed that this inconsistency is ... well ... inconsistent, but all right-thinking people know that (a) K&R are _right_ and (b) K&R are right. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-03-16 08:08:34 +01:00			`struct object_entry_pool {`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`struct object_entry_pool *next_pool;`
Cleaned up memory allocation for object_entry structs. Although its easy to ask the user to tell us how many objects they will need, its probably better to dynamically grow the object table in large units. But if the user can give us a hint as to roughly how many objects then we can still use it during startup. Also stopped printing the SHA1 strings to stdout as no user is currently making use of that facility. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:03:59 +02:00			`struct object_entry *next_free;`
			`struct object_entry *end;`
Refactored fast-import's internals for future additions. Too many globals variables were being used not not enough code was resuable to process trees and commits so this is a simple refactoring of the existing blob processing code to get into a state that will be easier to handle trees and commits in. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:46:13 +02:00			`struct object_entry entries[FLEX_ARRAY]; /* more */`
Cleaned up memory allocation for object_entry structs. Although its easy to ask the user to tell us how many objects they will need, its probably better to dynamically grow the object table in large units. But if the user can give us a hint as to roughly how many objects then we can still use it during startup. Also stopped printing the SHA1 strings to stdout as no user is currently making use of that facility. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:03:59 +02:00			`};`

standardize brace placement in struct definitions In a struct definitions, unlike functions, the prevailing style is for the opening brace to go on the same line as the struct name, like so: struct foo { int bar; char baz; }; Indeed, grepping for 'struct [a-z_] {$' yields about 5 times as many matches as 'struct [a-z_]*$'. Linus sayeth: Heretic people all over the world have claimed that this inconsistency is ... well ... inconsistent, but all right-thinking people know that (a) K&R are _right_ and (b) K&R are right. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-03-16 08:08:34 +01:00			`struct mark_set {`
Added mark store/find to fast-import. Marks are now saved when the mark directive gets used by the frontend and may be used in place of a SHA1 expression to locate a previous SHA1 which fast-import may have generated. This is particularly useful with commits where the frontend does not (easily) have the ability to compute the SHA1 for an arbitrary commit but needs it to generate a branch or tag from that commit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 10:17:45 +02:00			`union {`
			`struct object_entry *marked[1024];`
			`struct mark_set *sets[1024];`
			`} data;`
Correct a few types to be unsigned in fast-import. The length of an atom string cannot be negative. So make it explicit and declare it as an unsigned value. The shift width in a mark table node also cannot be negative. I'm also moving it to after the pointer arrays to prevent any possible alignment problems on a 64 bit system. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-17 06:57:23 +01:00			`unsigned int shift;`
Added mark store/find to fast-import. Marks are now saved when the mark directive gets used by the frontend and may be used in place of a SHA1 expression to locate a previous SHA1 which fast-import may have generated. This is particularly useful with commits where the frontend does not (easily) have the ability to compute the SHA1 for an arbitrary commit but needs it to generate a branch or tag from that commit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 10:17:45 +02:00			`};`

standardize brace placement in struct definitions In a struct definitions, unlike functions, the prevailing style is for the opening brace to go on the same line as the struct name, like so: struct foo { int bar; char baz; }; Indeed, grepping for 'struct [a-z_] {$' yields about 5 times as many matches as 'struct [a-z_]*$'. Linus sayeth: Heretic people all over the world have claimed that this inconsistency is ... well ... inconsistent, but all right-thinking people know that (a) K&R are _right_ and (b) K&R are right. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-03-16 08:08:34 +01:00			`struct last_object {`
fast-import optimization: Now that cmd_data acts on a strbuf, make last_object stashed buffer be a strbuf as well. On new stash, don't free the last stashed buffer, rather swap it with the one you will stash, this way, callers of store_object can act on static strbufs, and at some point, fast-import won't allocate new memory for objects buffers. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 14:00:38 +02:00			`struct strbuf data;`
fast-import: make default pack size unlimited Now that fast-import is creating packs with index version 2, there is no point limiting the pack size by default. A pack split will still happen if off_t is not sufficiently large to hold large offsets. While updating the doc, let's remove the "packfiles fit on CDs" suggestion. Pack files created by fast-import are still suboptimal and a 'git repack -a -f -d' or even 'git gc --aggressive' would be a pretty good idea before considering storage on CDs. Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:54 +01:00			`off_t offset;`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00			`unsigned int depth;`
fast-import optimization: Now that cmd_data acts on a strbuf, make last_object stashed buffer be a strbuf as well. On new stash, don't free the last stashed buffer, rather swap it with the one you will stash, this way, callers of store_object can act on static strbufs, and at some point, fast-import won't allocate new memory for objects buffers. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 14:00:38 +02:00			`unsigned no_swap : 1;`
Refactored fast-import's internals for future additions. Too many globals variables were being used not not enough code was resuable to process trees and commits so this is a simple refactoring of the existing blob processing code to get into a state that will be easier to handle trees and commits in. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:46:13 +02:00			`};`

standardize brace placement in struct definitions In a struct definitions, unlike functions, the prevailing style is for the opening brace to go on the same line as the struct name, like so: struct foo { int bar; char baz; }; Indeed, grepping for 'struct [a-z_] {$' yields about 5 times as many matches as 'struct [a-z_]*$'. Linus sayeth: Heretic people all over the world have claimed that this inconsistency is ... well ... inconsistent, but all right-thinking people know that (a) K&R are _right_ and (b) K&R are right. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-03-16 08:08:34 +01:00			`struct mem_pool {`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`struct mem_pool *next_pool;`
			`char *next_free;`
			`char *end;`
fast-import: fix unalinged allocation and access The specialized pool allocator fast-import uses aligned objects on the size of a pointer, which was not sufficient at least on Sparc. Instead, make the alignment for objects of type unitmax_t. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-12-15 05:39:16 +01:00			`uintmax_t space[FLEX_ARRAY]; /* more */`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`};`

standardize brace placement in struct definitions In a struct definitions, unlike functions, the prevailing style is for the opening brace to go on the same line as the struct name, like so: struct foo { int bar; char baz; }; Indeed, grepping for 'struct [a-z_] {$' yields about 5 times as many matches as 'struct [a-z_]*$'. Linus sayeth: Heretic people all over the world have claimed that this inconsistency is ... well ... inconsistent, but all right-thinking people know that (a) K&R are _right_ and (b) K&R are right. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-03-16 08:08:34 +01:00			`struct atom_str {`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`struct atom_str *next_atom;`
Reduce memory usage of fast-import. Some structs are allocated rather frequently, but were using integer types which were far larger than required to actually store their full value range. As packfiles are limited to 4 GiB we don't need more than 32 bits to store the offset of an object within that packfile, an `unsigned long` on a 64 bit system is likely a 64 bit unsigned value. Saving 4 bytes per object on a 64 bit system can add up fast on any sizable import. As atom strings are strictly single components in a path name these are probably limited to just 255 bytes by the underlying OS. Going to that short of a string is probably too restrictive, but certainly `unsigned int` is far too large for their lengths. `unsigned short` is a reasonable limit. Modes within a tree really only need two bytes to store their whole value; using `unsigned int` here is vast overkill. Saving 4 bytes per file entry in an active branch can add up quickly on a project with a large number of files. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-05 22:34:56 +01:00			`unsigned short str_len;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`char str_dat[FLEX_ARRAY]; /* more */`
			`};`

			`struct tree_content;`
standardize brace placement in struct definitions In a struct definitions, unlike functions, the prevailing style is for the opening brace to go on the same line as the struct name, like so: struct foo { int bar; char baz; }; Indeed, grepping for 'struct [a-z_] {$' yields about 5 times as many matches as 'struct [a-z_]*$'. Linus sayeth: Heretic people all over the world have claimed that this inconsistency is ... well ... inconsistent, but all right-thinking people know that (a) K&R are _right_ and (b) K&R are right. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-03-16 08:08:34 +01:00			`struct tree_entry {`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`struct tree_content *tree;`
Fix a bunch of pointer declarations (codestyle) Essentially; s/type* /type */ as per the coding guidelines. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-05-01 11:06:36 +02:00			`struct atom_str *name;`
standardize brace placement in struct definitions In a struct definitions, unlike functions, the prevailing style is for the opening brace to go on the same line as the struct name, like so: struct foo { int bar; char baz; }; Indeed, grepping for 'struct [a-z_] {$' yields about 5 times as many matches as 'struct [a-z_]*$'. Linus sayeth: Heretic people all over the world have claimed that this inconsistency is ... well ... inconsistent, but all right-thinking people know that (a) K&R are _right_ and (b) K&R are right. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-03-16 08:08:34 +01:00			`struct tree_entry_ms {`
Reduce memory usage of fast-import. Some structs are allocated rather frequently, but were using integer types which were far larger than required to actually store their full value range. As packfiles are limited to 4 GiB we don't need more than 32 bits to store the offset of an object within that packfile, an `unsigned long` on a 64 bit system is likely a 64 bit unsigned value. Saving 4 bytes per object on a 64 bit system can add up fast on any sizable import. As atom strings are strictly single components in a path name these are probably limited to just 255 bytes by the underlying OS. Going to that short of a string is probably too restrictive, but certainly `unsigned int` is far too large for their lengths. `unsigned short` is a reasonable limit. Modes within a tree really only need two bytes to store their whole value; using `unsigned int` here is vast overkill. Saving 4 bytes per file entry in an active branch can add up quickly on a project with a large number of files. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-05 22:34:56 +01:00			`uint16_t mode;`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`unsigned char sha1[20];`
			`} versions[2];`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00			`};`

standardize brace placement in struct definitions In a struct definitions, unlike functions, the prevailing style is for the opening brace to go on the same line as the struct name, like so: struct foo { int bar; char baz; }; Indeed, grepping for 'struct [a-z_] {$' yields about 5 times as many matches as 'struct [a-z_]*$'. Linus sayeth: Heretic people all over the world have claimed that this inconsistency is ... well ... inconsistent, but all right-thinking people know that (a) K&R are _right_ and (b) K&R are right. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-03-16 08:08:34 +01:00			`struct tree_content {`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`unsigned int entry_capacity; /* must match avail_tree_content */`
			`unsigned int entry_count;`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`unsigned int delta_depth;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`struct tree_entry entries[FLEX_ARRAY]; / more */`
			`};`

standardize brace placement in struct definitions In a struct definitions, unlike functions, the prevailing style is for the opening brace to go on the same line as the struct name, like so: struct foo { int bar; char baz; }; Indeed, grepping for 'struct [a-z_] {$' yields about 5 times as many matches as 'struct [a-z_]*$'. Linus sayeth: Heretic people all over the world have claimed that this inconsistency is ... well ... inconsistent, but all right-thinking people know that (a) K&R are _right_ and (b) K&R are right. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-03-16 08:08:34 +01:00			`struct avail_tree_content {`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`unsigned int entry_capacity; /* must match tree_content */`
			`struct avail_tree_content *next_avail;`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00			`};`

standardize brace placement in struct definitions In a struct definitions, unlike functions, the prevailing style is for the opening brace to go on the same line as the struct name, like so: struct foo { int bar; char baz; }; Indeed, grepping for 'struct [a-z_] {$' yields about 5 times as many matches as 'struct [a-z_]*$'. Linus sayeth: Heretic people all over the world have claimed that this inconsistency is ... well ... inconsistent, but all right-thinking people know that (a) K&R are _right_ and (b) K&R are right. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-03-16 08:08:34 +01:00			`struct branch {`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`struct branch *table_next_branch;`
			`struct branch *active_next_branch;`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00			`const char *name;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`struct tree_entry branch_tree;`
Correct packfile edge output in fast-import. Branches are only contained by a packfile if the branch actually had its most recent commit in that packfile. So new branches are set to MAX_PACK_ID to ensure they don't cause their commit to list as part of the first packfile when it closes out if the commit was actually in existance before fast-import started. Also corrected the type of last_commit to be umaxint_t to prevent overflow and wraparound on very large imports. Though that is highly unlikely to occur as we're talking 4 billion commits, which no real project has right now. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-17 08:42:43 +01:00			`uintmax_t last_commit;`
fast-import: Proper notes tree manipulation This patch teaches 'git fast-import' to automatically organize note objects in a fast-import stream into an appropriate fanout structure. The notes API in notes.h is NOT used to accomplish this, because trying to keep the fast-import and notes data structures in sync would yield a significantly larger patch with higher complexity. Note objects are added with the 'N' command, and accounted for with a per-branch counter, which is used to trigger fanout restructuring when needed. Note that when restructuring the branch tree, _any_ entry whose path consists of 40 hex chars (not including directory separators) will be recognized as a note object. It is therefore not advisable to manipulate note entries with M/D/R/C commands. Since note objects are stored in the same tree structure as other objects, the unloading and reloading of a fast-import branches handle note objects transparently. This patch has been improved by the following contributions: - Shawn O. Pearce: Several style- and logic-related improvements Cc: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-07 12:27:24 +01:00			`uintmax_t num_notes;`
fast-import: Avoid infinite loop after reset Johannes Sixt noticed that a 'reset' command applied to a branch that is already active in the branch LRU cache can cause fast-import to relink the same branch into the LRU cache twice. This will cause the LRU cache to contain a cycle, making unload_one_branch run in an infinite loop as it tries to select the oldest branch for eviction. I have trivially fixed the problem by adding an active bit to each branch object; this bit indicates if the branch is already in the LRU and allows us to avoid trying to add it a second time. Converting the pack_id field into a bitfield makes this change take up no additional memory. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-05 18:31:09 +01:00			`unsigned active : 1;`
			`unsigned pack_id : PACK_ID_BITS;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`unsigned char sha1[20];`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00			`};`

standardize brace placement in struct definitions In a struct definitions, unlike functions, the prevailing style is for the opening brace to go on the same line as the struct name, like so: struct foo { int bar; char baz; }; Indeed, grepping for 'struct [a-z_] {$' yields about 5 times as many matches as 'struct [a-z_]*$'. Linus sayeth: Heretic people all over the world have claimed that this inconsistency is ... well ... inconsistent, but all right-thinking people know that (a) K&R are _right_ and (b) K&R are right. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-03-16 08:08:34 +01:00			`struct tag {`
Implemented 'tag' command in fast-import. Tags received from the frontend are generated in memory in a simple linked list in the order that the tag commands were sent by the frontend. If multiple different tag objects for the same tag name get generated the last one sent by the frontend will be the one that gets written out at termination. Multiple tag objects for the same name will cause all older tags of the same name to be lost. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 09:12:13 +02:00			`struct tag *next_tag;`
			`const char *name;`
Print out the edge commits for each packfile in fast-import. To help callers repack very large repositories into a series of packfiles fast-import now outputs the last commits/tags it wrote to a packfile when it prints out the packfile name. This information can be feed to pack-objects --revs to repack. For the first pack of an initial import this is pretty easy (just feed those SHA1s on stdin) but for subsequent packs you want to feed the subsequent pack's final SHA1s but also all prior pack's SHA1s prefixed with the negation operator. This way the prior pack's data does not get included into the subsequent pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 22:18:44 +01:00			`unsigned int pack_id;`
Implemented 'tag' command in fast-import. Tags received from the frontend are generated in memory in a simple linked list in the order that the tag commands were sent by the frontend. If multiple different tag objects for the same tag name get generated the last one sent by the frontend will be the one that gets written out at termination. Multiple tag objects for the same name will cause all older tags of the same name to be lost. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 09:12:13 +02:00			`unsigned char sha1[20];`
			`};`

standardize brace placement in struct definitions In a struct definitions, unlike functions, the prevailing style is for the opening brace to go on the same line as the struct name, like so: struct foo { int bar; char baz; }; Indeed, grepping for 'struct [a-z_] {$' yields about 5 times as many matches as 'struct [a-z_]*$'. Linus sayeth: Heretic people all over the world have claimed that this inconsistency is ... well ... inconsistent, but all right-thinking people know that (a) K&R are _right_ and (b) K&R are right. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-03-16 08:08:34 +01:00			`struct hash_list {`
Support creation of merge commits in fast-import. Some importers are able to determine when branch merges occurred within their source data. In these cases they will want to supply the correct commits to fast-import so that a proper merge commit will exist in Git. This is now supported by supplying a 'merge ' command after the commit message and optional from command. A merge is not actually performed by fast-import, its assumed that the frontend performed any sort of merging activity already and that fast-import should simply be storing its result. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-12 04:21:38 +01:00			`struct hash_list *next;`
			`unsigned char sha1[20];`
			`};`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00
Support RFC 2822 date parsing in fast-import. Since some frontends may be working with source material where the dates are only readily available as RFC 2822 strings, it is more friendly if fast-import exposes Git's parse_date() function to handle the conversion. This way the frontend doesn't need to perform the parsing itself. The new --date-format option to fast-import can be used by a frontend to select which format it will supply date strings in. The default is the standard `raw` Git format, which fast-import has always supported. Format rfc2822 can be used to activate the parse_date() function instead. Because fast-import could also be useful for creating new, current commits, the format `now` is also supported to generate the current system timestamp. The implementation of `now` is a trivial call to datestamp(), but is actually a whole whopping 3 lines so that fast-import can verify the frontend really meant `now`. As part of this change I have added validation of the `raw` date format. Prior to this change fast-import would accept anything in a `committer` command, even if it was seriously malformed. Now fast-import requires the '> ' near the end of the string and verifies the timestamp is formatted properly. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 20:58:30 +01:00			`typedef enum {`
			`WHENSPEC_RAW = 1,`
			`WHENSPEC_RFC2822,`
enums: omit trailing comma for portability Without this patch at least IBM VisualAge C 5.0 (I have 5.0.2) on AIX 5.1 fails to compile git. enum style is inconsistent already, with some enums declared on one line, some over 3 lines with the enum values all on the middle line, sometimes with 1 enum value per line... and independently of that the trailing comma is sometimes present and other times absent, often mixing with/without trailing comma styles in a single file, and sometimes in consecutive enum declarations. Clearly, omitting the comma is the more portable style, and this patch changes all enum declarations to use the portable omitted dangling comma style consistently. Signed-off-by: Gary V. Vaughan <gary@thewrittenword.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-05-14 11:31:35 +02:00			`WHENSPEC_NOW`
Support RFC 2822 date parsing in fast-import. Since some frontends may be working with source material where the dates are only readily available as RFC 2822 strings, it is more friendly if fast-import exposes Git's parse_date() function to handle the conversion. This way the frontend doesn't need to perform the parsing itself. The new --date-format option to fast-import can be used by a frontend to select which format it will supply date strings in. The default is the standard `raw` Git format, which fast-import has always supported. Format rfc2822 can be used to activate the parse_date() function instead. Because fast-import could also be useful for creating new, current commits, the format `now` is also supported to generate the current system timestamp. The implementation of `now` is a trivial call to datestamp(), but is actually a whole whopping 3 lines so that fast-import can verify the frontend really meant `now`. As part of this change I have added validation of the `raw` date format. Prior to this change fast-import would accept anything in a `committer` command, even if it was seriously malformed. Now fast-import requires the '> ' near the end of the string and verifies the timestamp is formatted properly. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 20:58:30 +01:00			`} whenspec_type;`

standardize brace placement in struct definitions In a struct definitions, unlike functions, the prevailing style is for the opening brace to go on the same line as the struct name, like so: struct foo { int bar; char baz; }; Indeed, grepping for 'struct [a-z_] {$' yields about 5 times as many matches as 'struct [a-z_]*$'. Linus sayeth: Heretic people all over the world have claimed that this inconsistency is ... well ... inconsistent, but all right-thinking people know that (a) K&R are _right_ and (b) K&R are right. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-03-16 08:08:34 +01:00			`struct recent_command {`
Include recent command history in fast-import crash reports When we crash the frontend developer (or end-user) may need to know roughly around what part of the input stream we had a problem with and aborted on. Because line numbers aren't very useful in this sort of application we instead just keep the last 100 commands in a FIFO queue and print them as part of the crash report. Currently one problem with this design is a commit that has more than 100 modified files in it will flood the FIFO and any context regarding branch/from/committer/mark/comments will be lost. We really should save only the last few (10?) file changes for the current commit, ensuring we have some prior higher level commands in the FIFO when we crash on a file M/D/C/R command. Another issue with this approach is the FIFO only includes the commands, it does not include the commit messages. Yet having a commit message may be useful to help locate the relevant change in the source material. In practice I don't think this is going to be a major concern as the frontend can always embed its own source change set identifier as a comment (which will appear in the crash report) and the commit message(s) for the most recent commits of any given branch should be obtainable from the (packed) commit objects. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 10:47:04 +02:00			`struct recent_command *prev;`
			`struct recent_command *next;`
			`char *buf;`
			`};`

Use uintmax_t for marks in fast-import. If a frontend wants to use a mark per file revision and per commit and is doing a truly huge import (such as a 32 GiB SVN repository) we may need more than 2**32 unique mark values, especially if the frontend is unable (or unwilling) to recycle mark values. For mark idnums we should use the largest unsigned integer type available, hoping that will be at least 64 bits when we are compiled as a 64 bit executable. This way we may consume huge amounts of memory storing our mark table, but we'll at least be able to process the entire import without failing. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 06:33:19 +01:00			`/* Configured limits on output */`
Converted fast-import to accept standard command line parameters. The following command line options are now accepted before the pack name: --objects=n # replaces the object count after the pack name --depth=n # delta chain depth to use (default is 10) --active-branches=n # maximum number of branches to keep in memory Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 08:00:31 +02:00			`static unsigned long max_depth = 10;`
fast-import: make default pack size unlimited Now that fast-import is creating packs with index version 2, there is no point limiting the pack size by default. A pack split will still happen if off_t is not sufficiently large to hold large offsets. While updating the doc, let's remove the "packfiles fit on CDs" suggestion. Pack files created by fast-import are still suboptimal and a 'git repack -a -f -d' or even 'git gc --aggressive' would be a pretty good idea before considering storage on CDs. Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:54 +01:00			`static off_t max_packsize;`
Don't do non-fastforward updates in fast-import. If fast-import is being used to update an existing branch of a repository, the user may not want to lose commits if another process updates the same ref at the same time. For example, the user might be using fast-import to make just one or two commits against a live branch. We now perform a fast-forward check during the ref updating process. If updating a branch would cause commits in that branch to be lost, we skip over it and display the new SHA1 to standard error. This new default behavior can be overridden with `--force`, like git-push and git-fetch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 22:08:06 +01:00			`static int force_update;`
Teach fast-import to honor pack.compression and pack.depth We now use the configured pack.compression and pack.depth values within fast-import, as like builtin-pack-objects fast-import is generating a packfile for consumption by the Git tools. We use the same behavior as builtin-pack-objects does for these options, allowing core.compression to supply the default value for pack.compression. The default setting for pack.depth within fast-import is still 10 as users will generally repack fast-import generated packfiles by `repack -f`. A large delta depth within the fast-import packfile can significantly slow down such a later repack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-21 05:36:54 +01:00			`static int pack_compression_level = Z_DEFAULT_COMPRESSION;`
			`static int pack_compression_seen;`
Use uintmax_t for marks in fast-import. If a frontend wants to use a mark per file revision and per commit and is doing a truly huge import (such as a 32 GiB SVN repository) we may need more than 2**32 unique mark values, especially if the frontend is unable (or unwilling) to recycle mark values. For mark idnums we should use the largest unsigned integer type available, hoping that will be at least 64 bits when we are compiled as a 64 bit executable. This way we may consume huge amounts of memory storing our mark table, but we'll at least be able to process the entire import without failing. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 06:33:19 +01:00
			`/* Stats and misc. counters */`
			`static uintmax_t alloc_count;`
			`static uintmax_t marks_set_count;`
			`static uintmax_t object_count_by_type[1 << TYPE_BITS];`
			`static uintmax_t duplicate_count_by_type[1 << TYPE_BITS];`
			`static uintmax_t delta_count_by_type[1 << TYPE_BITS];`
fast-import: count and report # of calls to diff_delta in stats It's an interesting number, how often do we try to deltify each type of objects and how often do we succeed. So do add it to stats. Success doesn't mean much gain in pack size though. As we allow delta to be as big as (data.len - 20). And delta close to data.len gains nothing compared to no delta at all even after zlib compression (delta is pretty much the same as data, just with few modifications). We should try to make less attempts that result in huge deltas as these consume more cpu than trivial small deltas. Either by choosing a better delta base or reducing delta size upper bound or doing less delta attempts at all. Currently, delta base for blobs is a waste literally. Each blob delta base is chosen as a previously stored blob. Disabling deltas for blobs doesn't increase pack size and reduce import time, or at least doesn't increase time for all fast-import streams I've tried. Signed-off-by: Dmitry Ivankov <divanorama@gmail.com> Acked-by: David Barr <davidbarr@google.com> Acked-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-08-20 21:04:11 +02:00			`static uintmax_t delta_count_attempts_by_type[1 << TYPE_BITS];`
Correct object_count type and stat output in fast-import. Since object_count is limited to 'unsigned long' (really an unsigned 32 bit integer value) by the pack file format we may as well use exactly that type here in fast-import for that counter. An earlier change by me incorrectly made it uintmax_t. But since object_count is a counter for the current packfile only, we don't want to output its value at the end. Instead we should sum up the individual type counters and report that total, as that will cover all of the packfiles. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 10:55:41 +01:00			`static unsigned long object_count;`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00			`static unsigned long branch_count;`
Added branch load counter to fast-import. If the branch load count exceeds the number of branches created then the frontend is causing fast-import to page branches into and out of memory due to the way its ordering its commits. Performance can likely be increased if the frontend were to alter its commit sequence such that it stays on one branch before switching to another branch, then never returns to the prior branch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 10:31:12 +02:00			`static unsigned long branch_load_count;`
Don't do non-fastforward updates in fast-import. If fast-import is being used to update an existing branch of a repository, the user may not want to lose commits if another process updates the same ref at the same time. For example, the user might be using fast-import to make just one or two commits against a live branch. We now perform a fast-forward check during the ref updating process. If updating a branch would cause commits in that branch to be lost, we skip over it and display the new SHA1 to standard error. This new default behavior can be overridden with `--force`, like git-push and git-fetch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 22:08:06 +01:00			`static int failure;`
fast-import: Hide the pack boundary commits by default. Most users don't need the pack boundary information that fast-import was printing to standard output, especially if they were calling it with --quiet. Those users who do want this information probably want it captured so they can go back and use it to repack the imported repository. So dumping the boundary commits to a log file makes more sense then printing them to standard output. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-12 01:45:56 +01:00			`static FILE *pack_edges;`
fast-import: put option parsing code in separate functions Putting the options in their own functions increases readability of the option parsing block and makes it easier to reuse the option parsing code later on. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:54 +01:00			`static unsigned int show_stats = 1;`
fast-import: add option command This allows the frontend to specify any of the supported options as long as no non-option command has been given. This way the user does not have to include any frontend-specific options, but instead she can rely on the frontend to tell fast-import what it needs. Also factor out parsing of argv and have it execute when we reach the first non-option command, or after all commands have been read and no non-option command has been encountered. Non-git options are ignored, unrecognised options result in an error. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:57 +01:00			`static int global_argc;`
			`static const char **global_argv;`
Refactored fast-import's internals for future additions. Too many globals variables were being used not not enough code was resuable to process trees and commits so this is a simple refactoring of the existing blob processing code to get into a state that will be easier to handle trees and commits in. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:46:13 +02:00
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`/* Memory pools */`
			`static size_t mem_pool_alloc = 210241024 - sizeof(struct mem_pool);`
			`static size_t total_allocd;`
			`static struct mem_pool *mem_pool;`

Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`/* Atom management */`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`static unsigned int atom_table_sz = 4451;`
			`static unsigned int atom_cnt;`
			`static struct atom_str **atom_table;`

			`/* The .pack file being generated */`
write_idx_file: introduce a struct to hold idx customization options Remove two globals, pack_idx_default version and pack_idx_off32_limit, and place them in a pack_idx_option structure. Allow callers to pass it to write_idx_file() as a parameter. Adjust all callers to the API change. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-02-26 00:43:25 +01:00			`static struct pack_idx_option pack_idx_opts;`
Implemented manual packfile switching in fast-import. To help importers which are dealing with massive amounts of data fast-import needs to be able to close the packfile it is currently writing to and open a new packfile for any additional data that will be received. A new 'checkpoint' command has been introduced which can be used by the frontend import process to force this to occur at any time. This may be useful to ensure a very long running import doesn't lose any work due to unexpected failures. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:35:41 +01:00			`static unsigned int pack_id;`
fast-import: use sha1write() for pack data This is in preparation for using write_idx_file(). Also, by using sha1write() we get some buffering to reduces the number of write syscalls, and the written data is SHA1 summed which allows for the extra data integrity validation check performed in fixup_pack_header_footer() (details on this in commit abeb40e5aa). Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:52 +01:00			`static struct sha1file *pack_file;`
Improve reuse of sha1_file library within fast-import. Now that the sha1_file.c library routines use the sliding mmap routines to perform efficient access to portions of a packfile I can remove that code from fast-import.c and just invoke it. One benefit is we now have reloading support for any packfile which uses OBJ_OFS_DELTA. Another is we have significantly less code to maintain. This code reuse change requires that fast-import generate only an OBJ_OFS_DELTA format packfile, as there is absolutely no index available to perform OBJ_REF_DELTA lookup in while unpacking an object. This is probably reasonable to require as the delta offsets result in smaller packfiles and are faster to unpack, as no index searching is required. Its also only a temporary requirement as users could always repack without offsets before making the import available to older versions of Git. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-14 12:20:23 +01:00			`static struct packed_git *pack_data;`
Implemented manual packfile switching in fast-import. To help importers which are dealing with massive amounts of data fast-import needs to be able to close the packfile it is currently writing to and open a new packfile for any additional data that will be received. A new 'checkpoint' command has been introduced which can be used by the frontend import process to force this to occur at any time. This may be useful to ensure a very long running import doesn't lose any work due to unexpected failures. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:35:41 +01:00			`static struct packed_git **all_packs;`
fast-import: make default pack size unlimited Now that fast-import is creating packs with index version 2, there is no point limiting the pack size by default. A pack split will still happen if off_t is not sufficiently large to hold large offsets. While updating the doc, let's remove the "packfiles fit on CDs" suggestion. Pack files created by fast-import are still suboptimal and a 'git repack -a -f -d' or even 'git gc --aggressive' would be a pretty good idea before considering storage on CDs. Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:54 +01:00			`static off_t pack_size;`
Refactored fast-import's internals for future additions. Too many globals variables were being used not not enough code was resuable to process trees and commits so this is a simple refactoring of the existing blob processing code to get into a state that will be easier to handle trees and commits in. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:46:13 +02:00
			`/* Table of objects we've written. */`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`static unsigned int object_entry_alloc = 5000;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`static struct object_entry_pool *blocks;`
			`static struct object_entry *object_table[1 << 16];`
Added mark store/find to fast-import. Marks are now saved when the mark directive gets used by the frontend and may be used in place of a SHA1 expression to locate a previous SHA1 which fast-import may have generated. This is particularly useful with commits where the frontend does not (easily) have the ability to compute the SHA1 for an arbitrary commit but needs it to generate a branch or tag from that commit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 10:17:45 +02:00			`static struct mark_set *marks;`
fast-import: put marks reading in its own function All options do nothing but set settings, with the exception of the --input-marks option. Delay the reading of the marks file till after all options have been parsed. Also, rename mark_file to export_marks_file as it is now ambiguous. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:55 +01:00			`static const char *export_marks_file;`
			`static const char *import_marks_file;`
fast-import: allow for multiple --import-marks= arguments The --import-marks= option may be specified multiple times on the commandline and should result in all marks being read in. Only one import-marks feature may be specified in the stream, which is overriden by any --import-marks= commandline options. If one wishes to specify import-marks files in addition to the one specified in the stream, it is easy to repeat the stream option as a --import-marks= commandline option. Also verify this behavior with tests. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:59 +01:00			`static int import_marks_file_from_stream;`
fast-import: Introduce --import-marks-if-exists When a frontend uses a marks file to ensure its state persists between runs, it may represent "clean slate" when bootstrapping with "no marks yet". In such a case, feeding the last state with --import-marks and saving the state after the current run with --export-marks would be a natural thing to do. The --import-marks option however errors out when the specified marks file doesn't exist; this makes bootstrapping a bit difficult. The location of the marks file becomes backend-dependent when --relative-marks is in effect, and the frontend cannot check for the existence of the file in such a case. The --import-marks-if-exists option does the same thing as --import-marks but does not flag an error if the named file does not exist yet to help these frontends. Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-01-15 07:31:46 +01:00			`static int import_marks_file_ignore_missing;`
fast-import: add (non-)relative-marks feature After specifying 'feature relative-marks' the paths specified with 'feature import-marks' and 'feature export-marks' are relative to an internal directory in the current repository. In git-fast-import this means that the paths are relative to the '.git/info/fast-import' directory. However, other importers may use a different location. Add 'feature non-relative-marks' to disable this behavior, this way it is possible to, for example, specify the import-marks location as relative, and the export-marks location as non-relative. Also add tests to verify this behavior. Cc: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:07:00 +01:00			`static int relative_marks_paths;`
Refactored fast-import's internals for future additions. Too many globals variables were being used not not enough code was resuable to process trees and commits so this is a simple refactoring of the existing blob processing code to get into a state that will be easier to handle trees and commits in. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:46:13 +02:00
			`/* Our last blob */`
fast-import optimization: Now that cmd_data acts on a strbuf, make last_object stashed buffer be a strbuf as well. On new stash, don't free the last stashed buffer, rather swap it with the one you will stash, this way, callers of store_object can act on static strbufs, and at some point, fast-import won't allocate new memory for objects buffers. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 14:00:38 +02:00			`static struct last_object last_blob = { STRBUF_INIT, 0, 0, 0 };`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00
			`/* Tree management */`
			`static unsigned int tree_entry_alloc = 1000;`
			`static void *avail_tree_entry;`
			`static unsigned int avail_tree_table_sz = 100;`
			`static struct avail_tree_content **avail_tree_table;`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`static struct strbuf old_tree = STRBUF_INIT;`
			`static struct strbuf new_tree = STRBUF_INIT;`
Added automatic index generation to fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-06 19:51:39 +02:00
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00			`/* Branch data */`
Converted fast-import to accept standard command line parameters. The following command line options are now accepted before the pack name: --objects=n # replaces the object count after the pack name --depth=n # delta chain depth to use (default is 10) --active-branches=n # maximum number of branches to keep in memory Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 08:00:31 +02:00			`static unsigned long max_active_branches = 5;`
			`static unsigned long cur_active_branches;`
			`static unsigned long branch_table_sz = 1039;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`static struct branch **branch_table;`
			`static struct branch *active_branches;`

Implemented 'tag' command in fast-import. Tags received from the frontend are generated in memory in a simple linked list in the order that the tag commands were sent by the frontend. If multiple different tag objects for the same tag name get generated the last one sent by the frontend will be the one that gets written out at termination. Multiple tag objects for the same name will cause all older tags of the same name to be lost. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 09:12:13 +02:00			`/* Tag data */`
			`static struct tag *first_tag;`
			`static struct tag *last_tag;`

Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`/* Input stream parsing */`
Support RFC 2822 date parsing in fast-import. Since some frontends may be working with source material where the dates are only readily available as RFC 2822 strings, it is more friendly if fast-import exposes Git's parse_date() function to handle the conversion. This way the frontend doesn't need to perform the parsing itself. The new --date-format option to fast-import can be used by a frontend to select which format it will supply date strings in. The default is the standard `raw` Git format, which fast-import has always supported. Format rfc2822 can be used to activate the parse_date() function instead. Because fast-import could also be useful for creating new, current commits, the format `now` is also supported to generate the current system timestamp. The implementation of `now` is a trivial call to datestamp(), but is actually a whole whopping 3 lines so that fast-import can verify the frontend really meant `now`. As part of this change I have added validation of the `raw` date format. Prior to this change fast-import would accept anything in a `committer` command, even if it was seriously malformed. Now fast-import requires the '> ' near the end of the string and verifies the timestamp is formatted properly. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 20:58:30 +01:00			`static whenspec_type whenspec = WHENSPEC_RAW;`
fast-import: Use strbuf API, and simplify cmd_data() This patch features the use of strbuf_detach, and prevent the programmer to mess with allocation directly. The code is as efficent as before, just more concise and more straightforward. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-06 13:20:07 +02:00			`static struct strbuf command_buf = STRBUF_INIT;`
Make trailing LF optional for all fast-import commands For the same reasons as the prior change we want to allow frontends to omit the trailing LF that usually delimits commands. In some cases these just make the input stream more verbose looking than it needs to be, and its just simpler for the frontend developer to get started if our parser is slightly more lenient about where an LF is required and where it isn't. To make this optional LF feature work we now have to buffer up to one line of input in command_buf. This buffering can happen if we look at the current input command but don't recognize it at this point in the code. In such a case we need to "unget" the entire line, but we cannot depend upon the stdio library to let us do ungetc() for that many characters at once. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-01 08:22:53 +02:00			`static int unread_command_buf;`
Include recent command history in fast-import crash reports When we crash the frontend developer (or end-user) may need to know roughly around what part of the input stream we had a problem with and aborted on. Because line numbers aren't very useful in this sort of application we instead just keep the last 100 commands in a FIFO queue and print them as part of the crash report. Currently one problem with this design is a commit that has more than 100 modified files in it will flood the FIFO and any context regarding branch/from/committer/mark/comments will be lost. We really should save only the last few (10?) file changes for the current commit, ensuring we have some prior higher level commands in the FIFO when we crash on a file M/D/C/R command. Another issue with this approach is the FIFO only includes the commands, it does not include the commit messages. Yet having a commit message may be useful to help locate the relevant change in the source material. In practice I don't think this is going to be a major concern as the frontend can always embed its own source change set identifier as a comment (which will appear in the crash report) and the commit message(s) for the most recent commits of any given branch should be obtainable from the (packed) commit objects. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 10:47:04 +02:00			`static struct recent_command cmd_hist = {&cmd_hist, &cmd_hist, NULL};`
			`static struct recent_command *cmd_tail = &cmd_hist;`
			`static struct recent_command *rc_free;`
			`static unsigned int cmd_save = 100;`
Use uintmax_t for marks in fast-import. If a frontend wants to use a mark per file revision and per commit and is doing a truly huge import (such as a 32 GiB SVN repository) we may need more than 2**32 unique mark values, especially if the frontend is unable (or unwilling) to recycle mark values. For mark idnums we should use the largest unsigned integer type available, hoping that will be at least 64 bits when we are compiled as a 64 bit executable. This way we may consume huge amounts of memory storing our mark table, but we'll at least be able to process the entire import without failing. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 06:33:19 +01:00			`static uintmax_t next_mark;`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`static struct strbuf new_data = STRBUF_INIT;`
fast-import: add feature command This allows the fronted to require a specific feature to be supported by the backend, or abort. Also add support for four initial feature, date-format=, force=, import-marks=, export-marks=. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:56 +01:00			`static int seen_data_command;`
fast-import: introduce 'done' command Add a 'done' command that causes fast-import to stop reading from the stream and exit. If the new --done command line flag was passed on the command line (or a "feature done" declaration included at the start of the stream), make the 'done' command mandatory. So "git fast-import --done"'s input format will be prefix-free, making errors easier to detect when they show up as early termination at some convenient time of the upstream of a pipe writing to fast-import. Another possible application of the 'done' command would to be allow a fast-import stream that is only a small part of a larger encapsulating stream to be easily parsed, leaving the file offset after the "done\n" so the other application can pick up from there. This patch does not teach fast-import to do that --- fast-import still uses buffered input (stdio). Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-07-16 15:03:32 +02:00			`static int require_explicit_termination;`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00
fast-import: treat SIGUSR1 as a request to access objects early It can be tedious to wait for a multi-million-revision import. Unfortunately it is hard to spy on the import because fast-import works by continuously streaming out objects, without updating the pack index or refs until a checkpoint command or the end of the stream. So allow the impatient operator to request checkpoints by sending a signal, like so: killall -USR1 git-fast-import When receiving such a signal, fast-import would schedule a checkpoint to take place after the current top-level command (usually a "commit" or "blob" request) finishes. Caveats: just like ordinary checkpoint commands, such requests slow down the import. Switching to a new pack at a suboptimal moment is also likely to result in a less dense initial collection of packs. That's the price. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-11-22 09:16:02 +01:00			`/* Signal handling */`
			`static volatile sig_atomic_t checkpoint_requested;`

fast-import: let importers retrieve blobs New objects written by fast-import are not available immediately. Until a checkpoint has been started and finishes writing the pack index, any new blobs will not be accessible using standard git tools. So introduce a new way to access them: a "cat-blob" command in the command stream requests for fast-import to print a blob to stdout or a file descriptor specified by the argument to --cat-blob-fd. The value for cat-blob-fd cannot be specified in the stream because that would be a layering violation: the decision of where to direct a stream has to be made when fast-import is started anyway, so we might as well make the stream format is independent of that detail. Output uses the same format as "git cat-file --batch". Thanks to Sverre Rabbelier and Sam Vilain for guidance in designing the protocol. Based-on-patch-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: David Barr <david.barr@cordelta.com> Acked-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-11-28 20:45:01 +01:00			`/* Where to write output of cat-blob commands */`
			`static int cat_blob_fd = STDOUT_FILENO;`

fast-import: add option command This allows the frontend to specify any of the supported options as long as no non-option command has been given. This way the user does not have to include any frontend-specific options, but instead she can rely on the frontend to tell fast-import what it needs. Also factor out parsing of argv and have it execute when we reach the first non-option command, or after all commands have been read and no non-option command has been encountered. Non-git options are ignored, unrecognised options result in an error. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:57 +01:00			`static void parse_argv(void);`
fast-import: Allow cat-blob requests at arbitrary points in stream The new rule: a "cat-blob" can be inserted wherever a comment is allowed, which means at the start of any line except in the middle of a "data" command. This saves frontends from having to loop over everything they want to commit in the next commit and cat-ing the necessary objects in advance. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-11-28 20:45:58 +01:00			`static void parse_cat_blob(void);`
fast-import: add 'ls' command Lazy fast-import frontend authors that want to rely on the backend to keep track of the content of the imported trees _almost_ have what they need in the 'cat-blob' command (v1.7.4-rc0~30^2~3, 2010-11-28). But it is not quite enough, since (1) cat-blob can be used to retrieve the content of files, but not their mode, and (2) using cat-blob requires the frontend to keep track of a name (mark number or object id) for each blob to be retrieved Introduce an 'ls' command to complement cat-blob and take care of the remaining needs. The 'ls' command finds what is at a given path within a given tree-ish (tag, commit, or tree): 'ls' SP <dataref> SP <path> LF or in fast-import's active commit: 'ls' SP <path> LF The response is a single line sent through the cat-blob channel, imitating ls-tree output. So for example: FE> ls :1 Documentation gfi> 040000 tree 9e6c2b599341d28a2a375f8207507e0a2a627fe9 Documentation FE> ls 9e6c2b599341d28a2a375f8207507e0a2a627fe9 git-fast-import.txt gfi> 100644 blob 4f92954396e3f0f97e75b6838a5635b583708870 git-fast-import.txt FE> ls :1 RelNotes gfi> 120000 blob b942e499449d97aeb50c73ca2bdc1c6e6d528743 RelNotes FE> cat-blob b942e499449d97aeb50c73ca2bdc1c6e6d528743 gfi> b942e499449d97aeb50c73ca2bdc1c6e6d528743 blob 32 gfi> Documentation/RelNotes/1.7.4.txt The most interesting parts of the reply are the first word, which is a 6-digit octal mode (regular file, executable, symlink, directory, or submodule), and the part from the second space to the tab, which is a <dataref> that can be used in later cat-blob, ls, and filemodify (M) commands to refer to the content (blob, tree, or commit) at that path. If there is nothing there, the response is "missing some/path". The intent is for this command to be used to read files from the active commit, so a frontend can apply patches to them, and to copy files and directories from previous revisions. For example, proposed updates to svn-fe use this command in place of its internal representation of the repository directory structure. This simplifies the frontend a great deal and means support for resuming an import in a separate fast-import run (i.e., incremental import) is basically free. Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Improved-by: Junio C Hamano <gitster@pobox.com> Improved-by: Sverre Rabbelier <srabbelier@gmail.com> 2010-12-02 11:40:20 +01:00			`static void parse_ls(struct branch *b);`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00
Generate crash reports on die in fast-import As fast-import is quite strict about its input and die()'s anytime something goes wrong it can be difficult for a frontend developer to troubleshoot why fast-import rejected their input, or to even determine what input command it rejected. This change introduces a custom handler for Git's die() routine. When we receive a die() for any reason (fast-import or a lower level core Git routine we called) the error is first dumped onto stderr and then a more extensive crash report file is prepared in GIT_DIR. Finally we exit the process with status 128, just like the stock builtin die handler. An internal flag is set to prevent any further die()'s that may be invoked during the crash report generator from causing us to enter into an infinite loop. We shouldn't die() from our crash report handler, but just in case someone makes a future code change we are prepared to gaurd against small mistakes turning into huge problems for the end-user. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 08:00:37 +02:00			`static void write_branch_report(FILE rpt, struct branch b)`
			`{`
			`fprintf(rpt, "%s:\n", b->name);`

			`fprintf(rpt, " status :");`
			`if (b->active)`
			`fputs(" active", rpt);`
			`if (b->branch_tree.tree)`
			`fputs(" loaded", rpt);`
			`if (is_null_sha1(b->branch_tree.versions[1].sha1))`
			`fputs(" dirty", rpt);`
			`fputc('\n', rpt);`

			`fprintf(rpt, " tip commit : %s\n", sha1_to_hex(b->sha1));`
			`fprintf(rpt, " old tree : %s\n", sha1_to_hex(b->branch_tree.versions[0].sha1));`
			`fprintf(rpt, " cur tree : %s\n", sha1_to_hex(b->branch_tree.versions[1].sha1));`
			`fprintf(rpt, " commit clock: %" PRIuMAX "\n", b->last_commit);`

			`fputs(" last pack : ", rpt);`
			`if (b->pack_id < MAX_PACK_ID)`
			`fprintf(rpt, "%u", b->pack_id);`
			`fputc('\n', rpt);`

			`fputc('\n', rpt);`
			`}`

Include the fast-import marks table in crash reports If fast-import was not run with --export-marks but we are crashing the frontend application developer may still benefit from having that information available to them. We now include the marks table as part of the crash report if --export-marks was not supplied on the command line. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-14 07:34:40 +01:00			`static void dump_marks_helper(FILE , uintmax_t, struct mark_set );`

Avoid using va_copy in fast-import: it seems to be unportable. [sp: minor change to use fputs, thus reducing the patch size] Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-08-21 05:38:14 +02:00			`static void write_crash_report(const char *err)`
Generate crash reports on die in fast-import As fast-import is quite strict about its input and die()'s anytime something goes wrong it can be difficult for a frontend developer to troubleshoot why fast-import rejected their input, or to even determine what input command it rejected. This change introduces a custom handler for Git's die() routine. When we receive a die() for any reason (fast-import or a lower level core Git routine we called) the error is first dumped onto stderr and then a more extensive crash report file is prepared in GIT_DIR. Finally we exit the process with status 128, just like the stock builtin die handler. An internal flag is set to prevent any further die()'s that may be invoked during the crash report generator from causing us to enter into an infinite loop. We shouldn't die() from our crash report handler, but just in case someone makes a future code change we are prepared to gaurd against small mistakes turning into huge problems for the end-user. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 08:00:37 +02:00			`{`
cast pid_t's to uintmax_t to improve portability Some systems (like e.g. OpenSolaris) define pid_t as long, therefore all our sprintf that use %i/%d cause a compiler warning beacuse of the implicit long->int cast. To make sure that we fit the limits, we display pids as PRIuMAX and cast them explicitly to uintmax_t. Signed-off-by: David Soria Parra <dsp@php.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-08-31 14:09:39 +02:00			`char *loc = git_path("fast_import_crash_%"PRIuMAX, (uintmax_t) getpid());`
Generate crash reports on die in fast-import As fast-import is quite strict about its input and die()'s anytime something goes wrong it can be difficult for a frontend developer to troubleshoot why fast-import rejected their input, or to even determine what input command it rejected. This change introduces a custom handler for Git's die() routine. When we receive a die() for any reason (fast-import or a lower level core Git routine we called) the error is first dumped onto stderr and then a more extensive crash report file is prepared in GIT_DIR. Finally we exit the process with status 128, just like the stock builtin die handler. An internal flag is set to prevent any further die()'s that may be invoked during the crash report generator from causing us to enter into an infinite loop. We shouldn't die() from our crash report handler, but just in case someone makes a future code change we are prepared to gaurd against small mistakes turning into huge problems for the end-user. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 08:00:37 +02:00			`FILE *rpt = fopen(loc, "w");`
			`struct branch *b;`
			`unsigned long lu;`
Include recent command history in fast-import crash reports When we crash the frontend developer (or end-user) may need to know roughly around what part of the input stream we had a problem with and aborted on. Because line numbers aren't very useful in this sort of application we instead just keep the last 100 commands in a FIFO queue and print them as part of the crash report. Currently one problem with this design is a commit that has more than 100 modified files in it will flood the FIFO and any context regarding branch/from/committer/mark/comments will be lost. We really should save only the last few (10?) file changes for the current commit, ensuring we have some prior higher level commands in the FIFO when we crash on a file M/D/C/R command. Another issue with this approach is the FIFO only includes the commands, it does not include the commit messages. Yet having a commit message may be useful to help locate the relevant change in the source material. In practice I don't think this is going to be a major concern as the frontend can always embed its own source change set identifier as a comment (which will appear in the crash report) and the commit message(s) for the most recent commits of any given branch should be obtainable from the (packed) commit objects. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 10:47:04 +02:00			`struct recent_command *rc;`
Generate crash reports on die in fast-import As fast-import is quite strict about its input and die()'s anytime something goes wrong it can be difficult for a frontend developer to troubleshoot why fast-import rejected their input, or to even determine what input command it rejected. This change introduces a custom handler for Git's die() routine. When we receive a die() for any reason (fast-import or a lower level core Git routine we called) the error is first dumped onto stderr and then a more extensive crash report file is prepared in GIT_DIR. Finally we exit the process with status 128, just like the stock builtin die handler. An internal flag is set to prevent any further die()'s that may be invoked during the crash report generator from causing us to enter into an infinite loop. We shouldn't die() from our crash report handler, but just in case someone makes a future code change we are prepared to gaurd against small mistakes turning into huge problems for the end-user. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 08:00:37 +02:00
			`if (!rpt) {`
			`error("can't write crash report %s: %s", loc, strerror(errno));`
			`return;`
			`}`

			`fprintf(stderr, "fast-import: dumping crash report to %s\n", loc);`

			`fprintf(rpt, "fast-import crash report:\n");`
cast pid_t's to uintmax_t to improve portability Some systems (like e.g. OpenSolaris) define pid_t as long, therefore all our sprintf that use %i/%d cause a compiler warning beacuse of the implicit long->int cast. To make sure that we fit the limits, we display pids as PRIuMAX and cast them explicitly to uintmax_t. Signed-off-by: David Soria Parra <dsp@php.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-08-31 14:09:39 +02:00			`fprintf(rpt, " fast-import process: %"PRIuMAX"\n", (uintmax_t) getpid());`
			`fprintf(rpt, " parent process : %"PRIuMAX"\n", (uintmax_t) getppid());`
Generate crash reports on die in fast-import As fast-import is quite strict about its input and die()'s anytime something goes wrong it can be difficult for a frontend developer to troubleshoot why fast-import rejected their input, or to even determine what input command it rejected. This change introduces a custom handler for Git's die() routine. When we receive a die() for any reason (fast-import or a lower level core Git routine we called) the error is first dumped onto stderr and then a more extensive crash report file is prepared in GIT_DIR. Finally we exit the process with status 128, just like the stock builtin die handler. An internal flag is set to prevent any further die()'s that may be invoked during the crash report generator from causing us to enter into an infinite loop. We shouldn't die() from our crash report handler, but just in case someone makes a future code change we are prepared to gaurd against small mistakes turning into huge problems for the end-user. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 08:00:37 +02:00			`fprintf(rpt, " at %s\n", show_date(time(NULL), 0, DATE_LOCAL));`
			`fputc('\n', rpt);`

			`fputs("fatal: ", rpt);`
Avoid using va_copy in fast-import: it seems to be unportable. [sp: minor change to use fputs, thus reducing the patch size] Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-08-21 05:38:14 +02:00			`fputs(err, rpt);`
Generate crash reports on die in fast-import As fast-import is quite strict about its input and die()'s anytime something goes wrong it can be difficult for a frontend developer to troubleshoot why fast-import rejected their input, or to even determine what input command it rejected. This change introduces a custom handler for Git's die() routine. When we receive a die() for any reason (fast-import or a lower level core Git routine we called) the error is first dumped onto stderr and then a more extensive crash report file is prepared in GIT_DIR. Finally we exit the process with status 128, just like the stock builtin die handler. An internal flag is set to prevent any further die()'s that may be invoked during the crash report generator from causing us to enter into an infinite loop. We shouldn't die() from our crash report handler, but just in case someone makes a future code change we are prepared to gaurd against small mistakes turning into huge problems for the end-user. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 08:00:37 +02:00			`fputc('\n', rpt);`

Include recent command history in fast-import crash reports When we crash the frontend developer (or end-user) may need to know roughly around what part of the input stream we had a problem with and aborted on. Because line numbers aren't very useful in this sort of application we instead just keep the last 100 commands in a FIFO queue and print them as part of the crash report. Currently one problem with this design is a commit that has more than 100 modified files in it will flood the FIFO and any context regarding branch/from/committer/mark/comments will be lost. We really should save only the last few (10?) file changes for the current commit, ensuring we have some prior higher level commands in the FIFO when we crash on a file M/D/C/R command. Another issue with this approach is the FIFO only includes the commands, it does not include the commit messages. Yet having a commit message may be useful to help locate the relevant change in the source material. In practice I don't think this is going to be a major concern as the frontend can always embed its own source change set identifier as a comment (which will appear in the crash report) and the commit message(s) for the most recent commits of any given branch should be obtainable from the (packed) commit objects. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 10:47:04 +02:00			`fputc('\n', rpt);`
			`fputs("Most Recent Commands Before Crash\n", rpt);`
			`fputs("---------------------------------\n", rpt);`
			`for (rc = cmd_hist.next; rc != &cmd_hist; rc = rc->next) {`
			`if (rc->next == &cmd_hist)`
			`fputs("* ", rpt);`
			`else`
			`fputs(" ", rpt);`
			`fputs(rc->buf, rpt);`
			`fputc('\n', rpt);`
			`}`

Generate crash reports on die in fast-import As fast-import is quite strict about its input and die()'s anytime something goes wrong it can be difficult for a frontend developer to troubleshoot why fast-import rejected their input, or to even determine what input command it rejected. This change introduces a custom handler for Git's die() routine. When we receive a die() for any reason (fast-import or a lower level core Git routine we called) the error is first dumped onto stderr and then a more extensive crash report file is prepared in GIT_DIR. Finally we exit the process with status 128, just like the stock builtin die handler. An internal flag is set to prevent any further die()'s that may be invoked during the crash report generator from causing us to enter into an infinite loop. We shouldn't die() from our crash report handler, but just in case someone makes a future code change we are prepared to gaurd against small mistakes turning into huge problems for the end-user. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 08:00:37 +02:00			`fputc('\n', rpt);`
			`fputs("Active Branch LRU\n", rpt);`
			`fputs("-----------------\n", rpt);`
			`fprintf(rpt, " active_branches = %lu cur, %lu max\n",`
			`cur_active_branches,`
			`max_active_branches);`
			`fputc('\n', rpt);`
			`fputs(" pos clock name\n", rpt);`
			`fputs(" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n", rpt);`
			`for (b = active_branches, lu = 0; b; b = b->active_next_branch)`
			`fprintf(rpt, " %2lu) %6" PRIuMAX" %s\n",`
			`++lu, b->last_commit, b->name);`

			`fputc('\n', rpt);`
			`fputs("Inactive Branches\n", rpt);`
			`fputs("-----------------\n", rpt);`
			`for (lu = 0; lu < branch_table_sz; lu++) {`
			`for (b = branch_table[lu]; b; b = b->table_next_branch)`
			`write_branch_report(rpt, b);`
			`}`

Include annotated tags in fast-import crash reports If annotated tags were created they exist in a different namespace within the fast-import process' internal memory tables so we did not export them in the inactive branch table. Now they are written out after the branches, in the order that they were defined by the frontend process. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-14 07:34:36 +01:00			`if (first_tag) {`
			`struct tag *tg;`
			`fputc('\n', rpt);`
			`fputs("Annotated Tags\n", rpt);`
			`fputs("--------------\n", rpt);`
			`for (tg = first_tag; tg; tg = tg->next_tag) {`
			`fputs(sha1_to_hex(tg->sha1), rpt);`
			`fputc(' ', rpt);`
			`fputs(tg->name, rpt);`
			`fputc('\n', rpt);`
			`}`
			`}`

Include the fast-import marks table in crash reports If fast-import was not run with --export-marks but we are crashing the frontend application developer may still benefit from having that information available to them. We now include the marks table as part of the crash report if --export-marks was not supplied on the command line. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-14 07:34:40 +01:00			`fputc('\n', rpt);`
			`fputs("Marks\n", rpt);`
			`fputs("-----\n", rpt);`
fast-import: put marks reading in its own function All options do nothing but set settings, with the exception of the --input-marks option. Delay the reading of the marks file till after all options have been parsed. Also, rename mark_file to export_marks_file as it is now ambiguous. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:55 +01:00			`if (export_marks_file)`
			`fprintf(rpt, " exported to %s\n", export_marks_file);`
Include the fast-import marks table in crash reports If fast-import was not run with --export-marks but we are crashing the frontend application developer may still benefit from having that information available to them. We now include the marks table as part of the crash report if --export-marks was not supplied on the command line. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-14 07:34:40 +01:00			`else`
			`dump_marks_helper(rpt, 0, marks);`

Generate crash reports on die in fast-import As fast-import is quite strict about its input and die()'s anytime something goes wrong it can be difficult for a frontend developer to troubleshoot why fast-import rejected their input, or to even determine what input command it rejected. This change introduces a custom handler for Git's die() routine. When we receive a die() for any reason (fast-import or a lower level core Git routine we called) the error is first dumped onto stderr and then a more extensive crash report file is prepared in GIT_DIR. Finally we exit the process with status 128, just like the stock builtin die handler. An internal flag is set to prevent any further die()'s that may be invoked during the crash report generator from causing us to enter into an infinite loop. We shouldn't die() from our crash report handler, but just in case someone makes a future code change we are prepared to gaurd against small mistakes turning into huge problems for the end-user. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 08:00:37 +02:00			`fputc('\n', rpt);`
			`fputs("-------------------\n", rpt);`
			`fputs("END OF CRASH REPORT\n", rpt);`
			`fclose(rpt);`
			`}`

Finish current packfile during fast-import crash handler If fast-import is in the middle of crashing due to a protocol error or something like that then it can be very useful to have the mark table and all objects up until that point be available for a new import to resume from. Currently we just close the active packfile, unkeep all of our newly created packfiles (so they can be deleted), and dump the marks table to a temporary file. We don't attempt to update the refs/tags that the process has in memory as much of that data can be found in the crash report and I'm not sure it would be the right thing to do under every type of crash. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-14 07:34:43 +01:00			`static void end_packfile(void);`
			`static void unkeep_all_packs(void);`
			`static void dump_marks(void);`

Generate crash reports on die in fast-import As fast-import is quite strict about its input and die()'s anytime something goes wrong it can be difficult for a frontend developer to troubleshoot why fast-import rejected their input, or to even determine what input command it rejected. This change introduces a custom handler for Git's die() routine. When we receive a die() for any reason (fast-import or a lower level core Git routine we called) the error is first dumped onto stderr and then a more extensive crash report file is prepared in GIT_DIR. Finally we exit the process with status 128, just like the stock builtin die handler. An internal flag is set to prevent any further die()'s that may be invoked during the crash report generator from causing us to enter into an infinite loop. We shouldn't die() from our crash report handler, but just in case someone makes a future code change we are prepared to gaurd against small mistakes turning into huge problems for the end-user. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 08:00:37 +02:00			`static NORETURN void die_nicely(const char *err, va_list params)`
			`{`
			`static int zombie;`
Avoid using va_copy in fast-import: it seems to be unportable. [sp: minor change to use fputs, thus reducing the patch size] Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-08-21 05:38:14 +02:00			`char message[2 * PATH_MAX];`
Generate crash reports on die in fast-import As fast-import is quite strict about its input and die()'s anytime something goes wrong it can be difficult for a frontend developer to troubleshoot why fast-import rejected their input, or to even determine what input command it rejected. This change introduces a custom handler for Git's die() routine. When we receive a die() for any reason (fast-import or a lower level core Git routine we called) the error is first dumped onto stderr and then a more extensive crash report file is prepared in GIT_DIR. Finally we exit the process with status 128, just like the stock builtin die handler. An internal flag is set to prevent any further die()'s that may be invoked during the crash report generator from causing us to enter into an infinite loop. We shouldn't die() from our crash report handler, but just in case someone makes a future code change we are prepared to gaurd against small mistakes turning into huge problems for the end-user. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 08:00:37 +02:00
Avoid using va_copy in fast-import: it seems to be unportable. [sp: minor change to use fputs, thus reducing the patch size] Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-08-21 05:38:14 +02:00			`vsnprintf(message, sizeof(message), err, params);`
Generate crash reports on die in fast-import As fast-import is quite strict about its input and die()'s anytime something goes wrong it can be difficult for a frontend developer to troubleshoot why fast-import rejected their input, or to even determine what input command it rejected. This change introduces a custom handler for Git's die() routine. When we receive a die() for any reason (fast-import or a lower level core Git routine we called) the error is first dumped onto stderr and then a more extensive crash report file is prepared in GIT_DIR. Finally we exit the process with status 128, just like the stock builtin die handler. An internal flag is set to prevent any further die()'s that may be invoked during the crash report generator from causing us to enter into an infinite loop. We shouldn't die() from our crash report handler, but just in case someone makes a future code change we are prepared to gaurd against small mistakes turning into huge problems for the end-user. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 08:00:37 +02:00			`fputs("fatal: ", stderr);`
Avoid using va_copy in fast-import: it seems to be unportable. [sp: minor change to use fputs, thus reducing the patch size] Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-08-21 05:38:14 +02:00			`fputs(message, stderr);`
Generate crash reports on die in fast-import As fast-import is quite strict about its input and die()'s anytime something goes wrong it can be difficult for a frontend developer to troubleshoot why fast-import rejected their input, or to even determine what input command it rejected. This change introduces a custom handler for Git's die() routine. When we receive a die() for any reason (fast-import or a lower level core Git routine we called) the error is first dumped onto stderr and then a more extensive crash report file is prepared in GIT_DIR. Finally we exit the process with status 128, just like the stock builtin die handler. An internal flag is set to prevent any further die()'s that may be invoked during the crash report generator from causing us to enter into an infinite loop. We shouldn't die() from our crash report handler, but just in case someone makes a future code change we are prepared to gaurd against small mistakes turning into huge problems for the end-user. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 08:00:37 +02:00			`fputc('\n', stderr);`

			`if (!zombie) {`
			`zombie = 1;`
Avoid using va_copy in fast-import: it seems to be unportable. [sp: minor change to use fputs, thus reducing the patch size] Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-08-21 05:38:14 +02:00			`write_crash_report(message);`
Finish current packfile during fast-import crash handler If fast-import is in the middle of crashing due to a protocol error or something like that then it can be very useful to have the mark table and all objects up until that point be available for a new import to resume from. Currently we just close the active packfile, unkeep all of our newly created packfiles (so they can be deleted), and dump the marks table to a temporary file. We don't attempt to update the refs/tags that the process has in memory as much of that data can be found in the crash report and I'm not sure it would be the right thing to do under every type of crash. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-14 07:34:43 +01:00			`end_packfile();`
			`unkeep_all_packs();`
			`dump_marks();`
Generate crash reports on die in fast-import As fast-import is quite strict about its input and die()'s anytime something goes wrong it can be difficult for a frontend developer to troubleshoot why fast-import rejected their input, or to even determine what input command it rejected. This change introduces a custom handler for Git's die() routine. When we receive a die() for any reason (fast-import or a lower level core Git routine we called) the error is first dumped onto stderr and then a more extensive crash report file is prepared in GIT_DIR. Finally we exit the process with status 128, just like the stock builtin die handler. An internal flag is set to prevent any further die()'s that may be invoked during the crash report generator from causing us to enter into an infinite loop. We shouldn't die() from our crash report handler, but just in case someone makes a future code change we are prepared to gaurd against small mistakes turning into huge problems for the end-user. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 08:00:37 +02:00			`}`
			`exit(128);`
			`}`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00
fast-import: treat SIGUSR1 as a request to access objects early It can be tedious to wait for a multi-million-revision import. Unfortunately it is hard to spy on the import because fast-import works by continuously streaming out objects, without updating the pack index or refs until a checkpoint command or the end of the stream. So allow the impatient operator to request checkpoints by sending a signal, like so: killall -USR1 git-fast-import When receiving such a signal, fast-import would schedule a checkpoint to take place after the current top-level command (usually a "commit" or "blob" request) finishes. Caveats: just like ordinary checkpoint commands, such requests slow down the import. Switching to a new pack at a suboptimal moment is also likely to result in a less dense initial collection of packs. That's the price. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-11-22 09:16:02 +01:00			`#ifndef SIGUSR1 /* Windows, for example */`

			`static void set_checkpoint_signal(void)`
			`{`
			`}`

			`#else`

			`static void checkpoint_signal(int signo)`
			`{`
			`checkpoint_requested = 1;`
			`}`

			`static void set_checkpoint_signal(void)`
			`{`
			`struct sigaction sa;`

			`memset(&sa, 0, sizeof(sa));`
			`sa.sa_handler = checkpoint_signal;`
			`sigemptyset(&sa.sa_mask);`
			`sa.sa_flags = SA_RESTART;`
			`sigaction(SIGUSR1, &sa, NULL);`
			`}`

			`#endif`

Misc. type cleanups within fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 06:16:23 +01:00			`static void alloc_objects(unsigned int cnt)`
Added automatic index generation to fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-06 19:51:39 +02:00			`{`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`struct object_entry_pool *b;`
Cleaned up memory allocation for object_entry structs. Although its easy to ask the user to tell us how many objects they will need, its probably better to dynamically grow the object table in large units. But if the user can give us a hint as to roughly how many objects then we can still use it during startup. Also stopped printing the SHA1 strings to stdout as no user is currently making use of that facility. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:03:59 +02:00
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`b = xmalloc(sizeof(struct object_entry_pool)`
Cleaned up memory allocation for object_entry structs. Although its easy to ask the user to tell us how many objects they will need, its probably better to dynamically grow the object table in large units. But if the user can give us a hint as to roughly how many objects then we can still use it during startup. Also stopped printing the SHA1 strings to stdout as no user is currently making use of that facility. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:03:59 +02:00			`+ cnt * sizeof(struct object_entry));`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`b->next_pool = blocks;`
Cleaned up memory allocation for object_entry structs. Although its easy to ask the user to tell us how many objects they will need, its probably better to dynamically grow the object table in large units. But if the user can give us a hint as to roughly how many objects then we can still use it during startup. Also stopped printing the SHA1 strings to stdout as no user is currently making use of that facility. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:03:59 +02:00			`b->next_free = b->entries;`
			`b->end = b->entries + cnt;`
			`blocks = b;`
			`alloc_count += cnt;`
			`}`
Added automatic index generation to fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-06 19:51:39 +02:00
Correct minor style issue in fast-import. Junio noticed that I was using a different style in fast-import for returned pointers than the rest of Git. Before merging this code into the main git.git tree I'd like to make it consistent, as this style variation was not intentional. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 06:43:59 +01:00			`static struct object_entry new_object(unsigned char sha1)`
Added automatic index generation to fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-06 19:51:39 +02:00			`{`
Cleaned up memory allocation for object_entry structs. Although its easy to ask the user to tell us how many objects they will need, its probably better to dynamically grow the object table in large units. But if the user can give us a hint as to roughly how many objects then we can still use it during startup. Also stopped printing the SHA1 strings to stdout as no user is currently making use of that facility. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:03:59 +02:00			`struct object_entry *e;`
Added automatic index generation to fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-06 19:51:39 +02:00
Cleaned up memory allocation for object_entry structs. Although its easy to ask the user to tell us how many objects they will need, its probably better to dynamically grow the object table in large units. But if the user can give us a hint as to roughly how many objects then we can still use it during startup. Also stopped printing the SHA1 strings to stdout as no user is currently making use of that facility. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:03:59 +02:00			`if (blocks->next_free == blocks->end)`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`alloc_objects(object_entry_alloc);`
Added automatic index generation to fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-06 19:51:39 +02:00
Cleaned up memory allocation for object_entry structs. Although its easy to ask the user to tell us how many objects they will need, its probably better to dynamically grow the object table in large units. But if the user can give us a hint as to roughly how many objects then we can still use it during startup. Also stopped printing the SHA1 strings to stdout as no user is currently making use of that facility. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:03:59 +02:00			`e = blocks->next_free++;`
fast-import: start using struct pack_idx_entry This is in preparation for using write_idx_file(). Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:51 +01:00			`hashcpy(e->idx.sha1, sha1);`
Cleaned up memory allocation for object_entry structs. Although its easy to ask the user to tell us how many objects they will need, its probably better to dynamically grow the object table in large units. But if the user can give us a hint as to roughly how many objects then we can still use it during startup. Also stopped printing the SHA1 strings to stdout as no user is currently making use of that facility. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:03:59 +02:00			`return e;`
Added automatic index generation to fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-06 19:51:39 +02:00			`}`

Correct minor style issue in fast-import. Junio noticed that I was using a different style in fast-import for returned pointers than the rest of Git. Before merging this code into the main git.git tree I'd like to make it consistent, as this style variation was not intentional. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 06:43:59 +01:00			`static struct object_entry find_object(unsigned char sha1)`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`{`
			`unsigned int h = sha1[0] << 8 \| sha1[1];`
			`struct object_entry *e;`
			`for (e = object_table[h]; e; e = e->next)`
fast-import: start using struct pack_idx_entry This is in preparation for using write_idx_file(). Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:51 +01:00			`if (!hashcmp(sha1, e->idx.sha1))`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`return e;`
			`return NULL;`
			`}`

Correct minor style issue in fast-import. Junio noticed that I was using a different style in fast-import for returned pointers than the rest of Git. Before merging this code into the main git.git tree I'd like to make it consistent, as this style variation was not intentional. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 06:43:59 +01:00			`static struct object_entry insert_object(unsigned char sha1)`
Added automatic index generation to fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-06 19:51:39 +02:00			`{`
			`unsigned int h = sha1[0] << 8 \| sha1[1];`
Cleaned up memory allocation for object_entry structs. Although its easy to ask the user to tell us how many objects they will need, its probably better to dynamically grow the object table in large units. But if the user can give us a hint as to roughly how many objects then we can still use it during startup. Also stopped printing the SHA1 strings to stdout as no user is currently making use of that facility. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:03:59 +02:00			`struct object_entry *e = object_table[h];`
Added automatic index generation to fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-06 19:51:39 +02:00
			`while (e) {`
fast-import: start using struct pack_idx_entry This is in preparation for using write_idx_file(). Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:51 +01:00			`if (!hashcmp(sha1, e->idx.sha1))`
Added automatic index generation to fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-06 19:51:39 +02:00			`return e;`
			`e = e->next;`
			`}`

			`e = new_object(sha1);`
fast-import: insert new object entries at start of hash bucket More often than not, find_object is called for recently inserted objects. Optimise for this case by inserting new entries at the start of the chain. This doesn't affect the cost of new inserts but reduces the cost of find and insert for existing object entries. Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-11-23 08:53:48 +01:00			`e->next = object_table[h];`
fast-import: start using struct pack_idx_entry This is in preparation for using write_idx_file(). Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:51 +01:00			`e->idx.offset = 0;`
fast-import: insert new object entries at start of hash bucket More often than not, find_object is called for recently inserted objects. Optimise for this case by inserting new entries at the start of the chain. This doesn't affect the cost of new inserts but reduces the cost of find and insert for existing object entries. Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-11-23 08:53:48 +01:00			`object_table[h] = e;`
Added automatic index generation to fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-06 19:51:39 +02:00			`return e;`
			`}`
Created fast-import, a tool to quickly generating a pack from blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-05 08:04:21 +02:00
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`static unsigned int hc_str(const char *s, size_t len)`
			`{`
			`unsigned int r = 0;`
			`while (len-- > 0)`
			`r = r * 31 + *s++;`
			`return r;`
			`}`

Correct minor style issue in fast-import. Junio noticed that I was using a different style in fast-import for returned pointers than the rest of Git. Before merging this code into the main git.git tree I'd like to make it consistent, as this style variation was not intentional. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 06:43:59 +01:00			`static void *pool_alloc(size_t len)`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`{`
			`struct mem_pool *p;`
			`void *r;`

git-fast-import possible memory corruption problem Internal "allocate in bulk, we will never free this memory anyway" allocator used in fast-import had a logic to round up the size of the requested memory block in a wrong place (it computed if the available space is enough to fit the request first, and then carved a chunk of memory by size rounded up to the alignment, which could go beyond the actually available space). Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-12-14 03:08:22 +01:00			`/* round up to a 'uintmax_t' alignment */`
			`if (len & (sizeof(uintmax_t) - 1))`
			`len += sizeof(uintmax_t) - (len & (sizeof(uintmax_t) - 1));`

Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`for (p = mem_pool; p; p = p->next_pool)`
			`if ((p->end - p->next_free >= len))`
			`break;`

			`if (!p) {`
			`if (len >= (mem_pool_alloc/2)) {`
			`total_allocd += len;`
			`return xmalloc(len);`
			`}`
			`total_allocd += sizeof(struct mem_pool) + mem_pool_alloc;`
			`p = xmalloc(sizeof(struct mem_pool) + mem_pool_alloc);`
			`p->next_pool = mem_pool;`
fast-import: fix unalinged allocation and access The specialized pool allocator fast-import uses aligned objects on the size of a pointer, which was not sufficient at least on Sparc. Instead, make the alignment for objects of type unitmax_t. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-12-15 05:39:16 +01:00			`p->next_free = (char *) p->space;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`p->end = p->next_free + mem_pool_alloc;`
			`mem_pool = p;`
			`}`

			`r = p->next_free;`
			`p->next_free += len;`
			`return r;`
			`}`

Correct minor style issue in fast-import. Junio noticed that I was using a different style in fast-import for returned pointers than the rest of Git. Before merging this code into the main git.git tree I'd like to make it consistent, as this style variation was not intentional. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 06:43:59 +01:00			`static void *pool_calloc(size_t count, size_t size)`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`{`
			`size_t len = count * size;`
			`void *r = pool_alloc(len);`
			`memset(r, 0, len);`
			`return r;`
			`}`

Correct minor style issue in fast-import. Junio noticed that I was using a different style in fast-import for returned pointers than the rest of Git. Before merging this code into the main git.git tree I'd like to make it consistent, as this style variation was not intentional. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 06:43:59 +01:00			`static char pool_strdup(const char s)`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`{`
			`char *r = pool_alloc(strlen(s) + 1);`
			`strcpy(r, s);`
			`return r;`
			`}`

Use uintmax_t for marks in fast-import. If a frontend wants to use a mark per file revision and per commit and is doing a truly huge import (such as a 32 GiB SVN repository) we may need more than 2**32 unique mark values, especially if the frontend is unable (or unwilling) to recycle mark values. For mark idnums we should use the largest unsigned integer type available, hoping that will be at least 64 bits when we are compiled as a 64 bit executable. This way we may consume huge amounts of memory storing our mark table, but we'll at least be able to process the entire import without failing. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 06:33:19 +01:00			`static void insert_mark(uintmax_t idnum, struct object_entry *oe)`
Added mark store/find to fast-import. Marks are now saved when the mark directive gets used by the frontend and may be used in place of a SHA1 expression to locate a previous SHA1 which fast-import may have generated. This is particularly useful with commits where the frontend does not (easily) have the ability to compute the SHA1 for an arbitrary commit but needs it to generate a branch or tag from that commit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 10:17:45 +02:00			`{`
			`struct mark_set *s = marks;`
			`while ((idnum >> s->shift) >= 1024) {`
			`s = pool_calloc(1, sizeof(struct mark_set));`
			`s->shift = marks->shift + 10;`
			`s->data.sets[0] = marks;`
			`marks = s;`
			`}`
			`while (s->shift) {`
Use uintmax_t for marks in fast-import. If a frontend wants to use a mark per file revision and per commit and is doing a truly huge import (such as a 32 GiB SVN repository) we may need more than 2**32 unique mark values, especially if the frontend is unable (or unwilling) to recycle mark values. For mark idnums we should use the largest unsigned integer type available, hoping that will be at least 64 bits when we are compiled as a 64 bit executable. This way we may consume huge amounts of memory storing our mark table, but we'll at least be able to process the entire import without failing. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 06:33:19 +01:00			`uintmax_t i = idnum >> s->shift;`
Added mark store/find to fast-import. Marks are now saved when the mark directive gets used by the frontend and may be used in place of a SHA1 expression to locate a previous SHA1 which fast-import may have generated. This is particularly useful with commits where the frontend does not (easily) have the ability to compute the SHA1 for an arbitrary commit but needs it to generate a branch or tag from that commit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 10:17:45 +02:00			`idnum -= i << s->shift;`
			`if (!s->data.sets[i]) {`
			`s->data.sets[i] = pool_calloc(1, sizeof(struct mark_set));`
			`s->data.sets[i]->shift = s->shift - 10;`
			`}`
			`s = s->data.sets[i];`
			`}`
			`if (!s->data.marked[idnum])`
			`marks_set_count++;`
			`s->data.marked[idnum] = oe;`
			`}`

Correct minor style issue in fast-import. Junio noticed that I was using a different style in fast-import for returned pointers than the rest of Git. Before merging this code into the main git.git tree I'd like to make it consistent, as this style variation was not intentional. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 06:43:59 +01:00			`static struct object_entry *find_mark(uintmax_t idnum)`
Added mark store/find to fast-import. Marks are now saved when the mark directive gets used by the frontend and may be used in place of a SHA1 expression to locate a previous SHA1 which fast-import may have generated. This is particularly useful with commits where the frontend does not (easily) have the ability to compute the SHA1 for an arbitrary commit but needs it to generate a branch or tag from that commit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 10:17:45 +02:00			`{`
Use uintmax_t for marks in fast-import. If a frontend wants to use a mark per file revision and per commit and is doing a truly huge import (such as a 32 GiB SVN repository) we may need more than 2**32 unique mark values, especially if the frontend is unable (or unwilling) to recycle mark values. For mark idnums we should use the largest unsigned integer type available, hoping that will be at least 64 bits when we are compiled as a 64 bit executable. This way we may consume huge amounts of memory storing our mark table, but we'll at least be able to process the entire import without failing. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 06:33:19 +01:00			`uintmax_t orig_idnum = idnum;`
Added mark store/find to fast-import. Marks are now saved when the mark directive gets used by the frontend and may be used in place of a SHA1 expression to locate a previous SHA1 which fast-import may have generated. This is particularly useful with commits where the frontend does not (easily) have the ability to compute the SHA1 for an arbitrary commit but needs it to generate a branch or tag from that commit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 10:17:45 +02:00			`struct mark_set *s = marks;`
			`struct object_entry *oe = NULL;`
			`if ((idnum >> s->shift) < 1024) {`
			`while (s && s->shift) {`
Use uintmax_t for marks in fast-import. If a frontend wants to use a mark per file revision and per commit and is doing a truly huge import (such as a 32 GiB SVN repository) we may need more than 2**32 unique mark values, especially if the frontend is unable (or unwilling) to recycle mark values. For mark idnums we should use the largest unsigned integer type available, hoping that will be at least 64 bits when we are compiled as a 64 bit executable. This way we may consume huge amounts of memory storing our mark table, but we'll at least be able to process the entire import without failing. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 06:33:19 +01:00			`uintmax_t i = idnum >> s->shift;`
Added mark store/find to fast-import. Marks are now saved when the mark directive gets used by the frontend and may be used in place of a SHA1 expression to locate a previous SHA1 which fast-import may have generated. This is particularly useful with commits where the frontend does not (easily) have the ability to compute the SHA1 for an arbitrary commit but needs it to generate a branch or tag from that commit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 10:17:45 +02:00			`idnum -= i << s->shift;`
			`s = s->data.sets[i];`
			`}`
			`if (s)`
			`oe = s->data.marked[idnum];`
			`}`
			`if (!oe)`
Check for PRIuMAX rather than NO_C99_FORMAT in fast-import.c. Thanks to Simon 'corecode' Schubert <corecode@fs.ei.tum.de> for the clean-up. Defining the C99 standard PRIuMAX when necessary replaces UM_FMT and the awkward UM10_FMT. There are no direct C99 translations for other uses of NO_C99_FORMAT in git, alas. Signed-off-by: Jason Riedy <ejr@cs.berkeley.edu> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-21 02:34:56 +01:00			`die("mark :%" PRIuMAX " not declared", orig_idnum);`
Added mark store/find to fast-import. Marks are now saved when the mark directive gets used by the frontend and may be used in place of a SHA1 expression to locate a previous SHA1 which fast-import may have generated. This is particularly useful with commits where the frontend does not (easily) have the ability to compute the SHA1 for an arbitrary commit but needs it to generate a branch or tag from that commit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 10:17:45 +02:00			`return oe;`
			`}`

Correct minor style issue in fast-import. Junio noticed that I was using a different style in fast-import for returned pointers than the rest of Git. Before merging this code into the main git.git tree I'd like to make it consistent, as this style variation was not intentional. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 06:43:59 +01:00			`static struct atom_str to_atom(const char s, unsigned short len)`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`{`
			`unsigned int hc = hc_str(s, len) % atom_table_sz;`
			`struct atom_str *c;`

			`for (c = atom_table[hc]; c; c = c->next_atom)`
			`if (c->str_len == len && !strncmp(s, c->str_dat, len))`
			`return c;`

			`c = pool_alloc(sizeof(struct atom_str) + len + 1);`
			`c->str_len = len;`
			`strncpy(c->str_dat, s, len);`
			`c->str_dat[len] = 0;`
			`c->next_atom = atom_table[hc];`
			`atom_table[hc] = c;`
			`atom_cnt++;`
			`return c;`
			`}`

Correct minor style issue in fast-import. Junio noticed that I was using a different style in fast-import for returned pointers than the rest of Git. Before merging this code into the main git.git tree I'd like to make it consistent, as this style variation was not intentional. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 06:43:59 +01:00			`static struct branch lookup_branch(const char name)`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`{`
			`unsigned int hc = hc_str(name, strlen(name)) % branch_table_sz;`
			`struct branch *b;`

			`for (b = branch_table[hc]; b; b = b->table_next_branch)`
			`if (!strcmp(name, b->name))`
			`return b;`
			`return NULL;`
			`}`

Correct minor style issue in fast-import. Junio noticed that I was using a different style in fast-import for returned pointers than the rest of Git. Before merging this code into the main git.git tree I'd like to make it consistent, as this style variation was not intentional. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 06:43:59 +01:00			`static struct branch new_branch(const char name)`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`{`
			`unsigned int hc = hc_str(name, strlen(name)) % branch_table_sz;`
Fix a bunch of pointer declarations (codestyle) Essentially; s/type* /type */ as per the coding guidelines. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-05-01 11:06:36 +02:00			`struct branch *b = lookup_branch(name);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00
			`if (b)`
			`die("Invalid attempt to create duplicate branch: %s", name);`
Change check_ref_format() to take a flags argument Change check_ref_format() to take a flags argument that indicates what is acceptable in the reference name (analogous to "git check-ref-format"'s "--allow-onelevel" and "--refspec-pattern"). This is more convenient for callers and also fixes a failure in the test suite (and likely elsewhere in the code) by enabling "onelevel" and "refspec-pattern" to be allowed independently of each other. Also rename check_ref_format() to check_refname_format() to make it obvious that it deals with refnames rather than references themselves. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-09-15 23:10:25 +02:00			`if (check_refname_format(name, REFNAME_ALLOW_ONELEVEL))`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`die("Branch name doesn't conform to GIT standards: %s", name);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00
			`b = pool_calloc(1, sizeof(struct branch));`
			`b->name = pool_strdup(name);`
			`b->table_next_branch = branch_table[hc];`
Additional fast-import tree delta corruption cleanups. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-29 04:06:13 +02:00			`b->branch_tree.versions[0].mode = S_IFDIR;`
			`b->branch_tree.versions[1].mode = S_IFDIR;`
fast-import: Proper notes tree manipulation This patch teaches 'git fast-import' to automatically organize note objects in a fast-import stream into an appropriate fanout structure. The notes API in notes.h is NOT used to accomplish this, because trying to keep the fast-import and notes data structures in sync would yield a significantly larger patch with higher complexity. Note objects are added with the 'N' command, and accounted for with a per-branch counter, which is used to trigger fanout restructuring when needed. Note that when restructuring the branch tree, _any_ entry whose path consists of 40 hex chars (not including directory separators) will be recognized as a note object. It is therefore not advisable to manipulate note entries with M/D/R/C commands. Since note objects are stored in the same tree structure as other objects, the unloading and reloading of a fast-import branches handle note objects transparently. This patch has been improved by the following contributions: - Shawn O. Pearce: Several style- and logic-related improvements Cc: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-07 12:27:24 +01:00			`b->num_notes = 0;`
fast-import: Avoid infinite loop after reset Johannes Sixt noticed that a 'reset' command applied to a branch that is already active in the branch LRU cache can cause fast-import to relink the same branch into the LRU cache twice. This will cause the LRU cache to contain a cycle, making unload_one_branch run in an infinite loop as it tries to select the oldest branch for eviction. I have trivially fixed the problem by adding an active bit to each branch object; this bit indicates if the branch is already in the LRU and allows us to avoid trying to add it a second time. Converting the pack_id field into a bitfield makes this change take up no additional memory. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-05 18:31:09 +01:00			`b->active = 0;`
Correct packfile edge output in fast-import. Branches are only contained by a packfile if the branch actually had its most recent commit in that packfile. So new branches are set to MAX_PACK_ID to ensure they don't cause their commit to list as part of the first packfile when it closes out if the commit was actually in existance before fast-import started. Also corrected the type of last_commit to be umaxint_t to prevent overflow and wraparound on very large imports. Though that is highly unlikely to occur as we're talking 4 billion commits, which no real project has right now. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-17 08:42:43 +01:00			`b->pack_id = MAX_PACK_ID;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`branch_table[hc] = b;`
			`branch_count++;`
			`return b;`
			`}`

			`static unsigned int hc_entries(unsigned int cnt)`
			`{`
			`cnt = cnt & 7 ? (cnt / 8) + 1 : cnt / 8;`
			`return cnt < avail_tree_table_sz ? cnt : avail_tree_table_sz - 1;`
			`}`

Correct minor style issue in fast-import. Junio noticed that I was using a different style in fast-import for returned pointers than the rest of Git. Before merging this code into the main git.git tree I'd like to make it consistent, as this style variation was not intentional. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 06:43:59 +01:00			`static struct tree_content *new_tree_content(unsigned int cnt)`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`{`
			`struct avail_tree_content f, l = NULL;`
			`struct tree_content *t;`
			`unsigned int hc = hc_entries(cnt);`

			`for (f = avail_tree_table[hc]; f; l = f, f = f->next_avail)`
			`if (f->entry_capacity >= cnt)`
			`break;`

			`if (f) {`
			`if (l)`
			`l->next_avail = f->next_avail;`
			`else`
			`avail_tree_table[hc] = f->next_avail;`
			`} else {`
			`cnt = cnt & 7 ? ((cnt / 8) + 1) * 8 : cnt;`
			`f = pool_alloc(sizeof(t) + sizeof(t->entries[0]) cnt);`
			`f->entry_capacity = cnt;`
			`}`

			`t = (struct tree_content*)f;`
			`t->entry_count = 0;`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`t->delta_depth = 0;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`return t;`
			`}`

			`static void release_tree_entry(struct tree_entry *e);`
			`static void release_tree_content(struct tree_content *t)`
			`{`
			`struct avail_tree_content f = (struct avail_tree_content)t;`
			`unsigned int hc = hc_entries(f->entry_capacity);`
Fixed segfault in fast-import after growing a tree. Growing a tree caused all subtrees to be deallocated and put back into the free list yet those subtree's contents were still actively in use. Consequently they were doled out again and got stomped on elsewhere. Releasing a tree is now performed in two parts, either releasing only the content array or releasing the content array and recursively releasing the subtree(s). Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 07:33:47 +02:00			`f->next_avail = avail_tree_table[hc];`
			`avail_tree_table[hc] = f;`
			`}`

			`static void release_tree_content_recursive(struct tree_content *t)`
			`{`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`unsigned int i;`
			`for (i = 0; i < t->entry_count; i++)`
			`release_tree_entry(t->entries[i]);`
Fixed segfault in fast-import after growing a tree. Growing a tree caused all subtrees to be deallocated and put back into the free list yet those subtree's contents were still actively in use. Consequently they were doled out again and got stomped on elsewhere. Releasing a tree is now performed in two parts, either releasing only the content array or releasing the content array and recursively releasing the subtree(s). Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 07:33:47 +02:00			`release_tree_content(t);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`}`

Correct minor style issue in fast-import. Junio noticed that I was using a different style in fast-import for returned pointers than the rest of Git. Before merging this code into the main git.git tree I'd like to make it consistent, as this style variation was not intentional. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 06:43:59 +01:00			`static struct tree_content *grow_tree_content(`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`struct tree_content *t,`
			`int amt)`
			`{`
			`struct tree_content *r = new_tree_content(t->entry_count + amt);`
			`r->entry_count = t->entry_count;`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`r->delta_depth = t->delta_depth;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`memcpy(r->entries,t->entries,t->entry_count*sizeof(t->entries[0]));`
			`release_tree_content(t);`
			`return r;`
			`}`

Correct minor style issue in fast-import. Junio noticed that I was using a different style in fast-import for returned pointers than the rest of Git. Before merging this code into the main git.git tree I'd like to make it consistent, as this style variation was not intentional. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 06:43:59 +01:00			`static struct tree_entry *new_tree_entry(void)`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`{`
			`struct tree_entry *e;`

			`if (!avail_tree_entry) {`
			`unsigned int n = tree_entry_alloc;`
Account for tree entry memory costs in fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 20:53:32 +02:00			`total_allocd += n * sizeof(struct tree_entry);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`avail_tree_entry = e = xmalloc(n * sizeof(struct tree_entry));`
Fixed GPF in fast-import caused by unterminated linked list. fast-import was encounting a GPF when it ran out of free tree_entry objects but didn't know this was the cause because the last tree_entry wasn't terminated with a NULL pointer. The missing NULL pointer occurred when we allocated additional entries via xmalloc but didn't set the last tree_entry's "next" pointer to NULL. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-27 04:38:02 +02:00			`while (n-- > 1) {`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`((void*)e) = e + 1;`
			`e++;`
			`}`
Fixed compile error in fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-27 05:37:31 +02:00			`((void*)e) = NULL;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`}`

			`e = avail_tree_entry;`
			`avail_tree_entry = ((void*)e);`
			`return e;`
			`}`

			`static void release_tree_entry(struct tree_entry *e)`
			`{`
			`if (e->tree)`
Fixed segfault in fast-import after growing a tree. Growing a tree caused all subtrees to be deallocated and put back into the free list yet those subtree's contents were still actively in use. Consequently they were doled out again and got stomped on elsewhere. Releasing a tree is now performed in two parts, either releasing only the content array or releasing the content array and recursively releasing the subtree(s). Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 07:33:47 +02:00			`release_tree_content_recursive(e->tree);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`((void*)e) = avail_tree_entry;`
			`avail_tree_entry = e;`
			`}`

Teach fast-import to recursively copy files/directories Some source material (e.g. Subversion dump files) perform directory renames by telling us the directory was copied, then deleted in the same revision. This makes it difficult for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Copy a/ to b/; Delete a/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'C' subcommand within a commit allows the frontend to make a recursive copy of one path to another path within the branch, without needing to keep track of the individual file paths. The metadata copy is performed in memory efficiently, but is implemented as a copy-immediately operation, rather than copy-on-write. With this new 'C' subcommand frontends could obviously implement an 'R' (rename) on their own as a combination of 'C' and 'D' (delete), but since we have already offered up 'R' in the past and it is a trivial thing to keep implemented I'm not going to deprecate it. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-15 07:40:37 +02:00			`static struct tree_content dup_tree_content(struct tree_content s)`
			`{`
			`struct tree_content *d;`
			`struct tree_entry a, b;`
			`unsigned int i;`

			`if (!s)`
			`return NULL;`
			`d = new_tree_content(s->entry_count);`
			`for (i = 0; i < s->entry_count; i++) {`
			`a = s->entries[i];`
			`b = new_tree_entry();`
			`memcpy(b, a, sizeof(*a));`
			`if (a->tree && is_null_sha1(b->versions[1].sha1))`
			`b->tree = dup_tree_content(a->tree);`
			`else`
			`b->tree = NULL;`
			`d->entries[i] = b;`
			`}`
			`d->entry_count = s->entry_count;`
			`d->delta_depth = s->delta_depth;`

			`return d;`
			`}`

Declare no-arg functions as (void) in fast-import. Apparently the git convention is to declare any function which takes no arguments as taking void. I did not do this during the early fast-import development, but should have. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-17 07:47:25 +01:00			`static void start_packfile(void)`
Restructure fast-import to support creating multiple packfiles. Now that we are starting to see some really large projects (such as KDE or a fork of FreeBSD) get imported into Git we're running into the upper limit on packfile object count as well as overall byte length. The KDE and FreeBSD projects are both likely to require more than 4 GiB to store their current history, which means we really need multiple packfiles to handle their content. This is a fairly simple restructuring of the internal code to help us support creating multiple packfiles from within fast-import. We are now adding a 5 digit incrementing suffix to the end of the basename supplied to us by the caller, permitting up to 99,999 packs to be generated in a single fast-import run. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 10:39:05 +01:00			`{`
Appease Sun Studio by renaming "tmpfile" On Solaris the system headers define the "tmpfile" name, which'll cause Git compiled with Sun Studio 12 Update 1 to whine about us redefining the name: "pack-write.c", line 76: warning: name redefined by pragma redefine_extname declared static: tmpfile (E_PRAGMA_REDEFINE_STATIC) "sha1_file.c", line 2455: warning: name redefined by pragma redefine_extname declared static: tmpfile (E_PRAGMA_REDEFINE_STATIC) "fast-import.c", line 858: warning: name redefined by pragma redefine_extname declared static: tmpfile (E_PRAGMA_REDEFINE_STATIC) "builtin/index-pack.c", line 175: warning: name redefined by pragma redefine_extname declared static: tmpfile (E_PRAGMA_REDEFINE_STATIC) Just renaming the "tmpfile" variable to "tmp_file" in the relevant places is the easiest way to fix this. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-12-21 02:18:21 +01:00			`static char tmp_file[PATH_MAX];`
Implemented manual packfile switching in fast-import. To help importers which are dealing with massive amounts of data fast-import needs to be able to close the packfile it is currently writing to and open a new packfile for any additional data that will be received. A new 'checkpoint' command has been introduced which can be used by the frontend import process to force this to occur at any time. This may be useful to ensure a very long running import doesn't lose any work due to unexpected failures. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:35:41 +01:00			`struct packed_git *p;`
Restructure fast-import to support creating multiple packfiles. Now that we are starting to see some really large projects (such as KDE or a fork of FreeBSD) get imported into Git we're running into the upper limit on packfile object count as well as overall byte length. The KDE and FreeBSD projects are both likely to require more than 4 GiB to store their current history, which means we really need multiple packfiles to handle their content. This is a fairly simple restructuring of the internal code to help us support creating multiple packfiles from within fast-import. We are now adding a 5 digit incrementing suffix to the end of the basename supplied to us by the caller, permitting up to 99,999 packs to be generated in a single fast-import run. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 10:39:05 +01:00			`struct pack_header hdr;`
Remove unnecessary pack_fd global in fast-import. Much like the pack_sha1 the pack_fd is an unnecessary global variable, we already have the fd stored in our struct packed_git *pack_data so that the core library functions in sha1_file.c are able to lookup and decompress object data that we have previously written. Keeping an extra copy of this value in our own variable is just a hold-over from earlier versions of fast-import and is now completely unnecessary. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 07:20:57 +01:00			`int pack_fd;`
Restructure fast-import to support creating multiple packfiles. Now that we are starting to see some really large projects (such as KDE or a fork of FreeBSD) get imported into Git we're running into the upper limit on packfile object count as well as overall byte length. The KDE and FreeBSD projects are both likely to require more than 4 GiB to store their current history, which means we really need multiple packfiles to handle their content. This is a fairly simple restructuring of the internal code to help us support creating multiple packfiles from within fast-import. We are now adding a 5 digit incrementing suffix to the end of the basename supplied to us by the caller, permitting up to 99,999 packs to be generated in a single fast-import run. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 10:39:05 +01:00
Appease Sun Studio by renaming "tmpfile" On Solaris the system headers define the "tmpfile" name, which'll cause Git compiled with Sun Studio 12 Update 1 to whine about us redefining the name: "pack-write.c", line 76: warning: name redefined by pragma redefine_extname declared static: tmpfile (E_PRAGMA_REDEFINE_STATIC) "sha1_file.c", line 2455: warning: name redefined by pragma redefine_extname declared static: tmpfile (E_PRAGMA_REDEFINE_STATIC) "fast-import.c", line 858: warning: name redefined by pragma redefine_extname declared static: tmpfile (E_PRAGMA_REDEFINE_STATIC) "builtin/index-pack.c", line 175: warning: name redefined by pragma redefine_extname declared static: tmpfile (E_PRAGMA_REDEFINE_STATIC) Just renaming the "tmpfile" variable to "tmp_file" in the relevant places is the easiest way to fix this. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-12-21 02:18:21 +01:00			`pack_fd = odb_mkstemp(tmp_file, sizeof(tmp_file),`
Make sure objects/pack exists before creating a new pack In a repository created with git older than f49fb35 (git-init-db: create "pack" subdirectory under objects, 2005-06-27), objects/pack/ directory is not created upon initialization. It was Ok because subdirectories are created as needed inside directories init-db creates, and back then, packfiles were recent invention. After the said commit, new codepaths started relying on the presense of objects/pack/ directory in the repository. This was exacerbated with 8b4eb6b (Do not perform cross-directory renames when creating packs, 2008-09-22) that moved the location temporary pack files are created from objects/ directory to objects/pack/ directory, because moving temporary to the final location was done carefully with lazy leading directory creation. Many packfile related operations in such an old repository can fail mysteriously because of this. This commit introduces two helper functions to make things work better. - odb_mkstemp() is a specialized version of mkstemp() to refactor the code and teach it to create leading directories as needed; - odb_pack_keep() refactors the code to create a ".keep" file while create leading directories as needed. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-02-25 08:11:29 +01:00			`"pack/tmp_pack_XXXXXX");`
Appease Sun Studio by renaming "tmpfile" On Solaris the system headers define the "tmpfile" name, which'll cause Git compiled with Sun Studio 12 Update 1 to whine about us redefining the name: "pack-write.c", line 76: warning: name redefined by pragma redefine_extname declared static: tmpfile (E_PRAGMA_REDEFINE_STATIC) "sha1_file.c", line 2455: warning: name redefined by pragma redefine_extname declared static: tmpfile (E_PRAGMA_REDEFINE_STATIC) "fast-import.c", line 858: warning: name redefined by pragma redefine_extname declared static: tmpfile (E_PRAGMA_REDEFINE_STATIC) "builtin/index-pack.c", line 175: warning: name redefined by pragma redefine_extname declared static: tmpfile (E_PRAGMA_REDEFINE_STATIC) Just renaming the "tmpfile" variable to "tmp_file" in the relevant places is the easiest way to fix this. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-12-21 02:18:21 +01:00			`p = xcalloc(1, sizeof(*p) + strlen(tmp_file) + 2);`
			`strcpy(p->pack_name, tmp_file);`
Implemented manual packfile switching in fast-import. To help importers which are dealing with massive amounts of data fast-import needs to be able to close the packfile it is currently writing to and open a new packfile for any additional data that will be received. A new 'checkpoint' command has been introduced which can be used by the frontend import process to force this to occur at any time. This may be useful to ensure a very long running import doesn't lose any work due to unexpected failures. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:35:41 +01:00			`p->pack_fd = pack_fd;`
sha1_file.c: Don't retain open fds on small packs If a pack file is small enough that its entire contents fits within one mmap window, mmap the file and then immediately close its file descriptor. This reduces the number of file descriptors that are needed to read from repositories with many tiny pack files, such as one that has received 1000 pushes (and created 1000 small pack files) since its last repack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-03-02 19:01:54 +01:00			`p->do_not_close = 1;`
fast-import: use sha1write() for pack data This is in preparation for using write_idx_file(). Also, by using sha1write() we get some buffering to reduces the number of write syscalls, and the written data is SHA1 summed which allows for the extra data integrity validation check performed in fixup_pack_header_footer() (details on this in commit abeb40e5aa). Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:52 +01:00			`pack_file = sha1fd(pack_fd, p->pack_name);`
Restructure fast-import to support creating multiple packfiles. Now that we are starting to see some really large projects (such as KDE or a fork of FreeBSD) get imported into Git we're running into the upper limit on packfile object count as well as overall byte length. The KDE and FreeBSD projects are both likely to require more than 4 GiB to store their current history, which means we really need multiple packfiles to handle their content. This is a fairly simple restructuring of the internal code to help us support creating multiple packfiles from within fast-import. We are now adding a 5 digit incrementing suffix to the end of the basename supplied to us by the caller, permitting up to 99,999 packs to be generated in a single fast-import run. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 10:39:05 +01:00
			`hdr.hdr_signature = htonl(PACK_SIGNATURE);`
			`hdr.hdr_version = htonl(2);`
			`hdr.hdr_entries = 0;`
fast-import: use sha1write() for pack data This is in preparation for using write_idx_file(). Also, by using sha1write() we get some buffering to reduces the number of write syscalls, and the written data is SHA1 summed which allows for the extra data integrity validation check performed in fixup_pack_header_footer() (details on this in commit abeb40e5aa). Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:52 +01:00			`sha1write(pack_file, &hdr, sizeof(hdr));`
Implemented manual packfile switching in fast-import. To help importers which are dealing with massive amounts of data fast-import needs to be able to close the packfile it is currently writing to and open a new packfile for any additional data that will be received. A new 'checkpoint' command has been introduced which can be used by the frontend import process to force this to occur at any time. This may be useful to ensure a very long running import doesn't lose any work due to unexpected failures. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:35:41 +01:00
			`pack_data = p;`
Restructure fast-import to support creating multiple packfiles. Now that we are starting to see some really large projects (such as KDE or a fork of FreeBSD) get imported into Git we're running into the upper limit on packfile object count as well as overall byte length. The KDE and FreeBSD projects are both likely to require more than 4 GiB to store their current history, which means we really need multiple packfiles to handle their content. This is a fairly simple restructuring of the internal code to help us support creating multiple packfiles from within fast-import. We are now adding a 5 digit incrementing suffix to the end of the basename supplied to us by the caller, permitting up to 99,999 packs to be generated in a single fast-import run. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 10:39:05 +01:00			`pack_size = sizeof(hdr);`
			`object_count = 0;`
Implemented manual packfile switching in fast-import. To help importers which are dealing with massive amounts of data fast-import needs to be able to close the packfile it is currently writing to and open a new packfile for any additional data that will be received. A new 'checkpoint' command has been introduced which can be used by the frontend import process to force this to occur at any time. This may be useful to ensure a very long running import doesn't lose any work due to unexpected failures. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:35:41 +01:00
			`all_packs = xrealloc(all_packs, sizeof(all_packs) (pack_id + 1));`
			`all_packs[pack_id] = p;`
Restructure fast-import to support creating multiple packfiles. Now that we are starting to see some really large projects (such as KDE or a fork of FreeBSD) get imported into Git we're running into the upper limit on packfile object count as well as overall byte length. The KDE and FreeBSD projects are both likely to require more than 4 GiB to store their current history, which means we really need multiple packfiles to handle their content. This is a fairly simple restructuring of the internal code to help us support creating multiple packfiles from within fast-import. We are now adding a 5 digit incrementing suffix to the end of the basename supplied to us by the caller, permitting up to 99,999 packs to be generated in a single fast-import run. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 10:39:05 +01:00			`}`

fast-import: use write_idx_file() instead of custom code This allows for the creation of pack index version 2 with its object CRC and the possibility for a pack to be larger than 4 GB. Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:53 +01:00			`static const char *create_index(void)`
Restructure fast-import to support creating multiple packfiles. Now that we are starting to see some really large projects (such as KDE or a fork of FreeBSD) get imported into Git we're running into the upper limit on packfile object count as well as overall byte length. The KDE and FreeBSD projects are both likely to require more than 4 GiB to store their current history, which means we really need multiple packfiles to handle their content. This is a fairly simple restructuring of the internal code to help us support creating multiple packfiles from within fast-import. We are now adding a 5 digit incrementing suffix to the end of the basename supplied to us by the caller, permitting up to 99,999 packs to be generated in a single fast-import run. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 10:39:05 +01:00			`{`
fast-import: use write_idx_file() instead of custom code This allows for the creation of pack index version 2 with its object CRC and the possibility for a pack to be larger than 4 GB. Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:53 +01:00			`const char *tmpfile;`
			`struct pack_idx_entry idx, c, **last;`
			`struct object_entry *e;`
Restructure fast-import to support creating multiple packfiles. Now that we are starting to see some really large projects (such as KDE or a fork of FreeBSD) get imported into Git we're running into the upper limit on packfile object count as well as overall byte length. The KDE and FreeBSD projects are both likely to require more than 4 GiB to store their current history, which means we really need multiple packfiles to handle their content. This is a fairly simple restructuring of the internal code to help us support creating multiple packfiles from within fast-import. We are now adding a 5 digit incrementing suffix to the end of the basename supplied to us by the caller, permitting up to 99,999 packs to be generated in a single fast-import run. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 10:39:05 +01:00			`struct object_entry_pool *o;`

fast-import: use write_idx_file() instead of custom code This allows for the creation of pack index version 2 with its object CRC and the possibility for a pack to be larger than 4 GB. Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:53 +01:00			`/* Build the table of object IDs. */`
			`idx = xmalloc(object_count * sizeof(*idx));`
Restructure fast-import to support creating multiple packfiles. Now that we are starting to see some really large projects (such as KDE or a fork of FreeBSD) get imported into Git we're running into the upper limit on packfile object count as well as overall byte length. The KDE and FreeBSD projects are both likely to require more than 4 GiB to store their current history, which means we really need multiple packfiles to handle their content. This is a fairly simple restructuring of the internal code to help us support creating multiple packfiles from within fast-import. We are now adding a 5 digit incrementing suffix to the end of the basename supplied to us by the caller, permitting up to 99,999 packs to be generated in a single fast-import run. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 10:39:05 +01:00			`c = idx;`
			`for (o = blocks; o; o = o->next_pool)`
Implemented automatic checkpoints within fast-import. When the number of objects or number of bytes gets close to the limit allowed by the packfile format (or configured on the command line by our caller) we should automatically checkpoint the current packfile and start a new one before writing the object out. This does however require that we abandon the delta (if we had one) as its not valid in a new packfile. I also added the simple rule that if we got a delta back but the delta itself is the same size as or larger than the uncompressed object to ignore the delta and just store the object data. This should avoid some really bad behavior caused by our current delta strategy. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 14:00:49 +01:00			`for (e = o->next_free; e-- != o->entries;)`
			`if (pack_id == e->pack_id)`
fast-import: use write_idx_file() instead of custom code This allows for the creation of pack index version 2 with its object CRC and the possibility for a pack to be larger than 4 GB. Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:53 +01:00			`*c++ = &e->idx;`
Restructure fast-import to support creating multiple packfiles. Now that we are starting to see some really large projects (such as KDE or a fork of FreeBSD) get imported into Git we're running into the upper limit on packfile object count as well as overall byte length. The KDE and FreeBSD projects are both likely to require more than 4 GiB to store their current history, which means we really need multiple packfiles to handle their content. This is a fairly simple restructuring of the internal code to help us support creating multiple packfiles from within fast-import. We are now adding a 5 digit incrementing suffix to the end of the basename supplied to us by the caller, permitting up to 99,999 packs to be generated in a single fast-import run. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 10:39:05 +01:00			`last = idx + object_count;`
Optimize index creation on large object sets in fast-import. When we are generating multiple packfiles at once we only need to scan the blocks of object_entry structs which contain objects for the current packfile. Because the most recent blocks are at the front of the linked list, and because all new objects going into the current file are allocated from the front of that list, we can stop scanning for objects as soon as we identify one which doesn't belong to the current packfile. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:51:58 +01:00			`if (c != last)`
			`die("internal consistency error creating the index");`
Restructure fast-import to support creating multiple packfiles. Now that we are starting to see some really large projects (such as KDE or a fork of FreeBSD) get imported into Git we're running into the upper limit on packfile object count as well as overall byte length. The KDE and FreeBSD projects are both likely to require more than 4 GiB to store their current history, which means we really need multiple packfiles to handle their content. This is a fairly simple restructuring of the internal code to help us support creating multiple packfiles from within fast-import. We are now adding a 5 digit incrementing suffix to the end of the basename supplied to us by the caller, permitting up to 99,999 packs to be generated in a single fast-import run. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 10:39:05 +01:00
write_idx_file: introduce a struct to hold idx customization options Remove two globals, pack_idx_default version and pack_idx_off32_limit, and place them in a pack_idx_option structure. Allow callers to pass it to write_idx_file() as a parameter. Adjust all callers to the API change. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-02-26 00:43:25 +01:00			`tmpfile = write_idx_file(NULL, idx, object_count, &pack_idx_opts, pack_data->sha1);`
Restructure fast-import to support creating multiple packfiles. Now that we are starting to see some really large projects (such as KDE or a fork of FreeBSD) get imported into Git we're running into the upper limit on packfile object count as well as overall byte length. The KDE and FreeBSD projects are both likely to require more than 4 GiB to store their current history, which means we really need multiple packfiles to handle their content. This is a fairly simple restructuring of the internal code to help us support creating multiple packfiles from within fast-import. We are now adding a 5 digit incrementing suffix to the end of the basename supplied to us by the caller, permitting up to 99,999 packs to be generated in a single fast-import run. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 10:39:05 +01:00			`free(idx);`
Use .keep files in fast-import during processing. Because fast-import automatically updates all references (heads and tags) at the end of its run the repository is corrupt unless the objects are available in the .git/objects/pack directory prior to the refs being modified. The easiest way to ensure that is true is to move the packfile and its associated index directly into the .git/objects/pack directory as soon as we have finished output to it. But the only safe way to do this is to create the a temporary .keep file for that pack, so we use the same tricks that index-pack uses when its being invoked by receive-pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 07:15:31 +01:00			`return tmpfile;`
			`}`

fast-import: use write_idx_file() instead of custom code This allows for the creation of pack index version 2 with its object CRC and the possibility for a pack to be larger than 4 GB. Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:53 +01:00			`static char keep_pack(const char curr_index_name)`
Use .keep files in fast-import during processing. Because fast-import automatically updates all references (heads and tags) at the end of its run the repository is corrupt unless the objects are available in the .git/objects/pack directory prior to the refs being modified. The easiest way to ensure that is true is to move the packfile and its associated index directly into the .git/objects/pack directory as soon as we have finished output to it. But the only safe way to do this is to create the a temporary .keep file for that pack, so we use the same tricks that index-pack uses when its being invoked by receive-pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 07:15:31 +01:00			`{`
			`static char name[PATH_MAX];`
General const correctness fixes We shouldn't attempt to assign constant strings into char*, as the string is not writable at runtime. Likewise we should always be treating unsigned values as unsigned values, not as signed values. Most of these are very straightforward. The only exception is the (unnecessary) xstrdup/free in builtin-branch.c for the detached head case. Since this is a user-level interactive type program and that particular code path is executed no more than once, I feel that the extra xstrdup call is well worth the easy elimination of this warning. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-07 02:44:17 +01:00			`static const char *keep_msg = "fast-import";`
Use .keep files in fast-import during processing. Because fast-import automatically updates all references (heads and tags) at the end of its run the repository is corrupt unless the objects are available in the .git/objects/pack directory prior to the refs being modified. The easiest way to ensure that is true is to move the packfile and its associated index directly into the .git/objects/pack directory as soon as we have finished output to it. But the only safe way to do this is to create the a temporary .keep file for that pack, so we use the same tricks that index-pack uses when its being invoked by receive-pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 07:15:31 +01:00			`int keep_fd;`

Make sure objects/pack exists before creating a new pack In a repository created with git older than f49fb35 (git-init-db: create "pack" subdirectory under objects, 2005-06-27), objects/pack/ directory is not created upon initialization. It was Ok because subdirectories are created as needed inside directories init-db creates, and back then, packfiles were recent invention. After the said commit, new codepaths started relying on the presense of objects/pack/ directory in the repository. This was exacerbated with 8b4eb6b (Do not perform cross-directory renames when creating packs, 2008-09-22) that moved the location temporary pack files are created from objects/ directory to objects/pack/ directory, because moving temporary to the final location was done carefully with lazy leading directory creation. Many packfile related operations in such an old repository can fail mysteriously because of this. This commit introduces two helper functions to make things work better. - odb_mkstemp() is a specialized version of mkstemp() to refactor the code and teach it to create leading directories as needed; - odb_pack_keep() refactors the code to create a ".keep" file while create leading directories as needed. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-02-25 08:11:29 +01:00			`keep_fd = odb_pack_keep(name, sizeof(name), pack_data->sha1);`
Use .keep files in fast-import during processing. Because fast-import automatically updates all references (heads and tags) at the end of its run the repository is corrupt unless the objects are available in the .git/objects/pack directory prior to the refs being modified. The easiest way to ensure that is true is to move the packfile and its associated index directly into the .git/objects/pack directory as soon as we have finished output to it. But the only safe way to do this is to create the a temporary .keep file for that pack, so we use the same tricks that index-pack uses when its being invoked by receive-pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 07:15:31 +01:00			`if (keep_fd < 0)`
Use die_errno() instead of die() when checking syscalls Lots of die() calls did not actually report the kind of error, which can leave the user confused as to the real problem. Use die_errno() where we check a system/library call that sets errno on failure, or one of the following that wrap such calls: Function Passes on error from -------- -------------------- odb_pack_keep open read_ancestry fopen read_in_full xread strbuf_read xread strbuf_read_file open or strbuf_read_file strbuf_readlink readlink write_in_full xwrite Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-06-27 17:58:47 +02:00			`die_errno("cannot create keep file");`
bundle, fast-import: detect write failure I noticed some unchecked writes. This fixes them. * bundle.c (create_bundle): Die upon write failure. * fast-import.c (keep_pack): Die upon write or close failure. Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-10 09:54:25 +01:00			`write_or_die(keep_fd, keep_msg, strlen(keep_msg));`
			`if (close(keep_fd))`
Use die_errno() instead of die() when checking syscalls Lots of die() calls did not actually report the kind of error, which can leave the user confused as to the real problem. Use die_errno() where we check a system/library call that sets errno on failure, or one of the following that wrap such calls: Function Passes on error from -------- -------------------- odb_pack_keep open read_ancestry fopen read_in_full xread strbuf_read xread strbuf_read_file open or strbuf_read_file strbuf_readlink readlink write_in_full xwrite Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-06-27 17:58:47 +02:00			`die_errno("failed to write keep file");`
Use .keep files in fast-import during processing. Because fast-import automatically updates all references (heads and tags) at the end of its run the repository is corrupt unless the objects are available in the .git/objects/pack directory prior to the refs being modified. The easiest way to ensure that is true is to move the packfile and its associated index directly into the .git/objects/pack directory as soon as we have finished output to it. But the only safe way to do this is to create the a temporary .keep file for that pack, so we use the same tricks that index-pack uses when its being invoked by receive-pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 07:15:31 +01:00
			`snprintf(name, sizeof(name), "%s/pack/pack-%s.pack",`
			`get_object_directory(), sha1_to_hex(pack_data->sha1));`
			`if (move_temp_to_file(pack_data->pack_name, name))`
			`die("cannot store pack file");`

			`snprintf(name, sizeof(name), "%s/pack/pack-%s.idx",`
			`get_object_directory(), sha1_to_hex(pack_data->sha1));`
			`if (move_temp_to_file(curr_index_name, name))`
			`die("cannot store index file");`
fast-import: use write_idx_file() instead of custom code This allows for the creation of pack index version 2 with its object CRC and the possibility for a pack to be larger than 4 GB. Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:53 +01:00			`free((void *)curr_index_name);`
Use .keep files in fast-import during processing. Because fast-import automatically updates all references (heads and tags) at the end of its run the repository is corrupt unless the objects are available in the .git/objects/pack directory prior to the refs being modified. The easiest way to ensure that is true is to move the packfile and its associated index directly into the .git/objects/pack directory as soon as we have finished output to it. But the only safe way to do this is to create the a temporary .keep file for that pack, so we use the same tricks that index-pack uses when its being invoked by receive-pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 07:15:31 +01:00			`return name;`
			`}`

Declare no-arg functions as (void) in fast-import. Apparently the git convention is to declare any function which takes no arguments as taking void. I did not do this during the early fast-import development, but should have. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-17 07:47:25 +01:00			`static void unkeep_all_packs(void)`
Use .keep files in fast-import during processing. Because fast-import automatically updates all references (heads and tags) at the end of its run the repository is corrupt unless the objects are available in the .git/objects/pack directory prior to the refs being modified. The easiest way to ensure that is true is to move the packfile and its associated index directly into the .git/objects/pack directory as soon as we have finished output to it. But the only safe way to do this is to create the a temporary .keep file for that pack, so we use the same tricks that index-pack uses when its being invoked by receive-pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 07:15:31 +01:00			`{`
			`static char name[PATH_MAX];`
			`int k;`

			`for (k = 0; k < pack_id; k++) {`
			`struct packed_git *p = all_packs[k];`
			`snprintf(name, sizeof(name), "%s/pack/pack-%s.keep",`
			`get_object_directory(), sha1_to_hex(p->sha1));`
replace direct calls to unlink(2) with unlink_or_warn This helps to notice when something's going wrong, especially on systems which lock open files. I used the following criteria when selecting the code for replacement: - it was already printing a warning for the unlink failures - it is in a function which already printing something or is called from such a function - it is in a static function, returning void and the function is only called from a builtin main function (cmd_) - it is in a function which handles emergency exit (signal handlers) - it is in a function which is obvously cleaning up the lockfiles Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-04-29 23:22:56 +02:00			`unlink_or_warn(name);`
Use .keep files in fast-import during processing. Because fast-import automatically updates all references (heads and tags) at the end of its run the repository is corrupt unless the objects are available in the .git/objects/pack directory prior to the refs being modified. The easiest way to ensure that is true is to move the packfile and its associated index directly into the .git/objects/pack directory as soon as we have finished output to it. But the only safe way to do this is to create the a temporary .keep file for that pack, so we use the same tricks that index-pack uses when its being invoked by receive-pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 07:15:31 +01:00			`}`
Restructure fast-import to support creating multiple packfiles. Now that we are starting to see some really large projects (such as KDE or a fork of FreeBSD) get imported into Git we're running into the upper limit on packfile object count as well as overall byte length. The KDE and FreeBSD projects are both likely to require more than 4 GiB to store their current history, which means we really need multiple packfiles to handle their content. This is a fairly simple restructuring of the internal code to help us support creating multiple packfiles from within fast-import. We are now adding a 5 digit incrementing suffix to the end of the basename supplied to us by the caller, permitting up to 99,999 packs to be generated in a single fast-import run. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 10:39:05 +01:00			`}`

Declare no-arg functions as (void) in fast-import. Apparently the git convention is to declare any function which takes no arguments as taking void. I did not do this during the early fast-import development, but should have. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-17 07:47:25 +01:00			`static void end_packfile(void)`
Restructure fast-import to support creating multiple packfiles. Now that we are starting to see some really large projects (such as KDE or a fork of FreeBSD) get imported into Git we're running into the upper limit on packfile object count as well as overall byte length. The KDE and FreeBSD projects are both likely to require more than 4 GiB to store their current history, which means we really need multiple packfiles to handle their content. This is a fairly simple restructuring of the internal code to help us support creating multiple packfiles from within fast-import. We are now adding a 5 digit incrementing suffix to the end of the basename supplied to us by the caller, permitting up to 99,999 packs to be generated in a single fast-import run. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 10:39:05 +01:00			`{`
Implemented manual packfile switching in fast-import. To help importers which are dealing with massive amounts of data fast-import needs to be able to close the packfile it is currently writing to and open a new packfile for any additional data that will be received. A new 'checkpoint' command has been introduced which can be used by the frontend import process to force this to occur at any time. This may be useful to ensure a very long running import doesn't lose any work due to unexpected failures. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:35:41 +01:00			`struct packed_git old_p = pack_data, new_p;`

Clear the delta base cache during fast-import checkpoint Otherwise we may reuse the same memory address for a totally different "struct packed_git", and a previously cached object from the prior occupant might be returned when trying to unpack an object from the new pack. Found-by: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-02-10 22:36:12 +01:00			`clear_delta_base_cache();`
Don't create a final empty packfile in fast-import. If the last packfile is going to be empty (has 0 objects) then it shouldn't be kept after the import has terminated, as there is no point to the packfile. So rather than hashing it and making the index file, just delete the packfile. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:39:39 +01:00			`if (object_count) {`
fast-import: use sha1write() for pack data This is in preparation for using write_idx_file(). Also, by using sha1write() we get some buffering to reduces the number of write syscalls, and the written data is SHA1 summed which allows for the extra data integrity validation check performed in fixup_pack_header_footer() (details on this in commit abeb40e5aa). Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:52 +01:00			`unsigned char cur_pack_sha1[20];`
Use .keep files in fast-import during processing. Because fast-import automatically updates all references (heads and tags) at the end of its run the repository is corrupt unless the objects are available in the .git/objects/pack directory prior to the refs being modified. The easiest way to ensure that is true is to move the packfile and its associated index directly into the .git/objects/pack directory as soon as we have finished output to it. But the only safe way to do this is to create the a temporary .keep file for that pack, so we use the same tricks that index-pack uses when its being invoked by receive-pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 07:15:31 +01:00			`char *idx_name;`
Print out the edge commits for each packfile in fast-import. To help callers repack very large repositories into a series of packfiles fast-import now outputs the last commits/tags it wrote to a packfile when it prints out the packfile name. This information can be feed to pack-objects --revs to repack. For the first pack of an initial import this is pretty easy (just feed those SHA1s on stdin) but for subsequent packs you want to feed the subsequent pack's final SHA1s but also all prior pack's SHA1s prefixed with the negation operator. This way the prior pack's data does not get included into the subsequent pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 22:18:44 +01:00			`int i;`
			`struct branch *b;`
			`struct tag *t;`
Use .keep files in fast-import during processing. Because fast-import automatically updates all references (heads and tags) at the end of its run the repository is corrupt unless the objects are available in the .git/objects/pack directory prior to the refs being modified. The easiest way to ensure that is true is to move the packfile and its associated index directly into the .git/objects/pack directory as soon as we have finished output to it. But the only safe way to do this is to create the a temporary .keep file for that pack, so we use the same tricks that index-pack uses when its being invoked by receive-pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 07:15:31 +01:00
Fix random fast-import errors when compiled with NO_MMAP fast-import was relying on the fact that on most systems mmap() and write() are synchronized by the filesystem's buffer cache. We were relying on the ability to mmap() 20 bytes beyond the current end of the file, then later fill in those bytes with a future write() call, then read them through the previously obtained mmap() address. This isn't always true with some implementations of NFS, but it is especially not true with our NO_MMAP=YesPlease build time option used on some platforms. If fast-import was built with NO_MMAP=YesPlease we used the malloc()+pread() emulation and the subsequent write() call does not update the trailing 20 bytes of a previously obtained "mmap()" (aka malloc'd) address. Under NO_MMAP that behavior causes unpack_entry() in sha1_file.c to be unable to read an object header (or data) that has been unlucky enough to be written to the packfile at a location such that it is in the trailing 20 bytes of a window previously opened on that same packfile. This bug has gone unnoticed for a very long time as it is highly data dependent. Not only does the object have to be placed at the right position, but it also needs to be positioned behind some other object that has been accessed due to a branch cache invalidation. In other words the stars had to align just right, and if you did run into this bug you probably should also have purchased a lottery ticket. Fortunately the workaround is a lot easier than the bug explanation. Before we allow unpack_entry() to read data from a pack window that has also (possibly) been modified through write() we force all existing windows on that packfile to be closed. By closing the windows we ensure that any new access via the emulated mmap() will reread the packfile, updating to the current file content. This comes at a slight performance degredation as we cannot reuse previously cached windows when we update the packfile. But it is a fairly minor difference as the window closes happen at only two points: - When the packfile is finalized and its .idx is generated: At this stage we are getting ready to update the refs and any data access into the packfile is going to be random, and is going after only the branch tips (to ensure they are valid). Our existing windows (if any) are not likely to be positioned at useful locations to access those final tip commits so we probably were closing them before anyway. - When the branch cache missed and we need to reload: At this point fast-import is getting change commands for the next commit and it needs to go re-read a tree object it previously had written out to the packfile. What windows we had (if any) are not likely to cover the tree in question so we probably were closing them before anyway. We do try to avoid unnecessarily closing windows in the second case by checking to see if the packfile size has increased since the last time we called unpack_entry() on that packfile. If the size has not changed then we have not written additional data, and any existing window is still vaild. This nicely handles the cases where fast-import is going through a branch cache reload and needs to read many trees at once. During such an event we are not likely to be updating the packfile so we do not cycle the windows between reads. With this change in place t9301-fast-export.sh (which was broken by c3b0dec509fe136c5417422f31898b5a4e2d5e02) finally works again. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-18 04:57:00 +01:00			`close_pack_windows(pack_data);`
fast-import: use sha1write() for pack data This is in preparation for using write_idx_file(). Also, by using sha1write() we get some buffering to reduces the number of write syscalls, and the written data is SHA1 summed which allows for the extra data integrity validation check performed in fixup_pack_header_footer() (details on this in commit abeb40e5aa). Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:52 +01:00			`sha1close(pack_file, cur_pack_sha1, 0);`
Create pack-write.c for common pack writing code Include a generalized fixup_pack_header_footer() in this new file. Needed by git-repack --max-pack-size feature in a later patchset. [sp: Moved close(pack_fd) to callers, to support index-pack, and changed name to better indicate it is for packfiles.] Signed-off-by: Dana L. How <danahow@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-05-02 18:13:14 +02:00			`fixup_pack_header_footer(pack_data->pack_fd, pack_data->sha1,`
improve reliability of fixup_pack_header_footer() Currently, this function has the potential to read corrupted pack data from disk and give it a valid SHA1 checksum. Let's add the ability to validate SHA1 checksum of existing data along the way, including before and after any arbitrary point in the pack. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-08-29 22:07:59 +02:00			`pack_data->pack_name, object_count,`
fast-import: use sha1write() for pack data This is in preparation for using write_idx_file(). Also, by using sha1write() we get some buffering to reduces the number of write syscalls, and the written data is SHA1 summed which allows for the extra data integrity validation check performed in fixup_pack_header_footer() (details on this in commit abeb40e5aa). Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:52 +01:00			`cur_pack_sha1, pack_size);`
Create pack-write.c for common pack writing code Include a generalized fixup_pack_header_footer() in this new file. Needed by git-repack --max-pack-size feature in a later patchset. [sp: Moved close(pack_fd) to callers, to support index-pack, and changed name to better indicate it is for packfiles.] Signed-off-by: Dana L. How <danahow@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-05-02 18:13:14 +02:00			`close(pack_data->pack_fd);`
Use .keep files in fast-import during processing. Because fast-import automatically updates all references (heads and tags) at the end of its run the repository is corrupt unless the objects are available in the .git/objects/pack directory prior to the refs being modified. The easiest way to ensure that is true is to move the packfile and its associated index directly into the .git/objects/pack directory as soon as we have finished output to it. But the only safe way to do this is to create the a temporary .keep file for that pack, so we use the same tricks that index-pack uses when its being invoked by receive-pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 07:15:31 +01:00			`idx_name = keep_pack(create_index());`
Don't create a final empty packfile in fast-import. If the last packfile is going to be empty (has 0 objects) then it shouldn't be kept after the import has terminated, as there is no point to the packfile. So rather than hashing it and making the index file, just delete the packfile. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:39:39 +01:00
Fix typos / spelling in comments Signed-off-by: Mike Ralphson <mike@abacus.co.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-04-17 20:13:30 +02:00			`/* Register the packfile with core git's machinery. */`
Don't create a final empty packfile in fast-import. If the last packfile is going to be empty (has 0 objects) then it shouldn't be kept after the import has terminated, as there is no point to the packfile. So rather than hashing it and making the index file, just delete the packfile. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:39:39 +01:00			`new_p = add_packed_git(idx_name, strlen(idx_name), 1);`
			`if (!new_p)`
			`die("core git rejected index %s", idx_name);`
Print out the edge commits for each packfile in fast-import. To help callers repack very large repositories into a series of packfiles fast-import now outputs the last commits/tags it wrote to a packfile when it prints out the packfile name. This information can be feed to pack-objects --revs to repack. For the first pack of an initial import this is pretty easy (just feed those SHA1s on stdin) but for subsequent packs you want to feed the subsequent pack's final SHA1s but also all prior pack's SHA1s prefixed with the negation operator. This way the prior pack's data does not get included into the subsequent pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 22:18:44 +01:00			`all_packs[pack_id] = new_p;`
Don't create a final empty packfile in fast-import. If the last packfile is going to be empty (has 0 objects) then it shouldn't be kept after the import has terminated, as there is no point to the packfile. So rather than hashing it and making the index file, just delete the packfile. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:39:39 +01:00			`install_packed_git(new_p);`
Print out the edge commits for each packfile in fast-import. To help callers repack very large repositories into a series of packfiles fast-import now outputs the last commits/tags it wrote to a packfile when it prints out the packfile name. This information can be feed to pack-objects --revs to repack. For the first pack of an initial import this is pretty easy (just feed those SHA1s on stdin) but for subsequent packs you want to feed the subsequent pack's final SHA1s but also all prior pack's SHA1s prefixed with the negation operator. This way the prior pack's data does not get included into the subsequent pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 22:18:44 +01:00
			`/* Print the boundary */`
fast-import: Hide the pack boundary commits by default. Most users don't need the pack boundary information that fast-import was printing to standard output, especially if they were calling it with --quiet. Those users who do want this information probably want it captured so they can go back and use it to repack the imported repository. So dumping the boundary commits to a log file makes more sense then printing them to standard output. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-12 01:45:56 +01:00			`if (pack_edges) {`
			`fprintf(pack_edges, "%s:", new_p->pack_name);`
			`for (i = 0; i < branch_table_sz; i++) {`
			`for (b = branch_table[i]; b; b = b->table_next_branch) {`
			`if (b->pack_id == pack_id)`
			`fprintf(pack_edges, " %s", sha1_to_hex(b->sha1));`
			`}`
Print out the edge commits for each packfile in fast-import. To help callers repack very large repositories into a series of packfiles fast-import now outputs the last commits/tags it wrote to a packfile when it prints out the packfile name. This information can be feed to pack-objects --revs to repack. For the first pack of an initial import this is pretty easy (just feed those SHA1s on stdin) but for subsequent packs you want to feed the subsequent pack's final SHA1s but also all prior pack's SHA1s prefixed with the negation operator. This way the prior pack's data does not get included into the subsequent pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 22:18:44 +01:00			`}`
fast-import: Hide the pack boundary commits by default. Most users don't need the pack boundary information that fast-import was printing to standard output, especially if they were calling it with --quiet. Those users who do want this information probably want it captured so they can go back and use it to repack the imported repository. So dumping the boundary commits to a log file makes more sense then printing them to standard output. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-12 01:45:56 +01:00			`for (t = first_tag; t; t = t->next_tag) {`
			`if (t->pack_id == pack_id)`
			`fprintf(pack_edges, " %s", sha1_to_hex(t->sha1));`
			`}`
			`fputc('\n', pack_edges);`
			`fflush(pack_edges);`
Print out the edge commits for each packfile in fast-import. To help callers repack very large repositories into a series of packfiles fast-import now outputs the last commits/tags it wrote to a packfile when it prints out the packfile name. This information can be feed to pack-objects --revs to repack. For the first pack of an initial import this is pretty easy (just feed those SHA1s on stdin) but for subsequent packs you want to feed the subsequent pack's final SHA1s but also all prior pack's SHA1s prefixed with the negation operator. This way the prior pack's data does not get included into the subsequent pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 22:18:44 +01:00			`}`

			`pack_id++;`
Don't create a final empty packfile in fast-import. If the last packfile is going to be empty (has 0 objects) then it shouldn't be kept after the import has terminated, as there is no point to the packfile. So rather than hashing it and making the index file, just delete the packfile. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:39:39 +01:00			`}`
fast-import: close pack before unlinking it This is sort of a companion patch to 4723ee9(Close files opened by lock_file() before unlinking.): on Windows, you cannot delete what is still open. This makes test 9300-fast-import pass on Windows for me; quite a few fast-imports leave temporary packs until the test "blank lines not necessary after other commands" actually tests for the number of files in .git/objects/pack/, which has a few temporary packs now. I guess that 8b4eb6b(Do not perform cross-directory renames when creating packs) was "responsible" for the breakage. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-12-15 22:11:40 +01:00			`else {`
			`close(old_p->pack_fd);`
replace direct calls to unlink(2) with unlink_or_warn This helps to notice when something's going wrong, especially on systems which lock open files. I used the following criteria when selecting the code for replacement: - it was already printing a warning for the unlink failures - it is in a function which already printing something or is called from such a function - it is in a static function, returning void and the function is only called from a builtin main function (cmd_) - it is in a function which handles emergency exit (signal handlers) - it is in a function which is obvously cleaning up the lockfiles Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-04-29 23:22:56 +02:00			`unlink_or_warn(old_p->pack_name);`
fast-import: close pack before unlinking it This is sort of a companion patch to 4723ee9(Close files opened by lock_file() before unlinking.): on Windows, you cannot delete what is still open. This makes test 9300-fast-import pass on Windows for me; quite a few fast-imports leave temporary packs until the test "blank lines not necessary after other commands" actually tests for the number of files in .git/objects/pack/, which has a few temporary packs now. I guess that 8b4eb6b(Do not perform cross-directory renames when creating packs) was "responsible" for the breakage. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-12-15 22:11:40 +01:00			`}`
Implemented manual packfile switching in fast-import. To help importers which are dealing with massive amounts of data fast-import needs to be able to close the packfile it is currently writing to and open a new packfile for any additional data that will be received. A new 'checkpoint' command has been introduced which can be used by the frontend import process to force this to occur at any time. This may be useful to ensure a very long running import doesn't lose any work due to unexpected failures. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:35:41 +01:00			`free(old_p);`

			`/* We can't carry a delta across packfiles. */`
fast-import optimization: Now that cmd_data acts on a strbuf, make last_object stashed buffer be a strbuf as well. On new stash, don't free the last stashed buffer, rather swap it with the one you will stash, this way, callers of store_object can act on static strbufs, and at some point, fast-import won't allocate new memory for objects buffers. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 14:00:38 +02:00			`strbuf_release(&last_blob.data);`
Implemented manual packfile switching in fast-import. To help importers which are dealing with massive amounts of data fast-import needs to be able to close the packfile it is currently writing to and open a new packfile for any additional data that will be received. A new 'checkpoint' command has been introduced which can be used by the frontend import process to force this to occur at any time. This may be useful to ensure a very long running import doesn't lose any work due to unexpected failures. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:35:41 +01:00			`last_blob.offset = 0;`
			`last_blob.depth = 0;`
Restructure fast-import to support creating multiple packfiles. Now that we are starting to see some really large projects (such as KDE or a fork of FreeBSD) get imported into Git we're running into the upper limit on packfile object count as well as overall byte length. The KDE and FreeBSD projects are both likely to require more than 4 GiB to store their current history, which means we really need multiple packfiles to handle their content. This is a fairly simple restructuring of the internal code to help us support creating multiple packfiles from within fast-import. We are now adding a 5 digit incrementing suffix to the end of the basename supplied to us by the caller, permitting up to 99,999 packs to be generated in a single fast-import run. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 10:39:05 +01:00			`}`

Dump all refs and marks during a checkpoint in fast-import. If the frontend asks us to checkpoint (via the explicit checkpoint command) its probably because they are afraid the current import will crash/fail/whatever and want to make sure they can pickup from the last checkpoint. To do that sort of recovery, we will need the current tip of every branch and tag available at the next startup. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-07 08:42:44 +01:00			`static void cycle_packfile(void)`
Implemented automatic checkpoints within fast-import. When the number of objects or number of bytes gets close to the limit allowed by the packfile format (or configured on the command line by our caller) we should automatically checkpoint the current packfile and start a new one before writing the object out. This does however require that we abandon the delta (if we had one) as its not valid in a new packfile. I also added the simple rule that if we got a delta back but the delta itself is the same size as or larger than the uncompressed object to ignore the delta and just store the object data. This should avoid some really bad behavior caused by our current delta strategy. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 14:00:49 +01:00			`{`
			`end_packfile();`
			`start_packfile();`
			`}`

Refactored fast-import's internals for future additions. Too many globals variables were being used not not enough code was resuable to process trees and commits so this is a simple refactoring of the existing blob processing code to get into a state that will be easier to handle trees and commits in. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:46:13 +02:00			`static int store_object(`
			`enum object_type type,`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`struct strbuf *dat,`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00			`struct last_object *last,`
Added mark store/find to fast-import. Marks are now saved when the mark directive gets used by the frontend and may be used in place of a SHA1 expression to locate a previous SHA1 which fast-import may have generated. This is particularly useful with commits where the frontend does not (easily) have the ability to compute the SHA1 for an arbitrary commit but needs it to generate a branch or tag from that commit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 10:17:45 +02:00			`unsigned char *sha1out,`
Use uintmax_t for marks in fast-import. If a frontend wants to use a mark per file revision and per commit and is doing a truly huge import (such as a 32 GiB SVN repository) we may need more than 2**32 unique mark values, especially if the frontend is unable (or unwilling) to recycle mark values. For mark idnums we should use the largest unsigned integer type available, hoping that will be at least 64 bits when we are compiled as a 64 bit executable. This way we may consume huge amounts of memory storing our mark table, but we'll at least be able to process the entire import without failing. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 06:33:19 +01:00			`uintmax_t mark)`
Created fast-import, a tool to quickly generating a pack from blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-05 08:04:21 +02:00			`{`
			`void out, delta;`
Refactored fast-import's internals for future additions. Too many globals variables were being used not not enough code was resuable to process trees and commits so this is a simple refactoring of the existing blob processing code to get into a state that will be easier to handle trees and commits in. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:46:13 +02:00			`struct object_entry *e;`
			`unsigned char hdr[96];`
			`unsigned char sha1[20];`
Created fast-import, a tool to quickly generating a pack from blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-05 08:04:21 +02:00			`unsigned long hdrlen, deltalen;`
fix openssl headers conflicting with custom SHA1 implementations On ARM I have the following compilation errors: CC fast-import.o In file included from cache.h:8, from builtin.h:6, from fast-import.c:142: arm/sha1.h:14: error: conflicting types for 'SHA_CTX' /usr/include/openssl/sha.h:105: error: previous declaration of 'SHA_CTX' was here arm/sha1.h:16: error: conflicting types for 'SHA1_Init' /usr/include/openssl/sha.h:115: error: previous declaration of 'SHA1_Init' was here arm/sha1.h:17: error: conflicting types for 'SHA1_Update' /usr/include/openssl/sha.h:116: error: previous declaration of 'SHA1_Update' was here arm/sha1.h:18: error: conflicting types for 'SHA1_Final' /usr/include/openssl/sha.h:117: error: previous declaration of 'SHA1_Final' was here make: *** [fast-import.o] Error 1 This is because openssl header files are always included in git-compat-util.h since commit 684ec6c63c whenever NO_OPENSSL is not set, which somehow brings in <openssl/sha1.h> clashing with the custom ARM version. Compilation of git is probably broken on PPC too for the same reason. Turns out that the only file requiring openssl/ssl.h and openssl/err.h is imap-send.c. But only moving those problematic includes there doesn't solve the issue as it also includes cache.h which brings in the conflicting local SHA1 header file. As suggested by Jeff King, the best solution is to rename our references to SHA1 functions and structure to something git specific, and define those according to the implementation used. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2008-10-01 20:05:20 +02:00			`git_SHA_CTX c;`
zlib: zlib can only process 4GB at a time The size of objects we read from the repository and data we try to put into the repository are represented in "unsigned long", so that on larger architectures we can handle objects that weigh more than 4GB. But the interface defined in zlib.h to communicate with inflate/deflate limits avail_in (how many bytes of input are we calling zlib with) and avail_out (how many bytes of output from zlib are we ready to accept) fields effectively to 4GB by defining their type to be uInt. In many places in our code, we allocate a large buffer (e.g. mmap'ing a large loose object file) and tell zlib its size by assigning the size to avail_in field of the stream, but that will truncate the high octets of the real size. The worst part of this story is that we often pass around z_stream (the state object used by zlib) to keep track of the number of used bytes in input/output buffer by inspecting these two fields, which practically limits our callchain to the same 4GB limit. Wrap z_stream in another structure git_zstream that can express avail_in and avail_out in unsigned long. For now, just die() when the caller gives a size that cannot be given to a single zlib call. In later patches in the series, we would make git_inflate() and git_deflate() internally loop to give callers an illusion that our "improved" version of zlib interface can operate on a buffer larger than 4GB in one go. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-06-10 20:52:15 +02:00			`git_zstream s;`
Refactored fast-import's internals for future additions. Too many globals variables were being used not not enough code was resuable to process trees and commits so this is a simple refactoring of the existing blob processing code to get into a state that will be easier to handle trees and commits in. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:46:13 +02:00
Fix a bunch of pointer declarations (codestyle) Essentially; s/type* /type */ as per the coding guidelines. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-05-01 11:06:36 +02:00			`hdrlen = sprintf((char *)hdr,"%s %lu", typename(type),`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`(unsigned long)dat->len) + 1;`
fix openssl headers conflicting with custom SHA1 implementations On ARM I have the following compilation errors: CC fast-import.o In file included from cache.h:8, from builtin.h:6, from fast-import.c:142: arm/sha1.h:14: error: conflicting types for 'SHA_CTX' /usr/include/openssl/sha.h:105: error: previous declaration of 'SHA_CTX' was here arm/sha1.h:16: error: conflicting types for 'SHA1_Init' /usr/include/openssl/sha.h:115: error: previous declaration of 'SHA1_Init' was here arm/sha1.h:17: error: conflicting types for 'SHA1_Update' /usr/include/openssl/sha.h:116: error: previous declaration of 'SHA1_Update' was here arm/sha1.h:18: error: conflicting types for 'SHA1_Final' /usr/include/openssl/sha.h:117: error: previous declaration of 'SHA1_Final' was here make: *** [fast-import.o] Error 1 This is because openssl header files are always included in git-compat-util.h since commit 684ec6c63c whenever NO_OPENSSL is not set, which somehow brings in <openssl/sha1.h> clashing with the custom ARM version. Compilation of git is probably broken on PPC too for the same reason. Turns out that the only file requiring openssl/ssl.h and openssl/err.h is imap-send.c. But only moving those problematic includes there doesn't solve the issue as it also includes cache.h which brings in the conflicting local SHA1 header file. As suggested by Jeff King, the best solution is to rename our references to SHA1 functions and structure to something git specific, and define those according to the implementation used. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2008-10-01 20:05:20 +02:00			`git_SHA1_Init(&c);`
			`git_SHA1_Update(&c, hdr, hdrlen);`
			`git_SHA1_Update(&c, dat->buf, dat->len);`
			`git_SHA1_Final(sha1, &c);`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00			`if (sha1out)`
Converted hash memcpy/memcmp to new hashcpy/hashcmp/hashclr. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 16:46:58 +02:00			`hashcpy(sha1out, sha1);`
Refactored fast-import's internals for future additions. Too many globals variables were being used not not enough code was resuable to process trees and commits so this is a simple refactoring of the existing blob processing code to get into a state that will be easier to handle trees and commits in. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:46:13 +02:00
			`e = insert_object(sha1);`
Added mark store/find to fast-import. Marks are now saved when the mark directive gets used by the frontend and may be used in place of a SHA1 expression to locate a previous SHA1 which fast-import may have generated. This is particularly useful with commits where the frontend does not (easily) have the ability to compute the SHA1 for an arbitrary commit but needs it to generate a branch or tag from that commit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 10:17:45 +02:00			`if (mark)`
			`insert_mark(mark, e);`
fast-import: start using struct pack_idx_entry This is in preparation for using write_idx_file(). Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:51 +01:00			`if (e->idx.offset) {`
Added basic command handler to fast-import. Moved the new_blob logic off into a new subroutine and invoked it when getting the 'blob' command. Added statistics dump to STDERR when the program terminates listing what it did at a high level. This is somewhat interesting. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 07:14:21 +02:00			`duplicate_count_by_type[type]++;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`return 1;`
Don't repack existing objects in fast-import Some users of fast-import have been trying to use it to rewrite commits and trees, an activity where the all of the relevant blobs are already available from the existing packfiles. In such a case we don't want to repack a blob, even if the frontend application has supplied us the raw data rather than a mark or a SHA-1 name. I'm intentionally only checking the packfiles that existed when fast-import started and am always ignoring all loose object files. We ignore loose objects because fast-import tends to operate on a very large number of objects in a very short timespan, and it is usually creating new objects, not reusing existing ones. In such a situtation the majority of the objects will not be found in the existing packfiles, nor will they be loose object files. If the frontend application really wants us to look at loose object files, then they can just repack the repository before running fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-04-20 17:23:45 +02:00			`} else if (find_sha1_pack(sha1, packed_git)) {`
			`e->type = type;`
			`e->pack_id = MAX_PACK_ID;`
fast-import: start using struct pack_idx_entry This is in preparation for using write_idx_file(). Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:51 +01:00			`e->idx.offset = 1; /* just not zero! */`
Don't repack existing objects in fast-import Some users of fast-import have been trying to use it to rewrite commits and trees, an activity where the all of the relevant blobs are already available from the existing packfiles. In such a case we don't want to repack a blob, even if the frontend application has supplied us the raw data rather than a mark or a SHA-1 name. I'm intentionally only checking the packfiles that existed when fast-import started and am always ignoring all loose object files. We ignore loose objects because fast-import tends to operate on a very large number of objects in a very short timespan, and it is usually creating new objects, not reusing existing ones. In such a situtation the majority of the objects will not be found in the existing packfiles, nor will they be loose object files. If the frontend application really wants us to look at loose object files, then they can just repack the repository before running fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-04-20 17:23:45 +02:00			`duplicate_count_by_type[type]++;`
			`return 1;`
Refactored fast-import's internals for future additions. Too many globals variables were being used not not enough code was resuable to process trees and commits so this is a simple refactoring of the existing blob processing code to get into a state that will be easier to handle trees and commits in. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:46:13 +02:00			`}`
Created fast-import, a tool to quickly generating a pack from blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-05 08:04:21 +02:00
fast-import: use the diff_delta() max_delta_size argument This let diff_delta() abort early if it is going to bust the given size limit. Also, only objects larger than 20 bytes are considered as objects smaller than that are most certainly going to produce larger deltas than the original object due to the additional headers. Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:56 +01:00			`if (last && last->data.buf && last->depth < max_depth && dat->len > 20) {`
fast-import: count and report # of calls to diff_delta in stats It's an interesting number, how often do we try to deltify each type of objects and how often do we succeed. So do add it to stats. Success doesn't mean much gain in pack size though. As we allow delta to be as big as (data.len - 20). And delta close to data.len gains nothing compared to no delta at all even after zlib compression (delta is pretty much the same as data, just with few modifications). We should try to make less attempts that result in huge deltas as these consume more cpu than trivial small deltas. Either by choosing a better delta base or reducing delta size upper bound or doing less delta attempts at all. Currently, delta base for blobs is a waste literally. Each blob delta base is chosen as a previously stored blob. Disabling deltas for blobs doesn't increase pack size and reduce import time, or at least doesn't increase time for all fast-import streams I've tried. Signed-off-by: Dmitry Ivankov <divanorama@gmail.com> Acked-by: David Barr <davidbarr@google.com> Acked-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-08-20 21:04:11 +02:00			`delta_count_attempts_by_type[type]++;`
fast-import optimization: Now that cmd_data acts on a strbuf, make last_object stashed buffer be a strbuf as well. On new stash, don't free the last stashed buffer, rather swap it with the one you will stash, this way, callers of store_object can act on static strbufs, and at some point, fast-import won't allocate new memory for objects buffers. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 14:00:38 +02:00			`delta = diff_delta(last->data.buf, last->data.len,`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`dat->buf, dat->len,`
fast-import: use the diff_delta() max_delta_size argument This let diff_delta() abort early if it is going to bust the given size limit. Also, only objects larger than 20 bytes are considered as objects smaller than that are most certainly going to produce larger deltas than the original object due to the additional headers. Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:56 +01:00			`&deltalen, dat->len - 20);`
Implemented automatic checkpoints within fast-import. When the number of objects or number of bytes gets close to the limit allowed by the packfile format (or configured on the command line by our caller) we should automatically checkpoint the current packfile and start a new one before writing the object out. This does however require that we abandon the delta (if we had one) as its not valid in a new packfile. I also added the simple rule that if we got a delta back but the delta itself is the same size as or larger than the uncompressed object to ignore the delta and just store the object data. This should avoid some really bad behavior caused by our current delta strategy. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 14:00:49 +01:00			`} else`
			`delta = NULL;`
Created fast-import, a tool to quickly generating a pack from blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-05 08:04:21 +02:00
			`memset(&s, 0, sizeof(s));`
zlib: wrap deflate side of the API Wrap deflateInit, deflate, and deflateEnd for everybody, and the sole use of deflateInit2 in remote-curl.c to tell the library to use gzip header and trailer in git_deflate_init_gzip(). There is only one caller that cares about the status from deflateEnd(). Introduce git_deflate_end_gently() to let that sole caller retrieve the status and act on it (i.e. die) for now, but we would probably want to make inflate_end/deflate_end die when they ran out of memory and get rid of the _gently() kind. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-06-10 19:55:10 +02:00			`git_deflate_init(&s, pack_compression_level);`
Implemented automatic checkpoints within fast-import. When the number of objects or number of bytes gets close to the limit allowed by the packfile format (or configured on the command line by our caller) we should automatically checkpoint the current packfile and start a new one before writing the object out. This does however require that we abandon the delta (if we had one) as its not valid in a new packfile. I also added the simple rule that if we got a delta back but the delta itself is the same size as or larger than the uncompressed object to ignore the delta and just store the object data. This should avoid some really bad behavior caused by our current delta strategy. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 14:00:49 +01:00			`if (delta) {`
			`s.next_in = delta;`
			`s.avail_in = deltalen;`
			`} else {`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`s.next_in = (void *)dat->buf;`
			`s.avail_in = dat->len;`
Implemented automatic checkpoints within fast-import. When the number of objects or number of bytes gets close to the limit allowed by the packfile format (or configured on the command line by our caller) we should automatically checkpoint the current packfile and start a new one before writing the object out. This does however require that we abandon the delta (if we had one) as its not valid in a new packfile. I also added the simple rule that if we got a delta back but the delta itself is the same size as or larger than the uncompressed object to ignore the delta and just store the object data. This should avoid some really bad behavior caused by our current delta strategy. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 14:00:49 +01:00			`}`
zlib: wrap deflateBound() too Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-06-10 20:18:17 +02:00			`s.avail_out = git_deflate_bound(&s, s.avail_in);`
Implemented automatic checkpoints within fast-import. When the number of objects or number of bytes gets close to the limit allowed by the packfile format (or configured on the command line by our caller) we should automatically checkpoint the current packfile and start a new one before writing the object out. This does however require that we abandon the delta (if we had one) as its not valid in a new packfile. I also added the simple rule that if we got a delta back but the delta itself is the same size as or larger than the uncompressed object to ignore the delta and just store the object data. This should avoid some really bad behavior caused by our current delta strategy. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 14:00:49 +01:00			`s.next_out = out = xmalloc(s.avail_out);`
zlib: wrap deflate side of the API Wrap deflateInit, deflate, and deflateEnd for everybody, and the sole use of deflateInit2 in remote-curl.c to tell the library to use gzip header and trailer in git_deflate_init_gzip(). There is only one caller that cares about the status from deflateEnd(). Introduce git_deflate_end_gently() to let that sole caller retrieve the status and act on it (i.e. die) for now, but we would probably want to make inflate_end/deflate_end die when they ran out of memory and get rid of the _gently() kind. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-06-10 19:55:10 +02:00			`while (git_deflate(&s, Z_FINISH) == Z_OK)`
			`; /* nothing */`
			`git_deflate_end(&s);`
Implemented automatic checkpoints within fast-import. When the number of objects or number of bytes gets close to the limit allowed by the packfile format (or configured on the command line by our caller) we should automatically checkpoint the current packfile and start a new one before writing the object out. This does however require that we abandon the delta (if we had one) as its not valid in a new packfile. I also added the simple rule that if we got a delta back but the delta itself is the same size as or larger than the uncompressed object to ignore the delta and just store the object data. This should avoid some really bad behavior caused by our current delta strategy. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 14:00:49 +01:00
			`/* Determine if we should auto-checkpoint. */`
fast-import: make default pack size unlimited Now that fast-import is creating packs with index version 2, there is no point limiting the pack size by default. A pack split will still happen if off_t is not sufficiently large to hold large offsets. While updating the doc, let's remove the "packfiles fit on CDs" suggestion. Pack files created by fast-import are still suboptimal and a 'git repack -a -f -d' or even 'git gc --aggressive' would be a pretty good idea before considering storage on CDs. Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:54 +01:00			`if ((max_packsize && (pack_size + 60 + s.total_out) > max_packsize)`
Implemented automatic checkpoints within fast-import. When the number of objects or number of bytes gets close to the limit allowed by the packfile format (or configured on the command line by our caller) we should automatically checkpoint the current packfile and start a new one before writing the object out. This does however require that we abandon the delta (if we had one) as its not valid in a new packfile. I also added the simple rule that if we got a delta back but the delta itself is the same size as or larger than the uncompressed object to ignore the delta and just store the object data. This should avoid some really bad behavior caused by our current delta strategy. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 14:00:49 +01:00			`\|\| (pack_size + 60 + s.total_out) < pack_size) {`

			`/* This new object needs to not have the current pack_id. */`
			`e->pack_id = pack_id + 1;`
Dump all refs and marks during a checkpoint in fast-import. If the frontend asks us to checkpoint (via the explicit checkpoint command) its probably because they are afraid the current import will crash/fail/whatever and want to make sure they can pickup from the last checkpoint. To do that sort of recovery, we will need the current tip of every branch and tag available at the next startup. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-07 08:42:44 +01:00			`cycle_packfile();`
Implemented automatic checkpoints within fast-import. When the number of objects or number of bytes gets close to the limit allowed by the packfile format (or configured on the command line by our caller) we should automatically checkpoint the current packfile and start a new one before writing the object out. This does however require that we abandon the delta (if we had one) as its not valid in a new packfile. I also added the simple rule that if we got a delta back but the delta itself is the same size as or larger than the uncompressed object to ignore the delta and just store the object data. This should avoid some really bad behavior caused by our current delta strategy. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 14:00:49 +01:00
			`/* We cannot carry a delta into the new pack. */`
			`if (delta) {`
			`free(delta);`
			`delta = NULL;`
Corrected buffer overflow during automatic checkpoint in fast-import. If we previously were using a delta but we needed to checkpoint the current packfile and switch to a new packfile we need to throw away the delta and compress the raw object by itself, as delta chains cannot span non-thin packfiles. Unfortunately the output buffer in this case needs to grow, as the size of the compressed object may be quite a bit larger than the size of the compressed delta. I've also avoided recompressing the object if we are checkpointing and we didn't use a delta. In this case the output buffer is the correct size and has already been populated with the right data, we just need to close out the current packfile and open a new one. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 05:40:27 +01:00
			`memset(&s, 0, sizeof(s));`
zlib: wrap deflate side of the API Wrap deflateInit, deflate, and deflateEnd for everybody, and the sole use of deflateInit2 in remote-curl.c to tell the library to use gzip header and trailer in git_deflate_init_gzip(). There is only one caller that cares about the status from deflateEnd(). Introduce git_deflate_end_gently() to let that sole caller retrieve the status and act on it (i.e. die) for now, but we would probably want to make inflate_end/deflate_end die when they ran out of memory and get rid of the _gently() kind. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-06-10 19:55:10 +02:00			`git_deflate_init(&s, pack_compression_level);`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`s.next_in = (void *)dat->buf;`
			`s.avail_in = dat->len;`
zlib: wrap deflateBound() too Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-06-10 20:18:17 +02:00			`s.avail_out = git_deflate_bound(&s, s.avail_in);`
Corrected buffer overflow during automatic checkpoint in fast-import. If we previously were using a delta but we needed to checkpoint the current packfile and switch to a new packfile we need to throw away the delta and compress the raw object by itself, as delta chains cannot span non-thin packfiles. Unfortunately the output buffer in this case needs to grow, as the size of the compressed object may be quite a bit larger than the size of the compressed delta. I've also avoided recompressing the object if we are checkpointing and we didn't use a delta. In this case the output buffer is the correct size and has already been populated with the right data, we just need to close out the current packfile and open a new one. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 05:40:27 +01:00			`s.next_out = out = xrealloc(out, s.avail_out);`
zlib: wrap deflate side of the API Wrap deflateInit, deflate, and deflateEnd for everybody, and the sole use of deflateInit2 in remote-curl.c to tell the library to use gzip header and trailer in git_deflate_init_gzip(). There is only one caller that cares about the status from deflateEnd(). Introduce git_deflate_end_gently() to let that sole caller retrieve the status and act on it (i.e. die) for now, but we would probably want to make inflate_end/deflate_end die when they ran out of memory and get rid of the _gently() kind. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-06-10 19:55:10 +02:00			`while (git_deflate(&s, Z_FINISH) == Z_OK)`
			`; /* nothing */`
			`git_deflate_end(&s);`
Implemented automatic checkpoints within fast-import. When the number of objects or number of bytes gets close to the limit allowed by the packfile format (or configured on the command line by our caller) we should automatically checkpoint the current packfile and start a new one before writing the object out. This does however require that we abandon the delta (if we had one) as its not valid in a new packfile. I also added the simple rule that if we got a delta back but the delta itself is the same size as or larger than the uncompressed object to ignore the delta and just store the object data. This should avoid some really bad behavior caused by our current delta strategy. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 14:00:49 +01:00			`}`
			`}`

			`e->type = type;`
			`e->pack_id = pack_id;`
fast-import: start using struct pack_idx_entry This is in preparation for using write_idx_file(). Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:51 +01:00			`e->idx.offset = pack_size;`
Implemented automatic checkpoints within fast-import. When the number of objects or number of bytes gets close to the limit allowed by the packfile format (or configured on the command line by our caller) we should automatically checkpoint the current packfile and start a new one before writing the object out. This does however require that we abandon the delta (if we had one) as its not valid in a new packfile. I also added the simple rule that if we got a delta back but the delta itself is the same size as or larger than the uncompressed object to ignore the delta and just store the object data. This should avoid some really bad behavior caused by our current delta strategy. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 14:00:49 +01:00			`object_count++;`
			`object_count_by_type[type]++;`
Created fast-import, a tool to quickly generating a pack from blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-05 08:04:21 +02:00
fast-import: use write_idx_file() instead of custom code This allows for the creation of pack index version 2 with its object CRC and the possibility for a pack to be larger than 4 GB. Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:53 +01:00			`crc32_begin(pack_file);`

Created fast-import, a tool to quickly generating a pack from blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-05 08:04:21 +02:00			`if (delta) {`
fast-import: make default pack size unlimited Now that fast-import is creating packs with index version 2, there is no point limiting the pack size by default. A pack split will still happen if off_t is not sufficiently large to hold large offsets. While updating the doc, let's remove the "packfiles fit on CDs" suggestion. Pack files created by fast-import are still suboptimal and a 'git repack -a -f -d' or even 'git gc --aggressive' would be a pretty good idea before considering storage on CDs. Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:54 +01:00			`off_t ofs = e->idx.offset - last->offset;`
Improve reuse of sha1_file library within fast-import. Now that the sha1_file.c library routines use the sliding mmap routines to perform efficient access to portions of a packfile I can remove that code from fast-import.c and just invoke it. One benefit is we now have reloading support for any packfile which uses OBJ_OFS_DELTA. Another is we have significantly less code to maintain. This code reuse change requires that fast-import generate only an OBJ_OFS_DELTA format packfile, as there is absolutely no index available to perform OBJ_REF_DELTA lookup in while unpacking an object. This is probably reasonable to require as the delta offsets result in smaller packfiles and are faster to unpack, as no index searching is required. Its also only a temporary requirement as users could always repack without offsets before making the import available to older versions of Git. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-14 12:20:23 +01:00			`unsigned pos = sizeof(hdr) - 1;`

Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`delta_count_by_type[type]++;`
Don't allow fast-import tree delta chains to exceed maximum depth Brian Downing noticed fast-import can produce tree depths of up to 6,035 objects and even deeper. Long delta chains can create very small packfiles but cause problems during repacking as git needs to unpack each tree to count the reachable blobs. What's happening here is the active branch cache isn't big enough. We're swapping out the branch and thus recycling the tree information (struct tree_content) back into the free pool. When we later reload the tree we set the delta_depth to 0 but we kept the tree we just reloaded as a delta base. So if the tree we reloaded was already at the maximum depth we wouldn't know it and make the new tree a delta. Multiply the number of times the branch cache has to swap out the tree times max_depth (10) and you get the maximum delta depth of a tree created by fast-import. In Brian's case above the active branch cache had to swap the branch out 603/604 times during this import to produce a tree with a delta depth of 6035. Acked-by: Brian Downing <bdowning@lavos.net> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-14 05:48:42 +01:00			`e->depth = last->depth + 1;`
Improve reuse of sha1_file library within fast-import. Now that the sha1_file.c library routines use the sliding mmap routines to perform efficient access to portions of a packfile I can remove that code from fast-import.c and just invoke it. One benefit is we now have reloading support for any packfile which uses OBJ_OFS_DELTA. Another is we have significantly less code to maintain. This code reuse change requires that fast-import generate only an OBJ_OFS_DELTA format packfile, as there is absolutely no index available to perform OBJ_REF_DELTA lookup in while unpacking an object. This is probably reasonable to require as the delta offsets result in smaller packfiles and are faster to unpack, as no index searching is required. Its also only a temporary requirement as users could always repack without offsets before making the import available to older versions of Git. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-14 12:20:23 +01:00
refactor duplicated encode_header in pack-objects and fast-import The following function is duplicated: encode_header Move this function to sha1_file.c and rename it 'encode_in_pack_object_header', as suggested by Junio C Hamano Signed-off-by: Michael Lukashov <michael.lukashov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 00:42:54 +01:00			`hdrlen = encode_in_pack_object_header(OBJ_OFS_DELTA, deltalen, hdr);`
fast-import: use sha1write() for pack data This is in preparation for using write_idx_file(). Also, by using sha1write() we get some buffering to reduces the number of write syscalls, and the written data is SHA1 summed which allows for the extra data integrity validation check performed in fixup_pack_header_footer() (details on this in commit abeb40e5aa). Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:52 +01:00			`sha1write(pack_file, hdr, hdrlen);`
Improve reuse of sha1_file library within fast-import. Now that the sha1_file.c library routines use the sliding mmap routines to perform efficient access to portions of a packfile I can remove that code from fast-import.c and just invoke it. One benefit is we now have reloading support for any packfile which uses OBJ_OFS_DELTA. Another is we have significantly less code to maintain. This code reuse change requires that fast-import generate only an OBJ_OFS_DELTA format packfile, as there is absolutely no index available to perform OBJ_REF_DELTA lookup in while unpacking an object. This is probably reasonable to require as the delta offsets result in smaller packfiles and are faster to unpack, as no index searching is required. Its also only a temporary requirement as users could always repack without offsets before making the import available to older versions of Git. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-14 12:20:23 +01:00			`pack_size += hdrlen;`

			`hdr[pos] = ofs & 127;`
			`while (ofs >>= 7)`
			`hdr[--pos] = 128 \| (--ofs & 127);`
fast-import: use sha1write() for pack data This is in preparation for using write_idx_file(). Also, by using sha1write() we get some buffering to reduces the number of write syscalls, and the written data is SHA1 summed which allows for the extra data integrity validation check performed in fixup_pack_header_footer() (details on this in commit abeb40e5aa). Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:52 +01:00			`sha1write(pack_file, hdr + pos, sizeof(hdr) - pos);`
Improve reuse of sha1_file library within fast-import. Now that the sha1_file.c library routines use the sliding mmap routines to perform efficient access to portions of a packfile I can remove that code from fast-import.c and just invoke it. One benefit is we now have reloading support for any packfile which uses OBJ_OFS_DELTA. Another is we have significantly less code to maintain. This code reuse change requires that fast-import generate only an OBJ_OFS_DELTA format packfile, as there is absolutely no index available to perform OBJ_REF_DELTA lookup in while unpacking an object. This is probably reasonable to require as the delta offsets result in smaller packfiles and are faster to unpack, as no index searching is required. Its also only a temporary requirement as users could always repack without offsets before making the import available to older versions of Git. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-14 12:20:23 +01:00			`pack_size += sizeof(hdr) - pos;`
Created fast-import, a tool to quickly generating a pack from blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-05 08:04:21 +02:00			`} else {`
Don't allow fast-import tree delta chains to exceed maximum depth Brian Downing noticed fast-import can produce tree depths of up to 6,035 objects and even deeper. Long delta chains can create very small packfiles but cause problems during repacking as git needs to unpack each tree to count the reachable blobs. What's happening here is the active branch cache isn't big enough. We're swapping out the branch and thus recycling the tree information (struct tree_content) back into the free pool. When we later reload the tree we set the delta_depth to 0 but we kept the tree we just reloaded as a delta base. So if the tree we reloaded was already at the maximum depth we wouldn't know it and make the new tree a delta. Multiply the number of times the branch cache has to swap out the tree times max_depth (10) and you get the maximum delta depth of a tree created by fast-import. In Brian's case above the active branch cache had to swap the branch out 603/604 times during this import to produce a tree with a delta depth of 6035. Acked-by: Brian Downing <bdowning@lavos.net> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-14 05:48:42 +01:00			`e->depth = 0;`
refactor duplicated encode_header in pack-objects and fast-import The following function is duplicated: encode_header Move this function to sha1_file.c and rename it 'encode_in_pack_object_header', as suggested by Junio C Hamano Signed-off-by: Michael Lukashov <michael.lukashov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 00:42:54 +01:00			`hdrlen = encode_in_pack_object_header(type, dat->len, hdr);`
fast-import: use sha1write() for pack data This is in preparation for using write_idx_file(). Also, by using sha1write() we get some buffering to reduces the number of write syscalls, and the written data is SHA1 summed which allows for the extra data integrity validation check performed in fixup_pack_header_footer() (details on this in commit abeb40e5aa). Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:52 +01:00			`sha1write(pack_file, hdr, hdrlen);`
Implemented tree reloading in fast-import. Tree reloading allows fast-import to swap out the least-recently used branch by simply deallocating the data structures from memory that were associated with that branch. Later if the branch becomes active again it can lazily recreate those structures on demand by reloading the necessary trees from the pack file it originally wrote them to. The reloading process is implemented by mmap'ing the pack into memory and using a much tighter variant of the pack reading code contained in sha1_file.c. This was a blatent copy from sha1_file.c but the unpacking functions were significantly simplified and are actually now in a form that should make it easier to map only the necessary regions of a pack rather than the entire file. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 10:37:35 +02:00			`pack_size += hdrlen;`
Created fast-import, a tool to quickly generating a pack from blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-05 08:04:21 +02:00			`}`

fast-import: use sha1write() for pack data This is in preparation for using write_idx_file(). Also, by using sha1write() we get some buffering to reduces the number of write syscalls, and the written data is SHA1 summed which allows for the extra data integrity validation check performed in fixup_pack_header_footer() (details on this in commit abeb40e5aa). Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:52 +01:00			`sha1write(pack_file, out, s.total_out);`
Implemented tree reloading in fast-import. Tree reloading allows fast-import to swap out the least-recently used branch by simply deallocating the data structures from memory that were associated with that branch. Later if the branch becomes active again it can lazily recreate those structures on demand by reloading the necessary trees from the pack file it originally wrote them to. The reloading process is implemented by mmap'ing the pack into memory and using a much tighter variant of the pack reading code contained in sha1_file.c. This was a blatent copy from sha1_file.c but the unpacking functions were significantly simplified and are actually now in a form that should make it easier to map only the necessary regions of a pack rather than the entire file. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 10:37:35 +02:00			`pack_size += s.total_out;`
Created fast-import, a tool to quickly generating a pack from blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-05 08:04:21 +02:00
fast-import: use write_idx_file() instead of custom code This allows for the creation of pack index version 2 with its object CRC and the possibility for a pack to be larger than 4 GB. Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:53 +01:00			`e->idx.crc32 = crc32_end(pack_file);`

Created fast-import, a tool to quickly generating a pack from blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-05 08:04:21 +02:00			`free(out);`
Remove unnecessary null pointer checks in fast-import. There is no need to check for a NULL pointer before invoking free(), the runtime library automatically performs this check anyway and does nothing if a NULL pointer is supplied. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 18:05:51 +01:00			`free(delta);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`if (last) {`
fast-import optimization: Now that cmd_data acts on a strbuf, make last_object stashed buffer be a strbuf as well. On new stash, don't free the last stashed buffer, rather swap it with the one you will stash, this way, callers of store_object can act on static strbufs, and at some point, fast-import won't allocate new memory for objects buffers. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 14:00:38 +02:00			`if (last->no_swap) {`
			`last->data = *dat;`
			`} else {`
strbuf API additions and enhancements. Add strbuf_remove, change strbuf_insert: As both are special cases of strbuf_splice, implement them as such. gcc is able to do the math and generate almost optimal code this way. Add strbuf_swap: Exchange the values of its arguments. Use it in fast-import.c Also fix spacing issues in strbuf.h Signed-off-by: Pierre Habouzit <madcoder@debian.org> 2007-09-20 00:42:12 +02:00			`strbuf_swap(&last->data, dat);`
fast-import optimization: Now that cmd_data acts on a strbuf, make last_object stashed buffer be a strbuf as well. On new stash, don't free the last stashed buffer, rather swap it with the one you will stash, this way, callers of store_object can act on static strbufs, and at some point, fast-import won't allocate new memory for objects buffers. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 14:00:38 +02:00			`}`
fast-import: start using struct pack_idx_entry This is in preparation for using write_idx_file(). Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:51 +01:00			`last->offset = e->idx.offset;`
Don't allow fast-import tree delta chains to exceed maximum depth Brian Downing noticed fast-import can produce tree depths of up to 6,035 objects and even deeper. Long delta chains can create very small packfiles but cause problems during repacking as git needs to unpack each tree to count the reachable blobs. What's happening here is the active branch cache isn't big enough. We're swapping out the branch and thus recycling the tree information (struct tree_content) back into the free pool. When we later reload the tree we set the delta_depth to 0 but we kept the tree we just reloaded as a delta base. So if the tree we reloaded was already at the maximum depth we wouldn't know it and make the new tree a delta. Multiply the number of times the branch cache has to swap out the tree times max_depth (10) and you get the maximum delta depth of a tree created by fast-import. In Brian's case above the active branch cache had to swap the branch out 603/604 times during this import to produce a tree with a delta depth of 6035. Acked-by: Brian Downing <bdowning@lavos.net> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-14 05:48:42 +01:00			`last->depth = e->depth;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`}`
			`return 0;`
			`}`

csum-file: introduce sha1file_checkpoint It is useful to be able to rewind a check-summed file to a certain previous state after writing data into it using sha1write() API. The fast-import command does this after streaming a blob data to the packfile being generated and then noticing that the same blob has already been written, and it does this with a private code truncate_pack() that is commented as "Yes, this is a layering violation". Introduce two API functions, sha1file_checkpoint(), that allows the caller to save a state of a sha1file, and then later revert it to the saved state. Use it to reimplement truncate_pack(). Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-11-18 01:26:54 +01:00			`static void truncate_pack(struct sha1file_checkpoint *checkpoint)`
fast-import: Stream very large blobs directly to pack If a blob is larger than the configured big-file-threshold, instead of reading it into a single buffer obtained from malloc, stream it onto the end of the current pack file. Streaming the larger objects into the pack avoids the 4+ GiB memory footprint that occurs when fast-import is processing 2+ GiB blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-01 18:27:35 +01:00			`{`
csum-file: introduce sha1file_checkpoint It is useful to be able to rewind a check-summed file to a certain previous state after writing data into it using sha1write() API. The fast-import command does this after streaming a blob data to the packfile being generated and then noticing that the same blob has already been written, and it does this with a private code truncate_pack() that is commented as "Yes, this is a layering violation". Introduce two API functions, sha1file_checkpoint(), that allows the caller to save a state of a sha1file, and then later revert it to the saved state. Use it to reimplement truncate_pack(). Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-11-18 01:26:54 +01:00			`if (sha1file_truncate(pack_file, checkpoint))`
fast-import: Stream very large blobs directly to pack If a blob is larger than the configured big-file-threshold, instead of reading it into a single buffer obtained from malloc, stream it onto the end of the current pack file. Streaming the larger objects into the pack avoids the 4+ GiB memory footprint that occurs when fast-import is processing 2+ GiB blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-01 18:27:35 +01:00			`die_errno("cannot truncate pack to skip duplicate");`
csum-file: introduce sha1file_checkpoint It is useful to be able to rewind a check-summed file to a certain previous state after writing data into it using sha1write() API. The fast-import command does this after streaming a blob data to the packfile being generated and then noticing that the same blob has already been written, and it does this with a private code truncate_pack() that is commented as "Yes, this is a layering violation". Introduce two API functions, sha1file_checkpoint(), that allows the caller to save a state of a sha1file, and then later revert it to the saved state. Use it to reimplement truncate_pack(). Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-11-18 01:26:54 +01:00			`pack_size = checkpoint->offset;`
fast-import: Stream very large blobs directly to pack If a blob is larger than the configured big-file-threshold, instead of reading it into a single buffer obtained from malloc, stream it onto the end of the current pack file. Streaming the larger objects into the pack avoids the 4+ GiB memory footprint that occurs when fast-import is processing 2+ GiB blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-01 18:27:35 +01:00			`}`

			`static void stream_blob(uintmax_t len, unsigned char *sha1out, uintmax_t mark)`
			`{`
			`size_t in_sz = 64 * 1024, out_sz = 64 * 1024;`
			`unsigned char *in_buf = xmalloc(in_sz);`
			`unsigned char *out_buf = xmalloc(out_sz);`
			`struct object_entry *e;`
			`unsigned char sha1[20];`
			`unsigned long hdrlen;`
			`off_t offset;`
			`git_SHA_CTX c;`
zlib: zlib can only process 4GB at a time The size of objects we read from the repository and data we try to put into the repository are represented in "unsigned long", so that on larger architectures we can handle objects that weigh more than 4GB. But the interface defined in zlib.h to communicate with inflate/deflate limits avail_in (how many bytes of input are we calling zlib with) and avail_out (how many bytes of output from zlib are we ready to accept) fields effectively to 4GB by defining their type to be uInt. In many places in our code, we allocate a large buffer (e.g. mmap'ing a large loose object file) and tell zlib its size by assigning the size to avail_in field of the stream, but that will truncate the high octets of the real size. The worst part of this story is that we often pass around z_stream (the state object used by zlib) to keep track of the number of used bytes in input/output buffer by inspecting these two fields, which practically limits our callchain to the same 4GB limit. Wrap z_stream in another structure git_zstream that can express avail_in and avail_out in unsigned long. For now, just die() when the caller gives a size that cannot be given to a single zlib call. In later patches in the series, we would make git_inflate() and git_deflate() internally loop to give callers an illusion that our "improved" version of zlib interface can operate on a buffer larger than 4GB in one go. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-06-10 20:52:15 +02:00			`git_zstream s;`
csum-file: introduce sha1file_checkpoint It is useful to be able to rewind a check-summed file to a certain previous state after writing data into it using sha1write() API. The fast-import command does this after streaming a blob data to the packfile being generated and then noticing that the same blob has already been written, and it does this with a private code truncate_pack() that is commented as "Yes, this is a layering violation". Introduce two API functions, sha1file_checkpoint(), that allows the caller to save a state of a sha1file, and then later revert it to the saved state. Use it to reimplement truncate_pack(). Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-11-18 01:26:54 +01:00			`struct sha1file_checkpoint checkpoint;`
fast-import: Stream very large blobs directly to pack If a blob is larger than the configured big-file-threshold, instead of reading it into a single buffer obtained from malloc, stream it onto the end of the current pack file. Streaming the larger objects into the pack avoids the 4+ GiB memory footprint that occurs when fast-import is processing 2+ GiB blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-01 18:27:35 +01:00			`int status = Z_OK;`

			`/* Determine if we should auto-checkpoint. */`
fast-import: make default pack size unlimited Now that fast-import is creating packs with index version 2, there is no point limiting the pack size by default. A pack split will still happen if off_t is not sufficiently large to hold large offsets. While updating the doc, let's remove the "packfiles fit on CDs" suggestion. Pack files created by fast-import are still suboptimal and a 'git repack -a -f -d' or even 'git gc --aggressive' would be a pretty good idea before considering storage on CDs. Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:54 +01:00			`if ((max_packsize && (pack_size + 60 + len) > max_packsize)`
fast-import: Stream very large blobs directly to pack If a blob is larger than the configured big-file-threshold, instead of reading it into a single buffer obtained from malloc, stream it onto the end of the current pack file. Streaming the larger objects into the pack avoids the 4+ GiB memory footprint that occurs when fast-import is processing 2+ GiB blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-01 18:27:35 +01:00			`\|\| (pack_size + 60 + len) < pack_size)`
			`cycle_packfile();`

csum-file: introduce sha1file_checkpoint It is useful to be able to rewind a check-summed file to a certain previous state after writing data into it using sha1write() API. The fast-import command does this after streaming a blob data to the packfile being generated and then noticing that the same blob has already been written, and it does this with a private code truncate_pack() that is commented as "Yes, this is a layering violation". Introduce two API functions, sha1file_checkpoint(), that allows the caller to save a state of a sha1file, and then later revert it to the saved state. Use it to reimplement truncate_pack(). Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-11-18 01:26:54 +01:00			`sha1file_checkpoint(pack_file, &checkpoint);`
			`offset = checkpoint.offset;`
fast-import: use sha1write() for pack data This is in preparation for using write_idx_file(). Also, by using sha1write() we get some buffering to reduces the number of write syscalls, and the written data is SHA1 summed which allows for the extra data integrity validation check performed in fixup_pack_header_footer() (details on this in commit abeb40e5aa). Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:52 +01:00
fast-import: Stream very large blobs directly to pack If a blob is larger than the configured big-file-threshold, instead of reading it into a single buffer obtained from malloc, stream it onto the end of the current pack file. Streaming the larger objects into the pack avoids the 4+ GiB memory footprint that occurs when fast-import is processing 2+ GiB blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-01 18:27:35 +01:00			`hdrlen = snprintf((char *)out_buf, out_sz, "blob %" PRIuMAX, len) + 1;`
			`if (out_sz <= hdrlen)`
			`die("impossibly large object header");`

			`git_SHA1_Init(&c);`
			`git_SHA1_Update(&c, out_buf, hdrlen);`

fast-import: use write_idx_file() instead of custom code This allows for the creation of pack index version 2 with its object CRC and the possibility for a pack to be larger than 4 GB. Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:53 +01:00			`crc32_begin(pack_file);`

fast-import: Stream very large blobs directly to pack If a blob is larger than the configured big-file-threshold, instead of reading it into a single buffer obtained from malloc, stream it onto the end of the current pack file. Streaming the larger objects into the pack avoids the 4+ GiB memory footprint that occurs when fast-import is processing 2+ GiB blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-01 18:27:35 +01:00			`memset(&s, 0, sizeof(s));`
zlib: wrap deflate side of the API Wrap deflateInit, deflate, and deflateEnd for everybody, and the sole use of deflateInit2 in remote-curl.c to tell the library to use gzip header and trailer in git_deflate_init_gzip(). There is only one caller that cares about the status from deflateEnd(). Introduce git_deflate_end_gently() to let that sole caller retrieve the status and act on it (i.e. die) for now, but we would probably want to make inflate_end/deflate_end die when they ran out of memory and get rid of the _gently() kind. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-06-10 19:55:10 +02:00			`git_deflate_init(&s, pack_compression_level);`
fast-import: Stream very large blobs directly to pack If a blob is larger than the configured big-file-threshold, instead of reading it into a single buffer obtained from malloc, stream it onto the end of the current pack file. Streaming the larger objects into the pack avoids the 4+ GiB memory footprint that occurs when fast-import is processing 2+ GiB blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-01 18:27:35 +01:00
refactor duplicated encode_header in pack-objects and fast-import The following function is duplicated: encode_header Move this function to sha1_file.c and rename it 'encode_in_pack_object_header', as suggested by Junio C Hamano Signed-off-by: Michael Lukashov <michael.lukashov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 00:42:54 +01:00			`hdrlen = encode_in_pack_object_header(OBJ_BLOB, len, out_buf);`
fast-import: Stream very large blobs directly to pack If a blob is larger than the configured big-file-threshold, instead of reading it into a single buffer obtained from malloc, stream it onto the end of the current pack file. Streaming the larger objects into the pack avoids the 4+ GiB memory footprint that occurs when fast-import is processing 2+ GiB blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-01 18:27:35 +01:00			`if (out_sz <= hdrlen)`
			`die("impossibly large object header");`

			`s.next_out = out_buf + hdrlen;`
			`s.avail_out = out_sz - hdrlen;`

			`while (status != Z_STREAM_END) {`
			`if (0 < len && !s.avail_in) {`
			`size_t cnt = in_sz < len ? in_sz : (size_t)len;`
			`size_t n = fread(in_buf, 1, cnt, stdin);`
			`if (!n && feof(stdin))`
			`die("EOF in data (%" PRIuMAX " bytes remaining)", len);`

			`git_SHA1_Update(&c, in_buf, n);`
			`s.next_in = in_buf;`
			`s.avail_in = n;`
			`len -= n;`
			`}`

zlib: wrap deflate side of the API Wrap deflateInit, deflate, and deflateEnd for everybody, and the sole use of deflateInit2 in remote-curl.c to tell the library to use gzip header and trailer in git_deflate_init_gzip(). There is only one caller that cares about the status from deflateEnd(). Introduce git_deflate_end_gently() to let that sole caller retrieve the status and act on it (i.e. die) for now, but we would probably want to make inflate_end/deflate_end die when they ran out of memory and get rid of the _gently() kind. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-06-10 19:55:10 +02:00			`status = git_deflate(&s, len ? 0 : Z_FINISH);`
fast-import: Stream very large blobs directly to pack If a blob is larger than the configured big-file-threshold, instead of reading it into a single buffer obtained from malloc, stream it onto the end of the current pack file. Streaming the larger objects into the pack avoids the 4+ GiB memory footprint that occurs when fast-import is processing 2+ GiB blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-01 18:27:35 +01:00
			`if (!s.avail_out \|\| status == Z_STREAM_END) {`
			`size_t n = s.next_out - out_buf;`
fast-import: use sha1write() for pack data This is in preparation for using write_idx_file(). Also, by using sha1write() we get some buffering to reduces the number of write syscalls, and the written data is SHA1 summed which allows for the extra data integrity validation check performed in fixup_pack_header_footer() (details on this in commit abeb40e5aa). Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:52 +01:00			`sha1write(pack_file, out_buf, n);`
fast-import: Stream very large blobs directly to pack If a blob is larger than the configured big-file-threshold, instead of reading it into a single buffer obtained from malloc, stream it onto the end of the current pack file. Streaming the larger objects into the pack avoids the 4+ GiB memory footprint that occurs when fast-import is processing 2+ GiB blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-01 18:27:35 +01:00			`pack_size += n;`
			`s.next_out = out_buf;`
			`s.avail_out = out_sz;`
			`}`

			`switch (status) {`
			`case Z_OK:`
			`case Z_BUF_ERROR:`
			`case Z_STREAM_END:`
			`continue;`
			`default:`
			`die("unexpected deflate failure: %d", status);`
			`}`
			`}`
zlib: wrap deflate side of the API Wrap deflateInit, deflate, and deflateEnd for everybody, and the sole use of deflateInit2 in remote-curl.c to tell the library to use gzip header and trailer in git_deflate_init_gzip(). There is only one caller that cares about the status from deflateEnd(). Introduce git_deflate_end_gently() to let that sole caller retrieve the status and act on it (i.e. die) for now, but we would probably want to make inflate_end/deflate_end die when they ran out of memory and get rid of the _gently() kind. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-06-10 19:55:10 +02:00			`git_deflate_end(&s);`
fast-import: Stream very large blobs directly to pack If a blob is larger than the configured big-file-threshold, instead of reading it into a single buffer obtained from malloc, stream it onto the end of the current pack file. Streaming the larger objects into the pack avoids the 4+ GiB memory footprint that occurs when fast-import is processing 2+ GiB blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-01 18:27:35 +01:00			`git_SHA1_Final(sha1, &c);`

			`if (sha1out)`
			`hashcpy(sha1out, sha1);`

			`e = insert_object(sha1);`

			`if (mark)`
			`insert_mark(mark, e);`

fast-import: start using struct pack_idx_entry This is in preparation for using write_idx_file(). Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:51 +01:00			`if (e->idx.offset) {`
fast-import: Stream very large blobs directly to pack If a blob is larger than the configured big-file-threshold, instead of reading it into a single buffer obtained from malloc, stream it onto the end of the current pack file. Streaming the larger objects into the pack avoids the 4+ GiB memory footprint that occurs when fast-import is processing 2+ GiB blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-01 18:27:35 +01:00			`duplicate_count_by_type[OBJ_BLOB]++;`
csum-file: introduce sha1file_checkpoint It is useful to be able to rewind a check-summed file to a certain previous state after writing data into it using sha1write() API. The fast-import command does this after streaming a blob data to the packfile being generated and then noticing that the same blob has already been written, and it does this with a private code truncate_pack() that is commented as "Yes, this is a layering violation". Introduce two API functions, sha1file_checkpoint(), that allows the caller to save a state of a sha1file, and then later revert it to the saved state. Use it to reimplement truncate_pack(). Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-11-18 01:26:54 +01:00			`truncate_pack(&checkpoint);`
fast-import: Stream very large blobs directly to pack If a blob is larger than the configured big-file-threshold, instead of reading it into a single buffer obtained from malloc, stream it onto the end of the current pack file. Streaming the larger objects into the pack avoids the 4+ GiB memory footprint that occurs when fast-import is processing 2+ GiB blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-01 18:27:35 +01:00
			`} else if (find_sha1_pack(sha1, packed_git)) {`
			`e->type = OBJ_BLOB;`
			`e->pack_id = MAX_PACK_ID;`
fast-import: start using struct pack_idx_entry This is in preparation for using write_idx_file(). Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:51 +01:00			`e->idx.offset = 1; /* just not zero! */`
fast-import: Stream very large blobs directly to pack If a blob is larger than the configured big-file-threshold, instead of reading it into a single buffer obtained from malloc, stream it onto the end of the current pack file. Streaming the larger objects into the pack avoids the 4+ GiB memory footprint that occurs when fast-import is processing 2+ GiB blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-01 18:27:35 +01:00			`duplicate_count_by_type[OBJ_BLOB]++;`
csum-file: introduce sha1file_checkpoint It is useful to be able to rewind a check-summed file to a certain previous state after writing data into it using sha1write() API. The fast-import command does this after streaming a blob data to the packfile being generated and then noticing that the same blob has already been written, and it does this with a private code truncate_pack() that is commented as "Yes, this is a layering violation". Introduce two API functions, sha1file_checkpoint(), that allows the caller to save a state of a sha1file, and then later revert it to the saved state. Use it to reimplement truncate_pack(). Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-11-18 01:26:54 +01:00			`truncate_pack(&checkpoint);`
fast-import: Stream very large blobs directly to pack If a blob is larger than the configured big-file-threshold, instead of reading it into a single buffer obtained from malloc, stream it onto the end of the current pack file. Streaming the larger objects into the pack avoids the 4+ GiB memory footprint that occurs when fast-import is processing 2+ GiB blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-01 18:27:35 +01:00
			`} else {`
			`e->depth = 0;`
			`e->type = OBJ_BLOB;`
			`e->pack_id = pack_id;`
fast-import: start using struct pack_idx_entry This is in preparation for using write_idx_file(). Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:51 +01:00			`e->idx.offset = offset;`
fast-import: use write_idx_file() instead of custom code This allows for the creation of pack index version 2 with its object CRC and the possibility for a pack to be larger than 4 GB. Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:53 +01:00			`e->idx.crc32 = crc32_end(pack_file);`
fast-import: Stream very large blobs directly to pack If a blob is larger than the configured big-file-threshold, instead of reading it into a single buffer obtained from malloc, stream it onto the end of the current pack file. Streaming the larger objects into the pack avoids the 4+ GiB memory footprint that occurs when fast-import is processing 2+ GiB blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-01 18:27:35 +01:00			`object_count++;`
			`object_count_by_type[OBJ_BLOB]++;`
			`}`

			`free(in_buf);`
			`free(out_buf);`
			`}`

Document the hairy gfi_unpack_entry part of fast-import Junio pointed out this part of fast-import wasn't very clear on initial read, and it took some time for someone who was new to fast-import's "dirty little tricks" to understand how this was even working. So a little bit of commentary in the proper place may help future readers. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-21 05:37:01 +01:00			`/* All calls must be guarded by find_object() or find_mark() to`
			`* ensure the 'struct object_entry' passed was written by this`
			`* process instance. We unpack the entry by the offset, avoiding`
			`* the need for the corresponding .idx file. This unpacking rule`
			`* works because we only use OBJ_REF_DELTA within the packfiles`
			`* created by fast-import.`
			`*`
			`* oe must not be NULL. Such an oe usually comes from giving`
			`* an unknown SHA-1 to find_object() or an undefined mark to`
			`* find_mark(). Callers must test for this condition and use`
			`* the standard read_sha1_file() when it happens.`
			`*`
			`* oe->pack_id must not be MAX_PACK_ID. Such an oe is usually from`
			`* find_mark(), where the mark was reloaded from an existing marks`
			`* file and is referencing an object that this fast-import process`
			`* instance did not write out to a packfile. Callers must test for`
			`* this condition and use read_sha1_file() instead.`
			`*/`
Implemented manual packfile switching in fast-import. To help importers which are dealing with massive amounts of data fast-import needs to be able to close the packfile it is currently writing to and open a new packfile for any additional data that will be received. A new 'checkpoint' command has been introduced which can be used by the frontend import process to force this to occur at any time. This may be useful to ensure a very long running import doesn't lose any work due to unexpected failures. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:35:41 +01:00			`static void *gfi_unpack_entry(`
			`struct object_entry *oe,`
			`unsigned long *sizep)`
Implemented tree reloading in fast-import. Tree reloading allows fast-import to swap out the least-recently used branch by simply deallocating the data structures from memory that were associated with that branch. Later if the branch becomes active again it can lazily recreate those structures on demand by reloading the necessary trees from the pack file it originally wrote them to. The reloading process is implemented by mmap'ing the pack into memory and using a much tighter variant of the pack reading code contained in sha1_file.c. This was a blatent copy from sha1_file.c but the unpacking functions were significantly simplified and are actually now in a form that should make it easier to map only the necessary regions of a pack rather than the entire file. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 10:37:35 +02:00			`{`
convert object type handling from a string to a number We currently have two parallel notation for dealing with object types in the code: a string and a numerical value. One of them is obviously redundent, and the most used one requires more stack space and a bunch of strcmp() all over the place. This is an initial step for the removal of the version using a char array found in object reading code paths. The patch is unfortunately large but there is no sane way to split it in smaller parts without breaking the system. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-26 20:55:59 +01:00			`enum object_type type;`
Implemented manual packfile switching in fast-import. To help importers which are dealing with massive amounts of data fast-import needs to be able to close the packfile it is currently writing to and open a new packfile for any additional data that will be received. A new 'checkpoint' command has been introduced which can be used by the frontend import process to force this to occur at any time. This may be useful to ensure a very long running import doesn't lose any work due to unexpected failures. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:35:41 +01:00			`struct packed_git *p = all_packs[oe->pack_id];`
Fix random fast-import errors when compiled with NO_MMAP fast-import was relying on the fact that on most systems mmap() and write() are synchronized by the filesystem's buffer cache. We were relying on the ability to mmap() 20 bytes beyond the current end of the file, then later fill in those bytes with a future write() call, then read them through the previously obtained mmap() address. This isn't always true with some implementations of NFS, but it is especially not true with our NO_MMAP=YesPlease build time option used on some platforms. If fast-import was built with NO_MMAP=YesPlease we used the malloc()+pread() emulation and the subsequent write() call does not update the trailing 20 bytes of a previously obtained "mmap()" (aka malloc'd) address. Under NO_MMAP that behavior causes unpack_entry() in sha1_file.c to be unable to read an object header (or data) that has been unlucky enough to be written to the packfile at a location such that it is in the trailing 20 bytes of a window previously opened on that same packfile. This bug has gone unnoticed for a very long time as it is highly data dependent. Not only does the object have to be placed at the right position, but it also needs to be positioned behind some other object that has been accessed due to a branch cache invalidation. In other words the stars had to align just right, and if you did run into this bug you probably should also have purchased a lottery ticket. Fortunately the workaround is a lot easier than the bug explanation. Before we allow unpack_entry() to read data from a pack window that has also (possibly) been modified through write() we force all existing windows on that packfile to be closed. By closing the windows we ensure that any new access via the emulated mmap() will reread the packfile, updating to the current file content. This comes at a slight performance degredation as we cannot reuse previously cached windows when we update the packfile. But it is a fairly minor difference as the window closes happen at only two points: - When the packfile is finalized and its .idx is generated: At this stage we are getting ready to update the refs and any data access into the packfile is going to be random, and is going after only the branch tips (to ensure they are valid). Our existing windows (if any) are not likely to be positioned at useful locations to access those final tip commits so we probably were closing them before anyway. - When the branch cache missed and we need to reload: At this point fast-import is getting change commands for the next commit and it needs to go re-read a tree object it previously had written out to the packfile. What windows we had (if any) are not likely to cover the tree in question so we probably were closing them before anyway. We do try to avoid unnecessarily closing windows in the second case by checking to see if the packfile size has increased since the last time we called unpack_entry() on that packfile. If the size has not changed then we have not written additional data, and any existing window is still vaild. This nicely handles the cases where fast-import is going through a branch cache reload and needs to read many trees at once. During such an event we are not likely to be updating the packfile so we do not cycle the windows between reads. With this change in place t9301-fast-export.sh (which was broken by c3b0dec509fe136c5417422f31898b5a4e2d5e02) finally works again. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-18 04:57:00 +01:00			`if (p == pack_data && p->pack_size < (pack_size + 20)) {`
Document the hairy gfi_unpack_entry part of fast-import Junio pointed out this part of fast-import wasn't very clear on initial read, and it took some time for someone who was new to fast-import's "dirty little tricks" to understand how this was even working. So a little bit of commentary in the proper place may help future readers. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-21 05:37:01 +01:00			`/* The object is stored in the packfile we are writing to`
			`* and we have modified it since the last time we scanned`
			`* back to read a previously written object. If an old`
			`* window covered [p->pack_size, p->pack_size + 20) its`
			`* data is stale and is not valid. Closing all windows`
			`* and updating the packfile length ensures we can read`
			`* the newly written data.`
			`*/`
Fix random fast-import errors when compiled with NO_MMAP fast-import was relying on the fact that on most systems mmap() and write() are synchronized by the filesystem's buffer cache. We were relying on the ability to mmap() 20 bytes beyond the current end of the file, then later fill in those bytes with a future write() call, then read them through the previously obtained mmap() address. This isn't always true with some implementations of NFS, but it is especially not true with our NO_MMAP=YesPlease build time option used on some platforms. If fast-import was built with NO_MMAP=YesPlease we used the malloc()+pread() emulation and the subsequent write() call does not update the trailing 20 bytes of a previously obtained "mmap()" (aka malloc'd) address. Under NO_MMAP that behavior causes unpack_entry() in sha1_file.c to be unable to read an object header (or data) that has been unlucky enough to be written to the packfile at a location such that it is in the trailing 20 bytes of a window previously opened on that same packfile. This bug has gone unnoticed for a very long time as it is highly data dependent. Not only does the object have to be placed at the right position, but it also needs to be positioned behind some other object that has been accessed due to a branch cache invalidation. In other words the stars had to align just right, and if you did run into this bug you probably should also have purchased a lottery ticket. Fortunately the workaround is a lot easier than the bug explanation. Before we allow unpack_entry() to read data from a pack window that has also (possibly) been modified through write() we force all existing windows on that packfile to be closed. By closing the windows we ensure that any new access via the emulated mmap() will reread the packfile, updating to the current file content. This comes at a slight performance degredation as we cannot reuse previously cached windows when we update the packfile. But it is a fairly minor difference as the window closes happen at only two points: - When the packfile is finalized and its .idx is generated: At this stage we are getting ready to update the refs and any data access into the packfile is going to be random, and is going after only the branch tips (to ensure they are valid). Our existing windows (if any) are not likely to be positioned at useful locations to access those final tip commits so we probably were closing them before anyway. - When the branch cache missed and we need to reload: At this point fast-import is getting change commands for the next commit and it needs to go re-read a tree object it previously had written out to the packfile. What windows we had (if any) are not likely to cover the tree in question so we probably were closing them before anyway. We do try to avoid unnecessarily closing windows in the second case by checking to see if the packfile size has increased since the last time we called unpack_entry() on that packfile. If the size has not changed then we have not written additional data, and any existing window is still vaild. This nicely handles the cases where fast-import is going through a branch cache reload and needs to read many trees at once. During such an event we are not likely to be updating the packfile so we do not cycle the windows between reads. With this change in place t9301-fast-export.sh (which was broken by c3b0dec509fe136c5417422f31898b5a4e2d5e02) finally works again. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-18 04:57:00 +01:00			`close_pack_windows(p);`
fast-import: use sha1write() for pack data This is in preparation for using write_idx_file(). Also, by using sha1write() we get some buffering to reduces the number of write syscalls, and the written data is SHA1 summed which allows for the extra data integrity validation check performed in fixup_pack_header_footer() (details on this in commit abeb40e5aa). Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:52 +01:00			`sha1flush(pack_file);`
Document the hairy gfi_unpack_entry part of fast-import Junio pointed out this part of fast-import wasn't very clear on initial read, and it took some time for someone who was new to fast-import's "dirty little tricks" to understand how this was even working. So a little bit of commentary in the proper place may help future readers. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-21 05:37:01 +01:00
			`/* We have to offer 20 bytes additional on the end of`
			`* the packfile as the core unpacker code assumes the`
			`* footer is present at the file end and must promise`
			`* at least 20 bytes within any window it maps. But`
			`* we don't actually create the footer here.`
			`*/`
Implemented manual packfile switching in fast-import. To help importers which are dealing with massive amounts of data fast-import needs to be able to close the packfile it is currently writing to and open a new packfile for any additional data that will be received. A new 'checkpoint' command has been introduced which can be used by the frontend import process to force this to occur at any time. This may be useful to ensure a very long running import doesn't lose any work due to unexpected failures. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:35:41 +01:00			`p->pack_size = pack_size + 20;`
Fix random fast-import errors when compiled with NO_MMAP fast-import was relying on the fact that on most systems mmap() and write() are synchronized by the filesystem's buffer cache. We were relying on the ability to mmap() 20 bytes beyond the current end of the file, then later fill in those bytes with a future write() call, then read them through the previously obtained mmap() address. This isn't always true with some implementations of NFS, but it is especially not true with our NO_MMAP=YesPlease build time option used on some platforms. If fast-import was built with NO_MMAP=YesPlease we used the malloc()+pread() emulation and the subsequent write() call does not update the trailing 20 bytes of a previously obtained "mmap()" (aka malloc'd) address. Under NO_MMAP that behavior causes unpack_entry() in sha1_file.c to be unable to read an object header (or data) that has been unlucky enough to be written to the packfile at a location such that it is in the trailing 20 bytes of a window previously opened on that same packfile. This bug has gone unnoticed for a very long time as it is highly data dependent. Not only does the object have to be placed at the right position, but it also needs to be positioned behind some other object that has been accessed due to a branch cache invalidation. In other words the stars had to align just right, and if you did run into this bug you probably should also have purchased a lottery ticket. Fortunately the workaround is a lot easier than the bug explanation. Before we allow unpack_entry() to read data from a pack window that has also (possibly) been modified through write() we force all existing windows on that packfile to be closed. By closing the windows we ensure that any new access via the emulated mmap() will reread the packfile, updating to the current file content. This comes at a slight performance degredation as we cannot reuse previously cached windows when we update the packfile. But it is a fairly minor difference as the window closes happen at only two points: - When the packfile is finalized and its .idx is generated: At this stage we are getting ready to update the refs and any data access into the packfile is going to be random, and is going after only the branch tips (to ensure they are valid). Our existing windows (if any) are not likely to be positioned at useful locations to access those final tip commits so we probably were closing them before anyway. - When the branch cache missed and we need to reload: At this point fast-import is getting change commands for the next commit and it needs to go re-read a tree object it previously had written out to the packfile. What windows we had (if any) are not likely to cover the tree in question so we probably were closing them before anyway. We do try to avoid unnecessarily closing windows in the second case by checking to see if the packfile size has increased since the last time we called unpack_entry() on that packfile. If the size has not changed then we have not written additional data, and any existing window is still vaild. This nicely handles the cases where fast-import is going through a branch cache reload and needs to read many trees at once. During such an event we are not likely to be updating the packfile so we do not cycle the windows between reads. With this change in place t9301-fast-export.sh (which was broken by c3b0dec509fe136c5417422f31898b5a4e2d5e02) finally works again. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-18 04:57:00 +01:00			`}`
fast-import: start using struct pack_idx_entry This is in preparation for using write_idx_file(). Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:51 +01:00			`return unpack_entry(p, oe->idx.offset, &type, sizep);`
Implemented tree reloading in fast-import. Tree reloading allows fast-import to swap out the least-recently used branch by simply deallocating the data structures from memory that were associated with that branch. Later if the branch becomes active again it can lazily recreate those structures on demand by reloading the necessary trees from the pack file it originally wrote them to. The reloading process is implemented by mmap'ing the pack into memory and using a much tighter variant of the pack reading code contained in sha1_file.c. This was a blatent copy from sha1_file.c but the unpacking functions were significantly simplified and are actually now in a form that should make it easier to map only the necessary regions of a pack rather than the entire file. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 10:37:35 +02:00			`}`

Reduce memory usage of fast-import. Some structs are allocated rather frequently, but were using integer types which were far larger than required to actually store their full value range. As packfiles are limited to 4 GiB we don't need more than 32 bits to store the offset of an object within that packfile, an `unsigned long` on a 64 bit system is likely a 64 bit unsigned value. Saving 4 bytes per object on a 64 bit system can add up fast on any sizable import. As atom strings are strictly single components in a path name these are probably limited to just 255 bytes by the underlying OS. Going to that short of a string is probably too restrictive, but certainly `unsigned int` is far too large for their lengths. `unsigned short` is a reasonable limit. Modes within a tree really only need two bytes to store their whole value; using `unsigned int` here is vast overkill. Saving 4 bytes per file entry in an active branch can add up quickly on a project with a large number of files. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-05 22:34:56 +01:00			`static const char get_mode(const char str, uint16_t *modep)`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`{`
			`unsigned char c;`
Reduce memory usage of fast-import. Some structs are allocated rather frequently, but were using integer types which were far larger than required to actually store their full value range. As packfiles are limited to 4 GiB we don't need more than 32 bits to store the offset of an object within that packfile, an `unsigned long` on a 64 bit system is likely a 64 bit unsigned value. Saving 4 bytes per object on a 64 bit system can add up fast on any sizable import. As atom strings are strictly single components in a path name these are probably limited to just 255 bytes by the underlying OS. Going to that short of a string is probably too restrictive, but certainly `unsigned int` is far too large for their lengths. `unsigned short` is a reasonable limit. Modes within a tree really only need two bytes to store their whole value; using `unsigned int` here is vast overkill. Saving 4 bytes per file entry in an active branch can add up quickly on a project with a large number of files. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-05 22:34:56 +01:00			`uint16_t mode = 0;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00
			`while ((c = *str++) != ' ') {`
			`if (c < '0' \|\| c > '7')`
			`return NULL;`
			`mode = (mode << 3) + (c - '0');`
			`}`
			`*modep = mode;`
			`return str;`
			`}`

			`static void load_tree(struct tree_entry *root)`
			`{`
Fix a bunch of pointer declarations (codestyle) Essentially; s/type* /type */ as per the coding guidelines. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-05-01 11:06:36 +02:00			`unsigned char *sha1 = root->versions[1].sha1;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`struct object_entry *myoe;`
			`struct tree_content *t;`
			`unsigned long size;`
			`char *buf;`
			`const char *c;`

			`root->tree = t = new_tree_content(8);`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`if (is_null_sha1(sha1))`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`return;`

Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`myoe = find_object(sha1);`
fast-import: Fix crash when referencing already existing objects Commit a5c1780a0355a71b9fb70f1f1977ce726ee5b8d8 sets the pack_id of existing objects to MAX_PACK_ID. When the same object is referenced later again it is found in the local object hash. With such a pack_id fast-import should not try to locate that object in the newly created pack(s). Signed-off-by: Simon Hausmann <simon@lst.de> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-05-23 23:01:49 +02:00			`if (myoe && myoe->pack_id != MAX_PACK_ID) {`
Implemented tree reloading in fast-import. Tree reloading allows fast-import to swap out the least-recently used branch by simply deallocating the data structures from memory that were associated with that branch. Later if the branch becomes active again it can lazily recreate those structures on demand by reloading the necessary trees from the pack file it originally wrote them to. The reloading process is implemented by mmap'ing the pack into memory and using a much tighter variant of the pack reading code contained in sha1_file.c. This was a blatent copy from sha1_file.c but the unpacking functions were significantly simplified and are actually now in a form that should make it easier to map only the necessary regions of a pack rather than the entire file. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 10:37:35 +02:00			`if (myoe->type != OBJ_TREE)`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`die("Not a tree: %s", sha1_to_hex(sha1));`
Don't allow fast-import tree delta chains to exceed maximum depth Brian Downing noticed fast-import can produce tree depths of up to 6,035 objects and even deeper. Long delta chains can create very small packfiles but cause problems during repacking as git needs to unpack each tree to count the reachable blobs. What's happening here is the active branch cache isn't big enough. We're swapping out the branch and thus recycling the tree information (struct tree_content) back into the free pool. When we later reload the tree we set the delta_depth to 0 but we kept the tree we just reloaded as a delta base. So if the tree we reloaded was already at the maximum depth we wouldn't know it and make the new tree a delta. Multiply the number of times the branch cache has to swap out the tree times max_depth (10) and you get the maximum delta depth of a tree created by fast-import. In Brian's case above the active branch cache had to swap the branch out 603/604 times during this import to produce a tree with a delta depth of 6035. Acked-by: Brian Downing <bdowning@lavos.net> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-14 05:48:42 +01:00			`t->delta_depth = myoe->depth;`
Implemented manual packfile switching in fast-import. To help importers which are dealing with massive amounts of data fast-import needs to be able to close the packfile it is currently writing to and open a new packfile for any additional data that will be received. A new 'checkpoint' command has been introduced which can be used by the frontend import process to force this to occur at any time. This may be useful to ensure a very long running import doesn't lose any work due to unexpected failures. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:35:41 +01:00			`buf = gfi_unpack_entry(myoe, &size);`
fast-import: check return value from unpack_entry() If the tree object we have asked for is deltafied in the packfile and the delta did not apply correctly or was not able to be decompressed from the packfile then we can get back NULL instead of the tree data. This is (part of) the reason why read_sha1_file() can return NULL, so we need to also handle it the same way. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-14 07:34:34 +01:00			`if (!buf)`
			`die("Can't load tree %s", sha1_to_hex(sha1));`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`} else {`
convert object type handling from a string to a number We currently have two parallel notation for dealing with object types in the code: a string and a numerical value. One of them is obviously redundent, and the most used one requires more stack space and a bunch of strcmp() all over the place. This is an initial step for the removal of the version using a char array found in object reading code paths. The patch is unfortunately large but there is no sane way to split it in smaller parts without breaking the system. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-26 20:55:59 +01:00			`enum object_type type;`
			`buf = read_sha1_file(sha1, &type, &size);`
			`if (!buf \|\| type != OBJ_TREE)`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`die("Can't load tree %s", sha1_to_hex(sha1));`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`}`

			`c = buf;`
			`while (c != (buf + size)) {`
			`struct tree_entry *e = new_tree_entry();`

			`if (t->entry_count == t->entry_capacity)`
fast-import: grow tree storage more aggressively When building up a tree for a commit, fast-import dynamically allocates memory for the tree entries. When more space is needed, the allocated memory is increased by a constant amount. For very large trees, this means re-allocating and memcpy()ing the memory O(n) times. To compound this problem, releasing the previous tree resource does not free the memory; it is kept in a pool for future trees. This means that each of the O(n) allocations will consume increasing amounts of memory, giving O(n^2) memory consumption. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-11 03:39:17 +01:00			`root->tree = t = grow_tree_content(t, t->entry_count);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`t->entries[t->entry_count++] = e;`

			`e->tree = NULL;`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`c = get_mode(c, &e->versions[1].mode);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`if (!c)`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`die("Corrupt mode in %s", sha1_to_hex(sha1));`
			`e->versions[0].mode = e->versions[1].mode;`
Remove unnecessary casts from fast-import Jeff King pointed out that these casts are quite unnecessary, as the compiler should be doing them anyway, and may cause problems in the future if the size of the argument for to_atom were to ever be increased. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-12 20:48:37 +01:00			`e->name = to_atom(c, strlen(c));`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`c += e->name->str_len + 1;`
Fix a bunch of pointer declarations (codestyle) Essentially; s/type* /type */ as per the coding guidelines. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-05-01 11:06:36 +02:00			`hashcpy(e->versions[0].sha1, (unsigned char *)c);`
			`hashcpy(e->versions[1].sha1, (unsigned char *)c);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`c += 20;`
			`}`
			`free(buf);`
			`}`

Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`static int tecmp0 (const void _a, const void _b)`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`{`
			`struct tree_entry a = ((struct tree_entry**)_a);`
			`struct tree_entry b = ((struct tree_entry**)_b);`
			`return base_name_compare(`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`a->name->str_dat, a->name->str_len, a->versions[0].mode,`
			`b->name->str_dat, b->name->str_len, b->versions[0].mode);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`}`

Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`static int tecmp1 (const void _a, const void _b)`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`{`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`struct tree_entry a = ((struct tree_entry**)_a);`
			`struct tree_entry b = ((struct tree_entry**)_b);`
			`return base_name_compare(`
			`a->name->str_dat, a->name->str_len, a->versions[1].mode,`
			`b->name->str_dat, b->name->str_len, b->versions[1].mode);`
			`}`

fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`static void mktree(struct tree_content t, int v, struct strbuf b)`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`{`
			`size_t maxlen = 0;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`unsigned int i;`

Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`if (!v)`
			`qsort(t->entries,t->entry_count,sizeof(t->entries[0]),tecmp0);`
			`else`
			`qsort(t->entries,t->entry_count,sizeof(t->entries[0]),tecmp1);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00
			`for (i = 0; i < t->entry_count; i++) {`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`if (t->entries[i]->versions[v].mode)`
			`maxlen += t->entries[i]->name->str_len + 34;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`}`

fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`strbuf_reset(b);`
			`strbuf_grow(b, maxlen);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`for (i = 0; i < t->entry_count; i++) {`
			`struct tree_entry *e = t->entries[i];`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`if (!e->versions[v].mode)`
			`continue;`
fast-import: prevent producing bad delta To produce deltas for tree objects fast-import tracks two versions of tree's entries - base and current one. Base version stands both for a delta base of this tree, and for a entry inside a delta base of a parent tree. So care should be taken to keep it in sync. tree_content_set cuts away a whole subtree and replaces it with a new one (or NULL for lazy load of a tree with known sha1). It keeps a base sha1 for this subtree (needed for parent tree). And here is the problem, 'subtree' tree root doesn't have the implied base version entries. Adjusting the subtree to include them would mean a deep rewrite of subtree. Invalidating the subtree base version would mean recursive invalidation of parents' base versions. So just mark this tree as do-not-delta me. Abuse setuid bit for this purpose. tree_content_replace is the same as tree_content_set except that is is used to replace the root, so just clearing base sha1 here (instead of setting the bit) is fine. [di: log message] Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Dmitry Ivankov <divanorama@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-08-14 20:32:24 +02:00			`strbuf_addf(b, "%o %s%c",`
			`(unsigned int)(e->versions[v].mode & ~NO_DELTA),`
			`e->name->str_dat, '\0');`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`strbuf_add(b, e->versions[v].sha1, 20);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`}`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`}`

			`static void store_tree(struct tree_entry *root)`
			`{`
			`struct tree_content *t = root->tree;`
			`unsigned int i, j, del;`
fast-import optimization: Now that cmd_data acts on a strbuf, make last_object stashed buffer be a strbuf as well. On new stash, don't free the last stashed buffer, rather swap it with the one you will stash, this way, callers of store_object can act on static strbufs, and at some point, fast-import won't allocate new memory for objects buffers. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 14:00:38 +02:00			`struct last_object lo = { STRBUF_INIT, 0, 0, /* no_swap */ 1 };`
fast-import: prevent producing bad delta To produce deltas for tree objects fast-import tracks two versions of tree's entries - base and current one. Base version stands both for a delta base of this tree, and for a entry inside a delta base of a parent tree. So care should be taken to keep it in sync. tree_content_set cuts away a whole subtree and replaces it with a new one (or NULL for lazy load of a tree with known sha1). It keeps a base sha1 for this subtree (needed for parent tree). And here is the problem, 'subtree' tree root doesn't have the implied base version entries. Adjusting the subtree to include them would mean a deep rewrite of subtree. Invalidating the subtree base version would mean recursive invalidation of parents' base versions. So just mark this tree as do-not-delta me. Abuse setuid bit for this purpose. tree_content_replace is the same as tree_content_set except that is is used to replace the root, so just clearing base sha1 here (instead of setting the bit) is fine. [di: log message] Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Dmitry Ivankov <divanorama@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-08-14 20:32:24 +02:00			`struct object_entry *le = NULL;`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00
			`if (!is_null_sha1(root->versions[1].sha1))`
			`return;`

			`for (i = 0; i < t->entry_count; i++) {`
			`if (t->entries[i]->tree)`
			`store_tree(t->entries[i]);`
			`}`

fast-import: prevent producing bad delta To produce deltas for tree objects fast-import tracks two versions of tree's entries - base and current one. Base version stands both for a delta base of this tree, and for a entry inside a delta base of a parent tree. So care should be taken to keep it in sync. tree_content_set cuts away a whole subtree and replaces it with a new one (or NULL for lazy load of a tree with known sha1). It keeps a base sha1 for this subtree (needed for parent tree). And here is the problem, 'subtree' tree root doesn't have the implied base version entries. Adjusting the subtree to include them would mean a deep rewrite of subtree. Invalidating the subtree base version would mean recursive invalidation of parents' base versions. So just mark this tree as do-not-delta me. Abuse setuid bit for this purpose. tree_content_replace is the same as tree_content_set except that is is used to replace the root, so just clearing base sha1 here (instead of setting the bit) is fine. [di: log message] Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Dmitry Ivankov <divanorama@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-08-14 20:32:24 +02:00			`if (!(root->versions[0].mode & NO_DELTA))`
			`le = find_object(root->versions[0].sha1);`
fast-import optimization: Now that cmd_data acts on a strbuf, make last_object stashed buffer be a strbuf as well. On new stash, don't free the last stashed buffer, rather swap it with the one you will stash, this way, callers of store_object can act on static strbufs, and at some point, fast-import won't allocate new memory for objects buffers. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 14:00:38 +02:00			`if (S_ISDIR(root->versions[0].mode) && le && le->pack_id == pack_id) {`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`mktree(t, 0, &old_tree);`
fast-import optimization: Now that cmd_data acts on a strbuf, make last_object stashed buffer be a strbuf as well. On new stash, don't free the last stashed buffer, rather swap it with the one you will stash, this way, callers of store_object can act on static strbufs, and at some point, fast-import won't allocate new memory for objects buffers. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 14:00:38 +02:00			`lo.data = old_tree;`
fast-import: start using struct pack_idx_entry This is in preparation for using write_idx_file(). Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:51 +01:00			`lo.offset = le->idx.offset;`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`lo.depth = t->delta_depth;`
			`}`

fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`mktree(t, 1, &new_tree);`
			`store_object(OBJ_TREE, &new_tree, &lo, root->versions[1].sha1, 0);`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00
			`t->delta_depth = lo.depth;`
			`for (i = 0, j = 0, del = 0; i < t->entry_count; i++) {`
			`struct tree_entry *e = t->entries[i];`
			`if (e->versions[1].mode) {`
			`e->versions[0].mode = e->versions[1].mode;`
			`hashcpy(e->versions[0].sha1, e->versions[1].sha1);`
			`t->entries[j++] = e;`
			`} else {`
			`release_tree_entry(e);`
			`del++;`
			`}`
			`}`
			`t->entry_count -= del;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`}`

fast-import: tighten M 040000 syntax When tree_content_set() is asked to modify the path "foo/bar/", it first recurses like so: tree_content_set(root, "foo/bar/", sha1, S_IFDIR) -> tree_content_set(root:foo, "bar/", ...) -> tree_content_set(root:foo/bar, "", ...) And as a side-effect of 2794ad5 (fast-import: Allow filemodify to set the root, 2010-10-10), this last call is accepted and changes the tree entry for root:foo/bar to refer to the specified tree. That seems safe enough but let's reject the new syntax (we never meant to support it) and make it harder for frontends to introduce pointless incompatibilities with git fast-import 1.7.3. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-10-18 03:08:53 +02:00			`static void tree_content_replace(`
			`struct tree_entry *root,`
			`const unsigned char *sha1,`
			`const uint16_t mode,`
			`struct tree_content *newtree)`
			`{`
			`if (!S_ISDIR(mode))`
			`die("Root cannot be a non-directory");`
fast-import: prevent producing bad delta To produce deltas for tree objects fast-import tracks two versions of tree's entries - base and current one. Base version stands both for a delta base of this tree, and for a entry inside a delta base of a parent tree. So care should be taken to keep it in sync. tree_content_set cuts away a whole subtree and replaces it with a new one (or NULL for lazy load of a tree with known sha1). It keeps a base sha1 for this subtree (needed for parent tree). And here is the problem, 'subtree' tree root doesn't have the implied base version entries. Adjusting the subtree to include them would mean a deep rewrite of subtree. Invalidating the subtree base version would mean recursive invalidation of parents' base versions. So just mark this tree as do-not-delta me. Abuse setuid bit for this purpose. tree_content_replace is the same as tree_content_set except that is is used to replace the root, so just clearing base sha1 here (instead of setting the bit) is fine. [di: log message] Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Dmitry Ivankov <divanorama@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-08-14 20:32:24 +02:00			`hashclr(root->versions[0].sha1);`
fast-import: tighten M 040000 syntax When tree_content_set() is asked to modify the path "foo/bar/", it first recurses like so: tree_content_set(root, "foo/bar/", sha1, S_IFDIR) -> tree_content_set(root:foo, "bar/", ...) -> tree_content_set(root:foo/bar, "", ...) And as a side-effect of 2794ad5 (fast-import: Allow filemodify to set the root, 2010-10-10), this last call is accepted and changes the tree entry for root:foo/bar to refer to the specified tree. That seems safe enough but let's reject the new syntax (we never meant to support it) and make it harder for frontends to introduce pointless incompatibilities with git fast-import 1.7.3. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-10-18 03:08:53 +02:00			`hashcpy(root->versions[1].sha1, sha1);`
			`if (root->tree)`
			`release_tree_content_recursive(root->tree);`
			`root->tree = newtree;`
			`}`

Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`static int tree_content_set(`
			`struct tree_entry *root,`
			`const char *p,`
			`const unsigned char *sha1,`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`const uint16_t mode,`
			`struct tree_content *subtree)`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`{`
fast-import: filemodify after M 040000 <tree> "" crashes Until M 040000 <tree> "" syntax was introduced in commit 2794ad5 (fast-import: Allow filemodify to set the root, 2010-10-10), it was impossible for the root entry to refer to an unloaded tree. Update various functions to take that possibility into account. Otherwise M 040000 <tree> "" M 100644 :1 "foo" and similar commands (using D, C, or R after resetting the root tree) segfault. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-10-18 03:03:38 +02:00			`struct tree_content *t;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`const char *slash1;`
			`unsigned int i, n;`
			`struct tree_entry *e;`

			`slash1 = strchr(p, '/');`
			`if (slash1)`
			`n = slash1 - p;`
			`else`
			`n = strlen(p);`
Don't allow empty pathnames in fast-import riddochc on #git noticed corruption caused by import-tars. This was fixed in the prior commit by Dscho, but fast-import was wrong to have allowed a tree to be created with an empty string as the filename. No operating system allows this, and Git itself doesn't accept this into the index. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-04-29 02:01:27 +02:00			`if (!n)`
			`die("Empty path component found in input");`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`if (!slash1 && !S_ISDIR(mode) && subtree)`
			`die("Non-directories cannot have subtrees");`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00
fast-import: filemodify after M 040000 <tree> "" crashes Until M 040000 <tree> "" syntax was introduced in commit 2794ad5 (fast-import: Allow filemodify to set the root, 2010-10-10), it was impossible for the root entry to refer to an unloaded tree. Update various functions to take that possibility into account. Otherwise M 040000 <tree> "" M 100644 :1 "foo" and similar commands (using D, C, or R after resetting the root tree) segfault. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-10-18 03:03:38 +02:00			`if (!root->tree)`
			`load_tree(root);`
			`t = root->tree;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`for (i = 0; i < t->entry_count; i++) {`
			`e = t->entries[i];`
Support case folding in git fast-import when core.ignorecase=true When core.ignorecase=true, imported file paths will be folded to match existing directory case. Signed-off-by: Joshua Jensen <jjensen@workspacewhiz.com> Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-10-03 11:56:46 +02:00			`if (e->name->str_len == n && !strncmp_icase(p, e->name->str_dat, n)) {`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`if (!slash1) {`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`if (!S_ISDIR(mode)`
			`&& e->versions[1].mode == mode`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`&& !hashcmp(e->versions[1].sha1, sha1))`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`return 0;`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`e->versions[1].mode = mode;`
			`hashcpy(e->versions[1].sha1, sha1);`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`if (e->tree)`
Fixed segfault in fast-import after growing a tree. Growing a tree caused all subtrees to be deallocated and put back into the free list yet those subtree's contents were still actively in use. Consequently they were doled out again and got stomped on elsewhere. Releasing a tree is now performed in two parts, either releasing only the content array or releasing the content array and recursively releasing the subtree(s). Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 07:33:47 +02:00			`release_tree_content_recursive(e->tree);`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`e->tree = subtree;`
fast-import: prevent producing bad delta To produce deltas for tree objects fast-import tracks two versions of tree's entries - base and current one. Base version stands both for a delta base of this tree, and for a entry inside a delta base of a parent tree. So care should be taken to keep it in sync. tree_content_set cuts away a whole subtree and replaces it with a new one (or NULL for lazy load of a tree with known sha1). It keeps a base sha1 for this subtree (needed for parent tree). And here is the problem, 'subtree' tree root doesn't have the implied base version entries. Adjusting the subtree to include them would mean a deep rewrite of subtree. Invalidating the subtree base version would mean recursive invalidation of parents' base versions. So just mark this tree as do-not-delta me. Abuse setuid bit for this purpose. tree_content_replace is the same as tree_content_set except that is is used to replace the root, so just clearing base sha1 here (instead of setting the bit) is fine. [di: log message] Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Dmitry Ivankov <divanorama@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-08-14 20:32:24 +02:00
			`/*`
			`* We need to leave e->versions[0].sha1 alone`
			`* to avoid modifying the preimage tree used`
			`* when writing out the parent directory.`
			`* But after replacing the subdir with a`
			`* completely different one, it's not a good`
			`* delta base any more, and besides, we've`
			`* thrown away the tree entries needed to`
			`* make a delta against it.`
			`*`
			`* So let's just explicitly disable deltas`
			`* for the subtree.`
			`*/`
			`if (S_ISDIR(e->versions[0].mode))`
			`e->versions[0].mode \|= NO_DELTA;`

Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`hashclr(root->versions[1].sha1);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`return 1;`
			`}`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`if (!S_ISDIR(e->versions[1].mode)) {`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`e->tree = new_tree_content(8);`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`e->versions[1].mode = S_IFDIR;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`}`
			`if (!e->tree)`
			`load_tree(e);`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`if (tree_content_set(e, slash1 + 1, sha1, mode, subtree)) {`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`hashclr(root->versions[1].sha1);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`return 1;`
			`}`
			`return 0;`
			`}`
			`}`

			`if (t->entry_count == t->entry_capacity)`
fast-import: grow tree storage more aggressively When building up a tree for a commit, fast-import dynamically allocates memory for the tree entries. When more space is needed, the allocated memory is increased by a constant amount. For very large trees, this means re-allocating and memcpy()ing the memory O(n) times. To compound this problem, releasing the previous tree resource does not free the memory; it is kept in a pool for future trees. This means that each of the O(n) allocations will consume increasing amounts of memory, giving O(n^2) memory consumption. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-11 03:39:17 +01:00			`root->tree = t = grow_tree_content(t, t->entry_count);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`e = new_tree_entry();`
Remove unnecessary casts from fast-import Jeff King pointed out that these casts are quite unnecessary, as the compiler should be doing them anyway, and may cause problems in the future if the size of the argument for to_atom were to ever be increased. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-12 20:48:37 +01:00			`e->name = to_atom(p, n);`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`e->versions[0].mode = 0;`
			`hashclr(e->versions[0].sha1);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`t->entries[t->entry_count++] = e;`
			`if (slash1) {`
			`e->tree = new_tree_content(8);`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`e->versions[1].mode = S_IFDIR;`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`tree_content_set(e, slash1 + 1, sha1, mode, subtree);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`} else {`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`e->tree = subtree;`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`e->versions[1].mode = mode;`
			`hashcpy(e->versions[1].sha1, sha1);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`}`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`hashclr(root->versions[1].sha1);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`return 1;`
			`}`

Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`static int tree_content_remove(`
			`struct tree_entry *root,`
			`const char *p,`
			`struct tree_entry *backup_leaf)`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`{`
fast-import: filemodify after M 040000 <tree> "" crashes Until M 040000 <tree> "" syntax was introduced in commit 2794ad5 (fast-import: Allow filemodify to set the root, 2010-10-10), it was impossible for the root entry to refer to an unloaded tree. Update various functions to take that possibility into account. Otherwise M 040000 <tree> "" M 100644 :1 "foo" and similar commands (using D, C, or R after resetting the root tree) segfault. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-10-18 03:03:38 +02:00			`struct tree_content *t;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`const char *slash1;`
			`unsigned int i, n;`
			`struct tree_entry *e;`

			`slash1 = strchr(p, '/');`
			`if (slash1)`
			`n = slash1 - p;`
			`else`
			`n = strlen(p);`

fast-import: filemodify after M 040000 <tree> "" crashes Until M 040000 <tree> "" syntax was introduced in commit 2794ad5 (fast-import: Allow filemodify to set the root, 2010-10-10), it was impossible for the root entry to refer to an unloaded tree. Update various functions to take that possibility into account. Otherwise M 040000 <tree> "" M 100644 :1 "foo" and similar commands (using D, C, or R after resetting the root tree) segfault. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-10-18 03:03:38 +02:00			`if (!root->tree)`
			`load_tree(root);`
			`t = root->tree;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`for (i = 0; i < t->entry_count; i++) {`
			`e = t->entries[i];`
Support case folding in git fast-import when core.ignorecase=true When core.ignorecase=true, imported file paths will be folded to match existing directory case. Signed-off-by: Joshua Jensen <jjensen@workspacewhiz.com> Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-10-03 11:56:46 +02:00			`if (e->name->str_len == n && !strncmp_icase(p, e->name->str_dat, n)) {`
fast-import: Improve robustness when D->F changes provided in wrong order When older versions of fast-export came across a directory changing to a symlink (or regular file), it would output the changes in the form M 120000 :239821 dir-changing-to-symlink D dir-changing-to-symlink/filename1 When fast-import sees the first line, it deletes the directory named dir-changing-to-symlink (and any files below it) and creates a symlink in its place. When fast-import came across the second line, it was previously trying to remove the file and relevant leading directories in tree_content_remove(), and as a side effect it would delete the symlink that was just created. This resulted in the symlink silently missing from the resulting repository. To improve robustness, we ignore file deletions underneath directory names that correspond to non-directories. This can also be viewed as a minor optimization: since there cannot be a file and a directory with the same name in the same directory, the file clearly can't exist so nothing needs to be done to delete it. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-07-09 15:10:56 +02:00			`if (slash1 && !S_ISDIR(e->versions[1].mode))`
			`/*`
			`* If p names a file in some subdirectory, and a`
			`* file or symlink matching the name of the`
			`* parent directory of p exists, then p cannot`
			`* exist and need not be deleted.`
			`*/`
			`return 1;`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`if (!slash1 \|\| !S_ISDIR(e->versions[1].mode))`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`goto del_entry;`
			`if (!e->tree)`
			`load_tree(e);`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`if (tree_content_remove(e, slash1 + 1, backup_leaf)) {`
Correct tree corruption problems in fast-import. The new tree delta implementation caused blob SHA1s to be used instead of a tree SHA1 when a tree was written out. This really only appeared to happen when converting an existing file to a tree, but may have been possible in some other situations. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-29 03:43:04 +02:00			`for (n = 0; n < e->tree->entry_count; n++) {`
			`if (e->tree->entries[n]->versions[1].mode) {`
			`hashclr(root->versions[1].sha1);`
			`return 1;`
			`}`
			`}`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`backup_leaf = NULL;`
Correct tree corruption problems in fast-import. The new tree delta implementation caused blob SHA1s to be used instead of a tree SHA1 when a tree was written out. This really only appeared to happen when converting an existing file to a tree, but may have been possible in some other situations. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-29 03:43:04 +02:00			`goto del_entry;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`}`
			`return 0;`
			`}`
			`}`
			`return 0;`

			`del_entry:`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`if (backup_leaf)`
			`memcpy(backup_leaf, e, sizeof(*backup_leaf));`
			`else if (e->tree)`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`release_tree_content_recursive(e->tree);`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`e->tree = NULL;`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`e->versions[1].mode = 0;`
			`hashclr(e->versions[1].sha1);`
			`hashclr(root->versions[1].sha1);`
Refactored fast-import's internals for future additions. Too many globals variables were being used not not enough code was resuable to process trees and commits so this is a simple refactoring of the existing blob processing code to get into a state that will be easier to handle trees and commits in. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 06:46:13 +02:00			`return 1;`
Created fast-import, a tool to quickly generating a pack from blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-05 08:04:21 +02:00			`}`

Teach fast-import to recursively copy files/directories Some source material (e.g. Subversion dump files) perform directory renames by telling us the directory was copied, then deleted in the same revision. This makes it difficult for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Copy a/ to b/; Delete a/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'C' subcommand within a commit allows the frontend to make a recursive copy of one path to another path within the branch, without needing to keep track of the individual file paths. The metadata copy is performed in memory efficiently, but is implemented as a copy-immediately operation, rather than copy-on-write. With this new 'C' subcommand frontends could obviously implement an 'R' (rename) on their own as a combination of 'C' and 'D' (delete), but since we have already offered up 'R' in the past and it is a trivial thing to keep implemented I'm not going to deprecate it. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-15 07:40:37 +02:00			`static int tree_content_get(`
			`struct tree_entry *root,`
			`const char *p,`
			`struct tree_entry *leaf)`
			`{`
fast-import: filemodify after M 040000 <tree> "" crashes Until M 040000 <tree> "" syntax was introduced in commit 2794ad5 (fast-import: Allow filemodify to set the root, 2010-10-10), it was impossible for the root entry to refer to an unloaded tree. Update various functions to take that possibility into account. Otherwise M 040000 <tree> "" M 100644 :1 "foo" and similar commands (using D, C, or R after resetting the root tree) segfault. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-10-18 03:03:38 +02:00			`struct tree_content *t;`
Teach fast-import to recursively copy files/directories Some source material (e.g. Subversion dump files) perform directory renames by telling us the directory was copied, then deleted in the same revision. This makes it difficult for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Copy a/ to b/; Delete a/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'C' subcommand within a commit allows the frontend to make a recursive copy of one path to another path within the branch, without needing to keep track of the individual file paths. The metadata copy is performed in memory efficiently, but is implemented as a copy-immediately operation, rather than copy-on-write. With this new 'C' subcommand frontends could obviously implement an 'R' (rename) on their own as a combination of 'C' and 'D' (delete), but since we have already offered up 'R' in the past and it is a trivial thing to keep implemented I'm not going to deprecate it. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-15 07:40:37 +02:00			`const char *slash1;`
			`unsigned int i, n;`
			`struct tree_entry *e;`

			`slash1 = strchr(p, '/');`
			`if (slash1)`
			`n = slash1 - p;`
			`else`
			`n = strlen(p);`
fast-import: don't allow 'ls' of path with empty components As the fast-import manual explains: The value of <path> must be in canonical form. That is it must not: . contain an empty directory component (e.g. foo//bar is invalid), . end with a directory separator (e.g. foo/ is invalid), . start with a directory separator (e.g. /foo is invalid), Unfortunately the "ls" command accepts these invalid syntaxes and responds by declaring that the indicated path is missing. This is too subtle and causes importers to silently misbehave; better to error out so the operator knows what's happening. The C, R, and M commands already error out for such paths. Reported-by: Andrew Sayers <andrew-git@pileofstuff.org> Analysis-by: David Barr <davidbarr@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> 2012-03-10 05:07:22 +01:00			`if (!n)`
			`die("Empty path component found in input");`
Teach fast-import to recursively copy files/directories Some source material (e.g. Subversion dump files) perform directory renames by telling us the directory was copied, then deleted in the same revision. This makes it difficult for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Copy a/ to b/; Delete a/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'C' subcommand within a commit allows the frontend to make a recursive copy of one path to another path within the branch, without needing to keep track of the individual file paths. The metadata copy is performed in memory efficiently, but is implemented as a copy-immediately operation, rather than copy-on-write. With this new 'C' subcommand frontends could obviously implement an 'R' (rename) on their own as a combination of 'C' and 'D' (delete), but since we have already offered up 'R' in the past and it is a trivial thing to keep implemented I'm not going to deprecate it. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-15 07:40:37 +02:00
fast-import: filemodify after M 040000 <tree> "" crashes Until M 040000 <tree> "" syntax was introduced in commit 2794ad5 (fast-import: Allow filemodify to set the root, 2010-10-10), it was impossible for the root entry to refer to an unloaded tree. Update various functions to take that possibility into account. Otherwise M 040000 <tree> "" M 100644 :1 "foo" and similar commands (using D, C, or R after resetting the root tree) segfault. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-10-18 03:03:38 +02:00			`if (!root->tree)`
			`load_tree(root);`
			`t = root->tree;`
Teach fast-import to recursively copy files/directories Some source material (e.g. Subversion dump files) perform directory renames by telling us the directory was copied, then deleted in the same revision. This makes it difficult for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Copy a/ to b/; Delete a/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'C' subcommand within a commit allows the frontend to make a recursive copy of one path to another path within the branch, without needing to keep track of the individual file paths. The metadata copy is performed in memory efficiently, but is implemented as a copy-immediately operation, rather than copy-on-write. With this new 'C' subcommand frontends could obviously implement an 'R' (rename) on their own as a combination of 'C' and 'D' (delete), but since we have already offered up 'R' in the past and it is a trivial thing to keep implemented I'm not going to deprecate it. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-15 07:40:37 +02:00			`for (i = 0; i < t->entry_count; i++) {`
			`e = t->entries[i];`
Support case folding in git fast-import when core.ignorecase=true When core.ignorecase=true, imported file paths will be folded to match existing directory case. Signed-off-by: Joshua Jensen <jjensen@workspacewhiz.com> Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-10-03 11:56:46 +02:00			`if (e->name->str_len == n && !strncmp_icase(p, e->name->str_dat, n)) {`
Teach fast-import to recursively copy files/directories Some source material (e.g. Subversion dump files) perform directory renames by telling us the directory was copied, then deleted in the same revision. This makes it difficult for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Copy a/ to b/; Delete a/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'C' subcommand within a commit allows the frontend to make a recursive copy of one path to another path within the branch, without needing to keep track of the individual file paths. The metadata copy is performed in memory efficiently, but is implemented as a copy-immediately operation, rather than copy-on-write. With this new 'C' subcommand frontends could obviously implement an 'R' (rename) on their own as a combination of 'C' and 'D' (delete), but since we have already offered up 'R' in the past and it is a trivial thing to keep implemented I'm not going to deprecate it. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-15 07:40:37 +02:00			`if (!slash1) {`
			`memcpy(leaf, e, sizeof(*leaf));`
			`if (e->tree && is_null_sha1(e->versions[1].sha1))`
			`leaf->tree = dup_tree_content(e->tree);`
			`else`
			`leaf->tree = NULL;`
			`return 1;`
			`}`
			`if (!S_ISDIR(e->versions[1].mode))`
			`return 0;`
			`if (!e->tree)`
			`load_tree(e);`
			`return tree_content_get(e, slash1 + 1, leaf);`
			`}`
			`}`
			`return 0;`
			`}`

Don't do non-fastforward updates in fast-import. If fast-import is being used to update an existing branch of a repository, the user may not want to lose commits if another process updates the same ref at the same time. For example, the user might be using fast-import to make just one or two commits against a live branch. We now perform a fast-forward check during the ref updating process. If updating a branch would cause commits in that branch to be lost, we skip over it and display the new SHA1 to standard error. This new default behavior can be overridden with `--force`, like git-push and git-fetch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 22:08:06 +01:00			`static int update_branch(struct branch *b)`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`{`
			`static const char *msg = "fast-import";`
Don't do non-fastforward updates in fast-import. If fast-import is being used to update an existing branch of a repository, the user may not want to lose commits if another process updates the same ref at the same time. For example, the user might be using fast-import to make just one or two commits against a live branch. We now perform a fast-forward check during the ref updating process. If updating a branch would cause commits in that branch to be lost, we skip over it and display the new SHA1 to standard error. This new default behavior can be overridden with `--force`, like git-push and git-fetch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 22:08:06 +01:00			`struct ref_lock *lock;`
			`unsigned char old_sha1[20];`

fast-import: Allow "reset" to delete a new branch without error Creating a branch in fast-import and then resetting it without making any further commits to it currently causes an error message at the end of the import. This error is triggered by cvs2svn's git backend, which uses a temporary fixup branch when it creates tags, because the fixup branch is reset after each tag. This patch prevents the error, allowing "reset" to be used to delete temporary branches. Signed-off-by: Eyvind Bernhardsen <eyvind-git@orakel.ntnu.no> Acked-by: Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-03-16 20:49:09 +01:00			`if (is_null_sha1(b->sha1))`
			`return 0;`
Don't do non-fastforward updates in fast-import. If fast-import is being used to update an existing branch of a repository, the user may not want to lose commits if another process updates the same ref at the same time. For example, the user might be using fast-import to make just one or two commits against a live branch. We now perform a fast-forward check during the ref updating process. If updating a branch would cause commits in that branch to be lost, we skip over it and display the new SHA1 to standard error. This new default behavior can be overridden with `--force`, like git-push and git-fetch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 22:08:06 +01:00			`if (read_ref(b->name, old_sha1))`
			`hashclr(old_sha1);`
git-update-ref: add --no-deref option for overwriting/detaching ref git-checkout is also adapted to make use of this new option instead of the handcrafted command sequence. Signed-off-by: Sven Verdoolaege <skimo@kotnet.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-05-09 12:33:20 +02:00			`lock = lock_any_ref_for_update(b->name, old_sha1, 0);`
Don't do non-fastforward updates in fast-import. If fast-import is being used to update an existing branch of a repository, the user may not want to lose commits if another process updates the same ref at the same time. For example, the user might be using fast-import to make just one or two commits against a live branch. We now perform a fast-forward check during the ref updating process. If updating a branch would cause commits in that branch to be lost, we skip over it and display the new SHA1 to standard error. This new default behavior can be overridden with `--force`, like git-push and git-fetch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 22:08:06 +01:00			`if (!lock)`
			`return error("Unable to lock %s", b->name);`
			`if (!force_update && !is_null_sha1(old_sha1)) {`
			`struct commit old_cmit, new_cmit;`

			`old_cmit = lookup_commit_reference_gently(old_sha1, 0);`
			`new_cmit = lookup_commit_reference_gently(b->sha1, 0);`
			`if (!old_cmit \|\| !new_cmit) {`
			`unlock_ref(lock);`
			`return error("Branch %s is missing commits.", b->name);`
			`}`

Merge branch 'jc/merge-base' (early part) This contains an evil merge to fast-import, in order to resolve in_merge_bases() update. 2007-02-14 01:50:32 +01:00			`if (!in_merge_bases(old_cmit, &new_cmit, 1)) {`
Don't do non-fastforward updates in fast-import. If fast-import is being used to update an existing branch of a repository, the user may not want to lose commits if another process updates the same ref at the same time. For example, the user might be using fast-import to make just one or two commits against a live branch. We now perform a fast-forward check during the ref updating process. If updating a branch would cause commits in that branch to be lost, we skip over it and display the new SHA1 to standard error. This new default behavior can be overridden with `--force`, like git-push and git-fetch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 22:08:06 +01:00			`unlock_ref(lock);`
Rename warn() to warning() to fix symbol conflicts on BSD and Mac OS This fixes a problem reported by Randal Schwartz: >I finally tracked down all the (albeit inconsequential) errors I was getting >on both OpenBSD and OSX. It's the warn() function in usage.c. There's >warn(3) in BSD-style distros. It'd take a "great rename" to change it, but if >someone with better C skills than I have could do that, my linker and I would >appreciate it. It was annoying to me, too, when I was doing some mergetool testing on Mac OS X, so here's a fix. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: "Randal L. Schwartz" <merlyn@stonehenge.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-31 01:07:05 +02:00			`warning("Not updating %s"`
Don't do non-fastforward updates in fast-import. If fast-import is being used to update an existing branch of a repository, the user may not want to lose commits if another process updates the same ref at the same time. For example, the user might be using fast-import to make just one or two commits against a live branch. We now perform a fast-forward check during the ref updating process. If updating a branch would cause commits in that branch to be lost, we skip over it and display the new SHA1 to standard error. This new default behavior can be overridden with `--force`, like git-push and git-fetch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 22:08:06 +01:00			`" (new tip %s does not contain %s)",`
			`b->name, sha1_to_hex(b->sha1), sha1_to_hex(old_sha1));`
			`return -1;`
			`}`
			`}`
			`if (write_ref_sha1(lock, b->sha1, msg) < 0)`
			`return error("Unable to update %s", b->name);`
			`return 0;`
			`}`

			`static void dump_branches(void)`
			`{`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`unsigned int i;`
			`struct branch *b;`

			`for (i = 0; i < branch_table_sz; i++) {`
Don't do non-fastforward updates in fast-import. If fast-import is being used to update an existing branch of a repository, the user may not want to lose commits if another process updates the same ref at the same time. For example, the user might be using fast-import to make just one or two commits against a live branch. We now perform a fast-forward check during the ref updating process. If updating a branch would cause commits in that branch to be lost, we skip over it and display the new SHA1 to standard error. This new default behavior can be overridden with `--force`, like git-push and git-fetch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 22:08:06 +01:00			`for (b = branch_table[i]; b; b = b->table_next_branch)`
			`failure \|= update_branch(b);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`}`
			`}`

Declare no-arg functions as (void) in fast-import. Apparently the git convention is to declare any function which takes no arguments as taking void. I did not do this during the early fast-import development, but should have. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-17 07:47:25 +01:00			`static void dump_tags(void)`
Implemented 'tag' command in fast-import. Tags received from the frontend are generated in memory in a simple linked list in the order that the tag commands were sent by the frontend. If multiple different tag objects for the same tag name get generated the last one sent by the frontend will be the one that gets written out at termination. Multiple tag objects for the same name will cause all older tags of the same name to be lost. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 09:12:13 +02:00			`{`
			`static const char *msg = "fast-import";`
			`struct tag *t;`
			`struct ref_lock *lock;`
Don't do non-fastforward updates in fast-import. If fast-import is being used to update an existing branch of a repository, the user may not want to lose commits if another process updates the same ref at the same time. For example, the user might be using fast-import to make just one or two commits against a live branch. We now perform a fast-forward check during the ref updating process. If updating a branch would cause commits in that branch to be lost, we skip over it and display the new SHA1 to standard error. This new default behavior can be overridden with `--force`, like git-push and git-fetch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 22:08:06 +01:00			`char ref_name[PATH_MAX];`
Implemented 'tag' command in fast-import. Tags received from the frontend are generated in memory in a simple linked list in the order that the tag commands were sent by the frontend. If multiple different tag objects for the same tag name get generated the last one sent by the frontend will be the one that gets written out at termination. Multiple tag objects for the same name will cause all older tags of the same name to be lost. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 09:12:13 +02:00
			`for (t = first_tag; t; t = t->next_tag) {`
Don't do non-fastforward updates in fast-import. If fast-import is being used to update an existing branch of a repository, the user may not want to lose commits if another process updates the same ref at the same time. For example, the user might be using fast-import to make just one or two commits against a live branch. We now perform a fast-forward check during the ref updating process. If updating a branch would cause commits in that branch to be lost, we skip over it and display the new SHA1 to standard error. This new default behavior can be overridden with `--force`, like git-push and git-fetch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 22:08:06 +01:00			`sprintf(ref_name, "tags/%s", t->name);`
			`lock = lock_ref_sha1(ref_name, NULL);`
Implemented 'tag' command in fast-import. Tags received from the frontend are generated in memory in a simple linked list in the order that the tag commands were sent by the frontend. If multiple different tag objects for the same tag name get generated the last one sent by the frontend will be the one that gets written out at termination. Multiple tag objects for the same name will cause all older tags of the same name to be lost. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 09:12:13 +02:00			`if (!lock \|\| write_ref_sha1(lock, t->sha1, msg) < 0)`
Don't do non-fastforward updates in fast-import. If fast-import is being used to update an existing branch of a repository, the user may not want to lose commits if another process updates the same ref at the same time. For example, the user might be using fast-import to make just one or two commits against a live branch. We now perform a fast-forward check during the ref updating process. If updating a branch would cause commits in that branch to be lost, we skip over it and display the new SHA1 to standard error. This new default behavior can be overridden with `--force`, like git-push and git-fetch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 22:08:06 +01:00			`failure \|= error("Unable to update %s", ref_name);`
Implemented 'tag' command in fast-import. Tags received from the frontend are generated in memory in a simple linked list in the order that the tag commands were sent by the frontend. If multiple different tag objects for the same tag name get generated the last one sent by the frontend will be the one that gets written out at termination. Multiple tag objects for the same name will cause all older tags of the same name to be lost. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 09:12:13 +02:00			`}`
			`}`

Added option to export the marks table when fast-import terminates. The marks table can be used by the frontend to load any commit after the import and compare it to whatever data the frontend knows about that commit. If the mark idnums can be easily correlated to some reference source then its relatively trivial to compare the GIT tree to the reference to verify the accuracy of the import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 22:03:04 +02:00			`static void dump_marks_helper(FILE *f,`
Use uintmax_t for marks in fast-import. If a frontend wants to use a mark per file revision and per commit and is doing a truly huge import (such as a 32 GiB SVN repository) we may need more than 2**32 unique mark values, especially if the frontend is unable (or unwilling) to recycle mark values. For mark idnums we should use the largest unsigned integer type available, hoping that will be at least 64 bits when we are compiled as a 64 bit executable. This way we may consume huge amounts of memory storing our mark table, but we'll at least be able to process the entire import without failing. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 06:33:19 +01:00			`uintmax_t base,`
Added option to export the marks table when fast-import terminates. The marks table can be used by the frontend to load any commit after the import and compare it to whatever data the frontend knows about that commit. If the mark idnums can be easily correlated to some reference source then its relatively trivial to compare the GIT tree to the reference to verify the accuracy of the import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 22:03:04 +02:00			`struct mark_set *m)`
			`{`
Use uintmax_t for marks in fast-import. If a frontend wants to use a mark per file revision and per commit and is doing a truly huge import (such as a 32 GiB SVN repository) we may need more than 2**32 unique mark values, especially if the frontend is unable (or unwilling) to recycle mark values. For mark idnums we should use the largest unsigned integer type available, hoping that will be at least 64 bits when we are compiled as a 64 bit executable. This way we may consume huge amounts of memory storing our mark table, but we'll at least be able to process the entire import without failing. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 06:33:19 +01:00			`uintmax_t k;`
Added option to export the marks table when fast-import terminates. The marks table can be used by the frontend to load any commit after the import and compare it to whatever data the frontend knows about that commit. If the mark idnums can be easily correlated to some reference source then its relatively trivial to compare the GIT tree to the reference to verify the accuracy of the import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 22:03:04 +02:00			`if (m->shift) {`
			`for (k = 0; k < 1024; k++) {`
			`if (m->data.sets[k])`
fast-import: export correctly marks larger than 2^20-1 dump_marks_helper() has a bug when dumping marks larger than 2^20-1, i.e., when the sparse array has more than two levels. The bug was that the 'base' counter was being shifted by 20 bits at level 3, and then again by 10 bits at level 2, rather than a total shift of 20 bits in this argument to the recursive call: (base + k) << m->shift There are two ways to fix this correctly, the elegant: (base + k) << 10 and the one I chose due to edit distance: base + (k << m->shift) Signed-off-by: Raja R Harinath <harinath@hurrynot.org> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-07-13 13:51:48 +02:00			`dump_marks_helper(f, base + (k << m->shift),`
Added option to export the marks table when fast-import terminates. The marks table can be used by the frontend to load any commit after the import and compare it to whatever data the frontend knows about that commit. If the mark idnums can be easily correlated to some reference source then its relatively trivial to compare the GIT tree to the reference to verify the accuracy of the import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 22:03:04 +02:00			`m->data.sets[k]);`
			`}`
			`} else {`
			`for (k = 0; k < 1024; k++) {`
			`if (m->data.marked[k])`
Check for PRIuMAX rather than NO_C99_FORMAT in fast-import.c. Thanks to Simon 'corecode' Schubert <corecode@fs.ei.tum.de> for the clean-up. Defining the C99 standard PRIuMAX when necessary replaces UM_FMT and the awkward UM10_FMT. There are no direct C99 translations for other uses of NO_C99_FORMAT in git, alas. Signed-off-by: Jason Riedy <ejr@cs.berkeley.edu> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-21 02:34:56 +01:00			`fprintf(f, ":%" PRIuMAX " %s\n", base + k,`
fast-import: start using struct pack_idx_entry This is in preparation for using write_idx_file(). Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:51 +01:00			`sha1_to_hex(m->data.marked[k]->idx.sha1));`
Added option to export the marks table when fast-import terminates. The marks table can be used by the frontend to load any commit after the import and compare it to whatever data the frontend knows about that commit. If the mark idnums can be easily correlated to some reference source then its relatively trivial to compare the GIT tree to the reference to verify the accuracy of the import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 22:03:04 +02:00			`}`
			`}`
			`}`

Declare no-arg functions as (void) in fast-import. Apparently the git convention is to declare any function which takes no arguments as taking void. I did not do this during the early fast-import development, but should have. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-17 07:47:25 +01:00			`static void dump_marks(void)`
Added option to export the marks table when fast-import terminates. The marks table can be used by the frontend to load any commit after the import and compare it to whatever data the frontend knows about that commit. If the mark idnums can be easily correlated to some reference source then its relatively trivial to compare the GIT tree to the reference to verify the accuracy of the import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 22:03:04 +02:00			`{`
Use atomic updates to the fast-import mark file When we allow fast-import frontends to reload a mark file from a prior session we want to let them use the same file as they exported the marks to. This makes it very simple for the frontend to save state across incremental imports. But we don't want to lose the old marks table if anything goes wrong while writing our current marks table. So instead of truncating and overwriting the path specified to --export-marks we use the standard lockfile code to write the current marks out to a temporary file, then rename it over the old marks table. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-08 00:05:38 +01:00			`static struct lock_file mark_lock;`
			`int mark_fd;`
			`FILE *f;`

fast-import: put marks reading in its own function All options do nothing but set settings, with the exception of the --input-marks option. Delay the reading of the marks file till after all options have been parsed. Also, rename mark_file to export_marks_file as it is now ambiguous. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:55 +01:00			`if (!export_marks_file)`
Use atomic updates to the fast-import mark file When we allow fast-import frontends to reload a mark file from a prior session we want to let them use the same file as they exported the marks to. This makes it very simple for the frontend to save state across incremental imports. But we don't want to lose the old marks table if anything goes wrong while writing our current marks table. So instead of truncating and overwriting the path specified to --export-marks we use the standard lockfile code to write the current marks out to a temporary file, then rename it over the old marks table. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-08 00:05:38 +01:00			`return;`

fast-import: put marks reading in its own function All options do nothing but set settings, with the exception of the --input-marks option. Delay the reading of the marks file till after all options have been parsed. Also, rename mark_file to export_marks_file as it is now ambiguous. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:55 +01:00			`mark_fd = hold_lock_file_for_update(&mark_lock, export_marks_file, 0);`
Use atomic updates to the fast-import mark file When we allow fast-import frontends to reload a mark file from a prior session we want to let them use the same file as they exported the marks to. This makes it very simple for the frontend to save state across incremental imports. But we don't want to lose the old marks table if anything goes wrong while writing our current marks table. So instead of truncating and overwriting the path specified to --export-marks we use the standard lockfile code to write the current marks out to a temporary file, then rename it over the old marks table. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-08 00:05:38 +01:00			`if (mark_fd < 0) {`
			`failure \|= error("Unable to write marks file %s: %s",`
fast-import: put marks reading in its own function All options do nothing but set settings, with the exception of the --input-marks option. Delay the reading of the marks file till after all options have been parsed. Also, rename mark_file to export_marks_file as it is now ambiguous. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:55 +01:00			`export_marks_file, strerror(errno));`
Use atomic updates to the fast-import mark file When we allow fast-import frontends to reload a mark file from a prior session we want to let them use the same file as they exported the marks to. This makes it very simple for the frontend to save state across incremental imports. But we don't want to lose the old marks table if anything goes wrong while writing our current marks table. So instead of truncating and overwriting the path specified to --export-marks we use the standard lockfile code to write the current marks out to a temporary file, then rename it over the old marks table. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-08 00:05:38 +01:00			`return;`
Added option to export the marks table when fast-import terminates. The marks table can be used by the frontend to load any commit after the import and compare it to whatever data the frontend knows about that commit. If the mark idnums can be easily correlated to some reference source then its relatively trivial to compare the GIT tree to the reference to verify the accuracy of the import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 22:03:04 +02:00			`}`
Use atomic updates to the fast-import mark file When we allow fast-import frontends to reload a mark file from a prior session we want to let them use the same file as they exported the marks to. This makes it very simple for the frontend to save state across incremental imports. But we don't want to lose the old marks table if anything goes wrong while writing our current marks table. So instead of truncating and overwriting the path specified to --export-marks we use the standard lockfile code to write the current marks out to a temporary file, then rename it over the old marks table. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-08 00:05:38 +01:00
			`f = fdopen(mark_fd, "w");`
			`if (!f) {`
fast-import: Don't use a maybe-clobbered errno value Without this change, each diagnostic could use an errno value clobbered by the close or unlink in rollback_lock_file. Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-18 19:35:49 +01:00			`int saved_errno = errno;`
Use atomic updates to the fast-import mark file When we allow fast-import frontends to reload a mark file from a prior session we want to let them use the same file as they exported the marks to. This makes it very simple for the frontend to save state across incremental imports. But we don't want to lose the old marks table if anything goes wrong while writing our current marks table. So instead of truncating and overwriting the path specified to --export-marks we use the standard lockfile code to write the current marks out to a temporary file, then rename it over the old marks table. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-08 00:05:38 +01:00			`rollback_lock_file(&mark_lock);`
			`failure \|= error("Unable to write marks file %s: %s",`
fast-import: put marks reading in its own function All options do nothing but set settings, with the exception of the --input-marks option. Delay the reading of the marks file till after all options have been parsed. Also, rename mark_file to export_marks_file as it is now ambiguous. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:55 +01:00			`export_marks_file, strerror(saved_errno));`
Use atomic updates to the fast-import mark file When we allow fast-import frontends to reload a mark file from a prior session we want to let them use the same file as they exported the marks to. This makes it very simple for the frontend to save state across incremental imports. But we don't want to lose the old marks table if anything goes wrong while writing our current marks table. So instead of truncating and overwriting the path specified to --export-marks we use the standard lockfile code to write the current marks out to a temporary file, then rename it over the old marks table. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-08 00:05:38 +01:00			`return;`
Added option to export the marks table when fast-import terminates. The marks table can be used by the frontend to load any commit after the import and compare it to whatever data the frontend knows about that commit. If the mark idnums can be easily correlated to some reference source then its relatively trivial to compare the GIT tree to the reference to verify the accuracy of the import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 22:03:04 +02:00			`}`
Use atomic updates to the fast-import mark file When we allow fast-import frontends to reload a mark file from a prior session we want to let them use the same file as they exported the marks to. This makes it very simple for the frontend to save state across incremental imports. But we don't want to lose the old marks table if anything goes wrong while writing our current marks table. So instead of truncating and overwriting the path specified to --export-marks we use the standard lockfile code to write the current marks out to a temporary file, then rename it over the old marks table. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-08 00:05:38 +01:00
Improve use of lockfile API Remove remaining double close(2)'s. i.e. close() before commit_locked_index() or commit_lock_file(). Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-16 20:12:46 +01:00			`/*`
fast-import.c: don't try to commit marks file if write failed We also move the assignment of -1 to the lock file descriptor up, so that rollback_lock_file() can be called safely after a possible attempt to fclose(). This matches the contents of the 'if' statement just above testing success of fdopen(). Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-17 17:58:34 +01:00			`* Since the lock file was fdopen()'ed, it should not be close()'ed.`
			`* Assign -1 to the lock file descriptor so that commit_lock_file()`
Improve use of lockfile API Remove remaining double close(2)'s. i.e. close() before commit_locked_index() or commit_lock_file(). Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-16 20:12:46 +01:00			`* won't try to close() it.`
			`*/`
			`mark_lock.fd = -1;`
fast-import.c: don't try to commit marks file if write failed We also move the assignment of -1 to the lock file descriptor up, so that rollback_lock_file() can be called safely after a possible attempt to fclose(). This matches the contents of the 'if' statement just above testing success of fdopen(). Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-17 17:58:34 +01:00
			`dump_marks_helper(f, 0, marks);`
			`if (ferror(f) \|\| fclose(f)) {`
fast-import: Don't use a maybe-clobbered errno value Without this change, each diagnostic could use an errno value clobbered by the close or unlink in rollback_lock_file. Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-18 19:35:49 +01:00			`int saved_errno = errno;`
fast-import.c: don't try to commit marks file if write failed We also move the assignment of -1 to the lock file descriptor up, so that rollback_lock_file() can be called safely after a possible attempt to fclose(). This matches the contents of the 'if' statement just above testing success of fdopen(). Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-17 17:58:34 +01:00			`rollback_lock_file(&mark_lock);`
			`failure \|= error("Unable to write marks file %s: %s",`
fast-import: put marks reading in its own function All options do nothing but set settings, with the exception of the --input-marks option. Delay the reading of the marks file till after all options have been parsed. Also, rename mark_file to export_marks_file as it is now ambiguous. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:55 +01:00			`export_marks_file, strerror(saved_errno));`
fast-import.c: don't try to commit marks file if write failed We also move the assignment of -1 to the lock file descriptor up, so that rollback_lock_file() can be called safely after a possible attempt to fclose(). This matches the contents of the 'if' statement just above testing success of fdopen(). Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-17 17:58:34 +01:00			`return;`
			`}`

			`if (commit_lock_file(&mark_lock)) {`
fast-import: Don't use a maybe-clobbered errno value Without this change, each diagnostic could use an errno value clobbered by the close or unlink in rollback_lock_file. Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-18 19:35:49 +01:00			`int saved_errno = errno;`
fast-import.c: don't try to commit marks file if write failed We also move the assignment of -1 to the lock file descriptor up, so that rollback_lock_file() can be called safely after a possible attempt to fclose(). This matches the contents of the 'if' statement just above testing success of fdopen(). Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-17 17:58:34 +01:00			`rollback_lock_file(&mark_lock);`
			`failure \|= error("Unable to commit marks file %s: %s",`
fast-import: put marks reading in its own function All options do nothing but set settings, with the exception of the --input-marks option. Delay the reading of the marks file till after all options have been parsed. Also, rename mark_file to export_marks_file as it is now ambiguous. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:55 +01:00			`export_marks_file, strerror(saved_errno));`
fast-import.c: don't try to commit marks file if write failed We also move the assignment of -1 to the lock file descriptor up, so that rollback_lock_file() can be called safely after a possible attempt to fclose(). This matches the contents of the 'if' statement just above testing success of fdopen(). Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-17 17:58:34 +01:00			`return;`
			`}`
Added option to export the marks table when fast-import terminates. The marks table can be used by the frontend to load any commit after the import and compare it to whatever data the frontend knows about that commit. If the mark idnums can be easily correlated to some reference source then its relatively trivial to compare the GIT tree to the reference to verify the accuracy of the import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 22:03:04 +02:00			`}`

fast-import: put marks reading in its own function All options do nothing but set settings, with the exception of the --input-marks option. Delay the reading of the marks file till after all options have been parsed. Also, rename mark_file to export_marks_file as it is now ambiguous. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:55 +01:00			`static void read_marks(void)`
			`{`
			`char line[512];`
			`FILE *f = fopen(import_marks_file, "r");`
fast-import: Introduce --import-marks-if-exists When a frontend uses a marks file to ensure its state persists between runs, it may represent "clean slate" when bootstrapping with "no marks yet". In such a case, feeding the last state with --import-marks and saving the state after the current run with --export-marks would be a natural thing to do. The --import-marks option however errors out when the specified marks file doesn't exist; this makes bootstrapping a bit difficult. The location of the marks file becomes backend-dependent when --relative-marks is in effect, and the frontend cannot check for the existence of the file in such a case. The --import-marks-if-exists option does the same thing as --import-marks but does not flag an error if the named file does not exist yet to help these frontends. Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-01-15 07:31:46 +01:00			`if (f)`
			`;`
			`else if (import_marks_file_ignore_missing && errno == ENOENT)`
			`return; /* Marks file does not exist */`
			`else`
fast-import: put marks reading in its own function All options do nothing but set settings, with the exception of the --input-marks option. Delay the reading of the marks file till after all options have been parsed. Also, rename mark_file to export_marks_file as it is now ambiguous. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:55 +01:00			`die_errno("cannot read '%s'", import_marks_file);`
			`while (fgets(line, sizeof(line), f)) {`
			`uintmax_t mark;`
			`char *end;`
			`unsigned char sha1[20];`
			`struct object_entry *e;`

			`end = strchr(line, '\n');`
			`if (line[0] != ':' \|\| !end)`
			`die("corrupt mark line: %s", line);`
			`*end = 0;`
			`mark = strtoumax(line + 1, &end, 10);`
			`if (!mark \|\| end == line + 1`
			`\|\| *end != ' ' \|\| get_sha1(end + 1, sha1))`
			`die("corrupt mark line: %s", line);`
			`e = find_object(sha1);`
			`if (!e) {`
			`enum object_type type = sha1_object_info(sha1, NULL);`
			`if (type < 0)`
			`die("object not found: %s", sha1_to_hex(sha1));`
			`e = insert_object(sha1);`
			`e->type = type;`
			`e->pack_id = MAX_PACK_ID;`
fast-import: start using struct pack_idx_entry This is in preparation for using write_idx_file(). Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:51 +01:00			`e->idx.offset = 1; /* just not zero! */`
fast-import: put marks reading in its own function All options do nothing but set settings, with the exception of the --input-marks option. Delay the reading of the marks file till after all options have been parsed. Also, rename mark_file to export_marks_file as it is now ambiguous. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:55 +01:00			`}`
			`insert_mark(mark, e);`
			`}`
			`fclose(f);`
			`}`


Drop strbuf's 'eof' marker, and make read_line a first class citizen. read_line is now strbuf_getline, and is a first class citizen, it returns 0 when reading a line worked, EOF else. The ->eof marker was used non-locally by fast-import.c, mimic the same behaviour using a static int in "read_next_command", that now returns -1 on EOF, and avoids to call strbuf_getline when it's in EOF state. Also no longer automagically strbuf_release the buffer, it's counter intuitive and breaks fast-import in a very subtle way. Note: being at EOF implies that command_buf.len == 0. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 11:19:04 +02:00			`static int read_next_command(void)`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`{`
Drop strbuf's 'eof' marker, and make read_line a first class citizen. read_line is now strbuf_getline, and is a first class citizen, it returns 0 when reading a line worked, EOF else. The ->eof marker was used non-locally by fast-import.c, mimic the same behaviour using a static int in "read_next_command", that now returns -1 on EOF, and avoids to call strbuf_getline when it's in EOF state. Also no longer automagically strbuf_release the buffer, it's counter intuitive and breaks fast-import in a very subtle way. Note: being at EOF implies that command_buf.len == 0. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 11:19:04 +02:00			`static int stdin_eof = 0;`

			`if (stdin_eof) {`
			`unread_command_buf = 0;`
			`return EOF;`
			`}`

fast-import: Allow cat-blob requests at arbitrary points in stream The new rule: a "cat-blob" can be inserted wherever a comment is allowed, which means at the start of any line except in the middle of a "data" command. This saves frontends from having to loop over everything they want to commit in the next commit and cat-ing the necessary objects in advance. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-11-28 20:45:58 +01:00			`for (;;) {`
Include recent command history in fast-import crash reports When we crash the frontend developer (or end-user) may need to know roughly around what part of the input stream we had a problem with and aborted on. Because line numbers aren't very useful in this sort of application we instead just keep the last 100 commands in a FIFO queue and print them as part of the crash report. Currently one problem with this design is a commit that has more than 100 modified files in it will flood the FIFO and any context regarding branch/from/committer/mark/comments will be lost. We really should save only the last few (10?) file changes for the current commit, ensuring we have some prior higher level commands in the FIFO when we crash on a file M/D/C/R command. Another issue with this approach is the FIFO only includes the commands, it does not include the commit messages. Yet having a commit message may be useful to help locate the relevant change in the source material. In practice I don't think this is going to be a major concern as the frontend can always embed its own source change set identifier as a comment (which will appear in the crash report) and the commit message(s) for the most recent commits of any given branch should be obtainable from the (packed) commit objects. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 10:47:04 +02:00			`if (unread_command_buf) {`
Make trailing LF optional for all fast-import commands For the same reasons as the prior change we want to allow frontends to omit the trailing LF that usually delimits commands. In some cases these just make the input stream more verbose looking than it needs to be, and its just simpler for the frontend developer to get started if our parser is slightly more lenient about where an LF is required and where it isn't. To make this optional LF feature work we now have to buffer up to one line of input in command_buf. This buffering can happen if we look at the current input command but don't recognize it at this point in the code. In such a case we need to "unget" the entire line, but we cannot depend upon the stdio library to let us do ungetc() for that many characters at once. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-01 08:22:53 +02:00			`unread_command_buf = 0;`
Include recent command history in fast-import crash reports When we crash the frontend developer (or end-user) may need to know roughly around what part of the input stream we had a problem with and aborted on. Because line numbers aren't very useful in this sort of application we instead just keep the last 100 commands in a FIFO queue and print them as part of the crash report. Currently one problem with this design is a commit that has more than 100 modified files in it will flood the FIFO and any context regarding branch/from/committer/mark/comments will be lost. We really should save only the last few (10?) file changes for the current commit, ensuring we have some prior higher level commands in the FIFO when we crash on a file M/D/C/R command. Another issue with this approach is the FIFO only includes the commands, it does not include the commit messages. Yet having a commit message may be useful to help locate the relevant change in the source material. In practice I don't think this is going to be a major concern as the frontend can always embed its own source change set identifier as a comment (which will appear in the crash report) and the commit message(s) for the most recent commits of any given branch should be obtainable from the (packed) commit objects. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 10:47:04 +02:00			`} else {`
			`struct recent_command *rc;`

strbuf change: be sure ->buf is never ever NULL. For that purpose, the ->buf is always initialized with a char * buf living in the strbuf module. It is made a char * so that we can sloppily accept things that perform: sb->buf[0] = '\0', and because you can't pass "" as an initializer for ->buf without making gcc unhappy for very good reasons. strbuf_init/_detach/_grow have been fixed to trust ->alloc and not ->buf anymore. as a consequence strbuf_detach is _mandatory_ to detach a buffer, copying ->buf isn't an option anymore, if ->buf is going to escape from the scope, and eventually be free'd. API changes: * strbuf_setlen now always works, so just make strbuf_reset a convenience macro. * strbuf_detatch takes a size_t* optional argument (meaning it can be NULL) to copy the buffer's len, as it was needed for this refactor to make the code more readable, and working like the callers. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-27 12:58:23 +02:00			`strbuf_detach(&command_buf, NULL);`
Drop strbuf's 'eof' marker, and make read_line a first class citizen. read_line is now strbuf_getline, and is a first class citizen, it returns 0 when reading a line worked, EOF else. The ->eof marker was used non-locally by fast-import.c, mimic the same behaviour using a static int in "read_next_command", that now returns -1 on EOF, and avoids to call strbuf_getline when it's in EOF state. Also no longer automagically strbuf_release the buffer, it's counter intuitive and breaks fast-import in a very subtle way. Note: being at EOF implies that command_buf.len == 0. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 11:19:04 +02:00			`stdin_eof = strbuf_getline(&command_buf, stdin, '\n');`
			`if (stdin_eof)`
			`return EOF;`
Include recent command history in fast-import crash reports When we crash the frontend developer (or end-user) may need to know roughly around what part of the input stream we had a problem with and aborted on. Because line numbers aren't very useful in this sort of application we instead just keep the last 100 commands in a FIFO queue and print them as part of the crash report. Currently one problem with this design is a commit that has more than 100 modified files in it will flood the FIFO and any context regarding branch/from/committer/mark/comments will be lost. We really should save only the last few (10?) file changes for the current commit, ensuring we have some prior higher level commands in the FIFO when we crash on a file M/D/C/R command. Another issue with this approach is the FIFO only includes the commands, it does not include the commit messages. Yet having a commit message may be useful to help locate the relevant change in the source material. In practice I don't think this is going to be a major concern as the frontend can always embed its own source change set identifier as a comment (which will appear in the crash report) and the commit message(s) for the most recent commits of any given branch should be obtainable from the (packed) commit objects. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 10:47:04 +02:00
fast-import: add feature command This allows the fronted to require a specific feature to be supported by the backend, or abort. Also add support for four initial feature, date-format=, force=, import-marks=, export-marks=. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:56 +01:00			`if (!seen_data_command`
fast-import: add option command This allows the frontend to specify any of the supported options as long as no non-option command has been given. This way the user does not have to include any frontend-specific options, but instead she can rely on the frontend to tell fast-import what it needs. Also factor out parsing of argv and have it execute when we reach the first non-option command, or after all commands have been read and no non-option command has been encountered. Non-git options are ignored, unrecognised options result in an error. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:57 +01:00			`&& prefixcmp(command_buf.buf, "feature ")`
			`&& prefixcmp(command_buf.buf, "option ")) {`
			`parse_argv();`
fast-import: add feature command This allows the fronted to require a specific feature to be supported by the backend, or abort. Also add support for four initial feature, date-format=, force=, import-marks=, export-marks=. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:56 +01:00			`}`

Include recent command history in fast-import crash reports When we crash the frontend developer (or end-user) may need to know roughly around what part of the input stream we had a problem with and aborted on. Because line numbers aren't very useful in this sort of application we instead just keep the last 100 commands in a FIFO queue and print them as part of the crash report. Currently one problem with this design is a commit that has more than 100 modified files in it will flood the FIFO and any context regarding branch/from/committer/mark/comments will be lost. We really should save only the last few (10?) file changes for the current commit, ensuring we have some prior higher level commands in the FIFO when we crash on a file M/D/C/R command. Another issue with this approach is the FIFO only includes the commands, it does not include the commit messages. Yet having a commit message may be useful to help locate the relevant change in the source material. In practice I don't think this is going to be a major concern as the frontend can always embed its own source change set identifier as a comment (which will appear in the crash report) and the commit message(s) for the most recent commits of any given branch should be obtainable from the (packed) commit objects. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 10:47:04 +02:00			`rc = rc_free;`
			`if (rc)`
			`rc_free = rc->next;`
			`else {`
			`rc = cmd_hist.next;`
			`cmd_hist.next = rc->next;`
			`cmd_hist.next->prev = &cmd_hist;`
			`free(rc->buf);`
			`}`

			`rc->buf = command_buf.buf;`
			`rc->prev = cmd_tail;`
			`rc->next = cmd_hist.prev;`
			`rc->prev->next = rc;`
			`cmd_tail = rc;`
			`}`
fast-import: Allow cat-blob requests at arbitrary points in stream The new rule: a "cat-blob" can be inserted wherever a comment is allowed, which means at the start of any line except in the middle of a "data" command. This saves frontends from having to loop over everything they want to commit in the next commit and cat-ing the necessary objects in advance. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-11-28 20:45:58 +01:00			`if (!prefixcmp(command_buf.buf, "cat-blob ")) {`
			`parse_cat_blob();`
			`continue;`
			`}`
			`if (command_buf.buf[0] == '#')`
			`continue;`
			`return 0;`
			`}`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`}`

fast-import pull request * skip_optional_lf() decl is old-style -- please say static skip_optional_lf(void) { ... } * t9300 #14 fails, like this: * expecting failure: git-fast-import <input fatal: Branch name doesn't conform to GIT standards: .badbranchname fast-import: dumping crash report to .git/fast_import_crash_14354 ./test-lib.sh: line 143: 14354 Segmentation fault git-fast-import <input -- >8 -- Subject: [PATCH] fastimport: Fix re-use of va_list The va_list is designed to be used only once. The current code reuses va_list argument may cause segmentation fault. Copy and release the arguments to avoid this problem. While we are at it, fix old-style function declaration of skip_optional_lf(). Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-19 11:50:18 +02:00			`static void skip_optional_lf(void)`
Make trailing LF following fast-import `data` commands optional A few fast-import frontend developers have found it odd that we require the LF following a `data` command, especially in the exact byte count format. Technically we don't need this LF to parse the stream properly, but having it here does make the stream more readable to humans. We can easily make the LF optional by peeking at the next byte available from the stream and pushing it back into the buffer if its not LF. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-01 06:24:25 +02:00			`{`
			`int term_char = fgetc(stdin);`
			`if (term_char != '\n' && term_char != EOF)`
			`ungetc(term_char, stdin);`
			`}`

git-fast-import: rename cmd_() functions to parse_() There is a cmd_merge() function in fast-import that will conflict with builtin-merge's cmd_merge() function. To keep it consistent, rename all cmd_() function to parse_() Signed-off-by: Miklos Vajna <vmiklos@frugalware.org> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-05-16 00:35:56 +02:00			`static void parse_mark(void)`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`{`
prefixcmp(): fix-up mechanical conversion. Previous step converted use of strncmp() with literal string mechanically even when the result is only used as a boolean: if (!strncmp("foo", arg, 3)) ==> if (!(-prefixcmp(arg, "foo"))) This step manually cleans them up to read: if (!prefixcmp(arg, "foo")) Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-20 10:54:00 +01:00			`if (!prefixcmp(command_buf.buf, "mark :")) {`
Use uintmax_t for marks in fast-import. If a frontend wants to use a mark per file revision and per commit and is doing a truly huge import (such as a 32 GiB SVN repository) we may need more than 2**32 unique mark values, especially if the frontend is unable (or unwilling) to recycle mark values. For mark idnums we should use the largest unsigned integer type available, hoping that will be at least 64 bits when we are compiled as a 64 bit executable. This way we may consume huge amounts of memory storing our mark table, but we'll at least be able to process the entire import without failing. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 06:33:19 +01:00			`next_mark = strtoumax(command_buf.buf + 6, NULL, 10);`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`read_next_command();`
			`}`
			`else`
Added mark store/find to fast-import. Marks are now saved when the mark directive gets used by the frontend and may be used in place of a SHA1 expression to locate a previous SHA1 which fast-import may have generated. This is particularly useful with commits where the frontend does not (easily) have the ability to compute the SHA1 for an arbitrary commit but needs it to generate a branch or tag from that commit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 10:17:45 +02:00			`next_mark = 0;`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`}`

fast-import: Stream very large blobs directly to pack If a blob is larger than the configured big-file-threshold, instead of reading it into a single buffer obtained from malloc, stream it onto the end of the current pack file. Streaming the larger objects into the pack avoids the 4+ GiB memory footprint that occurs when fast-import is processing 2+ GiB blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-01 18:27:35 +01:00			`static int parse_data(struct strbuf sb, uintmax_t limit, uintmax_t len_res)`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`{`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`strbuf_reset(sb);`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00
prefixcmp(): fix-up mechanical conversion. Previous step converted use of strncmp() with literal string mechanically even when the result is only used as a boolean: if (!strncmp("foo", arg, 3)) ==> if (!(-prefixcmp(arg, "foo"))) This step manually cleans them up to read: if (!prefixcmp(arg, "foo")) Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-20 10:54:00 +01:00			`if (prefixcmp(command_buf.buf, "data "))`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`die("Expected 'data n' command, found: %s", command_buf.buf);`

prefixcmp(): fix-up mechanical conversion. Previous step converted use of strncmp() with literal string mechanically even when the result is only used as a boolean: if (!strncmp("foo", arg, 3)) ==> if (!(-prefixcmp(arg, "foo"))) This step manually cleans them up to read: if (!prefixcmp(arg, "foo")) Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-20 10:54:00 +01:00			`if (!prefixcmp(command_buf.buf + 5, "<<")) {`
Support delimited data regions in fast-import. During testing its nice to not have to feed the length of a data chunk to the 'data' command of fast-import. Instead we would prefer to be able to establish a data chunk much like shell's << operator and use a line delimiter to denote the end of the input. So now if a data command is started as 'data <<EOF' we will look for a terminator line containing only the string EOF on that line. Once found, we stop the data command. Everything between the two lines is used as the data value. The 'data <<' syntax is slower than 'data n', as we don't know how many bytes to expect and instead must grow our buffer on the fly. It also has the problem that the frontend must use a string which will not appear on a line by itself in the input, and the data region will always end in an LF. For these reasons real import frontends are encouraged to continue to use _only_ 'data n'. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-18 19:14:27 +01:00			`char *term = xstrdup(command_buf.buf + 5 + 2);`
fast-import: Use strbuf API, and simplify cmd_data() This patch features the use of strbuf_detach, and prevent the programmer to mess with allocation directly. The code is as efficent as before, just more concise and more straightforward. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-06 13:20:07 +02:00			`size_t term_len = command_buf.len - 5 - 2;`

fast-import.c: fix regression due to strbuf conversion Without this strbuf_detach(), it yields a double free later, the command is in fact stashed, and this is not a memory leak. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-10-26 09:59:12 +02:00			`strbuf_detach(&command_buf, NULL);`
Support delimited data regions in fast-import. During testing its nice to not have to feed the length of a data chunk to the 'data' command of fast-import. Instead we would prefer to be able to establish a data chunk much like shell's << operator and use a line delimiter to denote the end of the input. So now if a data command is started as 'data <<EOF' we will look for a terminator line containing only the string EOF on that line. Once found, we stop the data command. Everything between the two lines is used as the data value. The 'data <<' syntax is slower than 'data n', as we don't know how many bytes to expect and instead must grow our buffer on the fly. It also has the problem that the frontend must use a string which will not appear on a line by itself in the input, and the data region will always end in an LF. For these reasons real import frontends are encouraged to continue to use _only_ 'data n'. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-18 19:14:27 +01:00			`for (;;) {`
Drop strbuf's 'eof' marker, and make read_line a first class citizen. read_line is now strbuf_getline, and is a first class citizen, it returns 0 when reading a line worked, EOF else. The ->eof marker was used non-locally by fast-import.c, mimic the same behaviour using a static int in "read_next_command", that now returns -1 on EOF, and avoids to call strbuf_getline when it's in EOF state. Also no longer automagically strbuf_release the buffer, it's counter intuitive and breaks fast-import in a very subtle way. Note: being at EOF implies that command_buf.len == 0. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 11:19:04 +02:00			`if (strbuf_getline(&command_buf, stdin, '\n') == EOF)`
Support delimited data regions in fast-import. During testing its nice to not have to feed the length of a data chunk to the 'data' command of fast-import. Instead we would prefer to be able to establish a data chunk much like shell's << operator and use a line delimiter to denote the end of the input. So now if a data command is started as 'data <<EOF' we will look for a terminator line containing only the string EOF on that line. Once found, we stop the data command. Everything between the two lines is used as the data value. The 'data <<' syntax is slower than 'data n', as we don't know how many bytes to expect and instead must grow our buffer on the fly. It also has the problem that the frontend must use a string which will not appear on a line by itself in the input, and the data region will always end in an LF. For these reasons real import frontends are encouraged to continue to use _only_ 'data n'. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-18 19:14:27 +01:00			`die("EOF in data (terminator '%s' not found)", term);`
			`if (term_len == command_buf.len`
			`&& !strcmp(term, command_buf.buf))`
			`break;`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`strbuf_addbuf(sb, &command_buf);`
			`strbuf_addch(sb, '\n');`
Support delimited data regions in fast-import. During testing its nice to not have to feed the length of a data chunk to the 'data' command of fast-import. Instead we would prefer to be able to establish a data chunk much like shell's << operator and use a line delimiter to denote the end of the input. So now if a data command is started as 'data <<EOF' we will look for a terminator line containing only the string EOF on that line. Once found, we stop the data command. Everything between the two lines is used as the data value. The 'data <<' syntax is slower than 'data n', as we don't know how many bytes to expect and instead must grow our buffer on the fly. It also has the problem that the frontend must use a string which will not appear on a line by itself in the input, and the data region will always end in an LF. For these reasons real import frontends are encouraged to continue to use _only_ 'data n'. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-18 19:14:27 +01:00			`}`
			`free(term);`
			`}`
			`else {`
fast-import: Stream very large blobs directly to pack If a blob is larger than the configured big-file-threshold, instead of reading it into a single buffer obtained from malloc, stream it onto the end of the current pack file. Streaming the larger objects into the pack avoids the 4+ GiB memory footprint that occurs when fast-import is processing 2+ GiB blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-01 18:27:35 +01:00			`uintmax_t len = strtoumax(command_buf.buf + 5, NULL, 10);`
			`size_t n = 0, length = (size_t)len;`
fast-import: Use strbuf API, and simplify cmd_data() This patch features the use of strbuf_detach, and prevent the programmer to mess with allocation directly. The code is as efficent as before, just more concise and more straightforward. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-06 13:20:07 +02:00
fast-import: Stream very large blobs directly to pack If a blob is larger than the configured big-file-threshold, instead of reading it into a single buffer obtained from malloc, stream it onto the end of the current pack file. Streaming the larger objects into the pack avoids the 4+ GiB memory footprint that occurs when fast-import is processing 2+ GiB blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-01 18:27:35 +01:00			`if (limit && limit < len) {`
			`*len_res = len;`
			`return 0;`
			`}`
			`if (length < len)`
			`die("data is too large to use in this context");`
fast-import: Use strbuf API, and simplify cmd_data() This patch features the use of strbuf_detach, and prevent the programmer to mess with allocation directly. The code is as efficent as before, just more concise and more straightforward. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-06 13:20:07 +02:00
Support delimited data regions in fast-import. During testing its nice to not have to feed the length of a data chunk to the 'data' command of fast-import. Instead we would prefer to be able to establish a data chunk much like shell's << operator and use a line delimiter to denote the end of the input. So now if a data command is started as 'data <<EOF' we will look for a terminator line containing only the string EOF on that line. Once found, we stop the data command. Everything between the two lines is used as the data value. The 'data <<' syntax is slower than 'data n', as we don't know how many bytes to expect and instead must grow our buffer on the fly. It also has the problem that the frontend must use a string which will not appear on a line by itself in the input, and the data region will always end in an LF. For these reasons real import frontends are encouraged to continue to use _only_ 'data n'. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-18 19:14:27 +01:00			`while (n < length) {`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`size_t s = strbuf_fread(sb, length - n, stdin);`
Support delimited data regions in fast-import. During testing its nice to not have to feed the length of a data chunk to the 'data' command of fast-import. Instead we would prefer to be able to establish a data chunk much like shell's << operator and use a line delimiter to denote the end of the input. So now if a data command is started as 'data <<EOF' we will look for a terminator line containing only the string EOF on that line. Once found, we stop the data command. Everything between the two lines is used as the data value. The 'data <<' syntax is slower than 'data n', as we don't know how many bytes to expect and instead must grow our buffer on the fly. It also has the problem that the frontend must use a string which will not appear on a line by itself in the input, and the data region will always end in an LF. For these reasons real import frontends are encouraged to continue to use _only_ 'data n'. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-18 19:14:27 +01:00			`if (!s && feof(stdin))`
fast-import: Fix compile warnings Not on all platforms are size_t and unsigned long equivalent. Since I do not know how portable %z is, I play safe, and just cast the respective variables to unsigned long. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-07 12:38:21 +01:00			`die("EOF in data (%lu bytes remaining)",`
			`(unsigned long)(length - n));`
Support delimited data regions in fast-import. During testing its nice to not have to feed the length of a data chunk to the 'data' command of fast-import. Instead we would prefer to be able to establish a data chunk much like shell's << operator and use a line delimiter to denote the end of the input. So now if a data command is started as 'data <<EOF' we will look for a terminator line containing only the string EOF on that line. Once found, we stop the data command. Everything between the two lines is used as the data value. The 'data <<' syntax is slower than 'data n', as we don't know how many bytes to expect and instead must grow our buffer on the fly. It also has the problem that the frontend must use a string which will not appear on a line by itself in the input, and the data region will always end in an LF. For these reasons real import frontends are encouraged to continue to use _only_ 'data n'. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-18 19:14:27 +01:00			`n += s;`
			`}`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`}`

Make trailing LF following fast-import `data` commands optional A few fast-import frontend developers have found it odd that we require the LF following a `data` command, especially in the exact byte count format. Technically we don't need this LF to parse the stream properly, but having it here does make the stream more readable to humans. We can easily make the LF optional by peeking at the next byte available from the stream and pushing it back into the buffer if its not LF. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-01 06:24:25 +02:00			`skip_optional_lf();`
fast-import: Stream very large blobs directly to pack If a blob is larger than the configured big-file-threshold, instead of reading it into a single buffer obtained from malloc, stream it onto the end of the current pack file. Streaming the larger objects into the pack avoids the 4+ GiB memory footprint that occurs when fast-import is processing 2+ GiB blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-01 18:27:35 +01:00			`return 1;`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`}`

Support RFC 2822 date parsing in fast-import. Since some frontends may be working with source material where the dates are only readily available as RFC 2822 strings, it is more friendly if fast-import exposes Git's parse_date() function to handle the conversion. This way the frontend doesn't need to perform the parsing itself. The new --date-format option to fast-import can be used by a frontend to select which format it will supply date strings in. The default is the standard `raw` Git format, which fast-import has always supported. Format rfc2822 can be used to activate the parse_date() function instead. Because fast-import could also be useful for creating new, current commits, the format `now` is also supported to generate the current system timestamp. The implementation of `now` is a trivial call to datestamp(), but is actually a whole whopping 3 lines so that fast-import can verify the frontend really meant `now`. As part of this change I have added validation of the `raw` date format. Prior to this change fast-import would accept anything in a `committer` command, even if it was seriously malformed. Now fast-import requires the '> ' near the end of the string and verifies the timestamp is formatted properly. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 20:58:30 +01:00			`static int validate_raw_date(const char src, char result, int maxlen)`
			`{`
			`const char *orig_src = src;`
Remove unused function scope local variables These variables were unused and can be removed safely: builtin-clone.c::cmd_clone(): use_local_hardlinks, use_separate_remote builtin-fetch-pack.c::find_common(): len builtin-remote.c::mv(): symref diff.c::show_stats():show_stats(): total diffcore-break.c::should_break(): base_size fast-import.c::validate_raw_date(): date, sign fsck.c::fsck_tree(): o_sha1, sha1 xdiff-interface.c::parse_num(): read_some Signed-off-by: Benjamin Kramer <benny.kra@googlemail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-03-07 21:02:10 +01:00			`char *endp;`
fast-import.c::validate_raw_date(): really validate the value When reading the "raw format" timestamp from the input stream, make sure that the timezone offset is a reasonable value by imitating 7122f82 (date.c: improve guess between timezone offset and year., 2006-06-08). We _might_ want to also check if the timestamp itself is reasonable, but that is left for a separate commit. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-09-29 08:40:09 +02:00			`unsigned long num;`
Support RFC 2822 date parsing in fast-import. Since some frontends may be working with source material where the dates are only readily available as RFC 2822 strings, it is more friendly if fast-import exposes Git's parse_date() function to handle the conversion. This way the frontend doesn't need to perform the parsing itself. The new --date-format option to fast-import can be used by a frontend to select which format it will supply date strings in. The default is the standard `raw` Git format, which fast-import has always supported. Format rfc2822 can be used to activate the parse_date() function instead. Because fast-import could also be useful for creating new, current commits, the format `now` is also supported to generate the current system timestamp. The implementation of `now` is a trivial call to datestamp(), but is actually a whole whopping 3 lines so that fast-import can verify the frontend really meant `now`. As part of this change I have added validation of the `raw` date format. Prior to this change fast-import would accept anything in a `committer` command, even if it was seriously malformed. Now fast-import requires the '> ' near the end of the string and verifies the timestamp is formatted properly. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 20:58:30 +01:00
fast-import.c: stricter strtoul check, silence compiler warning Store the return value of strtoul() in order to avoid compiler warnings on Ubuntu 8.10. Also check errno after each call, which is the only way to notice an overflow without making ULONG_MAX an illegal date. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-12-21 02:28:48 +01:00			`errno = 0;`

fast-import.c::validate_raw_date(): really validate the value When reading the "raw format" timestamp from the input stream, make sure that the timezone offset is a reasonable value by imitating 7122f82 (date.c: improve guess between timezone offset and year., 2006-06-08). We _might_ want to also check if the timestamp itself is reasonable, but that is left for a separate commit. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-09-29 08:40:09 +02:00			`num = strtoul(src, &endp, 10);`
			`/* NEEDSWORK: perhaps check for reasonable values? */`
fast-import.c: stricter strtoul check, silence compiler warning Store the return value of strtoul() in order to avoid compiler warnings on Ubuntu 8.10. Also check errno after each call, which is the only way to notice an overflow without making ULONG_MAX an illegal date. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-12-21 02:28:48 +01:00			`if (errno \|\| endp == src \|\| *endp != ' ')`
Support RFC 2822 date parsing in fast-import. Since some frontends may be working with source material where the dates are only readily available as RFC 2822 strings, it is more friendly if fast-import exposes Git's parse_date() function to handle the conversion. This way the frontend doesn't need to perform the parsing itself. The new --date-format option to fast-import can be used by a frontend to select which format it will supply date strings in. The default is the standard `raw` Git format, which fast-import has always supported. Format rfc2822 can be used to activate the parse_date() function instead. Because fast-import could also be useful for creating new, current commits, the format `now` is also supported to generate the current system timestamp. The implementation of `now` is a trivial call to datestamp(), but is actually a whole whopping 3 lines so that fast-import can verify the frontend really meant `now`. As part of this change I have added validation of the `raw` date format. Prior to this change fast-import would accept anything in a `committer` command, even if it was seriously malformed. Now fast-import requires the '> ' near the end of the string and verifies the timestamp is formatted properly. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 20:58:30 +01:00			`return -1;`

			`src = endp + 1;`
			`if (src != '-' && src != '+')`
			`return -1;`

fast-import.c::validate_raw_date(): really validate the value When reading the "raw format" timestamp from the input stream, make sure that the timezone offset is a reasonable value by imitating 7122f82 (date.c: improve guess between timezone offset and year., 2006-06-08). We _might_ want to also check if the timestamp itself is reasonable, but that is left for a separate commit. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-09-29 08:40:09 +02:00			`num = strtoul(src + 1, &endp, 10);`
			`if (errno \|\| endp == src + 1 \|\| *endp \|\| (endp - orig_src) >= maxlen \|\|`
			`1400 < num)`
Support RFC 2822 date parsing in fast-import. Since some frontends may be working with source material where the dates are only readily available as RFC 2822 strings, it is more friendly if fast-import exposes Git's parse_date() function to handle the conversion. This way the frontend doesn't need to perform the parsing itself. The new --date-format option to fast-import can be used by a frontend to select which format it will supply date strings in. The default is the standard `raw` Git format, which fast-import has always supported. Format rfc2822 can be used to activate the parse_date() function instead. Because fast-import could also be useful for creating new, current commits, the format `now` is also supported to generate the current system timestamp. The implementation of `now` is a trivial call to datestamp(), but is actually a whole whopping 3 lines so that fast-import can verify the frontend really meant `now`. As part of this change I have added validation of the `raw` date format. Prior to this change fast-import would accept anything in a `committer` command, even if it was seriously malformed. Now fast-import requires the '> ' near the end of the string and verifies the timestamp is formatted properly. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 20:58:30 +01:00			`return -1;`

			`strcpy(result, orig_src);`
			`return 0;`
			`}`

			`static char parse_ident(const char buf)`
			`{`
fast-import: check committer name more strictly The documentation declares following identity format: (<name> SP)? LT <email> GT where name is any string without LF and LT characters. But fast-import just accepts any string up to first GT instead of checking the whole format, and moreover just writes it as is to the commit object. git-fsck checks for [^<\n]* <[^<>\n]*> format. Note that the space is mandatory. And the space quirk is already handled via extending the string to the left when needed. Modify fast-import input identity format to a slightly stricter one - deny LF, LT and GT in both <name> and <email>. And check for it. This is stricter then git-fsck as fsck accepts "Name> <email>" currently, but soon fsck check will be adjusted likewise. Signed-off-by: Dmitry Ivankov <divanorama@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-08-11 12:21:08 +02:00			`const char *ltgt;`
Support RFC 2822 date parsing in fast-import. Since some frontends may be working with source material where the dates are only readily available as RFC 2822 strings, it is more friendly if fast-import exposes Git's parse_date() function to handle the conversion. This way the frontend doesn't need to perform the parsing itself. The new --date-format option to fast-import can be used by a frontend to select which format it will supply date strings in. The default is the standard `raw` Git format, which fast-import has always supported. Format rfc2822 can be used to activate the parse_date() function instead. Because fast-import could also be useful for creating new, current commits, the format `now` is also supported to generate the current system timestamp. The implementation of `now` is a trivial call to datestamp(), but is actually a whole whopping 3 lines so that fast-import can verify the frontend really meant `now`. As part of this change I have added validation of the `raw` date format. Prior to this change fast-import would accept anything in a `committer` command, even if it was seriously malformed. Now fast-import requires the '> ' near the end of the string and verifies the timestamp is formatted properly. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 20:58:30 +01:00			`size_t name_len;`
			`char *ident;`

fast-import: don't fail on omitted committer name fast-import format declares 'committer_name SP' to be optional in 'committer_name SP LT email GT'. But for a (commit) object SP is obligatory while zero length committer_name is ok. git-fsck checks that SP is present, so fast-import must prepend it if the name SP part is omitted. It doesn't do so and thus for "LT email GT" ident it writes a bad object. Name cannot contain LT or GT, ident always comes after SP in fast-import. So if ident starts with LT reuse the SP as if a valid 'SP LT email GT' ident was passed. This fixes a ident parsing bug for a well-formed fast-import input. Though the parsing is still loose and can accept a ill-formed input. Signed-off-by: Dmitry Ivankov <divanorama@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-08-11 12:21:07 +02:00			`/* ensure there is a space delimiter even if there is no name */`
			`if (*buf == '<')`
			`--buf;`

fast-import: check committer name more strictly The documentation declares following identity format: (<name> SP)? LT <email> GT where name is any string without LF and LT characters. But fast-import just accepts any string up to first GT instead of checking the whole format, and moreover just writes it as is to the commit object. git-fsck checks for [^<\n]* <[^<>\n]*> format. Note that the space is mandatory. And the space quirk is already handled via extending the string to the left when needed. Modify fast-import input identity format to a slightly stricter one - deny LF, LT and GT in both <name> and <email>. And check for it. This is stricter then git-fsck as fsck accepts "Name> <email>" currently, but soon fsck check will be adjusted likewise. Signed-off-by: Dmitry Ivankov <divanorama@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-08-11 12:21:08 +02:00			`ltgt = buf + strcspn(buf, "<>");`
			`if (*ltgt != '<')`
			`die("Missing < in ident string: %s", buf);`
			`if (ltgt != buf && ltgt[-1] != ' ')`
			`die("Missing space before < in ident string: %s", buf);`
			`ltgt = ltgt + 1 + strcspn(ltgt + 1, "<>");`
			`if (*ltgt != '>')`
Support RFC 2822 date parsing in fast-import. Since some frontends may be working with source material where the dates are only readily available as RFC 2822 strings, it is more friendly if fast-import exposes Git's parse_date() function to handle the conversion. This way the frontend doesn't need to perform the parsing itself. The new --date-format option to fast-import can be used by a frontend to select which format it will supply date strings in. The default is the standard `raw` Git format, which fast-import has always supported. Format rfc2822 can be used to activate the parse_date() function instead. Because fast-import could also be useful for creating new, current commits, the format `now` is also supported to generate the current system timestamp. The implementation of `now` is a trivial call to datestamp(), but is actually a whole whopping 3 lines so that fast-import can verify the frontend really meant `now`. As part of this change I have added validation of the `raw` date format. Prior to this change fast-import would accept anything in a `committer` command, even if it was seriously malformed. Now fast-import requires the '> ' near the end of the string and verifies the timestamp is formatted properly. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 20:58:30 +01:00			`die("Missing > in ident string: %s", buf);`
fast-import: check committer name more strictly The documentation declares following identity format: (<name> SP)? LT <email> GT where name is any string without LF and LT characters. But fast-import just accepts any string up to first GT instead of checking the whole format, and moreover just writes it as is to the commit object. git-fsck checks for [^<\n]* <[^<>\n]*> format. Note that the space is mandatory. And the space quirk is already handled via extending the string to the left when needed. Modify fast-import input identity format to a slightly stricter one - deny LF, LT and GT in both <name> and <email>. And check for it. This is stricter then git-fsck as fsck accepts "Name> <email>" currently, but soon fsck check will be adjusted likewise. Signed-off-by: Dmitry Ivankov <divanorama@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-08-11 12:21:08 +02:00			`ltgt++;`
			`if (*ltgt != ' ')`
Support RFC 2822 date parsing in fast-import. Since some frontends may be working with source material where the dates are only readily available as RFC 2822 strings, it is more friendly if fast-import exposes Git's parse_date() function to handle the conversion. This way the frontend doesn't need to perform the parsing itself. The new --date-format option to fast-import can be used by a frontend to select which format it will supply date strings in. The default is the standard `raw` Git format, which fast-import has always supported. Format rfc2822 can be used to activate the parse_date() function instead. Because fast-import could also be useful for creating new, current commits, the format `now` is also supported to generate the current system timestamp. The implementation of `now` is a trivial call to datestamp(), but is actually a whole whopping 3 lines so that fast-import can verify the frontend really meant `now`. As part of this change I have added validation of the `raw` date format. Prior to this change fast-import would accept anything in a `committer` command, even if it was seriously malformed. Now fast-import requires the '> ' near the end of the string and verifies the timestamp is formatted properly. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 20:58:30 +01:00			`die("Missing space after > in ident string: %s", buf);`
fast-import: check committer name more strictly The documentation declares following identity format: (<name> SP)? LT <email> GT where name is any string without LF and LT characters. But fast-import just accepts any string up to first GT instead of checking the whole format, and moreover just writes it as is to the commit object. git-fsck checks for [^<\n]* <[^<>\n]*> format. Note that the space is mandatory. And the space quirk is already handled via extending the string to the left when needed. Modify fast-import input identity format to a slightly stricter one - deny LF, LT and GT in both <name> and <email>. And check for it. This is stricter then git-fsck as fsck accepts "Name> <email>" currently, but soon fsck check will be adjusted likewise. Signed-off-by: Dmitry Ivankov <divanorama@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-08-11 12:21:08 +02:00			`ltgt++;`
			`name_len = ltgt - buf;`
Support RFC 2822 date parsing in fast-import. Since some frontends may be working with source material where the dates are only readily available as RFC 2822 strings, it is more friendly if fast-import exposes Git's parse_date() function to handle the conversion. This way the frontend doesn't need to perform the parsing itself. The new --date-format option to fast-import can be used by a frontend to select which format it will supply date strings in. The default is the standard `raw` Git format, which fast-import has always supported. Format rfc2822 can be used to activate the parse_date() function instead. Because fast-import could also be useful for creating new, current commits, the format `now` is also supported to generate the current system timestamp. The implementation of `now` is a trivial call to datestamp(), but is actually a whole whopping 3 lines so that fast-import can verify the frontend really meant `now`. As part of this change I have added validation of the `raw` date format. Prior to this change fast-import would accept anything in a `committer` command, even if it was seriously malformed. Now fast-import requires the '> ' near the end of the string and verifies the timestamp is formatted properly. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 20:58:30 +01:00			`ident = xmalloc(name_len + 24);`
			`strncpy(ident, buf, name_len);`

			`switch (whenspec) {`
			`case WHENSPEC_RAW:`
fast-import: check committer name more strictly The documentation declares following identity format: (<name> SP)? LT <email> GT where name is any string without LF and LT characters. But fast-import just accepts any string up to first GT instead of checking the whole format, and moreover just writes it as is to the commit object. git-fsck checks for [^<\n]* <[^<>\n]*> format. Note that the space is mandatory. And the space quirk is already handled via extending the string to the left when needed. Modify fast-import input identity format to a slightly stricter one - deny LF, LT and GT in both <name> and <email>. And check for it. This is stricter then git-fsck as fsck accepts "Name> <email>" currently, but soon fsck check will be adjusted likewise. Signed-off-by: Dmitry Ivankov <divanorama@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-08-11 12:21:08 +02:00			`if (validate_raw_date(ltgt, ident + name_len, 24) < 0)`
			`die("Invalid raw date \"%s\" in ident: %s", ltgt, buf);`
Support RFC 2822 date parsing in fast-import. Since some frontends may be working with source material where the dates are only readily available as RFC 2822 strings, it is more friendly if fast-import exposes Git's parse_date() function to handle the conversion. This way the frontend doesn't need to perform the parsing itself. The new --date-format option to fast-import can be used by a frontend to select which format it will supply date strings in. The default is the standard `raw` Git format, which fast-import has always supported. Format rfc2822 can be used to activate the parse_date() function instead. Because fast-import could also be useful for creating new, current commits, the format `now` is also supported to generate the current system timestamp. The implementation of `now` is a trivial call to datestamp(), but is actually a whole whopping 3 lines so that fast-import can verify the frontend really meant `now`. As part of this change I have added validation of the `raw` date format. Prior to this change fast-import would accept anything in a `committer` command, even if it was seriously malformed. Now fast-import requires the '> ' near the end of the string and verifies the timestamp is formatted properly. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 20:58:30 +01:00			`break;`
			`case WHENSPEC_RFC2822:`
fast-import: check committer name more strictly The documentation declares following identity format: (<name> SP)? LT <email> GT where name is any string without LF and LT characters. But fast-import just accepts any string up to first GT instead of checking the whole format, and moreover just writes it as is to the commit object. git-fsck checks for [^<\n]* <[^<>\n]*> format. Note that the space is mandatory. And the space quirk is already handled via extending the string to the left when needed. Modify fast-import input identity format to a slightly stricter one - deny LF, LT and GT in both <name> and <email>. And check for it. This is stricter then git-fsck as fsck accepts "Name> <email>" currently, but soon fsck check will be adjusted likewise. Signed-off-by: Dmitry Ivankov <divanorama@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-08-11 12:21:08 +02:00			`if (parse_date(ltgt, ident + name_len, 24) < 0)`
			`die("Invalid rfc2822 date \"%s\" in ident: %s", ltgt, buf);`
Support RFC 2822 date parsing in fast-import. Since some frontends may be working with source material where the dates are only readily available as RFC 2822 strings, it is more friendly if fast-import exposes Git's parse_date() function to handle the conversion. This way the frontend doesn't need to perform the parsing itself. The new --date-format option to fast-import can be used by a frontend to select which format it will supply date strings in. The default is the standard `raw` Git format, which fast-import has always supported. Format rfc2822 can be used to activate the parse_date() function instead. Because fast-import could also be useful for creating new, current commits, the format `now` is also supported to generate the current system timestamp. The implementation of `now` is a trivial call to datestamp(), but is actually a whole whopping 3 lines so that fast-import can verify the frontend really meant `now`. As part of this change I have added validation of the `raw` date format. Prior to this change fast-import would accept anything in a `committer` command, even if it was seriously malformed. Now fast-import requires the '> ' near the end of the string and verifies the timestamp is formatted properly. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 20:58:30 +01:00			`break;`
			`case WHENSPEC_NOW:`
fast-import: check committer name more strictly The documentation declares following identity format: (<name> SP)? LT <email> GT where name is any string without LF and LT characters. But fast-import just accepts any string up to first GT instead of checking the whole format, and moreover just writes it as is to the commit object. git-fsck checks for [^<\n]* <[^<>\n]*> format. Note that the space is mandatory. And the space quirk is already handled via extending the string to the left when needed. Modify fast-import input identity format to a slightly stricter one - deny LF, LT and GT in both <name> and <email>. And check for it. This is stricter then git-fsck as fsck accepts "Name> <email>" currently, but soon fsck check will be adjusted likewise. Signed-off-by: Dmitry Ivankov <divanorama@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-08-11 12:21:08 +02:00			`if (strcmp("now", ltgt))`
Support RFC 2822 date parsing in fast-import. Since some frontends may be working with source material where the dates are only readily available as RFC 2822 strings, it is more friendly if fast-import exposes Git's parse_date() function to handle the conversion. This way the frontend doesn't need to perform the parsing itself. The new --date-format option to fast-import can be used by a frontend to select which format it will supply date strings in. The default is the standard `raw` Git format, which fast-import has always supported. Format rfc2822 can be used to activate the parse_date() function instead. Because fast-import could also be useful for creating new, current commits, the format `now` is also supported to generate the current system timestamp. The implementation of `now` is a trivial call to datestamp(), but is actually a whole whopping 3 lines so that fast-import can verify the frontend really meant `now`. As part of this change I have added validation of the `raw` date format. Prior to this change fast-import would accept anything in a `committer` command, even if it was seriously malformed. Now fast-import requires the '> ' near the end of the string and verifies the timestamp is formatted properly. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 20:58:30 +01:00			`die("Date in ident must be 'now': %s", buf);`
			`datestamp(ident + name_len, 24);`
			`break;`
			`}`

			`return ident;`
			`}`

fast-import: Stream very large blobs directly to pack If a blob is larger than the configured big-file-threshold, instead of reading it into a single buffer obtained from malloc, stream it onto the end of the current pack file. Streaming the larger objects into the pack avoids the 4+ GiB memory footprint that occurs when fast-import is processing 2+ GiB blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-01 18:27:35 +01:00			`static void parse_and_store_blob(`
			`struct last_object *last,`
			`unsigned char *sha1out,`
			`uintmax_t mark)`
Added basic command handler to fast-import. Moved the new_blob logic off into a new subroutine and invoked it when getting the 'blob' command. Added statistics dump to STDERR when the program terminates listing what it did at a high level. This is somewhat interesting. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 07:14:21 +02:00			`{`
fast-import optimization: Now that cmd_data acts on a strbuf, make last_object stashed buffer be a strbuf as well. On new stash, don't free the last stashed buffer, rather swap it with the one you will stash, this way, callers of store_object can act on static strbufs, and at some point, fast-import won't allocate new memory for objects buffers. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 14:00:38 +02:00			`static struct strbuf buf = STRBUF_INIT;`
fast-import: Stream very large blobs directly to pack If a blob is larger than the configured big-file-threshold, instead of reading it into a single buffer obtained from malloc, stream it onto the end of the current pack file. Streaming the larger objects into the pack avoids the 4+ GiB memory footprint that occurs when fast-import is processing 2+ GiB blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-01 18:27:35 +01:00			`uintmax_t len;`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00
fast-import: Stream very large blobs directly to pack If a blob is larger than the configured big-file-threshold, instead of reading it into a single buffer obtained from malloc, stream it onto the end of the current pack file. Streaming the larger objects into the pack avoids the 4+ GiB memory footprint that occurs when fast-import is processing 2+ GiB blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-01 18:27:35 +01:00			`if (parse_data(&buf, big_file_threshold, &len))`
			`store_object(OBJ_BLOB, &buf, last, sha1out, mark);`
			`else {`
			`if (last) {`
			`strbuf_release(&last->data);`
			`last->offset = 0;`
			`last->depth = 0;`
			`}`
			`stream_blob(len, sha1out, mark);`
			`skip_optional_lf();`
			`}`
			`}`

			`static void parse_new_blob(void)`
			`{`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`read_next_command();`
git-fast-import: rename cmd_() functions to parse_() There is a cmd_merge() function in fast-import that will conflict with builtin-merge's cmd_merge() function. To keep it consistent, rename all cmd_() function to parse_() Signed-off-by: Miklos Vajna <vmiklos@frugalware.org> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-05-16 00:35:56 +02:00			`parse_mark();`
fast-import: Stream very large blobs directly to pack If a blob is larger than the configured big-file-threshold, instead of reading it into a single buffer obtained from malloc, stream it onto the end of the current pack file. Streaming the larger objects into the pack avoids the 4+ GiB memory footprint that occurs when fast-import is processing 2+ GiB blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-01 18:27:35 +01:00			`parse_and_store_blob(&last_blob, NULL, next_mark);`
Added basic command handler to fast-import. Moved the new_blob logic off into a new subroutine and invoked it when getting the 'blob' command. Added statistics dump to STDERR when the program terminates listing what it did at a high level. This is somewhat interesting. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 07:14:21 +02:00			`}`

Declare no-arg functions as (void) in fast-import. Apparently the git convention is to declare any function which takes no arguments as taking void. I did not do this during the early fast-import development, but should have. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-17 07:47:25 +01:00			`static void unload_one_branch(void)`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00			`{`
Implemented tree reloading in fast-import. Tree reloading allows fast-import to swap out the least-recently used branch by simply deallocating the data structures from memory that were associated with that branch. Later if the branch becomes active again it can lazily recreate those structures on demand by reloading the necessary trees from the pack file it originally wrote them to. The reloading process is implemented by mmap'ing the pack into memory and using a much tighter variant of the pack reading code contained in sha1_file.c. This was a blatent copy from sha1_file.c but the unpacking functions were significantly simplified and are actually now in a form that should make it easier to map only the necessary regions of a pack rather than the entire file. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 10:37:35 +02:00			`while (cur_active_branches`
			`&& cur_active_branches >= max_active_branches) {`
Use off_t in pack-objects/fast-import when we mean an offset Always use an off_t value in pack-objects anytime we are dealing with an offset to some data within a packfile. Also fixed a minor uintmax_t that was incorrectly defined before. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-07 02:44:34 +01:00			`uintmax_t min_commit = ULONG_MAX;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`struct branch e, l = NULL, *p = NULL;`

			`for (e = active_branches; e; e = e->active_next_branch) {`
			`if (e->last_commit < min_commit) {`
			`p = l;`
			`min_commit = e->last_commit;`
			`}`
			`l = e;`
			`}`

			`if (p) {`
			`e = p->active_next_branch;`
			`p->active_next_branch = e->active_next_branch;`
			`} else {`
			`e = active_branches;`
			`active_branches = e->active_next_branch;`
			`}`
fast-import: Avoid infinite loop after reset Johannes Sixt noticed that a 'reset' command applied to a branch that is already active in the branch LRU cache can cause fast-import to relink the same branch into the LRU cache twice. This will cause the LRU cache to contain a cycle, making unload_one_branch run in an infinite loop as it tries to select the oldest branch for eviction. I have trivially fixed the problem by adding an active bit to each branch object; this bit indicates if the branch is already in the LRU and allows us to avoid trying to add it a second time. Converting the pack_id field into a bitfield makes this change take up no additional memory. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-05 18:31:09 +01:00			`e->active = 0;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`e->active_next_branch = NULL;`
			`if (e->branch_tree.tree) {`
Fixed segfault in fast-import after growing a tree. Growing a tree caused all subtrees to be deallocated and put back into the free list yet those subtree's contents were still actively in use. Consequently they were doled out again and got stomped on elsewhere. Releasing a tree is now performed in two parts, either releasing only the content array or releasing the content array and recursively releasing the subtree(s). Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 07:33:47 +02:00			`release_tree_content_recursive(e->branch_tree.tree);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`e->branch_tree.tree = NULL;`
			`}`
			`cur_active_branches--;`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00			`}`
			`}`

Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`static void load_branch(struct branch *b)`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00			`{`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`load_tree(&b->branch_tree);`
fast-import: Avoid infinite loop after reset Johannes Sixt noticed that a 'reset' command applied to a branch that is already active in the branch LRU cache can cause fast-import to relink the same branch into the LRU cache twice. This will cause the LRU cache to contain a cycle, making unload_one_branch run in an infinite loop as it tries to select the oldest branch for eviction. I have trivially fixed the problem by adding an active bit to each branch object; this bit indicates if the branch is already in the LRU and allows us to avoid trying to add it a second time. Converting the pack_id field into a bitfield makes this change take up no additional memory. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-05 18:31:09 +01:00			`if (!b->active) {`
			`b->active = 1;`
			`b->active_next_branch = active_branches;`
			`active_branches = b;`
			`cur_active_branches++;`
			`branch_load_count++;`
			`}`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00			`}`

fast-import: Proper notes tree manipulation This patch teaches 'git fast-import' to automatically organize note objects in a fast-import stream into an appropriate fanout structure. The notes API in notes.h is NOT used to accomplish this, because trying to keep the fast-import and notes data structures in sync would yield a significantly larger patch with higher complexity. Note objects are added with the 'N' command, and accounted for with a per-branch counter, which is used to trigger fanout restructuring when needed. Note that when restructuring the branch tree, _any_ entry whose path consists of 40 hex chars (not including directory separators) will be recognized as a note object. It is therefore not advisable to manipulate note entries with M/D/R/C commands. Since note objects are stored in the same tree structure as other objects, the unloading and reloading of a fast-import branches handle note objects transparently. This patch has been improved by the following contributions: - Shawn O. Pearce: Several style- and logic-related improvements Cc: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-07 12:27:24 +01:00			`static unsigned char convert_num_notes_to_fanout(uintmax_t num_notes)`
			`{`
			`unsigned char fanout = 0;`
			`while ((num_notes >>= 8))`
			`fanout++;`
			`return fanout;`
			`}`

			`static void construct_path_with_fanout(const char *hex_sha1,`
			`unsigned char fanout, char *path)`
			`{`
			`unsigned int i = 0, j = 0;`
			`if (fanout >= 20)`
			`die("Too large fanout (%u)", fanout);`
			`while (fanout) {`
			`path[i++] = hex_sha1[j++];`
			`path[i++] = hex_sha1[j++];`
			`path[i++] = '/';`
			`fanout--;`
			`}`
			`memcpy(path + i, hex_sha1 + j, 40 - j);`
			`path[i + 40 - j] = '\0';`
			`}`

			`static uintmax_t do_change_note_fanout(`
			`struct tree_entry orig_root, struct tree_entry root,`
			`char *hex_sha1, unsigned int hex_sha1_len,`
			`char *fullpath, unsigned int fullpath_len,`
			`unsigned char fanout)`
			`{`
			`struct tree_content *t = root->tree;`
			`struct tree_entry *e, leaf;`
			`unsigned int i, tmp_hex_sha1_len, tmp_fullpath_len;`
			`uintmax_t num_notes = 0;`
			`unsigned char sha1[20];`
			`char realpath[60];`

			`for (i = 0; t && i < t->entry_count; i++) {`
			`e = t->entries[i];`
			`tmp_hex_sha1_len = hex_sha1_len + e->name->str_len;`
			`tmp_fullpath_len = fullpath_len;`

			`/*`
			`* We're interested in EITHER existing note entries (entries`
			`* with exactly 40 hex chars in path, not including directory`
			`* separators), OR directory entries that may contain note`
			`* entries (with < 40 hex chars in path).`
			`* Also, each path component in a note entry must be a multiple`
			`* of 2 chars.`
			`*/`
			`if (!e->versions[1].mode \|\|`
			`tmp_hex_sha1_len > 40 \|\|`
			`e->name->str_len % 2)`
			`continue;`

			`/* This _may_ be a note entry, or a subdir containing notes */`
			`memcpy(hex_sha1 + hex_sha1_len, e->name->str_dat,`
			`e->name->str_len);`
			`if (tmp_fullpath_len)`
			`fullpath[tmp_fullpath_len++] = '/';`
			`memcpy(fullpath + tmp_fullpath_len, e->name->str_dat,`
			`e->name->str_len);`
			`tmp_fullpath_len += e->name->str_len;`
			`fullpath[tmp_fullpath_len] = '\0';`

			`if (tmp_hex_sha1_len == 40 && !get_sha1_hex(hex_sha1, sha1)) {`
			`/* This is a note entry */`
fast-import: Fix incorrect fanout level when modifying existing notes refs This fixes the bug uncovered by the tests added in the previous two patches. When an existing notes ref was loaded into the fast-import machinery, the num_notes counter associated with that ref remained == 0, even though the true number of notes in the loaded ref was higher. This caused a fanout level of 0 to be used, although the actual fanout of the tree could be > 0. Manipulating the notes tree at an incorrect fanout level causes removals to silently fail, and modifications of existing notes to instead produce an additional note (leaving the old object in place at a different fanout level). This patch fixes the bug by explicitly counting the number of notes in the notes tree whenever it looks like the num_notes counter could be wrong (when num_notes == 0). There may be false positives (i.e. triggering the counting when the notes tree is truly empty), but in those cases, the counting should not take long. Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-11-25 01:09:47 +01:00			`if (fanout == 0xff) {`
			`/* Counting mode, no rename */`
			`num_notes++;`
			`continue;`
			`}`
fast-import: Proper notes tree manipulation This patch teaches 'git fast-import' to automatically organize note objects in a fast-import stream into an appropriate fanout structure. The notes API in notes.h is NOT used to accomplish this, because trying to keep the fast-import and notes data structures in sync would yield a significantly larger patch with higher complexity. Note objects are added with the 'N' command, and accounted for with a per-branch counter, which is used to trigger fanout restructuring when needed. Note that when restructuring the branch tree, _any_ entry whose path consists of 40 hex chars (not including directory separators) will be recognized as a note object. It is therefore not advisable to manipulate note entries with M/D/R/C commands. Since note objects are stored in the same tree structure as other objects, the unloading and reloading of a fast-import branches handle note objects transparently. This patch has been improved by the following contributions: - Shawn O. Pearce: Several style- and logic-related improvements Cc: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-07 12:27:24 +01:00			`construct_path_with_fanout(hex_sha1, fanout, realpath);`
			`if (!strcmp(fullpath, realpath)) {`
			`/* Note entry is in correct location */`
			`num_notes++;`
			`continue;`
			`}`

			`/* Rename fullpath to realpath */`
			`if (!tree_content_remove(orig_root, fullpath, &leaf))`
			`die("Failed to remove path %s", fullpath);`
			`tree_content_set(orig_root, realpath,`
			`leaf.versions[1].sha1,`
			`leaf.versions[1].mode,`
			`leaf.tree);`
			`} else if (S_ISDIR(e->versions[1].mode)) {`
			`/* This is a subdir that may contain note entries */`
			`if (!e->tree)`
			`load_tree(e);`
			`num_notes += do_change_note_fanout(orig_root, e,`
			`hex_sha1, tmp_hex_sha1_len,`
			`fullpath, tmp_fullpath_len, fanout);`
			`}`

			`/* The above may have reallocated the current tree_content */`
			`t = root->tree;`
			`}`
			`return num_notes;`
			`}`

			`static uintmax_t change_note_fanout(struct tree_entry *root,`
			`unsigned char fanout)`
			`{`
			`char hex_sha1[40], path[60];`
			`return do_change_note_fanout(root, root, hex_sha1, 0, path, 0, fanout);`
			`}`

fast-import: tighten parsing of datarefs The syntax for the use of mark references in fast-import demands either a SP (space) or LF (end-of-line) after a mark reference. Fast-import does not complain when garbage appears after a mark reference in some cases. Factor out parsing of mark references and complain if errant characters are found. Also be a little more careful when parsing "inline" and SHA1s, complaining if extra characters appear or if the form of the dataref is unrecognized. Buggy input can cause fast-import to produce the wrong output, silently, without error. This makes it difficult to track down buggy generators of fast-import streams. An example is seen in the last line of this commit command: commit refs/heads/S2 committer Name <name@example.com> 1112912893 -0400 data <<COMMIT commit message COMMIT from :1M 100644 :103 hello.c It is missing a newline and should be: [...] from :1 M 100644 :103 hello.c What fast-import does is to produce a commit with the same contents for hello.c as in refs/heads/S2^. What the buggy program was expecting was the contents of blob :103. While the resulting commit graph looked correct, the contents in some commits were wrong. Signed-off-by: Pete Wyckoff <pw@padd.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-04-08 00:59:20 +02:00			`/*`
			`* Given a pointer into a string, parse a mark reference:`
			`*`
			`* idnum ::= ':' bigint;`
			`*`
			`* Return the first character after the value in *endptr.`
			`*`
			`* Complain if the following character is not what is expected,`
			`* either a space or end of the string.`
			`*/`
			`static uintmax_t parse_mark_ref(const char p, char *endptr)`
			`{`
			`uintmax_t mark;`

			`assert(*p == ':');`
			`p++;`
			`mark = strtoumax(p, endptr, 10);`
			`if (*endptr == p)`
			`die("No value after ':' in mark: %s", command_buf.buf);`
			`return mark;`
			`}`

			`/*`
			`* Parse the mark reference, and complain if this is not the end of`
			`* the string.`
			`*/`
			`static uintmax_t parse_mark_ref_eol(const char *p)`
			`{`
			`char *end;`
			`uintmax_t mark;`

			`mark = parse_mark_ref(p, &end);`
			`if (*end != '\0')`
			`die("Garbage after mark: %s", command_buf.buf);`
			`return mark;`
			`}`

			`/*`
			`* Parse the mark reference, demanding a trailing space. Return a`
			`* pointer to the space.`
			`*/`
			`static uintmax_t parse_mark_ref_space(const char **p)`
			`{`
			`uintmax_t mark;`
			`char *end;`

			`mark = parse_mark_ref(*p, &end);`
			`if (*end != ' ')`
			`die("Missing space after mark: %s", command_buf.buf);`
			`*p = end;`
			`return mark;`
			`}`

Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`static void file_change_m(struct branch *b)`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00			`{`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`const char *p = command_buf.buf + 2;`
Rework unquote_c_style to work on a strbuf. If the gain is not obvious in the diffstat, the resulting code is more readable, _and_ in checkout-index/update-index we now reuse the same buffer to unquote strings instead of always freeing/mallocing. This also is more coherent with the next patch that reworks quoting functions. The quoting function is also made more efficient scanning for backslashes and treating portions of strings without a backslash at once. Signed-off-by: Pierre Habouzit <madcoder@debian.org> 2007-09-20 00:42:14 +02:00			`static struct strbuf uq = STRBUF_INIT;`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`const char *endp;`
Correct compiler warnings in fast-import. Junio noticed these warnings/errors in fast-import when compiling with `-Werror -ansi -pedantic`. A few changes are to reduce compiler warnings, while one (in cmd_merge) is a bug fix. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 06:26:49 +01:00			`struct object_entry *oe = oe;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`unsigned char sha1[20];`
Reduce memory usage of fast-import. Some structs are allocated rather frequently, but were using integer types which were far larger than required to actually store their full value range. As packfiles are limited to 4 GiB we don't need more than 32 bits to store the offset of an object within that packfile, an `unsigned long` on a 64 bit system is likely a 64 bit unsigned value. Saving 4 bytes per object on a 64 bit system can add up fast on any sizable import. As atom strings are strictly single components in a path name these are probably limited to just 255 bytes by the underlying OS. Going to that short of a string is probably too restrictive, but certainly `unsigned int` is far too large for their lengths. `unsigned short` is a reasonable limit. Modes within a tree really only need two bytes to store their whole value; using `unsigned int` here is vast overkill. Saving 4 bytes per file entry in an active branch can add up quickly on a project with a large number of files. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-05 22:34:56 +01:00			`uint16_t mode, inline_data = 0;`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`p = get_mode(p, &mode);`
			`if (!p)`
			`die("Corrupt mode: %s", command_buf.buf);`
			`switch (mode) {`
fast-import: Cleanup mode setting. "S_IFREG \| mode" makes only sense for 0644 and 0755. Even though doing (S_IFREG \| mode) may not hurt when mode is any other supported value, that is only true because S_IFREG mode bit happens to be already on for S_IFLNK or S_IFGITLINK. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-01-14 02:37:07 +01:00			`case 0644:`
			`case 0755:`
			`mode \|= S_IFREG;`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`case S_IFREG \| 0644:`
			`case S_IFREG \| 0755:`
Allow symlink blobs in trees during fast-import. If a frontend is smart enough to import a symlink then we should let them do so. We'll assume that they were smart enough to first generate a blob to hold the link target, as that's how symlinks get represented in GIT. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-21 09:29:13 +02:00			`case S_IFLNK:`
Teach fast-import to import subtrees named by tree id To simulate the svn cp command, it would be very useful to be replace an arbitrary file in the current revision by an arbitrary directory from a previous one. Modify the filemodify command to allow that: M 040000 <tree id> pathname This would be most useful in combination with a facility to print the commit ids for new revisions as they are written. Cc: Shawn O. Pearce <spearce@spearce.org> Cc: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-07-01 05:18:19 +02:00			`case S_IFDIR:`
Support gitlinks in fast-import. Currently fast-import/export cannot be used for repositories with submodules. This patch extends the relevant programs to make them correctly process gitlinks. Links can be represented by two forms of the Modify command: M 160000 SHA1 some/path which sets the link target explicitly, or M 160000 :mark some/path where the mark refers to a commit. The latter form can be used by importing tools to build all submodules simultaneously in one physical repository, and then simply fetch them apart. Signed-off-by: Alexander Gavrilov <angavrilov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-19 14:21:24 +02:00			`case S_IFGITLINK:`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`/* ok */`
			`break;`
			`default:`
			`die("Corrupt mode: %s", command_buf.buf);`
			`}`

Added mark store/find to fast-import. Marks are now saved when the mark directive gets used by the frontend and may be used in place of a SHA1 expression to locate a previous SHA1 which fast-import may have generated. This is particularly useful with commits where the frontend does not (easily) have the ability to compute the SHA1 for an arbitrary commit but needs it to generate a branch or tag from that commit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 10:17:45 +02:00			`if (*p == ':') {`
fast-import: tighten parsing of datarefs The syntax for the use of mark references in fast-import demands either a SP (space) or LF (end-of-line) after a mark reference. Fast-import does not complain when garbage appears after a mark reference in some cases. Factor out parsing of mark references and complain if errant characters are found. Also be a little more careful when parsing "inline" and SHA1s, complaining if extra characters appear or if the form of the dataref is unrecognized. Buggy input can cause fast-import to produce the wrong output, silently, without error. This makes it difficult to track down buggy generators of fast-import streams. An example is seen in the last line of this commit command: commit refs/heads/S2 committer Name <name@example.com> 1112912893 -0400 data <<COMMIT commit message COMMIT from :1M 100644 :103 hello.c It is missing a newline and should be: [...] from :1 M 100644 :103 hello.c What fast-import does is to produce a commit with the same contents for hello.c as in refs/heads/S2^. What the buggy program was expecting was the contents of blob :103. While the resulting commit graph looked correct, the contents in some commits were wrong. Signed-off-by: Pete Wyckoff <pw@padd.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-04-08 00:59:20 +02:00			`oe = find_mark(parse_mark_ref_space(&p));`
fast-import: start using struct pack_idx_entry This is in preparation for using write_idx_file(). Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:51 +01:00			`hashcpy(sha1, oe->idx.sha1);`
fast-import: tighten parsing of datarefs The syntax for the use of mark references in fast-import demands either a SP (space) or LF (end-of-line) after a mark reference. Fast-import does not complain when garbage appears after a mark reference in some cases. Factor out parsing of mark references and complain if errant characters are found. Also be a little more careful when parsing "inline" and SHA1s, complaining if extra characters appear or if the form of the dataref is unrecognized. Buggy input can cause fast-import to produce the wrong output, silently, without error. This makes it difficult to track down buggy generators of fast-import streams. An example is seen in the last line of this commit command: commit refs/heads/S2 committer Name <name@example.com> 1112912893 -0400 data <<COMMIT commit message COMMIT from :1M 100644 :103 hello.c It is missing a newline and should be: [...] from :1 M 100644 :103 hello.c What fast-import does is to produce a commit with the same contents for hello.c as in refs/heads/S2^. What the buggy program was expecting was the contents of blob :103. While the resulting commit graph looked correct, the contents in some commits were wrong. Signed-off-by: Pete Wyckoff <pw@padd.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-04-08 00:59:20 +02:00			`} else if (!prefixcmp(p, "inline ")) {`
Accept 'inline' file data in fast-import commit structure. Its very annoying to need to specify the file content ahead of a commit and use marks to connect the individual blobs to the commit's file modification entry, especially if the frontend can't/won't generate the blob SHA1s itself. Instead it would much easier to use if we can accept the blob data at the same time as we receive each file_change line. Now fast-import accepts 'inline' instead of a mark idnum or blob SHA1 within the 'M' type file_change command. If an inline is detected the very next line must be a 'data n' command, supplying the file data. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-18 21:17:58 +01:00			`inline_data = 1;`
fast-import: tighten parsing of datarefs The syntax for the use of mark references in fast-import demands either a SP (space) or LF (end-of-line) after a mark reference. Fast-import does not complain when garbage appears after a mark reference in some cases. Factor out parsing of mark references and complain if errant characters are found. Also be a little more careful when parsing "inline" and SHA1s, complaining if extra characters appear or if the form of the dataref is unrecognized. Buggy input can cause fast-import to produce the wrong output, silently, without error. This makes it difficult to track down buggy generators of fast-import streams. An example is seen in the last line of this commit command: commit refs/heads/S2 committer Name <name@example.com> 1112912893 -0400 data <<COMMIT commit message COMMIT from :1M 100644 :103 hello.c It is missing a newline and should be: [...] from :1 M 100644 :103 hello.c What fast-import does is to produce a commit with the same contents for hello.c as in refs/heads/S2^. What the buggy program was expecting was the contents of blob :103. While the resulting commit graph looked correct, the contents in some commits were wrong. Signed-off-by: Pete Wyckoff <pw@padd.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-04-08 00:59:20 +02:00			`p += strlen("inline"); /* advance to space */`
Added mark store/find to fast-import. Marks are now saved when the mark directive gets used by the frontend and may be used in place of a SHA1 expression to locate a previous SHA1 which fast-import may have generated. This is particularly useful with commits where the frontend does not (easily) have the ability to compute the SHA1 for an arbitrary commit but needs it to generate a branch or tag from that commit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 10:17:45 +02:00			`} else {`
			`if (get_sha1_hex(p, sha1))`
fast-import: tighten parsing of datarefs The syntax for the use of mark references in fast-import demands either a SP (space) or LF (end-of-line) after a mark reference. Fast-import does not complain when garbage appears after a mark reference in some cases. Factor out parsing of mark references and complain if errant characters are found. Also be a little more careful when parsing "inline" and SHA1s, complaining if extra characters appear or if the form of the dataref is unrecognized. Buggy input can cause fast-import to produce the wrong output, silently, without error. This makes it difficult to track down buggy generators of fast-import streams. An example is seen in the last line of this commit command: commit refs/heads/S2 committer Name <name@example.com> 1112912893 -0400 data <<COMMIT commit message COMMIT from :1M 100644 :103 hello.c It is missing a newline and should be: [...] from :1 M 100644 :103 hello.c What fast-import does is to produce a commit with the same contents for hello.c as in refs/heads/S2^. What the buggy program was expecting was the contents of blob :103. While the resulting commit graph looked correct, the contents in some commits were wrong. Signed-off-by: Pete Wyckoff <pw@padd.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-04-08 00:59:20 +02:00			`die("Invalid dataref: %s", command_buf.buf);`
Added mark store/find to fast-import. Marks are now saved when the mark directive gets used by the frontend and may be used in place of a SHA1 expression to locate a previous SHA1 which fast-import may have generated. This is particularly useful with commits where the frontend does not (easily) have the ability to compute the SHA1 for an arbitrary commit but needs it to generate a branch or tag from that commit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 10:17:45 +02:00			`oe = find_object(sha1);`
			`p += 40;`
fast-import: tighten parsing of datarefs The syntax for the use of mark references in fast-import demands either a SP (space) or LF (end-of-line) after a mark reference. Fast-import does not complain when garbage appears after a mark reference in some cases. Factor out parsing of mark references and complain if errant characters are found. Also be a little more careful when parsing "inline" and SHA1s, complaining if extra characters appear or if the form of the dataref is unrecognized. Buggy input can cause fast-import to produce the wrong output, silently, without error. This makes it difficult to track down buggy generators of fast-import streams. An example is seen in the last line of this commit command: commit refs/heads/S2 committer Name <name@example.com> 1112912893 -0400 data <<COMMIT commit message COMMIT from :1M 100644 :103 hello.c It is missing a newline and should be: [...] from :1 M 100644 :103 hello.c What fast-import does is to produce a commit with the same contents for hello.c as in refs/heads/S2^. What the buggy program was expecting was the contents of blob :103. While the resulting commit graph looked correct, the contents in some commits were wrong. Signed-off-by: Pete Wyckoff <pw@padd.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-04-08 00:59:20 +02:00			`if (*p != ' ')`
			`die("Missing space after SHA1: %s", command_buf.buf);`
Added mark store/find to fast-import. Marks are now saved when the mark directive gets used by the frontend and may be used in place of a SHA1 expression to locate a previous SHA1 which fast-import may have generated. This is particularly useful with commits where the frontend does not (easily) have the ability to compute the SHA1 for an arbitrary commit but needs it to generate a branch or tag from that commit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 10:17:45 +02:00			`}`
fast-import: tighten parsing of datarefs The syntax for the use of mark references in fast-import demands either a SP (space) or LF (end-of-line) after a mark reference. Fast-import does not complain when garbage appears after a mark reference in some cases. Factor out parsing of mark references and complain if errant characters are found. Also be a little more careful when parsing "inline" and SHA1s, complaining if extra characters appear or if the form of the dataref is unrecognized. Buggy input can cause fast-import to produce the wrong output, silently, without error. This makes it difficult to track down buggy generators of fast-import streams. An example is seen in the last line of this commit command: commit refs/heads/S2 committer Name <name@example.com> 1112912893 -0400 data <<COMMIT commit message COMMIT from :1M 100644 :103 hello.c It is missing a newline and should be: [...] from :1 M 100644 :103 hello.c What fast-import does is to produce a commit with the same contents for hello.c as in refs/heads/S2^. What the buggy program was expecting was the contents of blob :103. While the resulting commit graph looked correct, the contents in some commits were wrong. Signed-off-by: Pete Wyckoff <pw@padd.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-04-08 00:59:20 +02:00			`assert(*p == ' ');`
			`p++; /* skip space */`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00
Rework unquote_c_style to work on a strbuf. If the gain is not obvious in the diffstat, the resulting code is more readable, _and_ in checkout-index/update-index we now reuse the same buffer to unquote strings instead of always freeing/mallocing. This also is more coherent with the next patch that reworks quoting functions. The quoting function is also made more efficient scanning for backslashes and treating portions of strings without a backslash at once. Signed-off-by: Pierre Habouzit <madcoder@debian.org> 2007-09-20 00:42:14 +02:00			`strbuf_reset(&uq);`
			`if (!unquote_c_style(&uq, p, &endp)) {`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`if (*endp)`
			`die("Garbage after path in: %s", command_buf.buf);`
Rework unquote_c_style to work on a strbuf. If the gain is not obvious in the diffstat, the resulting code is more readable, _and_ in checkout-index/update-index we now reuse the same buffer to unquote strings instead of always freeing/mallocing. This also is more coherent with the next patch that reworks quoting functions. The quoting function is also made more efficient scanning for backslashes and treating portions of strings without a backslash at once. Signed-off-by: Pierre Habouzit <madcoder@debian.org> 2007-09-20 00:42:14 +02:00			`p = uq.buf;`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`}`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00
fast-import: treat filemodify with empty tree as delete Normal git processes do not allow one to build a tree with an empty subtree entry without trying hard at it. This is in keeping with the general UI philosophy: git tracks content, not empty directories. v1.7.3-rc0~75^2 (2010-06-30) changed that by making it easy to include an empty subtree in fast-import's active commit: M 040000 4b825dc642cb6eb9a060e54bf8d69288fbee4904 subdir One can trigger this by reading an empty tree (for example, the tree corresponding to an empty root commit) and trying to move it to a subtree. It is better and more closely analogous to 'git read-tree --prefix' to treat such commands as requests to remove the subtree. Noticed-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-01-27 07:07:49 +01:00			`/* Git does not track empty, non-toplevel directories. */`
			`if (S_ISDIR(mode) && !memcmp(sha1, EMPTY_TREE_SHA1_BIN, 20) && *p) {`
			`tree_content_remove(&b->branch_tree, p, NULL);`
			`return;`
			`}`

Support gitlinks in fast-import. Currently fast-import/export cannot be used for repositories with submodules. This patch extends the relevant programs to make them correctly process gitlinks. Links can be represented by two forms of the Modify command: M 160000 SHA1 some/path which sets the link target explicitly, or M 160000 :mark some/path where the mark refers to a commit. The latter form can be used by importing tools to build all submodules simultaneously in one physical repository, and then simply fetch them apart. Signed-off-by: Alexander Gavrilov <angavrilov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-19 14:21:24 +02:00			`if (S_ISGITLINK(mode)) {`
			`if (inline_data)`
			`die("Git links cannot be specified 'inline': %s",`
			`command_buf.buf);`
			`else if (oe) {`
			`if (oe->type != OBJ_COMMIT)`
			`die("Not a commit (actually a %s): %s",`
			`typename(oe->type), command_buf.buf);`
			`}`
			`/*`
			`* Accept the sha1 without checking; it expected to be in`
			`* another repository.`
			`*/`
			`} else if (inline_data) {`
Teach fast-import to import subtrees named by tree id To simulate the svn cp command, it would be very useful to be replace an arbitrary file in the current revision by an arbitrary directory from a previous one. Modify the filemodify command to allow that: M 040000 <tree id> pathname This would be most useful in combination with a facility to print the commit ids for new revisions as they are written. Cc: Shawn O. Pearce <spearce@spearce.org> Cc: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-07-01 05:18:19 +02:00			`if (S_ISDIR(mode))`
			`die("Directories cannot be specified 'inline': %s",`
			`command_buf.buf);`
Rework unquote_c_style to work on a strbuf. If the gain is not obvious in the diffstat, the resulting code is more readable, _and_ in checkout-index/update-index we now reuse the same buffer to unquote strings instead of always freeing/mallocing. This also is more coherent with the next patch that reworks quoting functions. The quoting function is also made more efficient scanning for backslashes and treating portions of strings without a backslash at once. Signed-off-by: Pierre Habouzit <madcoder@debian.org> 2007-09-20 00:42:14 +02:00			`if (p != uq.buf) {`
			`strbuf_addstr(&uq, p);`
			`p = uq.buf;`
			`}`
Accept 'inline' file data in fast-import commit structure. Its very annoying to need to specify the file content ahead of a commit and use marks to connect the individual blobs to the commit's file modification entry, especially if the frontend can't/won't generate the blob SHA1s itself. Instead it would much easier to use if we can accept the blob data at the same time as we receive each file_change line. Now fast-import accepts 'inline' instead of a mark idnum or blob SHA1 within the 'M' type file_change command. If an inline is detected the very next line must be a 'data n' command, supplying the file data. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-18 21:17:58 +01:00			`read_next_command();`
fast-import: Stream very large blobs directly to pack If a blob is larger than the configured big-file-threshold, instead of reading it into a single buffer obtained from malloc, stream it onto the end of the current pack file. Streaming the larger objects into the pack avoids the 4+ GiB memory footprint that occurs when fast-import is processing 2+ GiB blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-01 18:27:35 +01:00			`parse_and_store_blob(&last_blob, sha1, 0);`
Implement blob ID validation in fast-import. When accepting revision SHA1 IDs from the frontend verify the SHA1 actually refers to a blob and is known to exist. Its an error to use a SHA1 in a tree if the blob doesn't exist as this would cause git-fsck-objects to report a missing blob should the pack get closed without the blob being appended into it or a subsequent pack. So right now we'll just ask that the frontend "pre-declare" any blobs it wants to use in a tree before it can use them. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 08:50:18 +02:00			`} else {`
Teach fast-import to import subtrees named by tree id To simulate the svn cp command, it would be very useful to be replace an arbitrary file in the current revision by an arbitrary directory from a previous one. Modify the filemodify command to allow that: M 040000 <tree id> pathname This would be most useful in combination with a facility to print the commit ids for new revisions as they are written. Cc: Shawn O. Pearce <spearce@spearce.org> Cc: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-07-01 05:18:19 +02:00			`enum object_type expected = S_ISDIR(mode) ?`
			`OBJ_TREE: OBJ_BLOB;`
			`enum object_type type = oe ? oe->type :`
			`sha1_object_info(sha1, NULL);`
convert object type handling from a string to a number We currently have two parallel notation for dealing with object types in the code: a string and a numerical value. One of them is obviously redundent, and the most used one requires more stack space and a bunch of strcmp() all over the place. This is an initial step for the removal of the version using a char array found in object reading code paths. The patch is unfortunately large but there is no sane way to split it in smaller parts without breaking the system. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-26 20:55:59 +01:00			`if (type < 0)`
Teach fast-import to import subtrees named by tree id To simulate the svn cp command, it would be very useful to be replace an arbitrary file in the current revision by an arbitrary directory from a previous one. Modify the filemodify command to allow that: M 040000 <tree id> pathname This would be most useful in combination with a facility to print the commit ids for new revisions as they are written. Cc: Shawn O. Pearce <spearce@spearce.org> Cc: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-07-01 05:18:19 +02:00			`die("%s not found: %s",`
			`S_ISDIR(mode) ? "Tree" : "Blob",`
			`command_buf.buf);`
			`if (type != expected)`
			`die("Not a %s (actually a %s): %s",`
			`typename(expected), typename(type),`
			`command_buf.buf);`
Implement blob ID validation in fast-import. When accepting revision SHA1 IDs from the frontend verify the SHA1 actually refers to a blob and is known to exist. Its an error to use a SHA1 in a tree if the blob doesn't exist as this would cause git-fsck-objects to report a missing blob should the pack get closed without the blob being appended into it or a subsequent pack. So right now we'll just ask that the frontend "pre-declare" any blobs it wants to use in a tree before it can use them. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 08:50:18 +02:00			`}`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00
fast-import: tighten M 040000 syntax When tree_content_set() is asked to modify the path "foo/bar/", it first recurses like so: tree_content_set(root, "foo/bar/", sha1, S_IFDIR) -> tree_content_set(root:foo, "bar/", ...) -> tree_content_set(root:foo/bar, "", ...) And as a side-effect of 2794ad5 (fast-import: Allow filemodify to set the root, 2010-10-10), this last call is accepted and changes the tree entry for root:foo/bar to refer to the specified tree. That seems safe enough but let's reject the new syntax (we never meant to support it) and make it harder for frontends to introduce pointless incompatibilities with git fast-import 1.7.3. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-10-18 03:08:53 +02:00			`if (!*p) {`
			`tree_content_replace(&b->branch_tree, sha1, mode, NULL);`
			`return;`
			`}`
fast-import: Cleanup mode setting. "S_IFREG \| mode" makes only sense for 0644 and 0755. Even though doing (S_IFREG \| mode) may not hurt when mode is any other supported value, that is only true because S_IFREG mode bit happens to be already on for S_IFLNK or S_IFGITLINK. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-01-14 02:37:07 +01:00			`tree_content_set(&b->branch_tree, p, sha1, mode, NULL);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`}`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`static void file_change_d(struct branch *b)`
			`{`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`const char *p = command_buf.buf + 2;`
Rework unquote_c_style to work on a strbuf. If the gain is not obvious in the diffstat, the resulting code is more readable, _and_ in checkout-index/update-index we now reuse the same buffer to unquote strings instead of always freeing/mallocing. This also is more coherent with the next patch that reworks quoting functions. The quoting function is also made more efficient scanning for backslashes and treating portions of strings without a backslash at once. Signed-off-by: Pierre Habouzit <madcoder@debian.org> 2007-09-20 00:42:14 +02:00			`static struct strbuf uq = STRBUF_INIT;`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`const char *endp;`

Rework unquote_c_style to work on a strbuf. If the gain is not obvious in the diffstat, the resulting code is more readable, _and_ in checkout-index/update-index we now reuse the same buffer to unquote strings instead of always freeing/mallocing. This also is more coherent with the next patch that reworks quoting functions. The quoting function is also made more efficient scanning for backslashes and treating portions of strings without a backslash at once. Signed-off-by: Pierre Habouzit <madcoder@debian.org> 2007-09-20 00:42:14 +02:00			`strbuf_reset(&uq);`
			`if (!unquote_c_style(&uq, p, &endp)) {`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`if (*endp)`
			`die("Garbage after path in: %s", command_buf.buf);`
Rework unquote_c_style to work on a strbuf. If the gain is not obvious in the diffstat, the resulting code is more readable, _and_ in checkout-index/update-index we now reuse the same buffer to unquote strings instead of always freeing/mallocing. This also is more coherent with the next patch that reworks quoting functions. The quoting function is also made more efficient scanning for backslashes and treating portions of strings without a backslash at once. Signed-off-by: Pierre Habouzit <madcoder@debian.org> 2007-09-20 00:42:14 +02:00			`p = uq.buf;`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`}`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`tree_content_remove(&b->branch_tree, p, NULL);`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00			`}`

Teach fast-import to recursively copy files/directories Some source material (e.g. Subversion dump files) perform directory renames by telling us the directory was copied, then deleted in the same revision. This makes it difficult for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Copy a/ to b/; Delete a/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'C' subcommand within a commit allows the frontend to make a recursive copy of one path to another path within the branch, without needing to keep track of the individual file paths. The metadata copy is performed in memory efficiently, but is implemented as a copy-immediately operation, rather than copy-on-write. With this new 'C' subcommand frontends could obviously implement an 'R' (rename) on their own as a combination of 'C' and 'D' (delete), but since we have already offered up 'R' in the past and it is a trivial thing to keep implemented I'm not going to deprecate it. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-15 07:40:37 +02:00			`static void file_change_cr(struct branch *b, int rename)`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`{`
			`const char s, d;`
Rework unquote_c_style to work on a strbuf. If the gain is not obvious in the diffstat, the resulting code is more readable, _and_ in checkout-index/update-index we now reuse the same buffer to unquote strings instead of always freeing/mallocing. This also is more coherent with the next patch that reworks quoting functions. The quoting function is also made more efficient scanning for backslashes and treating portions of strings without a backslash at once. Signed-off-by: Pierre Habouzit <madcoder@debian.org> 2007-09-20 00:42:14 +02:00			`static struct strbuf s_uq = STRBUF_INIT;`
			`static struct strbuf d_uq = STRBUF_INIT;`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`const char *endp;`
			`struct tree_entry leaf;`

			`s = command_buf.buf + 2;`
Rework unquote_c_style to work on a strbuf. If the gain is not obvious in the diffstat, the resulting code is more readable, _and_ in checkout-index/update-index we now reuse the same buffer to unquote strings instead of always freeing/mallocing. This also is more coherent with the next patch that reworks quoting functions. The quoting function is also made more efficient scanning for backslashes and treating portions of strings without a backslash at once. Signed-off-by: Pierre Habouzit <madcoder@debian.org> 2007-09-20 00:42:14 +02:00			`strbuf_reset(&s_uq);`
			`if (!unquote_c_style(&s_uq, s, &endp)) {`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`if (*endp != ' ')`
			`die("Missing space after source: %s", command_buf.buf);`
Rework unquote_c_style to work on a strbuf. If the gain is not obvious in the diffstat, the resulting code is more readable, _and_ in checkout-index/update-index we now reuse the same buffer to unquote strings instead of always freeing/mallocing. This also is more coherent with the next patch that reworks quoting functions. The quoting function is also made more efficient scanning for backslashes and treating portions of strings without a backslash at once. Signed-off-by: Pierre Habouzit <madcoder@debian.org> 2007-09-20 00:42:14 +02:00			`} else {`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`endp = strchr(s, ' ');`
			`if (!endp)`
			`die("Missing space after source: %s", command_buf.buf);`
Rework unquote_c_style to work on a strbuf. If the gain is not obvious in the diffstat, the resulting code is more readable, _and_ in checkout-index/update-index we now reuse the same buffer to unquote strings instead of always freeing/mallocing. This also is more coherent with the next patch that reworks quoting functions. The quoting function is also made more efficient scanning for backslashes and treating portions of strings without a backslash at once. Signed-off-by: Pierre Habouzit <madcoder@debian.org> 2007-09-20 00:42:14 +02:00			`strbuf_add(&s_uq, s, endp - s);`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`}`
Rework unquote_c_style to work on a strbuf. If the gain is not obvious in the diffstat, the resulting code is more readable, _and_ in checkout-index/update-index we now reuse the same buffer to unquote strings instead of always freeing/mallocing. This also is more coherent with the next patch that reworks quoting functions. The quoting function is also made more efficient scanning for backslashes and treating portions of strings without a backslash at once. Signed-off-by: Pierre Habouzit <madcoder@debian.org> 2007-09-20 00:42:14 +02:00			`s = s_uq.buf;`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00
			`endp++;`
			`if (!*endp)`
			`die("Missing dest: %s", command_buf.buf);`

			`d = endp;`
Rework unquote_c_style to work on a strbuf. If the gain is not obvious in the diffstat, the resulting code is more readable, _and_ in checkout-index/update-index we now reuse the same buffer to unquote strings instead of always freeing/mallocing. This also is more coherent with the next patch that reworks quoting functions. The quoting function is also made more efficient scanning for backslashes and treating portions of strings without a backslash at once. Signed-off-by: Pierre Habouzit <madcoder@debian.org> 2007-09-20 00:42:14 +02:00			`strbuf_reset(&d_uq);`
			`if (!unquote_c_style(&d_uq, d, &endp)) {`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`if (*endp)`
			`die("Garbage after dest in: %s", command_buf.buf);`
Rework unquote_c_style to work on a strbuf. If the gain is not obvious in the diffstat, the resulting code is more readable, _and_ in checkout-index/update-index we now reuse the same buffer to unquote strings instead of always freeing/mallocing. This also is more coherent with the next patch that reworks quoting functions. The quoting function is also made more efficient scanning for backslashes and treating portions of strings without a backslash at once. Signed-off-by: Pierre Habouzit <madcoder@debian.org> 2007-09-20 00:42:14 +02:00			`d = d_uq.buf;`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`}`

			`memset(&leaf, 0, sizeof(leaf));`
Teach fast-import to recursively copy files/directories Some source material (e.g. Subversion dump files) perform directory renames by telling us the directory was copied, then deleted in the same revision. This makes it difficult for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Copy a/ to b/; Delete a/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'C' subcommand within a commit allows the frontend to make a recursive copy of one path to another path within the branch, without needing to keep track of the individual file paths. The metadata copy is performed in memory efficiently, but is implemented as a copy-immediately operation, rather than copy-on-write. With this new 'C' subcommand frontends could obviously implement an 'R' (rename) on their own as a combination of 'C' and 'D' (delete), but since we have already offered up 'R' in the past and it is a trivial thing to keep implemented I'm not going to deprecate it. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-15 07:40:37 +02:00			`if (rename)`
			`tree_content_remove(&b->branch_tree, s, &leaf);`
			`else`
			`tree_content_get(&b->branch_tree, s, &leaf);`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`if (!leaf.versions[1].mode)`
			`die("Path %s not in branch", s);`
fast-import: tighten M 040000 syntax When tree_content_set() is asked to modify the path "foo/bar/", it first recurses like so: tree_content_set(root, "foo/bar/", sha1, S_IFDIR) -> tree_content_set(root:foo, "bar/", ...) -> tree_content_set(root:foo/bar, "", ...) And as a side-effect of 2794ad5 (fast-import: Allow filemodify to set the root, 2010-10-10), this last call is accepted and changes the tree entry for root:foo/bar to refer to the specified tree. That seems safe enough but let's reject the new syntax (we never meant to support it) and make it harder for frontends to introduce pointless incompatibilities with git fast-import 1.7.3. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-10-18 03:08:53 +02:00			`if (!d) { / C "path/to/subdir" "" */`
			`tree_content_replace(&b->branch_tree,`
			`leaf.versions[1].sha1,`
			`leaf.versions[1].mode,`
			`leaf.tree);`
			`return;`
			`}`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`tree_content_set(&b->branch_tree, d,`
			`leaf.versions[1].sha1,`
			`leaf.versions[1].mode,`
			`leaf.tree);`
			`}`

fast-import: Fix incorrect fanout level when modifying existing notes refs This fixes the bug uncovered by the tests added in the previous two patches. When an existing notes ref was loaded into the fast-import machinery, the num_notes counter associated with that ref remained == 0, even though the true number of notes in the loaded ref was higher. This caused a fanout level of 0 to be used, although the actual fanout of the tree could be > 0. Manipulating the notes tree at an incorrect fanout level causes removals to silently fail, and modifications of existing notes to instead produce an additional note (leaving the old object in place at a different fanout level). This patch fixes the bug by explicitly counting the number of notes in the notes tree whenever it looks like the num_notes counter could be wrong (when num_notes == 0). There may be false positives (i.e. triggering the counting when the notes tree is truly empty), but in those cases, the counting should not take long. Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-11-25 01:09:47 +01:00			`static void note_change_n(struct branch b, unsigned char old_fanout)`
fast-import: Add support for importing commit notes Introduce a 'notemodify' subcommand of the 'commit' command. This subcommand is similar to 'filemodify', except that no mode is supplied (all notes have mode 0644), and the path is set to the hex SHA1 of the given "comittish". This enables fast import of note objects along with their associated commits, since the notes can now be named using the mark references of their corresponding commits. The patch also includes a test case of the added functionality. Signed-off-by: Johan Herland <johan@herland.net> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-09 12:22:02 +02:00			`{`
			`const char *p = command_buf.buf + 2;`
			`static struct strbuf uq = STRBUF_INIT;`
			`struct object_entry *oe = oe;`
			`struct branch *s;`
			`unsigned char sha1[20], commit_sha1[20];`
fast-import: Proper notes tree manipulation This patch teaches 'git fast-import' to automatically organize note objects in a fast-import stream into an appropriate fanout structure. The notes API in notes.h is NOT used to accomplish this, because trying to keep the fast-import and notes data structures in sync would yield a significantly larger patch with higher complexity. Note objects are added with the 'N' command, and accounted for with a per-branch counter, which is used to trigger fanout restructuring when needed. Note that when restructuring the branch tree, _any_ entry whose path consists of 40 hex chars (not including directory separators) will be recognized as a note object. It is therefore not advisable to manipulate note entries with M/D/R/C commands. Since note objects are stored in the same tree structure as other objects, the unloading and reloading of a fast-import branches handle note objects transparently. This patch has been improved by the following contributions: - Shawn O. Pearce: Several style- and logic-related improvements Cc: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-07 12:27:24 +01:00			`char path[60];`
fast-import: Add support for importing commit notes Introduce a 'notemodify' subcommand of the 'commit' command. This subcommand is similar to 'filemodify', except that no mode is supplied (all notes have mode 0644), and the path is set to the hex SHA1 of the given "comittish". This enables fast import of note objects along with their associated commits, since the notes can now be named using the mark references of their corresponding commits. The patch also includes a test case of the added functionality. Signed-off-by: Johan Herland <johan@herland.net> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-09 12:22:02 +02:00			`uint16_t inline_data = 0;`
fast-import: Proper notes tree manipulation This patch teaches 'git fast-import' to automatically organize note objects in a fast-import stream into an appropriate fanout structure. The notes API in notes.h is NOT used to accomplish this, because trying to keep the fast-import and notes data structures in sync would yield a significantly larger patch with higher complexity. Note objects are added with the 'N' command, and accounted for with a per-branch counter, which is used to trigger fanout restructuring when needed. Note that when restructuring the branch tree, _any_ entry whose path consists of 40 hex chars (not including directory separators) will be recognized as a note object. It is therefore not advisable to manipulate note entries with M/D/R/C commands. Since note objects are stored in the same tree structure as other objects, the unloading and reloading of a fast-import branches handle note objects transparently. This patch has been improved by the following contributions: - Shawn O. Pearce: Several style- and logic-related improvements Cc: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-07 12:27:24 +01:00			`unsigned char new_fanout;`
fast-import: Add support for importing commit notes Introduce a 'notemodify' subcommand of the 'commit' command. This subcommand is similar to 'filemodify', except that no mode is supplied (all notes have mode 0644), and the path is set to the hex SHA1 of the given "comittish". This enables fast import of note objects along with their associated commits, since the notes can now be named using the mark references of their corresponding commits. The patch also includes a test case of the added functionality. Signed-off-by: Johan Herland <johan@herland.net> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-09 12:22:02 +02:00
fast-import: Fix incorrect fanout level when modifying existing notes refs This fixes the bug uncovered by the tests added in the previous two patches. When an existing notes ref was loaded into the fast-import machinery, the num_notes counter associated with that ref remained == 0, even though the true number of notes in the loaded ref was higher. This caused a fanout level of 0 to be used, although the actual fanout of the tree could be > 0. Manipulating the notes tree at an incorrect fanout level causes removals to silently fail, and modifications of existing notes to instead produce an additional note (leaving the old object in place at a different fanout level). This patch fixes the bug by explicitly counting the number of notes in the notes tree whenever it looks like the num_notes counter could be wrong (when num_notes == 0). There may be false positives (i.e. triggering the counting when the notes tree is truly empty), but in those cases, the counting should not take long. Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-11-25 01:09:47 +01:00			`/*`
			`* When loading a branch, we don't traverse its tree to count the real`
			`* number of notes (too expensive to do this for all non-note refs).`
			`* This means that recently loaded notes refs might incorrectly have`
			`* b->num_notes == 0, and consequently, old_fanout might be wrong.`
			`*`
			`* Fix this by traversing the tree and counting the number of notes`
			`* when b->num_notes == 0. If the notes tree is truly empty, the`
			`* calculation should not take long.`
			`*/`
			`if (b->num_notes == 0 && *old_fanout == 0) {`
			`/* Invoke change_note_fanout() in "counting mode". */`
			`b->num_notes = change_note_fanout(&b->branch_tree, 0xff);`
			`*old_fanout = convert_num_notes_to_fanout(b->num_notes);`
			`}`

			`/* Now parse the notemodify command. */`
fast-import: Add support for importing commit notes Introduce a 'notemodify' subcommand of the 'commit' command. This subcommand is similar to 'filemodify', except that no mode is supplied (all notes have mode 0644), and the path is set to the hex SHA1 of the given "comittish". This enables fast import of note objects along with their associated commits, since the notes can now be named using the mark references of their corresponding commits. The patch also includes a test case of the added functionality. Signed-off-by: Johan Herland <johan@herland.net> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-09 12:22:02 +02:00			`/* <dataref> or 'inline' */`
			`if (*p == ':') {`
fast-import: tighten parsing of datarefs The syntax for the use of mark references in fast-import demands either a SP (space) or LF (end-of-line) after a mark reference. Fast-import does not complain when garbage appears after a mark reference in some cases. Factor out parsing of mark references and complain if errant characters are found. Also be a little more careful when parsing "inline" and SHA1s, complaining if extra characters appear or if the form of the dataref is unrecognized. Buggy input can cause fast-import to produce the wrong output, silently, without error. This makes it difficult to track down buggy generators of fast-import streams. An example is seen in the last line of this commit command: commit refs/heads/S2 committer Name <name@example.com> 1112912893 -0400 data <<COMMIT commit message COMMIT from :1M 100644 :103 hello.c It is missing a newline and should be: [...] from :1 M 100644 :103 hello.c What fast-import does is to produce a commit with the same contents for hello.c as in refs/heads/S2^. What the buggy program was expecting was the contents of blob :103. While the resulting commit graph looked correct, the contents in some commits were wrong. Signed-off-by: Pete Wyckoff <pw@padd.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-04-08 00:59:20 +02:00			`oe = find_mark(parse_mark_ref_space(&p));`
fast-import: start using struct pack_idx_entry This is in preparation for using write_idx_file(). Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:51 +01:00			`hashcpy(sha1, oe->idx.sha1);`
fast-import: tighten parsing of datarefs The syntax for the use of mark references in fast-import demands either a SP (space) or LF (end-of-line) after a mark reference. Fast-import does not complain when garbage appears after a mark reference in some cases. Factor out parsing of mark references and complain if errant characters are found. Also be a little more careful when parsing "inline" and SHA1s, complaining if extra characters appear or if the form of the dataref is unrecognized. Buggy input can cause fast-import to produce the wrong output, silently, without error. This makes it difficult to track down buggy generators of fast-import streams. An example is seen in the last line of this commit command: commit refs/heads/S2 committer Name <name@example.com> 1112912893 -0400 data <<COMMIT commit message COMMIT from :1M 100644 :103 hello.c It is missing a newline and should be: [...] from :1 M 100644 :103 hello.c What fast-import does is to produce a commit with the same contents for hello.c as in refs/heads/S2^. What the buggy program was expecting was the contents of blob :103. While the resulting commit graph looked correct, the contents in some commits were wrong. Signed-off-by: Pete Wyckoff <pw@padd.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-04-08 00:59:20 +02:00			`} else if (!prefixcmp(p, "inline ")) {`
fast-import: Add support for importing commit notes Introduce a 'notemodify' subcommand of the 'commit' command. This subcommand is similar to 'filemodify', except that no mode is supplied (all notes have mode 0644), and the path is set to the hex SHA1 of the given "comittish". This enables fast import of note objects along with their associated commits, since the notes can now be named using the mark references of their corresponding commits. The patch also includes a test case of the added functionality. Signed-off-by: Johan Herland <johan@herland.net> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-09 12:22:02 +02:00			`inline_data = 1;`
fast-import: tighten parsing of datarefs The syntax for the use of mark references in fast-import demands either a SP (space) or LF (end-of-line) after a mark reference. Fast-import does not complain when garbage appears after a mark reference in some cases. Factor out parsing of mark references and complain if errant characters are found. Also be a little more careful when parsing "inline" and SHA1s, complaining if extra characters appear or if the form of the dataref is unrecognized. Buggy input can cause fast-import to produce the wrong output, silently, without error. This makes it difficult to track down buggy generators of fast-import streams. An example is seen in the last line of this commit command: commit refs/heads/S2 committer Name <name@example.com> 1112912893 -0400 data <<COMMIT commit message COMMIT from :1M 100644 :103 hello.c It is missing a newline and should be: [...] from :1 M 100644 :103 hello.c What fast-import does is to produce a commit with the same contents for hello.c as in refs/heads/S2^. What the buggy program was expecting was the contents of blob :103. While the resulting commit graph looked correct, the contents in some commits were wrong. Signed-off-by: Pete Wyckoff <pw@padd.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-04-08 00:59:20 +02:00			`p += strlen("inline"); /* advance to space */`
fast-import: Add support for importing commit notes Introduce a 'notemodify' subcommand of the 'commit' command. This subcommand is similar to 'filemodify', except that no mode is supplied (all notes have mode 0644), and the path is set to the hex SHA1 of the given "comittish". This enables fast import of note objects along with their associated commits, since the notes can now be named using the mark references of their corresponding commits. The patch also includes a test case of the added functionality. Signed-off-by: Johan Herland <johan@herland.net> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-09 12:22:02 +02:00			`} else {`
			`if (get_sha1_hex(p, sha1))`
fast-import: tighten parsing of datarefs The syntax for the use of mark references in fast-import demands either a SP (space) or LF (end-of-line) after a mark reference. Fast-import does not complain when garbage appears after a mark reference in some cases. Factor out parsing of mark references and complain if errant characters are found. Also be a little more careful when parsing "inline" and SHA1s, complaining if extra characters appear or if the form of the dataref is unrecognized. Buggy input can cause fast-import to produce the wrong output, silently, without error. This makes it difficult to track down buggy generators of fast-import streams. An example is seen in the last line of this commit command: commit refs/heads/S2 committer Name <name@example.com> 1112912893 -0400 data <<COMMIT commit message COMMIT from :1M 100644 :103 hello.c It is missing a newline and should be: [...] from :1 M 100644 :103 hello.c What fast-import does is to produce a commit with the same contents for hello.c as in refs/heads/S2^. What the buggy program was expecting was the contents of blob :103. While the resulting commit graph looked correct, the contents in some commits were wrong. Signed-off-by: Pete Wyckoff <pw@padd.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-04-08 00:59:20 +02:00			`die("Invalid dataref: %s", command_buf.buf);`
fast-import: Add support for importing commit notes Introduce a 'notemodify' subcommand of the 'commit' command. This subcommand is similar to 'filemodify', except that no mode is supplied (all notes have mode 0644), and the path is set to the hex SHA1 of the given "comittish". This enables fast import of note objects along with their associated commits, since the notes can now be named using the mark references of their corresponding commits. The patch also includes a test case of the added functionality. Signed-off-by: Johan Herland <johan@herland.net> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-09 12:22:02 +02:00			`oe = find_object(sha1);`
			`p += 40;`
fast-import: tighten parsing of datarefs The syntax for the use of mark references in fast-import demands either a SP (space) or LF (end-of-line) after a mark reference. Fast-import does not complain when garbage appears after a mark reference in some cases. Factor out parsing of mark references and complain if errant characters are found. Also be a little more careful when parsing "inline" and SHA1s, complaining if extra characters appear or if the form of the dataref is unrecognized. Buggy input can cause fast-import to produce the wrong output, silently, without error. This makes it difficult to track down buggy generators of fast-import streams. An example is seen in the last line of this commit command: commit refs/heads/S2 committer Name <name@example.com> 1112912893 -0400 data <<COMMIT commit message COMMIT from :1M 100644 :103 hello.c It is missing a newline and should be: [...] from :1 M 100644 :103 hello.c What fast-import does is to produce a commit with the same contents for hello.c as in refs/heads/S2^. What the buggy program was expecting was the contents of blob :103. While the resulting commit graph looked correct, the contents in some commits were wrong. Signed-off-by: Pete Wyckoff <pw@padd.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-04-08 00:59:20 +02:00			`if (*p != ' ')`
			`die("Missing space after SHA1: %s", command_buf.buf);`
fast-import: Add support for importing commit notes Introduce a 'notemodify' subcommand of the 'commit' command. This subcommand is similar to 'filemodify', except that no mode is supplied (all notes have mode 0644), and the path is set to the hex SHA1 of the given "comittish". This enables fast import of note objects along with their associated commits, since the notes can now be named using the mark references of their corresponding commits. The patch also includes a test case of the added functionality. Signed-off-by: Johan Herland <johan@herland.net> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-09 12:22:02 +02:00			`}`
fast-import: tighten parsing of datarefs The syntax for the use of mark references in fast-import demands either a SP (space) or LF (end-of-line) after a mark reference. Fast-import does not complain when garbage appears after a mark reference in some cases. Factor out parsing of mark references and complain if errant characters are found. Also be a little more careful when parsing "inline" and SHA1s, complaining if extra characters appear or if the form of the dataref is unrecognized. Buggy input can cause fast-import to produce the wrong output, silently, without error. This makes it difficult to track down buggy generators of fast-import streams. An example is seen in the last line of this commit command: commit refs/heads/S2 committer Name <name@example.com> 1112912893 -0400 data <<COMMIT commit message COMMIT from :1M 100644 :103 hello.c It is missing a newline and should be: [...] from :1 M 100644 :103 hello.c What fast-import does is to produce a commit with the same contents for hello.c as in refs/heads/S2^. What the buggy program was expecting was the contents of blob :103. While the resulting commit graph looked correct, the contents in some commits were wrong. Signed-off-by: Pete Wyckoff <pw@padd.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-04-08 00:59:20 +02:00			`assert(*p == ' ');`
			`p++; /* skip space */`
fast-import: Add support for importing commit notes Introduce a 'notemodify' subcommand of the 'commit' command. This subcommand is similar to 'filemodify', except that no mode is supplied (all notes have mode 0644), and the path is set to the hex SHA1 of the given "comittish". This enables fast import of note objects along with their associated commits, since the notes can now be named using the mark references of their corresponding commits. The patch also includes a test case of the added functionality. Signed-off-by: Johan Herland <johan@herland.net> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-09 12:22:02 +02:00
			`/* <committish> */`
			`s = lookup_branch(p);`
			`if (s) {`
fast-import: don't allow to note on empty branch 'reset' command makes fast-import start a branch from scratch. It's name is kept in lookup table but it's sha1 is null_sha1 (special value). 'notemodify' command can be used to add a note on branch head given it's name. lookup_branch() is used it that case and it doesn't check for null_sha1. So fast-import writes a note for null_sha1 object instead of giving a error. Add a check to deny adding a note on empty branch and add a corresponding test. Signed-off-by: Dmitry Ivankov <divanorama@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-09-22 21:47:05 +02:00			`if (is_null_sha1(s->sha1))`
			`die("Can't add a note on empty branch.");`
fast-import: Add support for importing commit notes Introduce a 'notemodify' subcommand of the 'commit' command. This subcommand is similar to 'filemodify', except that no mode is supplied (all notes have mode 0644), and the path is set to the hex SHA1 of the given "comittish". This enables fast import of note objects along with their associated commits, since the notes can now be named using the mark references of their corresponding commits. The patch also includes a test case of the added functionality. Signed-off-by: Johan Herland <johan@herland.net> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-09 12:22:02 +02:00			`hashcpy(commit_sha1, s->sha1);`
			`} else if (*p == ':') {`
fast-import: tighten parsing of datarefs The syntax for the use of mark references in fast-import demands either a SP (space) or LF (end-of-line) after a mark reference. Fast-import does not complain when garbage appears after a mark reference in some cases. Factor out parsing of mark references and complain if errant characters are found. Also be a little more careful when parsing "inline" and SHA1s, complaining if extra characters appear or if the form of the dataref is unrecognized. Buggy input can cause fast-import to produce the wrong output, silently, without error. This makes it difficult to track down buggy generators of fast-import streams. An example is seen in the last line of this commit command: commit refs/heads/S2 committer Name <name@example.com> 1112912893 -0400 data <<COMMIT commit message COMMIT from :1M 100644 :103 hello.c It is missing a newline and should be: [...] from :1 M 100644 :103 hello.c What fast-import does is to produce a commit with the same contents for hello.c as in refs/heads/S2^. What the buggy program was expecting was the contents of blob :103. While the resulting commit graph looked correct, the contents in some commits were wrong. Signed-off-by: Pete Wyckoff <pw@padd.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-04-08 00:59:20 +02:00			`uintmax_t commit_mark = parse_mark_ref_eol(p);`
fast-import: Add support for importing commit notes Introduce a 'notemodify' subcommand of the 'commit' command. This subcommand is similar to 'filemodify', except that no mode is supplied (all notes have mode 0644), and the path is set to the hex SHA1 of the given "comittish". This enables fast import of note objects along with their associated commits, since the notes can now be named using the mark references of their corresponding commits. The patch also includes a test case of the added functionality. Signed-off-by: Johan Herland <johan@herland.net> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-09 12:22:02 +02:00			`struct object_entry *commit_oe = find_mark(commit_mark);`
			`if (commit_oe->type != OBJ_COMMIT)`
			`die("Mark :%" PRIuMAX " not a commit", commit_mark);`
fast-import: start using struct pack_idx_entry This is in preparation for using write_idx_file(). Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:51 +01:00			`hashcpy(commit_sha1, commit_oe->idx.sha1);`
fast-import: Add support for importing commit notes Introduce a 'notemodify' subcommand of the 'commit' command. This subcommand is similar to 'filemodify', except that no mode is supplied (all notes have mode 0644), and the path is set to the hex SHA1 of the given "comittish". This enables fast import of note objects along with their associated commits, since the notes can now be named using the mark references of their corresponding commits. The patch also includes a test case of the added functionality. Signed-off-by: Johan Herland <johan@herland.net> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-09 12:22:02 +02:00			`} else if (!get_sha1(p, commit_sha1)) {`
			`unsigned long size;`
			`char *buf = read_object_with_reference(commit_sha1,`
			`commit_type, &size, commit_sha1);`
			`if (!buf \|\| size < 46)`
			`die("Not a valid commit: %s", p);`
			`free(buf);`
			`} else`
			`die("Invalid ref name or SHA1 expression: %s", p);`

			`if (inline_data) {`
			`if (p != uq.buf) {`
			`strbuf_addstr(&uq, p);`
			`p = uq.buf;`
			`}`
			`read_next_command();`
fast-import: Stream very large blobs directly to pack If a blob is larger than the configured big-file-threshold, instead of reading it into a single buffer obtained from malloc, stream it onto the end of the current pack file. Streaming the larger objects into the pack avoids the 4+ GiB memory footprint that occurs when fast-import is processing 2+ GiB blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-01 18:27:35 +01:00			`parse_and_store_blob(&last_blob, sha1, 0);`
fast-import: Add support for importing commit notes Introduce a 'notemodify' subcommand of the 'commit' command. This subcommand is similar to 'filemodify', except that no mode is supplied (all notes have mode 0644), and the path is set to the hex SHA1 of the given "comittish". This enables fast import of note objects along with their associated commits, since the notes can now be named using the mark references of their corresponding commits. The patch also includes a test case of the added functionality. Signed-off-by: Johan Herland <johan@herland.net> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-09 12:22:02 +02:00			`} else if (oe) {`
			`if (oe->type != OBJ_BLOB)`
			`die("Not a blob (actually a %s): %s",`
			`typename(oe->type), command_buf.buf);`
fast-import: Proper notes tree manipulation This patch teaches 'git fast-import' to automatically organize note objects in a fast-import stream into an appropriate fanout structure. The notes API in notes.h is NOT used to accomplish this, because trying to keep the fast-import and notes data structures in sync would yield a significantly larger patch with higher complexity. Note objects are added with the 'N' command, and accounted for with a per-branch counter, which is used to trigger fanout restructuring when needed. Note that when restructuring the branch tree, _any_ entry whose path consists of 40 hex chars (not including directory separators) will be recognized as a note object. It is therefore not advisable to manipulate note entries with M/D/R/C commands. Since note objects are stored in the same tree structure as other objects, the unloading and reloading of a fast-import branches handle note objects transparently. This patch has been improved by the following contributions: - Shawn O. Pearce: Several style- and logic-related improvements Cc: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-07 12:27:24 +01:00			`} else if (!is_null_sha1(sha1)) {`
fast-import: Add support for importing commit notes Introduce a 'notemodify' subcommand of the 'commit' command. This subcommand is similar to 'filemodify', except that no mode is supplied (all notes have mode 0644), and the path is set to the hex SHA1 of the given "comittish". This enables fast import of note objects along with their associated commits, since the notes can now be named using the mark references of their corresponding commits. The patch also includes a test case of the added functionality. Signed-off-by: Johan Herland <johan@herland.net> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-09 12:22:02 +02:00			`enum object_type type = sha1_object_info(sha1, NULL);`
			`if (type < 0)`
			`die("Blob not found: %s", command_buf.buf);`
			`if (type != OBJ_BLOB)`
			`die("Not a blob (actually a %s): %s",`
			`typename(type), command_buf.buf);`
			`}`

fast-import: Fix incorrect fanout level when modifying existing notes refs This fixes the bug uncovered by the tests added in the previous two patches. When an existing notes ref was loaded into the fast-import machinery, the num_notes counter associated with that ref remained == 0, even though the true number of notes in the loaded ref was higher. This caused a fanout level of 0 to be used, although the actual fanout of the tree could be > 0. Manipulating the notes tree at an incorrect fanout level causes removals to silently fail, and modifications of existing notes to instead produce an additional note (leaving the old object in place at a different fanout level). This patch fixes the bug by explicitly counting the number of notes in the notes tree whenever it looks like the num_notes counter could be wrong (when num_notes == 0). There may be false positives (i.e. triggering the counting when the notes tree is truly empty), but in those cases, the counting should not take long. Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-11-25 01:09:47 +01:00			`construct_path_with_fanout(sha1_to_hex(commit_sha1), *old_fanout, path);`
fast-import: Proper notes tree manipulation This patch teaches 'git fast-import' to automatically organize note objects in a fast-import stream into an appropriate fanout structure. The notes API in notes.h is NOT used to accomplish this, because trying to keep the fast-import and notes data structures in sync would yield a significantly larger patch with higher complexity. Note objects are added with the 'N' command, and accounted for with a per-branch counter, which is used to trigger fanout restructuring when needed. Note that when restructuring the branch tree, _any_ entry whose path consists of 40 hex chars (not including directory separators) will be recognized as a note object. It is therefore not advisable to manipulate note entries with M/D/R/C commands. Since note objects are stored in the same tree structure as other objects, the unloading and reloading of a fast-import branches handle note objects transparently. This patch has been improved by the following contributions: - Shawn O. Pearce: Several style- and logic-related improvements Cc: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-07 12:27:24 +01:00			`if (tree_content_remove(&b->branch_tree, path, NULL))`
			`b->num_notes--;`

			`if (is_null_sha1(sha1))`
			`return; /* nothing to insert */`

			`b->num_notes++;`
			`new_fanout = convert_num_notes_to_fanout(b->num_notes);`
			`construct_path_with_fanout(sha1_to_hex(commit_sha1), new_fanout, path);`
			`tree_content_set(&b->branch_tree, path, sha1, S_IFREG \| 0644, NULL);`
fast-import: Add support for importing commit notes Introduce a 'notemodify' subcommand of the 'commit' command. This subcommand is similar to 'filemodify', except that no mode is supplied (all notes have mode 0644), and the path is set to the hex SHA1 of the given "comittish". This enables fast import of note objects along with their associated commits, since the notes can now be named using the mark references of their corresponding commits. The patch also includes a test case of the added functionality. Signed-off-by: Johan Herland <johan@herland.net> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-09 12:22:02 +02:00			`}`

Teach fast-import how to clear the internal branch content. Some frontends may not be able to (easily) keep track of which files are included in the branch, and which aren't. Performing this tracking can be tedious and error prone for the frontend to do, especially if its foreign data source cannot supply the changed path list on a per-commit basis. fast-import now allows a frontend to request that a branch's tree be wiped clean (reset to the empty tree) at the start of a commit, allowing the frontend to feed in all paths which belong on the branch. This is ideal for a tar-file importer frontend, for example, as the frontend just needs to reformat the tar data stream into a gfi data stream, which may be something a few Perl regexps can take care of. :) Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-07 08:03:03 +01:00			`static void file_change_deleteall(struct branch *b)`
			`{`
			`release_tree_content_recursive(b->branch_tree.tree);`
			`hashclr(b->branch_tree.versions[0].sha1);`
			`hashclr(b->branch_tree.versions[1].sha1);`
			`load_tree(&b->branch_tree);`
fast-import: Proper notes tree manipulation This patch teaches 'git fast-import' to automatically organize note objects in a fast-import stream into an appropriate fanout structure. The notes API in notes.h is NOT used to accomplish this, because trying to keep the fast-import and notes data structures in sync would yield a significantly larger patch with higher complexity. Note objects are added with the 'N' command, and accounted for with a per-branch counter, which is used to trigger fanout restructuring when needed. Note that when restructuring the branch tree, _any_ entry whose path consists of 40 hex chars (not including directory separators) will be recognized as a note object. It is therefore not advisable to manipulate note entries with M/D/R/C commands. Since note objects are stored in the same tree structure as other objects, the unloading and reloading of a fast-import branches handle note objects transparently. This patch has been improved by the following contributions: - Shawn O. Pearce: Several style- and logic-related improvements Cc: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-07 12:27:24 +01:00			`b->num_notes = 0;`
Teach fast-import how to clear the internal branch content. Some frontends may not be able to (easily) keep track of which files are included in the branch, and which aren't. Performing this tracking can be tedious and error prone for the frontend to do, especially if its foreign data source cannot supply the changed path list on a per-commit basis. fast-import now allows a frontend to request that a branch's tree be wiped clean (reset to the empty tree) at the start of a commit, allowing the frontend to feed in all paths which belong on the branch. This is ideal for a tar-file importer frontend, for example, as the frontend just needs to reformat the tar data stream into a gfi data stream, which may be something a few Perl regexps can take care of. :) Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-07 08:03:03 +01:00			`}`

git-fast-import: rename cmd_() functions to parse_() There is a cmd_merge() function in fast-import that will conflict with builtin-merge's cmd_merge() function. To keep it consistent, rename all cmd_() function to parse_() Signed-off-by: Miklos Vajna <vmiklos@frugalware.org> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-05-16 00:35:56 +02:00			`static void parse_from_commit(struct branch b, char buf, unsigned long size)`
Refactor fast-import branch creation from existing commit To resolve a corner case uncovered by Simon Hausmann I need to reuse the logic for the SHA-1 expression version of the 'from ' command within the mark version of the 'from ' command. This change doesn't alter any functionality, but is merely breaking the common code out to a function that I can reuse. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-05-24 06:05:19 +02:00			`{`
			`if (!buf \|\| size < 46)`
			`die("Not a valid commit: %s", sha1_to_hex(b->sha1));`
			`if (memcmp("tree ", buf, 5)`
			`\|\| get_sha1_hex(buf + 5, b->branch_tree.versions[1].sha1))`
			`die("The commit %s is corrupt", sha1_to_hex(b->sha1));`
			`hashcpy(b->branch_tree.versions[0].sha1,`
			`b->branch_tree.versions[1].sha1);`
			`}`

git-fast-import: rename cmd_() functions to parse_() There is a cmd_merge() function in fast-import that will conflict with builtin-merge's cmd_merge() function. To keep it consistent, rename all cmd_() function to parse_() Signed-off-by: Miklos Vajna <vmiklos@frugalware.org> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-05-16 00:35:56 +02:00			`static void parse_from_existing(struct branch *b)`
Refactor fast-import branch creation from existing commit To resolve a corner case uncovered by Simon Hausmann I need to reuse the logic for the SHA-1 expression version of the 'from ' command within the mark version of the 'from ' command. This change doesn't alter any functionality, but is merely breaking the common code out to a function that I can reuse. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-05-24 06:05:19 +02:00			`{`
			`if (is_null_sha1(b->sha1)) {`
			`hashclr(b->branch_tree.versions[0].sha1);`
			`hashclr(b->branch_tree.versions[1].sha1);`
			`} else {`
			`unsigned long size;`
			`char *buf;`

			`buf = read_object_with_reference(b->sha1,`
			`commit_type, &size, b->sha1);`
git-fast-import: rename cmd_() functions to parse_() There is a cmd_merge() function in fast-import that will conflict with builtin-merge's cmd_merge() function. To keep it consistent, rename all cmd_() function to parse_() Signed-off-by: Miklos Vajna <vmiklos@frugalware.org> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-05-16 00:35:56 +02:00			`parse_from_commit(b, buf, size);`
Refactor fast-import branch creation from existing commit To resolve a corner case uncovered by Simon Hausmann I need to reuse the logic for the SHA-1 expression version of the 'from ' command within the mark version of the 'from ' command. This change doesn't alter any functionality, but is merely breaking the common code out to a function that I can reuse. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-05-24 06:05:19 +02:00			`free(buf);`
			`}`
			`}`

git-fast-import: rename cmd_() functions to parse_() There is a cmd_merge() function in fast-import that will conflict with builtin-merge's cmd_merge() function. To keep it consistent, rename all cmd_() function to parse_() Signed-off-by: Miklos Vajna <vmiklos@frugalware.org> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-05-16 00:35:56 +02:00			`static int parse_from(struct branch *b)`
Remove branch creation command from fast-import. Jon Smirl was finding it difficult to alter cvs2svn to generate branch commands prior to the first commit of the same branch. This change moves the 'from' command to be an optional parameter of the 'commit' command, thereby allowing a new branch to be defined at the moment it gets used to create the first commit on that branch. This change makes it impossible to create a branch with no commits on it as at least one commit is needed to register the branch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 00:45:26 +02:00			`{`
Don't support shell-quoted refnames in fast-import. The current implementation of shell-style quoted refnames and SHA-1 expressions within fast-import contains a bad memory leak. We leak the unquoted strings used by the `from` and `merge` commands, maybe others. Its also just muddling up the docs. Since Git refnames cannot contain LF, and that is our delimiter for the end of the refname, and we accept any other character as-is, there is no reason for these strings to support quoting, except to be nice to frontends. But frontends shouldn't be expecting to use funny refs in Git, and its just as simple to never quote them as it is to always pass them through the same quoting filter as pathnames. So frontends should never quote refs, or ref expressions. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 02:30:37 +01:00			`const char *from;`
Remove branch creation command from fast-import. Jon Smirl was finding it difficult to alter cvs2svn to generate branch commands prior to the first commit of the same branch. This change moves the 'from' command to be an optional parameter of the 'commit' command, thereby allowing a new branch to be defined at the moment it gets used to create the first commit on that branch. This change makes it impossible to create a branch with no commits on it as at least one commit is needed to register the branch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 00:45:26 +02:00			`struct branch *s;`

prefixcmp(): fix-up mechanical conversion. Previous step converted use of strncmp() with literal string mechanically even when the result is only used as a boolean: if (!strncmp("foo", arg, 3)) ==> if (!(-prefixcmp(arg, "foo"))) This step manually cleans them up to read: if (!prefixcmp(arg, "foo")) Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-20 10:54:00 +01:00			`if (prefixcmp(command_buf.buf, "from "))`
Make trailing LF optional for all fast-import commands For the same reasons as the prior change we want to allow frontends to omit the trailing LF that usually delimits commands. In some cases these just make the input stream more verbose looking than it needs to be, and its just simpler for the frontend developer to get started if our parser is slightly more lenient about where an LF is required and where it isn't. To make this optional LF feature work we now have to buffer up to one line of input in command_buf. This buffering can happen if we look at the current input command but don't recognize it at this point in the code. In such a case we need to "unget" the entire line, but we cannot depend upon the stdio library to let us do ungetc() for that many characters at once. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-01 08:22:53 +02:00			`return 0;`
Remove branch creation command from fast-import. Jon Smirl was finding it difficult to alter cvs2svn to generate branch commands prior to the first commit of the same branch. This change moves the 'from' command to be an optional parameter of the 'commit' command, thereby allowing a new branch to be defined at the moment it gets used to create the first commit on that branch. This change makes it impossible to create a branch with no commits on it as at least one commit is needed to register the branch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 00:45:26 +02:00
fast-import: Support reusing 'from' and brown paper bag fix reset. It was suggested on the mailing list that being able to use `from` in any commit to reset the current branch is useful in some types of importers, such as a darcs importer. We originally did not permit resetting an existing branch with a new `from` command during a `commit` command, but this restriction was only to help debug the hacked up cvs2svn that Jon Smirl was developing in parallel with git-fast-import. It is probably more of a problem to disallow it than to allow it. So now we permit a `from` during any `commit`. While making the changes required to permit multiple `from` commands on the same branch, I discovered we no longer needed the last_commit field to be set to 0 during a reset, so that was removed. (Reset was originally setting the field to 0 to signal cmd_from() that it was OK to execute on the branch.) While poking around in this section of fast-import I also realized the `reset` command was not working as intended if the corresponding `from` command was omitted (as allowed by the BNF grammar and the code). If `from` was omitted we cleared out the tree but we left the tree SHA-1 and parent commit SHA-1 intact. This is not what the user intended in this case. Instead they would be trying to reset the branch to have no parent and to have no tree, making the branch look new-born during the next commit. We now clear these SHA-1 values during `reset`, ensuring the branch looks new-born if `from` does not get supplied. New test cases for these were also added. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-12 10:08:43 +01:00			`if (b->branch_tree.tree) {`
			`release_tree_content_recursive(b->branch_tree.tree);`
			`b->branch_tree.tree = NULL;`
			`}`
Remove branch creation command from fast-import. Jon Smirl was finding it difficult to alter cvs2svn to generate branch commands prior to the first commit of the same branch. This change moves the 'from' command to be an optional parameter of the 'commit' command, thereby allowing a new branch to be defined at the moment it gets used to create the first commit on that branch. This change makes it impossible to create a branch with no commits on it as at least one commit is needed to register the branch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 00:45:26 +02:00
			`from = strchr(command_buf.buf, ' ') + 1;`
			`s = lookup_branch(from);`
			`if (b == s)`
			`die("Can't create a branch from itself: %s", b->name);`
			`else if (s) {`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`unsigned char *t = s->branch_tree.versions[1].sha1;`
Converted hash memcpy/memcmp to new hashcpy/hashcmp/hashclr. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 16:46:58 +02:00			`hashcpy(b->sha1, s->sha1);`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`hashcpy(b->branch_tree.versions[0].sha1, t);`
			`hashcpy(b->branch_tree.versions[1].sha1, t);`
Remove branch creation command from fast-import. Jon Smirl was finding it difficult to alter cvs2svn to generate branch commands prior to the first commit of the same branch. This change moves the 'from' command to be an optional parameter of the 'commit' command, thereby allowing a new branch to be defined at the moment it gets used to create the first commit on that branch. This change makes it impossible to create a branch with no commits on it as at least one commit is needed to register the branch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 00:45:26 +02:00			`} else if (*from == ':') {`
fast-import: tighten parsing of datarefs The syntax for the use of mark references in fast-import demands either a SP (space) or LF (end-of-line) after a mark reference. Fast-import does not complain when garbage appears after a mark reference in some cases. Factor out parsing of mark references and complain if errant characters are found. Also be a little more careful when parsing "inline" and SHA1s, complaining if extra characters appear or if the form of the dataref is unrecognized. Buggy input can cause fast-import to produce the wrong output, silently, without error. This makes it difficult to track down buggy generators of fast-import streams. An example is seen in the last line of this commit command: commit refs/heads/S2 committer Name <name@example.com> 1112912893 -0400 data <<COMMIT commit message COMMIT from :1M 100644 :103 hello.c It is missing a newline and should be: [...] from :1 M 100644 :103 hello.c What fast-import does is to produce a commit with the same contents for hello.c as in refs/heads/S2^. What the buggy program was expecting was the contents of blob :103. While the resulting commit graph looked correct, the contents in some commits were wrong. Signed-off-by: Pete Wyckoff <pw@padd.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-04-08 00:59:20 +02:00			`uintmax_t idnum = parse_mark_ref_eol(from);`
Remove branch creation command from fast-import. Jon Smirl was finding it difficult to alter cvs2svn to generate branch commands prior to the first commit of the same branch. This change moves the 'from' command to be an optional parameter of the 'commit' command, thereby allowing a new branch to be defined at the moment it gets used to create the first commit on that branch. This change makes it impossible to create a branch with no commits on it as at least one commit is needed to register the branch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 00:45:26 +02:00			`struct object_entry *oe = find_mark(idnum);`
			`if (oe->type != OBJ_COMMIT)`
Check for PRIuMAX rather than NO_C99_FORMAT in fast-import.c. Thanks to Simon 'corecode' Schubert <corecode@fs.ei.tum.de> for the clean-up. Defining the C99 standard PRIuMAX when necessary replaces UM_FMT and the awkward UM10_FMT. There are no direct C99 translations for other uses of NO_C99_FORMAT in git, alas. Signed-off-by: Jason Riedy <ejr@cs.berkeley.edu> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-21 02:34:56 +01:00			`die("Mark :%" PRIuMAX " not a commit", idnum);`
fast-import: start using struct pack_idx_entry This is in preparation for using write_idx_file(). Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:51 +01:00			`hashcpy(b->sha1, oe->idx.sha1);`
Fix possible coredump with fast-import --import-marks When e8438420bb7d368bec3647b90c557b9931582267 allowed us to reload the marks table on subsequent runs of fast-import we really broke things, as we set pack_id to MAX_PACK_ID for any objects we imported into the marks table. Creating a branch from that mark should fail as we attempt to read the object through a non-existant packed_git pointer. Instead we have to use the normal Git object system to locate the older commit, as we ourselves do not have a reference to the packed_git it resides in. This bug only occurred because t9300 was not complete enough. When we added the --import-marks feature we didn't actually test its implementation enough to verify the function worked as intended. I have corrected that, and included the changes as part of this fix. Prior versions of fast-import fail the new test(s); this commit allows them to pass. Credit for this bug find goes to Simon Hausmann <simon@lst.de> as he recently identified a similiar bug in the tree lazy-loading path. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-05-24 06:32:31 +02:00			`if (oe->pack_id != MAX_PACK_ID) {`
Remove branch creation command from fast-import. Jon Smirl was finding it difficult to alter cvs2svn to generate branch commands prior to the first commit of the same branch. This change moves the 'from' command to be an optional parameter of the 'commit' command, thereby allowing a new branch to be defined at the moment it gets used to create the first commit on that branch. This change makes it impossible to create a branch with no commits on it as at least one commit is needed to register the branch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 00:45:26 +02:00			`unsigned long size;`
Fix possible coredump with fast-import --import-marks When e8438420bb7d368bec3647b90c557b9931582267 allowed us to reload the marks table on subsequent runs of fast-import we really broke things, as we set pack_id to MAX_PACK_ID for any objects we imported into the marks table. Creating a branch from that mark should fail as we attempt to read the object through a non-existant packed_git pointer. Instead we have to use the normal Git object system to locate the older commit, as we ourselves do not have a reference to the packed_git it resides in. This bug only occurred because t9300 was not complete enough. When we added the --import-marks feature we didn't actually test its implementation enough to verify the function worked as intended. I have corrected that, and included the changes as part of this fix. Prior versions of fast-import fail the new test(s); this commit allows them to pass. Credit for this bug find goes to Simon Hausmann <simon@lst.de> as he recently identified a similiar bug in the tree lazy-loading path. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-05-24 06:32:31 +02:00			`char *buf = gfi_unpack_entry(oe, &size);`
git-fast-import: rename cmd_() functions to parse_() There is a cmd_merge() function in fast-import that will conflict with builtin-merge's cmd_merge() function. To keep it consistent, rename all cmd_() function to parse_() Signed-off-by: Miklos Vajna <vmiklos@frugalware.org> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-05-16 00:35:56 +02:00			`parse_from_commit(b, buf, size);`
Remove branch creation command from fast-import. Jon Smirl was finding it difficult to alter cvs2svn to generate branch commands prior to the first commit of the same branch. This change moves the 'from' command to be an optional parameter of the 'commit' command, thereby allowing a new branch to be defined at the moment it gets used to create the first commit on that branch. This change makes it impossible to create a branch with no commits on it as at least one commit is needed to register the branch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 00:45:26 +02:00			`free(buf);`
Fix possible coredump with fast-import --import-marks When e8438420bb7d368bec3647b90c557b9931582267 allowed us to reload the marks table on subsequent runs of fast-import we really broke things, as we set pack_id to MAX_PACK_ID for any objects we imported into the marks table. Creating a branch from that mark should fail as we attempt to read the object through a non-existant packed_git pointer. Instead we have to use the normal Git object system to locate the older commit, as we ourselves do not have a reference to the packed_git it resides in. This bug only occurred because t9300 was not complete enough. When we added the --import-marks feature we didn't actually test its implementation enough to verify the function worked as intended. I have corrected that, and included the changes as part of this fix. Prior versions of fast-import fail the new test(s); this commit allows them to pass. Credit for this bug find goes to Simon Hausmann <simon@lst.de> as he recently identified a similiar bug in the tree lazy-loading path. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-05-24 06:32:31 +02:00			`} else`
git-fast-import: rename cmd_() functions to parse_() There is a cmd_merge() function in fast-import that will conflict with builtin-merge's cmd_merge() function. To keep it consistent, rename all cmd_() function to parse_() Signed-off-by: Miklos Vajna <vmiklos@frugalware.org> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-05-16 00:35:56 +02:00			`parse_from_existing(b);`
Refactor fast-import branch creation from existing commit To resolve a corner case uncovered by Simon Hausmann I need to reuse the logic for the SHA-1 expression version of the 'from ' command within the mark version of the 'from ' command. This change doesn't alter any functionality, but is merely breaking the common code out to a function that I can reuse. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-05-24 06:05:19 +02:00			`} else if (!get_sha1(from, b->sha1))`
git-fast-import: rename cmd_() functions to parse_() There is a cmd_merge() function in fast-import that will conflict with builtin-merge's cmd_merge() function. To keep it consistent, rename all cmd_() function to parse_() Signed-off-by: Miklos Vajna <vmiklos@frugalware.org> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-05-16 00:35:56 +02:00			`parse_from_existing(b);`
Refactor fast-import branch creation from existing commit To resolve a corner case uncovered by Simon Hausmann I need to reuse the logic for the SHA-1 expression version of the 'from ' command within the mark version of the 'from ' command. This change doesn't alter any functionality, but is merely breaking the common code out to a function that I can reuse. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-05-24 06:05:19 +02:00			`else`
Remove branch creation command from fast-import. Jon Smirl was finding it difficult to alter cvs2svn to generate branch commands prior to the first commit of the same branch. This change moves the 'from' command to be an optional parameter of the 'commit' command, thereby allowing a new branch to be defined at the moment it gets used to create the first commit on that branch. This change makes it impossible to create a branch with no commits on it as at least one commit is needed to register the branch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 00:45:26 +02:00			`die("Invalid ref name or SHA1 expression: %s", from);`

			`read_next_command();`
Make trailing LF optional for all fast-import commands For the same reasons as the prior change we want to allow frontends to omit the trailing LF that usually delimits commands. In some cases these just make the input stream more verbose looking than it needs to be, and its just simpler for the frontend developer to get started if our parser is slightly more lenient about where an LF is required and where it isn't. To make this optional LF feature work we now have to buffer up to one line of input in command_buf. This buffering can happen if we look at the current input command but don't recognize it at this point in the code. In such a case we need to "unget" the entire line, but we cannot depend upon the stdio library to let us do ungetc() for that many characters at once. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-01 08:22:53 +02:00			`return 1;`
Remove branch creation command from fast-import. Jon Smirl was finding it difficult to alter cvs2svn to generate branch commands prior to the first commit of the same branch. This change moves the 'from' command to be an optional parameter of the 'commit' command, thereby allowing a new branch to be defined at the moment it gets used to create the first commit on that branch. This change makes it impossible to create a branch with no commits on it as at least one commit is needed to register the branch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 00:45:26 +02:00			`}`

git-fast-import: rename cmd_() functions to parse_() There is a cmd_merge() function in fast-import that will conflict with builtin-merge's cmd_merge() function. To keep it consistent, rename all cmd_() function to parse_() Signed-off-by: Miklos Vajna <vmiklos@frugalware.org> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-05-16 00:35:56 +02:00			`static struct hash_list parse_merge(unsigned int count)`
Support creation of merge commits in fast-import. Some importers are able to determine when branch merges occurred within their source data. In these cases they will want to supply the correct commits to fast-import so that a proper merge commit will exist in Git. This is now supported by supplying a 'merge ' command after the commit message and optional from command. A merge is not actually performed by fast-import, its assumed that the frontend performed any sort of merging activity already and that fast-import should simply be storing its result. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-12 04:21:38 +01:00			`{`
Correct compiler warnings in fast-import. Junio noticed these warnings/errors in fast-import when compiling with `-Werror -ansi -pedantic`. A few changes are to reduce compiler warnings, while one (in cmd_merge) is a bug fix. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 06:26:49 +01:00			`struct hash_list list = NULL, n, *e = e;`
Don't support shell-quoted refnames in fast-import. The current implementation of shell-style quoted refnames and SHA-1 expressions within fast-import contains a bad memory leak. We leak the unquoted strings used by the `from` and `merge` commands, maybe others. Its also just muddling up the docs. Since Git refnames cannot contain LF, and that is our delimiter for the end of the refname, and we accept any other character as-is, there is no reason for these strings to support quoting, except to be nice to frontends. But frontends shouldn't be expecting to use funny refs in Git, and its just as simple to never quote them as it is to always pass them through the same quoting filter as pathnames. So frontends should never quote refs, or ref expressions. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 02:30:37 +01:00			`const char *from;`
Support creation of merge commits in fast-import. Some importers are able to determine when branch merges occurred within their source data. In these cases they will want to supply the correct commits to fast-import so that a proper merge commit will exist in Git. This is now supported by supplying a 'merge ' command after the commit message and optional from command. A merge is not actually performed by fast-import, its assumed that the frontend performed any sort of merging activity already and that fast-import should simply be storing its result. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-12 04:21:38 +01:00			`struct branch *s;`

			`*count = 0;`
prefixcmp(): fix-up mechanical conversion. Previous step converted use of strncmp() with literal string mechanically even when the result is only used as a boolean: if (!strncmp("foo", arg, 3)) ==> if (!(-prefixcmp(arg, "foo"))) This step manually cleans them up to read: if (!prefixcmp(arg, "foo")) Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-20 10:54:00 +01:00			`while (!prefixcmp(command_buf.buf, "merge ")) {`
Support creation of merge commits in fast-import. Some importers are able to determine when branch merges occurred within their source data. In these cases they will want to supply the correct commits to fast-import so that a proper merge commit will exist in Git. This is now supported by supplying a 'merge ' command after the commit message and optional from command. A merge is not actually performed by fast-import, its assumed that the frontend performed any sort of merging activity already and that fast-import should simply be storing its result. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-12 04:21:38 +01:00			`from = strchr(command_buf.buf, ' ') + 1;`
			`n = xmalloc(sizeof(*n));`
			`s = lookup_branch(from);`
			`if (s)`
			`hashcpy(n->sha1, s->sha1);`
			`else if (*from == ':') {`
fast-import: tighten parsing of datarefs The syntax for the use of mark references in fast-import demands either a SP (space) or LF (end-of-line) after a mark reference. Fast-import does not complain when garbage appears after a mark reference in some cases. Factor out parsing of mark references and complain if errant characters are found. Also be a little more careful when parsing "inline" and SHA1s, complaining if extra characters appear or if the form of the dataref is unrecognized. Buggy input can cause fast-import to produce the wrong output, silently, without error. This makes it difficult to track down buggy generators of fast-import streams. An example is seen in the last line of this commit command: commit refs/heads/S2 committer Name <name@example.com> 1112912893 -0400 data <<COMMIT commit message COMMIT from :1M 100644 :103 hello.c It is missing a newline and should be: [...] from :1 M 100644 :103 hello.c What fast-import does is to produce a commit with the same contents for hello.c as in refs/heads/S2^. What the buggy program was expecting was the contents of blob :103. While the resulting commit graph looked correct, the contents in some commits were wrong. Signed-off-by: Pete Wyckoff <pw@padd.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-04-08 00:59:20 +02:00			`uintmax_t idnum = parse_mark_ref_eol(from);`
Support creation of merge commits in fast-import. Some importers are able to determine when branch merges occurred within their source data. In these cases they will want to supply the correct commits to fast-import so that a proper merge commit will exist in Git. This is now supported by supplying a 'merge ' command after the commit message and optional from command. A merge is not actually performed by fast-import, its assumed that the frontend performed any sort of merging activity already and that fast-import should simply be storing its result. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-12 04:21:38 +01:00			`struct object_entry *oe = find_mark(idnum);`
			`if (oe->type != OBJ_COMMIT)`
Check for PRIuMAX rather than NO_C99_FORMAT in fast-import.c. Thanks to Simon 'corecode' Schubert <corecode@fs.ei.tum.de> for the clean-up. Defining the C99 standard PRIuMAX when necessary replaces UM_FMT and the awkward UM10_FMT. There are no direct C99 translations for other uses of NO_C99_FORMAT in git, alas. Signed-off-by: Jason Riedy <ejr@cs.berkeley.edu> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-21 02:34:56 +01:00			`die("Mark :%" PRIuMAX " not a commit", idnum);`
fast-import: start using struct pack_idx_entry This is in preparation for using write_idx_file(). Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:51 +01:00			`hashcpy(n->sha1, oe->idx.sha1);`
fast-import: Fail if a non-existant commit is used for merge Johannes Sixt noticed during one of his own imports that fast-import did not fail if a non-existant commit is referenced by SHA-1 value as an argument to the 'merge' command. This allowed the user to unknowingly create commits that would fail in fsck, as the commit contents would not be completely reachable. A side effect of this bug was that a frontend process could mark any SHA-1 object (blob, tree, tag) as a parent of a merge commit. This should also fail in fsck, as the commit is not a valid commit. We now use the same rule as the 'from' command. If a commit is referenced in the 'merge' command by hex formatted SHA-1 then the SHA-1 must be a commit or a tag that can be peeled back to a commit, the commit must already exist, and must be readable by the core Git infrastructure code. This requirement means that the commit must have existed prior to fast-import starting, or the commit must have been flushed out by a prior 'checkpoint' command. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-05 18:43:14 +01:00			`} else if (!get_sha1(from, n->sha1)) {`
			`unsigned long size;`
			`char *buf = read_object_with_reference(n->sha1,`
Merge branch 'maint' * maint: fast-import: Fail if a non-existant commit is used for merge fast-import: Avoid infinite loop after reset [sp: Minor evil merge to deal with type_names array moving to be private in 'master'.] 2007-03-05 18:49:02 +01:00			`commit_type, &size, n->sha1);`
fast-import: Fail if a non-existant commit is used for merge Johannes Sixt noticed during one of his own imports that fast-import did not fail if a non-existant commit is referenced by SHA-1 value as an argument to the 'merge' command. This allowed the user to unknowingly create commits that would fail in fsck, as the commit contents would not be completely reachable. A side effect of this bug was that a frontend process could mark any SHA-1 object (blob, tree, tag) as a parent of a merge commit. This should also fail in fsck, as the commit is not a valid commit. We now use the same rule as the 'from' command. If a commit is referenced in the 'merge' command by hex formatted SHA-1 then the SHA-1 must be a commit or a tag that can be peeled back to a commit, the commit must already exist, and must be readable by the core Git infrastructure code. This requirement means that the commit must have existed prior to fast-import starting, or the commit must have been flushed out by a prior 'checkpoint' command. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-05 18:43:14 +01:00			`if (!buf \|\| size < 46)`
			`die("Not a valid commit: %s", from);`
			`free(buf);`
			`} else`
Support creation of merge commits in fast-import. Some importers are able to determine when branch merges occurred within their source data. In these cases they will want to supply the correct commits to fast-import so that a proper merge commit will exist in Git. This is now supported by supplying a 'merge ' command after the commit message and optional from command. A merge is not actually performed by fast-import, its assumed that the frontend performed any sort of merging activity already and that fast-import should simply be storing its result. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-12 04:21:38 +01:00			`die("Invalid ref name or SHA1 expression: %s", from);`

			`n->next = NULL;`
			`if (list)`
			`e->next = n;`
			`else`
			`list = n;`
			`e = n;`
Correct compiler warnings in fast-import. Junio noticed these warnings/errors in fast-import when compiling with `-Werror -ansi -pedantic`. A few changes are to reduce compiler warnings, while one (in cmd_merge) is a bug fix. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 06:26:49 +01:00			`(*count)++;`
Support creation of merge commits in fast-import. Some importers are able to determine when branch merges occurred within their source data. In these cases they will want to supply the correct commits to fast-import so that a proper merge commit will exist in Git. This is now supported by supplying a 'merge ' command after the commit message and optional from command. A merge is not actually performed by fast-import, its assumed that the frontend performed any sort of merging activity already and that fast-import should simply be storing its result. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-12 04:21:38 +01:00			`read_next_command();`
			`}`
			`return list;`
			`}`

git-fast-import: rename cmd_() functions to parse_() There is a cmd_merge() function in fast-import that will conflict with builtin-merge's cmd_merge() function. To keep it consistent, rename all cmd_() function to parse_() Signed-off-by: Miklos Vajna <vmiklos@frugalware.org> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-05-16 00:35:56 +02:00			`static void parse_new_commit(void)`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00			`{`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`static struct strbuf msg = STRBUF_INIT;`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`struct branch *b;`
			`char *sp;`
			`char *author = NULL;`
			`char *committer = NULL;`
Support creation of merge commits in fast-import. Some importers are able to determine when branch merges occurred within their source data. In these cases they will want to supply the correct commits to fast-import so that a proper merge commit will exist in Git. This is now supported by supplying a 'merge ' command after the commit message and optional from command. A merge is not actually performed by fast-import, its assumed that the frontend performed any sort of merging activity already and that fast-import should simply be storing its result. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-12 04:21:38 +01:00			`struct hash_list *merge_list = NULL;`
			`unsigned int merge_count;`
fast-import: Proper notes tree manipulation This patch teaches 'git fast-import' to automatically organize note objects in a fast-import stream into an appropriate fanout structure. The notes API in notes.h is NOT used to accomplish this, because trying to keep the fast-import and notes data structures in sync would yield a significantly larger patch with higher complexity. Note objects are added with the 'N' command, and accounted for with a per-branch counter, which is used to trigger fanout restructuring when needed. Note that when restructuring the branch tree, _any_ entry whose path consists of 40 hex chars (not including directory separators) will be recognized as a note object. It is therefore not advisable to manipulate note entries with M/D/R/C commands. Since note objects are stored in the same tree structure as other objects, the unloading and reloading of a fast-import branches handle note objects transparently. This patch has been improved by the following contributions: - Shawn O. Pearce: Several style- and logic-related improvements Cc: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-07 12:27:24 +01:00			`unsigned char prev_fanout, new_fanout;`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00
			`/* Obtain the branch name from the rest of our command */`
			`sp = strchr(command_buf.buf, ' ') + 1;`
			`b = lookup_branch(sp);`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`if (!b)`
Remove branch creation command from fast-import. Jon Smirl was finding it difficult to alter cvs2svn to generate branch commands prior to the first commit of the same branch. This change moves the 'from' command to be an optional parameter of the 'commit' command, thereby allowing a new branch to be defined at the moment it gets used to create the first commit on that branch. This change makes it impossible to create a branch with no commits on it as at least one commit is needed to register the branch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 00:45:26 +02:00			`b = new_branch(sp);`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00
			`read_next_command();`
git-fast-import: rename cmd_() functions to parse_() There is a cmd_merge() function in fast-import that will conflict with builtin-merge's cmd_merge() function. To keep it consistent, rename all cmd_() function to parse_() Signed-off-by: Miklos Vajna <vmiklos@frugalware.org> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-05-16 00:35:56 +02:00			`parse_mark();`
prefixcmp(): fix-up mechanical conversion. Previous step converted use of strncmp() with literal string mechanically even when the result is only used as a boolean: if (!strncmp("foo", arg, 3)) ==> if (!(-prefixcmp(arg, "foo"))) This step manually cleans them up to read: if (!prefixcmp(arg, "foo")) Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-20 10:54:00 +01:00			`if (!prefixcmp(command_buf.buf, "author ")) {`
Support RFC 2822 date parsing in fast-import. Since some frontends may be working with source material where the dates are only readily available as RFC 2822 strings, it is more friendly if fast-import exposes Git's parse_date() function to handle the conversion. This way the frontend doesn't need to perform the parsing itself. The new --date-format option to fast-import can be used by a frontend to select which format it will supply date strings in. The default is the standard `raw` Git format, which fast-import has always supported. Format rfc2822 can be used to activate the parse_date() function instead. Because fast-import could also be useful for creating new, current commits, the format `now` is also supported to generate the current system timestamp. The implementation of `now` is a trivial call to datestamp(), but is actually a whole whopping 3 lines so that fast-import can verify the frontend really meant `now`. As part of this change I have added validation of the `raw` date format. Prior to this change fast-import would accept anything in a `committer` command, even if it was seriously malformed. Now fast-import requires the '> ' near the end of the string and verifies the timestamp is formatted properly. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 20:58:30 +01:00			`author = parse_ident(command_buf.buf + 7);`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`read_next_command();`
			`}`
prefixcmp(): fix-up mechanical conversion. Previous step converted use of strncmp() with literal string mechanically even when the result is only used as a boolean: if (!strncmp("foo", arg, 3)) ==> if (!(-prefixcmp(arg, "foo"))) This step manually cleans them up to read: if (!prefixcmp(arg, "foo")) Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-20 10:54:00 +01:00			`if (!prefixcmp(command_buf.buf, "committer ")) {`
Support RFC 2822 date parsing in fast-import. Since some frontends may be working with source material where the dates are only readily available as RFC 2822 strings, it is more friendly if fast-import exposes Git's parse_date() function to handle the conversion. This way the frontend doesn't need to perform the parsing itself. The new --date-format option to fast-import can be used by a frontend to select which format it will supply date strings in. The default is the standard `raw` Git format, which fast-import has always supported. Format rfc2822 can be used to activate the parse_date() function instead. Because fast-import could also be useful for creating new, current commits, the format `now` is also supported to generate the current system timestamp. The implementation of `now` is a trivial call to datestamp(), but is actually a whole whopping 3 lines so that fast-import can verify the frontend really meant `now`. As part of this change I have added validation of the `raw` date format. Prior to this change fast-import would accept anything in a `committer` command, even if it was seriously malformed. Now fast-import requires the '> ' near the end of the string and verifies the timestamp is formatted properly. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 20:58:30 +01:00			`committer = parse_ident(command_buf.buf + 10);`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`read_next_command();`
			`}`
			`if (!committer)`
			`die("Expected committer but didn't get one");`
fast-import: Stream very large blobs directly to pack If a blob is larger than the configured big-file-threshold, instead of reading it into a single buffer obtained from malloc, stream it onto the end of the current pack file. Streaming the larger objects into the pack avoids the 4+ GiB memory footprint that occurs when fast-import is processing 2+ GiB blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-01 18:27:35 +01:00			`parse_data(&msg, 0, NULL);`
Moved from command to after data to help cvs2svn. cvs2svn has three phases: begin_commit, middle_commit, end_commit. The ancester is computed in the middle_commit phase. So its easier to generate a stream if the from command appears after the commit message itself but before the file change commands. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 04:38:13 +02:00			`read_next_command();`
git-fast-import: rename cmd_() functions to parse_() There is a cmd_merge() function in fast-import that will conflict with builtin-merge's cmd_merge() function. To keep it consistent, rename all cmd_() function to parse_() Signed-off-by: Miklos Vajna <vmiklos@frugalware.org> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-05-16 00:35:56 +02:00			`parse_from(b);`
			`merge_list = parse_merge(&merge_count);`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00
			`/* ensure the branch is active/loaded */`
Implemented tree reloading in fast-import. Tree reloading allows fast-import to swap out the least-recently used branch by simply deallocating the data structures from memory that were associated with that branch. Later if the branch becomes active again it can lazily recreate those structures on demand by reloading the necessary trees from the pack file it originally wrote them to. The reloading process is implemented by mmap'ing the pack into memory and using a much tighter variant of the pack reading code contained in sha1_file.c. This was a blatent copy from sha1_file.c but the unpacking functions were significantly simplified and are actually now in a form that should make it easier to map only the necessary regions of a pack rather than the entire file. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 10:37:35 +02:00			`if (!b->branch_tree.tree \|\| !max_active_branches) {`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`unload_one_branch();`
			`load_branch(b);`
			`}`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00
fast-import: Proper notes tree manipulation This patch teaches 'git fast-import' to automatically organize note objects in a fast-import stream into an appropriate fanout structure. The notes API in notes.h is NOT used to accomplish this, because trying to keep the fast-import and notes data structures in sync would yield a significantly larger patch with higher complexity. Note objects are added with the 'N' command, and accounted for with a per-branch counter, which is used to trigger fanout restructuring when needed. Note that when restructuring the branch tree, _any_ entry whose path consists of 40 hex chars (not including directory separators) will be recognized as a note object. It is therefore not advisable to manipulate note entries with M/D/R/C commands. Since note objects are stored in the same tree structure as other objects, the unloading and reloading of a fast-import branches handle note objects transparently. This patch has been improved by the following contributions: - Shawn O. Pearce: Several style- and logic-related improvements Cc: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-07 12:27:24 +01:00			`prev_fanout = convert_num_notes_to_fanout(b->num_notes);`

Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`/* file_change* */`
Drop strbuf's 'eof' marker, and make read_line a first class citizen. read_line is now strbuf_getline, and is a first class citizen, it returns 0 when reading a line worked, EOF else. The ->eof marker was used non-locally by fast-import.c, mimic the same behaviour using a static int in "read_next_command", that now returns -1 on EOF, and avoids to call strbuf_getline when it's in EOF state. Also no longer automagically strbuf_release the buffer, it's counter intuitive and breaks fast-import in a very subtle way. Note: being at EOF implies that command_buf.len == 0. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 11:19:04 +02:00			`while (command_buf.len > 0) {`
Make trailing LF optional for all fast-import commands For the same reasons as the prior change we want to allow frontends to omit the trailing LF that usually delimits commands. In some cases these just make the input stream more verbose looking than it needs to be, and its just simpler for the frontend developer to get started if our parser is slightly more lenient about where an LF is required and where it isn't. To make this optional LF feature work we now have to buffer up to one line of input in command_buf. This buffering can happen if we look at the current input command but don't recognize it at this point in the code. In such a case we need to "unget" the entire line, but we cannot depend upon the stdio library to let us do ungetc() for that many characters at once. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-01 08:22:53 +02:00			`if (!prefixcmp(command_buf.buf, "M "))`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`file_change_m(b);`
prefixcmp(): fix-up mechanical conversion. Previous step converted use of strncmp() with literal string mechanically even when the result is only used as a boolean: if (!strncmp("foo", arg, 3)) ==> if (!(-prefixcmp(arg, "foo"))) This step manually cleans them up to read: if (!prefixcmp(arg, "foo")) Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-20 10:54:00 +01:00			`else if (!prefixcmp(command_buf.buf, "D "))`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`file_change_d(b);`
Support wholesale directory renames in fast-import Some source material (e.g. Subversion dump files) perform directory renames without telling us exactly which files in that subdirectory were moved. This makes it hard for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Rename a/ to b/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'R' subcommand within a commit allows the frontend to rename either a file or an entire subdirectory, without needing to know the object's SHA-1 or the specific files contained within it. The rename is performed as efficiently as possible internally, making it cheaper than a 'D'/'M' pair for a file rename. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-10 04:58:23 +02:00			`else if (!prefixcmp(command_buf.buf, "R "))`
Teach fast-import to recursively copy files/directories Some source material (e.g. Subversion dump files) perform directory renames by telling us the directory was copied, then deleted in the same revision. This makes it difficult for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Copy a/ to b/; Delete a/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'C' subcommand within a commit allows the frontend to make a recursive copy of one path to another path within the branch, without needing to keep track of the individual file paths. The metadata copy is performed in memory efficiently, but is implemented as a copy-immediately operation, rather than copy-on-write. With this new 'C' subcommand frontends could obviously implement an 'R' (rename) on their own as a combination of 'C' and 'D' (delete), but since we have already offered up 'R' in the past and it is a trivial thing to keep implemented I'm not going to deprecate it. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-07-15 07:40:37 +02:00			`file_change_cr(b, 1);`
			`else if (!prefixcmp(command_buf.buf, "C "))`
			`file_change_cr(b, 0);`
fast-import: Add support for importing commit notes Introduce a 'notemodify' subcommand of the 'commit' command. This subcommand is similar to 'filemodify', except that no mode is supplied (all notes have mode 0644), and the path is set to the hex SHA1 of the given "comittish". This enables fast import of note objects along with their associated commits, since the notes can now be named using the mark references of their corresponding commits. The patch also includes a test case of the added functionality. Signed-off-by: Johan Herland <johan@herland.net> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-09 12:22:02 +02:00			`else if (!prefixcmp(command_buf.buf, "N "))`
fast-import: Fix incorrect fanout level when modifying existing notes refs This fixes the bug uncovered by the tests added in the previous two patches. When an existing notes ref was loaded into the fast-import machinery, the num_notes counter associated with that ref remained == 0, even though the true number of notes in the loaded ref was higher. This caused a fanout level of 0 to be used, although the actual fanout of the tree could be > 0. Manipulating the notes tree at an incorrect fanout level causes removals to silently fail, and modifications of existing notes to instead produce an additional note (leaving the old object in place at a different fanout level). This patch fixes the bug by explicitly counting the number of notes in the notes tree whenever it looks like the num_notes counter could be wrong (when num_notes == 0). There may be false positives (i.e. triggering the counting when the notes tree is truly empty), but in those cases, the counting should not take long. Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-11-25 01:09:47 +01:00			`note_change_n(b, &prev_fanout);`
Teach fast-import how to clear the internal branch content. Some frontends may not be able to (easily) keep track of which files are included in the branch, and which aren't. Performing this tracking can be tedious and error prone for the frontend to do, especially if its foreign data source cannot supply the changed path list on a per-commit basis. fast-import now allows a frontend to request that a branch's tree be wiped clean (reset to the empty tree) at the start of a commit, allowing the frontend to feed in all paths which belong on the branch. This is ideal for a tar-file importer frontend, for example, as the frontend just needs to reformat the tar data stream into a gfi data stream, which may be something a few Perl regexps can take care of. :) Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-07 08:03:03 +01:00			`else if (!strcmp("deleteall", command_buf.buf))`
			`file_change_deleteall(b);`
fast-import: add 'ls' command Lazy fast-import frontend authors that want to rely on the backend to keep track of the content of the imported trees _almost_ have what they need in the 'cat-blob' command (v1.7.4-rc0~30^2~3, 2010-11-28). But it is not quite enough, since (1) cat-blob can be used to retrieve the content of files, but not their mode, and (2) using cat-blob requires the frontend to keep track of a name (mark number or object id) for each blob to be retrieved Introduce an 'ls' command to complement cat-blob and take care of the remaining needs. The 'ls' command finds what is at a given path within a given tree-ish (tag, commit, or tree): 'ls' SP <dataref> SP <path> LF or in fast-import's active commit: 'ls' SP <path> LF The response is a single line sent through the cat-blob channel, imitating ls-tree output. So for example: FE> ls :1 Documentation gfi> 040000 tree 9e6c2b599341d28a2a375f8207507e0a2a627fe9 Documentation FE> ls 9e6c2b599341d28a2a375f8207507e0a2a627fe9 git-fast-import.txt gfi> 100644 blob 4f92954396e3f0f97e75b6838a5635b583708870 git-fast-import.txt FE> ls :1 RelNotes gfi> 120000 blob b942e499449d97aeb50c73ca2bdc1c6e6d528743 RelNotes FE> cat-blob b942e499449d97aeb50c73ca2bdc1c6e6d528743 gfi> b942e499449d97aeb50c73ca2bdc1c6e6d528743 blob 32 gfi> Documentation/RelNotes/1.7.4.txt The most interesting parts of the reply are the first word, which is a 6-digit octal mode (regular file, executable, symlink, directory, or submodule), and the part from the second space to the tab, which is a <dataref> that can be used in later cat-blob, ls, and filemodify (M) commands to refer to the content (blob, tree, or commit) at that path. If there is nothing there, the response is "missing some/path". The intent is for this command to be used to read files from the active commit, so a frontend can apply patches to them, and to copy files and directories from previous revisions. For example, proposed updates to svn-fe use this command in place of its internal representation of the repository directory structure. This simplifies the frontend a great deal and means support for resuming an import in a separate fast-import run (i.e., incremental import) is basically free. Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Improved-by: Junio C Hamano <gitster@pobox.com> Improved-by: Sverre Rabbelier <srabbelier@gmail.com> 2010-12-02 11:40:20 +01:00			`else if (!prefixcmp(command_buf.buf, "ls "))`
			`parse_ls(b);`
Make trailing LF optional for all fast-import commands For the same reasons as the prior change we want to allow frontends to omit the trailing LF that usually delimits commands. In some cases these just make the input stream more verbose looking than it needs to be, and its just simpler for the frontend developer to get started if our parser is slightly more lenient about where an LF is required and where it isn't. To make this optional LF feature work we now have to buffer up to one line of input in command_buf. This buffering can happen if we look at the current input command but don't recognize it at this point in the code. In such a case we need to "unget" the entire line, but we cannot depend upon the stdio library to let us do ungetc() for that many characters at once. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-01 08:22:53 +02:00			`else {`
			`unread_command_buf = 1;`
			`break;`
			`}`
Drop strbuf's 'eof' marker, and make read_line a first class citizen. read_line is now strbuf_getline, and is a first class citizen, it returns 0 when reading a line worked, EOF else. The ->eof marker was used non-locally by fast-import.c, mimic the same behaviour using a static int in "read_next_command", that now returns -1 on EOF, and avoids to call strbuf_getline when it's in EOF state. Also no longer automagically strbuf_release the buffer, it's counter intuitive and breaks fast-import in a very subtle way. Note: being at EOF implies that command_buf.len == 0. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 11:19:04 +02:00			`if (read_next_command() == EOF)`
			`break;`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00			`}`

fast-import: Proper notes tree manipulation This patch teaches 'git fast-import' to automatically organize note objects in a fast-import stream into an appropriate fanout structure. The notes API in notes.h is NOT used to accomplish this, because trying to keep the fast-import and notes data structures in sync would yield a significantly larger patch with higher complexity. Note objects are added with the 'N' command, and accounted for with a per-branch counter, which is used to trigger fanout restructuring when needed. Note that when restructuring the branch tree, _any_ entry whose path consists of 40 hex chars (not including directory separators) will be recognized as a note object. It is therefore not advisable to manipulate note entries with M/D/R/C commands. Since note objects are stored in the same tree structure as other objects, the unloading and reloading of a fast-import branches handle note objects transparently. This patch has been improved by the following contributions: - Shawn O. Pearce: Several style- and logic-related improvements Cc: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-07 12:27:24 +01:00			`new_fanout = convert_num_notes_to_fanout(b->num_notes);`
			`if (new_fanout != prev_fanout)`
			`b->num_notes = change_note_fanout(&b->branch_tree, new_fanout);`

Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`/* build the tree and the commit */`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`store_tree(&b->branch_tree);`
Additional fast-import tree delta corruption cleanups. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-29 04:06:13 +02:00			`hashcpy(b->branch_tree.versions[0].sha1,`
			`b->branch_tree.versions[1].sha1);`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00
			`strbuf_reset(&new_data);`
			`strbuf_addf(&new_data, "tree %s\n",`
Implemented tree delta compression in fast-import. We now store for every tree entry two modes and two sha1 values; the base (aka "version 0") and the current/new (aka "version 1"). When we generate a tree object we also regenerate the prior version object and use that as our base object for a delta. This strategy saves a significant amount of memory as we can continue to use the atom pool for file/directory names and only increases each tree entry by an additional 24 bytes of memory. Branches should automatically delta against their ancestor tree, unless the ancestor tree is already at the delta chain limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 18:22:50 +02:00			`sha1_to_hex(b->branch_tree.versions[1].sha1));`
Converted hash memcpy/memcmp to new hashcpy/hashcmp/hashclr. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 16:46:58 +02:00			`if (!is_null_sha1(b->sha1))`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`strbuf_addf(&new_data, "parent %s\n", sha1_to_hex(b->sha1));`
Support creation of merge commits in fast-import. Some importers are able to determine when branch merges occurred within their source data. In these cases they will want to supply the correct commits to fast-import so that a proper merge commit will exist in Git. This is now supported by supplying a 'merge ' command after the commit message and optional from command. A merge is not actually performed by fast-import, its assumed that the frontend performed any sort of merging activity already and that fast-import should simply be storing its result. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-12 04:21:38 +01:00			`while (merge_list) {`
			`struct hash_list *next = merge_list->next;`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`strbuf_addf(&new_data, "parent %s\n", sha1_to_hex(merge_list->sha1));`
Support creation of merge commits in fast-import. Some importers are able to determine when branch merges occurred within their source data. In these cases they will want to supply the correct commits to fast-import so that a proper merge commit will exist in Git. This is now supported by supplying a 'merge ' command after the commit message and optional from command. A merge is not actually performed by fast-import, its assumed that the frontend performed any sort of merging activity already and that fast-import should simply be storing its result. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-12 04:21:38 +01:00			`free(merge_list);`
			`merge_list = next;`
			`}`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`strbuf_addf(&new_data,`
			`"author %s\n"`
			`"committer %s\n"`
			`"\n",`
			`author ? author : committer, committer);`
			`strbuf_addbuf(&new_data, &msg);`
Remove unnecessary null pointer checks in fast-import. There is no need to check for a NULL pointer before invoking free(), the runtime library automatically performs this check anyway and does nothing if a NULL pointer is supplied. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 18:05:51 +01:00			`free(author);`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`free(committer);`

fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`if (!store_object(OBJ_COMMIT, &new_data, NULL, b->sha1, next_mark))`
Correct packfile edge output in fast-import. Branches are only contained by a packfile if the branch actually had its most recent commit in that packfile. So new branches are set to MAX_PACK_ID to ensure they don't cause their commit to list as part of the first packfile when it closes out if the commit was actually in existance before fast-import started. Also corrected the type of last_commit to be umaxint_t to prevent overflow and wraparound on very large imports. Though that is highly unlikely to occur as we're talking 4 billion commits, which no real project has right now. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-17 08:42:43 +01:00			`b->pack_id = pack_id;`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`b->last_commit = object_count_by_type[OBJ_COMMIT];`
Implemented branch handling and basic tree support in fast-import. This provides the basic data structures needed to store trees in memory while we are processing them for a branch. What we are attempting to do is track one complete tree for each branch that the frontend has registered with us through the 'newb' (new_branch) command. When the frontend edits that tree through 'updf' or 'delf' commands we'll mark the affected tree(s) as being dirty and recompute their objects during 'comt' (commit). Currently the protocol is decidedly _not_ user friendly. I crashed fast-import by giving it bad input data from Perl. I may try to improve upon it, or at least upon its error handling. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-08 09:36:45 +02:00			`}`

git-fast-import: rename cmd_() functions to parse_() There is a cmd_merge() function in fast-import that will conflict with builtin-merge's cmd_merge() function. To keep it consistent, rename all cmd_() function to parse_() Signed-off-by: Miklos Vajna <vmiklos@frugalware.org> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-05-16 00:35:56 +02:00			`static void parse_new_tag(void)`
Implemented 'tag' command in fast-import. Tags received from the frontend are generated in memory in a simple linked list in the order that the tag commands were sent by the frontend. If multiple different tag objects for the same tag name get generated the last one sent by the frontend will be the one that gets written out at termination. Multiple tag objects for the same name will cause all older tags of the same name to be lost. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 09:12:13 +02:00			`{`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`static struct strbuf msg = STRBUF_INIT;`
Implemented 'tag' command in fast-import. Tags received from the frontend are generated in memory in a simple linked list in the order that the tag commands were sent by the frontend. If multiple different tag objects for the same tag name get generated the last one sent by the frontend will be the one that gets written out at termination. Multiple tag objects for the same name will cause all older tags of the same name to be lost. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 09:12:13 +02:00			`char *sp;`
			`const char *from;`
			`char *tagger;`
			`struct branch *s;`
			`struct tag *t;`
Use uintmax_t for marks in fast-import. If a frontend wants to use a mark per file revision and per commit and is doing a truly huge import (such as a 32 GiB SVN repository) we may need more than 2**32 unique mark values, especially if the frontend is unable (or unwilling) to recycle mark values. For mark idnums we should use the largest unsigned integer type available, hoping that will be at least 64 bits when we are compiled as a 64 bit executable. This way we may consume huge amounts of memory storing our mark table, but we'll at least be able to process the entire import without failing. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 06:33:19 +01:00			`uintmax_t from_mark = 0;`
Implemented 'tag' command in fast-import. Tags received from the frontend are generated in memory in a simple linked list in the order that the tag commands were sent by the frontend. If multiple different tag objects for the same tag name get generated the last one sent by the frontend will be the one that gets written out at termination. Multiple tag objects for the same name will cause all older tags of the same name to be lost. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 09:12:13 +02:00			`unsigned char sha1[20];`
fast-import: tag may point to any object type If you tried to export the official git repository, and then to import it back then git-fast-import would die complaining that "Mark :1 not a commit". Accordingly to a generated crash file, Mark 1 is not a commit but a blob, which is pointed by junio-gpg-pub tag. Because git-tag allows to create such tags, git-fast-import should import them. Signed-off-by: Dmitry Potapov <dpotapov@gmail.com> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-01-14 05:44:19 +01:00			`enum object_type type;`
Implemented 'tag' command in fast-import. Tags received from the frontend are generated in memory in a simple linked list in the order that the tag commands were sent by the frontend. If multiple different tag objects for the same tag name get generated the last one sent by the frontend will be the one that gets written out at termination. Multiple tag objects for the same name will cause all older tags of the same name to be lost. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 09:12:13 +02:00
			`/* Obtain the new tag name from the rest of our command */`
			`sp = strchr(command_buf.buf, ' ') + 1;`
			`t = pool_alloc(sizeof(struct tag));`
fast-import: zero all of 'struct tag' to silence valgrind When running t9300, valgrind (correctly) complains about an uninitialized value in write_crash_report: ==2971== Use of uninitialised value of size 8 ==2971== at 0x4164F4: sha1_to_hex (hex.c:70) ==2971== by 0x4073E4: die_nicely (fast-import.c:468) ==2971== by 0x43284C: die (usage.c:86) ==2971== by 0x40420D: main (fast-import.c:2731) ==2971== Uninitialised value was created by a heap allocation ==2971== at 0x4C29B3D: malloc (vg_replace_malloc.c:263) ==2971== by 0x433645: xmalloc (wrapper.c:35) ==2971== by 0x405DF5: pool_alloc (fast-import.c:619) ==2971== by 0x407755: pool_calloc.constprop.14 (fast-import.c:634) ==2971== by 0x403F33: main (fast-import.c:3324) Fix this by zeroing all of the 'struct tag'. We would only need to zero out the 'sha1' field, but this way seems more future-proof. Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-03-05 14:48:49 +01:00			`memset(t, 0, sizeof(struct tag));`
Implemented 'tag' command in fast-import. Tags received from the frontend are generated in memory in a simple linked list in the order that the tag commands were sent by the frontend. If multiple different tag objects for the same tag name get generated the last one sent by the frontend will be the one that gets written out at termination. Multiple tag objects for the same name will cause all older tags of the same name to be lost. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 09:12:13 +02:00			`t->name = pool_strdup(sp);`
			`if (last_tag)`
			`last_tag->next_tag = t;`
			`else`
			`first_tag = t;`
			`last_tag = t;`
			`read_next_command();`

			`/* from ... */`
prefixcmp(): fix-up mechanical conversion. Previous step converted use of strncmp() with literal string mechanically even when the result is only used as a boolean: if (!strncmp("foo", arg, 3)) ==> if (!(-prefixcmp(arg, "foo"))) This step manually cleans them up to read: if (!prefixcmp(arg, "foo")) Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-20 10:54:00 +01:00			`if (prefixcmp(command_buf.buf, "from "))`
Implemented 'tag' command in fast-import. Tags received from the frontend are generated in memory in a simple linked list in the order that the tag commands were sent by the frontend. If multiple different tag objects for the same tag name get generated the last one sent by the frontend will be the one that gets written out at termination. Multiple tag objects for the same name will cause all older tags of the same name to be lost. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 09:12:13 +02:00			`die("Expected from command, got %s", command_buf.buf);`
			`from = strchr(command_buf.buf, ' ') + 1;`
			`s = lookup_branch(from);`
			`if (s) {`
fast-import: don't allow to tag empty branch 'reset' command makes fast-import start a branch from scratch. It's name is kept in lookup table but it's sha1 is null_sha1 (special value). 'tag' command can be used to tag a branch by it's name. lookup_branch() is used it that case and it doesn't check for null_sha1. So fast-import writes a tag for null_sha1 object instead of giving a error. Add a check to deny tagging an empty branch and add a corresponding test. Signed-off-by: Dmitry Ivankov <divanorama@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-09-22 21:47:04 +02:00			`if (is_null_sha1(s->sha1))`
			`die("Can't tag an empty branch.");`
Converted hash memcpy/memcmp to new hashcpy/hashcmp/hashclr. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-28 16:46:58 +02:00			`hashcpy(sha1, s->sha1);`
fast-import: tag may point to any object type If you tried to export the official git repository, and then to import it back then git-fast-import would die complaining that "Mark :1 not a commit". Accordingly to a generated crash file, Mark 1 is not a commit but a blob, which is pointed by junio-gpg-pub tag. Because git-tag allows to create such tags, git-fast-import should import them. Signed-off-by: Dmitry Potapov <dpotapov@gmail.com> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-01-14 05:44:19 +01:00			`type = OBJ_COMMIT;`
Implemented 'tag' command in fast-import. Tags received from the frontend are generated in memory in a simple linked list in the order that the tag commands were sent by the frontend. If multiple different tag objects for the same tag name get generated the last one sent by the frontend will be the one that gets written out at termination. Multiple tag objects for the same name will cause all older tags of the same name to be lost. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 09:12:13 +02:00			`} else if (*from == ':') {`
Correct compiler warnings in fast-import. Junio noticed these warnings/errors in fast-import when compiling with `-Werror -ansi -pedantic`. A few changes are to reduce compiler warnings, while one (in cmd_merge) is a bug fix. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 06:26:49 +01:00			`struct object_entry *oe;`
fast-import: tighten parsing of datarefs The syntax for the use of mark references in fast-import demands either a SP (space) or LF (end-of-line) after a mark reference. Fast-import does not complain when garbage appears after a mark reference in some cases. Factor out parsing of mark references and complain if errant characters are found. Also be a little more careful when parsing "inline" and SHA1s, complaining if extra characters appear or if the form of the dataref is unrecognized. Buggy input can cause fast-import to produce the wrong output, silently, without error. This makes it difficult to track down buggy generators of fast-import streams. An example is seen in the last line of this commit command: commit refs/heads/S2 committer Name <name@example.com> 1112912893 -0400 data <<COMMIT commit message COMMIT from :1M 100644 :103 hello.c It is missing a newline and should be: [...] from :1 M 100644 :103 hello.c What fast-import does is to produce a commit with the same contents for hello.c as in refs/heads/S2^. What the buggy program was expecting was the contents of blob :103. While the resulting commit graph looked correct, the contents in some commits were wrong. Signed-off-by: Pete Wyckoff <pw@padd.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-04-08 00:59:20 +02:00			`from_mark = parse_mark_ref_eol(from);`
Correct compiler warnings in fast-import. Junio noticed these warnings/errors in fast-import when compiling with `-Werror -ansi -pedantic`. A few changes are to reduce compiler warnings, while one (in cmd_merge) is a bug fix. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 06:26:49 +01:00			`oe = find_mark(from_mark);`
fast-import: tag may point to any object type If you tried to export the official git repository, and then to import it back then git-fast-import would die complaining that "Mark :1 not a commit". Accordingly to a generated crash file, Mark 1 is not a commit but a blob, which is pointed by junio-gpg-pub tag. Because git-tag allows to create such tags, git-fast-import should import them. Signed-off-by: Dmitry Potapov <dpotapov@gmail.com> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-01-14 05:44:19 +01:00			`type = oe->type;`
fast-import: start using struct pack_idx_entry This is in preparation for using write_idx_file(). Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:51 +01:00			`hashcpy(sha1, oe->idx.sha1);`
Implemented 'tag' command in fast-import. Tags received from the frontend are generated in memory in a simple linked list in the order that the tag commands were sent by the frontend. If multiple different tag objects for the same tag name get generated the last one sent by the frontend will be the one that gets written out at termination. Multiple tag objects for the same name will cause all older tags of the same name to be lost. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 09:12:13 +02:00			`} else if (!get_sha1(from, sha1)) {`
fast-import: allow to tag newly created objects fast-import allows to tag objects by sha1 and to query sha1 of objects being imported. So it should allow to tag these objects, make it do so. Signed-off-by: Dmitry Ivankov <divanorama@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-08-22 14:10:19 +02:00			`struct object_entry *oe = find_object(sha1);`
			`if (!oe) {`
			`type = sha1_object_info(sha1, NULL);`
			`if (type < 0)`
			`die("Not a valid object: %s", from);`
			`} else`
			`type = oe->type;`
Implemented 'tag' command in fast-import. Tags received from the frontend are generated in memory in a simple linked list in the order that the tag commands were sent by the frontend. If multiple different tag objects for the same tag name get generated the last one sent by the frontend will be the one that gets written out at termination. Multiple tag objects for the same name will cause all older tags of the same name to be lost. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 09:12:13 +02:00			`} else`
			`die("Invalid ref name or SHA1 expression: %s", from);`
			`read_next_command();`

			`/* tagger ... */`
fast-import: make tagger information optional Even though newer Porcelain tools always record the tagger information when creating new tags, export/import pair should be able to faithfully reproduce ancient tag objects that lack tagger information. Signed-off-by: Junio C Hamano <gitster@pobox.com> Acked-by: Shawn O. Pearce <spearce@spearce.org> 2008-12-19 23:41:21 +01:00			`if (!prefixcmp(command_buf.buf, "tagger ")) {`
			`tagger = parse_ident(command_buf.buf + 7);`
			`read_next_command();`
			`} else`
			`tagger = NULL;`
Implemented 'tag' command in fast-import. Tags received from the frontend are generated in memory in a simple linked list in the order that the tag commands were sent by the frontend. If multiple different tag objects for the same tag name get generated the last one sent by the frontend will be the one that gets written out at termination. Multiple tag objects for the same name will cause all older tags of the same name to be lost. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 09:12:13 +02:00
			`/* tag payload/message */`
fast-import: Stream very large blobs directly to pack If a blob is larger than the configured big-file-threshold, instead of reading it into a single buffer obtained from malloc, stream it onto the end of the current pack file. Streaming the larger objects into the pack avoids the 4+ GiB memory footprint that occurs when fast-import is processing 2+ GiB blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-01 18:27:35 +01:00			`parse_data(&msg, 0, NULL);`
Implemented 'tag' command in fast-import. Tags received from the frontend are generated in memory in a simple linked list in the order that the tag commands were sent by the frontend. If multiple different tag objects for the same tag name get generated the last one sent by the frontend will be the one that gets written out at termination. Multiple tag objects for the same name will cause all older tags of the same name to be lost. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 09:12:13 +02:00
			`/* build the tag object */`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`strbuf_reset(&new_data);`
fast-import: make tagger information optional Even though newer Porcelain tools always record the tagger information when creating new tags, export/import pair should be able to faithfully reproduce ancient tag objects that lack tagger information. Signed-off-by: Junio C Hamano <gitster@pobox.com> Acked-by: Shawn O. Pearce <spearce@spearce.org> 2008-12-19 23:41:21 +01:00
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`strbuf_addf(&new_data,`
fast-import: make tagger information optional Even though newer Porcelain tools always record the tagger information when creating new tags, export/import pair should be able to faithfully reproduce ancient tag objects that lack tagger information. Signed-off-by: Junio C Hamano <gitster@pobox.com> Acked-by: Shawn O. Pearce <spearce@spearce.org> 2008-12-19 23:41:21 +01:00			`"object %s\n"`
			`"type %s\n"`
			`"tag %s\n",`
fast-import: tag may point to any object type If you tried to export the official git repository, and then to import it back then git-fast-import would die complaining that "Mark :1 not a commit". Accordingly to a generated crash file, Mark 1 is not a commit but a blob, which is pointed by junio-gpg-pub tag. Because git-tag allows to create such tags, git-fast-import should import them. Signed-off-by: Dmitry Potapov <dpotapov@gmail.com> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-01-14 05:44:19 +01:00			`sha1_to_hex(sha1), typename(type), t->name);`
fast-import: make tagger information optional Even though newer Porcelain tools always record the tagger information when creating new tags, export/import pair should be able to faithfully reproduce ancient tag objects that lack tagger information. Signed-off-by: Junio C Hamano <gitster@pobox.com> Acked-by: Shawn O. Pearce <spearce@spearce.org> 2008-12-19 23:41:21 +01:00			`if (tagger)`
			`strbuf_addf(&new_data,`
			`"tagger %s\n", tagger);`
			`strbuf_addch(&new_data, '\n');`
fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`strbuf_addbuf(&new_data, &msg);`
Implemented 'tag' command in fast-import. Tags received from the frontend are generated in memory in a simple linked list in the order that the tag commands were sent by the frontend. If multiple different tag objects for the same tag name get generated the last one sent by the frontend will be the one that gets written out at termination. Multiple tag objects for the same name will cause all older tags of the same name to be lost. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 09:12:13 +02:00			`free(tagger);`

fast-import was using dbuf's, replace them with strbuf's. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 13:48:17 +02:00			`if (store_object(OBJ_TAG, &new_data, NULL, t->sha1, 0))`
Correct packfile edge output in fast-import. Branches are only contained by a packfile if the branch actually had its most recent commit in that packfile. So new branches are set to MAX_PACK_ID to ensure they don't cause their commit to list as part of the first packfile when it closes out if the commit was actually in existance before fast-import started. Also corrected the type of last_commit to be umaxint_t to prevent overflow and wraparound on very large imports. Though that is highly unlikely to occur as we're talking 4 billion commits, which no real project has right now. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-17 08:42:43 +01:00			`t->pack_id = MAX_PACK_ID;`
			`else`
			`t->pack_id = pack_id;`
Implemented 'tag' command in fast-import. Tags received from the frontend are generated in memory in a simple linked list in the order that the tag commands were sent by the frontend. If multiple different tag objects for the same tag name get generated the last one sent by the frontend will be the one that gets written out at termination. Multiple tag objects for the same name will cause all older tags of the same name to be lost. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 09:12:13 +02:00			`}`

git-fast-import: rename cmd_() functions to parse_() There is a cmd_merge() function in fast-import that will conflict with builtin-merge's cmd_merge() function. To keep it consistent, rename all cmd_() function to parse_() Signed-off-by: Miklos Vajna <vmiklos@frugalware.org> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-05-16 00:35:56 +02:00			`static void parse_reset_branch(void)`
Added 'reset' command to clear a branch's tree. Sometimes an import frontend may need to work with a temporary branch which will actually contain many different branches over the life of the import. This is especially useful when the frontend needs to create a tag from a set of file versions which are otherwise never a commit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-27 12:20:49 +02:00			`{`
			`struct branch *b;`
			`char *sp;`

			`/* Obtain the branch name from the rest of our command */`
			`sp = strchr(command_buf.buf, ' ') + 1;`
			`b = lookup_branch(sp);`
			`if (b) {`
fast-import: Support reusing 'from' and brown paper bag fix reset. It was suggested on the mailing list that being able to use `from` in any commit to reset the current branch is useful in some types of importers, such as a darcs importer. We originally did not permit resetting an existing branch with a new `from` command during a `commit` command, but this restriction was only to help debug the hacked up cvs2svn that Jon Smirl was developing in parallel with git-fast-import. It is probably more of a problem to disallow it than to allow it. So now we permit a `from` during any `commit`. While making the changes required to permit multiple `from` commands on the same branch, I discovered we no longer needed the last_commit field to be set to 0 during a reset, so that was removed. (Reset was originally setting the field to 0 to signal cmd_from() that it was OK to execute on the branch.) While poking around in this section of fast-import I also realized the `reset` command was not working as intended if the corresponding `from` command was omitted (as allowed by the BNF grammar and the code). If `from` was omitted we cleared out the tree but we left the tree SHA-1 and parent commit SHA-1 intact. This is not what the user intended in this case. Instead they would be trying to reset the branch to have no parent and to have no tree, making the branch look new-born during the next commit. We now clear these SHA-1 values during `reset`, ensuring the branch looks new-born if `from` does not get supplied. New test cases for these were also added. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-12 10:08:43 +01:00			`hashclr(b->sha1);`
			`hashclr(b->branch_tree.versions[0].sha1);`
			`hashclr(b->branch_tree.versions[1].sha1);`
Added 'reset' command to clear a branch's tree. Sometimes an import frontend may need to work with a temporary branch which will actually contain many different branches over the life of the import. This is especially useful when the frontend needs to create a tag from a set of file versions which are otherwise never a commit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-27 12:20:49 +02:00			`if (b->branch_tree.tree) {`
			`release_tree_content_recursive(b->branch_tree.tree);`
			`b->branch_tree.tree = NULL;`
			`}`
			`}`
Allow creating branches without committing in fast-import. Some importers may want to create a branch long before they actually commit to it, or in some cases they may never commit to the branch but they still need the ref to be created in the repository after the import is complete. This extends the 'reset ' command to automatically create a new branch if the supplied reference isn't already known as a branch. While I'm at it I also modified the syntax of the reset command to terminate with an empty line, like commit and tag operate. This just makes the command set more consistent. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-12 04:28:39 +01:00			`else`
			`b = new_branch(sp);`
			`read_next_command();`
git-fast-import: rename cmd_() functions to parse_() There is a cmd_merge() function in fast-import that will conflict with builtin-merge's cmd_merge() function. To keep it consistent, rename all cmd_() function to parse_() Signed-off-by: Miklos Vajna <vmiklos@frugalware.org> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-05-16 00:35:56 +02:00			`parse_from(b);`
Really make the LF after reset in fast-import optional cmd_from() ends with a call to read_next_command(), which is needed when using cmd_from() from commands where from is not the last element. With reset, however, "from" is the last command, after which the flow returns to the main loop, which calls read_next_command() again. Because of this, always set unread_command_buf in cmd_reset_branch(), even if cmd_from() was successful. Add a test case for this in t9300-fast-import.sh. Signed-off-by: Adeodato Simó <dato@net.com.org.es> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-03-07 21:22:17 +01:00			`if (command_buf.len > 0)`
Make trailing LF optional for all fast-import commands For the same reasons as the prior change we want to allow frontends to omit the trailing LF that usually delimits commands. In some cases these just make the input stream more verbose looking than it needs to be, and its just simpler for the frontend developer to get started if our parser is slightly more lenient about where an LF is required and where it isn't. To make this optional LF feature work we now have to buffer up to one line of input in command_buf. This buffering can happen if we look at the current input command but don't recognize it at this point in the code. In such a case we need to "unget" the entire line, but we cannot depend upon the stdio library to let us do ungetc() for that many characters at once. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-01 08:22:53 +02:00			`unread_command_buf = 1;`
Added 'reset' command to clear a branch's tree. Sometimes an import frontend may need to work with a temporary branch which will actually contain many different branches over the life of the import. This is especially useful when the frontend needs to create a tag from a set of file versions which are otherwise never a commit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-27 12:20:49 +02:00			`}`

fast-import: let importers retrieve blobs New objects written by fast-import are not available immediately. Until a checkpoint has been started and finishes writing the pack index, any new blobs will not be accessible using standard git tools. So introduce a new way to access them: a "cat-blob" command in the command stream requests for fast-import to print a blob to stdout or a file descriptor specified by the argument to --cat-blob-fd. The value for cat-blob-fd cannot be specified in the stream because that would be a layering violation: the decision of where to direct a stream has to be made when fast-import is started anyway, so we might as well make the stream format is independent of that detail. Output uses the same format as "git cat-file --batch". Thanks to Sverre Rabbelier and Sam Vilain for guidance in designing the protocol. Based-on-patch-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: David Barr <david.barr@cordelta.com> Acked-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-11-28 20:45:01 +01:00			`static void cat_blob_write(const char *buf, unsigned long size)`
			`{`
			`if (write_in_full(cat_blob_fd, buf, size) != size)`
			`die_errno("Write to frontend failed");`
			`}`

			`static void cat_blob(struct object_entry *oe, unsigned char sha1[20])`
			`{`
			`struct strbuf line = STRBUF_INIT;`
			`unsigned long size;`
			`enum object_type type = 0;`
			`char *buf;`

			`if (!oe \|\| oe->pack_id == MAX_PACK_ID) {`
			`buf = read_sha1_file(sha1, &type, &size);`
			`} else {`
			`type = oe->type;`
			`buf = gfi_unpack_entry(oe, &size);`
			`}`

			`/*`
			`* Output based on batch_one_object() from cat-file.c.`
			`*/`
			`if (type <= 0) {`
			`strbuf_reset(&line);`
			`strbuf_addf(&line, "%s missing\n", sha1_to_hex(sha1));`
			`cat_blob_write(line.buf, line.len);`
			`strbuf_release(&line);`
			`free(buf);`
			`return;`
			`}`
			`if (!buf)`
			`die("Can't read object %s", sha1_to_hex(sha1));`
			`if (type != OBJ_BLOB)`
			`die("Object %s is a %s but a blob was expected.",`
			`sha1_to_hex(sha1), typename(type));`
			`strbuf_reset(&line);`
			`strbuf_addf(&line, "%s %s %lu\n", sha1_to_hex(sha1),`
			`typename(type), size);`
			`cat_blob_write(line.buf, line.len);`
			`strbuf_release(&line);`
			`cat_blob_write(buf, size);`
			`cat_blob_write("\n", 1);`
fast-import: treat cat-blob as a delta base hint for next blob Delta base for blobs is chosen as a previously saved blob. If we treat cat-blob's blob as a delta base for the next blob, nothing is likely to become worse. For fast-import stream producer like svn-fe cat-blob is used like following: - svn-fe reads file delta in svn format - to apply it, svn-fe asks cat-blob 'svn delta base' - applies 'svn delta' to the response - produces a blob command to store the result Currently there is no way for svn-fe to give fast-import a hint on object delta base. While what's requested in cat-blob is most of the time a best delta base possible. Of course, it could be not a good delta base, but we don't know any better one anyway. So do treat cat-blob's result as a delta base for next blob. The profit is nice: 2x to 7x reduction in pack size AND 1.2x to 3x time speedup due to diff_delta being faster on good deltas. git gc --aggressive can compress it even more, by 10% to 70%, utilizing more cpu time, real time and 3 cpu cores. Tested on 213M and 2.7G fast-import streams, resulting packs are 22M and 113M, import time is 7s and 60s, both streams are produced by svn-fe, sniffed and then used as raw input for fast-import. For git-fast-export produced streams there is no change as it doesn't use cat-blob and doesn't try to reorder blobs in some smart way to make successive deltas small. Signed-off-by: Dmitry Ivankov <divanorama@gmail.com> Acked-by: David Barr <davidbarr@google.com> Acked-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-08-20 21:04:12 +02:00			`if (oe && oe->pack_id == pack_id) {`
			`last_blob.offset = oe->idx.offset;`
			`strbuf_attach(&last_blob.data, buf, size, size);`
			`last_blob.depth = oe->depth;`
			`} else`
			`free(buf);`
fast-import: let importers retrieve blobs New objects written by fast-import are not available immediately. Until a checkpoint has been started and finishes writing the pack index, any new blobs will not be accessible using standard git tools. So introduce a new way to access them: a "cat-blob" command in the command stream requests for fast-import to print a blob to stdout or a file descriptor specified by the argument to --cat-blob-fd. The value for cat-blob-fd cannot be specified in the stream because that would be a layering violation: the decision of where to direct a stream has to be made when fast-import is started anyway, so we might as well make the stream format is independent of that detail. Output uses the same format as "git cat-file --batch". Thanks to Sverre Rabbelier and Sam Vilain for guidance in designing the protocol. Based-on-patch-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: David Barr <david.barr@cordelta.com> Acked-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-11-28 20:45:01 +01:00			`}`

			`static void parse_cat_blob(void)`
			`{`
			`const char *p;`
			`struct object_entry *oe = oe;`
			`unsigned char sha1[20];`

			`/* cat-blob SP <object> LF */`
			`p = command_buf.buf + strlen("cat-blob ");`
			`if (*p == ':') {`
fast-import: tighten parsing of datarefs The syntax for the use of mark references in fast-import demands either a SP (space) or LF (end-of-line) after a mark reference. Fast-import does not complain when garbage appears after a mark reference in some cases. Factor out parsing of mark references and complain if errant characters are found. Also be a little more careful when parsing "inline" and SHA1s, complaining if extra characters appear or if the form of the dataref is unrecognized. Buggy input can cause fast-import to produce the wrong output, silently, without error. This makes it difficult to track down buggy generators of fast-import streams. An example is seen in the last line of this commit command: commit refs/heads/S2 committer Name <name@example.com> 1112912893 -0400 data <<COMMIT commit message COMMIT from :1M 100644 :103 hello.c It is missing a newline and should be: [...] from :1 M 100644 :103 hello.c What fast-import does is to produce a commit with the same contents for hello.c as in refs/heads/S2^. What the buggy program was expecting was the contents of blob :103. While the resulting commit graph looked correct, the contents in some commits were wrong. Signed-off-by: Pete Wyckoff <pw@padd.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-04-08 00:59:20 +02:00			`oe = find_mark(parse_mark_ref_eol(p));`
fast-import: let importers retrieve blobs New objects written by fast-import are not available immediately. Until a checkpoint has been started and finishes writing the pack index, any new blobs will not be accessible using standard git tools. So introduce a new way to access them: a "cat-blob" command in the command stream requests for fast-import to print a blob to stdout or a file descriptor specified by the argument to --cat-blob-fd. The value for cat-blob-fd cannot be specified in the stream because that would be a layering violation: the decision of where to direct a stream has to be made when fast-import is started anyway, so we might as well make the stream format is independent of that detail. Output uses the same format as "git cat-file --batch". Thanks to Sverre Rabbelier and Sam Vilain for guidance in designing the protocol. Based-on-patch-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: David Barr <david.barr@cordelta.com> Acked-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-11-28 20:45:01 +01:00			`if (!oe)`
			`die("Unknown mark: %s", command_buf.buf);`
			`hashcpy(sha1, oe->idx.sha1);`
			`} else {`
			`if (get_sha1_hex(p, sha1))`
fast-import: tighten parsing of datarefs The syntax for the use of mark references in fast-import demands either a SP (space) or LF (end-of-line) after a mark reference. Fast-import does not complain when garbage appears after a mark reference in some cases. Factor out parsing of mark references and complain if errant characters are found. Also be a little more careful when parsing "inline" and SHA1s, complaining if extra characters appear or if the form of the dataref is unrecognized. Buggy input can cause fast-import to produce the wrong output, silently, without error. This makes it difficult to track down buggy generators of fast-import streams. An example is seen in the last line of this commit command: commit refs/heads/S2 committer Name <name@example.com> 1112912893 -0400 data <<COMMIT commit message COMMIT from :1M 100644 :103 hello.c It is missing a newline and should be: [...] from :1 M 100644 :103 hello.c What fast-import does is to produce a commit with the same contents for hello.c as in refs/heads/S2^. What the buggy program was expecting was the contents of blob :103. While the resulting commit graph looked correct, the contents in some commits were wrong. Signed-off-by: Pete Wyckoff <pw@padd.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-04-08 00:59:20 +02:00			`die("Invalid dataref: %s", command_buf.buf);`
fast-import: let importers retrieve blobs New objects written by fast-import are not available immediately. Until a checkpoint has been started and finishes writing the pack index, any new blobs will not be accessible using standard git tools. So introduce a new way to access them: a "cat-blob" command in the command stream requests for fast-import to print a blob to stdout or a file descriptor specified by the argument to --cat-blob-fd. The value for cat-blob-fd cannot be specified in the stream because that would be a layering violation: the decision of where to direct a stream has to be made when fast-import is started anyway, so we might as well make the stream format is independent of that detail. Output uses the same format as "git cat-file --batch". Thanks to Sverre Rabbelier and Sam Vilain for guidance in designing the protocol. Based-on-patch-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: David Barr <david.barr@cordelta.com> Acked-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-11-28 20:45:01 +01:00			`if (p[40])`
			`die("Garbage after SHA1: %s", command_buf.buf);`
			`oe = find_object(sha1);`
			`}`

			`cat_blob(oe, sha1);`
			`}`

fast-import: add 'ls' command Lazy fast-import frontend authors that want to rely on the backend to keep track of the content of the imported trees _almost_ have what they need in the 'cat-blob' command (v1.7.4-rc0~30^2~3, 2010-11-28). But it is not quite enough, since (1) cat-blob can be used to retrieve the content of files, but not their mode, and (2) using cat-blob requires the frontend to keep track of a name (mark number or object id) for each blob to be retrieved Introduce an 'ls' command to complement cat-blob and take care of the remaining needs. The 'ls' command finds what is at a given path within a given tree-ish (tag, commit, or tree): 'ls' SP <dataref> SP <path> LF or in fast-import's active commit: 'ls' SP <path> LF The response is a single line sent through the cat-blob channel, imitating ls-tree output. So for example: FE> ls :1 Documentation gfi> 040000 tree 9e6c2b599341d28a2a375f8207507e0a2a627fe9 Documentation FE> ls 9e6c2b599341d28a2a375f8207507e0a2a627fe9 git-fast-import.txt gfi> 100644 blob 4f92954396e3f0f97e75b6838a5635b583708870 git-fast-import.txt FE> ls :1 RelNotes gfi> 120000 blob b942e499449d97aeb50c73ca2bdc1c6e6d528743 RelNotes FE> cat-blob b942e499449d97aeb50c73ca2bdc1c6e6d528743 gfi> b942e499449d97aeb50c73ca2bdc1c6e6d528743 blob 32 gfi> Documentation/RelNotes/1.7.4.txt The most interesting parts of the reply are the first word, which is a 6-digit octal mode (regular file, executable, symlink, directory, or submodule), and the part from the second space to the tab, which is a <dataref> that can be used in later cat-blob, ls, and filemodify (M) commands to refer to the content (blob, tree, or commit) at that path. If there is nothing there, the response is "missing some/path". The intent is for this command to be used to read files from the active commit, so a frontend can apply patches to them, and to copy files and directories from previous revisions. For example, proposed updates to svn-fe use this command in place of its internal representation of the repository directory structure. This simplifies the frontend a great deal and means support for resuming an import in a separate fast-import run (i.e., incremental import) is basically free. Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Improved-by: Junio C Hamano <gitster@pobox.com> Improved-by: Sverre Rabbelier <srabbelier@gmail.com> 2010-12-02 11:40:20 +01:00			`static struct object_entry dereference(struct object_entry oe,`
			`unsigned char sha1[20])`
			`{`
			`unsigned long size;`
fast-import: make code "-Wpointer-arith" clean The dereference() function to peel a tree-ish and find the underlying tree expects arithmetic to (void ) to work on byte addresses. We should be reading the text of objects through a char anyway. Noticed-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> 2011-02-28 22:16:59 +01:00			`char *buf = NULL;`
fast-import: add 'ls' command Lazy fast-import frontend authors that want to rely on the backend to keep track of the content of the imported trees _almost_ have what they need in the 'cat-blob' command (v1.7.4-rc0~30^2~3, 2010-11-28). But it is not quite enough, since (1) cat-blob can be used to retrieve the content of files, but not their mode, and (2) using cat-blob requires the frontend to keep track of a name (mark number or object id) for each blob to be retrieved Introduce an 'ls' command to complement cat-blob and take care of the remaining needs. The 'ls' command finds what is at a given path within a given tree-ish (tag, commit, or tree): 'ls' SP <dataref> SP <path> LF or in fast-import's active commit: 'ls' SP <path> LF The response is a single line sent through the cat-blob channel, imitating ls-tree output. So for example: FE> ls :1 Documentation gfi> 040000 tree 9e6c2b599341d28a2a375f8207507e0a2a627fe9 Documentation FE> ls 9e6c2b599341d28a2a375f8207507e0a2a627fe9 git-fast-import.txt gfi> 100644 blob 4f92954396e3f0f97e75b6838a5635b583708870 git-fast-import.txt FE> ls :1 RelNotes gfi> 120000 blob b942e499449d97aeb50c73ca2bdc1c6e6d528743 RelNotes FE> cat-blob b942e499449d97aeb50c73ca2bdc1c6e6d528743 gfi> b942e499449d97aeb50c73ca2bdc1c6e6d528743 blob 32 gfi> Documentation/RelNotes/1.7.4.txt The most interesting parts of the reply are the first word, which is a 6-digit octal mode (regular file, executable, symlink, directory, or submodule), and the part from the second space to the tab, which is a <dataref> that can be used in later cat-blob, ls, and filemodify (M) commands to refer to the content (blob, tree, or commit) at that path. If there is nothing there, the response is "missing some/path". The intent is for this command to be used to read files from the active commit, so a frontend can apply patches to them, and to copy files and directories from previous revisions. For example, proposed updates to svn-fe use this command in place of its internal representation of the repository directory structure. This simplifies the frontend a great deal and means support for resuming an import in a separate fast-import run (i.e., incremental import) is basically free. Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Improved-by: Junio C Hamano <gitster@pobox.com> Improved-by: Sverre Rabbelier <srabbelier@gmail.com> 2010-12-02 11:40:20 +01:00			`if (!oe) {`
			`enum object_type type = sha1_object_info(sha1, NULL);`
			`if (type < 0)`
			`die("object not found: %s", sha1_to_hex(sha1));`
			`/* cache it! */`
			`oe = insert_object(sha1);`
			`oe->type = type;`
			`oe->pack_id = MAX_PACK_ID;`
			`oe->idx.offset = 1;`
			`}`
			`switch (oe->type) {`
			`case OBJ_TREE: /* easy case. */`
			`return oe;`
			`case OBJ_COMMIT:`
			`case OBJ_TAG:`
			`break;`
			`default:`
			`die("Not a treeish: %s", command_buf.buf);`
			`}`

			`if (oe->pack_id != MAX_PACK_ID) { /* in a pack being written */`
			`buf = gfi_unpack_entry(oe, &size);`
			`} else {`
			`enum object_type unused;`
			`buf = read_sha1_file(sha1, &unused, &size);`
			`}`
			`if (!buf)`
			`die("Can't load object %s", sha1_to_hex(sha1));`

			`/* Peel one layer. */`
			`switch (oe->type) {`
			`case OBJ_TAG:`
			`if (size < 40 + strlen("object ") \|\|`
			`get_sha1_hex(buf + strlen("object "), sha1))`
			`die("Invalid SHA1 in tag: %s", command_buf.buf);`
			`break;`
			`case OBJ_COMMIT:`
			`if (size < 40 + strlen("tree ") \|\|`
			`get_sha1_hex(buf + strlen("tree "), sha1))`
			`die("Invalid SHA1 in commit: %s", command_buf.buf);`
			`}`

			`free(buf);`
			`return find_object(sha1);`
			`}`

			`static struct object_entry parse_treeish_dataref(const char *p)`
			`{`
			`unsigned char sha1[20];`
			`struct object_entry *e;`

			`if (*p == ':') { / <mark> */`
fast-import: tighten parsing of datarefs The syntax for the use of mark references in fast-import demands either a SP (space) or LF (end-of-line) after a mark reference. Fast-import does not complain when garbage appears after a mark reference in some cases. Factor out parsing of mark references and complain if errant characters are found. Also be a little more careful when parsing "inline" and SHA1s, complaining if extra characters appear or if the form of the dataref is unrecognized. Buggy input can cause fast-import to produce the wrong output, silently, without error. This makes it difficult to track down buggy generators of fast-import streams. An example is seen in the last line of this commit command: commit refs/heads/S2 committer Name <name@example.com> 1112912893 -0400 data <<COMMIT commit message COMMIT from :1M 100644 :103 hello.c It is missing a newline and should be: [...] from :1 M 100644 :103 hello.c What fast-import does is to produce a commit with the same contents for hello.c as in refs/heads/S2^. What the buggy program was expecting was the contents of blob :103. While the resulting commit graph looked correct, the contents in some commits were wrong. Signed-off-by: Pete Wyckoff <pw@padd.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-04-08 00:59:20 +02:00			`e = find_mark(parse_mark_ref_space(p));`
fast-import: add 'ls' command Lazy fast-import frontend authors that want to rely on the backend to keep track of the content of the imported trees _almost_ have what they need in the 'cat-blob' command (v1.7.4-rc0~30^2~3, 2010-11-28). But it is not quite enough, since (1) cat-blob can be used to retrieve the content of files, but not their mode, and (2) using cat-blob requires the frontend to keep track of a name (mark number or object id) for each blob to be retrieved Introduce an 'ls' command to complement cat-blob and take care of the remaining needs. The 'ls' command finds what is at a given path within a given tree-ish (tag, commit, or tree): 'ls' SP <dataref> SP <path> LF or in fast-import's active commit: 'ls' SP <path> LF The response is a single line sent through the cat-blob channel, imitating ls-tree output. So for example: FE> ls :1 Documentation gfi> 040000 tree 9e6c2b599341d28a2a375f8207507e0a2a627fe9 Documentation FE> ls 9e6c2b599341d28a2a375f8207507e0a2a627fe9 git-fast-import.txt gfi> 100644 blob 4f92954396e3f0f97e75b6838a5635b583708870 git-fast-import.txt FE> ls :1 RelNotes gfi> 120000 blob b942e499449d97aeb50c73ca2bdc1c6e6d528743 RelNotes FE> cat-blob b942e499449d97aeb50c73ca2bdc1c6e6d528743 gfi> b942e499449d97aeb50c73ca2bdc1c6e6d528743 blob 32 gfi> Documentation/RelNotes/1.7.4.txt The most interesting parts of the reply are the first word, which is a 6-digit octal mode (regular file, executable, symlink, directory, or submodule), and the part from the second space to the tab, which is a <dataref> that can be used in later cat-blob, ls, and filemodify (M) commands to refer to the content (blob, tree, or commit) at that path. If there is nothing there, the response is "missing some/path". The intent is for this command to be used to read files from the active commit, so a frontend can apply patches to them, and to copy files and directories from previous revisions. For example, proposed updates to svn-fe use this command in place of its internal representation of the repository directory structure. This simplifies the frontend a great deal and means support for resuming an import in a separate fast-import run (i.e., incremental import) is basically free. Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Improved-by: Junio C Hamano <gitster@pobox.com> Improved-by: Sverre Rabbelier <srabbelier@gmail.com> 2010-12-02 11:40:20 +01:00			`if (!e)`
			`die("Unknown mark: %s", command_buf.buf);`
			`hashcpy(sha1, e->idx.sha1);`
			`} else { /* <sha1> */`
			`if (get_sha1_hex(*p, sha1))`
fast-import: tighten parsing of datarefs The syntax for the use of mark references in fast-import demands either a SP (space) or LF (end-of-line) after a mark reference. Fast-import does not complain when garbage appears after a mark reference in some cases. Factor out parsing of mark references and complain if errant characters are found. Also be a little more careful when parsing "inline" and SHA1s, complaining if extra characters appear or if the form of the dataref is unrecognized. Buggy input can cause fast-import to produce the wrong output, silently, without error. This makes it difficult to track down buggy generators of fast-import streams. An example is seen in the last line of this commit command: commit refs/heads/S2 committer Name <name@example.com> 1112912893 -0400 data <<COMMIT commit message COMMIT from :1M 100644 :103 hello.c It is missing a newline and should be: [...] from :1 M 100644 :103 hello.c What fast-import does is to produce a commit with the same contents for hello.c as in refs/heads/S2^. What the buggy program was expecting was the contents of blob :103. While the resulting commit graph looked correct, the contents in some commits were wrong. Signed-off-by: Pete Wyckoff <pw@padd.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-04-08 00:59:20 +02:00			`die("Invalid dataref: %s", command_buf.buf);`
fast-import: add 'ls' command Lazy fast-import frontend authors that want to rely on the backend to keep track of the content of the imported trees _almost_ have what they need in the 'cat-blob' command (v1.7.4-rc0~30^2~3, 2010-11-28). But it is not quite enough, since (1) cat-blob can be used to retrieve the content of files, but not their mode, and (2) using cat-blob requires the frontend to keep track of a name (mark number or object id) for each blob to be retrieved Introduce an 'ls' command to complement cat-blob and take care of the remaining needs. The 'ls' command finds what is at a given path within a given tree-ish (tag, commit, or tree): 'ls' SP <dataref> SP <path> LF or in fast-import's active commit: 'ls' SP <path> LF The response is a single line sent through the cat-blob channel, imitating ls-tree output. So for example: FE> ls :1 Documentation gfi> 040000 tree 9e6c2b599341d28a2a375f8207507e0a2a627fe9 Documentation FE> ls 9e6c2b599341d28a2a375f8207507e0a2a627fe9 git-fast-import.txt gfi> 100644 blob 4f92954396e3f0f97e75b6838a5635b583708870 git-fast-import.txt FE> ls :1 RelNotes gfi> 120000 blob b942e499449d97aeb50c73ca2bdc1c6e6d528743 RelNotes FE> cat-blob b942e499449d97aeb50c73ca2bdc1c6e6d528743 gfi> b942e499449d97aeb50c73ca2bdc1c6e6d528743 blob 32 gfi> Documentation/RelNotes/1.7.4.txt The most interesting parts of the reply are the first word, which is a 6-digit octal mode (regular file, executable, symlink, directory, or submodule), and the part from the second space to the tab, which is a <dataref> that can be used in later cat-blob, ls, and filemodify (M) commands to refer to the content (blob, tree, or commit) at that path. If there is nothing there, the response is "missing some/path". The intent is for this command to be used to read files from the active commit, so a frontend can apply patches to them, and to copy files and directories from previous revisions. For example, proposed updates to svn-fe use this command in place of its internal representation of the repository directory structure. This simplifies the frontend a great deal and means support for resuming an import in a separate fast-import run (i.e., incremental import) is basically free. Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Improved-by: Junio C Hamano <gitster@pobox.com> Improved-by: Sverre Rabbelier <srabbelier@gmail.com> 2010-12-02 11:40:20 +01:00			`e = find_object(sha1);`
			`*p += 40;`
			`}`

			`while (!e \|\| e->type != OBJ_TREE)`
			`e = dereference(e, sha1);`
			`return e;`
			`}`

			`static void print_ls(int mode, const unsigned char sha1, const char path)`
			`{`
			`static struct strbuf line = STRBUF_INIT;`

			`/* See show_tree(). */`
			`const char *type =`
			`S_ISGITLINK(mode) ? commit_type :`
			`S_ISDIR(mode) ? tree_type :`
			`blob_type;`

			`if (!mode) {`
			`/* missing SP path LF */`
			`strbuf_reset(&line);`
			`strbuf_addstr(&line, "missing ");`
			`quote_c_style(path, &line, NULL, 0);`
			`strbuf_addch(&line, '\n');`
			`} else {`
			`/* mode SP type SP object_name TAB path LF */`
			`strbuf_reset(&line);`
			`strbuf_addf(&line, "%06o %s %s\t",`
fast-import: prevent producing bad delta To produce deltas for tree objects fast-import tracks two versions of tree's entries - base and current one. Base version stands both for a delta base of this tree, and for a entry inside a delta base of a parent tree. So care should be taken to keep it in sync. tree_content_set cuts away a whole subtree and replaces it with a new one (or NULL for lazy load of a tree with known sha1). It keeps a base sha1 for this subtree (needed for parent tree). And here is the problem, 'subtree' tree root doesn't have the implied base version entries. Adjusting the subtree to include them would mean a deep rewrite of subtree. Invalidating the subtree base version would mean recursive invalidation of parents' base versions. So just mark this tree as do-not-delta me. Abuse setuid bit for this purpose. tree_content_replace is the same as tree_content_set except that is is used to replace the root, so just clearing base sha1 here (instead of setting the bit) is fine. [di: log message] Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Dmitry Ivankov <divanorama@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-08-14 20:32:24 +02:00			`mode & ~NO_DELTA, type, sha1_to_hex(sha1));`
fast-import: add 'ls' command Lazy fast-import frontend authors that want to rely on the backend to keep track of the content of the imported trees _almost_ have what they need in the 'cat-blob' command (v1.7.4-rc0~30^2~3, 2010-11-28). But it is not quite enough, since (1) cat-blob can be used to retrieve the content of files, but not their mode, and (2) using cat-blob requires the frontend to keep track of a name (mark number or object id) for each blob to be retrieved Introduce an 'ls' command to complement cat-blob and take care of the remaining needs. The 'ls' command finds what is at a given path within a given tree-ish (tag, commit, or tree): 'ls' SP <dataref> SP <path> LF or in fast-import's active commit: 'ls' SP <path> LF The response is a single line sent through the cat-blob channel, imitating ls-tree output. So for example: FE> ls :1 Documentation gfi> 040000 tree 9e6c2b599341d28a2a375f8207507e0a2a627fe9 Documentation FE> ls 9e6c2b599341d28a2a375f8207507e0a2a627fe9 git-fast-import.txt gfi> 100644 blob 4f92954396e3f0f97e75b6838a5635b583708870 git-fast-import.txt FE> ls :1 RelNotes gfi> 120000 blob b942e499449d97aeb50c73ca2bdc1c6e6d528743 RelNotes FE> cat-blob b942e499449d97aeb50c73ca2bdc1c6e6d528743 gfi> b942e499449d97aeb50c73ca2bdc1c6e6d528743 blob 32 gfi> Documentation/RelNotes/1.7.4.txt The most interesting parts of the reply are the first word, which is a 6-digit octal mode (regular file, executable, symlink, directory, or submodule), and the part from the second space to the tab, which is a <dataref> that can be used in later cat-blob, ls, and filemodify (M) commands to refer to the content (blob, tree, or commit) at that path. If there is nothing there, the response is "missing some/path". The intent is for this command to be used to read files from the active commit, so a frontend can apply patches to them, and to copy files and directories from previous revisions. For example, proposed updates to svn-fe use this command in place of its internal representation of the repository directory structure. This simplifies the frontend a great deal and means support for resuming an import in a separate fast-import run (i.e., incremental import) is basically free. Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Improved-by: Junio C Hamano <gitster@pobox.com> Improved-by: Sverre Rabbelier <srabbelier@gmail.com> 2010-12-02 11:40:20 +01:00			`quote_c_style(path, &line, NULL, 0);`
			`strbuf_addch(&line, '\n');`
			`}`
			`cat_blob_write(line.buf, line.len);`
			`}`

			`static void parse_ls(struct branch *b)`
			`{`
			`const char *p;`
			`struct tree_entry *root = NULL;`
Fix sparse warnings Fix warnings from 'make check'. - These files don't include 'builtin.h' causing sparse to complain that cmd_* isn't declared: builtin/clone.c:364, builtin/fetch-pack.c:797, builtin/fmt-merge-msg.c:34, builtin/hash-object.c:78, builtin/merge-index.c:69, builtin/merge-recursive.c:22 builtin/merge-tree.c:341, builtin/mktag.c:156, builtin/notes.c:426 builtin/notes.c:822, builtin/pack-redundant.c:596, builtin/pack-refs.c:10, builtin/patch-id.c:60, builtin/patch-id.c:149, builtin/remote.c:1512, builtin/remote-ext.c:240, builtin/remote-fd.c:53, builtin/reset.c:236, builtin/send-pack.c:384, builtin/unpack-file.c:25, builtin/var.c:75 - These files have symbols which should be marked static since they're only file scope: submodule.c:12, diff.c:631, replace_object.c:92, submodule.c:13, submodule.c:14, trace.c:78, transport.c:195, transport-helper.c:79, unpack-trees.c:19, url.c:3, url.c:18, url.c:104, url.c:117, url.c:123, url.c:129, url.c:136, thread-utils.c:21, thread-utils.c:48 - These files redeclare symbols to be different types: builtin/index-pack.c:210, parse-options.c:564, parse-options.c:571, usage.c:49, usage.c:58, usage.c:63, usage.c:72 - These files use a literal integer 0 when they really should use a NULL pointer: daemon.c:663, fast-import.c:2942, imap-send.c:1072, notes-merge.c:362 While we're in the area, clean up some unused #includes in builtin files (mostly exec_cmd.h). Signed-off-by: Stephen Boyd <bebarino@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-03-22 08:51:05 +01:00			`struct tree_entry leaf = {NULL};`
fast-import: add 'ls' command Lazy fast-import frontend authors that want to rely on the backend to keep track of the content of the imported trees _almost_ have what they need in the 'cat-blob' command (v1.7.4-rc0~30^2~3, 2010-11-28). But it is not quite enough, since (1) cat-blob can be used to retrieve the content of files, but not their mode, and (2) using cat-blob requires the frontend to keep track of a name (mark number or object id) for each blob to be retrieved Introduce an 'ls' command to complement cat-blob and take care of the remaining needs. The 'ls' command finds what is at a given path within a given tree-ish (tag, commit, or tree): 'ls' SP <dataref> SP <path> LF or in fast-import's active commit: 'ls' SP <path> LF The response is a single line sent through the cat-blob channel, imitating ls-tree output. So for example: FE> ls :1 Documentation gfi> 040000 tree 9e6c2b599341d28a2a375f8207507e0a2a627fe9 Documentation FE> ls 9e6c2b599341d28a2a375f8207507e0a2a627fe9 git-fast-import.txt gfi> 100644 blob 4f92954396e3f0f97e75b6838a5635b583708870 git-fast-import.txt FE> ls :1 RelNotes gfi> 120000 blob b942e499449d97aeb50c73ca2bdc1c6e6d528743 RelNotes FE> cat-blob b942e499449d97aeb50c73ca2bdc1c6e6d528743 gfi> b942e499449d97aeb50c73ca2bdc1c6e6d528743 blob 32 gfi> Documentation/RelNotes/1.7.4.txt The most interesting parts of the reply are the first word, which is a 6-digit octal mode (regular file, executable, symlink, directory, or submodule), and the part from the second space to the tab, which is a <dataref> that can be used in later cat-blob, ls, and filemodify (M) commands to refer to the content (blob, tree, or commit) at that path. If there is nothing there, the response is "missing some/path". The intent is for this command to be used to read files from the active commit, so a frontend can apply patches to them, and to copy files and directories from previous revisions. For example, proposed updates to svn-fe use this command in place of its internal representation of the repository directory structure. This simplifies the frontend a great deal and means support for resuming an import in a separate fast-import run (i.e., incremental import) is basically free. Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Improved-by: Junio C Hamano <gitster@pobox.com> Improved-by: Sverre Rabbelier <srabbelier@gmail.com> 2010-12-02 11:40:20 +01:00
			`/* ls SP (<treeish> SP)? <path> */`
			`p = command_buf.buf + strlen("ls ");`
			`if (*p == '"') {`
			`if (!b)`
			`die("Not in a commit: %s", command_buf.buf);`
			`root = &b->branch_tree;`
			`} else {`
			`struct object_entry *e = parse_treeish_dataref(&p);`
			`root = new_tree_entry();`
			`hashcpy(root->versions[1].sha1, e->idx.sha1);`
			`load_tree(root);`
			`if (*p++ != ' ')`
			`die("Missing space after tree-ish: %s", command_buf.buf);`
			`}`
			`if (*p == '"') {`
			`static struct strbuf uq = STRBUF_INIT;`
			`const char *endp;`
			`strbuf_reset(&uq);`
			`if (unquote_c_style(&uq, p, &endp))`
			`die("Invalid path: %s", command_buf.buf);`
			`if (*endp)`
			`die("Garbage after path in: %s", command_buf.buf);`
			`p = uq.buf;`
			`}`
			`tree_content_get(root, p, &leaf);`
			`/*`
			`* A directory in preparation would have a sha1 of zero`
			`* until it is saved. Save, for simplicity.`
			`*/`
			`if (S_ISDIR(leaf.versions[1].mode))`
			`store_tree(&leaf);`

			`print_ls(leaf.versions[1].mode, leaf.versions[1].sha1, p);`
fast-import: leakfix for 'ls' of dirty trees When the chosen directory has changed since it was last written to pack, "tree_content_get" makes a deep copy of its content to scribble on while computing the tree name, which we forgot to free. This leak has been present since the 'ls' command was introduced in v1.7.5-rc0~3^2~33 (fast-import: add 'ls' command, 2010-12-02). Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> 2012-03-10 04:20:34 +01:00			`if (leaf.tree)`
			`release_tree_content_recursive(leaf.tree);`
fast-import: add 'ls' command Lazy fast-import frontend authors that want to rely on the backend to keep track of the content of the imported trees _almost_ have what they need in the 'cat-blob' command (v1.7.4-rc0~30^2~3, 2010-11-28). But it is not quite enough, since (1) cat-blob can be used to retrieve the content of files, but not their mode, and (2) using cat-blob requires the frontend to keep track of a name (mark number or object id) for each blob to be retrieved Introduce an 'ls' command to complement cat-blob and take care of the remaining needs. The 'ls' command finds what is at a given path within a given tree-ish (tag, commit, or tree): 'ls' SP <dataref> SP <path> LF or in fast-import's active commit: 'ls' SP <path> LF The response is a single line sent through the cat-blob channel, imitating ls-tree output. So for example: FE> ls :1 Documentation gfi> 040000 tree 9e6c2b599341d28a2a375f8207507e0a2a627fe9 Documentation FE> ls 9e6c2b599341d28a2a375f8207507e0a2a627fe9 git-fast-import.txt gfi> 100644 blob 4f92954396e3f0f97e75b6838a5635b583708870 git-fast-import.txt FE> ls :1 RelNotes gfi> 120000 blob b942e499449d97aeb50c73ca2bdc1c6e6d528743 RelNotes FE> cat-blob b942e499449d97aeb50c73ca2bdc1c6e6d528743 gfi> b942e499449d97aeb50c73ca2bdc1c6e6d528743 blob 32 gfi> Documentation/RelNotes/1.7.4.txt The most interesting parts of the reply are the first word, which is a 6-digit octal mode (regular file, executable, symlink, directory, or submodule), and the part from the second space to the tab, which is a <dataref> that can be used in later cat-blob, ls, and filemodify (M) commands to refer to the content (blob, tree, or commit) at that path. If there is nothing there, the response is "missing some/path". The intent is for this command to be used to read files from the active commit, so a frontend can apply patches to them, and to copy files and directories from previous revisions. For example, proposed updates to svn-fe use this command in place of its internal representation of the repository directory structure. This simplifies the frontend a great deal and means support for resuming an import in a separate fast-import run (i.e., incremental import) is basically free. Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Improved-by: Junio C Hamano <gitster@pobox.com> Improved-by: Sverre Rabbelier <srabbelier@gmail.com> 2010-12-02 11:40:20 +01:00			`if (!b \|\| root != &b->branch_tree)`
			`release_tree_entry(root);`
			`}`

fast-import: treat SIGUSR1 as a request to access objects early It can be tedious to wait for a multi-million-revision import. Unfortunately it is hard to spy on the import because fast-import works by continuously streaming out objects, without updating the pack index or refs until a checkpoint command or the end of the stream. So allow the impatient operator to request checkpoints by sending a signal, like so: killall -USR1 git-fast-import When receiving such a signal, fast-import would schedule a checkpoint to take place after the current top-level command (usually a "commit" or "blob" request) finishes. Caveats: just like ordinary checkpoint commands, such requests slow down the import. Switching to a new pack at a suboptimal moment is also likely to result in a less dense initial collection of packs. That's the price. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-11-22 09:16:02 +01:00			`static void checkpoint(void)`
Implemented manual packfile switching in fast-import. To help importers which are dealing with massive amounts of data fast-import needs to be able to close the packfile it is currently writing to and open a new packfile for any additional data that will be received. A new 'checkpoint' command has been introduced which can be used by the frontend import process to force this to occur at any time. This may be useful to ensure a very long running import doesn't lose any work due to unexpected failures. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:35:41 +01:00			`{`
fast-import: treat SIGUSR1 as a request to access objects early It can be tedious to wait for a multi-million-revision import. Unfortunately it is hard to spy on the import because fast-import works by continuously streaming out objects, without updating the pack index or refs until a checkpoint command or the end of the stream. So allow the impatient operator to request checkpoints by sending a signal, like so: killall -USR1 git-fast-import When receiving such a signal, fast-import would schedule a checkpoint to take place after the current top-level command (usually a "commit" or "blob" request) finishes. Caveats: just like ordinary checkpoint commands, such requests slow down the import. Switching to a new pack at a suboptimal moment is also likely to result in a less dense initial collection of packs. That's the price. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-11-22 09:16:02 +01:00			`checkpoint_requested = 0;`
Dump all refs and marks during a checkpoint in fast-import. If the frontend asks us to checkpoint (via the explicit checkpoint command) its probably because they are afraid the current import will crash/fail/whatever and want to make sure they can pickup from the last checkpoint. To do that sort of recovery, we will need the current tip of every branch and tag available at the next startup. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-07 08:42:44 +01:00			`if (object_count) {`
			`cycle_packfile();`
			`dump_branches();`
			`dump_tags();`
			`dump_marks();`
			`}`
fast-import: treat SIGUSR1 as a request to access objects early It can be tedious to wait for a multi-million-revision import. Unfortunately it is hard to spy on the import because fast-import works by continuously streaming out objects, without updating the pack index or refs until a checkpoint command or the end of the stream. So allow the impatient operator to request checkpoints by sending a signal, like so: killall -USR1 git-fast-import When receiving such a signal, fast-import would schedule a checkpoint to take place after the current top-level command (usually a "commit" or "blob" request) finishes. Caveats: just like ordinary checkpoint commands, such requests slow down the import. Switching to a new pack at a suboptimal moment is also likely to result in a less dense initial collection of packs. That's the price. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-11-22 09:16:02 +01:00			`}`

			`static void parse_checkpoint(void)`
			`{`
			`checkpoint_requested = 1;`
Make trailing LF optional for all fast-import commands For the same reasons as the prior change we want to allow frontends to omit the trailing LF that usually delimits commands. In some cases these just make the input stream more verbose looking than it needs to be, and its just simpler for the frontend developer to get started if our parser is slightly more lenient about where an LF is required and where it isn't. To make this optional LF feature work we now have to buffer up to one line of input in command_buf. This buffering can happen if we look at the current input command but don't recognize it at this point in the code. In such a case we need to "unget" the entire line, but we cannot depend upon the stdio library to let us do ungetc() for that many characters at once. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-01 08:22:53 +02:00			`skip_optional_lf();`
Implemented manual packfile switching in fast-import. To help importers which are dealing with massive amounts of data fast-import needs to be able to close the packfile it is currently writing to and open a new packfile for any additional data that will be received. A new 'checkpoint' command has been introduced which can be used by the frontend import process to force this to occur at any time. This may be useful to ensure a very long running import doesn't lose any work due to unexpected failures. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:35:41 +01:00			`}`

git-fast-import: rename cmd_() functions to parse_() There is a cmd_merge() function in fast-import that will conflict with builtin-merge's cmd_merge() function. To keep it consistent, rename all cmd_() function to parse_() Signed-off-by: Miklos Vajna <vmiklos@frugalware.org> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-05-16 00:35:56 +02:00			`static void parse_progress(void)`
Allow frontends to bidirectionally communicate with fast-import The existing checkpoint command is very useful to force fast-import to dump the branches out to disk so that standard Git tools can access them and the objects they refer to. However there was not a way to know when fast-import had finished executing the checkpoint and it was safe to read those refs. The progress command can be used to make fast-import output any message of the frontend's choosing to standard out. The frontend can scan for these messages using select() or poll() to monitor a pipe connected to the standard output of fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-01 16:23:08 +02:00			`{`
Rework strbuf API and semantics. The gory details are explained in strbuf.h. The change of semantics this patch enforces is that the embeded buffer has always a '\0' character after its last byte, to always make it a C-string. The offs-by-one changes are all related to that very change. A strbuf can be used to store byte arrays, or as an extended string library. The `buf' member can be passed to any C legacy string function, because strbuf operations always ensure there is a terminating \0 at the end of the buffer, not accounted in the `len' field of the structure. A strbuf can be used to generate a string/buffer whose final size is not really known, and then "strbuf_detach" can be used to get the built buffer, and keep the wrapping "strbuf" structure usable for further work again. Other interesting feature: strbuf_grow(sb, size) ensure that there is enough allocated space in `sb' to put `size' new octets of data in the buffer. It helps avoiding reallocating data for nothing when the problem the strbuf helps to solve has a known typical size. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-06 13:20:05 +02:00			`fwrite(command_buf.buf, 1, command_buf.len, stdout);`
Allow frontends to bidirectionally communicate with fast-import The existing checkpoint command is very useful to force fast-import to dump the branches out to disk so that standard Git tools can access them and the objects they refer to. However there was not a way to know when fast-import had finished executing the checkpoint and it was safe to read those refs. The progress command can be used to make fast-import output any message of the frontend's choosing to standard out. The frontend can scan for these messages using select() or poll() to monitor a pipe connected to the standard output of fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-01 16:23:08 +02:00			`fputc('\n', stdout);`
			`fflush(stdout);`
			`skip_optional_lf();`
			`}`

fast-import: add (non-)relative-marks feature After specifying 'feature relative-marks' the paths specified with 'feature import-marks' and 'feature export-marks' are relative to an internal directory in the current repository. In git-fast-import this means that the paths are relative to the '.git/info/fast-import' directory. However, other importers may use a different location. Add 'feature non-relative-marks' to disable this behavior, this way it is possible to, for example, specify the import-marks location as relative, and the export-marks location as non-relative. Also add tests to verify this behavior. Cc: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:07:00 +01:00			`static char* make_fast_import_path(const char *path)`
Allow fast-import frontends to reload the marks table I'm giving fast-import a lesson on how to reload the marks table using the same format it outputs with --export-marks. This way a frontend can reload the marks table from a prior import, making incremental imports less painful. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-08 00:07:26 +01:00			`{`
fast-import: add (non-)relative-marks feature After specifying 'feature relative-marks' the paths specified with 'feature import-marks' and 'feature export-marks' are relative to an internal directory in the current repository. In git-fast-import this means that the paths are relative to the '.git/info/fast-import' directory. However, other importers may use a different location. Add 'feature non-relative-marks' to disable this behavior, this way it is possible to, for example, specify the import-marks location as relative, and the export-marks location as non-relative. Also add tests to verify this behavior. Cc: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:07:00 +01:00			`struct strbuf abs_path = STRBUF_INIT;`
Allow fast-import frontends to reload the marks table I'm giving fast-import a lesson on how to reload the marks table using the same format it outputs with --export-marks. This way a frontend can reload the marks table from a prior import, making incremental imports less painful. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-08 00:07:26 +01:00
fast-import: add (non-)relative-marks feature After specifying 'feature relative-marks' the paths specified with 'feature import-marks' and 'feature export-marks' are relative to an internal directory in the current repository. In git-fast-import this means that the paths are relative to the '.git/info/fast-import' directory. However, other importers may use a different location. Add 'feature non-relative-marks' to disable this behavior, this way it is possible to, for example, specify the import-marks location as relative, and the export-marks location as non-relative. Also add tests to verify this behavior. Cc: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:07:00 +01:00			`if (!relative_marks_paths \|\| is_absolute_path(path))`
			`return xstrdup(path);`
			`strbuf_addf(&abs_path, "%s/info/fast-import/%s", get_git_dir(), path);`
			`return strbuf_detach(&abs_path, NULL);`
			`}`

fast-import: Introduce --import-marks-if-exists When a frontend uses a marks file to ensure its state persists between runs, it may represent "clean slate" when bootstrapping with "no marks yet". In such a case, feeding the last state with --import-marks and saving the state after the current run with --export-marks would be a natural thing to do. The --import-marks option however errors out when the specified marks file doesn't exist; this makes bootstrapping a bit difficult. The location of the marks file becomes backend-dependent when --relative-marks is in effect, and the frontend cannot check for the existence of the file in such a case. The --import-marks-if-exists option does the same thing as --import-marks but does not flag an error if the named file does not exist yet to help these frontends. Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-01-15 07:31:46 +01:00			`static void option_import_marks(const char *marks,`
			`int from_stream, int ignore_missing)`
Allow fast-import frontends to reload the marks table I'm giving fast-import a lesson on how to reload the marks table using the same format it outputs with --export-marks. This way a frontend can reload the marks table from a prior import, making incremental imports less painful. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-08 00:07:26 +01:00			`{`
fast-import: allow for multiple --import-marks= arguments The --import-marks= option may be specified multiple times on the commandline and should result in all marks being read in. Only one import-marks feature may be specified in the stream, which is overriden by any --import-marks= commandline options. If one wishes to specify import-marks files in addition to the one specified in the stream, it is easy to repeat the stream option as a --import-marks= commandline option. Also verify this behavior with tests. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:59 +01:00			`if (import_marks_file) {`
			`if (from_stream)`
			`die("Only one import-marks command allowed per stream");`

			`/* read previous mark file */`
			`if(!import_marks_file_from_stream)`
			`read_marks();`
Allow fast-import frontends to reload the marks table I'm giving fast-import a lesson on how to reload the marks table using the same format it outputs with --export-marks. This way a frontend can reload the marks table from a prior import, making incremental imports less painful. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-08 00:07:26 +01:00			`}`
fast-import: allow for multiple --import-marks= arguments The --import-marks= option may be specified multiple times on the commandline and should result in all marks being read in. Only one import-marks feature may be specified in the stream, which is overriden by any --import-marks= commandline options. If one wishes to specify import-marks files in addition to the one specified in the stream, it is easy to repeat the stream option as a --import-marks= commandline option. Also verify this behavior with tests. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:59 +01:00
fast-import: add (non-)relative-marks feature After specifying 'feature relative-marks' the paths specified with 'feature import-marks' and 'feature export-marks' are relative to an internal directory in the current repository. In git-fast-import this means that the paths are relative to the '.git/info/fast-import' directory. However, other importers may use a different location. Add 'feature non-relative-marks' to disable this behavior, this way it is possible to, for example, specify the import-marks location as relative, and the export-marks location as non-relative. Also add tests to verify this behavior. Cc: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:07:00 +01:00			`import_marks_file = make_fast_import_path(marks);`
fast-import: always create marks_file directories CC: "Shawn O. Pearce" <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-03-29 18:48:25 +02:00			`safe_create_leading_directories_const(import_marks_file);`
fast-import: allow for multiple --import-marks= arguments The --import-marks= option may be specified multiple times on the commandline and should result in all marks being read in. Only one import-marks feature may be specified in the stream, which is overriden by any --import-marks= commandline options. If one wishes to specify import-marks files in addition to the one specified in the stream, it is easy to repeat the stream option as a --import-marks= commandline option. Also verify this behavior with tests. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:59 +01:00			`import_marks_file_from_stream = from_stream;`
fast-import: Introduce --import-marks-if-exists When a frontend uses a marks file to ensure its state persists between runs, it may represent "clean slate" when bootstrapping with "no marks yet". In such a case, feeding the last state with --import-marks and saving the state after the current run with --export-marks would be a natural thing to do. The --import-marks option however errors out when the specified marks file doesn't exist; this makes bootstrapping a bit difficult. The location of the marks file becomes backend-dependent when --relative-marks is in effect, and the frontend cannot check for the existence of the file in such a case. The --import-marks-if-exists option does the same thing as --import-marks but does not flag an error if the named file does not exist yet to help these frontends. Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-01-15 07:31:46 +01:00			`import_marks_file_ignore_missing = ignore_missing;`
Allow fast-import frontends to reload the marks table I'm giving fast-import a lesson on how to reload the marks table using the same format it outputs with --export-marks. This way a frontend can reload the marks table from a prior import, making incremental imports less painful. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-08 00:07:26 +01:00			`}`

fast-import: put option parsing code in separate functions Putting the options in their own functions increases readability of the option parsing block and makes it easier to reuse the option parsing code later on. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:54 +01:00			`static void option_date_format(const char *fmt)`
			`{`
			`if (!strcmp(fmt, "raw"))`
			`whenspec = WHENSPEC_RAW;`
			`else if (!strcmp(fmt, "rfc2822"))`
			`whenspec = WHENSPEC_RFC2822;`
			`else if (!strcmp(fmt, "now"))`
			`whenspec = WHENSPEC_NOW;`
			`else`
			`die("unknown --date-format argument %s", fmt);`
			`}`

fast-import: stricter parsing of integer options Check the result from strtoul to avoid accepting arguments like --depth=-1 and --active-branches=foo,bar,baz. Requested-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-11-28 20:42:46 +01:00			`static unsigned long ulong_arg(const char option, const char arg)`
			`{`
			`char *endptr;`
			`unsigned long rv = strtoul(arg, &endptr, 0);`
			`if (strchr(arg, '-') \|\| endptr == arg \|\| *endptr)`
			`die("%s: argument must be a non-negative integer", option);`
			`return rv;`
			`}`

fast-import: put option parsing code in separate functions Putting the options in their own functions increases readability of the option parsing block and makes it easier to reuse the option parsing code later on. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:54 +01:00			`static void option_depth(const char *depth)`
			`{`
fast-import: stricter parsing of integer options Check the result from strtoul to avoid accepting arguments like --depth=-1 and --active-branches=foo,bar,baz. Requested-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-11-28 20:42:46 +01:00			`max_depth = ulong_arg("--depth", depth);`
fast-import: put option parsing code in separate functions Putting the options in their own functions increases readability of the option parsing block and makes it easier to reuse the option parsing code later on. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:54 +01:00			`if (max_depth > MAX_DEPTH)`
			`die("--depth cannot exceed %u", MAX_DEPTH);`
			`}`

			`static void option_active_branches(const char *branches)`
			`{`
fast-import: stricter parsing of integer options Check the result from strtoul to avoid accepting arguments like --depth=-1 and --active-branches=foo,bar,baz. Requested-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-11-28 20:42:46 +01:00			`max_active_branches = ulong_arg("--active-branches", branches);`
fast-import: put option parsing code in separate functions Putting the options in their own functions increases readability of the option parsing block and makes it easier to reuse the option parsing code later on. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:54 +01:00			`}`

			`static void option_export_marks(const char *marks)`
			`{`
fast-import: add (non-)relative-marks feature After specifying 'feature relative-marks' the paths specified with 'feature import-marks' and 'feature export-marks' are relative to an internal directory in the current repository. In git-fast-import this means that the paths are relative to the '.git/info/fast-import' directory. However, other importers may use a different location. Add 'feature non-relative-marks' to disable this behavior, this way it is possible to, for example, specify the import-marks location as relative, and the export-marks location as non-relative. Also add tests to verify this behavior. Cc: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:07:00 +01:00			`export_marks_file = make_fast_import_path(marks);`
fast-import: always create marks_file directories CC: "Shawn O. Pearce" <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-03-29 18:48:25 +02:00			`safe_create_leading_directories_const(export_marks_file);`
fast-import: put option parsing code in separate functions Putting the options in their own functions increases readability of the option parsing block and makes it easier to reuse the option parsing code later on. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:54 +01:00			`}`

fast-import: let importers retrieve blobs New objects written by fast-import are not available immediately. Until a checkpoint has been started and finishes writing the pack index, any new blobs will not be accessible using standard git tools. So introduce a new way to access them: a "cat-blob" command in the command stream requests for fast-import to print a blob to stdout or a file descriptor specified by the argument to --cat-blob-fd. The value for cat-blob-fd cannot be specified in the stream because that would be a layering violation: the decision of where to direct a stream has to be made when fast-import is started anyway, so we might as well make the stream format is independent of that detail. Output uses the same format as "git cat-file --batch". Thanks to Sverre Rabbelier and Sam Vilain for guidance in designing the protocol. Based-on-patch-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: David Barr <david.barr@cordelta.com> Acked-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-11-28 20:45:01 +01:00			`static void option_cat_blob_fd(const char *fd)`
			`{`
			`unsigned long n = ulong_arg("--cat-blob-fd", fd);`
			`if (n > (unsigned long) INT_MAX)`
			`die("--cat-blob-fd cannot exceed %d", INT_MAX);`
			`cat_blob_fd = (int) n;`
			`}`

fast-import: put option parsing code in separate functions Putting the options in their own functions increases readability of the option parsing block and makes it easier to reuse the option parsing code later on. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:54 +01:00			`static void option_export_pack_edges(const char *edges)`
			`{`
			`if (pack_edges)`
			`fclose(pack_edges);`
			`pack_edges = fopen(edges, "a");`
			`if (!pack_edges)`
			`die_errno("Cannot open '%s'", edges);`
			`}`

fast-import: add option command This allows the frontend to specify any of the supported options as long as no non-option command has been given. This way the user does not have to include any frontend-specific options, but instead she can rely on the frontend to tell fast-import what it needs. Also factor out parsing of argv and have it execute when we reach the first non-option command, or after all commands have been read and no non-option command has been encountered. Non-git options are ignored, unrecognised options result in an error. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:57 +01:00			`static int parse_one_option(const char *option)`
fast-import: put option parsing code in separate functions Putting the options in their own functions increases readability of the option parsing block and makes it easier to reuse the option parsing code later on. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:54 +01:00			`{`
fast-import: add option command This allows the frontend to specify any of the supported options as long as no non-option command has been given. This way the user does not have to include any frontend-specific options, but instead she can rely on the frontend to tell fast-import what it needs. Also factor out parsing of argv and have it execute when we reach the first non-option command, or after all commands have been read and no non-option command has been encountered. Non-git options are ignored, unrecognised options result in an error. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:57 +01:00			`if (!prefixcmp(option, "max-pack-size=")) {`
fast-import: count --max-pack-size in bytes Similar in spirit to 07cf0f2 (make --max-pack-size argument to 'git pack-object' count in bytes, 2010-02-03) which made the option by the same name to pack-objects, this counts the pack size limit in bytes. In order not to cause havoc with people used to the previous megabyte scale an integer smaller than 8192 is interpreted in megabytes but the user gets a warning. Also a minimum size of 1 MiB is enforced to avoid an explosion of pack files. Signed-off-by: Junio C Hamano <gitster@pobox.com> Acked-by: Shawn O. Pearce <spearce@spearce.org> Acked-by: Nicolas Pitre <nico@fluxnic.net> 2010-02-04 20:10:44 +01:00			`unsigned long v;`
			`if (!git_parse_ulong(option + 14, &v))`
			`return 0;`
			`if (v < 8192) {`
			`warning("max-pack-size is now in bytes, assuming --max-pack-size=%lum", v);`
			`v = 1024 1024;`
			`} else if (v < 1024 * 1024) {`
			`warning("minimum max-pack-size is 1 MiB");`
			`v = 1024 * 1024;`
			`}`
			`max_packsize = v;`
Merge branch 'sp/maint-fast-import-large-blob' into sp/fast-import-large-blob * sp/maint-fast-import-large-blob: fast-import: Stream very large blobs directly to pack bash: don't offer remote transport helpers as subcommands Conflicts: fast-import.c 2010-02-01 21:41:31 +01:00			`} else if (!prefixcmp(option, "big-file-threshold=")) {`
fast-import.c: Fix big-file-threshold parsing bug Manual merge made at 844ad3d (Merge branch 'sp/maint-fast-import-large-blob' into sp/fast-import-large-blob, 2010-02-01) did not correctly reflect the change of unit in which this variable's value is counted from its previous version. Now it counts in bytes, not in megabytes. Signed-off-by: Junio C Hamano <gitster@pobox.com> Acked-by: Shawn O. Pearce <spearce@spearce.org> 2010-02-04 03:27:08 +01:00			`unsigned long v;`
			`if (!git_parse_ulong(option + 19, &v))`
			`return 0;`
			`big_file_threshold = v;`
fast-import: put option parsing code in separate functions Putting the options in their own functions increases readability of the option parsing block and makes it easier to reuse the option parsing code later on. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:54 +01:00			`} else if (!prefixcmp(option, "depth=")) {`
			`option_depth(option + 6);`
			`} else if (!prefixcmp(option, "active-branches=")) {`
			`option_active_branches(option + 16);`
			`} else if (!prefixcmp(option, "export-pack-edges=")) {`
			`option_export_pack_edges(option + 18);`
			`} else if (!prefixcmp(option, "quiet")) {`
			`show_stats = 0;`
			`} else if (!prefixcmp(option, "stats")) {`
			`show_stats = 1;`
			`} else {`
fast-import: add option command This allows the frontend to specify any of the supported options as long as no non-option command has been given. This way the user does not have to include any frontend-specific options, but instead she can rely on the frontend to tell fast-import what it needs. Also factor out parsing of argv and have it execute when we reach the first non-option command, or after all commands have been read and no non-option command has been encountered. Non-git options are ignored, unrecognised options result in an error. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:57 +01:00			`return 0;`
fast-import: put option parsing code in separate functions Putting the options in their own functions increases readability of the option parsing block and makes it easier to reuse the option parsing code later on. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:54 +01:00			`}`
fast-import: add option command This allows the frontend to specify any of the supported options as long as no non-option command has been given. This way the user does not have to include any frontend-specific options, but instead she can rely on the frontend to tell fast-import what it needs. Also factor out parsing of argv and have it execute when we reach the first non-option command, or after all commands have been read and no non-option command has been encountered. Non-git options are ignored, unrecognised options result in an error. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:57 +01:00
			`return 1;`
fast-import: put option parsing code in separate functions Putting the options in their own functions increases readability of the option parsing block and makes it easier to reuse the option parsing code later on. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:54 +01:00			`}`

fast-import: allow for multiple --import-marks= arguments The --import-marks= option may be specified multiple times on the commandline and should result in all marks being read in. Only one import-marks feature may be specified in the stream, which is overriden by any --import-marks= commandline options. If one wishes to specify import-marks files in addition to the one specified in the stream, it is easy to repeat the stream option as a --import-marks= commandline option. Also verify this behavior with tests. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:59 +01:00			`static int parse_one_feature(const char *feature, int from_stream)`
fast-import: add feature command This allows the fronted to require a specific feature to be supported by the backend, or abort. Also add support for four initial feature, date-format=, force=, import-marks=, export-marks=. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:56 +01:00			`{`
			`if (!prefixcmp(feature, "date-format=")) {`
			`option_date_format(feature + 12);`
			`} else if (!prefixcmp(feature, "import-marks=")) {`
fast-import: Introduce --import-marks-if-exists When a frontend uses a marks file to ensure its state persists between runs, it may represent "clean slate" when bootstrapping with "no marks yet". In such a case, feeding the last state with --import-marks and saving the state after the current run with --export-marks would be a natural thing to do. The --import-marks option however errors out when the specified marks file doesn't exist; this makes bootstrapping a bit difficult. The location of the marks file becomes backend-dependent when --relative-marks is in effect, and the frontend cannot check for the existence of the file in such a case. The --import-marks-if-exists option does the same thing as --import-marks but does not flag an error if the named file does not exist yet to help these frontends. Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-01-15 07:31:46 +01:00			`option_import_marks(feature + 13, from_stream, 0);`
			`} else if (!prefixcmp(feature, "import-marks-if-exists=")) {`
			`option_import_marks(feature + strlen("import-marks-if-exists="),`
			`from_stream, 1);`
fast-import: add feature command This allows the fronted to require a specific feature to be supported by the backend, or abort. Also add support for four initial feature, date-format=, force=, import-marks=, export-marks=. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:56 +01:00			`} else if (!prefixcmp(feature, "export-marks=")) {`
			`option_export_marks(feature + 13);`
fast-import: let importers retrieve blobs New objects written by fast-import are not available immediately. Until a checkpoint has been started and finishes writing the pack index, any new blobs will not be accessible using standard git tools. So introduce a new way to access them: a "cat-blob" command in the command stream requests for fast-import to print a blob to stdout or a file descriptor specified by the argument to --cat-blob-fd. The value for cat-blob-fd cannot be specified in the stream because that would be a layering violation: the decision of where to direct a stream has to be made when fast-import is started anyway, so we might as well make the stream format is independent of that detail. Output uses the same format as "git cat-file --batch". Thanks to Sverre Rabbelier and Sam Vilain for guidance in designing the protocol. Based-on-patch-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: David Barr <david.barr@cordelta.com> Acked-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-11-28 20:45:01 +01:00			`} else if (!strcmp(feature, "cat-blob")) {`
			`; /* Don't die - this feature is supported */`
fast-import: fix option parser for no-arg options While refactoring the options parser in bc3c79a (fast-import: add (non-)relative-marks feature, 2009-12-04), it was made too lenient for options that take no argument, fix that. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-05-05 20:56:00 +02:00			`} else if (!strcmp(feature, "relative-marks")) {`
fast-import: add (non-)relative-marks feature After specifying 'feature relative-marks' the paths specified with 'feature import-marks' and 'feature export-marks' are relative to an internal directory in the current repository. In git-fast-import this means that the paths are relative to the '.git/info/fast-import' directory. However, other importers may use a different location. Add 'feature non-relative-marks' to disable this behavior, this way it is possible to, for example, specify the import-marks location as relative, and the export-marks location as non-relative. Also add tests to verify this behavior. Cc: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:07:00 +01:00			`relative_marks_paths = 1;`
fast-import: fix option parser for no-arg options While refactoring the options parser in bc3c79a (fast-import: add (non-)relative-marks feature, 2009-12-04), it was made too lenient for options that take no argument, fix that. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-05-05 20:56:00 +02:00			`} else if (!strcmp(feature, "no-relative-marks")) {`
fast-import: add (non-)relative-marks feature After specifying 'feature relative-marks' the paths specified with 'feature import-marks' and 'feature export-marks' are relative to an internal directory in the current repository. In git-fast-import this means that the paths are relative to the '.git/info/fast-import' directory. However, other importers may use a different location. Add 'feature non-relative-marks' to disable this behavior, this way it is possible to, for example, specify the import-marks location as relative, and the export-marks location as non-relative. Also add tests to verify this behavior. Cc: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:07:00 +01:00			`relative_marks_paths = 0;`
fast-import: introduce 'done' command Add a 'done' command that causes fast-import to stop reading from the stream and exit. If the new --done command line flag was passed on the command line (or a "feature done" declaration included at the start of the stream), make the 'done' command mandatory. So "git fast-import --done"'s input format will be prefix-free, making errors easier to detect when they show up as early termination at some convenient time of the upstream of a pipe writing to fast-import. Another possible application of the 'done' command would to be allow a fast-import stream that is only a small part of a larger encapsulating stream to be easily parsed, leaving the file offset after the "done\n" so the other application can pick up from there. This patch does not teach fast-import to do that --- fast-import still uses buffered input (stdio). Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-07-16 15:03:32 +02:00			`} else if (!strcmp(feature, "done")) {`
			`require_explicit_termination = 1;`
fast-import: fix option parser for no-arg options While refactoring the options parser in bc3c79a (fast-import: add (non-)relative-marks feature, 2009-12-04), it was made too lenient for options that take no argument, fix that. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-05-05 20:56:00 +02:00			`} else if (!strcmp(feature, "force")) {`
fast-import: add feature command This allows the fronted to require a specific feature to be supported by the backend, or abort. Also add support for four initial feature, date-format=, force=, import-marks=, export-marks=. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:56 +01:00			`force_update = 1;`
fast-import: add 'ls' command Lazy fast-import frontend authors that want to rely on the backend to keep track of the content of the imported trees _almost_ have what they need in the 'cat-blob' command (v1.7.4-rc0~30^2~3, 2010-11-28). But it is not quite enough, since (1) cat-blob can be used to retrieve the content of files, but not their mode, and (2) using cat-blob requires the frontend to keep track of a name (mark number or object id) for each blob to be retrieved Introduce an 'ls' command to complement cat-blob and take care of the remaining needs. The 'ls' command finds what is at a given path within a given tree-ish (tag, commit, or tree): 'ls' SP <dataref> SP <path> LF or in fast-import's active commit: 'ls' SP <path> LF The response is a single line sent through the cat-blob channel, imitating ls-tree output. So for example: FE> ls :1 Documentation gfi> 040000 tree 9e6c2b599341d28a2a375f8207507e0a2a627fe9 Documentation FE> ls 9e6c2b599341d28a2a375f8207507e0a2a627fe9 git-fast-import.txt gfi> 100644 blob 4f92954396e3f0f97e75b6838a5635b583708870 git-fast-import.txt FE> ls :1 RelNotes gfi> 120000 blob b942e499449d97aeb50c73ca2bdc1c6e6d528743 RelNotes FE> cat-blob b942e499449d97aeb50c73ca2bdc1c6e6d528743 gfi> b942e499449d97aeb50c73ca2bdc1c6e6d528743 blob 32 gfi> Documentation/RelNotes/1.7.4.txt The most interesting parts of the reply are the first word, which is a 6-digit octal mode (regular file, executable, symlink, directory, or submodule), and the part from the second space to the tab, which is a <dataref> that can be used in later cat-blob, ls, and filemodify (M) commands to refer to the content (blob, tree, or commit) at that path. If there is nothing there, the response is "missing some/path". The intent is for this command to be used to read files from the active commit, so a frontend can apply patches to them, and to copy files and directories from previous revisions. For example, proposed updates to svn-fe use this command in place of its internal representation of the repository directory structure. This simplifies the frontend a great deal and means support for resuming an import in a separate fast-import run (i.e., incremental import) is basically free. Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Improved-by: Junio C Hamano <gitster@pobox.com> Improved-by: Sverre Rabbelier <srabbelier@gmail.com> 2010-12-02 11:40:20 +01:00			`} else if (!strcmp(feature, "notes") \|\| !strcmp(feature, "ls")) {`
fast-import: introduce "feature notes" command Here is a 'feature' command for streams to use to require support for the notemodify (N) command. When the 'feature' facility was introduced (v1.7.0-rc0~95^2~4, 2009-12-04), the notes import feature was old news (v1.6.6-rc0~21^2~8, 2009-10-09) and it was not obvious it deserved to be a named feature. But now that is clear, since all major non-git fast-import backends lack support for it. Details: on git version with this patch applied, any "feature notes" command in the features/options section at the beginning of a stream will be treated as a no-op. On fast-import implementations without the feature (and older git versions), the command instead errors out with a message like This version of fast-import does not support feature notes. So by declaring use of notes at the beginning of a stream, frontends can avoid wasting time and other resources when the backend does not support notes. (This would be especially important for backends that do not support rewinding history after a botched import.) Improved-by: Thomas Rast <trast@student.ethz.ch> Improved-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-02-09 23:43:57 +01:00			`; /* do nothing; we have the feature */`
fast-import: add feature command This allows the fronted to require a specific feature to be supported by the backend, or abort. Also add support for four initial feature, date-format=, force=, import-marks=, export-marks=. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:56 +01:00			`} else {`
			`return 0;`
			`}`

			`return 1;`
			`}`

			`static void parse_feature(void)`
			`{`
			`char *feature = command_buf.buf + 8;`

			`if (seen_data_command)`
			`die("Got feature command '%s' after data command", feature);`

fast-import: allow for multiple --import-marks= arguments The --import-marks= option may be specified multiple times on the commandline and should result in all marks being read in. Only one import-marks feature may be specified in the stream, which is overriden by any --import-marks= commandline options. If one wishes to specify import-marks files in addition to the one specified in the stream, it is easy to repeat the stream option as a --import-marks= commandline option. Also verify this behavior with tests. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:59 +01:00			`if (parse_one_feature(feature, 1))`
fast-import: add feature command This allows the fronted to require a specific feature to be supported by the backend, or abort. Also add support for four initial feature, date-format=, force=, import-marks=, export-marks=. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:56 +01:00			`return;`

			`die("This version of fast-import does not support feature %s.", feature);`
			`}`

fast-import: add option command This allows the frontend to specify any of the supported options as long as no non-option command has been given. This way the user does not have to include any frontend-specific options, but instead she can rely on the frontend to tell fast-import what it needs. Also factor out parsing of argv and have it execute when we reach the first non-option command, or after all commands have been read and no non-option command has been encountered. Non-git options are ignored, unrecognised options result in an error. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:57 +01:00			`static void parse_option(void)`
			`{`
			`char *option = command_buf.buf + 11;`

			`if (seen_data_command)`
			`die("Got option command '%s' after data command", option);`

			`if (parse_one_option(option))`
			`return;`

			`die("This version of fast-import does not support option: %s", option);`
Allow fast-import frontends to reload the marks table I'm giving fast-import a lesson on how to reload the marks table using the same format it outputs with --export-marks. This way a frontend can reload the marks table from a prior import, making incremental imports less painful. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-08 00:07:26 +01:00			`}`

Provide git_config with a callback-data parameter git_config() only had a function parameter, but no callback data parameter. This assumes that all callback functions only modify global variables. With this patch, every callback gets a void * parameter, and it is hoped that this will help the libification effort. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-05-14 19:46:53 +02:00			`static int git_pack_config(const char k, const char v, void *cb)`
Teach fast-import to honor pack.compression and pack.depth We now use the configured pack.compression and pack.depth values within fast-import, as like builtin-pack-objects fast-import is generating a packfile for consumption by the Git tools. We use the same behavior as builtin-pack-objects does for these options, allowing core.compression to supply the default value for pack.compression. The default setting for pack.depth within fast-import is still 10 as users will generally repack fast-import generated packfiles by `repack -f`. A large delta depth within the fast-import packfile can significantly slow down such a later repack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-21 05:36:54 +01:00			`{`
			`if (!strcmp(k, "pack.depth")) {`
			`max_depth = git_config_int(k, v);`
			`if (max_depth > MAX_DEPTH)`
			`max_depth = MAX_DEPTH;`
			`return 0;`
			`}`
			`if (!strcmp(k, "pack.compression")) {`
			`int level = git_config_int(k, v);`
			`if (level == -1)`
			`level = Z_DEFAULT_COMPRESSION;`
			`else if (level < 0 \|\| level > Z_BEST_COMPRESSION)`
			`die("bad pack compression level %d", level);`
			`pack_compression_level = level;`
			`pack_compression_seen = 1;`
			`return 0;`
			`}`
fast-import: honor pack.indexversion and pack.packsizelimit config vars Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:55 +01:00			`if (!strcmp(k, "pack.indexversion")) {`
write_idx_file: introduce a struct to hold idx customization options Remove two globals, pack_idx_default version and pack_idx_off32_limit, and place them in a pack_idx_option structure. Allow callers to pass it to write_idx_file() as a parameter. Adjust all callers to the API change. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-02-26 00:43:25 +01:00			`pack_idx_opts.version = git_config_int(k, v);`
			`if (pack_idx_opts.version > 2)`
fast-import: honor pack.indexversion and pack.packsizelimit config vars Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:55 +01:00			`die("bad pack.indexversion=%"PRIu32,`
write_idx_file: introduce a struct to hold idx customization options Remove two globals, pack_idx_default version and pack_idx_off32_limit, and place them in a pack_idx_option structure. Allow callers to pass it to write_idx_file() as a parameter. Adjust all callers to the API change. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-02-26 00:43:25 +01:00			`pack_idx_opts.version);`
fast-import: honor pack.indexversion and pack.packsizelimit config vars Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-17 20:05:55 +01:00			`return 0;`
			`}`
			`if (!strcmp(k, "pack.packsizelimit")) {`
			`max_packsize = git_config_ulong(k, v);`
			`return 0;`
			`}`
Provide git_config with a callback-data parameter git_config() only had a function parameter, but no callback data parameter. This assumes that all callback functions only modify global variables. With this patch, every callback gets a void * parameter, and it is hoped that this will help the libification effort. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-05-14 19:46:53 +02:00			`return git_default_config(k, v, cb);`
Teach fast-import to honor pack.compression and pack.depth We now use the configured pack.compression and pack.depth values within fast-import, as like builtin-pack-objects fast-import is generating a packfile for consumption by the Git tools. We use the same behavior as builtin-pack-objects does for these options, allowing core.compression to supply the default value for pack.compression. The default setting for pack.depth within fast-import is still 10 as users will generally repack fast-import generated packfiles by `repack -f`. A large delta depth within the fast-import packfile can significantly slow down such a later repack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-21 05:36:54 +01:00			`}`

Converted fast-import to accept standard command line parameters. The following command line options are now accepted before the pack name: --objects=n # replaces the object count after the pack name --depth=n # delta chain depth to use (default is 10) --active-branches=n # maximum number of branches to keep in memory Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 08:00:31 +02:00			`static const char fast_import_usage[] =`
Use angles for placeholders consistently Signed-off-by: Štěpán Němec <stepnem@gmail.com> Acked-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-10-08 19:31:15 +02:00			`"git fast-import [--date-format=<f>] [--max-pack-size=<n>] [--big-file-threshold=<n>] [--depth=<n>] [--active-branches=<n>] [--export-marks=<marks.file>]";`
Converted fast-import to accept standard command line parameters. The following command line options are now accepted before the pack name: --objects=n # replaces the object count after the pack name --depth=n # delta chain depth to use (default is 10) --active-branches=n # maximum number of branches to keep in memory Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 08:00:31 +02:00
fast-import: add option command This allows the frontend to specify any of the supported options as long as no non-option command has been given. This way the user does not have to include any frontend-specific options, but instead she can rely on the frontend to tell fast-import what it needs. Also factor out parsing of argv and have it execute when we reach the first non-option command, or after all commands have been read and no non-option command has been encountered. Non-git options are ignored, unrecognised options result in an error. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:57 +01:00			`static void parse_argv(void)`
			`{`
			`unsigned int i;`

			`for (i = 1; i < global_argc; i++) {`
			`const char *a = global_argv[i];`

			`if (*a != '-' \|\| !strcmp(a, "--"))`
			`break;`

			`if (parse_one_option(a + 2))`
			`continue;`

fast-import: allow for multiple --import-marks= arguments The --import-marks= option may be specified multiple times on the commandline and should result in all marks being read in. Only one import-marks feature may be specified in the stream, which is overriden by any --import-marks= commandline options. If one wishes to specify import-marks files in addition to the one specified in the stream, it is easy to repeat the stream option as a --import-marks= commandline option. Also verify this behavior with tests. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:59 +01:00			`if (parse_one_feature(a + 2, 0))`
fast-import: add option command This allows the frontend to specify any of the supported options as long as no non-option command has been given. This way the user does not have to include any frontend-specific options, but instead she can rely on the frontend to tell fast-import what it needs. Also factor out parsing of argv and have it execute when we reach the first non-option command, or after all commands have been read and no non-option command has been encountered. Non-git options are ignored, unrecognised options result in an error. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:57 +01:00			`continue;`

fast-import: let importers retrieve blobs New objects written by fast-import are not available immediately. Until a checkpoint has been started and finishes writing the pack index, any new blobs will not be accessible using standard git tools. So introduce a new way to access them: a "cat-blob" command in the command stream requests for fast-import to print a blob to stdout or a file descriptor specified by the argument to --cat-blob-fd. The value for cat-blob-fd cannot be specified in the stream because that would be a layering violation: the decision of where to direct a stream has to be made when fast-import is started anyway, so we might as well make the stream format is independent of that detail. Output uses the same format as "git cat-file --batch". Thanks to Sverre Rabbelier and Sam Vilain for guidance in designing the protocol. Based-on-patch-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: David Barr <david.barr@cordelta.com> Acked-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-11-28 20:45:01 +01:00			`if (!prefixcmp(a + 2, "cat-blob-fd=")) {`
			`option_cat_blob_fd(a + 2 + strlen("cat-blob-fd="));`
			`continue;`
			`}`

fast-import: add option command This allows the frontend to specify any of the supported options as long as no non-option command has been given. This way the user does not have to include any frontend-specific options, but instead she can rely on the frontend to tell fast-import what it needs. Also factor out parsing of argv and have it execute when we reach the first non-option command, or after all commands have been read and no non-option command has been encountered. Non-git options are ignored, unrecognised options result in an error. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:57 +01:00			`die("unknown option %s", a);`
			`}`
			`if (i != global_argc)`
			`usage(fast_import_usage);`

			`seen_data_command = 1;`
			`if (import_marks_file)`
			`read_marks();`
			`}`

Added automatic index generation to fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-06 19:51:39 +02:00			`int main(int argc, const char **argv)`
			`{`
fast-import: put option parsing code in separate functions Putting the options in their own functions increases readability of the option parsing block and makes it easier to reuse the option parsing code later on. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:54 +01:00			`unsigned int i;`
Added automatic index generation to fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-06 19:51:39 +02:00
Add calls to git_extract_argv0_path() in programs that call git_config_* Programs that use git_config need to find the global configuration. When runtime prefix computation is enabled, this requires that git_extract_argv0_path() is called early in the program's main(). This commit adds the necessary calls. Signed-off-by: Steffen Prohaska <prohaska@zib.de> Acked-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-01-18 13:00:12 +01:00			`git_extract_argv0_path(argv[0]);`

i18n: add infrastructure for translating Git with gettext Change the skeleton implementation of i18n in Git to one that can show localized strings to users for our C, Shell and Perl programs using either GNU libintl or the Solaris gettext implementation. This new internationalization support is enabled by default. If gettext isn't available, or if Git is compiled with NO_GETTEXT=YesPlease, Git falls back on its current behavior of showing interface messages in English. When using the autoconf script we'll auto-detect if the gettext libraries are installed and act appropriately. This change is somewhat large because as well as adding a C, Shell and Perl i18n interface we're adding a lot of tests for them, and for those tests to work we need a skeleton PO file to actually test translations. A minimal Icelandic translation is included for this purpose. Icelandic includes multi-byte characters which makes it easy to test various edge cases, and it's a language I happen to understand. The rest of the commit message goes into detail about various sub-parts of this commit. = Installation Gettext .mo files will be installed and looked for in the standard $(prefix)/share/locale path. GIT_TEXTDOMAINDIR can also be set to override that, but that's only intended to be used to test Git itself. = Perl Perl code that's to be localized should use the new Git::I18n module. It imports a __ function into the caller's package by default. Instead of using the high level Locale::TextDomain interface I've opted to use the low-level (equivalent to the C interface) Locale::Messages module, which Locale::TextDomain itself uses. Locale::TextDomain does a lot of redundant work we don't need, and some of it would potentially introduce bugs. It tries to set the $TEXTDOMAIN based on package of the caller, and has its own hardcoded paths where it'll search for messages. I found it easier just to completely avoid it rather than try to circumvent its behavior. In any case, this is an issue wholly internal Git::I18N. Its guts can be changed later if that's deemed necessary. See <AANLkTilYD_NyIZMyj9dHtVk-ylVBfvyxpCC7982LWnVd@mail.gmail.com> for a further elaboration on this topic. = Shell Shell code that's to be localized should use the git-sh-i18n library. It's basically just a wrapper for the system's gettext.sh. If gettext.sh isn't available we'll fall back on gettext(1) if it's available. The latter is available without the former on Solaris, which has its own non-GNU gettext implementation. We also need to emulate eval_gettext() there. If neither are present we'll use a dumb printf(1) fall-through wrapper. = About libcharset.h and langinfo.h We use libcharset to query the character set of the current locale if it's available. I.e. we'll use it instead of nl_langinfo if HAVE_LIBCHARSET_H is set. The GNU gettext manual recommends using langinfo.h's nl_langinfo(CODESET) to acquire the current character set, but on systems that have libcharset.h's locale_charset() using the latter is either saner, or the only option on those systems. GNU and Solaris have a nl_langinfo(CODESET), FreeBSD can use either, but MinGW and some others need to use libcharset.h's locale_charset() instead. =Credits This patch is based on work by Jeff Epler <jepler@unpythonic.net> who did the initial Makefile / C work, and a lot of comments from the Git mailing list, including Jonathan Nieder, Jakub Narebski, Johannes Sixt, Erik Faye-Lund, Peter Krefting, Junio C Hamano, Thomas Rast and others. [jc: squashed a small Makefile fix from Ramsay] Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-11-18 00:14:42 +01:00			`git_setup_gettext();`

Show usage string for 'git fast-import -h' Let "git fast-import -h" (with no other arguments) print usage before exiting, even when run outside any repository. Cc: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-11-09 16:04:49 +01:00			`if (argc == 2 && !strcmp(argv[1], "-h"))`
			`usage(fast_import_usage);`

fast-import: exit with proper message if not a git dir git fast-import expects to be run from an existing (possibly empty) repository. It was dying with a suboptimal message if that wasn't the case. Signed-off-by: Jean-Luc Herren <jlh@gmx.ch> Acked-by: Shawn O. Pearce <spearce@spearce.org> 2008-02-28 23:29:54 +01:00			`setup_git_directory();`
write_idx_file: introduce a struct to hold idx customization options Remove two globals, pack_idx_default version and pack_idx_off32_limit, and place them in a pack_idx_option structure. Allow callers to pass it to write_idx_file() as a parameter. Adjust all callers to the API change. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-02-26 00:43:25 +01:00			`reset_pack_idx_option(&pack_idx_opts);`
Provide git_config with a callback-data parameter git_config() only had a function parameter, but no callback data parameter. This assumes that all callback functions only modify global variables. With this patch, every callback gets a void * parameter, and it is hoped that this will help the libification effort. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-05-14 19:46:53 +02:00			`git_config(git_pack_config, NULL);`
Teach fast-import to honor pack.compression and pack.depth We now use the configured pack.compression and pack.depth values within fast-import, as like builtin-pack-objects fast-import is generating a packfile for consumption by the Git tools. We use the same behavior as builtin-pack-objects does for these options, allowing core.compression to supply the default value for pack.compression. The default setting for pack.depth within fast-import is still 10 as users will generally repack fast-import generated packfiles by `repack -f`. A large delta depth within the fast-import packfile can significantly slow down such a later repack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-21 05:36:54 +01:00			`if (!pack_compression_seen && core_compression_seen)`
			`pack_compression_level = core_compression_level;`

Preallocate memory earlier in fast-import I'm about to teach fast-import how to reload the marks file created by a prior session. The general approach that I want to use is to immediately parse the marks file when the specific argument is found in argv, thereby allowing the caller to supply multiple marks files, as the mark space can be sparsely populated. To make that work out we need to allocate our object tables before we parse the command line options. Since none of these tables depend on the command line options, we can easily relocate them. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-07 23:09:21 +01:00			`alloc_objects(object_entry_alloc);`
Strbuf API extensions and fixes. * Add strbuf_rtrim to remove trailing spaces. * Add strbuf_insert to insert data at a given position. * Off-by one fix in strbuf_addf: strbuf_avail() does not counts the final \0 so the overflow test for snprintf is the strict comparison. This is not critical as the growth mechanism chosen will always allocate _more_ memory than asked, so the second test will not fail. It's some kind of miracle though. * Add size extension hints for strbuf_init and strbuf_read. If 0, default applies, else: + initial buffer has the given size for strbuf_init. + first growth checks it has at least this size rather than the default 8192. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-10 12:35:04 +02:00			`strbuf_init(&command_buf, 0);`
Preallocate memory earlier in fast-import I'm about to teach fast-import how to reload the marks file created by a prior session. The general approach that I want to use is to immediately parse the marks file when the specific argument is found in argv, thereby allowing the caller to supply multiple marks files, as the mark space can be sparsely populated. To make that work out we need to allocate our object tables before we parse the command line options. Since none of these tables depend on the command line options, we can easily relocate them. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-03-07 23:09:21 +01:00			`atom_table = xcalloc(atom_table_sz, sizeof(struct atom_str*));`
			`branch_table = xcalloc(branch_table_sz, sizeof(struct branch*));`
			`avail_tree_table = xcalloc(avail_tree_table_sz, sizeof(struct avail_tree_content*));`
			`marks = pool_calloc(1, sizeof(struct mark_set));`
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00
fast-import: add option command This allows the frontend to specify any of the supported options as long as no non-option command has been given. This way the user does not have to include any frontend-specific options, but instead she can rely on the frontend to tell fast-import what it needs. Also factor out parsing of argv and have it execute when we reach the first non-option command, or after all commands have been read and no non-option command has been encountered. Non-git options are ignored, unrecognised options result in an error. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:57 +01:00			`global_argc = argc;`
			`global_argv = argv;`
Converted fast-import to accept standard command line parameters. The following command line options are now accepted before the pack name: --objects=n # replaces the object count after the pack name --depth=n # delta chain depth to use (default is 10) --active-branches=n # maximum number of branches to keep in memory Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-23 08:00:31 +02:00
Include recent command history in fast-import crash reports When we crash the frontend developer (or end-user) may need to know roughly around what part of the input stream we had a problem with and aborted on. Because line numbers aren't very useful in this sort of application we instead just keep the last 100 commands in a FIFO queue and print them as part of the crash report. Currently one problem with this design is a commit that has more than 100 modified files in it will flood the FIFO and any context regarding branch/from/committer/mark/comments will be lost. We really should save only the last few (10?) file changes for the current commit, ensuring we have some prior higher level commands in the FIFO when we crash on a file M/D/C/R command. Another issue with this approach is the FIFO only includes the commands, it does not include the commit messages. Yet having a commit message may be useful to help locate the relevant change in the source material. In practice I don't think this is going to be a major concern as the frontend can always embed its own source change set identifier as a comment (which will appear in the crash report) and the commit message(s) for the most recent commits of any given branch should be obtainable from the (packed) commit objects. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 10:47:04 +02:00			`rc_free = pool_alloc(cmd_save * sizeof(*rc_free));`
			`for (i = 0; i < (cmd_save - 1); i++)`
			`rc_free[i].next = &rc_free[i + 1];`
			`rc_free[cmd_save - 1].next = NULL;`

Don't repack existing objects in fast-import Some users of fast-import have been trying to use it to rewrite commits and trees, an activity where the all of the relevant blobs are already available from the existing packfiles. In such a case we don't want to repack a blob, even if the frontend application has supplied us the raw data rather than a mark or a SHA-1 name. I'm intentionally only checking the packfiles that existed when fast-import started and am always ignoring all loose object files. We ignore loose objects because fast-import tends to operate on a very large number of objects in a very short timespan, and it is usually creating new objects, not reusing existing ones. In such a situtation the majority of the objects will not be found in the existing packfiles, nor will they be loose object files. If the frontend application really wants us to look at loose object files, then they can just repack the repository before running fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-04-20 17:23:45 +02:00			`prepare_packed_git();`
Restructure fast-import to support creating multiple packfiles. Now that we are starting to see some really large projects (such as KDE or a fork of FreeBSD) get imported into Git we're running into the upper limit on packfile object count as well as overall byte length. The KDE and FreeBSD projects are both likely to require more than 4 GiB to store their current history, which means we really need multiple packfiles to handle their content. This is a fairly simple restructuring of the internal code to help us support creating multiple packfiles from within fast-import. We are now adding a 5 digit incrementing suffix to the end of the basename supplied to us by the caller, permitting up to 99,999 packs to be generated in a single fast-import run. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 10:39:05 +01:00			`start_packfile();`
Generate crash reports on die in fast-import As fast-import is quite strict about its input and die()'s anytime something goes wrong it can be difficult for a frontend developer to troubleshoot why fast-import rejected their input, or to even determine what input command it rejected. This change introduces a custom handler for Git's die() routine. When we receive a die() for any reason (fast-import or a lower level core Git routine we called) the error is first dumped onto stderr and then a more extensive crash report file is prepared in GIT_DIR. Finally we exit the process with status 128, just like the stock builtin die handler. An internal flag is set to prevent any further die()'s that may be invoked during the crash report generator from causing us to enter into an infinite loop. We shouldn't die() from our crash report handler, but just in case someone makes a future code change we are prepared to gaurd against small mistakes turning into huge problems for the end-user. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-03 08:00:37 +02:00			`set_die_routine(die_nicely);`
fast-import: treat SIGUSR1 as a request to access objects early It can be tedious to wait for a multi-million-revision import. Unfortunately it is hard to spy on the import because fast-import works by continuously streaming out objects, without updating the pack index or refs until a checkpoint command or the end of the stream. So allow the impatient operator to request checkpoints by sending a signal, like so: killall -USR1 git-fast-import When receiving such a signal, fast-import would schedule a checkpoint to take place after the current top-level command (usually a "commit" or "blob" request) finishes. Caveats: just like ordinary checkpoint commands, such requests slow down the import. Switching to a new pack at a suboptimal moment is also likely to result in a less dense initial collection of packs. That's the price. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-11-22 09:16:02 +01:00			`set_checkpoint_signal();`
Drop strbuf's 'eof' marker, and make read_line a first class citizen. read_line is now strbuf_getline, and is a first class citizen, it returns 0 when reading a line worked, EOF else. The ->eof marker was used non-locally by fast-import.c, mimic the same behaviour using a static int in "read_next_command", that now returns -1 on EOF, and avoids to call strbuf_getline when it's in EOF state. Also no longer automagically strbuf_release the buffer, it's counter intuitive and breaks fast-import in a very subtle way. Note: being at EOF implies that command_buf.len == 0. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 11:19:04 +02:00			`while (read_next_command() != EOF) {`
			`if (!strcmp("blob", command_buf.buf))`
git-fast-import: rename cmd_() functions to parse_() There is a cmd_merge() function in fast-import that will conflict with builtin-merge's cmd_merge() function. To keep it consistent, rename all cmd_() function to parse_() Signed-off-by: Miklos Vajna <vmiklos@frugalware.org> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-05-16 00:35:56 +02:00			`parse_new_blob();`
fast-import: add 'ls' command Lazy fast-import frontend authors that want to rely on the backend to keep track of the content of the imported trees _almost_ have what they need in the 'cat-blob' command (v1.7.4-rc0~30^2~3, 2010-11-28). But it is not quite enough, since (1) cat-blob can be used to retrieve the content of files, but not their mode, and (2) using cat-blob requires the frontend to keep track of a name (mark number or object id) for each blob to be retrieved Introduce an 'ls' command to complement cat-blob and take care of the remaining needs. The 'ls' command finds what is at a given path within a given tree-ish (tag, commit, or tree): 'ls' SP <dataref> SP <path> LF or in fast-import's active commit: 'ls' SP <path> LF The response is a single line sent through the cat-blob channel, imitating ls-tree output. So for example: FE> ls :1 Documentation gfi> 040000 tree 9e6c2b599341d28a2a375f8207507e0a2a627fe9 Documentation FE> ls 9e6c2b599341d28a2a375f8207507e0a2a627fe9 git-fast-import.txt gfi> 100644 blob 4f92954396e3f0f97e75b6838a5635b583708870 git-fast-import.txt FE> ls :1 RelNotes gfi> 120000 blob b942e499449d97aeb50c73ca2bdc1c6e6d528743 RelNotes FE> cat-blob b942e499449d97aeb50c73ca2bdc1c6e6d528743 gfi> b942e499449d97aeb50c73ca2bdc1c6e6d528743 blob 32 gfi> Documentation/RelNotes/1.7.4.txt The most interesting parts of the reply are the first word, which is a 6-digit octal mode (regular file, executable, symlink, directory, or submodule), and the part from the second space to the tab, which is a <dataref> that can be used in later cat-blob, ls, and filemodify (M) commands to refer to the content (blob, tree, or commit) at that path. If there is nothing there, the response is "missing some/path". The intent is for this command to be used to read files from the active commit, so a frontend can apply patches to them, and to copy files and directories from previous revisions. For example, proposed updates to svn-fe use this command in place of its internal representation of the repository directory structure. This simplifies the frontend a great deal and means support for resuming an import in a separate fast-import run (i.e., incremental import) is basically free. Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Improved-by: Junio C Hamano <gitster@pobox.com> Improved-by: Sverre Rabbelier <srabbelier@gmail.com> 2010-12-02 11:40:20 +01:00			`else if (!prefixcmp(command_buf.buf, "ls "))`
			`parse_ls(NULL);`
prefixcmp(): fix-up mechanical conversion. Previous step converted use of strncmp() with literal string mechanically even when the result is only used as a boolean: if (!strncmp("foo", arg, 3)) ==> if (!(-prefixcmp(arg, "foo"))) This step manually cleans them up to read: if (!prefixcmp(arg, "foo")) Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-20 10:54:00 +01:00			`else if (!prefixcmp(command_buf.buf, "commit "))`
git-fast-import: rename cmd_() functions to parse_() There is a cmd_merge() function in fast-import that will conflict with builtin-merge's cmd_merge() function. To keep it consistent, rename all cmd_() function to parse_() Signed-off-by: Miklos Vajna <vmiklos@frugalware.org> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-05-16 00:35:56 +02:00			`parse_new_commit();`
prefixcmp(): fix-up mechanical conversion. Previous step converted use of strncmp() with literal string mechanically even when the result is only used as a boolean: if (!strncmp("foo", arg, 3)) ==> if (!(-prefixcmp(arg, "foo"))) This step manually cleans them up to read: if (!prefixcmp(arg, "foo")) Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-20 10:54:00 +01:00			`else if (!prefixcmp(command_buf.buf, "tag "))`
git-fast-import: rename cmd_() functions to parse_() There is a cmd_merge() function in fast-import that will conflict with builtin-merge's cmd_merge() function. To keep it consistent, rename all cmd_() function to parse_() Signed-off-by: Miklos Vajna <vmiklos@frugalware.org> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-05-16 00:35:56 +02:00			`parse_new_tag();`
prefixcmp(): fix-up mechanical conversion. Previous step converted use of strncmp() with literal string mechanically even when the result is only used as a boolean: if (!strncmp("foo", arg, 3)) ==> if (!(-prefixcmp(arg, "foo"))) This step manually cleans them up to read: if (!prefixcmp(arg, "foo")) Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-20 10:54:00 +01:00			`else if (!prefixcmp(command_buf.buf, "reset "))`
git-fast-import: rename cmd_() functions to parse_() There is a cmd_merge() function in fast-import that will conflict with builtin-merge's cmd_merge() function. To keep it consistent, rename all cmd_() function to parse_() Signed-off-by: Miklos Vajna <vmiklos@frugalware.org> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-05-16 00:35:56 +02:00			`parse_reset_branch();`
Implemented manual packfile switching in fast-import. To help importers which are dealing with massive amounts of data fast-import needs to be able to close the packfile it is currently writing to and open a new packfile for any additional data that will be received. A new 'checkpoint' command has been introduced which can be used by the frontend import process to force this to occur at any time. This may be useful to ensure a very long running import doesn't lose any work due to unexpected failures. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 12:35:41 +01:00			`else if (!strcmp("checkpoint", command_buf.buf))`
git-fast-import: rename cmd_() functions to parse_() There is a cmd_merge() function in fast-import that will conflict with builtin-merge's cmd_merge() function. To keep it consistent, rename all cmd_() function to parse_() Signed-off-by: Miklos Vajna <vmiklos@frugalware.org> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-05-16 00:35:56 +02:00			`parse_checkpoint();`
fast-import: introduce 'done' command Add a 'done' command that causes fast-import to stop reading from the stream and exit. If the new --done command line flag was passed on the command line (or a "feature done" declaration included at the start of the stream), make the 'done' command mandatory. So "git fast-import --done"'s input format will be prefix-free, making errors easier to detect when they show up as early termination at some convenient time of the upstream of a pipe writing to fast-import. Another possible application of the 'done' command would to be allow a fast-import stream that is only a small part of a larger encapsulating stream to be easily parsed, leaving the file offset after the "done\n" so the other application can pick up from there. This patch does not teach fast-import to do that --- fast-import still uses buffered input (stdio). Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-07-16 15:03:32 +02:00			`else if (!strcmp("done", command_buf.buf))`
			`break;`
Allow frontends to bidirectionally communicate with fast-import The existing checkpoint command is very useful to force fast-import to dump the branches out to disk so that standard Git tools can access them and the objects they refer to. However there was not a way to know when fast-import had finished executing the checkpoint and it was safe to read those refs. The progress command can be used to make fast-import output any message of the frontend's choosing to standard out. The frontend can scan for these messages using select() or poll() to monitor a pipe connected to the standard output of fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-08-01 16:23:08 +02:00			`else if (!prefixcmp(command_buf.buf, "progress "))`
git-fast-import: rename cmd_() functions to parse_() There is a cmd_merge() function in fast-import that will conflict with builtin-merge's cmd_merge() function. To keep it consistent, rename all cmd_() function to parse_() Signed-off-by: Miklos Vajna <vmiklos@frugalware.org> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-05-16 00:35:56 +02:00			`parse_progress();`
fast-import: add feature command This allows the fronted to require a specific feature to be supported by the backend, or abort. Also add support for four initial feature, date-format=, force=, import-marks=, export-marks=. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:56 +01:00			`else if (!prefixcmp(command_buf.buf, "feature "))`
			`parse_feature();`
fast-import: add option command This allows the frontend to specify any of the supported options as long as no non-option command has been given. This way the user does not have to include any frontend-specific options, but instead she can rely on the frontend to tell fast-import what it needs. Also factor out parsing of argv and have it execute when we reach the first non-option command, or after all commands have been read and no non-option command has been encountered. Non-git options are ignored, unrecognised options result in an error. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:57 +01:00			`else if (!prefixcmp(command_buf.buf, "option git "))`
			`parse_option();`
			`else if (!prefixcmp(command_buf.buf, "option "))`
			`/* ignore non-git options*/;`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00			`else`
			`die("Unsupported command: %s", command_buf.buf);`
fast-import: treat SIGUSR1 as a request to access objects early It can be tedious to wait for a multi-million-revision import. Unfortunately it is hard to spy on the import because fast-import works by continuously streaming out objects, without updating the pack index or refs until a checkpoint command or the end of the stream. So allow the impatient operator to request checkpoints by sending a signal, like so: killall -USR1 git-fast-import When receiving such a signal, fast-import would schedule a checkpoint to take place after the current top-level command (usually a "commit" or "blob" request) finishes. Caveats: just like ordinary checkpoint commands, such requests slow down the import. Switching to a new pack at a suboptimal moment is also likely to result in a less dense initial collection of packs. That's the price. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-11-22 09:16:02 +01:00
			`if (checkpoint_requested)`
			`checkpoint();`
Created fast-import, a tool to quickly generating a pack from blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-05 08:04:21 +02:00			`}`
fast-import: add option command This allows the frontend to specify any of the supported options as long as no non-option command has been given. This way the user does not have to include any frontend-specific options, but instead she can rely on the frontend to tell fast-import what it needs. Also factor out parsing of argv and have it execute when we reach the first non-option command, or after all commands have been read and no non-option command has been encountered. Non-git options are ignored, unrecognised options result in an error. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-04 18:06:57 +01:00
			`/* argv hasn't been parsed yet, do so */`
			`if (!seen_data_command)`
			`parse_argv();`

fast-import: introduce 'done' command Add a 'done' command that causes fast-import to stop reading from the stream and exit. If the new --done command line flag was passed on the command line (or a "feature done" declaration included at the start of the stream), make the 'done' command mandatory. So "git fast-import --done"'s input format will be prefix-free, making errors easier to detect when they show up as early termination at some convenient time of the upstream of a pipe writing to fast-import. Another possible application of the 'done' command would to be allow a fast-import stream that is only a small part of a larger encapsulating stream to be easily parsed, leaving the file offset after the "done\n" so the other application can pick up from there. This patch does not teach fast-import to do that --- fast-import still uses buffered input (stdio). Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-07-16 15:03:32 +02:00			`if (require_explicit_termination && feof(stdin))`
			`die("stream ends early");`

Restructure fast-import to support creating multiple packfiles. Now that we are starting to see some really large projects (such as KDE or a fork of FreeBSD) get imported into Git we're running into the upper limit on packfile object count as well as overall byte length. The KDE and FreeBSD projects are both likely to require more than 4 GiB to store their current history, which means we really need multiple packfiles to handle their content. This is a fairly simple restructuring of the internal code to help us support creating multiple packfiles from within fast-import. We are now adding a 5 digit incrementing suffix to the end of the basename supplied to us by the caller, permitting up to 99,999 packs to be generated in a single fast-import run. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-15 10:39:05 +01:00			`end_packfile();`
Converted fast-import to a text based protocol. Frontend clients can now send a text stream to fast-import rather than a binary stream. This should facilitate developing frontend software as the data stream is easier to view, manipulate and debug my hand and Mark-I eyeball. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-15 02:16:28 +02:00
Added tree and commit writing to fast-import. The tree of the current commit can be altered by file_change commands before the commit gets written to the pack. The file changes are rather primitive as they simply allow removal of a tree entry or setting/adding a tree entry. Currently trees and commits aren't being deltafied when written to the pack and branch reloading from the current pack doesn't work, so at most 5 branches can be worked with at any one time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-14 06:58:19 +02:00			`dump_branches();`
Implemented 'tag' command in fast-import. Tags received from the frontend are generated in memory in a simple linked list in the order that the tag commands were sent by the frontend. If multiple different tag objects for the same tag name get generated the last one sent by the frontend will be the one that gets written out at termination. Multiple tag objects for the same name will cause all older tags of the same name to be lost. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-24 09:12:13 +02:00			`dump_tags();`
Use .keep files in fast-import during processing. Because fast-import automatically updates all references (heads and tags) at the end of its run the repository is corrupt unless the objects are available in the .git/objects/pack directory prior to the refs being modified. The easiest way to ensure that is true is to move the packfile and its associated index directly into the .git/objects/pack directory as soon as we have finished output to it. But the only safe way to do this is to create the a temporary .keep file for that pack, so we use the same tricks that index-pack uses when its being invoked by receive-pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-01-16 07:15:31 +01:00			`unkeep_all_packs();`
Added option to export the marks table when fast-import terminates. The marks table can be used by the frontend to load any commit after the import and compare it to whatever data the frontend knows about that commit. If the mark idnums can be easily correlated to some reference source then its relatively trivial to compare the GIT tree to the reference to verify the accuracy of the import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-25 22:03:04 +02:00			`dump_marks();`
Added automatic index generation to fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-06 19:51:39 +02:00
fast-import: Hide the pack boundary commits by default. Most users don't need the pack boundary information that fast-import was printing to standard output, especially if they were calling it with --quiet. Those users who do want this information probably want it captured so they can go back and use it to repack the imported repository. So dumping the boundary commits to a log file makes more sense then printing them to standard output. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-12 01:45:56 +01:00			`if (pack_edges)`
			`fclose(pack_edges);`

Teach fast-import how to sit quietly in the corner. Often users will be running fast-import from within a larger frontend process, and this may be a frequent periodic tool such as a future edition of `git-svn fetch`. We don't want to bombard users with our large stats output if they won't be interested in it, so `--quiet` is now an option to make gfi be more silent. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-07 08:19:31 +01:00			`if (show_stats) {`
			`uintmax_t total_count = 0, duplicate_count = 0;`
			`for (i = 0; i < ARRAY_SIZE(object_count_by_type); i++)`
			`total_count += object_count_by_type[i];`
			`for (i = 0; i < ARRAY_SIZE(duplicate_count_by_type); i++)`
			`duplicate_count += duplicate_count_by_type[i];`

			`fprintf(stderr, "%s statistics:\n", argv[0]);`
			`fprintf(stderr, "---------------------------------------------------------------------\n");`
Check for PRIuMAX rather than NO_C99_FORMAT in fast-import.c. Thanks to Simon 'corecode' Schubert <corecode@fs.ei.tum.de> for the clean-up. Defining the C99 standard PRIuMAX when necessary replaces UM_FMT and the awkward UM10_FMT. There are no direct C99 translations for other uses of NO_C99_FORMAT in git, alas. Signed-off-by: Jason Riedy <ejr@cs.berkeley.edu> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-21 02:34:56 +01:00			`fprintf(stderr, "Alloc'd objects: %10" PRIuMAX "\n", alloc_count);`
			`fprintf(stderr, "Total objects: %10" PRIuMAX " (%10" PRIuMAX " duplicates )\n", total_count, duplicate_count);`
fast-import: count and report # of calls to diff_delta in stats It's an interesting number, how often do we try to deltify each type of objects and how often do we succeed. So do add it to stats. Success doesn't mean much gain in pack size though. As we allow delta to be as big as (data.len - 20). And delta close to data.len gains nothing compared to no delta at all even after zlib compression (delta is pretty much the same as data, just with few modifications). We should try to make less attempts that result in huge deltas as these consume more cpu than trivial small deltas. Either by choosing a better delta base or reducing delta size upper bound or doing less delta attempts at all. Currently, delta base for blobs is a waste literally. Each blob delta base is chosen as a previously stored blob. Disabling deltas for blobs doesn't increase pack size and reduce import time, or at least doesn't increase time for all fast-import streams I've tried. Signed-off-by: Dmitry Ivankov <divanorama@gmail.com> Acked-by: David Barr <davidbarr@google.com> Acked-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-08-20 21:04:11 +02:00			`fprintf(stderr, " blobs : %10" PRIuMAX " (%10" PRIuMAX " duplicates %10" PRIuMAX " deltas of %10" PRIuMAX" attempts)\n", object_count_by_type[OBJ_BLOB], duplicate_count_by_type[OBJ_BLOB], delta_count_by_type[OBJ_BLOB], delta_count_attempts_by_type[OBJ_BLOB]);`
			`fprintf(stderr, " trees : %10" PRIuMAX " (%10" PRIuMAX " duplicates %10" PRIuMAX " deltas of %10" PRIuMAX" attempts)\n", object_count_by_type[OBJ_TREE], duplicate_count_by_type[OBJ_TREE], delta_count_by_type[OBJ_TREE], delta_count_attempts_by_type[OBJ_TREE]);`
			`fprintf(stderr, " commits: %10" PRIuMAX " (%10" PRIuMAX " duplicates %10" PRIuMAX " deltas of %10" PRIuMAX" attempts)\n", object_count_by_type[OBJ_COMMIT], duplicate_count_by_type[OBJ_COMMIT], delta_count_by_type[OBJ_COMMIT], delta_count_attempts_by_type[OBJ_COMMIT]);`
			`fprintf(stderr, " tags : %10" PRIuMAX " (%10" PRIuMAX " duplicates %10" PRIuMAX " deltas of %10" PRIuMAX" attempts)\n", object_count_by_type[OBJ_TAG], duplicate_count_by_type[OBJ_TAG], delta_count_by_type[OBJ_TAG], delta_count_attempts_by_type[OBJ_TAG]);`
Teach fast-import how to sit quietly in the corner. Often users will be running fast-import from within a larger frontend process, and this may be a frequent periodic tool such as a future edition of `git-svn fetch`. We don't want to bombard users with our large stats output if they won't be interested in it, so `--quiet` is now an option to make gfi be more silent. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-07 08:19:31 +01:00			`fprintf(stderr, "Total branches: %10lu (%10lu loads )\n", branch_count, branch_load_count);`
Check for PRIuMAX rather than NO_C99_FORMAT in fast-import.c. Thanks to Simon 'corecode' Schubert <corecode@fs.ei.tum.de> for the clean-up. Defining the C99 standard PRIuMAX when necessary replaces UM_FMT and the awkward UM10_FMT. There are no direct C99 translations for other uses of NO_C99_FORMAT in git, alas. Signed-off-by: Jason Riedy <ejr@cs.berkeley.edu> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-21 02:34:56 +01:00			`fprintf(stderr, " marks: %10" PRIuMAX " (%10" PRIuMAX " unique )\n", (((uintmax_t)1) << marks->shift) * 1024, marks_set_count);`
Teach fast-import how to sit quietly in the corner. Often users will be running fast-import from within a larger frontend process, and this may be a frequent periodic tool such as a future edition of `git-svn fetch`. We don't want to bombard users with our large stats output if they won't be interested in it, so `--quiet` is now an option to make gfi be more silent. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-07 08:19:31 +01:00			`fprintf(stderr, " atoms: %10u\n", atom_cnt);`
Check for PRIuMAX rather than NO_C99_FORMAT in fast-import.c. Thanks to Simon 'corecode' Schubert <corecode@fs.ei.tum.de> for the clean-up. Defining the C99 standard PRIuMAX when necessary replaces UM_FMT and the awkward UM10_FMT. There are no direct C99 translations for other uses of NO_C99_FORMAT in git, alas. Signed-off-by: Jason Riedy <ejr@cs.berkeley.edu> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-21 02:34:56 +01:00			`fprintf(stderr, "Memory total: %10" PRIuMAX " KiB\n", (total_allocd + alloc_count*sizeof(struct object_entry))/1024);`
fast-import: Fix compile warnings Not on all platforms are size_t and unsigned long equivalent. Since I do not know how portable %z is, I play safe, and just cast the respective variables to unsigned long. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-07 12:38:21 +01:00			`fprintf(stderr, " pools: %10lu KiB\n", (unsigned long)(total_allocd/1024));`
Check for PRIuMAX rather than NO_C99_FORMAT in fast-import.c. Thanks to Simon 'corecode' Schubert <corecode@fs.ei.tum.de> for the clean-up. Defining the C99 standard PRIuMAX when necessary replaces UM_FMT and the awkward UM10_FMT. There are no direct C99 translations for other uses of NO_C99_FORMAT in git, alas. Signed-off-by: Jason Riedy <ejr@cs.berkeley.edu> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-21 02:34:56 +01:00			`fprintf(stderr, " objects: %10" PRIuMAX " KiB\n", (alloc_count*sizeof(struct object_entry))/1024);`
Teach fast-import how to sit quietly in the corner. Often users will be running fast-import from within a larger frontend process, and this may be a frequent periodic tool such as a future edition of `git-svn fetch`. We don't want to bombard users with our large stats output if they won't be interested in it, so `--quiet` is now an option to make gfi be more silent. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-07 08:19:31 +01:00			`fprintf(stderr, "---------------------------------------------------------------------\n");`
			`pack_report();`
			`fprintf(stderr, "---------------------------------------------------------------------\n");`
			`fprintf(stderr, "\n");`
			`}`
Created fast-import, a tool to quickly generating a pack from blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-05 08:04:21 +02:00
Don't do non-fastforward updates in fast-import. If fast-import is being used to update an existing branch of a repository, the user may not want to lose commits if another process updates the same ref at the same time. For example, the user might be using fast-import to make just one or two commits against a live branch. We now perform a fast-forward check during the ref updating process. If updating a branch would cause commits in that branch to be lost, we skip over it and display the new SHA1 to standard error. This new default behavior can be overridden with `--force`, like git-push and git-fetch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-02-06 22:08:06 +01:00			`return failure ? 1 : 0;`
Created fast-import, a tool to quickly generating a pack from blobs. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2006-08-05 08:04:21 +02:00			`}`