This let diff_delta() abort early if it is going to bust the given
size limit. Also, only objects larger than 20 bytes are considered
as objects smaller than that are most certainly going to produce
larger deltas than the original object due to the additional headers.
Signed-off-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that fast-import is creating packs with index version 2, there is
no point limiting the pack size by default. A pack split will still
happen if off_t is not sufficiently large to hold large offsets.
While updating the doc, let's remove the "packfiles fit on CDs"
suggestion. Pack files created by fast-import are still suboptimal and
a 'git repack -a -f -d' or even 'git gc --aggressive' would be a pretty
good idea before considering storage on CDs.
Signed-off-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This allows for the creation of pack index version 2 with its object
CRC and the possibility for a pack to be larger than 4 GB.
Signed-off-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is in preparation for using write_idx_file(). Also, by using
sha1write() we get some buffering to reduces the number of write
syscalls, and the written data is SHA1 summed which allows for the extra
data integrity validation check performed in fixup_pack_header_footer()
(details on this in commit abeb40e5aa).
Signed-off-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Similar in spirit to 07cf0f2 (make --max-pack-size argument to 'git
pack-object' count in bytes, 2010-02-03) which made the option by the same
name to pack-objects, this counts the pack size limit in bytes.
In order not to cause havoc with people used to the previous megabyte
scale an integer smaller than 8192 is interpreted in megabytes but the
user gets a warning. Also a minimum size of 1 MiB is enforced to avoid an
explosion of pack files.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Acked-by: Shawn O. Pearce <spearce@spearce.org>
Acked-by: Nicolas Pitre <nico@fluxnic.net>
Manual merge made at 844ad3d (Merge branch 'sp/maint-fast-import-large-blob'
into sp/fast-import-large-blob, 2010-02-01) did not correctly reflect the change
of unit in which this variable's value is counted from its previous version.
Now it counts in bytes, not in megabytes.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Acked-by: Shawn O. Pearce <spearce@spearce.org>
* sp/maint-fast-import-large-blob:
fast-import: Stream very large blobs directly to pack
bash: don't offer remote transport helpers as subcommands
Conflicts:
fast-import.c
If a blob is larger than the configured big-file-threshold, instead
of reading it into a single buffer obtained from malloc, stream it
onto the end of the current pack file. Streaming the larger objects
into the pack avoids the 4+ GiB memory footprint that occurs when
fast-import is processing 2+ GiB blobs.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* 'jh/notes' (early part):
Add more testcases to test fast-import of notes
Rename t9301 to t9350, to make room for more fast-import tests
fast-import: Proper notes tree manipulation
* sr/gfi-options:
fast-import: add (non-)relative-marks feature
fast-import: allow for multiple --import-marks= arguments
fast-import: test the new option command
fast-import: add option command
fast-import: add feature command
fast-import: put marks reading in its own function
fast-import: put option parsing code in separate functions
If you tried to export the official git repository, and then to import it
back then git-fast-import would die complaining that "Mark :1 not a commit".
Accordingly to a generated crash file, Mark 1 is not a commit but a blob,
which is pointed by junio-gpg-pub tag. Because git-tag allows to create such
tags, git-fast-import should import them.
Signed-off-by: Dmitry Potapov <dpotapov@gmail.com>
Acked-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The fast-import parser does not validate that the author, committer
or tagger name component contains both a name and an email address.
Therefore the name component has always been optional. Correct the
documentation to match the implementation.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch teaches 'git fast-import' to automatically organize note objects
in a fast-import stream into an appropriate fanout structure. The notes API
in notes.h is NOT used to accomplish this, because trying to keep the
fast-import and notes data structures in sync would yield a significantly
larger patch with higher complexity.
Note objects are added with the 'N' command, and accounted for with a
per-branch counter, which is used to trigger fanout restructuring when
needed. Note that when restructuring the branch tree, _any_ entry whose
path consists of 40 hex chars (not including directory separators) will
be recognized as a note object. It is therefore not advisable to
manipulate note entries with M/D/R/C commands.
Since note objects are stored in the same tree structure as other objects,
the unloading and reloading of a fast-import branches handle note objects
transparently.
This patch has been improved by the following contributions:
- Shawn O. Pearce: Several style- and logic-related improvements
Cc: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After specifying 'feature relative-marks' the paths specified with
'feature import-marks' and 'feature export-marks' are relative to an
internal directory in the current repository.
In git-fast-import this means that the paths are relative to the
'.git/info/fast-import' directory. However, other importers may use a
different location.
Add 'feature non-relative-marks' to disable this behavior, this way
it is possible to, for example, specify the import-marks location as
relative, and the export-marks location as non-relative.
Also add tests to verify this behavior.
Cc: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --import-marks= option may be specified multiple times on the
commandline and should result in all marks being read in. Only one
import-marks feature may be specified in the stream, which is
overriden by any --import-marks= commandline options.
If one wishes to specify import-marks files in addition to the one
specified in the stream, it is easy to repeat the stream option as a
--import-marks= commandline option.
Also verify this behavior with tests.
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This allows the frontend to specify any of the supported options as
long as no non-option command has been given. This way the
user does not have to include any frontend-specific options, but
instead she can rely on the frontend to tell fast-import what it
needs.
Also factor out parsing of argv and have it execute when we reach the
first non-option command, or after all commands have been read and
no non-option command has been encountered.
Non-git options are ignored, unrecognised options result in an error.
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This allows the fronted to require a specific feature to be supported
by the backend, or abort.
Also add support for four initial feature, date-format=, force=,
import-marks=, export-marks=.
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All options do nothing but set settings, with the exception of the
--input-marks option. Delay the reading of the marks file till after
all options have been parsed.
Also, rename mark_file to export_marks_file as it is now ambiguous.
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Putting the options in their own functions increases readability of
the option parsing block and makes it easier to reuse the option
parsing code later on.
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* 'jh/notes' (early part):
Add selftests verifying concatenation of multiple notes for the same commit
Refactor notes code to concatenate multiple notes annotating the same object
Add selftests verifying that we can parse notes trees with various fanouts
Teach the notes lookup code to parse notes trees with various fanout schemes
Teach notes code to free its internal data structures on request
Add '%N'-format for pretty-printing commit notes
Add flags to get_commit_notes() to control the format of the note string
t3302-notes-index-expensive: Speed up create_repo()
fast-import: Add support for importing commit notes
Teach "-m <msg>" and "-F <file>" to "git notes edit"
Add an expensive test for git-notes
Speed up git notes lookup
Add a script to edit/inspect notes
Introduce commit notes
Conflicts:
.gitignore
Documentation/pretty-formats.txt
pretty.c
Let "git fast-import -h" (with no other arguments) print usage
before exiting, even when run outside any repository.
Cc: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce a 'notemodify' subcommand of the 'commit' command. This subcommand
is similar to 'filemodify', except that no mode is supplied (all notes have
mode 0644), and the path is set to the hex SHA1 of the given "comittish".
This enables fast import of note objects along with their associated commits,
since the notes can now be named using the mark references of their
corresponding commits.
The patch also includes a test case of the added functionality.
Signed-off-by: Johan Herland <johan@herland.net>
Acked-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When reading the "raw format" timestamp from the input stream, make sure
that the timezone offset is a reasonable value by imitating 7122f82
(date.c: improve guess between timezone offset and year., 2006-06-08).
We _might_ want to also check if the timestamp itself is reasonable, but
that is left for a separate commit.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Lots of die() calls did not actually report the kind of error, which
can leave the user confused as to the real problem. Use die_errno()
where we check a system/library call that sets errno on failure, or
one of the following that wrap such calls:
Function Passes on error from
-------- --------------------
odb_pack_keep open
read_ancestry fopen
read_in_full xread
strbuf_read xread
strbuf_read_file open or strbuf_read_file
strbuf_readlink readlink
write_in_full xwrite
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change calls to die(..., strerror(errno)) to use the new die_errno().
In the process, also make slight style adjustments: at least state
_something_ about the function that failed (instead of just printing
the pathname), and put paths in single quotes.
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* ar/unlink-err:
print unlink(2) errno in copy_or_link_directory
replace direct calls to unlink(2) with unlink_or_warn
Introduce an unlink(2) wrapper which gives warning if unlink failed
Essentially; s/type* /type */ as per the coding guidelines.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This helps to notice when something's going wrong, especially on
systems which lock open files.
I used the following criteria when selecting the code for replacement:
- it was already printing a warning for the unlink failures
- it is in a function which already printing something or is
called from such a function
- it is in a static function, returning void and the function is only
called from a builtin main function (cmd_)
- it is in a function which handles emergency exit (signal handlers)
- it is in a function which is obvously cleaning up the lockfiles
Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A few more fixes on top of the automatic spell checker generated ones.
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jc/shared-literally:
t1301: loosen test for forced modes
set_shared_perm(): sometimes we know what the final mode bits should look like
move_temp_to_file(): do not forget to chmod() in "Coda hack" codepath
Move chmod(foo, 0444) into move_temp_to_file()
"core.sharedrepository = 0mode" should set, not loosen
When writing out a loose object or a pack (index), move_temp_to_file() is
called to finalize the resulting file. These files (loose files and packs)
should all have permission mode 0444 (modulo adjust_shared_perm()).
Therefore, instead of doing chmod(foo, 0444) explicitly from each callsite
(or even forgetting to chmod() at all), do the chmod() call from within
move_temp_to_file().
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These variables were unused and can be removed safely:
builtin-clone.c::cmd_clone(): use_local_hardlinks, use_separate_remote
builtin-fetch-pack.c::find_common(): len
builtin-remote.c::mv(): symref
diff.c::show_stats():show_stats(): total
diffcore-break.c::should_break(): base_size
fast-import.c::validate_raw_date(): date, sign
fsck.c::fsck_tree(): o_sha1, sha1
xdiff-interface.c::parse_num(): read_some
Signed-off-by: Benjamin Kramer <benny.kra@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a repository created with git older than f49fb35 (git-init-db: create
"pack" subdirectory under objects, 2005-06-27), objects/pack/ directory is
not created upon initialization. It was Ok because subdirectories are
created as needed inside directories init-db creates, and back then,
packfiles were recent invention.
After the said commit, new codepaths started relying on the presense of
objects/pack/ directory in the repository. This was exacerbated with
8b4eb6b (Do not perform cross-directory renames when creating packs,
2008-09-22) that moved the location temporary pack files are created from
objects/ directory to objects/pack/ directory, because moving temporary to
the final location was done carefully with lazy leading directory creation.
Many packfile related operations in such an old repository can fail
mysteriously because of this.
This commit introduces two helper functions to make things work better.
- odb_mkstemp() is a specialized version of mkstemp() to refactor the
code and teach it to create leading directories as needed;
- odb_pack_keep() refactors the code to create a ".keep" file while
create leading directories as needed.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Otherwise we may reuse the same memory address for a totally
different "struct packed_git", and a previously cached object from
the prior occupant might be returned when trying to unpack an object
from the new pack.
Found-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Programs that use git_config need to find the global configuration.
When runtime prefix computation is enabled, this requires that
git_extract_argv0_path() is called early in the program's main().
This commit adds the necessary calls.
Signed-off-by: Steffen Prohaska <prohaska@zib.de>
Acked-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"S_IFREG | mode" makes only sense for 0644 and 0755.
Even though doing (S_IFREG | mode) may not hurt when mode is any other
supported value, that is only true because S_IFREG mode bit happens to
be already on for S_IFLNK or S_IFGITLINK.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Acked-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Store the return value of strtoul() in order to avoid compiler
warnings on Ubuntu 8.10.
Also check errno after each call, which is the only way to notice
an overflow without making ULONG_MAX an illegal date.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Even though newer Porcelain tools always record the tagger information
when creating new tags, export/import pair should be able to faithfully
reproduce ancient tag objects that lack tagger information.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Acked-by: Shawn O. Pearce <spearce@spearce.org>