Let "git svn" run "git gc --auto" every 1000 imported commits to
reduce the number of loose objects.
To handle the common use case of frequent imports, where each
invocation typically fetches much less than 1000 commits, also run gc
unconditionally at the end of the import.
"1000" is the same number that was used by default when we called
git-repack. It isn't necessarily still the best choice.
Signed-off-by: Karl Hasselström <kha@treskal.com>
Acked-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a moment, we'll start calling git-gc --auto instead, since it is a
better fit to what we're trying to accomplish.
The command line options are still accepted, but don't have any
effect, and we warn the user about that.
Signed-off-by: Karl Hasselström <kha@treskal.com>
Acked-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This should reduce disk space usage when doing large imports.
We'll be switching to "gc --auto" post-1.5.4 to handle
repacking for us.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Repositories generated by svnsync cannot be relied on to have
properly set revprops without newlines in UUIDs and URLs. There
may be broken versions of svnsync out there that append extra
newlines to UUIDs, or the revprops could've been changed by
repository administrators at any time, too.
At least one repository we've come across has an embedded
newline erroneously set in the svnsync-uuid prop. This is bad
because the trailing newline is taken as another record by the
Git.pm library, and the wantarray detection causes tmp_config()
to return an array with an empty-but-existing second element.
We will now strip leading and trailing whitespace both before
setting and after reading the uuid and url for svnsync values.
We will also force tmp_config to return a single scalar when
reading existing values.
SVN UUIDs should never have whitespace in them, and SVN
repository URLs should be URI-escaped, so neither of those
values we ever see in git-svn should actually have whitespace
in them.
Thanks to Dennis Schridde for the bug report and Junio for
helping diagnose this.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
prop_walk adds a leading / to all subdirectory paths. Unfortunately
this causes a problem when the remote repo lives in a subdirectory itself,
as the leading / causes subsequent PROPFIND calls to be executed on
the wrong path. Trimming the / before calling the PROPFIND fixes this problem.
Signed-off-by: Kevin Ballard <kevin@sb.org>
Acked-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
I've heard of several users puzzled by this, and it sometimes it
appears as if git-svn is doing nothing on slower connections and
larger repositories.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
SVN requires that paths be URI-escaped for HTTP(S) repositories.
file:// and svn:// repositories do not need these rules.
Additionally, accessing individual paths inside repositories
(check_path() and get_log() do NOT require escapes to function
and in fact it breaks things).
Noticed-by: Michael J. Cohen <mjc@cruiseplanners.com>
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
commit 3157dd9e89 (git-svn: unlink
internal index files after operations) introduced unlinking
index files after fetching. However, this missed indices for
refs that were created by globbing branches and tags. This will
track all refs we ever touch during a fetch and unlink them at
exit time.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's possible for bad clients to commit symlinks without the
5-character "link " prefix in symlinks. So guard around this
bug in SVN and make a best effort to create symlinks if the
"link " prefix is missing.
More information on this SVN bug is described here:
http://subversion.tigris.org/issues/show_bug.cgi?id=2692
To be pedantic, there is still a corner case that neither we nor
SVN can handle: If somebody made a link using a broken SVN
client where "link " is the first part of its path, e.g.
"link sausage", then we'd end up having a symlink which points
to "sausage" because we incorrectly stripped the "link ".
Hopefully this hasn't happened in practice, but if it has,
it's not our fault SVN is broken :)
Thanks to Benoit Sigoure and Sverre Johansen for reporting
and feedback.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
We set the 6 environment variables for controlling
committer/author email/name/time for every commit.
We do this in the parent process to be passed to
git-commit-tree, because open3() doesn't afford us the control
of doing it only in the child process. This means we leave them
hanging around in the main process until the next revision comes
around and all 6 environment variables are overwridden again.
Unfortunately, for the last commit, leaving them hanging around
means the git-rebase invocation will pick it up, rewriting the
rebased commit with incorrect author information. This should fix
it.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Previously, git-svn would ignore cases where the path we're
tracking is removed from the repository. This was to prevent
heads with follow-parent from ending up with a tree full of
empty revisions (and thus breaking rename detection).
The previous behavior is fine until the path we're tracking
is re-added later on, leading to the old files being merged
in with the new files in the directory (because the old
files were never marked as deleted)
We will now only remove all the old files locally that were
deleted remotely iff we detect the directory we're in is being
created from scratch.
Thanks for Marcus D. Hanwell for the bug report and
Peter Baumann for the analysis.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Being git, we can generate these very quickly on the fly as
needed, so there's no point in wasting space for these things
for large projects.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The current parsing for From: and Signed-off-by: lines handles fully
specified names:
From: Full Name <email@address>
Expand this to include the raw email addresses and straight "names":
From: email@address -> email <email@address>
From: Full Name -> Full Name <unknown>
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Acked-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-config recently learned a --get-colorbool option. By
using it, we will get the same color=auto behavior that
other git commands have.
Specifically, this fixes the case where "color.diff = true"
meant "always" in git-svn, but "auto" in other programs.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reinstates an old optimization in .rev_db which
stored the highest revision number we scanned, allowing
us to avoid scanning the SVN log for those revisions
again in a subsequent invocation.
This means the last 24-byte record in a .rev_map file
can be a 4-byte SVN revision number with 20-bytes of
zeroes representing a non-existent commit. This record
can and will be overwritten when a new commit iff
the commit is all zeroes.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Migrations are done automatically on an as-needed basis when new
revisions are to be fetched. Stale remote branches do not get
migrated, yet.
However, unless you set noMetadata or useSvkProps it's safe to
just do:
find $GIT_DIR/svn -name '.rev_db*' -print0 | xargs rm -f
to purge all the old .rev_db files.
The new format is a one-way migration and is NOT compatible with
old versions of git-svn.
This is the replacement for the rev_db format, which was too big
and inefficient for large repositories with a lot of sparse history
(mainly tags).
The format is this:
- 24 bytes for every record,
* 4 bytes for the integer representing an SVN revision number
* 20 bytes representing the sha1 of a git commit
- No empty padding records like the old format
- new records are written append-only since SVN revision numbers
increase monotonically
- lookups on SVN revision number are done via a binary search
- Piping the file to xxd(1) -c24 is a good way of dumping it for
viewing or editing, should the need ever arise.
As with .rev_db, these files are disposable unless noMetadata or
useSvmProps is set.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If you run "git-svn rebase" while sitting on a topic branch, there is
no need to create a "master" branch if one didn't exist already. The
branch was created implicitly by the automatic checkout after fetching,
which in the case of rebase isn't actually necessary anyway.
Signed-off-by: Steven Grimm <koreth@midwinter.com>
Acked-by: Eric Wong <normalperson@yhbt.net>
show-externals can be used by scripts to provide svn:externals-like
functionality. For example, a script can list all of the externals and then
use check out the listed URLs at the appropriate paths, similar to what the svn
client does. Said script (or perhaps git-svn itself, in the future) could
simply invoke svn export on the paths, or it could go one further, using
git-svn clone and even git-submodule together to better integrate externals
checkouts.
The implementation is shamelessly copied from show-ignores. A more general
command to list user-specified properties is probably a better idea.
Signed-off-by: Vineet Kumar <vineet@doorstop.net>
Acked-by: Eric Wong <normalperson@yhbt.net>
Digest::MD5 is loaded regardless of the package in which it's
declared, so move its 'use' statement and the md5sum() function
into the main package.
Signed-off-by: David D. Kilzer <ddkilzer@kilzer.net>
Acked-by: Eric Wong <normalperson@yhbt.net>
Add support for pulling the real author of a commit from the From:
and first Signed-off-by: fields of the SVN commit message.
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Acked-by: Eric Wong <normalperson@yhbt.net>
Previously, git-svn first read the .git/config file for settings as if
current working directory was the repository top-directory, and after
that made sure to cd into top-directory. The result was a silent
failur to read configuration settings. This patch changes the order
these two things are done.
Signed-off-by: Gustaf Hendeby <hendeby@isy.liu.se>
Acked-by: Eric Wong <normalperson@yhbt.net>
Cache the repository root whenever we connect to the repository.
This will allow us to notice URL changes if the user changes the
URL in .git/config, too.
If the repository is no longer accessible, or if `git svn info'
is the first and only command run; then '(offline)' will be
displayed for "Repository Root:" in the output.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Return the svn URL for the given path, or return the svn
repository URL if no path is given.
Added 18 tests to t/t9119-git-svn-info.sh.
Signed-off-by: David D. Kilzer <ddkilzer@kilzer.net>
Acked-by: Eric Wong <normalperson@yhbt.net>
Implement "git-svn info" for files and directories based on the
"svn info" command. Note that the -r/--revision argument is not
supported yet.
Added 18 tests in t/t9119-git-svn-info.sh.
[ew: small fix to work without arguments on all working directories]
Signed-off-by: David D. Kilzer <ddkilzer@kilzer.net>
Acked-by: Eric Wong <normalperson@yhbt.net>
Extacted canonicalize_path() in the main package.
Created new Git::SVN::Util package with an md5sum() function. A
new package was created so that Digest::MD5 did not have to be
loaded in the main package. Replaced code in the SVN::Git::Editor
and SVN::Git::Fetcher packages with calls to md5sum().
Extracted the format_svn_date(), parse_git_date() and
set_local_timezone() functions within the Git::SVN::Log package.
Signed-off-by: David D. Kilzer <ddkilzer@kilzer.net>
Acked-by: Eric Wong <normalperson@yhbt.net>
When unreachable revisions are given to "svn log", it displays all commit
logs in the given range that exist in the current tree. (If no commit
logs are found in the current tree, it simply prints a single commit log
separator.) This patch makes "git-svn log" behave the same way.
Ten tests added to t/t9116-git-svn-log.sh.
Signed-off-by: David D Kilzer <ddkilzer@kilzer.net>
Acked-by: Eric Wong <normalperson@yhbt.net>
The "svn log -rM:N" command shows commit logs inclusive in the range [M,N].
Previously "git-svn log" always excluded the commit log for the smallest
revision in a range, whether the range was ascending or descending. With
this patch, the smallest revision in a range is always shown.
Updated tests for ascending and descending revision ranges.
Signed-off-by: David D Kilzer <ddkilzer@kilzer.net>
Acked-by: Eric Wong <normalperson@yhbt.net>
Fixed typo in Git::SVN::Log::git_svn_log_cmd(). Previously a command like
"git-svn log -r1:4" would only show a commit log separator.
Added tests for ascending and descending revision ranges.
Signed-off-by: David D Kilzer <ddkilzer@kilzer.net>
Acked-by: Eric Wong <normalperson@yhbt.net>
When doing dcommit git-svn must use subversion's config or newly created
files will not include svn's properties
(defined in [auto-props] with 'enable-auto-props = yes').
Signed-off-by: Konstantin V. Arkhipov <voxus@onphp.org>
Acked-by: Eric Wong <normalperson@yhbt.net>
SVN requires that paths be URI-escaped for HTTP(S) repositories.
file:// and svn:// repositories do not need these rules.
Additionally, accessing individual paths inside repositories
(check_path() and get_log() do NOT require escapes to function
and in fact it breaks things).
Noticed-by: Michael J. Cohen <mjc@cruiseplanners.com>
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint:
for-each-ref: fix off by one read.
git-branch: remove mention of non-existent '-b' option
git-svn: prevent dcommitting if the index is dirty.
Fix memory leak in traverse_commit_list
dcommit uses rebase to sync the history with what has just been pushed to
SVN. Trying to dcommit with a dirty index is troublesome for rebase, so now
the user will get an error message if he attempts to dcommit with a dirty
index.
Signed-off-by: Benoit Sigoure <tsuna@lrde.epita.fr>
Acked-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint:
Remove a couple of duplicated include
grep with unmerged index
git-daemon: fix remote port number in log entry
git-svn: t9114: verify merge commit message in test
git-svn: fix dcommit clobbering when committing a series of diffs
Our revision number sent to SVN is set to the last revision we
committed if we've made any previous commits in a dcommit
invocation.
Although our SVN Editor code uses the delta of two (old) trees
to generate information to send upstream, it'll still send
complete resultant files upstream; even if the tree they're
based against is out-of-date.
The combination of sending a file that does not include the
latest changes, but set with a revision number of a commit we
just made will cause SVN to accept the resultant file even if it
was generated against an old tree.
More trouble was caused when fixing this because we were
rebasing uncessarily at times. We used git-diff-tree to check
the imported SVN revision against our HEAD, not the last tree we
committed to SVN. The unnecessary rebasing caused merge commits
upstream to SVN to fail.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git svn <cmd> --help" gave options in the order they were found in a
Perl hash, which meant "randomly" to humans.
Signed-off-by: Benoit Sigoure <tsuna@lrde.epita.fr>
Acked-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* git-svn.perl (&fatal): Append the newline at the end of the error
message.
Adjust all callers.
Signed-off-by: Benoit Sigoure <tsuna@lrde.epita.fr>
Acked-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This allows one to easily retrieve a list of svn properties from within
git-svn without requiring svn or knowing the URL of a repository.
* git-svn.perl (%cmd): Add the command `proplist'.
(&cmd_proplist): New.
* t/t9101-git-svn-props.sh: Test git svn proplist.
Signed-off-by: Benoit Sigoure <tsuna@lrde.epita.fr>
Acked-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This allows one to easily retrieve a single SVN property from within
git-svn without requiring svn or remembering the URL of a repository
* git-svn.perl (%cmd): Add the new command `propget'.
($cmd_dir_prefix): New global.
(&get_svnprops): New helper.
(&cmd_propget): New. Use &get_svnprops.
* t/t9101-git-svn-props.sh: Add a test case for propget.
[ew: make sure the rev-parse --show-prefix call doesn't break
the `git-svn clone' command]
Signed-off-by: Benoit Sigoure <tsuna@lrde.epita.fr>
Acked-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
git svn create-ignore (to create one .gitignore per directory
from the svn:ignore properties. This has the disadvantage of
committing the .gitignore during the next dcommit, but when you
import a repo with tons of ignores (>1000), using git svn show-ignore
to build .git/info/exclude is *not* a good idea, because things like
git-status will end up doing >1000 fnmatch *per file* in the repo,
which leads to git-status taking more than 4s on my Core2Duo 2Ghz 2G
RAM)
* git-svn.perl (%cmd): Add the new command `create-ignore'.
(&cmd_create_ignore): New.
* t/t9101-git-svn-props.sh: Adjust the test-case for show-ignore and
add a test case for create-ignore.
[ew: added commit message from
<05CAB148-56ED-4FF1-8AAB-4BA2A0B70C2C@lrde.epita.fr> ]
Signed-off-by: Benoit Sigoure <tsuna@lrde.epita.fr>
Acked-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
* git-svn.perl (&traverse_ignore): Remove.
(&prop_walk): New.
(&cmd_show_ignore): Use prop_walk.
[ew: This will ease the implementation of the `create-ignore',
`propget', and `proplist' commands]
Signed-off-by: Benoit Sigoure <tsuna@lrde.epita.fr>
Acked-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Some variables coming from the Subversion's Perl bindings are used
in our code only once, so the interpreter warns us about it. These
warnings are false-positives, because the variables themselves are
initialized in the binding's guts, that are made by SWIG.
Credits to Sam Vilain for his note about "no warnings 'once'".
[ew: minor formatting change]
Signed-off-by: Eygene Ryabinkin <rea-git@codelabs.ru>
Acked-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Parameters 'store-passwords' and 'store-auth-creds' from Subversion's
configuration (~/.subversion/config) were not respected. This was
fixed: the default values for these parameters are set to 'yes' to
follow Subversion behaviour.
Signed-off-by: Eygene Ryabinkin <rea-git@codelabs.ru>
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
Acked-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
In most cases of branching, the tree is copied unmodified from the trunk
to the branch. When that is done, we can simply start with the parent's
index and apply the changes on the branch as usual.
[ew: rewritten from Steven's original to use SVN::Client instead
of the command-line svn client.
Since SVN::Client connects separately, we'll share our
authentication providers array between our usages of
SVN::Client and SVN::Ra, too. Bypassing the high-level
SVN::Client library can avoid this, but the code will be
much more complex. Regardless, any implementation of this
seems to require restarting a connection to the remote
server.
Also of note is that SVN 1.4 and later allows a more
efficient diff_summary to be done instead of a full diff,
but since this code is only to support SVN < 1.4.4, we'll
ignore it for now.]
Signed-off-by: Steven Walter <stevenrwalter@gmail.com>
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint:
git-svn: don't attempt to spawn pager if we don't want one
Supplant the "while case ... break ;; esac" idiom
User Manual: add a chapter for submodules
user-manual: don't assume refs are stored under .git/refs
Detect exec bit in more cases.
Conjugate "search" correctly in the git-prune-packed man page.
Move the paragraph specifying where the .idx and .pack files should be
Documentation/git-lost-found.txt: drop unnecessarily duplicated name.
Even though config_pager() unset the $pager variable, we were
blindly calling exec() on it through run_pager().
Noticed-by: Chris Moore <christopher.ian.moore@gmail.com>
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>