1
0
Fork 0
mirror of https://github.com/git/git.git synced 2024-11-15 05:33:04 +01:00
Commit graph

36 commits

Author SHA1 Message Date
Eric Wong
da0bc948ac git-svn: add space after "W:" prefix in warning
And minor reformatting while we're in the area.

Signed-off-by: Eric Wong <normalperson@yhbt.net>
2014-10-30 08:31:28 +00:00
Eric Wong
4ae9a7b966 git-svn: (cleanup) remove editor param passing
Neither find_extra_svk_parents or find_extra_svn_parents ever
used the `$ed' parameter.

Signed-off-by: Eric Wong <normalperson@yhbt.net>
2014-10-30 08:29:43 +00:00
Eric Wong
7676aff709 git-svn: disable _rev_list memoization
This memoization appears unneeded as the check_cherry_pick2 cache is
in front of it does enough.

With this change applied, importing from local svn+ssh and http copies
of the R repo[1] takes only 2:00 (2 hours) on my system and the git-svn
process never uses more than 60MB RSS on my x86-64 GNU/Linux system[2].
This 60M measurement is only for the git-svn Perl process itself and
does not include memory used by git subprocesses accessing large packs
(subprocess memory usage _is_ measured by my time(1) tool).

Before this change, an import took longer (2:20) on svn+ssh:// but
git-svn used around 240MB during the imports.  Worse yet, git-svn
ballooned to over 400M when writing out the cache to the filesystem.

I also tried removing memoization for `has_no_changes', too, but a
local copy of the R repository(*) was not close to finishing within
10 hours on my system.

[1] http://svn.r-project.org/R
[2] file:// repos causes libsvn to use more memory internally

Signed-off-by: Eric Wong <normalperson@yhbt.net>
Cc: Hin-Tak Leung <htl10@users.sourceforge.net>
2014-10-27 01:39:39 +00:00
Eric Wong
2b6c613f1a git-svn: remove mergeinfo rev caching
This should further reduce memory usage from the new mergeinfo
speedups without hurting performance too much, assuming
reasonable latency to the SVN server.

Cc: Hin-Tak Leung <htl10@users.sourceforge.net>
Suggested-by: Jakob Stoklund Olesen <stoklund@2pi.dk>
Signed-off-by: Eric Wong <normalperson@yhbt.net>
2014-10-24 22:55:43 +00:00
Eric Wong
54b95346c1 git-svn: cache only mergeinfo revisions
This should reduce excessive memory usage from the new mergeinfo
caches without hurting performance too much, assuming reasonable
latency to the SVN server.

Cc: Hin-Tak Leung <htl10@users.sourceforge.net>
Suggested-by: Jakob Stoklund Olesen <stoklund@2pi.dk>
Signed-off-by: Eric Wong <normalperson@yhbt.net>
2014-10-24 22:55:39 +00:00
Eric Wong
d0b34f241d git-svn: reduce check_cherry_pick cache overhead
We do not need to store entire lists of commits, only the
number of incomplete and the first commit for reference.
This reduces the amount of data we need to store in memory
and on disk stores.

Signed-off-by: Eric Wong <normalperson@yhbt.net>
2014-10-24 22:55:35 +00:00
Jakob Stoklund Olesen
9ee13a934e git-svn: only look at the root path for svn:mergeinfo
Subversion can put mergeinfo on any sub-directory to track cherry-picks.
Since cherry-picks are not represented explicitly in git, git-svn should
just ignore it.

Signed-off-by: Jakob Stoklund Olesen <stoklund@2pi.dk>
Signed-off-by: Eric Wong <normalperson@yhbt.net>
2014-10-24 22:55:29 +00:00
Jakob Stoklund Olesen
abfef3bbf5 git-svn: only look at the new parts of svn:mergeinfo
In a Subversion repository where many feature branches are merged into a
trunk, the svn:mergeinfo property can grow very large. This severely
slows down git-svn's make_log_entry() because it is checking all
mergeinfo entries every time the property changes.

In most cases, the additions to svn:mergeinfo since the last commit are
pretty small, and there is nothing to gain by checking merges that were
already checked for the last commit in the branch.

Add a mergeinfo_changes() function which computes the set of interesting
changes to svn:mergeinfo since the last commit. Filter out merged
branches whose ranges haven't changed, and remove a common prefix of
ranges from other merged branches.

This speeds up "git svn fetch" by several orders of magnitude on a large
repository where thousands of feature branches have been merged.

Signed-off-by: Jakob Stoklund Olesen <stoklund@2pi.dk>
Signed-off-by: Eric Wong <normalperson@yhbt.net>
2014-10-24 22:55:26 +00:00
RomanBelinsky
784f4b6f33 SVN.pm::parse_svn_date: allow timestamps with a single-digit hour
Some broken subversion server gives timestamps with only one digit
in the hour part, like this:

    2014-01-07T5:01:02.048176Z

Loosen the regexp that expected to see two-digit hour, minute and
second parts to accept a single-digit hour (but not minute or
second).

Signed-off-by: Stepan Kasal <kasal@ucw.cz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-04-17 11:01:26 -07:00
Justin Lebar
235e8d5914 code and test: fix misuses of "nor"
Signed-off-by: Justin Lebar <jlebar@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-31 15:29:33 -07:00
Justin Lebar
01689909eb comments: fix misuses of "nor"
Signed-off-by: Justin Lebar <jlebar@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-31 15:29:27 -07:00
lin zuojian
ab0bcec987 git-svn: memoize _rev_list and rebuild
According to profile data, _rev_list and rebuild consume a large
portion of time.  Memoize the results of _rev_list and memoize
rebuild internals to avoid subprocess invocation.

When importing 15152 revisions on a LAN, time improved from 10
hours to 3-4 hours.

Signed-off-by: lin zuojian <manjian2006@gmail.com>
Signed-off-by: Eric Wong <normalperson@yhbt.net>
2014-01-23 02:54:26 +00:00
Eric Wong
eae6cf5aa8 git svn: consistent spacing after "W:" in warnings
All other instances of "W:"-prefixed warning messages have a space after
the "W:" to help with readability.

Signed-off-by: Eric Wong <normalperson@yhbt.net>
2013-03-08 09:53:57 +00:00
Jan Pešta
47543d161e git svn: ignore partial svn:mergeinfo
Currently this is cosmetic change - the merges are ignored, becuase the methods
(lookup_svn_merge, find_rev_before, find_rev_after) are failing on comparing text with number.

See http://www.open.collab.net/community/subversion/articles/merge-info.html
Extract:
The range r30430:30435 that was added to 1.5.x in this merge has a '*' suffix for 1.5.x\www.
This '*' is the marker for a non-inheritable mergeinfo range.
The '*' means that only the path on which the mergeinfo is explicitly set has had this range merged into it.

Signed-off-by: Jan Pesta <jan.pesta@certicon.cz>
Signed-off-by: Eric Wong <normalperson@yhbt.net>
2013-03-08 09:46:03 +00:00
Junio C Hamano
01e1406100 Merge branch 'bw/get-tz-offset-perl'
* bw/get-tz-offset-perl:
  cvsimport: format commit timestamp ourselves without using strftime
  perl/Git.pm: fix get_tz_offset to properly handle DST boundary cases
  Move Git::SVN::get_tz to Git::get_tz_offset
2013-02-14 10:29:44 -08:00
Ben Walton
68868ff573 Move Git::SVN::get_tz to Git::get_tz_offset
This function has utility outside of the SVN module for any routine
that needs the equivalent of GNU strftime's %z formatting option.
Move it to the top-level Git.pm so that non-SVN modules don't need to
import the SVN module to use it.

The rename makes the purpose of the function clearer.

Signed-off-by: Ben Walton <bdwalton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-09 14:01:28 -08:00
Eric Wong
1b67bef256 git-svn: cleanup sprintf usage for uppercasing hex
We do not need to call uc() separately for sprintf("%x")
as sprintf("%X") is available.

Signed-off-by: Eric Wong <normalperson@yhbt.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
2013-01-24 10:21:23 +00:00
Jonathan Nieder
52de6fa2c7 Git::SVN: rename private path field
All users of $gs->{path} should have been converted to use the
accessor by now.  Check our work by renaming the underlying variable
to break callers that try to use it directly.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Eric Wong <normalperson@yhbt.net>
2012-10-05 22:48:12 +00:00
Steven Walter
14d3ce1120 git-svn.perl: keep processing all commits in parents_exclude
This fixes a bug where git finds the incorrect merge parent.  Consider a
repository with trunk, branch1 of trunk, and branch2 of branch1.
Without this change, git interprets a merge of branch2 into trunk as a
merge of branch1 into trunk.

Signed-off-by: Steven Walter <stevenrwalter@gmail.com>
Reviewed-by: Sam Vilain <sam@vilain.net>
Signed-off-by: Eric Wong <normalperson@yhbt.net>
2012-10-05 22:48:12 +00:00
Steven Walter
f271fad266 git-svn.perl: consider all ranges for a given merge, instead of only tip-by-tip
Consider the case where you have trunk, branch1 of trunk, and branch2 of
branch1.  trunk is merged back into branch2, and then branch2 is
reintegrated into trunk.  The merge of branch2 into trunk will have
svn:mergeinfo property references to both branch1 and branch2.  When
git-svn fetches the commit that merges branch2 (check_cherry_pick),
it is necessary to eliminate the merged contents of branch1 as well as
branch2, or else the merge will be incorrectly ignored as a cherry-pick.

Signed-off-by: Steven Walter <stevenrwalter@gmail.com>
Reviewed-by: Sam Vilain <sam@vilain.net>
Signed-off-by: Eric Wong <normalperson@yhbt.net>
2012-10-05 22:48:12 +00:00
Junio C Hamano
9d305e5e70 Merge branch 'ms/git-svn-1.7'
A series by Michael Schwern via Eric to update git-svn to revamp the
way URLs are internally passed around, to make it work with SVN 1.7.

* ms/git-svn-1.7:
  git-svn: remove ad-hoc canonicalizations
  git-svn: canonicalize newly-minted URLs
  git-svn: introduce add_path_to_url function
  git-svn: canonicalize earlier
  git-svn: replace URL escapes with canonicalization
  git-svn: attempt to mimic SVN 1.7 URL canonicalization
  t9107: fix typo
  t9118: workaround inconsistency between SVN versions
  Git::SVN{,::Ra}: canonicalize earlier
  git-svn: path canonicalization uses SVN API
  Git::SVN::Utils: remove irrelevant comment
  git-svn: add join_paths() to safely concatenate paths
  git-svn: factor out _collapse_dotdot function
  git-svn: use SVN 1.7 to canonicalize when possible
  git-svn: move canonicalization to Git::SVN::Utils
  use Git::SVN{,::RA}->url accessor globally
  use Git::SVN->path accessor globally
  Git::SVN::Ra: use accessor for URLs
  Git::SVN: use accessor for URLs internally
  Git::SVN: use accessors internally for path
2012-08-22 11:51:20 -07:00
Peter Baumann
61b472ed8b git svn: reset invalidates the memoized mergeinfo caches
Since v1.7.0-rc2~11 (git-svn: persistent memoization, 2010-01-30),
git-svn has maintained some private per-repository caches in
.git/svn/.caches to avoid refetching and recalculating some
mergeinfo-related information with every 'git svn fetch'.

This memoization can cause problems, e.g consider the following case:

SVN repo:

  ... - a - b - c - m  <- trunk
          \        /
            d  -  e    <- branch1

The Git import of the above repo is at commit 'a' and doesn't know about
the branch1. In case of an 'git svn rebase', only the trunk of the
SVN repo is imported. During the creation of the git commit 'm', git svn
uses the svn:mergeinfo property and tries to find the corresponding git
commit 'e' to create 'm' with 'c' and 'e' as parents. But git svn rebase
only imports the current branch so commit 'e' is not imported.
Therefore git svn fails to create commit 'm' as a merge commit, because one
of its parents is not known to git. The imported history looks like this:

  ... - a - b - c - m  <- trunk

A later 'git svn fetch' to import all branches can't rewrite the commit 'm'
to add 'e' as a parent and to make it a real git merge commit, because it
was already imported.

That's why the imported history misses the merge and looks like this:

  ... - a - b - c - m  <- trunk
          \
            d  -  e    <- branch1

Right now the only known workaround for importing 'm' as a merge is to
force reimporting 'm' again from SVN, e.g. via

  $ git svn reset --revision $(git find-rev $c)
  $ git svn fetch

Sadly, this is where the behavior has regressed: git svn reset doesn't
invalidate the old mergeinfo cache, which is no longer valid for the
reimport, which leads to 'm' beeing imprted with only 'c' as parent.

As solution to this problem, this commit invalidates the mergeinfo cache
to force correct recalculation of the parents.

During development of this patch, several ways for invalidating the cache
where considered. One of them is to use Memoize::flush_cache, which will
call the CLEAR method on the underlying Memoize persistency implementation.
Sadly, neither Memoize::Storable nor the newer Memoize::YAML module
introduced in 68f532f4ba could optionally be used implement the
CLEAR method, so this is not an option.

Reseting the internal hash used to store the memoized values has the same
problem, because it calls the non-existing CLEAR method of the
underlying persistency layer, too.

Considering this and taking into account the different implementations
of the memoization modules, where Memoize::Storable is not in our control,
implementing the missing CLEAR method is not an option, at least not if
Memoize::Storable is still used.

Therefore the easiest solution to clear the cache is to delete the files
on disk in 'git svn reset'. Normally, deleting the files behind the back
of the memoization module would be problematic, because the in-memory
representation would still exist and contain wrong data. Fortunately, the
memoization is active in memory only for a small portion of the code.
Invalidating the cache by deleting the files on disk if it isn't active
should be safe.

Signed-off-by: Peter Baumann <waste.manager@gmx.de>
Signed-off-by: Steven Walter <stevenrwalter@gmail.com>
Signed-off-by: Eric Wong <normalperson@yhbt.net>
2012-08-10 19:53:18 +00:00
Michael G. Schwern
5eaa1fd086 git-svn: remove ad-hoc canonicalizations
[ew: commit title]

Signed-off-by: Eric Wong <normalperson@yhbt.net>
2012-08-02 21:46:06 +00:00
Michael G. Schwern
705b49cb81 git-svn: canonicalize newly-minted URLs
Go through all the spots that use the new add_path_to_url() to
make a new URL and canonicalize them.

* copyfrom_path has to be canonicalized else find_parent_branch
  will get confused

* due to the `canonicalize_url($full_url) ne $full_url)` line of
  logic in gs_do_switch(), $full_url is left alone until after.

At this point SVN 1.7 passes except for 3 tests in
t9100-git-svn-basic.sh that look like an SVN bug to do with
symlinks.

[ew: commit title]

Signed-off-by: Eric Wong <normalperson@yhbt.net>
2012-08-02 21:46:05 +00:00
Michael G. Schwern
d2fd119c4f git-svn: introduce add_path_to_url function
Remove the ad-hoc versions.

This is mostly to normalize the process and ensure the URLs produced
don't have double slashes or anything.

Also provides a place to fix the corner case where a file path
contains a percent sign.

[ew: commit title]

Signed-off-by: Eric Wong <normalperson@yhbt.net>
2012-08-02 21:46:03 +00:00
Michael G. Schwern
9c27a57b2d git-svn: replace URL escapes with canonicalization
The old hand-rolled URL escape functions were inferior to
canonicalization functions.

Continuing to move towards getting everything canonicalizing the same way.

* Git::SVN->init_remote_config and Git::SVN::Ra->minimize_url both
  have to canonicalize the same way else init_remote_config
  will incorrectly think they're different URLs causing
  t9107-git-svn-migrate.sh to fail.

[ew: commit title]

Signed-off-by: Eric Wong <normalperson@yhbt.net>
2012-08-02 21:46:01 +00:00
Michael G. Schwern
93c3fcbe4d git-svn: attempt to mimic SVN 1.7 URL canonicalization
Previously, our URL canonicalization didn't do much of anything.
Now it actually escapes and collapses slashes.  This is mostly a cut & paste
of escape_url from git-svn.

This is closer to how SVN 1.7's canonicalization behaves.  Doing it with
1.6 lets us chase down some problems caused by more effective canonicalization
without having to deal with all the other 1.7 issues on top of that.

* Remote URLs have to be canonicalized otherwise Git::SVN->find_existing_remote
  will think they're different.

* The SVN remote is now written to the git config canonicalized.  That
  should be ok.  Adjust a test to account for that.

[ew: commit title]

Signed-off-by: Eric Wong <normalperson@yhbt.net>
2012-08-02 21:46:00 +00:00
Michael G. Schwern
565e56c2cc Git::SVN{,::Ra}: canonicalize earlier
This canonicalizes paths and urls as early as possible so we don't
have to remember to do it at the point of use.  It will fix a swath
of SVN 1.7 problems in one go.

Its ok to double canonicalize things.

SVN 1.7 still fails, still not worrying about that.

[ew: commit title]

Signed-off-by: Eric Wong <normalperson@yhbt.net>
2012-08-02 21:45:56 +00:00
Michael G. Schwern
ca475a61f8 git-svn: add join_paths() to safely concatenate paths
Otherwise you might wind up with things like...

    my $path1 = undef;
    my $path2 = 'foo';
    my $path = $path1 . '/' . $path2;

creating '/foo'.  Or this...

    my $path1 = 'foo/';
    my $path2 = 'bar';
    my $path = $path1 . '/' . $path2;

creating 'foo//bar'.

Could have used File::Spec, but I'm shying away from it due to SVN
1.7's pickiness about paths.  Felt it would be better to have our own
we can control completely.

[ew: commit title]

Signed-off-by: Eric Wong <normalperson@yhbt.net>
2012-08-02 21:44:04 +00:00
Michael G. Schwern
b1ea6c3829 use Git::SVN{,::RA}->url accessor globally
Note: The structure returned from Git::SVN->read_all_remotes() does not
appear to contain objects, so I'm leaving them alone.

That's everything converted over to the url and path accessors.

No functional change.

[ew: commit title]

Signed-off-by: Eric Wong <normalperson@yhbt.net>
2012-08-02 21:42:59 +00:00
Michael G. Schwern
6a8d999ed4 use Git::SVN->path accessor globally
No functional change.

[ew: commit title]

Signed-off-by: Eric Wong <normalperson@yhbt.net>
2012-08-02 21:42:58 +00:00
Michael G. Schwern
06ee19e8e5 Git::SVN: use accessor for URLs internally
So later it can do automatic canonicalization.

A later patch will make other things use the accessor.

No functional change here.

[ew: commit title]
Signed-off-by: Eric Wong <normalperson@yhbt.net>
2012-08-02 21:42:47 +00:00
Michael G. Schwern
5578ed744d Git::SVN: use accessors internally for path
Then later it can be canonicalized automatically rather than everywhere
its used.

Later patch will make other things use it.

[ew: commit title, reformatted accessor to match existing style]

Signed-off-by: Eric Wong <normalperson@yhbt.net>
2012-08-02 21:42:45 +00:00
Michael G. Schwern
3d9be15fc2 Extract Git::SVN::GlobSpec from git-svn.
Straight cut & paste.  That's the last class.

* Make Git::SVN load it on its own, its the only thing that needs it.

Signed-off-by: Eric Wong <normalperson@yhbt.net>
2012-07-27 22:36:19 +00:00
Michael G. Schwern
5c71028fce Move initialization of Git::SVN variables into Git::SVN.
Also it can compile on its own now, yay!

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Eric Wong <normalperson@yhbt.net>
2012-07-27 22:14:54 +00:00
Michael G. Schwern
29499c0b27 Extract Git::SVN from git-svn into its own .pm file.
Except for adding the 1; at the end, this is a straight copy & paste.

Tests still pass, but its doubtful Git::SVN will compile on its own
without git-svn being loaded.  Next commit will fix that.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Eric Wong <normalperson@yhbt.net>
2012-07-27 22:14:53 +00:00