1
0
Fork 0
mirror of https://github.com/git/git.git synced 2024-11-18 15:04:49 +01:00
Commit graph

24 commits

Author SHA1 Message Date
Jonathan Nieder
3ac10b2e3f vcs-svn: avoid hangs from corrupt deltas
A corrupt Subversion-format delta can request reads past the end of
the preimage.  Set sliding_view::max_off so such corruption is caught
when it appears rather than blocking in an impossible-to-fulfill
read() when input is coming from a socket or pipe.

Inspired-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-06-15 02:32:50 -05:00
David Barr
7a75e661c5 vcs-svn: implement text-delta handling
Handle input in Subversion's dumpfile format, version 3.  This is the
format produced by "svnrdump dump" and "svnadmin dump --deltas", and
the main difference between v3 dumpfiles and the dumpfiles already
handled is that these can include nodes whose properties and text are
expressed relative to some other node.

To handle such nodes, we find which node the text and properties are
based on, handle its property changes, use the cat-blob command to
request the basis blob from the fast-import backend, use the
svndiff0_apply() helper to apply the text delta on the fly, writing
output to a temporary file, and then measure that postimage file's
length and write its content to the fast-import stream.

The temporary postimage file is shared between delta-using nodes to
avoid some file system overhead.

The svn-fe interface needs to be more complicated to accomodate the
backward flow of information from the fast-import backend to svn-fe.
The backflow fd is not needed when parsing streams without deltas,
though, so existing scripts using svn-fe on v2 dumps should
continue to work.

NEEDSWORK: generalize interface so caller sets the backflow fd, close
temporary file before exiting

Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-05-26 02:28:04 -05:00
Jonathan Nieder
9ecfa8ae4c Merge branch 'db/vcs-svn-incremental' into svn-fe
This teaches svn-fe to incrementally import into an existing
repository (at last!) at the expense of less convenient UI.  Think of
it as growing pains.  This opens the door to many excellent things,
and it would be a bad idea to discourage people from building on it
for much longer.

* db/vcs-svn-incremental:
  vcs-svn: avoid using ls command twice
  vcs-svn: use mark from previous import for parent commit
  vcs-svn: handle filenames with dq correctly
  vcs-svn: quote paths correctly for ls command
  vcs-svn: eliminate repo_tree structure
  vcs-svn: add a comment before each commit
  vcs-svn: save marks for imported commits
  vcs-svn: use higher mark numbers for blobs
  vcs-svn: set up channel to read fast-import cat-blob response

Conflicts:
	t/t9010-svn-fe.sh
	vcs-svn/fast_export.c
	vcs-svn/fast_export.h
	vcs-svn/repo_tree.c
	vcs-svn/svndump.c
2011-05-26 02:02:44 -05:00
Jonathan Nieder
4c502d6866 tests: make sure input to sed is newline terminated
POSIX only requires sed to work on text files and because it does
not end with a newline, this commit's content is not a text file.
Add a newline to fix it.  Without this change, OS X sed helpfully
adds a newline to actual.message, causing t9010.13 to fail.

Reported-by: Torsten Bögershausen <tboegi@web.de>
Tested-by: Brian Gernhardt <benji@silverinsanity.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-03-29 02:14:23 -05:00
Jonathan Nieder
195b7ca6f2 vcs-svn: handle log message with embedded NUL
Pass the log message by strbuf instead of as a C-style string and use
fwrite instead of printf to write it to fast-import so embedded '\0'
bytes can be preserved.

Currently "git log" doesn't show the embedded NULs but "git cat-file
commit" can.

While at it, stop including system headers from repo_tree.h.  git
source files need to include git-compat-util.h (or cache.h or
builtin.h) sooner to ensure the appropriate feature test macros are
defined.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-03-26 00:49:37 -05:00
Jonathan Nieder
e7d04ee147 vcs-svn: make reading of properties binary-safe
svn-fe errors out on revision 59151 of the ASF repository:

 fatal: invalid dump: unexpected end of file

The proximate cause is a property with an embedded NUL character.
Previously such anomalies were ignored but commit c9d1c8ba
(2010-12-28) introduced a check strlen(val) == len to avoid reading
uninitialized data when a property list ends early and unfortunately
this test does not distinguish between "foo" followed by EOF and the
string "foo\0bar\0baz".

Fix it by using buffer_read_binary to read to a strbuf and checking
the actual length read.  Most consumers of properties still use
C-style strings, so in practice an author or log message with embedded
NULs will be truncated, but a least this way svn-fe won't error out
(fixing the regression).

Reported-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-03-26 00:15:10 -05:00
David Barr
e435811208 vcs-svn: quote paths correctly for ls command
This bug was found while importing rev 601865 of ASF.

[jn: with test]

Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-03-07 01:43:58 -06:00
David Barr
41529bbce4 vcs-svn: set up channel to read fast-import cat-blob response
Set up some plumbing: teach the svndump lib to pass a file descriptor
number to the fast_export lib, representing where cat-blob/ls
responses can be read from, and add a get_response_line helper
function to the fast_export lib to read a line from that file.

Unfortunately this means that svn-fe needs file descriptor 3 to be
redirected from somewhere (preferrably the cat-blob stream of a
fast-import backend); otherwise it will fail:

	$ svndump <path> | svn-fe
	fatal: cannot read from file descriptor 3: Bad file descriptor

For the moment, "svn-fe 3</dev/null" works as a workaround but it
will not work for very long.  A fast-import backend that can retrieve
old commits is needed in order to be able to fulfill svn
"Node-copyfrom-rev" requests that refer to revs from a previous run.

[jn: with new change description]

Based-on-patch-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-03-07 01:43:57 -06:00
Jonathan Nieder
a62bbf8f01 Merge commit 'jn/svn-fe' of git://github.com/gitster/git into svn-fe
* git://github.com/gitster/git:
  vcs-svn: Allow change nodes for root of tree (/)
  vcs-svn: Implement Prop-delta handling
  vcs-svn: Sharpen parsing of property lines
  vcs-svn: Split off function for handling of individual properties
  vcs-svn: Make source easier to read on small screens
  vcs-svn: More dump format sanity checks
  vcs-svn: Reject path nodes without Node-action
  vcs-svn: Delay read of per-path properties
  vcs-svn: Combine repo_replace and repo_modify functions
  vcs-svn: Replace = Delete + Add
  vcs-svn: handle_node: Handle deletion case early
  vcs-svn: Use mark to indicate nodes with included text
  vcs-svn: Unclutter handle_node by introducing have_props var
  vcs-svn: Eliminate node_ctx.mark global
  vcs-svn: Eliminate node_ctx.srcRev global
  vcs-svn: Check for errors from open()
  vcs-svn: Allow simple v3 dumps (no deltas yet)

Conflicts:
	t/t9010-svn-fe.sh
	vcs-svn/svndump.c
2011-02-26 05:21:29 -06:00
Jonathan Nieder
0316bba80f t9010: svnadmin can fail even if available
If svn is built against one version of SQLite and run against another,
libsvn_subr needlessly errors out in operations that need to make a
commit.

That is clearly not a bug in git but let us consider the ramifications for
the test suite.  git-svn uses libsvn directly and is probably broken by
that bug; it is right for git-svn tests to fail.  The vcs-svn lib, on the
other hand, does not use libsvn and the test t9010 only uses svn to check
its work.  This points to two possible improvements:

 - do not disable most vcs-svn tests if svn is missing.
 - skip validation rather than failing it when svn fails.

Bring about both by putting the svn invocations into a single test that
builds a repo to compare the test-svn-fe result against.  The test will
always pass but only will set the new SVNREPO test prereq if svn succeeds;
and validation using that repo gets an SVNREPO prerequisite so it only
runs with working svn installations.

Works-around: http://bugs.debian.org/608925
Noticed-by: A Large Angry SCM <gitzilla@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-01-10 09:35:17 -08:00
Junio C Hamano
bf9b46c16d Merge branch 'jn/svn-fe' (early part)
* 'jn/svn-fe' (early part):
  vcs-svn: Error out for v3 dumps

Conflicts:
	t/t9010-svn-fe.sh
2011-01-05 13:34:43 -08:00
Junio C Hamano
8e9d453ce7 t9010 fails when no svn is available
Running test t9010 without svn currently errors out for no good reason.

The test uses "svnadmin" without checking if svn is available.  This was a
regression introduced by b0ad24b (t9010 (svn-fe): Eliminate dependency on
svn perl bindings, 2010-10-10) when it stopped including ./lib-git-svn.sh
that had the safety.

This should fix it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-12-09 09:26:07 -08:00
Jonathan Nieder
9e8c532108 vcs-svn: Allow change nodes for root of tree (/)
It is not uncommon for a svn repository to include change records for
properties at the top level of the tracked tree:

	Node-path:
	Node-kind: dir
	Node-action: change
	Prop-delta: true
	Prop-content-length: 43
	Content-length: 43

	K 10
	svn:ignore
	V 11
	build-area

	PROPS-END

Unfortunately a recent svn-fe change (vcs-svn: More dump format sanity
checks, 2010-11-19) causes such nodes to be rejected with the error
message

	fatal: invalid dump: path to be modified is missing

The repo_tree module does not keep a dirent for the root of the tree.
Add a block to the dump parser to take care of this case.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-12-07 16:04:56 -08:00
David Barr
6b01b67658 vcs-svn: Implement Prop-delta handling
The rules for what file is used as delta source for each file are not
documented in dump-load-format.txt.  Luckily, the Apache Software
Foundation repository has rich enough examples to figure out most of
the rules:

Node-action: replace implies the empty property set and empty text as
preimage for deltas.  Otherwise, if a copyfrom source is given, that
node is the preimage for deltas.  Lastly, if none of the above applies
and the node path exists in the current revision, then that version
forms the basis.

[jn: refactored, with tests]

Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-24 14:53:59 -08:00
Jonathan Nieder
c7dbf35e91 vcs-svn: More dump format sanity checks
Node-action: change is not appropriate when switching between file and
directory or adding a new file.  Current svn-fe silently accepts such
nodes and the resulting tree has missing files in the "changed when
meant to add" case.

Node-action: add requires some content (text or directory); there is
no such thing as an "intent to add" node in svn dumps.  Current svn-fe
accepts such contentless adds but produces an invalid fast-import
stream that refers to nonexistent mark :0 in response.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-24 14:52:51 -08:00
Jonathan Nieder
414e569e45 vcs-svn: Reject path nodes without Node-action
It would be better to flag such errors and let the import proceed
anyway, but for now it is simpler not to worry about recovery
from such weird cases.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-24 14:52:47 -08:00
Jonathan Nieder
1d13e9f600 vcs-svn: Eliminate node_ctx.srcRev global
The srcRev variable is only used in handle_node(); its purpose
is to hold the old mode for a path, to only be used if properties
are not being changed.  Narrow its scope to make its meaningful
lifetime more obvious.

No functional change intended.  Add some tests as a sanity-check
for the simplest case (no renames).

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-24 14:51:42 -08:00
David Barr
1f05d07c45 vcs-svn: Allow simple v3 dumps (no deltas yet)
Since the dumpfile version 1 days, the Subversion dump format
gained some new fields:

 - a unique identifier for the repository (version 2 format)
 - whether the text and properties for a node should be
   interpreted as deltas
 - checksums for a delta's preimage
 - SHA-1 sums as alternatives to the existing MD5 checksums for
   copy source and the payload (delta).

For now what is relevant to us is the Text-delta and Prop-delta
fields, since not noticing these causes a dump file to be
misinterpreted (see the previous commit).

[jn: with tests]

Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-24 14:48:54 -08:00
Jonathan Nieder
b3e5bce1aa vcs-svn: Error out for v3 dumps
By ignoring the Text-Delta and Prop-Delta node fields, current svn-fe
happily mistakes deltas for full text and instead of cleanly erroring
out, it produces a valid but semantically bogus fast-import stream
when fed a dump file in the modern "svnadmin dump --deltas" format.

Dump file parsers are supposed to ignore header fields they don't
understand (to allow for backward-compatible extensions), but they are
also supposed to check the SVN-fs-dump-format-version header to
prevent misinterpretation of non backward-compatible extensions.
Do so.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-24 14:48:52 -08:00
Ramkumar Ramachandra
b0ad24be8c t9010 (svn-fe): Eliminate dependency on svn perl bindings
Running test t9010 without the SVN:: perl modules currently errors
out, for no good reason.  We can make these tests easier to read and
run by not using the perl libsvn bindings and instead duplicating only
the relevant code from lib-git-svn.sh.

Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-23 16:20:22 -08:00
Ævar Arnfjörð Bjarmason
cd9a7b57a7 t/t9010-svn-fe.sh: add an +x bit to this test
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-19 13:07:14 -07:00
Jonathan Nieder
7e45e0569c t9010 (svn-fe): avoid symlinks in test
The svn-fe test fails on Windows in the “svn export” step because of
the lack of symlink support.  With a less ambitious dump, it passes.

Acked-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-14 19:35:38 -07:00
Jonathan Nieder
24f1136894 t9010 (svn-fe): use Unix-style path in URI
Ever since v1.6.3-rc0~101^2~14 (Tests on Windows: $(pwd) must return
Windows-style paths, 2009-03-13), there is a subtle difference between
$(pwd) and $PWD in tests: the former returns Windows-style paths as
might be output by git and the latter Unix-style paths which msys
programs tend to prefer.

In file:// URIs, Unix-style paths are needed.  Before: “svn export”
declares it cannot find

 file://c:/apps/git/git/t/trash directory/simple-svco

After: “svn export” successfully finds

 file:///c/apps/git/git/...

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-14 19:35:38 -07:00
David Barr
21746aa34f SVN dump parser
svndump parses data that is in SVN dumpfile format produced by
`svnadmin dump` with the help of line_buffer and uses repo_tree and
fast_export to emit a git fast-import stream.

Based roughly on com.hydrografix.svndump 0.92 from the SvnToCCase
project at <http://svn2cc.sarovar.org/>, by Stefan Hegny and
others.

[rr: allow input from files other than stdin]
[jn: with test, more error reporting]

Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-14 19:35:38 -07:00