1
0
Fork 0
mirror of https://github.com/git/git.git synced 2024-11-14 13:13:01 +01:00
Commit graph

16368 commits

Author SHA1 Message Date
Junio C Hamano
eeaa460314 [PATCH] diff: Update -B heuristics.
As Linus pointed out on the mailing list discussion, -B should
break a files that has many inserts even if it still keeps
enough of the original contents, so that the broken pieces can
later be matched with other files by -M or -C.  However, if such
a broken pair does not get picked up by -M or -C, we would want
to apply different criteria; namely, regardless of the amount of
new material in the result, the determination of "rewrite"
should be done by looking at the amount of original material
still left in the result.  If you still have the original 97
lines from a 100-line document, it does not matter if you add
your own 13 lines to make a 110-line document, or if you add 903
lines to make a 1000-line document.  It is not a rewrite but an
in-place edit.  On the other hand, if you did lose 97 lines from
the original, it does not matter if you added 27 lines to make a
30-line document or if you added 997 lines to make a 1000-line
document.  You did a complete rewrite in either case.

This patch introduces a post-processing phase that runs after
diffcore-rename matches up broken pairs diffcore-break creates.
The purpose of this post-processing is to pick up these broken
pieces and merge them back into in-place modifications.  For
this, the score parameter -B option takes is changed into a pair
of numbers, and it takes "-B99/80" format when fully spelled
out.  The first number is the minimum amount of "edit" (same
definition as what diffcore-rename uses, which is "sum of
deletion and insertion") that a modification needs to have to be
broken, and the second number is the minimum amount of "delete"
a surviving broken pair must have to avoid being merged back
together.  It can be abbreviated to "-B" to use default for
both, "-B9" or "-B9/" to use 90% for "edit" but default (80%)
for merge avoidance, or "-B/75" to use default (99%) "edit" and
75% for merge avoidance.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-03 11:23:03 -07:00
Junio C Hamano
0e3994fa97 [PATCH] diff: Clean up diff_scoreopt_parse().
This cleans up diff_scoreopt_parse() function that is used to
parse the fractional notation -B, -C and -M option takes.  The
callers are modified to check for errors and complain.  Earlier
they silently ignored malformed input and falled back on the
default.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-03 11:23:03 -07:00
Junio C Hamano
ce24067549 [PATCH] diff: Fix docs and add -O to diff-helper.
This patch updates diff documentation and usage strings:

 - clarify the semantics of -R.  It is not "output in reverse";
   rather, it is "I will feed diff backwards".  Semantically
   they are different when -C is involved.

 - describe -O in usage strings of diff-* brothers.  It was
   implemented, documented but not described in usage text.

Also it adds -O to diff-helper.  Like -S (and unlike -M/-C/-B),
this option can work on sanitized diff-raw output produced by
the diff-* brothers.  While we are at it, the call it makes to
diffcore is cleaned up to use the diffcore_std() like everybody
else, and the declaration for the low level diffcore routines
are moved from diff.h (public) to diffcore.h (private between
diff.c and diffcore backends).

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-03 11:23:03 -07:00
Junio C Hamano
355e76a4a3 [PATCH] Tweak count-delta interface
Make it return copied source and insertion separately, so that
later implementation of heuristics can use them more flexibly.

This does not change the heuristics implemented in
diffcore-rename nor diffcore-break in any way.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-03 11:23:03 -07:00
Rene Scharfe
5b86040679 [PATCH] git-tar-tree: do only basic tests in t/t5000-git-tar-tree.sh
git-tar-tree: remove tests of long path handling out of t5000-tar-tree.sh
and make test script cope with tar programs displaying file modification
date as hh:mm (newer variants show it as hh:mm:ss).

This makes the test cover only basic functionality that is expected to
be handled even by older tar programs.  Tests for long filenames (which
require pax extended headers) can be added separately.

I ran this test successfully with GNU tar 1.13, 1.14 and 1.15.1.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-03 09:51:01 -07:00
Rene Scharfe
a325a11b88 [PATCH] git-tar-tree: fix write_trailer
write_trailer() writes the last 10k (a full block) of the tar archive.
write_if_needed() writes out a block *if* it is full and then sets
the offset to 0.  In nine out of ten cases the messed up write_trailer()
function didn't manage to fill the block thus not writing anything at
all, truncating the archive.  I was "lucky" to hit the other case and so
my testing ran OK.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-03 07:36:42 -07:00
Rene Scharfe
d3d49c3d35 [PATCH] git-tar-tree: add a test case
add a simple test case.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-02 18:30:08 -07:00
Rene Scharfe
d3a15c49d4 [PATCH] git-tar-tree: small doc update
document difference in behaviour w/ regard to tree vs.  commit and
correct author information.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-02 18:30:08 -07:00
Rene Scharfe
9b5b9f398c [PATCH] git-tar-tree: cleanup write_trailer()
replace open-coded variants of get_record().

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-02 18:30:08 -07:00
Linus Torvalds
a7b209091a Clarify git-diff-cache semantics in the tutorial.
Adam Kropelin points out that it wasn't all that clear
at all what the thing does. This hopefully helps a bit.
2005-06-02 17:15:32 -07:00
Junio C Hamano
65c2e0c349 [PATCH] Find size of SHA1 object without inflating everything.
This adds sha1_file_size() helper function and uses it in the
rename/copy similarity estimator.  The helper function handles
deltified object as well.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-02 15:48:33 -07:00
Junio C Hamano
4a62b61939 [PATCH] Handle deltified object correctly in git-*-pull family.
When a remote repository is deltified, we need to get the
objects that a deltified object we want to obtain is based upon.
The initial parts of each retrieved SHA1 file is inflated and
inspected to see if it is deltified, and its base object is
asked from the remote side when it is.  Since this partial
inflation and inspection has a small performance hit, it can
optionally be skipped by giving -d flag to git-*-pull commands.
This flag should be used only when the remote repository is
known to have no deltified objects.

Rsync transport does not have this problem since it fetches
everything the remote side has.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-02 15:48:33 -07:00
Linus Torvalds
3b42a63cb5 git-rev-list: split out commit limiting from main() too.
Ok, now I'm happier.
2005-06-02 09:25:44 -07:00
Linus Torvalds
81f2bb1f54 git-rev-list: factor out the commit printing from "main()"
Functions that do many things are bad. We should basically
just parse the arguments in main(). We're not quite there
yet, but it's a step in the right direction.
2005-06-02 09:19:53 -07:00
Linus Torvalds
cc29f73285 Run the tutorial through ispell once more
People are making fun of me for being a bad speeler.
2005-06-02 07:58:41 -07:00
Linus Torvalds
5180cacc20 Split up unpack_sha1_file() some more
Make a separate helper for parsing the header of an object file
(really carefully) and for unpacking the rest. This means that
anybody who uses the "unpack_sha1_header()" interface can easily
look at the header and decide to unpack the rest too, without
doing any extra work.
2005-06-02 07:57:25 -07:00
Linus Torvalds
c4483576b8 Add "unpack_sha1_header()" helper function
It's for people who aren't necessarily interested in the whole
unpacked file, but do want to know the header information (size,
type, etc..)

For example, the delta code can use this to figure out whether
an object is already a delta object, and what it is a delta
against, without actually bothering to unpack all of the actual
data in the delta.
2005-06-01 17:54:59 -07:00
Linus Torvalds
f35ca9ed3e tutorial.txt: start describing how to copy repositories
Both locally and remotely.
2005-06-01 17:48:33 -07:00
Junio C Hamano
67574c403f [PATCH] diff: mode bits fixes
The core GIT repository has trees that record regular file mode
in 0664 instead of normalized 0644 pattern.  Comparing such a
tree with another tree that records the same file in 0644
pattern without content changes with git-diff-tree causes it to
feed otherwise unmodified pairs to the diff_change() routine,
which triggers a sanity check routine and barfs.  This patch
fixes the problem, along with the fix to another caller that
uses unnormalized mode bits to call diff_change() routine in a
similar way.

Without this patch, you will see "fatal error" from diff-tree
when you run git-deltafy-script on the core GIT repository
itself.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-01 13:24:03 -07:00
Linus Torvalds
81bb573ed8 Update tutorial for simplified "git" script.
Use "git commit" instead of "git-commit-script", and talk about using
"git log" before introducing the more complex "git-whatchanged".

In short, try to make it feel a bit more normal to those poor souls
using CVS.

Do some whitspace edits too, to make the side notes stand out a bit
more.
2005-06-01 09:27:22 -07:00
Linus Torvalds
e764b8e8b3 Add "git" and "git-log-script" helper scripts.
The "git" script is just shorthand: "git xyz <args>" will just execute
"git-xyz-script <args>", which is useful for people used to the CVS
naming convention. So "git log" will run the new git-log-script, which
is just a wrapper around the new pretty-printing git-rev-list.

Cheesy.
2005-06-01 09:13:26 -07:00
Linus Torvalds
9d97aa6466 git-rev-list: add "--pretty" command line option
That pretty-prints the resulting commit messages, so

	git-rev-list --pretty HEAD v2.6.12-rc5 | less -S

basically ends up being a log of the changes between -rc5
and current head.

It uses the pretty-printing helper function I just extracted
from diff-tree.c.
2005-06-01 08:42:22 -07:00
Linus Torvalds
e3bc7a3bc7 Add generic commit "pretty print" function.
It's really just the header printign function from diff-tree.c,
and it's usable for other things too.
2005-06-01 08:34:23 -07:00
Alexey Guzeev
ef6a46e6ea [PATCH] git: git-commit-script ignores $GIT_DIR 2005-06-01 07:51:51 -07:00
Linus Torvalds
837eedf41b tutorial.txt: fix typos and a'git-whatchanged' example
Pointed out by Junio. I kant't speel.
2005-06-01 07:39:36 -07:00
Linus Torvalds
95bedc9eec git-apply --stat: limit lines to 79 characters
It had already tried to do that, but with the independent
rounding of the number of '+' and '-' characters, it would
sometimes do 80-char lines after all.
2005-05-31 20:50:49 -07:00
Junio C Hamano
66204988fe [PATCH] ls-tree: handle trailing slashes in the pathspec properly.
This fixes the problem with ls-tree which failed to show
"drivers/char" directory when the user asked for "drivers/char/"
from the command line.  At the same time, if "drivers/char" were
a non directory, "drivers/char/" would not show it.  This is
consistent with the way diffcore-pathspec has been recently
fixed.

This adds back the diffcore-pathspec test,dropped when my
earlier diffcore-pathspec fix was rejected.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-31 20:32:27 -07:00
Linus Torvalds
8c7fa2478e Add first cut at a simple git tutorial.
This really is very basic stuff, no branches, no merging, no CVS
imports. Let's start small.
2005-05-31 19:50:34 -07:00
Paul Mackerras
d4e95cb6cf cope with changed git-diff-tree output format 2005-06-01 00:02:13 +00:00
Junio C Hamano
edb0c72428 [PATCH] diff: consolidate test helper script pieces.
There were duplicate script pieces to help comparing diff
output, which this patch consolidates into the t/diff-lib.sh
library.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-31 16:17:27 -07:00
Linus Torvalds
1d9e6f92bc pathspec: fix pathspecs with '/' at the end
Removing (and ignoring) them is wrong, since that means
that a pathspec of "xxxx/" would match a regular filename
of "xxxx", which is obviously incorrect.
2005-05-31 15:17:58 -07:00
Linus Torvalds
381ca9a3bc git-apply: don't try to be clever about filenames and the index
It just causes things like "git-apply --stat" to parse traditional
patch headers differently depending on what your index is, which
is nasty.
2005-05-31 15:05:59 -07:00
Paul Mackerras
cfb4563c83 Use git-rev-list instead of git-rev-tree.
Fix bug in changing font size in entry widgets.
Fix bug with B1 click before anything has been drawn.
Use "units" and "pages" instead of "u" and "p" for tk8.5.
2005-05-31 12:14:42 +00:00
Linus Torvalds
aff9f97a4f cvs2git: use CVS (rather than RCS) to extract the different
file versions.

This allows you to do the conversion (although slowly) from
a remote repository, and besides, it's one less thing to worry
about when you don't need to look up the CVS Attic directories
etc.
2005-05-30 21:00:09 -07:00
Linus Torvalds
97658004c3 git-rev-list: add "--parents" command line flag
It makes rev-list show the list of parents, the same
way git-rev-tree does (but without the expense).
2005-05-30 19:30:07 -07:00
Linus Torvalds
8906300f65 git-rev-list: use proper lazy reachability analysis
This mean sthat you can give a beginning/end pair to git-rev-list,
and it will show all entries that are reachable from the beginning
but not the end.

For example

	git-rev-list v2.6.12-rc5 v2.6.12-rc4

shows all commits that are in -rc5 but are not in -rc4.
2005-05-30 18:46:32 -07:00
Linus Torvalds
ac5155ef59 commit_list_insert: return the new commit list entry
This is useful for when we want to insert the next one after
this new one, for example.
2005-05-30 18:44:02 -07:00
Junio C Hamano
70aadac081 [PATCH] Show dissimilarity index for D and N case.
The way broken deletes and creates are shown in the -p
(diff-patch) output format has become consistent with how
rename/copy edits are shown.  They will show "dissimilarity
index" value, immediately following the "deleted file mode" and
"new file mode" lines.

The git-apply is taught to grok such an extended header.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-05-30 18:10:46 -07:00
Junio C Hamano
af5323e027 [PATCH] Add -O<orderfile> option to diff-* brothers.
A new diffcore filter diffcore-order is introduced.  This takes
a text file each of whose line is a shell glob pattern.  Patches
that match a glob pattern on an earlier line in the file are
output before patches that match a later line, and patches that
do not match any glob pattern are output last.

A typical orderfile for git project probably should look like
this:

    README
    Makefile
    Documentation
    *.h
    *.c

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-30 18:10:46 -07:00
Junio C Hamano
2036d84102 [PATCH] Buglets fix in the new two scripts
Should be obvious...

 - Use $VISUAL, $EDITOR, in this order if set, and fall back on
   vi.

 - Status R, C, D, N usually are followed by number, so adjust
   case arms to that pattern.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-30 18:10:46 -07:00
Linus Torvalds
866b973b5d git-resolve-script: use "git-apply --stat" instead of diffstat
Not everybody necessarily even has diffstat installed.
2005-05-30 17:45:41 -07:00
Nicolas Pitre
53d4b46085 [PATCH] mkdelta enhancements (take 2)
Although it was described as such, git-mkdelta didn't really attempt to
find the best delta against any previous object in the list, but was
only able to create a delta against the preceeding object.  This patch
reworks the code to fix that limitation and hopefully makes it a bit
clearer than before, including fixing the delta loop detection which was
broken.

This means that

	git-mkdelta sha1 sha2 sha3 sha4 sha5 sha6

will now create a sha2 delta against sha1, a sha3 delta against either
sha2 or sha1 and keep the best one, a sha4 delta against either sha3,
sha2 or sha1, etc.  The --max-behind argument limits that search for the
best delta to the specified number of previous objects in the list.  If
no limit is specified it is unlimited (note: it might run out of
memory with long object lists).

Also added a -q (quiet) switch so it is possible to have 3 levels of
output: -q for nothing, -v for verbose, and if none of -q nor -v is
specified then only actual changes on the object database are shown.

Finally the git-deltafy-script has been updated accordingly, and some
bugs fixed (thanks to Stephen C. Tweedie for spotting them).

This version has been toroughly tested and I think it is ready
for public consumption.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-30 17:37:20 -07:00
Linus Torvalds
a3e870f2e2 Add "commit" helper script
This is meant to make raw git not hugely less usable than something
like raw CVS. I want to make a 1.0 release of the plumbing, and the
actual commit part was just too intimidating.
2005-05-30 12:51:00 -07:00
Junio C Hamano
f345b0a066 [PATCH] Add -B flag to diff-* brothers.
A new diffcore transformation, diffcore-break.c, is introduced.

When the -B flag is given, a patch that represents a complete
rewrite is broken into a deletion followed by a creation.  This
makes it easier to review such a complete rewrite patch.

The -B flag takes the same syntax as the -M and -C flags to
specify the minimum amount of non-source material the resulting
file needs to have to be considered a complete rewrite, and
defaults to 99% if not specified.

As the new test t4008-diff-break-rewrite.sh demonstrates, if a
file is a complete rewrite, it is broken into a delete/create
pair, which can further be subjected to the usual rename
detection if -M or -C is used.  For example, if file0 gets
completely rewritten to make it as if it were rather based on
file1 which itself disappeared, the following happens:

    The original change looks like this:

	file0     --> file0' (quite different from file0)
	file1     --> /dev/null

    After diffcore-break runs, it would become this:

	file0     --> /dev/null
	/dev/null --> file0'
	file1     --> /dev/null

    Then diffcore-rename matches them up:

	file1     --> file0'

The internal score values are finer grained now.  Earlier
maximum of 10000 has been raised to 60000; there is no user
visible changes but there is no reason to waste available bits.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-30 10:35:49 -07:00
Junio C Hamano
2cd68882ee [PATCH] diff: fix the culling of unneeded delete record.
The commit 15d061b435

    [PATCH] Fix the way diffcore-rename records unremoved source.

still leaves unneeded delete records in its output stream by
mistake, which was covered up by having an extra check to turn
such a delete into a no-op downstream.  Fix the check in the
diffcore-rename to simplify the output routine.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-30 10:35:49 -07:00
Junio C Hamano
9d429ff6ff [PATCH] diff: further cleanup.
When preparing data to feed the external diff, we should give
the mode we obtained from the caller, even when we are dealing
with a file with 0{40} SHA1 (i.e. the caller said "look at the
filesystem"), since the mode passed by the caller via
diff_addremove() or diff_change() is always trustworthy.

This is _not_ a bugfix --- the existing code stat() on the file
ifself and does the same computation on st.st_mode to compute
the mode the same way the caller did to give the original mode.
We cannot remove the stat() call from here, but the extra
computation to create the mode value is unnecessary.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-30 10:35:49 -07:00
Junio C Hamano
01c4e70f63 [PATCH] diff: code clean-up and removal of rename hack.
A new macro, DIFF_PAIR_RENAME(), is introduced to distinguish a
filepair that is a rename/copy (the definition of which is src
and dst are different paths, of course).  This removes the hack
used in the record_rename_pair() to always put a non-zero value
in the score field.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-30 10:35:49 -07:00
Junio C Hamano
befe86392c [PATCH] diff: consolidate various calls into diffcore.
The three diff-* brothers had a sequence of calls into diffcore
that were almost identical.  Introduce a new diffcore_std()
function that takes all the necessary arguments to consolidate
it.  This will make later enhancements and changing the order of
diffcore application simpler.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-30 10:35:49 -07:00
Junio C Hamano
ddafa7e933 [PATCH] diff-helper: Fix R/C score parsing under -z flag.
The score number that follow R/C status were parsed but the
parse pointer was not updated, causing the entire line to become
unrecognized.  This patch fixes this problem.

There was a test missing to catch this breakage, which this
commit adds as t4009-diff-rename-4.sh.  The diff-raw tests used
in related t4005-diff-rename-2.sh (the same test without -z) and
t4007-rename-3.sh were stricter than necessarily, despite that
the comment for the tests said otherwise.  This patch also
corrects them.

The documentation is updated to say that the status can
optionally be followed by a number called "score"; it does not
have to stay similarity index forever and there is no reason to
limit it only to C and R.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-30 10:35:49 -07:00
Linus Torvalds
cad88fdf8d git-init-db: set up the full default environment
Create .git/refs/{heads,tags} and make .git/HEAD be a symlink to
(the as yet non-existent) .git/refs/heads/master.
2005-05-30 10:20:44 -07:00