1
0
Fork 0
mirror of https://github.com/git/git.git synced 2024-11-15 21:53:44 +01:00
Commit graph

84 commits

Author SHA1 Message Date
Nguyễn Thái Ngọc Duy
4a2d5ae262 pathspec: stop --*-pathspecs impact on internal parse_pathspec() uses
Normally parse_pathspec() is used on command line arguments where it
can do fancy thing like parsing magic on each argument or adding magic
for all pathspecs based on --*-pathspecs options.

There's another use of parse_pathspec(), where pathspec is needed, but
the input is known to be pure paths. In this case we usually don't
want --*-pathspecs to interfere. And we definitely do not want to
parse magic in these paths, regardless of --literal-pathspecs.

Add new flag PATHSPEC_LITERAL_PATH for this purpose. When it's set,
--*-pathspecs are ignored, no magic is parsed. And if the caller
allows PATHSPEC_LITERAL (i.e. the next calls can take literal magic),
then PATHSPEC_LITERAL will be set.

This fixes cases where git chokes when GIT_*_PATHSPECS are set because
parse_pathspec() indicates it won't take any magic. But
GIT_*_PATHSPECS add them anyway. These are

   export GIT_LITERAL_PATHSPECS=1
   git blame -- something
   git log --follow something
   git log --merge

"git ls-files --with-tree=path" (aka parse_pathspec() in
overlay_tree_on_cache()) is safe because the input is empty, and
producing one pathspec due to PATHSPEC_PREFER_CWD does not take any
magic into account.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-10-28 09:57:36 -07:00
Junio C Hamano
b02f5aeda6 Merge branch 'jl/submodule-mv'
"git mv A B" when moving a submodule A does "the right thing",
inclusing relocating its working tree and adjusting the paths in
the .gitmodules file.

* jl/submodule-mv: (53 commits)
  rm: delete .gitmodules entry of submodules removed from the work tree
  mv: update the path entry in .gitmodules for moved submodules
  submodule.c: add .gitmodules staging helper functions
  mv: move submodules using a gitfile
  mv: move submodules together with their work trees
  rm: do not set a variable twice without intermediate reading.
  t6131 - skip tests if on case-insensitive file system
  parse_pathspec: accept :(icase)path syntax
  pathspec: support :(glob) syntax
  pathspec: make --literal-pathspecs disable pathspec magic
  pathspec: support :(literal) syntax for noglob pathspec
  kill limit_pathspec_to_literal() as it's only used by parse_pathspec()
  parse_pathspec: preserve prefix length via PATHSPEC_PREFIX_ORIGIN
  parse_pathspec: make sure the prefix part is wildcard-free
  rename field "raw" to "_raw" in struct pathspec
  tree-diff: remove the use of pathspec's raw[] in follow-rename codepath
  remove match_pathspec() in favor of match_pathspec_depth()
  remove init_pathspec() in favor of parse_pathspec()
  remove diff_tree_{setup,release}_paths
  convert common_prefix() to use struct pathspec
  ...
2013-09-09 14:36:15 -07:00
Junio C Hamano
de9a25354a Merge branch 'es/blame-L-twice'
Teaches "git blame" to take more than one -L ranges.

* es/blame-L-twice:
  line-range: reject -L line numbers less than 1
  t8001/t8002: blame: add tests of -L line numbers less than 1
  line-range: teach -L^:RE to search from start of file
  line-range: teach -L:RE to search from end of previous -L range
  line-range: teach -L^/RE/ to search from start of file
  line-range-format.txt: document -L/RE/ relative search
  log: teach -L/RE/ to search from end of previous -L range
  blame: teach -L/RE/ to search from end of previous -L range
  line-range: teach -L/RE/ to search relative to anchor point
  blame: document multiple -L support
  t8001/t8002: blame: add tests of multiple -L options
  blame: accept multiple -L ranges
  blame: inline one-line function into its lone caller
  range-set: publish API for re-use by git-blame -L
  line-range-format.txt: clarify -L:regex usage form
  git-log.txt: place each -L option variation on its own line
2013-09-09 14:35:11 -07:00
Junio C Hamano
118b9d5836 Merge branch 'es/blame-L-more'
More fixes to the code to parse the "-L" option in "log" and "blame".

* es/blame-L-more:
  blame: reject empty ranges -L,+0 and -L,-0
  t8001/t8002: blame: demonstrate acceptance of bogus -L,+0 and -L,-0
  blame: reject empty ranges -LX,+0 and -LX,-0
  t8001/t8002: blame: demonstrate acceptance of bogus -LX,+0 and -LX,-0
  log: fix -L bounds checking bug
  t4211: retire soon-to-be unimplementable tests
  t4211: log: demonstrate -L bounds checking bug
  blame: fix -L bounds checking bug
  t8001/t8002: blame: add empty file & partial-line tests
  t8001/t8002: blame: demonstrate -L bounds checking bug
  t8001/t8002: blame: decompose overly-large test
2013-09-09 14:32:45 -07:00
Eric Sunshine
52f4d12648 blame: teach -L/RE/ to search from end of previous -L range
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-06 14:44:25 -07:00
Eric Sunshine
815834e9aa line-range: teach -L/RE/ to search relative to anchor point
Range specification -L/RE/ for blame/log unconditionally begins
searching at line one. Mailing list discussion [1] suggests that, in the
presence of multiple -L options, -L/RE/ should search relative to the
endpoint of the previous -L range, if any.

Teach the parsing machinery underlying blame's and log's -L options to
accept a start point for -L/RE/ searches. Follow-up patches will upgrade
blame and log to take advantage of this ability.

[1]: http://thread.gmane.org/gmane.comp.version-control.git/229755/focus=229966

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-06 14:36:34 -07:00
Eric Sunshine
58dbfa2e59 blame: accept multiple -L ranges
git-blame accepts only a single -L option or none. Clients requiring
blame information for multiple disjoint ranges are therefore forced
either to invoke git-blame multiple times, once for each range, or only
once with no -L option to cover the entire file, both of which can be
costly.  Teach git-blame to accept multiple -L ranges.  Overlapping and
out-of-order ranges are accepted.

In this patch, the X in -LX,Y is absolute (for instance, /RE/ patterns
search from line 1), and Y is relative to X. Follow-up patches provide
more flexibility over how X is anchored.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-06 14:29:35 -07:00
Eric Sunshine
753935749f blame: inline one-line function into its lone caller
As of 25ed3412 (Refactor parse_loc; 2013-03-28),
blame.c:prepare_blame_range() became effectively a one-line function
which merely passes its arguments along to another function. This
indirection does not bring clarity to the code. Simplify by inlining
prepare_blame_range() into its lone caller.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-06 14:28:09 -07:00
Eric Sunshine
164a9cf430 blame: fix -L bounds checking bug
Since inception, -LX,Y has correctly reported an out-of-range error when
Y is beyond end of file, however, X was not checked, and an out-of-range
X would cause a crash.  92f9e273 (blame: prevent a segv when -L given
start > EOF; 2010-02-08) attempted to rectify this shortcoming but has
its own off-by-one error which allows X to extend one line past end of
file.  For example, given a file with 5 lines:

  git blame -L5 foo  # OK, blames line 5
  git blame -L6 foo  # accepted, no error, no output, huh?
  git blame -L7 foo  # error "fatal: file foo has only 5 lines"

Fix this bug.

In order to avoid regressing "blame foo" when foo is an empty file, the
fix is slightly more complicated than changing '<' to '<='.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-05 11:54:31 -07:00
Stefan Beller
d5d09d4754 Replace deprecated OPT_BOOLEAN by OPT_BOOL
This task emerged from b04ba2bb (parse-options: deprecate OPT_BOOLEAN,
2011-09-27). All occurrences of the respective variables have
been reviewed and none of them relied on the counting up mechanism,
but all of them were using the variable as a true boolean.

This patch does not change semantics of any command intentionally.

Signed-off-by: Stefan Beller <stefanbeller@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-05 11:32:19 -07:00
Nguyễn Thái Ngọc Duy
9a08727443 remove init_pathspec() in favor of parse_pathspec()
While at there, move free_pathspec() to pathspec.c

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15 10:56:09 -07:00
Nguyễn Thái Ngọc Duy
bd1928df1d remove diff_tree_{setup,release}_paths
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15 10:56:09 -07:00
Junio C Hamano
ed73fe5642 Merge branch 'tr/line-log'
* tr/line-log:
  git-log(1): remove --full-line-diff description
  line-log: fix documentation formatting
  log -L: improve comments in process_all_files()
  log -L: store the path instead of a diff_filespec
  log -L: test merge of parallel modify/rename
  t4211: pass -M to 'git log -M -L...' test
  log -L: fix overlapping input ranges
  log -L: check range set invariants when we look it up
  Speed up log -L... -M
  log -L: :pattern:file syntax to find by funcname
  Implement line-history search (git log -L)
  Export rewrite_parents() for 'log -L'
  Refactor parse_loc
2013-06-02 16:00:44 -07:00
Junio C Hamano
e52e6f79cc Merge branch 'nd/pretty-formats'
pretty-printing body of the commit that is stored in non UTF-8
encoding did not work well.  The early part of this series fixes
it.  And then it adds %C(auto) specifier that turns the coloring on
when we are emitting to the terminal, and adds column-aligning
format directives.

* nd/pretty-formats:
  pretty: support %>> that steal trailing spaces
  pretty: support truncating in %>, %< and %><
  pretty: support padding placeholders, %< %> and %><
  pretty: add %C(auto) for auto-coloring
  pretty: split color parsing into a separate function
  pretty: two phase conversion for non utf-8 commits
  utf8.c: add reencode_string_len() that can handle NULs in string
  utf8.c: add utf8_strnwidth() with the ability to skip ansi sequences
  utf8.c: move display_mode_esc_sequence_len() for use by other functions
  pretty: share code between format_decoration and show_decorations
  pretty-formats.txt: wrap long lines
  pretty: get the correct encoding for --pretty:format=%e
  pretty: save commit encoding from logmsg_reencode if the caller needs it
2013-04-23 11:22:48 -07:00
Nguyễn Thái Ngọc Duy
5a10d23658 pretty: save commit encoding from logmsg_reencode if the caller needs it
The commit encoding is parsed by logmsg_reencode, there's no need for
the caller to re-parse it again. The reencoded message now has the new
encoding, not the original one. The caller would need to read commit
object again before parsing.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-18 16:28:27 -07:00
René Scharfe
de5abe9fe9 blame: handle broken commit headers gracefully
split_ident_line() can leave us with the pointers date_begin, date_end,
tz_begin and tz_end all set to NULL.  Check them before use and supply
the same fallback values as in the case of a negative return code from
split_ident_line().

The "(unknown)" is not actually shown in the output, though, because it
will be converted to a number (zero) eventually.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-17 14:50:45 -07:00
Thomas Rast
13b8f68c1f log -L: :pattern:file syntax to find by funcname
This new syntax finds a funcname matching /pattern/, and then takes from there
up to (but not including) the next funcname.  So you can say

  git log -L:main:main.c

and it will dig up the main() function and show its line-log, provided
there are no other funcnames matching 'main'.

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-03-28 10:30:04 -07:00
Bo Yang
25ed3412f8 Refactor parse_loc
We want to use the same style of -L n,m argument for 'git log -L' as
for git-blame.  Refactor the argument parsing of the range arguments
from builtin/blame.c to the (new) file that will hold the 'git log -L'
logic.

To accommodate different data structures in blame and log -L, the file
contents are abstracted away; parse_range_arg takes a callback that it
uses to get the contents of a line of the (notional) file.

The new test is for a case that made me pause during debugging: the
'blame -L with invalid end' test was the only one that noticed an
outright failure to parse the end *at all*.  So make a more explicit
test for that.

Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-03-28 10:28:41 -07:00
Jeff King
be5c9fb904 logmsg_reencode: lazily load missing commit buffers
Usually a commit that makes it to logmsg_reencode will have
been parsed, and the commit->buffer struct member will be
valid. However, some code paths will free commit buffers
after having used them (for example, the log traversal
machinery will do so to keep memory usage down).

Most of the time this is fine; log should only show a commit
once, and then exits. However, there are some code paths
where this does not work. At least two are known:

  1. A commit may be shown as part of a regular ref, and
     then it may be shown again as part of a submodule diff
     (e.g., if a repo contains refs to both the superproject
     and subproject).

  2. A notes-cache commit may be shown during "log --all",
     and then later used to access a textconv cache during a
     diff.

Lazily loading in logmsg_reencode does not necessarily catch
all such cases, but it should catch most of them. Users of
the commit buffer tend to be either parsing for structure
(in which they will call parse_commit, and either we will
already have parsed, or we will load commit->buffer lazily
there), or outputting (either to the user, or fetching a
part of the commit message via format_commit_message). In
the latter case, we should always be using logmsg_reencode
anyway (and typically we do so via the pretty-print
machinery).

If there are any cases that this misses, we can fix them up
to use logmsg_reencode (or handle them on a case-by-case
basis if that is inappropriate).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-26 13:28:22 -08:00
Jeff King
dd0d388c44 logmsg_reencode: never return NULL
The logmsg_reencode function will return the reencoded
commit buffer, or NULL if reencoding failed or no reencoding
was necessary. Since every caller then ends up checking for NULL
and just using the commit's original buffer, anyway, we can
be a bit more helpful and just return that buffer when we
would have returned NULL.

Since the resulting string may or may not need to be freed,
we introduce a logmsg_free, which checks whether the buffer
came from the commit object or not (callers either
implemented the same check already, or kept two separate
pointers, one to mark the buffer to be used, and one for the
to-be-freed string).

Pushing this logic into logmsg_* simplifies the callers, and
will let future patches lazily load the commit buffer in a
single place.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-26 13:28:21 -08:00
Junio C Hamano
577f63e781 Merge branch 'ap/log-mailmap'
Teach commands in the "log" family to optionally pay attention to
the mailmap.

* ap/log-mailmap:
  log --use-mailmap: optimize for cases without --author/--committer search
  log: add log.mailmap configuration option
  log: grep author/committer using mailmap
  test: add test for --use-mailmap option
  log: add --use-mailmap option
  pretty: use mailmap to display username and email
  mailmap: add mailmap structure to rev_info and pp
  mailmap: simplify map_user() interface
  mailmap: remove email copy and length limitation
  Use split_ident_line to parse author and committer
  string-list: allow case-insensitive string list
2013-01-20 17:06:53 -08:00
Junio C Hamano
90d0b8a9f0 Merge branch 'jc/blame-no-follow'
Teaches "--no-follow" option to "git blame" to disable its
whole-file rename detection.

* jc/blame-no-follow:
  blame: pay attention to --no-follow
  diff: accept --no-follow option
2013-01-14 08:15:51 -08:00
Antoine Pelisse
ea02ffa385 mailmap: simplify map_user() interface
Simplify map_user(), mostly to avoid copies of string buffers. It
also simplifies caller functions.

map_user() directly receive pointers and length from the commit buffer
as mail and name. If mapping of the user and mail can be done, the
pointer is updated to a new location. Lengths are also updated if
necessary.

The caller of map_user() can then copy the new email and name if
necessary.

Signed-off-by: Antoine Pelisse <apelisse@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-10 12:33:08 -08:00
Antoine Pelisse
3c020bd528 Use split_ident_line to parse author and committer
Currently blame.c::get_acline(), pretty.c::pp_user_info() and
shortlog.c::insert_one_record() are parsing author name, email, time
and tz themselves.

Use ident.c::split_ident_line() for better code reuse.

Signed-off-by: Antoine Pelisse <apelisse@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-07 15:59:32 -08:00
Junio C Hamano
e297cf5aff pretty: remove reencode_commit_message()
This function has only two callsites, and is a thin wrapper whose
usefulness is dubious.  When the caller needs to learn the log
output encoding, it should be able to do so by directly calling
get_log_output_encoding() and calling the underlying
logmsg_reencode() with it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-17 22:42:40 -07:00
Junio C Hamano
3d1aa56671 blame: pay attention to --no-follow
If you know your history did not have renames, or if you care only
about the history after a large rename that happened some time ago,
"git blame --no-follow $path" is a way to tell the command not to
bother about renames.

When you use -C, the lines that came from the renamed file will
still be found without the whole-file rename detection, so it is not
all that interesting either way, though.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-21 13:52:25 -07:00
Junio C Hamano
992311cf86 Merge branch 'jc/maint-blame-no-such-path'
"git blame MAKEFILE" run in a history that has "Makefile" but not
"MAKEFILE" should say "No such file MAKEFILE in HEAD", but got
confused on a case insensitive filesystem and failed to do so.

Even during a conflicted merge, "git blame $path" always meant to
blame uncommitted changes to the "working tree" version; make it
more useful by showing cleanly merged parts as coming from the other
branch that is being merged.

* jc/maint-blame-no-such-path:
  blame: allow "blame file" in the middle of a conflicted merge
  blame $path: avoid getting fooled by case insensitive filesystems
2012-09-17 15:52:32 -07:00
Junio C Hamano
9aeaab6811 blame: allow "blame file" in the middle of a conflicted merge
"git blame file" has always meant "find the origin of each line of
the file in the history leading to HEAD, oh by the way, blame the
lines that are modified locally to the working tree".

This teaches "git blame" that during a conflicted merge, some
uncommitted changes may have come from the other history that is
being merged.

The verify_working_tree_path() function introduced in the previous
patch to notice a typo in the filename (primarily on case insensitive
filesystems) has been updated to allow a filename that does not exist
in HEAD (i.e. the tip of our history) as long as it exists one of the
commits being merged, so that a "we deleted, the other side modified"
case tracks the history of the file in the history of the other side.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-11 14:30:03 -07:00
Junio C Hamano
738c218760 Merge branch 'tr/void-diff-setup-done' into maint-1.7.11
* tr/void-diff-setup-done:
  diff_setup_done(): return void
2012-09-11 10:53:40 -07:00
Junio C Hamano
ffcabccf5d blame $path: avoid getting fooled by case insensitive filesystems
"git blame MAKEFILE" run in a history that has "Makefile" but not
MAKEFILE can get confused on a case insensitive filesystem, because
the check we run to see if there is a corresponding file in the
working tree with lstat("MAKEFILE") succeeds.  In addition to that
check, we have to make sure that the given path also exists in the
commit we start digging history from (i.e. "HEAD").

Note that this reveals the breakage in a test added in cd8ae20
(git-blame shouldn't crash if run in an unmerged tree, 2007-10-18),
which expects the entire merge-in-progress path to be blamed to the
working tree when it did not exist in our tree.  As it is clear in
the log message of that commit, the old breakage was that it was
causing an internal error and the fix was about avoiding it.

Just check that the command does not die an uncontrolled death.  For
this particular case, the blame should fail, as the history for the
file in that contents has not been committed yet at the point in the
test.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-10 18:42:30 -07:00
Junio C Hamano
03adeeaad6 Merge branch 'jk/maint-null-in-trees' into maint-1.7.11
"git diff" had a confusion between taking data from a path in the
working tree and taking data from an object that happens to have
name 0{40} recorded in a tree.

* jk/maint-null-in-trees:
  fsck: detect null sha1 in tree entries
  do not write null sha1s to on-disk index
  diff: do not use null sha1 as a sentinel value
2012-09-10 15:24:54 -07:00
Junio C Hamano
096bbd6537 Merge branch 'nd/i18n-parseopt-help'
A lot of i18n mark-up for the help text from "git <cmd> -h".

* nd/i18n-parseopt-help: (66 commits)
  Use imperative form in help usage to describe an action
  Reduce translations by using same terminologies
  i18n: write-tree: mark parseopt strings for translation
  i18n: verify-tag: mark parseopt strings for translation
  i18n: verify-pack: mark parseopt strings for translation
  i18n: update-server-info: mark parseopt strings for translation
  i18n: update-ref: mark parseopt strings for translation
  i18n: update-index: mark parseopt strings for translation
  i18n: tag: mark parseopt strings for translation
  i18n: symbolic-ref: mark parseopt strings for translation
  i18n: show-ref: mark parseopt strings for translation
  i18n: show-branch: mark parseopt strings for translation
  i18n: shortlog: mark parseopt strings for translation
  i18n: rm: mark parseopt strings for translation
  i18n: revert, cherry-pick: mark parseopt strings for translation
  i18n: rev-parse: mark parseopt strings for translation
  i18n: reset: mark parseopt strings for translation
  i18n: rerere: mark parseopt strings for translation
  i18n: status: mark parseopt strings for translation
  i18n: replace: mark parseopt strings for translation
  ...
2012-09-07 11:09:09 -07:00
Junio C Hamano
3b753148b6 Merge branch 'jk/maint-null-in-trees'
We do not want a link to 0{40} object stored anywhere in our objects.

* jk/maint-null-in-trees:
  fsck: detect null sha1 in tree entries
  do not write null sha1s to on-disk index
  diff: do not use null sha1 as a sentinel value
2012-08-27 11:54:28 -07:00
Junio C Hamano
9cd33bbc52 Merge branch 'tr/void-diff-setup-done'
Remove unnecessary code.

* tr/void-diff-setup-done:
  diff_setup_done(): return void
2012-08-22 11:52:27 -07:00
Nguyễn Thái Ngọc Duy
efd2a8bd38 i18n: blame: mark parseopt strings for translation
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-08-20 12:23:15 -07:00
Thomas Rast
28452655af diff_setup_done(): return void
diff_setup_done() has historically returned an error code, but lost
the last nonzero return in 943d5b7 (allow diff.renamelimit to be set
regardless of -M/-C, 2006-08-09).  The callers were in a pretty
confused state: some actually checked for the return code, and some
did not.

Let it return void, and patch all callers to take this into account.
This conveniently also gets rid of a handful of different(!) error
messages that could never be triggered anyway.

Note that the function can still die().

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-08-03 12:11:07 -07:00
Jeff King
e54501004a diff: do not use null sha1 as a sentinel value
The diff code represents paths using the diff_filespec
struct. This struct has a sha1 to represent the sha1 of the
content at that path, as well as a sha1_valid member which
indicates whether its sha1 field is actually useful. If
sha1_valid is not true, then the filespec represents a
working tree file (e.g., for the no-index case, or for when
the index is not up-to-date).

The diff_filespec is only used internally, though. At the
interfaces to the diff subsystem, callers feed the sha1
directly, and we create a diff_filespec from it. It's at
that point that we look at the sha1 and decide whether it is
valid or not; callers may pass the null sha1 as a sentinel
value to indicate that it is not.

We should not typically see the null sha1 coming from any
other source (e.g., in the index itself, or from a tree).
However, a corrupt tree might have a null sha1, which would
cause "diff --patch" to accidentally diff the working tree
version of a file instead of treating it as a blob.

This patch extends the edges of the diff interface to accept
a "sha1_valid" flag whenever we accept a sha1, and to use
that flag when creating a filespec. In some cases, this
means passing the flag through several layers, making the
code change larger than would be desirable.

One alternative would be to simply die() upon seeing
corrupted trees with null sha1s. However, this fix more
directly addresses the problem (while bogus sha1s in a tree
are probably a bad thing, it is really the sentinel
confusion sending us down the wrong code path that is what
makes it devastating). And it means that git is more capable
of examining and debugging these corrupted trees. For
example, you can still "diff --raw" such a tree to find out
when the bogus entry was introduced; you just cannot do a
"--patch" diff (just as you could not with any other
corrupted tree, as we do not have any content to diff).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-29 15:04:32 -07:00
Junio C Hamano
b700086d84 Merge branch 'jc/maint-blame-unique-abbrev' into maint
"git blame" did not try to make sure that the abbreviated commit
object names in its output are unique.

* jc/maint-blame-unique-abbrev:
  blame: compute abbreviation width that ensures uniqueness
2012-07-11 12:58:28 -07:00
Thomas Gummerer
b60e188c51 Strip namelen out of ce_flags into a ce_namelen field
Strip the name length from the ce_flags field and move it
into its own ce_namelen field in struct cache_entry. This
will both give us a tiny bit of a performance enhancement
when working with long pathnames and is a refactoring for
more readability of the code.

It enhances readability, by making it more clear what
is a flag, and where the length is stored and make it clear
which functions use stages in comparisions and which only
use the length.

It also makes CE_NAMEMASK private, so that users don't
mistakenly write the name length in the flags.

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-11 09:42:45 -07:00
Junio C Hamano
3a335ee2da Merge branch 'jc/maint-blame-unique-abbrev'
"git blame" did not try to make sure the abbreviated commit object
names in its output are unique.

* jc/maint-blame-unique-abbrev:
  blame: compute abbreviation width that ensures uniqueness
2012-07-09 09:01:38 -07:00
Junio C Hamano
b31272f704 blame: compute abbreviation width that ensures uniqueness
Julia Lawall noticed that in linux-next repository the commit object
60d5c9f5 (shown with the default abbreviation width baked into "git
blame") in output from

  $ git blame -L 3675,3675 60d5c9f5b -- \
      drivers/staging/brcm80211/brcmfmac/wl_iw.c

is no longer unique in the repository, which results in "short SHA1
60d5c9f5 is ambiguous".

Compute the minimum abbreviation width that ensures uniqueness when
the user did not specify the --abbrev option to avoid this.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-02 00:54:19 -07:00
Ramsay Jones
85c20c304f builtin/blame.c: Fix a "Using plain integer as NULL pointer" warning
Plain gcc may not but sparse catches and complains about this sort of
stuff.

Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-14 10:19:42 -07:00
René Scharfe
4b4132fd89 blame: factor out helper for calling xdi_diff()
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-09 14:08:27 -07:00
René Scharfe
5d23ec7664 blame: use hunk_func(), part 2
Use handle_split_cb() directly as hunk_func() callback, without going
through xdi_diff_hunks().

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-09 14:03:23 -07:00
René Scharfe
0af596c6ff blame: use hunk_func(), part 1
Use blame_chunk_cb() directly as hunk_func() callback, without detour
through xdi_diff_hunks().

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-09 14:03:03 -07:00
Junio C Hamano
089c0ca8b6 Merge branch 'jc/maint-blame-minimal' into maint
"git blame" started missing quite a few changes from the origin since we
stopped using the diff minimalization by default in v1.7.2 era.

Teach "--minimal" option to "git blame" to work around this regression.

* jc/maint-blame-minimal:
  blame: accept --need-minimal
2012-05-01 21:11:49 -07:00
Junio C Hamano
9d76db4e67 Merge branch 'jc/maint-blame-minimal'
"git blame" started missing quite a few changes from the origin since we
stopped using the diff minimalization by default in v1.7.2 era.

* jc/maint-blame-minimal:
  blame: accept --need-minimal
2012-04-23 12:58:23 -07:00
Junio C Hamano
059a500d25 blame: accept --need-minimal
Between v1.7.1 and v1.7.2, 582aa00bdf switched the default "diff"
invocation not to use XDF_NEED_MINIMAL, but this breaks "git blame"
rather badly.

Allow the command line option to ask for an extra careful matching.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-11 13:11:55 -07:00
Junio C Hamano
4d9e079e82 Merge branch 'zj/decimal-width'
* zj/decimal-width:
  make lineno_width() from blame reusable for others

Conflicts:
	cache.h
	pager.c
2012-02-20 00:15:11 -08:00
Zbigniew Jędrzejewski-Szmek
ec7ff5ba27 make lineno_width() from blame reusable for others
builtin/blame.c has a helper function to compute how many columns
we need to show a line-number, whose implementation is reusable as
a more generic helper function to count the number of columns
necessary to show any cardinal number.

Rename it to decimal_width(), move it to pager.c and export it for
use by future callers.

Signed-off-by: Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-14 16:16:19 -08:00