Use "wordRegex" for configuration variable names. Use "word_regex" for C
language tokens.
Signed-off-by: Boyd Stephen Smith Jr. <bss@iguanasuicide.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make the --color-words splitting regular expression configurable via
the diff driver's 'wordregex' attribute. The user can then set the
driver on a file in .gitattributes. If a regex is given on the
command line, it overrides the driver's setting.
We also provide built-in regexes for the languages that already had
funcname patterns, and add an appropriate diff driver entry for C/++.
(The patterns are designed to run UTF-8 sequences into a single chunk
to make sure they remain readable.)
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Diffs that have been produced with textconv almost certainly
cannot be applied, so we want to be careful not to generate
them in things like format-patch.
This introduces a new diff options, ALLOW_TEXTCONV, which
controls this behavior. It is off by default, but is
explicitly turned on for the "log" family of commands, as
well as the "diff" porcelain (but not diff-* plumbing).
Because both text conversion and external diffing are
controlled by these diff options, we can get rid of the
"plumbing versus porcelain" distinction when reading the
config. This was an attempt to control the same thing, but
suffered from being too coarse-grained.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When diffing binary files, it is sometimes nice to see the
differences of a canonical text form rather than either a
binary patch or simply "binary files differ."
Until now, the only option for doing this was to define an
external diff command to perform the diff. This was a lot of
work, since the external command needed to take care of
doing the diff itself (including mode changes), and lost the
benefit of git's colorization and other options.
This patch adds a text conversion option, which converts a
file to its canonical format before performing the diff.
This is less flexible than an arbitrary external diff, but
is much less work to set up. For example:
$ echo '*.jpg diff=exif' >>.gitattributes
$ git config diff.exif.textconv exiftool
$ git config diff.exif.binary false
allows one to see jpg diffs represented by the text output
of exiftool.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
The "diff" gitattribute is somewhat overloaded right now. It
can say one of three things:
1. this file is definitely binary, or definitely not
(i.e., diff or !diff)
2. this file should use an external diff engine (i.e.,
diff=foo, diff.foo.command = custom-script)
3. this file should use particular funcname patterns
(i.e., diff=foo, diff.foo.(x?)funcname = some-regex)
Most of the time, there is no conflict between these uses,
since using one implies that the other is irrelevant (e.g.,
an external diff engine will decide for itself whether the
file is binary).
However, there is at least one conflicting situation: there
is no way to say "use the regular rules to determine whether
this file is binary, but if we do diff it textually, use
this funcname pattern." That is, currently setting diff=foo
indicates that the file is definitely text.
This patch introduces a "binary" config option for a diff
driver, so that one can explicitly set diff.foo.binary. We
default this value to "don't know". That is, setting a diff
attribute to "foo" and using "diff.foo.funcname" will have
no effect on the binaryness of a file. To get the current
behavior, one can set diff.foo.binary to true.
This patch also has one additional advantage: it cleans up
the interface to the userdiff code a bit. Before, calling
code had to know more about whether attributes were false,
true, or unset to determine binaryness. Now that binaryness
is a property of a driver, we can represent these situations
just by passing back a driver struct.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Both sets of code assume that one specifies a diff profile
as a gitattribute via the "diff=foo" attribute. They then
pull information about that profile from the config as
diff.foo.*.
The code for each is currently completely separate from the
other, which has several disadvantages:
- there is duplication as we maintain code to create and
search the separate lists of external drivers and
funcname patterns
- it is difficult to add new profile options, since it is
unclear where they should go
- the code is difficult to follow, as we rely on the
"check if this file is binary" code to find the funcname
pattern as a side effect. This is the first step in
refactoring the binary-checking code.
This patch factors out these diff profiles into "userdiff"
drivers. A file with "diff=foo" uses the "foo" driver, which
is specified by a single struct.
Note that one major difference between the two pieces of
code is that the funcname patterns are always loaded,
whereas external drivers are loaded only for the "git diff"
porcelain; the new code takes care to retain that situation.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>