The sha1_to_hex and find_unique_abbrev functions always
write into reusable static buffers. There are a few problems
with this:
- future calls overwrite our result. This is especially
annoying with find_unique_abbrev, which does not have a
ring of buffers, so you cannot even printf() a result
that has two abbreviated sha1s.
- if you want to put the result into another buffer, we
often strcpy, which looks suspicious when auditing for
overflows.
This patch introduces sha1_to_hex_r and find_unique_abbrev_r,
which write into a user-provided buffer. Of course this is
just punting on the overflow-auditing, as the buffer
obviously needs to be GIT_SHA1_HEXSZ + 1 bytes. But it is
much easier to audit, since that is a well-known size.
We retain the non-reentrant forms, which just become thin
wrappers around the reentrant ones. This patch also adds a
strbuf variant of find_unique_abbrev, which will be handy in
later patches.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The strbuf_complete_line function makes sure that a buffer
ends in a newline. But we may want to do this for any
character (e.g., "/" on the end of a path). Let's factor out
a generic version, and keep strbuf_complete_line as a thin
wrapper.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach "git log" and friends a new "--date=format:..." option to
format timestamps using system's strftime(3).
* jk/date-mode-format:
strbuf: make strbuf_addftime more robust
introduce "format" date-mode
convert "enum date_mode" into a struct
show-branch: use DATE_RELATIVE instead of magic number
It is currently declared to return int, which could overflow for
large files.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This feeds the format directly to strftime. Besides being a
little more flexible, the main advantage is that your system
strftime may know more about your locale's preferred format
(e.g., how to spell the days of the week).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We mark strbuf_addch as inline, because we expect it may be
called from a tight loop. However, the first thing it does
is call the non-inline strbuf_grow(), which can handle
arbitrary-sized growth. Since we know that we only need a
single character, we can use the inline strbuf_avail() to
quickly check whether we need to grow at all.
Our check is redundant when we do call strbuf_grow(), but
that's OK. The common case is that we avoid calling it at
all, and we have made that case faster.
On a silly pathological case:
perl -le '
print "[core]";
print "key$_ = value$_" for (1..1000000)
' >input
git config -f input core.key1
this dropped the time to run git-config from:
real 0m0.159s
user 0m0.152s
sys 0m0.004s
to:
real 0m0.140s
user 0m0.136s
sys 0m0.004s
for a savings of 12%.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The relationship between these makes more sense if you read
them as a group, which can help people who are looking for
the right function. Let's give them a single comment.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The description of strbuf_split_buf says most of what
needs to be said for all of the split variants that take
strings, raw memory, etc. We have a boilerplate comment
above each that points to the first. This boilerplate
ends up making it harder to read, because it spaces out the
functions, which could otherwise be read as a group.
Let's drop the boilerplate completely, and mention the
variants in the top comment. This is perhaps slightly worse
for a hypothetical system which pulls the documentation for
each function out of the comment immediately preceding it.
But such a system does not yet exist, and anyway, the end
result of extracting the boilerplate comments would not lead
to a very easy-to-read result. We would do better in the
long run to teach the extraction system about groups of
related functions.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The original API doc had something like:
Functions
---------
* Life cycle
... some life-cycle functions ...
* Related to the contents of the buffer
... functions related to contents ....
etc
This grouping can be hard to read in the comment sources,
given the "*" in the comment lines, and the amount of text
between each section.
Instead, let's make a flat list of groupings, and underline
each as a section header. That makes them stand out, and
eliminates the weird half-phrase of "Related to...". Like:
Functions related to the contents of the buffer
-----------------------------------------------
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is much easier to read when the whole thing is stuffed
inside a comment block. And there is precedent for this
convention in markdown (and just in general ascii text).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Using a hanging indent is much more readable. This means we
won't format as asciidoc anymore, but since we don't have a
working system for extracting these comments anyway, it's
probably more important to just make the source readable.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The prior patch uses "/**" to denote "documentation"
comments that we pulled from api-strbuf.txt. Let's use a
consistent style for similar comments that were already in
strbuf.h.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some of strbuf is documented as comments above functions,
and some is separate in Documentation/technical/api-strbuf.txt.
This makes it annoying to find the appropriate documentation.
We'd rather have it all in one place, which means all in the
text document, or all in the header.
Let's choose the header as that place. Even though the
formatting is not quite as pretty, this keeps the
documentation close to the related code. The hope is that
this makes it easier to find what you want (human-readable
comments are right next to the C declarations), and easier
for writers to keep the documentation up to date.
This is more or less a straight import of the text from
api-strbuf.txt into C comments, complete with asciidoc
formatting. The exceptions are:
1. All comments created in this way are started with "/**"
to indicate they are part of the API documentation. This
may help later with extracting the text to pretty-print
it.
2. Function descriptions do not repeat the function name,
as it is available in the context directly below. So:
`strbuf_add`::
Add data of given length to the buffer.
from api-strbuf.txt becomes:
/**
* Add data of given length to the buffer.
*/
void strbuf_add(struct strbuf *sb, const void *, size_t);
As a result, any block-continuation required in asciidoc
for that list item was dropped in favor of straight
blank-line paragraph (since it is not necessary when we
are not in a list item).
3. There is minor re-wording to integrate existing comments
and api-strbuf text. In each case, I took whichever
version was more descriptive, and eliminated any
redundancies. In one case, for strbuf_addstr, the api
documentation gave its inline definition; I eliminated
this as redundant with the actual definition, which can
be seen directly below the comment.
4. The functions in the header file are re-ordered to match
the ordering of the API documentation, under the
assumption that more thought went into the grouping
there.
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move strbuf_addchars() to strbuf.c, where it belongs, and make it
available for other callers.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reduce the use of fixed sized buffer passed to getcwd() calls
by introducing xgetcwd() helper.
* rs/strbuf-getcwd:
use strbuf_add_absolute_path() to add absolute paths
abspath: convert absolute_path() to strbuf
use xgetcwd() to set $GIT_DIR
use xgetcwd() to get the current directory or die
wrapper: add xgetcwd()
abspath: convert real_path_internal() to strbuf
abspath: use strbuf_getcwd() to remember original working directory
setup: convert setup_git_directory_gently_1 et al. to strbuf
unix-sockets: use strbuf_getcwd()
strbuf: add strbuf_getcwd()
Move most of the code of absolute_path() into the new function
strbuf_add_absolute_path() and in the process transform it to use
struct strbuf and xgetcwd() instead of a PATH_MAX-sized buffer,
which can be too small on some file systems.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add strbuf_getcwd(), which puts the current working directory into a
strbuf. Because it doesn't use a fixed-size buffer it supports
arbitrarily long paths, provided the platform's getcwd() does as well.
At least on Linux and FreeBSD it handles paths longer than PATH_MAX
just fine.
Suggested-by: Karsten Blees <karsten.blees@gmail.com>
Helped-by: Duy Nguyen <pclouds@gmail.com>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jk/strip-suffix:
prepare_packed_git_one: refactor duplicate-pack check
verify-pack: use strbuf_strip_suffix
strbuf: implement strbuf_strip_suffix
index-pack: use strip_suffix to avoid magic numbers
use strip_suffix instead of ends_with in simple cases
replace has_extension with ends_with
implement ends_with via strip_suffix
add strip_suffix function
sha1_file: replace PATH_MAX buffer with strbuf in prepare_packed_git_one()
You can almost get away with just calling "strip_suffix_mem"
on a strbuf's buf and len fields. But we also need to move
the NUL-terminator to satisfy strbuf's invariants. Let's
provide a convenience wrapper that handles this.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
You can use a strbuf to build up a string from parts, and
then detach it. In the general case, you might use multiple
strbuf_add* functions to do the building. However, in many
cases, a single strbuf_addf is sufficient, and we end up
with:
struct strbuf buf = STRBUF_INIT;
...
strbuf_addf(&buf, fmt, some, args);
str = strbuf_detach(&buf, NULL);
We can make this much more readable (and avoid introducing
an extra variable, which can clutter the code) by
introducing a convenience function:
str = xstrfmt(fmt, some, args);
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Propagate the error messages from the webserver better to the
client coming over the HTTP transport.
* jk/http-errors:
http: default text charset to iso-8859-1
remote-curl: reencode http error messages
strbuf: add strbuf_reencode helper
http: optionally extract charset parameter from content-type
http: extract type/subtype portion of content-type
t5550: test display of remote http error messages
t/lib-httpd: use write_script to copy CGI scripts
test-lib: preserve GIT_CURL_VERBOSE from the environment
This is a convenience wrapper around `reencode_string_len`
and `strbuf_attach`.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is a convenience wrapper to call tolower on each
character of the string.
This makes config's lowercase() function obsolete, though
note that because we have a strbuf, we are careful to
operate over the whole strbuf, rather than assuming that a
NUL is the end-of-string.
We could continue to offer a pure-string lowercase, but
there would be no callers (in most pure-string cases, we
actually duplicate and lowercase the duplicate, for which we
have the xstrdup_tolower wrapper).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have two implementations of the same function; let's drop
that to one. We take the name from daemon.c, but the
implementation (which is just slightly more efficient) from
the config code.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Humanization of downloaded size is done in the same function as text
formatting in 'process.c'. The code cannot be reused easily elsewhere.
Separate text formatting from size simplification and make the
function public in strbuf so that it can easily be used by other
callers.
We now can use strbuf_humanise_bytes() for both downloaded size and
download speed calculation. One of the drawbacks is that speed will
now look like this when download is stalled: "0 bytes/s" instead of
"0 KiB/s".
Signed-off-by: Antoine Pelisse <apelisse@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some users do want to write a line that begin with a pound sign, #,
in their commit log message. Many tracking system recognise
a token of #<bugid> form, for example.
The support we offer these use cases is not very friendly to the end
users. They have a choice between
- Don't do it. Avoid such a line by rewrapping or indenting; and
- Use --cleanup=whitespace but remove all the hint lines we add.
Give them a way to set a custom comment char, e.g.
$ git -c core.commentchar="%" commit
so that they do not have to do either of the two workarounds.
[jc: although I started the topic, all the tests and documentation
updates, many of the call sites of the new strbuf_add_commented_*()
functions, and the change to git-submodule.sh scripted Porcelain are
from Ralf.]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update imap-send to reuse xml quoting code from http-push codepath,
clean up some code, and fix a small bug.
* mh/unify-xml-in-imap-send-and-http-push:
wrap_in_html(): process message in bulk rather than line-by-line
wrap_in_html(): use strbuf_addstr_xml_quoted()
imap-send: change msg_data from storing (ptr, len) to storing strbuf
imap-send: correctly report errors reading from stdin
imap-send: store all_msgs as a strbuf
lf_to_crlf(): NUL-terminate msg_data::data
xml_entities(): use function strbuf_addstr_xml_quoted()
Add new function strbuf_add_xml_quoted()
Substantially the same code is present in http-push.c and imap-send.c,
so make a library function out of it.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Document strbuf_split_buf(), strbuf_split_str(), strbuf_split_max(),
strbuf_split(), and strbuf_list_free() in the header file and in
api-strbuf.txt. (These functions were previously completely
undocumented.)
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jeff King <peff@peff.net>
The word "delimiter" suggests that the argument separates the
substrings, whereas in fact (1) the delimiter characters are included
in the output, and (2) if the input string ends with the delimiter,
then the output does not include a final empty string. So rename the
"delim" arguments of the strbuf_split() family of functions to
"terminator", which is more suggestive of how it is used.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jeff King <peff@peff.net>
These functions are helpful when we do not want to expose \n to
translators. For example
printf("hello world\n");
can be converted to
printf_ln(_("hello world"));
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* tr/maint-bundle-long-subject:
t5704: match tests to modern style
strbuf: improve strbuf_get*line documentation
bundle: use a strbuf to scan the log for boundary commits
bundle: put strbuf_readline_fd in strbuf.c with adjustments
The comment even said that it should eventually go there. While at
it, match the calling convention and name of the function to the
strbuf_get*line family. So it now is strbuf_getwholeline_fd.
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jk/credentials:
t: add test harness for external credential helpers
credentials: add "store" helper
strbuf: add strbuf_add*_urlencode
Makefile: unix sockets may not available on some platforms
credentials: add "cache" helper
docs: end-user documentation for the credential subsystem
credential: make relevance of http path configurable
credential: add credential.*.username
credential: apply helper config
http: use credential API to get passwords
credential: add function for parsing url components
introduce credentials API
t5550: fix typo
test-lib: add test_config_global variant
Conflicts:
strbuf.c
This just follows the rfc3986 rules for percent-encoding
url data into a strbuf.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a contributor asks the integrator to merge her history, a signed tag
can be a good vehicle to communicate the authenticity of the request while
conveying other information such as the purpose of the topic.
E.g. a signed tag "for-linus" can be created, and the integrator can run:
$ git pull git://example.com/work.git/ for-linus
This would allow the integrator to run "git verify-tag FETCH_HEAD" to
validate the signed tag.
Update fmt-merge-msg so that it pre-fills the merge message template with
the body (but not signature) of the tag object to help the integrator write
a better merge message, in the same spirit as the existing merge.log summary
lines.
The message that comes from GPG signature validation is also included in
the merge message template to help the integrator verify it, but they are
prefixed with "#" to make them comments.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jk/maint-config-param:
config: use strbuf_split_str instead of a temporary strbuf
strbuf: allow strbuf_split to work on non-strbufs
config: avoid segfault when parsing command-line config
config: die on error in command-line config
fix "git -c" parsing of values with equals signs
strbuf_split: add a max parameter
The strbuf_split function takes a strbuf as input, and
outputs a list of strbufs. However, there is no reason that
the input has to be a strbuf, and not an arbitrary buffer.
This patch adds strbuf_split_buf for a length-delimited
buffer, and strbuf_split_str for NUL-terminated strings.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Sometimes when splitting, you only want a limited number of
fields, and for the final field to contain "everything
else", even if it includes the delimiter.
This patch introduces strbuf_split_max, which provides a
"max number of fields" parameter; it behaves similarly to
perl's "split" with a 3rd field.
The existing 2-argument form of strbuf_split is retained for
compatibility and ease-of-use.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit a8f3e2219 introduced the strbuf_grow() call to strbuf_setlen() to
make ensure that there was at least one byte available to write the
mandatory trailing NUL, even for previously unallocated strbufs.
Then b315c5c0 added strbuf_slopbuf for the same reason, only globally for
all uses of strbufs.
Thus the strbuf_grow() call can be removed now. This avoids readers of
strbuf.h from mistakenly thinking that strbuf_setlen() can be used to
extend a strbuf.
The following assert() needs to be changed to cope with the fact that
sb->alloc can now be zero, which is OK as long as len is also zero. As
suggested by Junio, use the chance to convert it to a die() with a short
explanatory message. The pattern of 'die("BUG: ...")' is already used in
strbuf.c.
This was the only assert() in strbuf.[ch], so assert.h doesn't have to be
included anymore either.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint:
contrib/thunderbird-patch-inline: do not require bash to run the script
t8001: check the exit status of the command being tested
strbuf.h: remove a tad stale docs-in-comment and reference api-doc instead
Typos: t/README
Documentation/config.txt: make truth value of numbers more explicit
git-pack-objects.txt: fix grammatical errors
parse-remote: replace unnecessary sed invocation
In a variable-args function, the code for writing into a strbuf is
non-trivial. We ended up cutting and pasting it in several places
because there was no vprintf-style function for strbufs (which in turn
was held up by a lack of va_copy).
Now that we have a fallback va_copy, we can add strbuf_vaddf, the
strbuf equivalent of vsprintf. And we can clean up the cut and paste
mess.
Signed-off-by: Jeff King <peff@peff.net>
Improved-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jk/warn-author-committer-after-commit:
user_ident_sufficiently_given(): refactor the logic to be usable from elsewhere
commit.c::print_summary: do not release the format string too early
commit: allow suppression of implicit identity advice
commit: show interesting ident information in summary
strbuf: add strbuf_addbuf_percentquote
strbuf_expand: convert "%%" to "%"
Conflicts:
builtin-commit.c
ident.c
This is handy for creating strings which will be fed to printf() or
strbuf_expand().
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If sb and sb2 are the same (i.e. doubling the string), the underlying
strbuf_add() can make sb2->buf invalid by calling strbuf_grow(sb) at
the beginning; if realloc(3) done by strbuf_grow() needs to move the
string, strbuf_add() will read from an already freed buffer.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We already have these checks in many printf-type functions that have
prototypes which are in header files. Add these same checks to some
more prototypes in header functions and to static functions in .c
files.
cc: Miklos Vajna <vmiklos@frugalware.org>
Signed-off-by: Tarmigan Casebolt <tarmigan+git@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function is just like strbuf_getline() except it retains the
line-termination character. This function will be used by the mailinfo
and mailsplit builtins which require the entire line for parsing.
Signed-off-by: Brandon Casey <drafnel@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This allows a common calling sequence
strbuf_branchname(&ref, name);
strbuf_splice(&ref, 0, 0, "refs/heads/", 11);
if (check_ref_format(ref.buf))
die(...);
to be refactored into
if (strbuf_check_branch_ref(&ref, name))
die(...);
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function takes a user-supplied string that is supposed to be a branch
name, and puts it in a strbuf after expanding possible shorthand notation.
A handful of open coded sequence to do this in the existing code have been
changed to use this helper function.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It was already what 'git apply' did in read_old_data(), just export it
as a real function, and make it be more generic.
In particular, this handles the case of the lstat() st_size data not
matching the readlink() return value properly (which apparently happens
at least on NTFS under Linux). But as a result of this you could also
use the new function without even knowing how big the link is going to
be, and it will allocate an appropriately sized buffer.
So we pass in the st_size of the link as just a hint, rather than a
fixed requirement.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The new callback function strbuf_expand_dict_cb() can be used together
with strbuf_expand() if there is only a small number of placeholders
for static replacement texts. It expects its dictionary as an array of
placeholder+value pairs as context parameter, terminated by an entry
with the placeholder member set to NULL.
The new helper is intended to aid converting the remaining calls of
interpolate(). strbuf_expand() is smaller, more flexible and can be
used to go faster than interpolate(), so it should replace the latter.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch removes exit()/die() calls and builtin-specific messages
from launch_editor(), so that it can be used as a general libgit.a
function to launch an editor.
Signed-off-by: Stephan Beyer <s-beyer@gmx.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, the --pretty=format prefix is looked up in a
tight loop in strbuf_expand(), if prefix is found it is then
used as argument for format_commit_item() that does another
search by a switch statement to select the proper operation.
Because the switch statement is already able to discard
unknown matches we don't need the prefix lookup before
to call format_commit_item().
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The editor program to let the user edit the log message used to
get GIT_INDEX_FILE environment variable pointing at the right
file, but this was lost when git-commit was rewritten in C.
Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Acked-by: Kristian Høgsberg <krh@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a new function, strbuf_adddup(), that appends a duplicate of a
part of a struct strbuf to end of the latter.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some of the --pretty=format placeholders expansions are expensive to
calculate. This is made worse by the current code's use of
interpolate(), which requires _all_ placeholders are to be prepared
up front.
One way to speed this up is to check which placeholders are present
in the format string and to prepare only the expansions that are
needed. That still leaves the allocation overhead of interpolate().
Another way is to use a callback based approach together with the
strbuf library to keep allocations to a minimum and avoid string
copies. That's what this patch does. It introduces a new strbuf
function, strbuf_expand().
The function takes a format string, list of placeholder strings,
a user supplied function 'fn', and an opaque pointer 'context'
to tell 'fn' what thingy to operate on.
The function 'fn' is expected to accept a strbuf, a parsed
placeholder string and the 'context' pointer, and append the
interpolated value for the 'context' thingy, according to the
format specified by the placeholder.
Thanks to Pierre Habouzit for his suggestion to use strchrnul() and
the code surrounding its callsite. And thanks to Junio for most of
this commit message. :)
Here my measurements of most of Paul Mackerras' test cases that
highlighted the performance problem (best of three runs):
(master)
$ time git log --pretty=oneline >/dev/null
real 0m0.390s
user 0m0.340s
sys 0m0.040s
(master)
$ time git log --pretty=raw >/dev/null
real 0m0.434s
user 0m0.408s
sys 0m0.016s
(master)
$ time git log --pretty="format:%H {%P} %ct" >/dev/null
real 0m1.347s
user 0m0.080s
sys 0m1.256s
(interp_find_active -- Dscho)
$ time ./git log --pretty="format:%H {%P} %ct" >/dev/null
real 0m0.694s
user 0m0.020s
sys 0m0.672s
(strbuf_expand -- this patch)
$ time ./git log --pretty="format:%H {%P} %ct" >/dev/null
real 0m0.395s
user 0m0.352s
sys 0m0.028s
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* make strbuf_read_file take a size hint (works like strbuf_read)
* use it in a couple of places.
Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For that purpose, the ->buf is always initialized with a char * buf living
in the strbuf module. It is made a char * so that we can sloppily accept
things that perform: sb->buf[0] = '\0', and because you can't pass "" as an
initializer for ->buf without making gcc unhappy for very good reasons.
strbuf_init/_detach/_grow have been fixed to trust ->alloc and not ->buf
anymore.
as a consequence strbuf_detach is _mandatory_ to detach a buffer, copying
->buf isn't an option anymore, if ->buf is going to escape from the scope,
and eventually be free'd.
API changes:
* strbuf_setlen now always works, so just make strbuf_reset a convenience
macro.
* strbuf_detatch takes a size_t* optional argument (meaning it can be
NULL) to copy the buffer's len, as it was needed for this refactor to
make the code more readable, and working like the callers.
Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
strbuf_setlen() expect to be able to NUL terminate the buffer,
but a completely empty strbuf could have an empty buffer with 0
allocation; both the assert() and the assignment for NUL
termination would fail.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add strbuf_remove, change strbuf_insert:
As both are special cases of strbuf_splice, implement them as such.
gcc is able to do the math and generate almost optimal code this way.
Add strbuf_swap:
Exchange the values of its arguments.
Use it in fast-import.c
Also fix spacing issues in strbuf.h
Signed-off-by: Pierre Habouzit <madcoder@debian.org>
read_line is now strbuf_getline, and is a first class citizen, it returns 0
when reading a line worked, EOF else.
The ->eof marker was used non-locally by fast-import.c, mimic the same
behaviour using a static int in "read_next_command", that now returns -1 on
EOF, and avoids to call strbuf_getline when it's in EOF state.
Also no longer automagically strbuf_release the buffer, it's counter
intuitive and breaks fast-import in a very subtle way.
Note: being at EOF implies that command_buf.len == 0.
Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* strbuf_splice replace a portion of the buffer with another.
* strbuf_attach replace a strbuf buffer with the given one, that should be
malloc'ed. Then it enforces strbuf's invariants. If alloc > len, then this
function has negligible cost, else it will perform a realloc, possibly
with a cost.
Also some style issues are fixed now.
Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Add strbuf_rtrim to remove trailing spaces.
* Add strbuf_insert to insert data at a given position.
* Off-by one fix in strbuf_addf: strbuf_avail() does not counts the final
\0 so the overflow test for snprintf is the strict comparison. This is
not critical as the growth mechanism chosen will always allocate _more_
memory than asked, so the second test will not fail. It's some kind of
miracle though.
* Add size extension hints for strbuf_init and strbuf_read. If 0, default
applies, else:
+ initial buffer has the given size for strbuf_init.
+ first growth checks it has at least this size rather than the
default 8192.
Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The gory details are explained in strbuf.h. The change of semantics this
patch enforces is that the embeded buffer has always a '\0' character after
its last byte, to always make it a C-string. The offs-by-one changes are all
related to that very change.
A strbuf can be used to store byte arrays, or as an extended string
library. The `buf' member can be passed to any C legacy string function,
because strbuf operations always ensure there is a terminating \0 at the end
of the buffer, not accounted in the `len' field of the structure.
A strbuf can be used to generate a string/buffer whose final size is not
really known, and then "strbuf_detach" can be used to get the built buffer,
and keep the wrapping "strbuf" structure usable for further work again.
Other interesting feature: strbuf_grow(sb, size) ensure that there is
enough allocated space in `sb' to put `size' new octets of data in the
buffer. It helps avoiding reallocating data for nothing when the problem the
strbuf helps to solve has a known typical size.
Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
- Raw hashes should be unsigned char.
- String functions want signed char.
- Hash and compress functions want unsigned char.
Signed-off By: Brian Gerst <bgerst@didntduck.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch introduces a new program, diff-tree-helper. It reads
output from diff-cache and diff-tree, and produces a patch file.
The diff format customization can be done the same way the
show-diff uses; the same external diff interface introduced by
the previous patch to drive diff from show-diff is used so this
is not surprising.
It is used like the following examples:
$ diff-cache --cached -z <tree> | diff-tree-helper -z -R paths...
$ diff-tree -r -z <tree1> <tree2> | diff-tree-helper -z paths...
- As usual, the use of the -z flag is recommended in the script
to pass NUL-terminated filenames through the pipe between
commands.
- The -R flag is used to generate reverse diff. It does not
matter for diff-tree case, but it is sometimes useful to get
a patch in the desired direction out of diff-cache.
- The paths parameters are used to restrict the paths that
appears in the output. Again this is useful to use with
diff-cache, which, unlike diff-tree, does not take such paths
restriction parameters.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>