The strbuf_getline() interface allows a byte other than LF or NUL as
the line terminator, but this is only because I wrote these
codepaths anticipating that there might be a value other than NUL
and LF that could be useful when I introduced line_termination long
time ago. No useful caller that uses other value has emerged.
By now, it is clear that the interface is overly broad without a
good reason. Many codepaths have hardcoded preference to read
either LF terminated or NUL terminated records from their input, and
then call strbuf_getline() with LF or NUL as the third parameter.
This step introduces two thin wrappers around strbuf_getline(),
namely, strbuf_getline_lf() and strbuf_getline_nul(), and
mechanically rewrites these call sites to call either one of
them. The changes contained in this patch are:
* introduction of these two functions in strbuf.[ch]
* mechanical conversion of all callers to strbuf_getline() with
either '\n' or '\0' as the third parameter to instead call the
respective thin wrapper.
After this step, output from "git grep 'strbuf_getline('" would
become a lot smaller. An interim goal of this series is to make
this an empty set, so that we can have strbuf_getline_crlf() take
over the shorter name strbuf_getline().
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The error message after a failing commit_lock_file() call sometimes
looks like this, causing confusion:
$ git remote add remote git@server.com/repo.git
error: could not commit config file .git/config
# Huh?!
# I didn't want to commit anything, especially not my config file!
While in the narrow context of the lockfile module using the verb
'commit' in the error message makes perfect sense, in the broader
context of git the word 'commit' already has a very specific meaning,
hence the confusion.
Reword these error messages to say "could not write" instead of "could
not commit".
While at it, include strerror in the error messages after writing the
config file or the credential store fails to provide some information
about the cause of the failure, and update the style of the error
message after writing the reflog fails to match surrounding error
messages (i.e. no '' around the pathname and no () around the error
description).
Signed-off-by: SZEDER Gábor <szeder@ira.uka.de>
Signed-off-by: Jeff King <peff@peff.net>
Many allocations that is manually counted (correctly) that are
followed by strcpy/sprintf have been replaced with a less error
prone constructs such as xstrfmt.
Macintosh-specific breakage was noticed and corrected in this
reroll.
* jk/war-on-sprintf: (70 commits)
name-rev: use strip_suffix to avoid magic numbers
use strbuf_complete to conditionally append slash
fsck: use for_each_loose_file_in_objdir
Makefile: drop D_INO_IN_DIRENT build knob
fsck: drop inode-sorting code
convert strncpy to memcpy
notes: document length of fanout path with a constant
color: add color_set helper for copying raw colors
prefer memcpy to strcpy
help: clean up kfmclient munging
receive-pack: simplify keep_arg computation
avoid sprintf and strcpy with flex arrays
use alloc_ref rather than hand-allocating "struct ref"
color: add overflow checks for parsing colors
drop strcpy in favor of raw sha1_to_hex
use sha1_to_hex_r() instead of strcpy
daemon: use cld->env_array when re-spawning
stat_tracking_info: convert to argv_array
http-push: use an argv_array for setup_revisions
fetch-pack: use argv_array for index-pack / unpack-objects
...
"git log --date=local" used to only show the normal (default)
format in the local timezone. The command learned to take 'local'
as an instruction to use the local timezone with other formats,
e.g. "git show --date=rfc-local".
* jk/date-local:
t6300: add tests for "-local" date formats
t6300: make UTC and local dates different
date: make "local" orthogonal to date format
date: check for "local" before anything else
t6300: add test for "raw" date format
t6300: introduce test_date() helper
fast-import: switch crash-report date to iso8601
Documentation/rev-list: don't list date formats
Documentation/git-for-each-ref: don't list date formats
Documentation/config: don't list date formats
Documentation/blame-options: don't list date formats
strncpy is known to be a confusing function because of its
termination semantics. These calls are all correct, but it
takes some examination to see why. In particular, every one
of them expects to copy up to the length limit, and then
makes some arrangement for terminating the result.
We can just use memcpy, along with noting explicitly how the
result is terminated (if it is not already obvious). That
should make it more clear to a reader that we are doing the
right thing.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we already know the length of a string (e.g., because
we just malloc'd to fit it), it's nicer to use memcpy than
strcpy, as it makes it more obvious that we are not going to
overflow the buffer (because the size we pass matches the
size in the allocation).
This also eliminates calls to strcpy, which make auditing
the code base harder.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we are allocating a struct with a FLEX_ARRAY member, we
generally compute the size of the array and then sprintf or
strcpy into it. Normally we could improve a dynamic allocation
like this by using xstrfmt, but it doesn't work here; we
have to account for the size of the rest of the struct.
But we can improve things a bit by storing the length that
we use for the allocation, and then feeding it to xsnprintf
or memcpy, which makes it more obvious that we are not
writing more than the allocated number of bytes.
It would be nice if we had some kind of helper for
allocating generic flex arrays, but it doesn't work that
well:
- the call signature is a little bit unwieldy:
d = flex_struct(sizeof(*d), offsetof(d, path), fmt, ...);
You need offsetof here instead of just writing to the
end of the base size, because we don't know how the
struct is packed (partially this is because FLEX_ARRAY
might not be zero, though we can account for that; but
the size of the struct may actually be rounded up for
alignment, and we can't know that).
- some sites do clever things, like over-allocating because
they know they will write larger things into the buffer
later (e.g., struct packed_git here).
So we're better off to just write out each allocation (or
add type-specific helpers, though many of these are one-off
allocations anyway).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We generally use 32-byte buffers to format git's "type size"
header fields. These should not generally overflow unless
you can produce some truly gigantic objects (and our types
come from our internal array of constant strings). But it is
a good idea to use xsnprintf to make sure this is the case.
Note that we slightly modify the interface to
write_sha1_file_prepare, which nows uses "hdrlen" as an "in"
parameter as well as an "out" (on the way in it stores the
allocated size of the header, and on the way out it returns
the ultimate size of the header).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When fast-import emits a crash report, it does so in the
user's local timezone. But because we omit the timezone
completely for DATE_LOCAL, a reader of the report does not
immediately know which time zone was used. Let's switch this
to ISO8601 instead, which includes the time zone.
This does mean we will show the time in UTC, but that's not
a big deal. A crash report like this will either be looked
at immediately (in which case nobody even looks at the
timestamp), or it will be passed along to a developer to
debug, in which case the original timezone is less likely to
be of interest.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git_path() and mkpath() are handy helper functions but it is easy
to misuse, as the callers need to be careful to keep the number of
active results below 4. Their uses have been reduced.
* jk/git-path:
memoize common git-path "constant" files
get_repo_path: refactor path-allocation
find_hook: keep our own static buffer
refs.c: remove_empty_directories can take a strbuf
refs.c: avoid git_path assignment in lock_ref_sha1_basic
refs.c: avoid repeated git_path calls in rename_tmp_log
refs.c: simplify strbufs in reflog setup and writing
path.c: drop git_path_submodule
refs.c: remove extra git_path calls from read_loose_refs
remote.c: drop extraneous local variable from migrate_file
prefer mkpathdup to mkpath in assignments
prefer git_pathdup to git_path in some possibly-dangerous cases
add_to_alternates_file: don't add duplicate entries
t5700: modernize style
cache.h: complete set of git_path_submodule helpers
cache.h: clarify documentation for git_path, et al
Because git_path uses a static buffer that is shared with
calls to git_path, mkpath, etc, it can be dangerous to
assign the result to a variable or pass it to a non-trivial
function. The value may change unexpectedly due to other
calls.
None of the cases changed here has a known bug, but they're
worth converting away from git_path because:
1. It's easy to use git_pathdup in these cases.
2. They use constructs (like assignment) that make it
hard to tell whether they're safe or not.
The extra malloc overhead should be trivial, as an
allocation should be an order of magnitude cheaper than a
system call (which we are clearly about to make, since we
are constructing a filename). The real cost is that we must
remember to free the result.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 5a688fe4 ("core.sharedrepository = 0mode" should set, not
loosen, 2009-03-25), we kept reminding ourselves:
NEEDSWORK: this should be renamed to finalize_temp_file() as
"moving" is only a part of what it does, when no patch between
master to pu changes the call sites of this function.
without doing anything about it. Let's do so.
The purpose of this function was not to move but to finalize. The
detail of the primarily implementation of finalizing was to link the
temporary file to its final name and then to unlink, which wasn't
even "moving". The alternative implementation did "move" by calling
rename(2), which is a fun tangent.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach "git log" and friends a new "--date=format:..." option to
format timestamps using system's strftime(3).
* jk/date-mode-format:
strbuf: make strbuf_addftime more robust
introduce "format" date-mode
convert "enum date_mode" into a struct
show-branch: use DATE_RELATIVE instead of magic number
Often a fast-import stream builds a new commit on top of the
previous commit it built, and it often unconditionally emits a
"from" command to specify the first parent, which can be omitted in
such a case. This caused fast-import to forget the tree of the
previous commit and then re-read it from scratch, which was
inefficient. Optimize for this common case.
* mh/fast-import-optimize-current-from:
fast-import: do less work when given "from" matches current branch head
"git fast-import" learned to respond to the get-mark command via
its cat-blob-fd interface.
* mh/fast-import-get-mark:
fast-import: add a get-mark command
When building a fast-import stream, it's easy to forget the fact
that for non-merge commits happening on top of the current branch
head, there is no need for a "from" command. That is corroborated by
the fact that at least git-p4, hg-fast-export and felipec's
git-remote-hg all unconditionally use a "from" command.
Unfortunately, giving a "from" command always resets the branch
tree, forcing it to be re-read, and in many cases, the pack is also
closed and reopened through gfi_unpack_entry. Both are unnecessary
overhead, and the latter is particularly slow at least on OSX.
Avoid resetting the tree when it's unmodified, and avoid calling
gfi_unpack_entry when the given mark points to the same commit as
the current branch head.
Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is sometimes useful for importers to be able to read the SHA-1
corresponding to a mark that they have created via fast-import. For
example, they might want to embed the SHA-1 into the commit message of
a later commit. Or it might be useful for internal bookkeeping uses,
or for logging.
Add a "get-mark" command to "git fast-import" that allows the importer
to ask for the value of a mark that has been created earlier.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In preparation for adding date modes that may carry extra
information beyond the mode itself, this patch converts the
date_mode enum into a struct.
Most of the conversion is fairly straightforward; we pass
the struct as a pointer and dereference the type field where
necessary. Locations that declare a date_mode can use a "{}"
constructor. However, the tricky case is where we use the
enum labels as constants, like:
show_date(t, tz, DATE_NORMAL);
Ideally we could say:
show_date(t, tz, &{ DATE_NORMAL });
but of course C does not allow that. Likewise, we cannot
cast the constant to a struct, because we need to pass an
actual address. Our options are basically:
1. Manually add a "struct date_mode d = { DATE_NORMAL }"
definition to each caller, and pass "&d". This makes
the callers uglier, because they sometimes do not even
have their own scope (e.g., they are inside a switch
statement).
2. Provide a pre-made global "date_normal" struct that can
be passed by address. We'd also need "date_rfc2822",
"date_iso8601", and so forth. But at least the ugliness
is defined in one place.
3. Provide a wrapper that generates the correct struct on
the fly. The big downside is that we end up pointing to
a single global, which makes our wrapper non-reentrant.
But show_date is already not reentrant, so it does not
matter.
This patch implements 3, along with a minor macro to keep
the size of the callers sane.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we are deleting the reference, then we don't need to read the
reference's old value. It doesn't provide any race safety, because the
value read just before the delete is no "better" than the value that
would be read under lock during the delete. And even if the reference
previously didn't exist, we can call delete_ref() on it if we don't
provide an old_sha1 value.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A replacement for contrib/workdir/git-new-workdir that does not
rely on symbolic links and make sharing of objects and refs safer
by making the borrowee and borrowers aware of each other.
* nd/multiple-work-trees: (41 commits)
prune --worktrees: fix expire vs worktree existence condition
t1501: fix test with split index
t2026: fix broken &&-chain
t2026 needs procondition SANITY
git-checkout.txt: a note about multiple checkout support for submodules
checkout: add --ignore-other-wortrees
checkout: pass whole struct to parse_branchname_arg instead of individual flags
git-common-dir: make "modules/" per-working-directory directory
checkout: do not fail if target is an empty directory
t2025: add a test to make sure grafts is working from a linked checkout
checkout: don't require a work tree when checking out into a new one
git_path(): keep "info/sparse-checkout" per work-tree
count-objects: report unused files in $GIT_DIR/worktrees/...
gc: support prune --worktrees
gc: factor out gc.pruneexpire parsing code
gc: style change -- no SP before closing parenthesis
checkout: clean up half-prepared directories in --to mode
checkout: reject if the branch is already checked out elsewhere
prune: strategies for linked checkouts
checkout: support checking out into a new working directory
...
Clear the git_zstream variable at the start of git_deflate_init() etc.
so that callers don't have to do that.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Simplify the ref transaction API around how "the ref should be
pointing at this object" is specified.
* mh/refs-have-new:
refs.h: remove duplication in function docstrings
update_ref(): improve documentation
ref_transaction_verify(): new function to check a reference's value
ref_transaction_delete(): check that old_sha1 is not null_sha1
ref_transaction_create(): check that new_sha1 is valid
commit: avoid race when creating orphan commits
commit: add tests of commit races
ref_transaction_delete(): remove "have_old" parameter
ref_transaction_update(): remove "have_old" parameter
struct ref_update: move "have_old" into "flags"
refs.c: change some "flags" to "unsigned int"
refs: remove the gap in the REF_* constant values
refs: move REF_DELETING to refs.c
"git fast-import" used to crash when it could not close and
conclude the resulting packfile cleanly.
* jk/fast-import-die-nicely-fix:
fast-import: avoid running end_packfile recursively
Instead, verify the reference's old value if and only if old_sha1 is
non-NULL.
ref_transaction_delete() will get the same treatment in a moment.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When an import has finished, we run end_packfile() to
finalize the data and move the packfile into place. If this
process fails, we call die() and end up in our die_nicely()
handler. Which unfortunately includes running end_packfile
to save any progress we made. We enter the function again,
and start operating on the pack_data struct while it is in
an inconsistent state, leading to a segfault.
One way to trigger this is to simply start two identical
fast-imports at the same time. They will both create the
same packfiles, which will then try to create identically
named ".keep" files. One will win the race, and the other
will die(), and end up with the segfault.
Since 3c078b9, we already reset the pack_data pointer to
NULL at the end of end_packfile. That covers the case of us
calling die() right after end_packfile, before we have
reinitialized the pack_data pointer. This new problem is
quite similar, except that we are worried about calling
die() _during_ end_packfile, not right after. Ideally we
would simply set pack_data to NULL as soon as we enter the
function, and operate on a copy of the pointer.
Unfortunately, it is not so easy. pack_data is a global, and
end_packfile calls into other functions which operate on the
global directly. We would have to teach each of these to
take an argument, and there is no guarantee that we would
catch all of the spots.
Instead, we can simply use a static flag to avoid
recursively entering the function. This is a little less
elegant, but it's short and fool-proof.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This allows git_path() to redirect info/fast-import to another place
if needed
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before the previous commit, get_pathname returns an array of PATH_MAX
length. Even if git_path() and similar functions does not use the
whole array, git_path() caller can, in theory.
After the commit, get_pathname() may return a buffer that has just
enough room for the returned string and git_path() caller should never
write beyond that.
Make git_path(), mkpath() and git_path_submodule() return a const
buffer to make sure callers do not write in it at all.
This could have been part of the previous commit, but the "const"
conversion is too much distraction from the core changes in path.c.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the ref transaction API so that we pass the reflog message to the
create/delete/update functions instead of to ref_transaction_commit.
This allows different reflog messages for each ref update in a multi-ref
transaction.
Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the interface declaration for the functions in lockfile.c from
cache.h to a new file, lockfile.h. Add #includes where necessary (and
remove some redundant includes of cache.h by files that already
include builtin.h).
Move the documentation of the lock_file state diagram from lockfile.c
to the new header file.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When commit_lock_file() fails, it now always calls
rollback_lock_file() internally, so there is no need to call that
function here.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
An attempt to remove the entire tree in the "git fast-import" input
stream caused it to misbehave.
* mb/fast-import-delete-root:
fast-import: fix segfault in store_tree()
t9300: test filedelete command
An attempt to remove the entire tree in the "git fast-import" input
stream caused it to misbehave.
* mb/fast-import-delete-root:
fast-import: fix segfault in store_tree()
t9300: test filedelete command
Code clean-up.
* jk/commit-author-parsing:
determine_author_info(): copy getenv output
determine_author_info(): reuse parsing functions
date: use strbufs in date-formatting functions
record_author_date(): use find_commit_header()
record_author_date(): fix memory leak on malformed commit
commit: provide a function to find a header in a buffer
With sufficiently long refnames, fast-import could have overflown
an on-stack buffer.
* jk/fast-import-fixes:
fast-import: fix buffer overflow in dump_tags
fast-import: clean up pack_data pointer in end_packfile
The second batch of the transactional ref update series.
* rs/ref-transaction-1: (22 commits)
update-ref --stdin: pass transaction around explicitly
update-ref --stdin: narrow scope of err strbuf
refs.c: make delete_ref use a transaction
refs.c: make prune_ref use a transaction to delete the ref
refs.c: remove lock_ref_sha1
refs.c: remove the update_ref_write function
refs.c: remove the update_ref_lock function
refs.c: make lock_ref_sha1 static
walker.c: use ref transaction for ref updates
fast-import.c: use a ref transaction when dumping tags
receive-pack.c: use a reference transaction for updating the refs
refs.c: change update_ref to use a transaction
branch.c: use ref transaction for all ref updates
fast-import.c: change update_branch to use ref transactions
sequencer.c: use ref transactions for all ref updates
commit.c: use ref transactions for updates
replace.c: use the ref transaction functions for updates
tag.c: use ref transactions when doing updates
refs.c: add transaction.status and track OPEN/CLOSED
refs.c: make ref_transaction_begin take an err argument
...
Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change update_branch() to use ref transactions for updates.
Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Branch tree is NULLified by filedelete command if we are trying
to delete root tree. Add sanity check and use load_tree() in that case.
Signed-off-by: Maxim Bublis <satori@yandex-team.ru>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many of the date functions write into fixed-size buffers.
This is a minor pain, as we have to take special
precautions, and frequently end up copying the result into a
strbuf or heap-allocated buffer anyway (for which we
sometimes use strcpy!).
Let's instead teach parse_date, datestamp, etc to write to a
strbuf. The obvious downside is that we might need to
perform a heap allocation where we otherwise would not need
to. However, it turns out that the only two new allocations
required are:
1. In test-date.c, where we don't care about efficiency.
2. In determine_author_info, which is not performance
critical (and where the use of a strbuf will help later
refactoring).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When creating a new annotated tag, we sprintf the refname
into a static-sized buffer. If we have an absurdly long
tagname, like:
git init repo &&
cd repo &&
git commit --allow-empty -m foo &&
git tag -m message mytag &&
git fast-export mytag |
perl -lpe '/^tag/ and s/mytag/"a" x 8192/e' |
git fast-import <input
we'll overflow the buffer. We can fix it by using a strbuf.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu>
Reviewed-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have a global pointer pack_data pointing to the current
pack we have open. Inside end_packfile we have two new
pointers, old_p and new_p. The latter points to pack_data,
and the former points to the new "installed" version of the
packfile we get when we hand the file off to the regular
sha1_file machinery. When then free old_p.
Presumably the extra old_p pointer was there so that we
could overwrite pack_data with new_p and still free old_p,
but we don't do that. We just leave pack_data pointing to
bogus memory, and don't overwrite it until we call
start_packfile again (if ever).
This can cause problems for our die routine, which calls
end_packfile to clean things up. If we die at the wrong
moment, we can end up looking at invalid memory in
pack_data left after the last end_packfile().
Instead, let's make sure we set pack_data to NULL after we
free it, and make calling endfile() again with a NULL
pack_data a noop (there is nothing to end).
We can further make things less confusing by dropping old_p
entirely, and moving new_p closer to its point of use.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use `git_config_get_*()` family instead of `git_config()` to take
advantage of the config-set API which provides a cleaner control flow.
Signed-off-by: Tanay Abhra <tanayabh@gmail.com>
Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>