* jc/streaming-filter:
t0021: test application of both crlf and ident
t0021-conversion.sh: fix NoTerminatingSymbolAtEOF test
streaming: filter cascading
streaming filter: ident filter
Add LF-to-CRLF streaming conversion
stream filter: add "no more input" to the filters
Add streaming filter API
convert.h: move declarations for conversion from cache.h
* jc/streaming:
sha1_file: use the correct type (ssize_t, not size_t) for read-style function
streaming: read loose objects incrementally
sha1_file.c: expose helpers to read loose objects
streaming: read non-delta incrementally from a pack
streaming_write_entry(): support files with holes
convert: CRLF_INPUT is a no-op in the output codepath
streaming_write_entry(): use streaming API in write_entry()
streaming: a new API to read from the object store
write_entry(): separate two helper functions out
unpack_object_header(): make it public
sha1_object_info_extended(): hint about objects in delta-base cache
sha1_object_info_extended(): expose a bit more info
packed_object_info_detail(): do not return a string
* ef/maint-win-verify-path:
verify_dotfile(): do not assume '/' is the path seperator
verify_path(): simplify check at the directory boundary
verify_path: consider dos drive prefix
real_path: do not assume '/' is the path seperator
A Windows path starting with a backslash is absolute
* jk/maint-config-alias-fix:
handle_options(): do not miscount how many arguments were used
config: always parse GIT_CONFIG_PARAMETERS during git_config
git_config: don't peek at global config_parameters
config: make environment parsing routines static
Conflicts:
config.c
This fixes prefix_path() not recognizing e.g. \foo\bar as an absolute path
on Windows.
Signed-off-by: Theo Niessink <theo@taletn.com>
Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before adding the streaming filter API to the conversion layer,
move the existing declarations related to the conversion to its
own header file.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jk/git-connection-deadlock-fix:
test core.gitproxy configuration
send-pack: avoid deadlock on git:// push with failed pack-objects
connect: let callers know if connection is a socket
connect: treat generic proxy processes like ssh processes
Conflicts:
connect.c
* jc/bigfile:
Bigfile: teach "git add" to send a large file straight to a pack
index_fd(): split into two helper functions
index_fd(): turn write_object and format_check arguments into one flag
Nobody outside of git_config_from_parameters should need
to use the GIT_CONFIG_PARAMETERS parsing functions, so let's
make them private.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jc/magic-pathspec:
setup.c: Fix some "symbol not declared" sparse warnings
t3703: Skip tests using directory name ":" on Windows
revision.c: leave a note for "a lone :" enhancement
t3703, t4208: add test cases for magic pathspec
rev/path disambiguation: further restrict "misspelled index entry" diag
fix overslow :/no-such-string-ever-existed diagnostics
fix overstrict :<path> diagnosis
grep: use get_pathspec() correctly
pathspec: drop "lone : means no pathspec" from get_pathspec()
Revert "magic pathspec: add ":(icase)path" to match case insensitively"
magic pathspec: add ":(icase)path" to match case insensitively
magic pathspec: futureproof shorthand form
magic pathspec: add tentative ":/path/from/top/level" pathspec support
Make map_sha1_file(), parse_sha1_header() and unpack_sha1_header()
available to the streaming read API by exporting them via cache.h header
file.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the output to a path does not have to be converted, we can read from
the object database from the streaming API and write to the file in the
working tree, without having to hold everything in the memory.
The ident, auto- and safe- crlf conversions inherently require you to read
the whole thing before deciding what to do, so while it is technically
possible to support them by using a buffer of an unbound size or rewinding
and reading the stream twice, it is less practical than the traditional
"read the whole thing in core and convert" approach.
Adding streaming filters for the other conversions on top of this should
be doable by tweaking the can_bypass_conversion() function (it should be
renamed to can_filter_stream() when it happens). Then the streaming API
can be extended to wrap the git_istream streaming_write_entry() opens on
the underlying object in another git_istream that reads from it, filters
what is read, and let the streaming_write_entry() read the filtered
result. But that is outside the scope of this series.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
An object found in the delta-base cache is not guaranteed to
stay there, but we know it came from a pack and it is likely
to give us a quick access if we read_sha1_file() it right now,
which is a piece of useful information.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jc/replacing:
read_sha1_file(): allow selective bypassing of replacement mechanism
inline lookup_replace_object() calls
read_sha1_file(): get rid of read_sha1_file_repl() madness
t6050: make sure we test not just commit replacement
Declare lookup_replace_object() in cache.h, not in commit.h
Conflicts:
environment.c
* jk/git-connection-deadlock-fix:
test core.gitproxy configuration
send-pack: avoid deadlock on git:// push with failed pack-objects
connect: let callers know if connection is a socket
connect: treat generic proxy processes like ssh processes
Conflicts:
connect.c
The original interface for sha1_object_info() takes an object name and
gives back a type and its size (the latter is given only when it was
asked). The new interface wraps its implementation and exposes a bit
more pieces of information that the interface used to discard, namely:
- where the object is stored (loose? cached? packed?)
- if packed, where in which packfile?
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
* In the earlier round, this used u.pack.delta to record the length of
the delta chain, but the caller is not necessarily interested in the
length of the delta chain per-se, but may only want to know if it is a
delta against another object or is stored as a deflated data. Calling
packed_object_info_detail() involves walking the reverse index chain to
compute the store size of the object and is unnecessarily expensive.
We could resurrect the code if a new caller wants to know, but I doubt
it.
The return codes of git_config_set() and friends are magic numbers right
in the source. #define them in cache.h where the functions are declared,
and use the constants in the source.
Also, mention the resulting exit codes of "git config" in its man page
(and complete the list).
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead return an integer that can be given to typename() if
the caller wants a string, just like everybody else does.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
They might care because they want to do a half-duplex close.
With pipes, that means simply closing the output descriptor;
with a socket, you must actually call shutdown.
Instead of exposing the magic no_fork child_process struct,
let's encapsulate the test in a function.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jc/convert:
convert: make it harder to screw up adding a conversion attribute
convert: make it safer to add conversion attributes
convert: give saner names to crlf/eol variables, types and functions
convert: rename the "eol" global variable to "core_eol"
* jc/bigfile:
Bigfile: teach "git add" to send a large file straight to a pack
index_fd(): split into two helper functions
index_fd(): turn write_object and format_check arguments into one flag
* jc/replacing:
read_sha1_file(): allow selective bypassing of replacement mechanism
inline lookup_replace_object() calls
read_sha1_file(): get rid of read_sha1_file_repl() madness
t6050: make sure we test not just commit replacement
Declare lookup_replace_object() in cache.h, not in commit.h
The way "object replacement" mechanism was tucked to the read_sha1_file()
interface was suboptimal in a couple of ways:
- Callers that want it to die with useful diagnosis upon seeing a corrupt
object does not have a way to say that they do not want any object
replacement.
- Callers who do not want it to die but want to handle the errors
themselves are told to arrange to call read_object(), but the function
does not use the replacement mechanism, and also it is a file scope
static function that not many callers can call to begin with.
This adds a read_sha1_file_extended() that takes a set of flags; the
callers of read_sha1_file() passes a flag READ_SHA1_FILE_REPLACE to ask
for object replacement mechanism to kick in.
Later, we could add another flag bit to tell the function to return an
error instead of dying and then remove the misguided "call read_object()
yourself".
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a repository without object replacement, lookup_replace_object() should
be a no-op. Check the flag "read_replace_refs" on the side of the caller,
and bypess a function call when we know we are not dealing with replacement.
Also, even when we are set up to replace objects, if we do not find any
replacement defined, flip that flag off to avoid function call overhead
for all the later object accesses.
As this change the semantics of the flag from "do we need read the
replacement definition?" to "do we need to check with the lookup table?"
the flag needs to be renamed later to something saner, e.g. "use_replace",
when the codebase is calmer, but not now.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most callers want to silently get a replacement object, and they do not
care what the real name of the replacement object is. Worse yet, no sane
interface to return the underlying object without replacement is provided.
Remove the function and make only the few callers that want the name of
the replacement object find it themselves.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The declaration is misplaced as the replace API is supposed to affect
not just commits, but all types of objects.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git cmd :/no-such-string-ever-existed" runs an extra round of get_sha1()
since 009fee4 (Detailed diagnosis when parsing an object name fails.,
2009-12-07). Once without error diagnosis to see there is no commit with
such a string in the log message (hence "it cannot be a ref"), and after
seeing that :/no-such-string-ever-existed is not a filename (hence "it
cannot be a path, either"), another time to give "better diagnosis".
The thing is, the second time it runs, we already know that traversing the
history all the way down to the root will _not_ find any matching commit.
Rename misguided "gently" parameter, which is turned off _only_ when the
"detailed diagnosis" codepath knows that it cannot be a ref and making the
call only for the caller to die with a message. Flip its meaning (and
adjust the callers) and call it "only_to_die", which is not a great name,
but it describes far more clearly what the codepaths that switches their
behaviour based on this variable do.
On my box, the command spends ~1.8 seconds without the patch to make the
report; with the patch it spends ~1.12 seconds.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Yes, it is clear that "eol" wants to mean some sort of end-of-line thing,
but as the name of a global variable, it is way too short to describe what
kind of end-of-line thing it wants to represent. Besides, there are many
codepaths that want to use their own local "char *eol" variable to point
at the end of the current line they are processing.
This global variable holds what we read from core.eol configuration
variable. Name it as such.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "format_check" parameter tucked after the existing parameters is too
ugly an afterthought to live in any reasonable API.
Combine it with the other boolean parameter "write_object" into a single
"flags" parameter.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* nd/struct-pathspec:
pathspec: rename per-item field has_wildcard to use_wildcard
Improve tree_entry_interesting() handling code
Convert read_tree{,_recursive} to support struct pathspec
Reimplement read_tree_recursive() using tree_entry_interesting()
The pack-objects command should take notice of the object file and
refrain from attempting to delta large ones, to be consistent with
the fast-import command.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As the point of the last change is to allow use of strings as
literals no matter what characters are in them, "has_wildcard"
does not match what we use this field for anymore.
It is used to decide if the wildcard matching should be used, so
rename it to match the usage better.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The git-new-workdir script in contrib/ makes a new work tree by sharing
many subdirectories of the .git directory with the original repository.
When rerere.enabled is set in the original repository, but the user has
not encountered any conflicts yet, the original repository may not yet
have .git/rr-cache directory.
When rerere wants to run in a new work tree created from such a young
original repository, it fails to mkdir(2) .git/rr-cache that is a symlink
to a yet-to-be-created directory.
There are three possible approaches to this:
- A naive solution is not to create a symlink in the git-new-workdir
script to a directory the original does not have (yet). This is not a
solution, as we tend to lazily create subdirectories of .git/, and
having rerere.enabled configuration set is a strong indication that the
user _wants_ to have this lazy creation to happen;
- We could always create .git/rr-cache upon repository creation. This is
tempting but will not help people with existing repositories.
- Detect this case by seeing that mkdir(2) failed with EEXIST, checking
that the path is a symlink, and try running mkdir(2) on the link
target.
This patch solves the issue by doing the third one.
Strictly speaking, this is incomplete. It does not attempt to handle
relative symbolic link that points into the original repository, but this
is good enough to help people who use contrib/workdir/git-new-workdir
script.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jn/test-sanitize-git-env:
tests: scrub environment of GIT_* variables
config: drop support for GIT_CONFIG_NOGLOBAL
gitattributes: drop support for GIT_ATTR_NOGLOBAL
tests: suppress system gitattributes
tests: stop worrying about obsolete environment variables
When we had to refresh the index internally before running diff or status,
we opportunistically updated the $GIT_INDEX_FILE so that later invocation
of git can use the lstat(2) we already did in this invocation.
Make them share a helper function to do so.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* sp/maint-fd-limit:
sha1_file.c: Don't retain open fds on small packs
mingw: add minimum getrlimit() compatibility stub
Limit file descriptors used by packs
* ab/i18n-basic:
i18n: "make distclean" should clean up after "make pot"
i18n: Makefile: "pot" target to extract messages marked for translation
i18n: add stub Q_() wrapper for ngettext
i18n: do not poison translations unless GIT_GETTEXT_POISON envvar is set
i18n: add GETTEXT_POISON to simulate unfriendly translator
i18n: add no-op _() and N_() wrappers
commit, status: use status_printf{,_ln,_more} helpers
commit: refer to commit template as s->fp
wt-status: add helpers for printing wt-status lines
Conflicts:
builtin/commit.c
* jk/trace-sifter:
trace: give repo_setup trace its own key
add packet tracing debug code
trace: add trace_strbuf
trace: factor out "do we want to trace" logic
trace: refactor to support multiple env variables
trace: add trace_vprintf
--separate-git-dir tells git to create git dir at the specified
location, instead of where it is supposed to be. A .git file that
points to that location will be put in place so that it appears normal
to repo discovery process.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename the make_*_path functions so it's clearer what they do, in
particlar make clear what the differnce between make_absolute_path and
make_nonrelative_path is by renaming them real_path and absolute_path
respectively. make_relative_path has an understandable name and is
renamed to relative_path to maintain the name convention.
The function calls have been replaced 1-to-1 in their usage.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>