mirror of
https://github.com/git/git.git
synced 2024-11-01 06:47:52 +01:00
c6458e60ed
Once we know the number of objects in the input pack, we allocate an array of nr_objects of struct delta_entry. On x86-64, this struct is 32 bytes long. The union delta_base, which is part of struct delta_entry, provides enough space to store either ofs-delta (8 bytes) or ref-delta (20 bytes). Because ofs-delta encoding is more efficient space-wise and more performant at runtime than ref-delta encoding, Git packers try to use ofs-delta whenever possible, and it is expected that objects encoded as ref-delta are minority. In the best clone case where no ref-delta object is present, we waste (20-8) * nr_objects bytes because of this union. That's about 38MB out of 100MB for deltas[] with 3.4M objects, or 38%. deltas[] would be around 62MB without the waste. This patch attempts to eliminate that. deltas[] array is split into two: one for ofs-delta and one for ref-delta. Many functions are also duplicated because of this split. With this patch, ofs_deltas[] array takes 51MB. ref_deltas[] should remain unallocated in clone case (0 bytes). This array grows as we see ref-delta. We save about half in this case, or 25% of total bookkeeping. The saving is more than the calculation above because some padding in the old delta_entry struct is removed. ofs_delta_entry is 16 bytes, including the 4 bytes padding. That's 13MB for padding, but packing the struct could break platforms that do not support unaligned access. If someone on 32-bit is really low on memory and only deals with packs smaller than 2G, using 32-bit off_t would eliminate the padding and save 27MB on top. A note about ofs_deltas allocation. We could use ref_deltas memory allocation strategy for ofs_deltas. But that probably just adds more overhead on top. ofs-deltas are generally the majority (1/2 to 2/3) in any pack. Incremental realloc may lead to too many memcpy. And if we preallocate, say 1/2 or 2/3 of nr_objects initially, the growth rate of ALLOC_GROW() could make this array larger than nr_objects, wasting more memory. Brought-up-by: Matthew Sporleder <msporleder@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> |
||
---|---|---|
.. | ||
add.c | ||
annotate.c | ||
apply.c | ||
archive.c | ||
bisect--helper.c | ||
blame.c | ||
branch.c | ||
bundle.c | ||
cat-file.c | ||
check-attr.c | ||
check-ignore.c | ||
check-mailmap.c | ||
check-ref-format.c | ||
checkout-index.c | ||
checkout.c | ||
clean.c | ||
clone.c | ||
column.c | ||
commit-tree.c | ||
commit.c | ||
config.c | ||
count-objects.c | ||
credential.c | ||
describe.c | ||
diff-files.c | ||
diff-index.c | ||
diff-tree.c | ||
diff.c | ||
fast-export.c | ||
fetch-pack.c | ||
fetch.c | ||
fmt-merge-msg.c | ||
for-each-ref.c | ||
fsck.c | ||
gc.c | ||
get-tar-commit-id.c | ||
grep.c | ||
hash-object.c | ||
help.c | ||
index-pack.c | ||
init-db.c | ||
interpret-trailers.c | ||
log.c | ||
ls-files.c | ||
ls-remote.c | ||
ls-tree.c | ||
mailinfo.c | ||
mailsplit.c | ||
merge-base.c | ||
merge-file.c | ||
merge-index.c | ||
merge-ours.c | ||
merge-recursive.c | ||
merge-tree.c | ||
merge.c | ||
mktag.c | ||
mktree.c | ||
mv.c | ||
name-rev.c | ||
notes.c | ||
pack-objects.c | ||
pack-redundant.c | ||
pack-refs.c | ||
patch-id.c | ||
prune-packed.c | ||
prune.c | ||
push.c | ||
read-tree.c | ||
receive-pack.c | ||
reflog.c | ||
remote-ext.c | ||
remote-fd.c | ||
remote.c | ||
repack.c | ||
replace.c | ||
rerere.c | ||
reset.c | ||
rev-list.c | ||
rev-parse.c | ||
revert.c | ||
rm.c | ||
send-pack.c | ||
shortlog.c | ||
show-branch.c | ||
show-ref.c | ||
stripspace.c | ||
symbolic-ref.c | ||
tag.c | ||
unpack-file.c | ||
unpack-objects.c | ||
update-index.c | ||
update-ref.c | ||
update-server-info.c | ||
upload-archive.c | ||
var.c | ||
verify-commit.c | ||
verify-pack.c | ||
verify-tag.c | ||
write-tree.c |