mirrors/git - Incest Forge: Beyond sex. We incest.

mirrors/git

mirror of https://github.com/git/git.git synced 2024-11-18 15:04:49 +01:00

Author	SHA1	Message	Date
Junio C Hamano	bec2a69fe4	Revert "Revert "diff-delta: produce optimal pack data""	2006-02-27 21:37:56 -08:00
Junio C Hamano	eae3fe5e50	Revert "diff-delta: produce optimal pack data" This reverts `6b7d25d97b` commit. It turns out that the new algorithm has a really bad corner case, that literally spends minutes for inputs that takes less than a quater seconds to delta with the old algorithm. The resulting delta is 50% smaller which is admirable, but the performance degradation is simply unacceptable for unconditional use. Some example cases are these blobs in Linux 2.6 repository: 4917ec509720a42846d513addc11cbd25e0e3c4f 9af06ba723df75fed49f7ccae5b6c9c34bc5115f dfc9cd58dc065d17030d875d3fea6e7862ede143 Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-02-24 01:29:00 -08:00
Nicolas Pitre	6b7d25d97b	diff-delta: produce optimal pack data Indexing based on adler32 has a match precision based on the block size (currently 16). Lowering the block size would produce smaller deltas but the indexing memory and computing cost increases significantly. For optimal delta result the indexing block size should be 3 with an increment of 1 (instead of 16 and 16). With such low params the adler32 becomes a clear overhead increasing the time for git-repack by a factor of 3. And with such small blocks the adler 32 is not very useful as the whole of the block bits can be used directly. This patch replaces the adler32 with an open coded index value based on 3 characters directly. This gives sufficient bits for hashing and allows for optimal delta with reasonable CPU cycles. The resulting packs are 6% smaller on average. The increase in CPU time is about 25%. But this cost is now hidden by the delta reuse patch while the saving on data transfers is always there. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-02-22 00:36:09 -08:00
Nicolas Pitre	8e1454b5ad	diff-delta: big code simplification This is much smaller and hopefully clearer code now. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-02-22 00:36:09 -08:00
Nicolas Pitre	fe474b588b	diff-delta: fold two special tests into one plus cleanups Testing for realloc and size limit can be done with only one test per loop. Make it so and fix a theoretical off-by-one comparison error in the process. The output buffer memory allocation is also bounded by max_size when specified. Finally make some variable unsigned to allow the handling of files up to 4GB in size instead of 2GB. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-02-22 00:36:09 -08:00
Peter Eriksen	04fe2a1706	Use adler32() from zlib instead of defining our own. Since we already depend on zlib, we don't need to define our own adler32(). Spotted by oprofile. Signed-off-by: Peter Eriksen <s022018@student.dtu.dk> Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-02-05 13:45:01 -08:00
Nicolas Pitre	e5e3a9d8f9	small cleanup for diff-delta.c This patch removes unused remnants of the original xdiff source. No functional change. Possible tiny speed improvement. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net>	2005-12-15 16:19:11 -08:00
Junio C Hamano	c7a45bd20e	Revert "diff-delta.c: allow delta with empty blob." This reverts `962537a3eb` commit to play safe.	2005-12-12 16:42:38 -08:00
Junio C Hamano	962537a3eb	diff-delta.c: allow delta with empty blob. Delta computation with an empty blob used to punt and returned NULL. This commit allows creation with empty blob; all combination of empty->empty, empty->something, and something->empty are allowed. Signed-off-by: Junio C Hamano <junkio@cox.net>	2005-12-12 12:57:25 -08:00
Nicolas Pitre	dcde55bc58	[PATCH] assorted delta code cleanup This is a wrap-up patch including all the cleanups I've done to the delta code and its usage. The most important change is the factorization of the delta header handling code. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-29 09:11:38 -07:00
Nicolas Pitre	69a2d426f0	[PATCH] denser delta header encoding Since the delta data format is not tied to any actual git object anymore, now is the time to add a small improvement to the delta data header as it is been done for packed object header. This patch allows for reducing the delta header of about 2 bytes and makes for simpler code. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-28 21:38:47 -07:00
Linus Torvalds	75c42d8cc3	Add a "max_size" parameter to diff_delta() Anything that generates a delta to see if two objects are close usually isn't interested in the delta ends up being bigger than some specified size, and this allows us to stop delta generation early when that happens.	2005-06-25 19:30:20 -07:00
Nicolas Pitre	a310d43494	[PATCH] Deltification library work by Nicolas Pitre. This patch adds the basic library functions to create and replay delta information. Also included is a test-delta utility to validate the code. diff-delta was based on LibXDiff written by Davide Libenzi Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-19 08:56:22 -07:00