mirrors/git - Incest Forge: Beyond sex. We incest.

mirrors/git

mirror of https://github.com/git/git.git synced 2024-10-28 04:49:43 +01:00

445 lines

13 KiB

C

Raw Permalink Normal View History

hash.h: move SHA-1 implementation selection into a header file Many developers use functionality in their editors that allows for quick syntax checks, including warning about questionable constructs. This functionality allows rapid development with fewer errors. However, such functionality generally does not allow the specification of project-specific defines or command-line options. Since the SHA1_HEADER include is not defined in such a case, developers see spurious errors when using these tools. Furthermore, there are known implementations of "cc" whose '#include' is unhappy with this construct. Instead of using SHA1_HEADER, create a hash.h header and use #if and #elif to select the desired header. Have the Makefile pass an appropriate option to help the header select the right implementation to use. [jc: make BLK_SHA1 the fallback default as discussed on list, e.g. <20170314201424.vccij5z2ortq4a4o@sigill.intra.peff.net>; also remove SHA1_HEADER and SHA1_HEADER_SQ that are no longer used]. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-03-11 23:28:18 +01:00			`#ifndef HASH_H`
			`#define HASH_H`

hash-ll: merge with "hash.h" The "hash-ll.h" header was introduced via d1cbe1e6d8 (hash-ll.h: split out of hash.h to remove dependency on repository.h, 2023-04-22) to make explicit the split between hash-related functions that rely on the global `the_repository`, and those that don't. This split is no longer necessary now that we we have removed the reliance on `the_repository`. Merge "hash-ll.h" back into "hash.h". This causes some code units to not include "repository.h" anymore, which requires us to add some forward declarations. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2024-06-14 08:50:32 +02:00			`#if defined(SHA1_APPLE)`
			`#include <CommonCrypto/CommonDigest.h>`
			`#elif defined(SHA1_OPENSSL)`
			`# include <openssl/sha.h>`
			`# if defined(OPENSSL_API_LEVEL) && OPENSSL_API_LEVEL >= 3`
			`# define SHA1_NEEDS_CLONE_HELPER`
			`# include "sha1/openssl.h"`
			`# endif`
			`#elif defined(SHA1_DC)`
			`#include "sha1dc_git.h"`
			`#else /* SHA1_BLK */`
			`#include "block-sha1/sha1.h"`
			`#endif`

Makefile: allow specifying a SHA-1 for non-cryptographic uses Introduce _UNSAFE variants of the OPENSSL_SHA1, BLK_SHA1, and APPLE_COMMON_CRYPTO_SHA1 compile-time knobs which indicate which SHA-1 implementation is to be used for non-cryptographic uses. There are a couple of small implementation notes worth mentioning: - There is no way to select the collision detecting SHA-1 as the "fast" fallback, since the fast fallback is only for non-cryptographic uses, and is meant to be faster than our collision-detecting implementation. - There are no similar knobs for SHA-256, since no collision attacks are presently known and thus no collision-detecting implementations actually exist. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2024-09-26 17:22:50 +02:00			`#if defined(SHA1_APPLE_UNSAFE)`
			`# include <CommonCrypto/CommonDigest.h>`
			`# define platform_SHA_CTX_unsafe CC_SHA1_CTX`
			`# define platform_SHA1_Init_unsafe CC_SHA1_Init`
			`# define platform_SHA1_Update_unsafe CC_SHA1_Update`
			`# define platform_SHA1_Final_unsafe CC_SHA1_Final`
			`#elif defined(SHA1_OPENSSL_UNSAFE)`
			`# include <openssl/sha.h>`
			`# if defined(OPENSSL_API_LEVEL) && OPENSSL_API_LEVEL >= 3`
			`# define SHA1_NEEDS_CLONE_HELPER_UNSAFE`
			`# include "sha1/openssl.h"`
			`# define platform_SHA_CTX_unsafe openssl_SHA1_CTX`
			`# define platform_SHA1_Init_unsafe openssl_SHA1_Init`
			`# define platform_SHA1_Clone_unsafe openssl_SHA1_Clone`
			`# define platform_SHA1_Update_unsafe openssl_SHA1_Update`
			`# define platform_SHA1_Final_unsafe openssl_SHA1_Final`
			`# else`
			`# define platform_SHA_CTX_unsafe SHA_CTX`
			`# define platform_SHA1_Init_unsafe SHA1_Init`
			`# define platform_SHA1_Update_unsafe SHA1_Update`
			`# define platform_SHA1_Final_unsafe SHA1_Final`
			`# endif`
			`#elif defined(SHA1_BLK_UNSAFE)`
			`# include "block-sha1/sha1.h"`
			`# define platform_SHA_CTX_unsafe blk_SHA_CTX`
			`# define platform_SHA1_Init_unsafe blk_SHA1_Init`
			`# define platform_SHA1_Update_unsafe blk_SHA1_Update`
			`# define platform_SHA1_Final_unsafe blk_SHA1_Final`
			`#endif`

hash-ll: merge with "hash.h" The "hash-ll.h" header was introduced via d1cbe1e6d8 (hash-ll.h: split out of hash.h to remove dependency on repository.h, 2023-04-22) to make explicit the split between hash-related functions that rely on the global `the_repository`, and those that don't. This split is no longer necessary now that we we have removed the reliance on `the_repository`. Merge "hash-ll.h" back into "hash.h". This causes some code units to not include "repository.h" anymore, which requires us to add some forward declarations. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2024-06-14 08:50:32 +02:00			`#if defined(SHA256_NETTLE)`
			`#include "sha256/nettle.h"`
			`#elif defined(SHA256_GCRYPT)`
			`#define SHA256_NEEDS_CLONE_HELPER`
			`#include "sha256/gcrypt.h"`
			`#elif defined(SHA256_OPENSSL)`
			`# include <openssl/sha.h>`
			`# if defined(OPENSSL_API_LEVEL) && OPENSSL_API_LEVEL >= 3`
			`# define SHA256_NEEDS_CLONE_HELPER`
			`# include "sha256/openssl.h"`
			`# endif`
			`#else`
			`#include "sha256/block/sha256.h"`
			`#endif`

			`#ifndef platform_SHA_CTX`
			`/*`
			`* platform's underlying implementation of SHA-1; could be OpenSSL,`
			`* blk_SHA, Apple CommonCrypto, etc... Note that the relevant`
			`* SHA-1 header may have already defined platform_SHA_CTX for our`
			`* own implementations like block-sha1, so we list`
			`* the default for OpenSSL compatible SHA-1 implementations here.`
			`*/`
			`#define platform_SHA_CTX SHA_CTX`
			`#define platform_SHA1_Init SHA1_Init`
			`#define platform_SHA1_Update SHA1_Update`
			`#define platform_SHA1_Final SHA1_Final`
			`#endif`

hash.h: scaffolding for _unsafe hashing variants Git's default SHA-1 implementation is collision-detecting, which hardens us against known SHA-1 attacks against Git objects. This makes Git object writes safer at the expense of some speed when hashing through the collision-detecting implementation, which is slower than non-collision detecting alternatives. Prepare for loading a separate "unsafe" SHA-1 implementation that can be used for non-cryptographic purposes, like computing the checksum of files that use the hashwrite() API. This commit does not actually introduce any new compile-time knobs to control which implementation is used as the unsafe SHA-1 variant, but does add scaffolding so that the "git_hash_algo" structure has five new function pointers which are "unsafe" variants of the five existing hashing-related function pointers: - git_hash_init_fn unsafe_init_fn - git_hash_clone_fn unsafe_clone_fn - git_hash_update_fn unsafe_update_fn - git_hash_final_fn unsafe_final_fn - git_hash_final_oid_fn unsafe_final_oid_fn The following commit will introduce compile-time knobs to specify which SHA-1 implementation is used for non-cryptographic uses. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2024-09-26 17:22:47 +02:00			`#ifndef platform_SHA_CTX_unsafe`
			`# define platform_SHA_CTX_unsafe platform_SHA_CTX`
			`# define platform_SHA1_Init_unsafe platform_SHA1_Init`
			`# define platform_SHA1_Update_unsafe platform_SHA1_Update`
			`# define platform_SHA1_Final_unsafe platform_SHA1_Final`
			`# ifdef platform_SHA1_Clone`
			`# define platform_SHA1_Clone_unsafe platform_SHA1_Clone`
			`# endif`
hash.h: set NEEDS_CLONE_HELPER_UNSAFE in fallback mode Commit 253ed9ecff (hash.h: scaffolding for _unsafe hashing variants, 2024-09-26) introduced the concept of having two hash algorithms: a safe and an unsafe one. When the Makefile knobs do not explicitly request an unsafe one, we fall back to using the safe algorithm. However, the fallback to do so forgot one case: we should inherit the NEEDS_CLONE_HELPER flag from the safe variant. Failing to do so means that we'll end up defining two clone functions (the algorithm specific one, and the generic one that just calls memcpy). You'll see an error like this: $ make OPENSSL_SHA1=1 [...] sha1/openssl.h:46:29: error: redefinition of ‘openssl_SHA1_Clone’ 46 \| #define platform_SHA1_Clone openssl_SHA1_Clone \| ^~~~~~~~~~~~~~~~~~ hash.h:83:40: note: in expansion of macro ‘platform_SHA1_Clone’ 83 \| # define platform_SHA1_Clone_unsafe platform_SHA1_Clone \| ^~~~~~~~~~~~~~~~~~~ hash.h:101:33: note: in expansion of macro ‘platform_SHA1_Clone_unsafe’ 101 \| # define git_SHA1_Clone_unsafe platform_SHA1_Clone_unsafe \| ^~~~~~~~~~~~~~~~~~~~~~~~~~ hash.h:133:20: note: in expansion of macro ‘git_SHA1_Clone_unsafe’ 133 \| static inline void git_SHA1_Clone_unsafe(git_SHA_CTX_unsafe dst, \| ^~~~~~~~~~~~~~~~~~~~~ sha1/openssl.h:37:20: note: previous definition of ‘openssl_SHA1_Clone’ with type ‘void(struct openssl_SHA1_CTX , const struct openssl_SHA1_CTX )’ 37 \| static inline void openssl_SHA1_Clone(struct openssl_SHA1_CTX dst, \| ^~~~~~~~~~~~~~~~~~ This only matters when compiling with openssl as the "safe" variant, since it's the only algorithm that requires a clone helper (and even then, only if you are using openssl 3.0+). And you should never do that, because it's not safe. But still, the invocation above used to work and should continue to do so until we decide to require a collision-detecting variant for the safe algorithm entirely. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2024-10-03 01:26:18 +02:00			`# ifdef SHA1_NEEDS_CLONE_HELPER`
			`# define SHA1_NEEDS_CLONE_HELPER_UNSAFE`
			`# endif`
hash.h: scaffolding for _unsafe hashing variants Git's default SHA-1 implementation is collision-detecting, which hardens us against known SHA-1 attacks against Git objects. This makes Git object writes safer at the expense of some speed when hashing through the collision-detecting implementation, which is slower than non-collision detecting alternatives. Prepare for loading a separate "unsafe" SHA-1 implementation that can be used for non-cryptographic purposes, like computing the checksum of files that use the hashwrite() API. This commit does not actually introduce any new compile-time knobs to control which implementation is used as the unsafe SHA-1 variant, but does add scaffolding so that the "git_hash_algo" structure has five new function pointers which are "unsafe" variants of the five existing hashing-related function pointers: - git_hash_init_fn unsafe_init_fn - git_hash_clone_fn unsafe_clone_fn - git_hash_update_fn unsafe_update_fn - git_hash_final_fn unsafe_final_fn - git_hash_final_oid_fn unsafe_final_oid_fn The following commit will introduce compile-time knobs to specify which SHA-1 implementation is used for non-cryptographic uses. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2024-09-26 17:22:47 +02:00			`#endif`

hash-ll: merge with "hash.h" The "hash-ll.h" header was introduced via d1cbe1e6d8 (hash-ll.h: split out of hash.h to remove dependency on repository.h, 2023-04-22) to make explicit the split between hash-related functions that rely on the global `the_repository`, and those that don't. This split is no longer necessary now that we we have removed the reliance on `the_repository`. Merge "hash-ll.h" back into "hash.h". This causes some code units to not include "repository.h" anymore, which requires us to add some forward declarations. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2024-06-14 08:50:32 +02:00			`#define git_SHA_CTX platform_SHA_CTX`
			`#define git_SHA1_Init platform_SHA1_Init`
			`#define git_SHA1_Update platform_SHA1_Update`
			`#define git_SHA1_Final platform_SHA1_Final`

hash.h: scaffolding for _unsafe hashing variants Git's default SHA-1 implementation is collision-detecting, which hardens us against known SHA-1 attacks against Git objects. This makes Git object writes safer at the expense of some speed when hashing through the collision-detecting implementation, which is slower than non-collision detecting alternatives. Prepare for loading a separate "unsafe" SHA-1 implementation that can be used for non-cryptographic purposes, like computing the checksum of files that use the hashwrite() API. This commit does not actually introduce any new compile-time knobs to control which implementation is used as the unsafe SHA-1 variant, but does add scaffolding so that the "git_hash_algo" structure has five new function pointers which are "unsafe" variants of the five existing hashing-related function pointers: - git_hash_init_fn unsafe_init_fn - git_hash_clone_fn unsafe_clone_fn - git_hash_update_fn unsafe_update_fn - git_hash_final_fn unsafe_final_fn - git_hash_final_oid_fn unsafe_final_oid_fn The following commit will introduce compile-time knobs to specify which SHA-1 implementation is used for non-cryptographic uses. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2024-09-26 17:22:47 +02:00			`#define git_SHA_CTX_unsafe platform_SHA_CTX_unsafe`
			`#define git_SHA1_Init_unsafe platform_SHA1_Init_unsafe`
			`#define git_SHA1_Update_unsafe platform_SHA1_Update_unsafe`
			`#define git_SHA1_Final_unsafe platform_SHA1_Final_unsafe`

hash-ll: merge with "hash.h" The "hash-ll.h" header was introduced via d1cbe1e6d8 (hash-ll.h: split out of hash.h to remove dependency on repository.h, 2023-04-22) to make explicit the split between hash-related functions that rely on the global `the_repository`, and those that don't. This split is no longer necessary now that we we have removed the reliance on `the_repository`. Merge "hash-ll.h" back into "hash.h". This causes some code units to not include "repository.h" anymore, which requires us to add some forward declarations. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2024-06-14 08:50:32 +02:00			`#ifdef platform_SHA1_Clone`
			`#define git_SHA1_Clone platform_SHA1_Clone`
			`#endif`
hash.h: scaffolding for _unsafe hashing variants Git's default SHA-1 implementation is collision-detecting, which hardens us against known SHA-1 attacks against Git objects. This makes Git object writes safer at the expense of some speed when hashing through the collision-detecting implementation, which is slower than non-collision detecting alternatives. Prepare for loading a separate "unsafe" SHA-1 implementation that can be used for non-cryptographic purposes, like computing the checksum of files that use the hashwrite() API. This commit does not actually introduce any new compile-time knobs to control which implementation is used as the unsafe SHA-1 variant, but does add scaffolding so that the "git_hash_algo" structure has five new function pointers which are "unsafe" variants of the five existing hashing-related function pointers: - git_hash_init_fn unsafe_init_fn - git_hash_clone_fn unsafe_clone_fn - git_hash_update_fn unsafe_update_fn - git_hash_final_fn unsafe_final_fn - git_hash_final_oid_fn unsafe_final_oid_fn The following commit will introduce compile-time knobs to specify which SHA-1 implementation is used for non-cryptographic uses. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2024-09-26 17:22:47 +02:00			`#ifdef platform_SHA1_Clone_unsafe`
			`# define git_SHA1_Clone_unsafe platform_SHA1_Clone_unsafe`
			`#endif`
hash-ll: merge with "hash.h" The "hash-ll.h" header was introduced via d1cbe1e6d8 (hash-ll.h: split out of hash.h to remove dependency on repository.h, 2023-04-22) to make explicit the split between hash-related functions that rely on the global `the_repository`, and those that don't. This split is no longer necessary now that we we have removed the reliance on `the_repository`. Merge "hash-ll.h" back into "hash.h". This causes some code units to not include "repository.h" anymore, which requires us to add some forward declarations. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2024-06-14 08:50:32 +02:00
			`#ifndef platform_SHA256_CTX`
			`#define platform_SHA256_CTX SHA256_CTX`
			`#define platform_SHA256_Init SHA256_Init`
			`#define platform_SHA256_Update SHA256_Update`
			`#define platform_SHA256_Final SHA256_Final`
			`#endif`

			`#define git_SHA256_CTX platform_SHA256_CTX`
			`#define git_SHA256_Init platform_SHA256_Init`
			`#define git_SHA256_Update platform_SHA256_Update`
			`#define git_SHA256_Final platform_SHA256_Final`

			`#ifdef platform_SHA256_Clone`
			`#define git_SHA256_Clone platform_SHA256_Clone`
			`#endif`

			`#ifdef SHA1_MAX_BLOCK_SIZE`
			`#include "compat/sha1-chunked.h"`
			`#undef git_SHA1_Update`
			`#define git_SHA1_Update git_SHA1_Update_Chunked`
			`#endif`

			`#ifndef SHA1_NEEDS_CLONE_HELPER`
			`static inline void git_SHA1_Clone(git_SHA_CTX dst, const git_SHA_CTX src)`
			`{`
			`memcpy(dst, src, sizeof(*dst));`
			`}`
			`#endif`
hash.h: scaffolding for _unsafe hashing variants Git's default SHA-1 implementation is collision-detecting, which hardens us against known SHA-1 attacks against Git objects. This makes Git object writes safer at the expense of some speed when hashing through the collision-detecting implementation, which is slower than non-collision detecting alternatives. Prepare for loading a separate "unsafe" SHA-1 implementation that can be used for non-cryptographic purposes, like computing the checksum of files that use the hashwrite() API. This commit does not actually introduce any new compile-time knobs to control which implementation is used as the unsafe SHA-1 variant, but does add scaffolding so that the "git_hash_algo" structure has five new function pointers which are "unsafe" variants of the five existing hashing-related function pointers: - git_hash_init_fn unsafe_init_fn - git_hash_clone_fn unsafe_clone_fn - git_hash_update_fn unsafe_update_fn - git_hash_final_fn unsafe_final_fn - git_hash_final_oid_fn unsafe_final_oid_fn The following commit will introduce compile-time knobs to specify which SHA-1 implementation is used for non-cryptographic uses. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2024-09-26 17:22:47 +02:00			`#ifndef SHA1_NEEDS_CLONE_HELPER_UNSAFE`
			`static inline void git_SHA1_Clone_unsafe(git_SHA_CTX_unsafe *dst,`
			`const git_SHA_CTX_unsafe *src)`
			`{`
			`memcpy(dst, src, sizeof(*dst));`
			`}`
			`#endif`
hash-ll: merge with "hash.h" The "hash-ll.h" header was introduced via d1cbe1e6d8 (hash-ll.h: split out of hash.h to remove dependency on repository.h, 2023-04-22) to make explicit the split between hash-related functions that rely on the global `the_repository`, and those that don't. This split is no longer necessary now that we we have removed the reliance on `the_repository`. Merge "hash-ll.h" back into "hash.h". This causes some code units to not include "repository.h" anymore, which requires us to add some forward declarations. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2024-06-14 08:50:32 +02:00
			`#ifndef SHA256_NEEDS_CLONE_HELPER`
			`static inline void git_SHA256_Clone(git_SHA256_CTX dst, const git_SHA256_CTX src)`
			`{`
			`memcpy(dst, src, sizeof(*dst));`
			`}`
			`#endif`

			`/*`
			`* Note that these constants are suitable for indexing the hash_algos array and`
			`* comparing against each other, but are otherwise arbitrary, so they should not`
			`* be exposed to the user or serialized to disk. To know whether a`
			`* git_hash_algo struct points to some usable hash function, test the format_id`
			`* field for being non-zero. Use the name field for user-visible situations and`
			`* the format_id field for fixed-length fields on disk.`
			`*/`
			`/* An unknown hash function. */`
			`#define GIT_HASH_UNKNOWN 0`
			`/* SHA-1 */`
			`#define GIT_HASH_SHA1 1`
			`/* SHA-256 */`
			`#define GIT_HASH_SHA256 2`
			`/* Number of algorithms supported (including unknown). */`
			`#define GIT_HASH_NALGOS (GIT_HASH_SHA256 + 1)`

			`/* "sha1", big-endian */`
			`#define GIT_SHA1_FORMAT_ID 0x73686131`

			`/* The length in bytes and in hex digits of an object name (SHA-1 value). */`
			`#define GIT_SHA1_RAWSZ 20`
			`#define GIT_SHA1_HEXSZ (2 * GIT_SHA1_RAWSZ)`
			`/* The block size of SHA-1. */`
			`#define GIT_SHA1_BLKSZ 64`

			`/* "s256", big-endian */`
			`#define GIT_SHA256_FORMAT_ID 0x73323536`

			`/* The length in bytes and in hex digits of an object name (SHA-256 value). */`
			`#define GIT_SHA256_RAWSZ 32`
			`#define GIT_SHA256_HEXSZ (2 * GIT_SHA256_RAWSZ)`
			`/* The block size of SHA-256. */`
			`#define GIT_SHA256_BLKSZ 64`

			`/* The length in byte and in hex digits of the largest possible hash value. */`
			`#define GIT_MAX_RAWSZ GIT_SHA256_RAWSZ`
			`#define GIT_MAX_HEXSZ GIT_SHA256_HEXSZ`
			`/* The largest possible block size for any supported hash. */`
			`#define GIT_MAX_BLKSZ GIT_SHA256_BLKSZ`

			`struct object_id {`
			`unsigned char hash[GIT_MAX_RAWSZ];`
			`int algo; /* XXX requires 4-byte alignment */`
			`};`

			`#define GET_OID_QUIETLY 01`
			`#define GET_OID_COMMIT 02`
			`#define GET_OID_COMMITTISH 04`
			`#define GET_OID_TREE 010`
			`#define GET_OID_TREEISH 020`
			`#define GET_OID_BLOB 040`
			`#define GET_OID_FOLLOW_SYMLINKS 0100`
			`#define GET_OID_RECORD_PATH 0200`
			`#define GET_OID_ONLY_TO_DIE 04000`
			`#define GET_OID_REQUIRE_PATH 010000`
			`#define GET_OID_HASH_ANY 020000`

			`#define GET_OID_DISAMBIGUATORS \`
			`(GET_OID_COMMIT \| GET_OID_COMMITTISH \| \`
			`GET_OID_TREE \| GET_OID_TREEISH \| \`
			`GET_OID_BLOB)`

			`enum get_oid_result {`
			`FOUND = 0,`
			`MISSING_OBJECT = -1, /* The requested object is missing */`
			`SHORT_NAME_AMBIGUOUS = -2,`
			`/* The following only apply when symlinks are followed */`
			`DANGLING_SYMLINK = -4, /*`
			`* The initial symlink is there, but`
			`* (transitively) points to a missing`
			`* in-tree file`
			`*/`
			`SYMLINK_LOOP = -5,`
			`NOT_DIR = -6, /*`
			`* Somewhere along the symlink chain, a path is`
			`* requested which contains a file as a`
			`* non-final element.`
			`*/`
			`};`
Add structure representing hash algorithm Since in the future we want to support an additional hash algorithm, add a structure that represents a hash algorithm and all the data that must go along with it. Add a constant to allow easy enumeration of hash algorithms. Implement function typedefs to create an abstract API that can be used by any hash algorithm, and wrappers for the existing SHA1 functions that conform to this API. Expose a value for hex size as well as binary size. While one will always be twice the other, the two values are both used extremely commonly throughout the codebase and providing both leads to improved readability. Don't include an entry in the hash algorithm structure for the null object ID. As this value is all zeros, any suitably sized all-zero object ID can be used, and there's no need to store a given one on a per-hash basis. The current hash function transition plan envisions a time when we will accept input from the user that might be in SHA-1 or in the NewHash format. Since we cannot know which the user has provided, add a constant representing the unknown algorithm to allow us to indicate that we must look the correct value up. Provide dummy API functions that die in this case. Finally, include git-compat-util.h in hash.h so that the required types are available. This aids people using automated tools their editors. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-11-12 22:28:52 +01:00
global: introduce `USE_THE_REPOSITORY_VARIABLE` macro Use of the `the_repository` variable is deprecated nowadays, and we slowly but steadily convert the codebase to not use it anymore. Instead, callers should be passing down the repository to work on via parameters. It is hard though to prove that a given code unit does not use this variable anymore. The most trivial case, merely demonstrating that there is no direct use of `the_repository`, is already a bit of a pain during code reviews as the reviewer needs to manually verify claims made by the patch author. The bigger problem though is that we have many interfaces that implicitly rely on `the_repository`. Introduce a new `USE_THE_REPOSITORY_VARIABLE` macro that allows code units to opt into usage of `the_repository`. The intent of this macro is to demonstrate that a certain code unit does not use this variable anymore, and to keep it from new dependencies on it in future changes, be it explicit or implicit For now, the macro only guards `the_repository` itself as well as `the_hash_algo`. There are many more known interfaces where we have an implicit dependency on `the_repository`, but those are not guarded at the current point in time. Over time though, we should start to add guards as required (or even better, just remove them). Define the macro as required in our code units. As expected, most of our code still relies on the global variable. Nearly all of our builtins rely on the variable as there is no way yet to pass `the_repository` to their entry point. For now, declare the macro in "biultin.h" to keep the required changes at least a little bit more contained. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2024-06-14 08:50:23 +02:00			`#ifdef USE_THE_REPOSITORY_VARIABLE`
hash-ll: merge with "hash.h" The "hash-ll.h" header was introduced via d1cbe1e6d8 (hash-ll.h: split out of hash.h to remove dependency on repository.h, 2023-04-22) to make explicit the split between hash-related functions that rely on the global `the_repository`, and those that don't. This split is no longer necessary now that we we have removed the reliance on `the_repository`. Merge "hash-ll.h" back into "hash.h". This causes some code units to not include "repository.h" anymore, which requires us to add some forward declarations. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2024-06-14 08:50:32 +02:00			`# include "repository.h"`
global: introduce `USE_THE_REPOSITORY_VARIABLE` macro Use of the `the_repository` variable is deprecated nowadays, and we slowly but steadily convert the codebase to not use it anymore. Instead, callers should be passing down the repository to work on via parameters. It is hard though to prove that a given code unit does not use this variable anymore. The most trivial case, merely demonstrating that there is no direct use of `the_repository`, is already a bit of a pain during code reviews as the reviewer needs to manually verify claims made by the patch author. The bigger problem though is that we have many interfaces that implicitly rely on `the_repository`. Introduce a new `USE_THE_REPOSITORY_VARIABLE` macro that allows code units to opt into usage of `the_repository`. The intent of this macro is to demonstrate that a certain code unit does not use this variable anymore, and to keep it from new dependencies on it in future changes, be it explicit or implicit For now, the macro only guards `the_repository` itself as well as `the_hash_algo`. There are many more known interfaces where we have an implicit dependency on `the_repository`, but those are not guarded at the current point in time. Over time though, we should start to add guards as required (or even better, just remove them). Define the macro as required in our code units. As expected, most of our code still relies on the global variable. Nearly all of our builtins rely on the variable as there is no way yet to pass `the_repository` to their entry point. For now, declare the macro in "biultin.h" to keep the required changes at least a little bit more contained. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2024-06-14 08:50:23 +02:00			`# define the_hash_algo the_repository->hash_algo`
			`#endif`
hash.h: move object_id definition from cache.h Our hashmap.h helpfully defines a sha1hash() function. But it cannot define a similar oidhash() without including all of cache.h, which itself wants to include hashmap.h! Let's break this circular dependency by moving the definition to hash.h, along with the remaining RAWSZ macros, etc. That will put them with the existing git_hash_algo definition. One alternative would be to move oidhash() into cache.h, but it's already quite bloated. We're better off moving things out than in. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2019-06-20 09:41:45 +02:00
hash-ll: merge with "hash.h" The "hash-ll.h" header was introduced via d1cbe1e6d8 (hash-ll.h: split out of hash.h to remove dependency on repository.h, 2023-04-22) to make explicit the split between hash-related functions that rely on the global `the_repository`, and those that don't. This split is no longer necessary now that we we have removed the reliance on `the_repository`. Merge "hash-ll.h" back into "hash.h". This causes some code units to not include "repository.h" anymore, which requires us to add some forward declarations. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2024-06-14 08:50:32 +02:00			`/* A suitably aligned type for stack allocations of hash contexts. */`
			`union git_hash_ctx {`
			`git_SHA_CTX sha1;`
hash.h: scaffolding for _unsafe hashing variants Git's default SHA-1 implementation is collision-detecting, which hardens us against known SHA-1 attacks against Git objects. This makes Git object writes safer at the expense of some speed when hashing through the collision-detecting implementation, which is slower than non-collision detecting alternatives. Prepare for loading a separate "unsafe" SHA-1 implementation that can be used for non-cryptographic purposes, like computing the checksum of files that use the hashwrite() API. This commit does not actually introduce any new compile-time knobs to control which implementation is used as the unsafe SHA-1 variant, but does add scaffolding so that the "git_hash_algo" structure has five new function pointers which are "unsafe" variants of the five existing hashing-related function pointers: - git_hash_init_fn unsafe_init_fn - git_hash_clone_fn unsafe_clone_fn - git_hash_update_fn unsafe_update_fn - git_hash_final_fn unsafe_final_fn - git_hash_final_oid_fn unsafe_final_oid_fn The following commit will introduce compile-time knobs to specify which SHA-1 implementation is used for non-cryptographic uses. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2024-09-26 17:22:47 +02:00			`git_SHA_CTX_unsafe sha1_unsafe;`

hash-ll: merge with "hash.h" The "hash-ll.h" header was introduced via d1cbe1e6d8 (hash-ll.h: split out of hash.h to remove dependency on repository.h, 2023-04-22) to make explicit the split between hash-related functions that rely on the global `the_repository`, and those that don't. This split is no longer necessary now that we we have removed the reliance on `the_repository`. Merge "hash-ll.h" back into "hash.h". This causes some code units to not include "repository.h" anymore, which requires us to add some forward declarations. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2024-06-14 08:50:32 +02:00			`git_SHA256_CTX sha256;`
			`};`
			`typedef union git_hash_ctx git_hash_ctx;`

			`typedef void (git_hash_init_fn)(git_hash_ctx ctx);`
			`typedef void (git_hash_clone_fn)(git_hash_ctx dst, const git_hash_ctx *src);`
			`typedef void (git_hash_update_fn)(git_hash_ctx ctx, const void *in, size_t len);`
			`typedef void (git_hash_final_fn)(unsigned char hash, git_hash_ctx *ctx);`
			`typedef void (git_hash_final_oid_fn)(struct object_id oid, git_hash_ctx *ctx);`

			`struct git_hash_algo {`
			`/*`
			`* The name of the algorithm, as appears in the config file and in`
			`* messages.`
			`*/`
			`const char *name;`

			`/* A four-byte version identifier, used in pack indices. */`
			`uint32_t format_id;`

			`/* The length of the hash in binary. */`
			`size_t rawsz;`

			`/* The length of the hash in hex characters. */`
			`size_t hexsz;`

			`/* The block size of the hash. */`
			`size_t blksz;`

			`/* The hash initialization function. */`
			`git_hash_init_fn init_fn;`

			`/* The hash context cloning function. */`
			`git_hash_clone_fn clone_fn;`

			`/* The hash update function. */`
			`git_hash_update_fn update_fn;`

			`/* The hash finalization function. */`
			`git_hash_final_fn final_fn;`

			`/* The hash finalization function for object IDs. */`
			`git_hash_final_oid_fn final_oid_fn;`

hash.h: scaffolding for _unsafe hashing variants Git's default SHA-1 implementation is collision-detecting, which hardens us against known SHA-1 attacks against Git objects. This makes Git object writes safer at the expense of some speed when hashing through the collision-detecting implementation, which is slower than non-collision detecting alternatives. Prepare for loading a separate "unsafe" SHA-1 implementation that can be used for non-cryptographic purposes, like computing the checksum of files that use the hashwrite() API. This commit does not actually introduce any new compile-time knobs to control which implementation is used as the unsafe SHA-1 variant, but does add scaffolding so that the "git_hash_algo" structure has five new function pointers which are "unsafe" variants of the five existing hashing-related function pointers: - git_hash_init_fn unsafe_init_fn - git_hash_clone_fn unsafe_clone_fn - git_hash_update_fn unsafe_update_fn - git_hash_final_fn unsafe_final_fn - git_hash_final_oid_fn unsafe_final_oid_fn The following commit will introduce compile-time knobs to specify which SHA-1 implementation is used for non-cryptographic uses. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2024-09-26 17:22:47 +02:00			`/* The non-cryptographic hash initialization function. */`
			`git_hash_init_fn unsafe_init_fn;`

			`/* The non-cryptographic hash context cloning function. */`
			`git_hash_clone_fn unsafe_clone_fn;`

			`/* The non-cryptographic hash update function. */`
			`git_hash_update_fn unsafe_update_fn;`

			`/* The non-cryptographic hash finalization function. */`
			`git_hash_final_fn unsafe_final_fn;`

			`/* The non-cryptographic hash finalization function. */`
			`git_hash_final_oid_fn unsafe_final_oid_fn;`

hash-ll: merge with "hash.h" The "hash-ll.h" header was introduced via d1cbe1e6d8 (hash-ll.h: split out of hash.h to remove dependency on repository.h, 2023-04-22) to make explicit the split between hash-related functions that rely on the global `the_repository`, and those that don't. This split is no longer necessary now that we we have removed the reliance on `the_repository`. Merge "hash-ll.h" back into "hash.h". This causes some code units to not include "repository.h" anymore, which requires us to add some forward declarations. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2024-06-14 08:50:32 +02:00			`/* The OID of the empty tree. */`
			`const struct object_id *empty_tree;`

			`/* The OID of the empty blob. */`
			`const struct object_id *empty_blob;`

			`/* The all-zeros OID. */`
			`const struct object_id *null_oid;`
			`};`
			`extern const struct git_hash_algo hash_algos[GIT_HASH_NALGOS];`

			`/*`
			`* Return a GIT_HASH_* constant based on the name. Returns GIT_HASH_UNKNOWN if`
			`* the name doesn't match a known algorithm.`
			`*/`
			`int hash_algo_by_name(const char *name);`
			`/* Identical, except based on the format ID. */`
			`int hash_algo_by_id(uint32_t format_id);`
			`/* Identical, except based on the length. */`
			`int hash_algo_by_length(int len);`
			`/* Identical, except for a pointer to struct git_hash_algo. */`
			`static inline int hash_algo_by_ptr(const struct git_hash_algo *p)`
			`{`
			`return p - hash_algos;`
			`}`

			`const struct object_id *null_oid(void);`

			`static inline int hashcmp(const unsigned char sha1, const unsigned char sha2, const struct git_hash_algo *algop)`
			`{`
			`/*`
			`* Teach the compiler that there are only two possibilities of hash size`
			`* here, so that it can optimize for this case as much as possible.`
			`*/`
			`if (algop->rawsz == GIT_MAX_RAWSZ)`
			`return memcmp(sha1, sha2, GIT_MAX_RAWSZ);`
			`return memcmp(sha1, sha2, GIT_SHA1_RAWSZ);`
			`}`

			`static inline int hasheq(const unsigned char sha1, const unsigned char sha2, const struct git_hash_algo *algop)`
			`{`
			`/*`
			`* We write this here instead of deferring to hashcmp so that the`
			`* compiler can properly inline it and avoid calling memcmp.`
			`*/`
			`if (algop->rawsz == GIT_MAX_RAWSZ)`
			`return !memcmp(sha1, sha2, GIT_MAX_RAWSZ);`
			`return !memcmp(sha1, sha2, GIT_SHA1_RAWSZ);`
			`}`

			`static inline void hashcpy(unsigned char sha_dst, const unsigned char sha_src,`
			`const struct git_hash_algo *algop)`
			`{`
			`memcpy(sha_dst, sha_src, algop->rawsz);`
			`}`

			`static inline void hashclr(unsigned char hash, const struct git_hash_algo algop)`
			`{`
			`memset(hash, 0, algop->rawsz);`
			`}`

			`static inline int oidcmp(const struct object_id oid1, const struct object_id oid2)`
			`{`
			`return memcmp(oid1->hash, oid2->hash, GIT_MAX_RAWSZ);`
			`}`

			`static inline int oideq(const struct object_id oid1, const struct object_id oid2)`
			`{`
			`return !memcmp(oid1->hash, oid2->hash, GIT_MAX_RAWSZ);`
			`}`

			`static inline void oidcpy(struct object_id dst, const struct object_id src)`
			`{`
			`memcpy(dst->hash, src->hash, GIT_MAX_RAWSZ);`
			`dst->algo = src->algo;`
			`}`

			`static inline void oidread(struct object_id oid, const unsigned char hash,`
			`const struct git_hash_algo *algop)`
			`{`
			`memcpy(oid->hash, hash, algop->rawsz);`
			`if (algop->rawsz < GIT_MAX_RAWSZ)`
			`memset(oid->hash + algop->rawsz, 0, GIT_MAX_RAWSZ - algop->rawsz);`
			`oid->algo = hash_algo_by_ptr(algop);`
			`}`

			`static inline void oidclr(struct object_id *oid,`
			`const struct git_hash_algo *algop)`
			`{`
			`memset(oid->hash, 0, GIT_MAX_RAWSZ);`
			`oid->algo = hash_algo_by_ptr(algop);`
			`}`

			`static inline struct object_id oiddup(const struct object_id src)`
			`{`
			`struct object_id *dst = xmalloc(sizeof(struct object_id));`
			`oidcpy(dst, src);`
			`return dst;`
			`}`

			`static inline void oid_set_algo(struct object_id oid, const struct git_hash_algo algop)`
			`{`
			`oid->algo = hash_algo_by_ptr(algop);`
			`}`

			`/*`
			`* Converts a cryptographic hash (e.g. SHA-1) into an int-sized hash code`
			`* for use in hash tables. Cryptographic hashes are supposed to have`
			* uniform distribution, so in contrast to `memhash()`, this just copies
			* the first `sizeof(int)` bytes without shuffling any bits. Note that
			`* the results will be different on big-endian and little-endian`
			`* platforms, so they should not be stored or transferred over the net.`
			`*/`
			`static inline unsigned int oidhash(const struct object_id *oid)`
			`{`
			`/*`
			`* Equivalent to 'return (unsigned int )oid->hash;', but safe on`
			`* platforms that don't support unaligned reads.`
			`*/`
			`unsigned int hash;`
			`memcpy(&hash, oid->hash, sizeof(hash));`
			`return hash;`
			`}`

			`static inline int is_null_oid(const struct object_id *oid)`
			`{`
			`static const unsigned char null_hash[GIT_MAX_RAWSZ];`
			`return !memcmp(oid->hash, null_hash, GIT_MAX_RAWSZ);`
			`}`

			`const char empty_tree_oid_hex(const struct git_hash_algo algop);`

			`static inline int is_empty_blob_oid(const struct object_id *oid,`
			`const struct git_hash_algo *algop)`
			`{`
			`return oideq(oid, algop->empty_blob);`
			`}`

			`static inline int is_empty_tree_oid(const struct object_id *oid,`
			`const struct git_hash_algo *algop)`
			`{`
			`return oideq(oid, algop->empty_tree);`
			`}`

hash.h: move SHA-1 implementation selection into a header file Many developers use functionality in their editors that allows for quick syntax checks, including warning about questionable constructs. This functionality allows rapid development with fewer errors. However, such functionality generally does not allow the specification of project-specific defines or command-line options. Since the SHA1_HEADER include is not defined in such a case, developers see spurious errors when using these tools. Furthermore, there are known implementations of "cc" whose '#include' is unhappy with this construct. Instead of using SHA1_HEADER, create a hash.h header and use #if and #elif to select the desired header. Have the Makefile pass an appropriate option to help the header select the right implementation to use. [jc: make BLK_SHA1 the fallback default as discussed on list, e.g. <20170314201424.vccij5z2ortq4a4o@sigill.intra.peff.net>; also remove SHA1_HEADER and SHA1_HEADER_SQ that are no longer used]. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-03-11 23:28:18 +01:00			`#endif`