mirrors/git - Incest Forge: Beyond sex. We incest.

mirrors/git

mirror of https://github.com/git/git.git synced 2024-11-05 00:37:55 +01:00

821 lines

23 KiB

Bash

Raw Normal View History

Add 'ident' conversion. The 'ident' attribute set to path squashes "$ident:<any bytes except dollor sign>$" to "$ident$" upon checkin, and expands it to "$ident: <blob SHA-1> $" upon checkout. As we have two conversions that affect checkin/checkout paths, clarify how they interact with each other. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-22 04:09:02 +02:00			`#!/bin/sh`

			`test_description='blob conversion via gitattributes'`

			`. ./test-lib.sh`

t0021, t5615: use $PWD instead of $(pwd) in PATH-like shell variables We have to use $PWD instead of $(pwd) because on Windows the latter would add a C: style path to bash's Unix-style $PATH variable, which becomes confused by the colon after the drive letter. ($PWD is a Unix-style path.) In the case of GIT_ALTERNATE_OBJECT_DIRECTORIES, bash on Windows assembles a Unix-style path list with the colon as separators. It converts the value to a Windows-style path list with the semicolon as path separator when it forwards the variable to git.exe. The same confusion happens when bash's original value is contaminated with Windows style paths. Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-11-11 18:31:48 +01:00			`TEST_ROOT="$PWD"`
t0021: put $TEST_ROOT in $PATH We create a rot13.sh script in the trash directory, but need to call it by its full path when we have moved our cwd to another directory. Let's just put $TEST_ROOT in our $PATH so that the script is always found. This is a minor convenience for rot13.sh, but will be a major one when we switch rot13-filter.pl to a script in the same directory, as it means we will not have to deal with shell quoting inside the filter-process config. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-11-02 19:18:25 +01:00			`PATH=$TEST_ROOT:$PATH`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00
t0021: use write_script to create rot13 shell script This avoids us fooling around with $SHELL_PATH and the executable bit ourselves. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-11-02 19:17:51 +01:00			`write_script <<\EOF "$TEST_ROOT/rot13.sh"`
t0021: tr portability fix for Solaris Solaris' /usr/bin/tr doesn't seem to like multiple character ranges in brackets (it simply prints "Bad string"). Instead, let's just enumerate the transformation we want. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-03-11 18:40:45 +01:00			`tr \`
			`'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ' \`
			`'nopqrstuvwxyzabcdefghijklmNOPQRSTUVWXYZABCDEFGHIJKLM'`
Add 'filter' attribute and external filter driver definition. The interface is similar to the custom low-level merge drivers. First you configure your filter driver by defining 'filter.<name>.' variables in the configuration. filter.<name>.clean filter command to run upon checkin filter.<name>.smudge filter command to run upon checkout Then you assign filter attribute to each path, whose name matches the custom filter driver's name. Example: (in .gitattributes) .c filter=indent (in config) [filter "indent"] clean = indent smudge = cat Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-21 12:14:13 +02:00			`EOF`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00
t0021: use $PERL_PATH for rot13-filter.pl The rot13-filter.pl script hardcodes "#!/usr/bin/perl", and does not respect $PERL_PATH at all. That is a problem if the system does not have perl at that path, or if it has a perl that is too old to run a complicated script like the rot13-filter (but PERL_PATH points to a more modern one). We can fix this by using write_script() to create a new copy of the script with the correct #!-line. In theory we could move the whole script inside t0021-conversion.sh rather than having it as an auxiliary file, but it's long enough that it just makes things harder to read. As a bonus, we can stop using the full path to the script in the filter-process config we add (because the trash directory is in our PATH). Not only is this shorter, but it sidesteps any shell-quoting issues. The original was broken when $TEST_DIRECTORY contained a space, because it was interpolated in the outer script. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-11-02 19:20:22 +01:00			`write_script rot13-filter.pl "$PERL_PATH" \`
			`<"$TEST_DIRECTORY"/t0021/rot13-filter.pl`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00
			`generate_random_characters () {`
			`LEN=$1`
			`NAME=$2`
			`test-genrandom some-seed $LEN \|`
			`perl -pe "s/./chr((ord($&) % 26) + ord('a'))/sge" >"$TEST_ROOT/$NAME"`
			`}`

			`file_size () {`
t0021: compute file size with a single process instead of a pipeline Avoid unwanted coding patterns (prodigal use of pipelines), and in particular a useless use of cat. Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Jeff King <peff@peff.net> 2016-11-06 20:31:19 +01:00			`perl -e 'print -s $ARGV[0]' "$1"`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00			`}`

			`filter_git () {`
t0021: make debug log file name configurable The "rot13-filter.pl" helper wrote its debug logs always to "rot13-filter.log". Make this configurable by defining the log file as first parameter of "rot13-filter.pl". This is useful if "rot13-filter.pl" is configured multiple times similar to the subsequent patch 'convert: add "status=delayed" to filter process protocol'. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-06-01 10:22:00 +02:00			`rm -f *.log &&`
t0021: remove debugging cruft The redirection of the standard error stream to a temporary file is a leftover cruft during debugging. Remove it. Besides, it is reported by folks on the Windows that the test is flaky with this redirection; somebody gets confused and this merely-redirected-to file gets marked as delete-pending by git.exe and makes it finish with a non-zero exit status when "git checkout" finishes. Windows folks may want to figure that one out, but for the purpose of this test, it shouldn't become a show-stopper. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-11-11 22:07:37 +01:00			`git "$@"`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00			`}`

			# Compare two files and ensure that `clean` and `smudge` respectively are
			# called at least once if specified in the `expect` file. The actual
			`# invocation count is not relevant because their number can vary.`
			`# c.f. http://public-inbox.org/git/xmqqshv18i8i.fsf@gitster.mtv.corp.google.com/`
			`test_cmp_count () {`
			`expect=$1`
			`actual=$2`
			`for FILE in "$expect" "$actual"`
			`do`
t0021: expect more variations in the output of uniq -c Some versions of uniq -c write the count left-justified, other version write it right-justified. Be prepared for both kinds. Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Jeff King <peff@peff.net> 2016-11-03 21:12:13 +01:00			`sort "$FILE" \| uniq -c \|`
t0021: keep filter log files on comparison The filter log files are modified on comparison. That might be unexpected by the caller. It would be even undesirable if the caller wants to reuse the original log files. Address these issues by using temp files for modifications. This is useful for the subsequent patch 'convert: add "status=delayed" to filter process protocol'. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-06-01 10:21:59 +02:00			`sed -e "s/^ [0-9][0-9][ ]*IN: /x IN: /" >"$FILE.tmp"`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00			`done &&`
t0021: keep filter log files on comparison The filter log files are modified on comparison. That might be unexpected by the caller. It would be even undesirable if the caller wants to reuse the original log files. Address these issues by using temp files for modifications. This is useful for the subsequent patch 'convert: add "status=delayed" to filter process protocol'. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-06-01 10:21:59 +02:00			`test_cmp "$expect.tmp" "$actual.tmp" &&`
			`rm "$expect.tmp" "$actual.tmp"`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00			`}`

			# Compare two files but exclude all `clean` invocations because Git can
			# call `clean` zero or more times.
			`# c.f. http://public-inbox.org/git/xmqqshv18i8i.fsf@gitster.mtv.corp.google.com/`
			`test_cmp_exclude_clean () {`
			`expect=$1`
			`actual=$2`
			`for FILE in "$expect" "$actual"`
			`do`
t0021: keep filter log files on comparison The filter log files are modified on comparison. That might be unexpected by the caller. It would be even undesirable if the caller wants to reuse the original log files. Address these issues by using temp files for modifications. This is useful for the subsequent patch 'convert: add "status=delayed" to filter process protocol'. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-06-01 10:21:59 +02:00			`grep -v "IN: clean" "$FILE" >"$FILE.tmp"`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00			`done &&`
t0021: keep filter log files on comparison The filter log files are modified on comparison. That might be unexpected by the caller. It would be even undesirable if the caller wants to reuse the original log files. Address these issues by using temp files for modifications. This is useful for the subsequent patch 'convert: add "status=delayed" to filter process protocol'. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-06-01 10:21:59 +02:00			`test_cmp "$expect.tmp" "$actual.tmp" &&`
			`rm "$expect.tmp" "$actual.tmp"`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00			`}`

			`# Check that the contents of two files are equal and that their rot13 version`
			`# is equal to the committed content.`
			`test_cmp_committed_rot13 () {`
			`test_cmp "$1" "$2" &&`
t0021: put $TEST_ROOT in $PATH We create a rot13.sh script in the trash directory, but need to call it by its full path when we have moved our cwd to another directory. Let's just put $TEST_ROOT in our $PATH so that the script is always found. This is a minor convenience for rot13.sh, but will be a major one when we switch rot13-filter.pl to a script in the same directory, as it means we will not have to deal with shell quoting inside the filter-process config. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-11-02 19:18:25 +01:00			`rot13.sh <"$1" >expected &&`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00			`git cat-file blob :"$2" >actual &&`
			`test_cmp expected actual`
			`}`
Add 'filter' attribute and external filter driver definition. The interface is similar to the custom low-level merge drivers. First you configure your filter driver by defining 'filter.<name>.' variables in the configuration. filter.<name>.clean filter command to run upon checkin filter.<name>.smudge filter command to run upon checkout Then you assign filter attribute to each path, whose name matches the custom filter driver's name. Example: (in .gitattributes) .c filter=indent (in config) [filter "indent"] clean = indent smudge = cat Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-21 12:14:13 +02:00
Add 'ident' conversion. The 'ident' attribute set to path squashes "$ident:<any bytes except dollor sign>$" to "$ident$" upon checkin, and expands it to "$ident: <blob SHA-1> $" upon checkout. As we have two conversions that affect checkin/checkout paths, clarify how they interact with each other. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-22 04:09:02 +02:00			`test_expect_success setup '`
Add 'filter' attribute and external filter driver definition. The interface is similar to the custom low-level merge drivers. First you configure your filter driver by defining 'filter.<name>.' variables in the configuration. filter.<name>.clean filter command to run upon checkin filter.<name>.smudge filter command to run upon checkout Then you assign filter attribute to each path, whose name matches the custom filter driver's name. Example: (in .gitattributes) .c filter=indent (in config) [filter "indent"] clean = indent smudge = cat Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-21 12:14:13 +02:00			`git config filter.rot13.smudge ./rot13.sh &&`
			`git config filter.rot13.clean ./rot13.sh &&`

Add 'ident' conversion. The 'ident' attribute set to path squashes "$ident:<any bytes except dollor sign>$" to "$ident$" upon checkin, and expands it to "$ident: <blob SHA-1> $" upon checkout. As we have two conversions that affect checkin/checkout paths, clarify how they interact with each other. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-22 04:09:02 +02:00			`{`
Add 'filter' attribute and external filter driver definition. The interface is similar to the custom low-level merge drivers. First you configure your filter driver by defining 'filter.<name>.' variables in the configuration. filter.<name>.clean filter command to run upon checkin filter.<name>.smudge filter command to run upon checkout Then you assign filter attribute to each path, whose name matches the custom filter driver's name. Example: (in .gitattributes) .c filter=indent (in config) [filter "indent"] clean = indent smudge = cat Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-21 12:14:13 +02:00			`echo "*.t filter=rot13"`
Add 'ident' conversion. The 'ident' attribute set to path squashes "$ident:<any bytes except dollor sign>$" to "$ident$" upon checkin, and expands it to "$ident: <blob SHA-1> $" upon checkout. As we have two conversions that affect checkin/checkout paths, clarify how they interact with each other. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-22 04:09:02 +02:00			`echo "*.i ident"`
			`} >.gitattributes &&`

			`{`
			`echo a b c d e f g h i j k l m`
			`echo n o p q r s t u v w x y z`
Use $Id$ as the ident attribute keyword rather than $ident$ to be consistent with other VCSs $Id$ is present already in SVN and CVS; it would mean that people converting their existing repositories won't have to make any changes to the source files should they want to make use of the ident attribute. Given that it's a feature that's meant to calm those very people, it seems obtuse to make them edit every file just to make use of it. I think that bzr uses $Id$; Mercurial has examples hooks for $Id$; monotone has $Id$ on its wishlist. I can't think of a good reason not to stick with the de-facto standard and call ours $Id$ instead of $ident$. Signed-off-by: Andy Parkins <andyparkins@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-05-14 15:37:25 +02:00			`echo '\''$Id$'\''`
Add 'ident' conversion. The 'ident' attribute set to path squashes "$ident:<any bytes except dollor sign>$" to "$ident$" upon checkin, and expands it to "$ident: <blob SHA-1> $" upon checkout. As we have two conversions that affect checkin/checkout paths, clarify how they interact with each other. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-22 04:09:02 +02:00			`} >test &&`
			`cat test >test.t &&`
			`cat test >test.o &&`
			`cat test >test.i &&`
			`git add test test.t test.i &&`
			`rm -f test test.t test.i &&`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00			`git checkout -- test test.t test.i &&`

			`echo "content-test2" >test2.o &&`
docs: warn about possible '=' in clean/smudge filter process values A pathname value in a clean/smudge filter process "key=value" pair can contain the '=' character (introduced in edcc858). Make the user aware of this issue in the docs, add a corresponding test case, and fix the issue in filter process value parser of the example implementation in contrib. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-12-03 20:45:16 +01:00			`echo "content-test3 - filename with special characters" >"test3 '\''sq'\'',\$x=.o"`
Add 'ident' conversion. The 'ident' attribute set to path squashes "$ident:<any bytes except dollor sign>$" to "$ident$" upon checkin, and expands it to "$ident: <blob SHA-1> $" upon checkout. As we have two conversions that affect checkin/checkout paths, clarify how they interact with each other. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-22 04:09:02 +02:00			`'`

Use $Id$ as the ident attribute keyword rather than $ident$ to be consistent with other VCSs $Id$ is present already in SVN and CVS; it would mean that people converting their existing repositories won't have to make any changes to the source files should they want to make use of the ident attribute. Given that it's a feature that's meant to calm those very people, it seems obtuse to make them edit every file just to make use of it. I think that bzr uses $Id$; Mercurial has examples hooks for $Id$; monotone has $Id$ on its wishlist. I can't think of a good reason not to stick with the de-facto standard and call ours $Id$ instead of $ident$. Signed-off-by: Andy Parkins <andyparkins@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-05-14 15:37:25 +02:00			`script='s/^\$Id: \([0-9a-f]*\) \$/\1/p'`
Add 'ident' conversion. The 'ident' attribute set to path squashes "$ident:<any bytes except dollor sign>$" to "$ident$" upon checkin, and expands it to "$ident: <blob SHA-1> $" upon checkout. As we have two conversions that affect checkin/checkout paths, clarify how they interact with each other. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-22 04:09:02 +02:00
			`test_expect_success check '`

convert: modernize tests Use `test_config` to set the config, check that files are empty with `test_must_be_empty`, compare files with `test_cmp`, and remove spaces after ">" and "<". Please note that the "rot13" filter configured in "setup" keeps using `git config` instead of `test_config` because subsequent tests might depend on it. Reviewed-by: Stefan Beller <sbeller@google.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:26 +02:00			`test_cmp test.o test &&`
			`test_cmp test.o test.t &&`
Add 'ident' conversion. The 'ident' attribute set to path squashes "$ident:<any bytes except dollor sign>$" to "$ident$" upon checkin, and expands it to "$ident: <blob SHA-1> $" upon checkout. As we have two conversions that affect checkin/checkout paths, clarify how they interact with each other. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-22 04:09:02 +02:00
			`# ident should be stripped in the repository`
			`git diff --raw --exit-code :test :test.i &&`
			`id=$(git rev-parse --verify :test) &&`
			`embedded=$(sed -ne "$script" test.i) &&`
t0021-conversion.sh: Test that the clean filter really cleans content. This test uses a rot13 filter, which is its own inverse. It tested only that the content was the same as the original after both the 'clean' and the 'smudge' filter were applied. This way it would not detect whether any filter was run at all. Hence, here we add another test that checks that the repository contained content that was processed by the 'clean' filter. Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-10-19 21:48:04 +02:00			`test "z$id" = "z$embedded" &&`

convert: modernize tests Use `test_config` to set the config, check that files are empty with `test_must_be_empty`, compare files with `test_cmp`, and remove spaces after ">" and "<". Please note that the "rot13" filter configured in "setup" keeps using `git config` instead of `test_config` because subsequent tests might depend on it. Reviewed-by: Stefan Beller <sbeller@google.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:26 +02:00			`git cat-file blob :test.t >test.r &&`
t0021-conversion.sh: Test that the clean filter really cleans content. This test uses a rot13 filter, which is its own inverse. It tested only that the content was the same as the original after both the 'clean' and the 'smudge' filter were applied. This way it would not detect whether any filter was run at all. Hence, here we add another test that checks that the repository contained content that was processed by the 'clean' filter. Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-10-19 21:48:04 +02:00
convert: modernize tests Use `test_config` to set the config, check that files are empty with `test_must_be_empty`, compare files with `test_cmp`, and remove spaces after ">" and "<". Please note that the "rot13" filter configured in "setup" keeps using `git config` instead of `test_config` because subsequent tests might depend on it. Reviewed-by: Stefan Beller <sbeller@google.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:26 +02:00			`./rot13.sh <test.o >test.t &&`
			`test_cmp test.r test.t`
Add 'ident' conversion. The 'ident' attribute set to path squashes "$ident:<any bytes except dollor sign>$" to "$ident$" upon checkin, and expands it to "$ident: <blob SHA-1> $" upon checkout. As we have two conversions that affect checkin/checkout paths, clarify how they interact with each other. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-22 04:09:02 +02:00			`'`

Add test case for $Id$ expanded in the repository This test case would have caught the bug fixed by revision c23290d5. It puts various forms of $Id$ into a file in the repository, without allowing git to collapse them to uniformity. Then enables the $Id$ expansion on checkout, and checks that what is checked out has coped with the various forms. Signed-off-by: Andy Parkins <andyparkins@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-05-27 12:52:11 +02:00			`# If an expanded ident ever gets into the repository, we want to make sure that`
			`# it is collapsed before being expanded again on checkout`
			`test_expect_success expanded_in_repo '`
			`{`
			`echo "File with expanded keywords"`
			`echo "\$Id\$"`
			`echo "\$Id:\$"`
			`echo "\$Id: 0000000000000000000000000000000000000000 \$"`
			`echo "\$Id: NoSpaceAtEnd\$"`
			`echo "\$Id:NoSpaceAtFront \$"`
			`echo "\$Id:NoSpaceAtEitherEnd\$"`
			`echo "\$Id: NoTerminatingSymbol"`
convert: Safer handling of $Id$ contraction. The code to contract $Id:xxxxx$ strings could eat an arbitrary amount of source text if the terminating $ was lost. It now refuses to contract $Id:xxxxx$ strings spanning multiple lines. Signed-off-by: Henrik Grubbström <grubba@grubba.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-04-06 14:46:37 +02:00			`echo "\$Id: Foreign Commit With Spaces \$"`
t0021: test application of both crlf and ident Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-05-25 03:02:48 +02:00			`} >expanded-keywords.0 &&`
Add test case for $Id$ expanded in the repository This test case would have caught the bug fixed by revision c23290d5. It puts various forms of $Id$ into a file in the repository, without allowing git to collapse them to uniformity. Then enables the $Id$ expansion on checkout, and checks that what is checked out has coped with the various forms. Signed-off-by: Andy Parkins <andyparkins@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-05-27 12:52:11 +02:00
t0021: test application of both crlf and ident Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-05-25 03:02:48 +02:00			`{`
			`cat expanded-keywords.0 &&`
			`printf "\$Id: NoTerminatingSymbolAtEOF"`
			`} >expanded-keywords &&`
			`cat expanded-keywords >expanded-keywords-crlf &&`
			`git add expanded-keywords expanded-keywords-crlf &&`
t0021-conversion.sh: fix NoTerminatingSymbolAtEOF test The last line of the test file "expanded-keywords" ended in a newline, which is a valid terminator for ident. Use printf instead of echo to omit it and thus really test if a file that ends unexpectedly in the middle of an ident tag is handled properly. Also take the oppertunity to calculate the expected ID dynamically instead of hardcoding it into the test script. This should make future changes easier. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-05-21 23:25:06 +02:00			`git commit -m "File with keywords expanded" &&`
			`id=$(git rev-parse --verify :expanded-keywords) &&`

Add test case for $Id$ expanded in the repository This test case would have caught the bug fixed by revision c23290d5. It puts various forms of $Id$ into a file in the repository, without allowing git to collapse them to uniformity. Then enables the $Id$ expansion on checkout, and checks that what is checked out has coped with the various forms. Signed-off-by: Andy Parkins <andyparkins@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-05-27 12:52:11 +02:00			`{`
			`echo "File with expanded keywords"`
t0021-conversion.sh: fix NoTerminatingSymbolAtEOF test The last line of the test file "expanded-keywords" ended in a newline, which is a valid terminator for ident. Use printf instead of echo to omit it and thus really test if a file that ends unexpectedly in the middle of an ident tag is handled properly. Also take the oppertunity to calculate the expected ID dynamically instead of hardcoding it into the test script. This should make future changes easier. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-05-21 23:25:06 +02:00			`echo "\$Id: $id \$"`
			`echo "\$Id: $id \$"`
			`echo "\$Id: $id \$"`
			`echo "\$Id: $id \$"`
			`echo "\$Id: $id \$"`
			`echo "\$Id: $id \$"`
Add test case for $Id$ expanded in the repository This test case would have caught the bug fixed by revision c23290d5. It puts various forms of $Id$ into a file in the repository, without allowing git to collapse them to uniformity. Then enables the $Id$ expansion on checkout, and checks that what is checked out has coped with the various forms. Signed-off-by: Andy Parkins <andyparkins@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-05-27 12:52:11 +02:00			`echo "\$Id: NoTerminatingSymbol"`
convert: Keep foreign $Id$ on checkout. If there are foreign $Id$ keywords in the repository, they are most likely there for a reason. Let's keep them on checkout (which is also what the documentation indicates). Foreign $Id$ keywords are now recognized by there being multiple space separated fields in $Id:xxxxx$. Signed-off-by: Henrik Grubbström <grubba@grubba.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-04-06 14:46:38 +02:00			`echo "\$Id: Foreign Commit With Spaces \$"`
t0021: test application of both crlf and ident Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-05-25 03:02:48 +02:00			`} >expected-output.0 &&`
			`{`
			`cat expected-output.0 &&`
t0021-conversion.sh: fix NoTerminatingSymbolAtEOF test The last line of the test file "expanded-keywords" ended in a newline, which is a valid terminator for ident. Use printf instead of echo to omit it and thus really test if a file that ends unexpectedly in the middle of an ident tag is handled properly. Also take the oppertunity to calculate the expected ID dynamically instead of hardcoding it into the test script. This should make future changes easier. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-05-21 23:25:06 +02:00			`printf "\$Id: NoTerminatingSymbolAtEOF"`
t0021: test application of both crlf and ident Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-05-25 03:02:48 +02:00			`} >expected-output &&`
			`{`
			`append_cr <expected-output.0 &&`
			`printf "\$Id: NoTerminatingSymbolAtEOF"`
			`} >expected-output-crlf &&`
			`{`
			`echo "expanded-keywords ident"`
			`echo "expanded-keywords-crlf ident text eol=crlf"`
			`} >>.gitattributes &&`
Add test case for $Id$ expanded in the repository This test case would have caught the bug fixed by revision c23290d5. It puts various forms of $Id$ into a file in the repository, without allowing git to collapse them to uniformity. Then enables the $Id$ expansion on checkout, and checks that what is checked out has coped with the various forms. Signed-off-by: Andy Parkins <andyparkins@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-05-27 12:52:11 +02:00
t0021: test application of both crlf and ident Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-05-25 03:02:48 +02:00			`rm -f expanded-keywords expanded-keywords-crlf &&`
Add test case for $Id$ expanded in the repository This test case would have caught the bug fixed by revision c23290d5. It puts various forms of $Id$ into a file in the repository, without allowing git to collapse them to uniformity. Then enables the $Id$ expansion on checkout, and checks that what is checked out has coped with the various forms. Signed-off-by: Andy Parkins <andyparkins@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-05-27 12:52:11 +02:00
			`git checkout -- expanded-keywords &&`
t0021: test application of both crlf and ident Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-05-25 03:02:48 +02:00			`test_cmp expanded-keywords expected-output &&`

			`git checkout -- expanded-keywords-crlf &&`
			`test_cmp expanded-keywords-crlf expected-output-crlf`
Add test case for $Id$ expanded in the repository This test case would have caught the bug fixed by revision c23290d5. It puts various forms of $Id$ into a file in the repository, without allowing git to collapse them to uniformity. Then enables the $Id$ expansion on checkout, and checks that what is checked out has coped with the various forms. Signed-off-by: Andy Parkins <andyparkins@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-05-27 12:52:11 +02:00			`'`

convert filter: supply path to external driver Filtering to support keyword expansion may need the name of the file being filtered. In particular, to support p4 keywords like $File: //depot/product/dir/script.sh $ the smudge filter needs to know the name of the file it is smudging. Allow "%f" in the custom filter command line specified in the configuration. This will be substituted by the filename inside a single-quote pair to be passed to the shell. Signed-off-by: Pete Wyckoff <pw@padd.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-22 15:40:13 +01:00			`# The use of %f in a filter definition is expanded to the path to`
			`# the filename being smudged or cleaned. It must be shell escaped.`
			`# First, set up some interesting file names and pet them in`
			`# .gitattributes.`
			`test_expect_success 'filter shell-escaped filenames' '`
			`cat >argc.sh <<-EOF &&`
			`#!$SHELL_PATH`
t0021: avoid getting filter killed with SIGPIPE The fake filter did not read from the standard input at all, which caused the calling side to die with SIGPIPE, depending on the timing. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-23 00:18:47 +01:00			`cat >/dev/null`
convert filter: supply path to external driver Filtering to support keyword expansion may need the name of the file being filtered. In particular, to support p4 keywords like $File: //depot/product/dir/script.sh $ the smudge filter needs to know the name of the file it is smudging. Allow "%f" in the custom filter command line specified in the configuration. This will be substituted by the filename inside a single-quote pair to be passed to the shell. Signed-off-by: Pete Wyckoff <pw@padd.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-22 15:40:13 +01:00			`echo argc: \$# "\$@"`
			`EOF`
			`normal=name-no-magic &&`
			`special="name with '\''sq'\'' and \$x" &&`
			`echo some test text >"$normal" &&`
			`echo some test text >"$special" &&`
			`git add "$normal" "$special" &&`
			`git commit -q -m "add files" &&`
			`echo "name* filter=argc" >.gitattributes &&`

			`# delete the files and check them out again, using a smudge filter`
			`# that will count the args and echo the command-line back to us`
convert: modernize tests Use `test_config` to set the config, check that files are empty with `test_must_be_empty`, compare files with `test_cmp`, and remove spaces after ">" and "<". Please note that the "rot13" filter configured in "setup" keeps using `git config` instead of `test_config` because subsequent tests might depend on it. Reviewed-by: Stefan Beller <sbeller@google.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:26 +02:00			`test_config filter.argc.smudge "sh ./argc.sh %f" &&`
convert filter: supply path to external driver Filtering to support keyword expansion may need the name of the file being filtered. In particular, to support p4 keywords like $File: //depot/product/dir/script.sh $ the smudge filter needs to know the name of the file it is smudging. Allow "%f" in the custom filter command line specified in the configuration. This will be substituted by the filename inside a single-quote pair to be passed to the shell. Signed-off-by: Pete Wyckoff <pw@padd.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-22 15:40:13 +01:00			`rm "$normal" "$special" &&`
			`git checkout -- "$normal" "$special" &&`

			`# make sure argc.sh counted the right number of args`
			`echo "argc: 1 $normal" >expect &&`
			`test_cmp expect "$normal" &&`
			`echo "argc: 1 $special" >expect &&`
			`test_cmp expect "$special" &&`

			`# do the same thing, but with more args in the filter expression`
convert: modernize tests Use `test_config` to set the config, check that files are empty with `test_must_be_empty`, compare files with `test_cmp`, and remove spaces after ">" and "<". Please note that the "rot13" filter configured in "setup" keeps using `git config` instead of `test_config` because subsequent tests might depend on it. Reviewed-by: Stefan Beller <sbeller@google.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:26 +02:00			`test_config filter.argc.smudge "sh ./argc.sh %f --my-extra-arg" &&`
convert filter: supply path to external driver Filtering to support keyword expansion may need the name of the file being filtered. In particular, to support p4 keywords like $File: //depot/product/dir/script.sh $ the smudge filter needs to know the name of the file it is smudging. Allow "%f" in the custom filter command line specified in the configuration. This will be substituted by the filename inside a single-quote pair to be passed to the shell. Signed-off-by: Pete Wyckoff <pw@padd.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-22 15:40:13 +01:00			`rm "$normal" "$special" &&`
			`git checkout -- "$normal" "$special" &&`

			`# make sure argc.sh counted the right number of args`
			`echo "argc: 2 $normal --my-extra-arg" >expect &&`
			`test_cmp expect "$normal" &&`
			`echo "argc: 2 $special --my-extra-arg" >expect &&`
			`test_cmp expect "$special" &&`
			`:`
			`'`

convert: stream from fd to required clean filter to reduce used address space The data is streamed to the filter process anyway. Better avoid mapping the file if possible. This is especially useful if a clean filter reduces the size, for example if it computes a sha1 for binary data, like git media. The file size that the previous implementation could handle was limited by the available address space; large files for example could not be handled with (32-bit) msysgit. The new implementation can filter files of any size as long as the filter output is small enough. The new code path is only taken if the filter is required. The filter consumes data directly from the fd. If it fails, the original data is not immediately available. The condition can easily be handled as a fatal error, which is expected for a required filter anyway. If the filter was not required, the condition would need to be handled in a different way, like seeking to 0 and reading the data. But this would require more restructuring of the code and is probably not worth it. The obvious approach of falling back to reading all data would not help achieving the main purpose of this patch, which is to handle large files with limited address space. If reading all data is an option, we can simply take the old code path right away and mmap the entire file. The environment variable GIT_MMAP_LIMIT, which has been introduced in a previous commit is used to test that the expected code path is taken. A related test that exercises required filters is modified to verify that the data actually has been modified on its way from the file system to the object store. Signed-off-by: Steffen Prohaska <prohaska@zib.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-08-26 17:23:25 +02:00			`test_expect_success 'required filter should filter data' '`
convert: modernize tests Use `test_config` to set the config, check that files are empty with `test_must_be_empty`, compare files with `test_cmp`, and remove spaces after ">" and "<". Please note that the "rot13" filter configured in "setup" keeps using `git config` instead of `test_config` because subsequent tests might depend on it. Reviewed-by: Stefan Beller <sbeller@google.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:26 +02:00			`test_config filter.required.smudge ./rot13.sh &&`
			`test_config filter.required.clean ./rot13.sh &&`
			`test_config filter.required.required true &&`
Add a setting to require a filter to be successful By default, a missing filter driver or a failure from the filter driver is not an error, but merely makes the filter operation a no-op pass through. This is useful to massage the content into a shape that is more convenient for the platform, filesystem, and the user to use, and the content filter mechanism is not used to turn something unusable into usable. However, we could also use of the content filtering mechanism and store the content that cannot be directly used in the repository (e.g. a UUID that refers to the true content stored outside git, or an encrypted content) and turn it into a usable form upon checkout (e.g. download the external content, or decrypt the encrypted content). For such a use case, the content cannot be used when filter driver fails, and we need a way to tell Git to abort the whole operation for such a failing or missing filter driver. Add a new "filter.<driver>.required" configuration variable to mark the second use case. When it is set, git will abort the operation when the filter driver does not exist or exits with a non-zero status code. Signed-off-by: Jehan Bing <jehan@orb.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-02-17 02:19:03 +01:00
			`echo "*.r filter=required" >.gitattributes &&`

convert: stream from fd to required clean filter to reduce used address space The data is streamed to the filter process anyway. Better avoid mapping the file if possible. This is especially useful if a clean filter reduces the size, for example if it computes a sha1 for binary data, like git media. The file size that the previous implementation could handle was limited by the available address space; large files for example could not be handled with (32-bit) msysgit. The new implementation can filter files of any size as long as the filter output is small enough. The new code path is only taken if the filter is required. The filter consumes data directly from the fd. If it fails, the original data is not immediately available. The condition can easily be handled as a fatal error, which is expected for a required filter anyway. If the filter was not required, the condition would need to be handled in a different way, like seeking to 0 and reading the data. But this would require more restructuring of the code and is probably not worth it. The obvious approach of falling back to reading all data would not help achieving the main purpose of this patch, which is to handle large files with limited address space. If reading all data is an option, we can simply take the old code path right away and mmap the entire file. The environment variable GIT_MMAP_LIMIT, which has been introduced in a previous commit is used to test that the expected code path is taken. A related test that exercises required filters is modified to verify that the data actually has been modified on its way from the file system to the object store. Signed-off-by: Steffen Prohaska <prohaska@zib.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-08-26 17:23:25 +02:00			`cat test.o >test.r &&`
Add a setting to require a filter to be successful By default, a missing filter driver or a failure from the filter driver is not an error, but merely makes the filter operation a no-op pass through. This is useful to massage the content into a shape that is more convenient for the platform, filesystem, and the user to use, and the content filter mechanism is not used to turn something unusable into usable. However, we could also use of the content filtering mechanism and store the content that cannot be directly used in the repository (e.g. a UUID that refers to the true content stored outside git, or an encrypted content) and turn it into a usable form upon checkout (e.g. download the external content, or decrypt the encrypted content). For such a use case, the content cannot be used when filter driver fails, and we need a way to tell Git to abort the whole operation for such a failing or missing filter driver. Add a new "filter.<driver>.required" configuration variable to mark the second use case. When it is set, git will abort the operation when the filter driver does not exist or exits with a non-zero status code. Signed-off-by: Jehan Bing <jehan@orb.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-02-17 02:19:03 +01:00			`git add test.r &&`
convert: stream from fd to required clean filter to reduce used address space The data is streamed to the filter process anyway. Better avoid mapping the file if possible. This is especially useful if a clean filter reduces the size, for example if it computes a sha1 for binary data, like git media. The file size that the previous implementation could handle was limited by the available address space; large files for example could not be handled with (32-bit) msysgit. The new implementation can filter files of any size as long as the filter output is small enough. The new code path is only taken if the filter is required. The filter consumes data directly from the fd. If it fails, the original data is not immediately available. The condition can easily be handled as a fatal error, which is expected for a required filter anyway. If the filter was not required, the condition would need to be handled in a different way, like seeking to 0 and reading the data. But this would require more restructuring of the code and is probably not worth it. The obvious approach of falling back to reading all data would not help achieving the main purpose of this patch, which is to handle large files with limited address space. If reading all data is an option, we can simply take the old code path right away and mmap the entire file. The environment variable GIT_MMAP_LIMIT, which has been introduced in a previous commit is used to test that the expected code path is taken. A related test that exercises required filters is modified to verify that the data actually has been modified on its way from the file system to the object store. Signed-off-by: Steffen Prohaska <prohaska@zib.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-08-26 17:23:25 +02:00
Add a setting to require a filter to be successful By default, a missing filter driver or a failure from the filter driver is not an error, but merely makes the filter operation a no-op pass through. This is useful to massage the content into a shape that is more convenient for the platform, filesystem, and the user to use, and the content filter mechanism is not used to turn something unusable into usable. However, we could also use of the content filtering mechanism and store the content that cannot be directly used in the repository (e.g. a UUID that refers to the true content stored outside git, or an encrypted content) and turn it into a usable form upon checkout (e.g. download the external content, or decrypt the encrypted content). For such a use case, the content cannot be used when filter driver fails, and we need a way to tell Git to abort the whole operation for such a failing or missing filter driver. Add a new "filter.<driver>.required" configuration variable to mark the second use case. When it is set, git will abort the operation when the filter driver does not exist or exits with a non-zero status code. Signed-off-by: Jehan Bing <jehan@orb.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-02-17 02:19:03 +01:00			`rm -f test.r &&`
convert: stream from fd to required clean filter to reduce used address space The data is streamed to the filter process anyway. Better avoid mapping the file if possible. This is especially useful if a clean filter reduces the size, for example if it computes a sha1 for binary data, like git media. The file size that the previous implementation could handle was limited by the available address space; large files for example could not be handled with (32-bit) msysgit. The new implementation can filter files of any size as long as the filter output is small enough. The new code path is only taken if the filter is required. The filter consumes data directly from the fd. If it fails, the original data is not immediately available. The condition can easily be handled as a fatal error, which is expected for a required filter anyway. If the filter was not required, the condition would need to be handled in a different way, like seeking to 0 and reading the data. But this would require more restructuring of the code and is probably not worth it. The obvious approach of falling back to reading all data would not help achieving the main purpose of this patch, which is to handle large files with limited address space. If reading all data is an option, we can simply take the old code path right away and mmap the entire file. The environment variable GIT_MMAP_LIMIT, which has been introduced in a previous commit is used to test that the expected code path is taken. A related test that exercises required filters is modified to verify that the data actually has been modified on its way from the file system to the object store. Signed-off-by: Steffen Prohaska <prohaska@zib.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-08-26 17:23:25 +02:00			`git checkout -- test.r &&`
convert: modernize tests Use `test_config` to set the config, check that files are empty with `test_must_be_empty`, compare files with `test_cmp`, and remove spaces after ">" and "<". Please note that the "rot13" filter configured in "setup" keeps using `git config` instead of `test_config` because subsequent tests might depend on it. Reviewed-by: Stefan Beller <sbeller@google.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:26 +02:00			`test_cmp test.o test.r &&`
convert: stream from fd to required clean filter to reduce used address space The data is streamed to the filter process anyway. Better avoid mapping the file if possible. This is especially useful if a clean filter reduces the size, for example if it computes a sha1 for binary data, like git media. The file size that the previous implementation could handle was limited by the available address space; large files for example could not be handled with (32-bit) msysgit. The new implementation can filter files of any size as long as the filter output is small enough. The new code path is only taken if the filter is required. The filter consumes data directly from the fd. If it fails, the original data is not immediately available. The condition can easily be handled as a fatal error, which is expected for a required filter anyway. If the filter was not required, the condition would need to be handled in a different way, like seeking to 0 and reading the data. But this would require more restructuring of the code and is probably not worth it. The obvious approach of falling back to reading all data would not help achieving the main purpose of this patch, which is to handle large files with limited address space. If reading all data is an option, we can simply take the old code path right away and mmap the entire file. The environment variable GIT_MMAP_LIMIT, which has been introduced in a previous commit is used to test that the expected code path is taken. A related test that exercises required filters is modified to verify that the data actually has been modified on its way from the file system to the object store. Signed-off-by: Steffen Prohaska <prohaska@zib.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-08-26 17:23:25 +02:00
			`./rot13.sh <test.o >expected &&`
			`git cat-file blob :test.r >actual &&`
convert: modernize tests Use `test_config` to set the config, check that files are empty with `test_must_be_empty`, compare files with `test_cmp`, and remove spaces after ">" and "<". Please note that the "rot13" filter configured in "setup" keeps using `git config` instead of `test_config` because subsequent tests might depend on it. Reviewed-by: Stefan Beller <sbeller@google.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:26 +02:00			`test_cmp expected actual`
Add a setting to require a filter to be successful By default, a missing filter driver or a failure from the filter driver is not an error, but merely makes the filter operation a no-op pass through. This is useful to massage the content into a shape that is more convenient for the platform, filesystem, and the user to use, and the content filter mechanism is not used to turn something unusable into usable. However, we could also use of the content filtering mechanism and store the content that cannot be directly used in the repository (e.g. a UUID that refers to the true content stored outside git, or an encrypted content) and turn it into a usable form upon checkout (e.g. download the external content, or decrypt the encrypted content). For such a use case, the content cannot be used when filter driver fails, and we need a way to tell Git to abort the whole operation for such a failing or missing filter driver. Add a new "filter.<driver>.required" configuration variable to mark the second use case. When it is set, git will abort the operation when the filter driver does not exist or exits with a non-zero status code. Signed-off-by: Jehan Bing <jehan@orb.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-02-17 02:19:03 +01:00			`'`

			`test_expect_success 'required filter smudge failure' '`
convert: modernize tests Use `test_config` to set the config, check that files are empty with `test_must_be_empty`, compare files with `test_cmp`, and remove spaces after ">" and "<". Please note that the "rot13" filter configured in "setup" keeps using `git config` instead of `test_config` because subsequent tests might depend on it. Reviewed-by: Stefan Beller <sbeller@google.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:26 +02:00			`test_config filter.failsmudge.smudge false &&`
			`test_config filter.failsmudge.clean cat &&`
			`test_config filter.failsmudge.required true &&`
Add a setting to require a filter to be successful By default, a missing filter driver or a failure from the filter driver is not an error, but merely makes the filter operation a no-op pass through. This is useful to massage the content into a shape that is more convenient for the platform, filesystem, and the user to use, and the content filter mechanism is not used to turn something unusable into usable. However, we could also use of the content filtering mechanism and store the content that cannot be directly used in the repository (e.g. a UUID that refers to the true content stored outside git, or an encrypted content) and turn it into a usable form upon checkout (e.g. download the external content, or decrypt the encrypted content). For such a use case, the content cannot be used when filter driver fails, and we need a way to tell Git to abort the whole operation for such a failing or missing filter driver. Add a new "filter.<driver>.required" configuration variable to mark the second use case. When it is set, git will abort the operation when the filter driver does not exist or exits with a non-zero status code. Signed-off-by: Jehan Bing <jehan@orb.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-02-17 02:19:03 +01:00
			`echo "*.fs filter=failsmudge" >.gitattributes &&`

			`echo test >test.fs &&`
			`git add test.fs &&`
			`rm -f test.fs &&`
			`test_must_fail git checkout -- test.fs`
			`'`

			`test_expect_success 'required filter clean failure' '`
convert: modernize tests Use `test_config` to set the config, check that files are empty with `test_must_be_empty`, compare files with `test_cmp`, and remove spaces after ">" and "<". Please note that the "rot13" filter configured in "setup" keeps using `git config` instead of `test_config` because subsequent tests might depend on it. Reviewed-by: Stefan Beller <sbeller@google.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:26 +02:00			`test_config filter.failclean.smudge cat &&`
			`test_config filter.failclean.clean false &&`
			`test_config filter.failclean.required true &&`
Add a setting to require a filter to be successful By default, a missing filter driver or a failure from the filter driver is not an error, but merely makes the filter operation a no-op pass through. This is useful to massage the content into a shape that is more convenient for the platform, filesystem, and the user to use, and the content filter mechanism is not used to turn something unusable into usable. However, we could also use of the content filtering mechanism and store the content that cannot be directly used in the repository (e.g. a UUID that refers to the true content stored outside git, or an encrypted content) and turn it into a usable form upon checkout (e.g. download the external content, or decrypt the encrypted content). For such a use case, the content cannot be used when filter driver fails, and we need a way to tell Git to abort the whole operation for such a failing or missing filter driver. Add a new "filter.<driver>.required" configuration variable to mark the second use case. When it is set, git will abort the operation when the filter driver does not exist or exits with a non-zero status code. Signed-off-by: Jehan Bing <jehan@orb.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-02-17 02:19:03 +01:00
			`echo "*.fc filter=failclean" >.gitattributes &&`

			`echo test >test.fc &&`
			`test_must_fail git add test.fc`
			`'`

convert: stream from fd to required clean filter to reduce used address space The data is streamed to the filter process anyway. Better avoid mapping the file if possible. This is especially useful if a clean filter reduces the size, for example if it computes a sha1 for binary data, like git media. The file size that the previous implementation could handle was limited by the available address space; large files for example could not be handled with (32-bit) msysgit. The new implementation can filter files of any size as long as the filter output is small enough. The new code path is only taken if the filter is required. The filter consumes data directly from the fd. If it fails, the original data is not immediately available. The condition can easily be handled as a fatal error, which is expected for a required filter anyway. If the filter was not required, the condition would need to be handled in a different way, like seeking to 0 and reading the data. But this would require more restructuring of the code and is probably not worth it. The obvious approach of falling back to reading all data would not help achieving the main purpose of this patch, which is to handle large files with limited address space. If reading all data is an option, we can simply take the old code path right away and mmap the entire file. The environment variable GIT_MMAP_LIMIT, which has been introduced in a previous commit is used to test that the expected code path is taken. A related test that exercises required filters is modified to verify that the data actually has been modified on its way from the file system to the object store. Signed-off-by: Steffen Prohaska <prohaska@zib.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-08-26 17:23:25 +02:00			`test_expect_success 'filtering large input to small output should use little memory' '`
convert: modernize tests Use `test_config` to set the config, check that files are empty with `test_must_be_empty`, compare files with `test_cmp`, and remove spaces after ">" and "<". Please note that the "rot13" filter configured in "setup" keeps using `git config` instead of `test_config` because subsequent tests might depend on it. Reviewed-by: Stefan Beller <sbeller@google.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:26 +02:00			`test_config filter.devnull.clean "cat >/dev/null" &&`
			`test_config filter.devnull.required true &&`
convert: stream from fd to required clean filter to reduce used address space The data is streamed to the filter process anyway. Better avoid mapping the file if possible. This is especially useful if a clean filter reduces the size, for example if it computes a sha1 for binary data, like git media. The file size that the previous implementation could handle was limited by the available address space; large files for example could not be handled with (32-bit) msysgit. The new implementation can filter files of any size as long as the filter output is small enough. The new code path is only taken if the filter is required. The filter consumes data directly from the fd. If it fails, the original data is not immediately available. The condition can easily be handled as a fatal error, which is expected for a required filter anyway. If the filter was not required, the condition would need to be handled in a different way, like seeking to 0 and reading the data. But this would require more restructuring of the code and is probably not worth it. The obvious approach of falling back to reading all data would not help achieving the main purpose of this patch, which is to handle large files with limited address space. If reading all data is an option, we can simply take the old code path right away and mmap the entire file. The environment variable GIT_MMAP_LIMIT, which has been introduced in a previous commit is used to test that the expected code path is taken. A related test that exercises required filters is modified to verify that the data actually has been modified on its way from the file system to the object store. Signed-off-by: Steffen Prohaska <prohaska@zib.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-08-26 17:23:25 +02:00			`for i in $(test_seq 1 30); do printf "%1048576d" 1; done >30MB &&`
			`echo "30MB filter=devnull" >.gitattributes &&`
			`GIT_MMAP_LIMIT=1m GIT_ALLOC_LIMIT=1m git add 30MB`
			`'`

filter_buffer_or_fd(): ignore EPIPE We are explicitly ignoring SIGPIPE, as we fully expect that the filter program may not read our output fully. Ignore EPIPE that may come from writing to it as well. A new test was stolen from Jeff's suggestion. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-19 20:08:23 +02:00			`test_expect_success 'filter that does not read is fine' '`
			`test-genrandom foo $((128 * 1024 + 1)) >big &&`
			`echo "big filter=epipe" >.gitattributes &&`
convert: modernize tests Use `test_config` to set the config, check that files are empty with `test_must_be_empty`, compare files with `test_cmp`, and remove spaces after ">" and "<". Please note that the "rot13" filter configured in "setup" keeps using `git config` instead of `test_config` because subsequent tests might depend on it. Reviewed-by: Stefan Beller <sbeller@google.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:26 +02:00			`test_config filter.epipe.clean "echo xyzzy" &&`
filter_buffer_or_fd(): ignore EPIPE We are explicitly ignoring SIGPIPE, as we fully expect that the filter program may not read our output fully. Ignore EPIPE that may come from writing to it as well. A new test was stolen from Jeff's suggestion. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-19 20:08:23 +02:00			`git add big &&`
			`git cat-file blob :big >actual &&`
			`echo xyzzy >expect &&`
			`test_cmp expect actual`
			`'`

xread, xwrite: limit size of IO to 8MB Checking out 2GB or more through an external filter (see test) fails on Mac OS X 10.8.4 (12E55) for a 64-bit executable with: error: read from external filter cat failed error: cannot feed the input to external filter cat error: cat died of signal 13 error: external filter cat failed 141 error: external filter cat failed The reason is that read() immediately returns with EINVAL when asked to read more than 2GB. According to POSIX [1], if the value of nbyte passed to read() is greater than SSIZE_MAX, the result is implementation-defined. The write function has the same restriction [2]. Since OS X still supports running 32-bit executables, the 32-bit limit (SSIZE_MAX = INT_MAX = 2GB - 1) seems to be also imposed on 64-bit executables under certain conditions. For write, the problem has been addressed earlier [6c642a]. Address the problem for read() and write() differently, by limiting size of IO chunks unconditionally on all platforms in xread() and xwrite(). Large chunks only cause problems, like causing latencies when killing the process, even if OS X was not buggy. Doing IO in reasonably sized smaller chunks should have no negative impact on performance. The compat wrapper clipped_write() introduced earlier [6c642a] is not needed anymore. It will be reverted in a separate commit. The new test catches read and write problems. Note that 'git add' exits with 0 even if it prints filtering errors to stderr. The test, therefore, checks stderr. 'git add' should probably be changed (sometime in another commit) to exit with nonzero if filtering fails. The test could then be changed to use test_must_fail. Thanks to the following people for suggestions and testing: Johannes Sixt <j6t@kdbg.org> John Keeping <john@keeping.me.uk> Jonathan Nieder <jrnieder@gmail.com> Kyle J. McKay <mackyle@gmail.com> Linus Torvalds <torvalds@linux-foundation.org> Torsten Bögershausen <tboegi@web.de> [1] http://pubs.opengroup.org/onlinepubs/009695399/functions/read.html [2] http://pubs.opengroup.org/onlinepubs/009695399/functions/write.html [6c642a] commit 6c642a878688adf46b226903858b53e2d31ac5c3 compate/clipped-write.c: large write(2) fails on Mac OS X/XNU Signed-off-by: Steffen Prohaska <prohaska@zib.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-08-20 08:43:54 +02:00			`test_expect_success EXPENSIVE 'filter large file' '`
convert: modernize tests Use `test_config` to set the config, check that files are empty with `test_must_be_empty`, compare files with `test_cmp`, and remove spaces after ">" and "<". Please note that the "rot13" filter configured in "setup" keeps using `git config` instead of `test_config` because subsequent tests might depend on it. Reviewed-by: Stefan Beller <sbeller@google.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:26 +02:00			`test_config filter.largefile.smudge cat &&`
			`test_config filter.largefile.clean cat &&`
xread, xwrite: limit size of IO to 8MB Checking out 2GB or more through an external filter (see test) fails on Mac OS X 10.8.4 (12E55) for a 64-bit executable with: error: read from external filter cat failed error: cannot feed the input to external filter cat error: cat died of signal 13 error: external filter cat failed 141 error: external filter cat failed The reason is that read() immediately returns with EINVAL when asked to read more than 2GB. According to POSIX [1], if the value of nbyte passed to read() is greater than SSIZE_MAX, the result is implementation-defined. The write function has the same restriction [2]. Since OS X still supports running 32-bit executables, the 32-bit limit (SSIZE_MAX = INT_MAX = 2GB - 1) seems to be also imposed on 64-bit executables under certain conditions. For write, the problem has been addressed earlier [6c642a]. Address the problem for read() and write() differently, by limiting size of IO chunks unconditionally on all platforms in xread() and xwrite(). Large chunks only cause problems, like causing latencies when killing the process, even if OS X was not buggy. Doing IO in reasonably sized smaller chunks should have no negative impact on performance. The compat wrapper clipped_write() introduced earlier [6c642a] is not needed anymore. It will be reverted in a separate commit. The new test catches read and write problems. Note that 'git add' exits with 0 even if it prints filtering errors to stderr. The test, therefore, checks stderr. 'git add' should probably be changed (sometime in another commit) to exit with nonzero if filtering fails. The test could then be changed to use test_must_fail. Thanks to the following people for suggestions and testing: Johannes Sixt <j6t@kdbg.org> John Keeping <john@keeping.me.uk> Jonathan Nieder <jrnieder@gmail.com> Kyle J. McKay <mackyle@gmail.com> Linus Torvalds <torvalds@linux-foundation.org> Torsten Bögershausen <tboegi@web.de> [1] http://pubs.opengroup.org/onlinepubs/009695399/functions/read.html [2] http://pubs.opengroup.org/onlinepubs/009695399/functions/write.html [6c642a] commit 6c642a878688adf46b226903858b53e2d31ac5c3 compate/clipped-write.c: large write(2) fails on Mac OS X/XNU Signed-off-by: Steffen Prohaska <prohaska@zib.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-08-20 08:43:54 +02:00			`for i in $(test_seq 1 2048); do printf "%1048576d" 1; done >2GB &&`
			`echo "2GB filter=largefile" >.gitattributes &&`
			`git add 2GB 2>err &&`
convert: modernize tests Use `test_config` to set the config, check that files are empty with `test_must_be_empty`, compare files with `test_cmp`, and remove spaces after ">" and "<". Please note that the "rot13" filter configured in "setup" keeps using `git config` instead of `test_config` because subsequent tests might depend on it. Reviewed-by: Stefan Beller <sbeller@google.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:26 +02:00			`test_must_be_empty err &&`
xread, xwrite: limit size of IO to 8MB Checking out 2GB or more through an external filter (see test) fails on Mac OS X 10.8.4 (12E55) for a 64-bit executable with: error: read from external filter cat failed error: cannot feed the input to external filter cat error: cat died of signal 13 error: external filter cat failed 141 error: external filter cat failed The reason is that read() immediately returns with EINVAL when asked to read more than 2GB. According to POSIX [1], if the value of nbyte passed to read() is greater than SSIZE_MAX, the result is implementation-defined. The write function has the same restriction [2]. Since OS X still supports running 32-bit executables, the 32-bit limit (SSIZE_MAX = INT_MAX = 2GB - 1) seems to be also imposed on 64-bit executables under certain conditions. For write, the problem has been addressed earlier [6c642a]. Address the problem for read() and write() differently, by limiting size of IO chunks unconditionally on all platforms in xread() and xwrite(). Large chunks only cause problems, like causing latencies when killing the process, even if OS X was not buggy. Doing IO in reasonably sized smaller chunks should have no negative impact on performance. The compat wrapper clipped_write() introduced earlier [6c642a] is not needed anymore. It will be reverted in a separate commit. The new test catches read and write problems. Note that 'git add' exits with 0 even if it prints filtering errors to stderr. The test, therefore, checks stderr. 'git add' should probably be changed (sometime in another commit) to exit with nonzero if filtering fails. The test could then be changed to use test_must_fail. Thanks to the following people for suggestions and testing: Johannes Sixt <j6t@kdbg.org> John Keeping <john@keeping.me.uk> Jonathan Nieder <jrnieder@gmail.com> Kyle J. McKay <mackyle@gmail.com> Linus Torvalds <torvalds@linux-foundation.org> Torsten Bögershausen <tboegi@web.de> [1] http://pubs.opengroup.org/onlinepubs/009695399/functions/read.html [2] http://pubs.opengroup.org/onlinepubs/009695399/functions/write.html [6c642a] commit 6c642a878688adf46b226903858b53e2d31ac5c3 compate/clipped-write.c: large write(2) fails on Mac OS X/XNU Signed-off-by: Steffen Prohaska <prohaska@zib.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-08-20 08:43:54 +02:00			`rm -f 2GB &&`
			`git checkout -- 2GB 2>err &&`
convert: modernize tests Use `test_config` to set the config, check that files are empty with `test_must_be_empty`, compare files with `test_cmp`, and remove spaces after ">" and "<". Please note that the "rot13" filter configured in "setup" keeps using `git config` instead of `test_config` because subsequent tests might depend on it. Reviewed-by: Stefan Beller <sbeller@google.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:26 +02:00			`test_must_be_empty err`
xread, xwrite: limit size of IO to 8MB Checking out 2GB or more through an external filter (see test) fails on Mac OS X 10.8.4 (12E55) for a 64-bit executable with: error: read from external filter cat failed error: cannot feed the input to external filter cat error: cat died of signal 13 error: external filter cat failed 141 error: external filter cat failed The reason is that read() immediately returns with EINVAL when asked to read more than 2GB. According to POSIX [1], if the value of nbyte passed to read() is greater than SSIZE_MAX, the result is implementation-defined. The write function has the same restriction [2]. Since OS X still supports running 32-bit executables, the 32-bit limit (SSIZE_MAX = INT_MAX = 2GB - 1) seems to be also imposed on 64-bit executables under certain conditions. For write, the problem has been addressed earlier [6c642a]. Address the problem for read() and write() differently, by limiting size of IO chunks unconditionally on all platforms in xread() and xwrite(). Large chunks only cause problems, like causing latencies when killing the process, even if OS X was not buggy. Doing IO in reasonably sized smaller chunks should have no negative impact on performance. The compat wrapper clipped_write() introduced earlier [6c642a] is not needed anymore. It will be reverted in a separate commit. The new test catches read and write problems. Note that 'git add' exits with 0 even if it prints filtering errors to stderr. The test, therefore, checks stderr. 'git add' should probably be changed (sometime in another commit) to exit with nonzero if filtering fails. The test could then be changed to use test_must_fail. Thanks to the following people for suggestions and testing: Johannes Sixt <j6t@kdbg.org> John Keeping <john@keeping.me.uk> Jonathan Nieder <jrnieder@gmail.com> Kyle J. McKay <mackyle@gmail.com> Linus Torvalds <torvalds@linux-foundation.org> Torsten Bögershausen <tboegi@web.de> [1] http://pubs.opengroup.org/onlinepubs/009695399/functions/read.html [2] http://pubs.opengroup.org/onlinepubs/009695399/functions/write.html [6c642a] commit 6c642a878688adf46b226903858b53e2d31ac5c3 compate/clipped-write.c: large write(2) fails on Mac OS X/XNU Signed-off-by: Steffen Prohaska <prohaska@zib.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-08-20 08:43:54 +02:00			`'`

sha1_file: pass empty buffer to index empty file `git add` of an empty file with a filter pops complaints from `copy_fd` about a bad file descriptor. This traces back to these lines in sha1_file.c:index_core: if (!size) { ret = index_mem(sha1, NULL, size, type, path, flags); The problem here is that content to be added to the index can be supplied from an fd, or from a memory buffer, or from a pathname. This call is supplying a NULL buffer pointer and a zero size. Downstream logic takes the complete absence of a buffer to mean the data is to be found elsewhere -- for instance, these, from convert.c: if (params->src) { write_err = (write_in_full(child_process.in, params->src, params->size) < 0); } else { write_err = copy_fd(params->fd, child_process.in); } ~If there's a buffer, write from that, otherwise the data must be coming from an open fd.~ Perfectly reasonable logic in a routine that's going to write from either a buffer or an fd. So change `index_core` to supply an empty buffer when indexing an empty file. There's a patch out there that instead changes the logic quoted above to take a `-1` fd to mean "use the buffer", but it seems to me that the distinction between a missing buffer and an empty one carries intrinsic semantics, where the logic change is adapting the code to handle incorrect arguments. Signed-off-by: Jim Hill <gjthill@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-18 02:41:45 +02:00			`test_expect_success "filter: clean empty file" '`
convert: modernize tests Use `test_config` to set the config, check that files are empty with `test_must_be_empty`, compare files with `test_cmp`, and remove spaces after ">" and "<". Please note that the "rot13" filter configured in "setup" keeps using `git config` instead of `test_config` because subsequent tests might depend on it. Reviewed-by: Stefan Beller <sbeller@google.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:26 +02:00			`test_config filter.in-repo-header.clean "echo cleaned && cat" &&`
			`test_config filter.in-repo-header.smudge "sed 1d" &&`
sha1_file: pass empty buffer to index empty file `git add` of an empty file with a filter pops complaints from `copy_fd` about a bad file descriptor. This traces back to these lines in sha1_file.c:index_core: if (!size) { ret = index_mem(sha1, NULL, size, type, path, flags); The problem here is that content to be added to the index can be supplied from an fd, or from a memory buffer, or from a pathname. This call is supplying a NULL buffer pointer and a zero size. Downstream logic takes the complete absence of a buffer to mean the data is to be found elsewhere -- for instance, these, from convert.c: if (params->src) { write_err = (write_in_full(child_process.in, params->src, params->size) < 0); } else { write_err = copy_fd(params->fd, child_process.in); } ~If there's a buffer, write from that, otherwise the data must be coming from an open fd.~ Perfectly reasonable logic in a routine that's going to write from either a buffer or an fd. So change `index_core` to supply an empty buffer when indexing an empty file. There's a patch out there that instead changes the logic quoted above to take a `-1` fd to mean "use the buffer", but it seems to me that the distinction between a missing buffer and an empty one carries intrinsic semantics, where the logic change is adapting the code to handle incorrect arguments. Signed-off-by: Jim Hill <gjthill@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-18 02:41:45 +02:00
			`echo "empty-in-worktree filter=in-repo-header" >>.gitattributes &&`
			`>empty-in-worktree &&`

			`echo cleaned >expected &&`
			`git add empty-in-worktree &&`
			`git show :empty-in-worktree >actual &&`
			`test_cmp expected actual`
			`'`

			`test_expect_success "filter: smudge empty file" '`
convert: modernize tests Use `test_config` to set the config, check that files are empty with `test_must_be_empty`, compare files with `test_cmp`, and remove spaces after ">" and "<". Please note that the "rot13" filter configured in "setup" keeps using `git config` instead of `test_config` because subsequent tests might depend on it. Reviewed-by: Stefan Beller <sbeller@google.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:26 +02:00			`test_config filter.empty-in-repo.clean "cat >/dev/null" &&`
			`test_config filter.empty-in-repo.smudge "echo smudged && cat" &&`
sha1_file: pass empty buffer to index empty file `git add` of an empty file with a filter pops complaints from `copy_fd` about a bad file descriptor. This traces back to these lines in sha1_file.c:index_core: if (!size) { ret = index_mem(sha1, NULL, size, type, path, flags); The problem here is that content to be added to the index can be supplied from an fd, or from a memory buffer, or from a pathname. This call is supplying a NULL buffer pointer and a zero size. Downstream logic takes the complete absence of a buffer to mean the data is to be found elsewhere -- for instance, these, from convert.c: if (params->src) { write_err = (write_in_full(child_process.in, params->src, params->size) < 0); } else { write_err = copy_fd(params->fd, child_process.in); } ~If there's a buffer, write from that, otherwise the data must be coming from an open fd.~ Perfectly reasonable logic in a routine that's going to write from either a buffer or an fd. So change `index_core` to supply an empty buffer when indexing an empty file. There's a patch out there that instead changes the logic quoted above to take a `-1` fd to mean "use the buffer", but it seems to me that the distinction between a missing buffer and an empty one carries intrinsic semantics, where the logic change is adapting the code to handle incorrect arguments. Signed-off-by: Jim Hill <gjthill@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-18 02:41:45 +02:00
			`echo "empty-in-repo filter=empty-in-repo" >>.gitattributes &&`
			`echo dead data walking >empty-in-repo &&`
			`git add empty-in-repo &&`

			`echo smudged >expected &&`
			`git checkout-index --prefix=filtered- empty-in-repo &&`
			`test_cmp expected filtered-empty-in-repo`
			`'`

convert: treat an empty string for clean/smudge filters as "cat" Once a lower-priority configuration file defines a clean or smudge filter, there is no convenient way to override it to produce as-is output. Even though the configuration mechanism implements "the last one wins" semantics, you cannot set them to an empty string and expect them to work, as apply_filter() would try to run the empty string as an external command and fail. The conversion is not done, but the function would still report a failure to convert. Even though resetting the variable to "cat" (i.e. pass the data back as-is and report success) is an obvious and a viable way to solve this, it is wasteful to spawn an external process just as a workaround. Instead, teach apply_filter() to treat an empty string as a no-op filter that always returns successfully its input as-is without conversion. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-01-29 09:21:37 +01:00			`test_expect_success 'disable filter with empty override' '`
			`test_config_global filter.disable.smudge false &&`
			`test_config_global filter.disable.clean false &&`
			`test_config filter.disable.smudge false &&`
			`test_config filter.disable.clean false &&`

			`echo "*.disable filter=disable" >.gitattributes &&`

			`echo test >test.disable &&`
			`git -c filter.disable.clean= add test.disable 2>err &&`
			`test_must_be_empty err &&`
			`rm -f test.disable &&`
			`git -c filter.disable.smudge= checkout -- test.disable 2>err &&`
			`test_must_be_empty err`
			`'`

diff: do not reuse worktree files that need "clean" conversion When accessing a blob for a diff, we may try to reuse file contents in the working tree, under the theory that it is faster to mmap those file contents than it would be to extract the content from the object database. When we have to filter those contents, though, that assumption does not hold. Even for our internal conversions like CRLF, we have to allocate and fill a new buffer anyway. But much worse, for external clean filters we have to exec an arbitrary script, and we have no idea how expensive it may be to run. So let's skip this optimization when conversion into git's "clean" form is required. This applies whenever the "want_file" flag is false. When it's true, the caller actually wants the smudged worktree contents, which the reused file by definition already has (in fact, this is a key optimization going the other direction, since reusing the worktree file there lets us skip smudge filters). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-07-22 17:27:53 +02:00			`test_expect_success 'diff does not reuse worktree files that need cleaning' '`
			`test_config filter.counter.clean "echo . >>count; sed s/^/clean:/" &&`
			`echo "file filter=counter" >.gitattributes &&`
			`test_commit one file &&`
			`test_commit two file &&`

			`>count &&`
			`git diff-tree -p HEAD &&`
			`test_line_count = 0 count`
			`'`

convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00			`test_expect_success PERL 'required process filter should filter data' '`
t0021: make debug log file name configurable The "rot13-filter.pl" helper wrote its debug logs always to "rot13-filter.log". Make this configurable by defining the log file as first parameter of "rot13-filter.pl". This is useful if "rot13-filter.pl" is configured multiple times similar to the subsequent patch 'convert: add "status=delayed" to filter process protocol'. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-06-01 10:22:00 +02:00			`test_config_global filter.protocol.process "rot13-filter.pl debug.log clean smudge" &&`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00			`test_config_global filter.protocol.required true &&`
			`rm -rf repo &&`
			`mkdir repo &&`
			`(`
			`cd repo &&`
			`git init &&`

			`echo "*.r filter=protocol" >.gitattributes &&`
			`git add . &&`
t0021: minor filter process test cleanup Remove superfluous .gitignore pattern and invalid '.' in `git commit` calls. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-12-04 14:37:31 +01:00			`git commit -m "test commit 1" &&`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00			`git branch empty-branch &&`

			`cp "$TEST_ROOT/test.o" test.r &&`
			`cp "$TEST_ROOT/test2.o" test2.r &&`
			`mkdir testsubdir &&`
docs: warn about possible '=' in clean/smudge filter process values A pathname value in a clean/smudge filter process "key=value" pair can contain the '=' character (introduced in edcc858). Make the user aware of this issue in the docs, add a corresponding test case, and fix the issue in filter process value parser of the example implementation in contrib. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-12-03 20:45:16 +01:00			`cp "$TEST_ROOT/test3 '\''sq'\'',\$x=.o" "testsubdir/test3 '\''sq'\'',\$x=.r" &&`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00			`>test4-empty.r &&`

			`S=$(file_size test.r) &&`
			`S2=$(file_size test2.r) &&`
docs: warn about possible '=' in clean/smudge filter process values A pathname value in a clean/smudge filter process "key=value" pair can contain the '=' character (introduced in edcc858). Make the user aware of this issue in the docs, add a corresponding test case, and fix the issue in filter process value parser of the example implementation in contrib. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-12-03 20:45:16 +01:00			`S3=$(file_size "testsubdir/test3 '\''sq'\'',\$x=.r") &&`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00
			`filter_git add . &&`
			`cat >expected.log <<-EOF &&`
			`START`
			`init handshake complete`
			`IN: clean test.r $S [OK] -- OUT: $S . [OK]`
			`IN: clean test2.r $S2 [OK] -- OUT: $S2 . [OK]`
			`IN: clean test4-empty.r 0 [OK] -- OUT: 0 [OK]`
docs: warn about possible '=' in clean/smudge filter process values A pathname value in a clean/smudge filter process "key=value" pair can contain the '=' character (introduced in edcc858). Make the user aware of this issue in the docs, add a corresponding test case, and fix the issue in filter process value parser of the example implementation in contrib. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-12-03 20:45:16 +01:00			`IN: clean testsubdir/test3 '\''sq'\'',\$x=.r $S3 [OK] -- OUT: $S3 . [OK]`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00			`STOP`
			`EOF`
t0021: make debug log file name configurable The "rot13-filter.pl" helper wrote its debug logs always to "rot13-filter.log". Make this configurable by defining the log file as first parameter of "rot13-filter.pl". This is useful if "rot13-filter.pl" is configured multiple times similar to the subsequent patch 'convert: add "status=delayed" to filter process protocol'. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-06-01 10:22:00 +02:00			`test_cmp_count expected.log debug.log &&`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00
t0021: fix flaky test t0021.15 creates files, adds them to the index, and commits them. All this usually happens in a test run within the same second and Git cannot know if the files have been changed between `add` and `commit`. Thus, Git has to run the clean filter in both operations. Sometimes these invocations spread over two different seconds and Git can infer that the files were not changed between `add` and `commit` based on their modification timestamp. The test would fail as it expects the filter invocation. Remove this expectation to make the test stable. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-12-18 13:37:48 +01:00			`git commit -m "test commit 2" &&`
docs: warn about possible '=' in clean/smudge filter process values A pathname value in a clean/smudge filter process "key=value" pair can contain the '=' character (introduced in edcc858). Make the user aware of this issue in the docs, add a corresponding test case, and fix the issue in filter process value parser of the example implementation in contrib. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-12-03 20:45:16 +01:00			`rm -f test2.r "testsubdir/test3 '\''sq'\'',\$x=.r" &&`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00
			`filter_git checkout --quiet --no-progress . &&`
			`cat >expected.log <<-EOF &&`
			`START`
			`init handshake complete`
			`IN: smudge test2.r $S2 [OK] -- OUT: $S2 . [OK]`
docs: warn about possible '=' in clean/smudge filter process values A pathname value in a clean/smudge filter process "key=value" pair can contain the '=' character (introduced in edcc858). Make the user aware of this issue in the docs, add a corresponding test case, and fix the issue in filter process value parser of the example implementation in contrib. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-12-03 20:45:16 +01:00			`IN: smudge testsubdir/test3 '\''sq'\'',\$x=.r $S3 [OK] -- OUT: $S3 . [OK]`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00			`STOP`
			`EOF`
t0021: make debug log file name configurable The "rot13-filter.pl" helper wrote its debug logs always to "rot13-filter.log". Make this configurable by defining the log file as first parameter of "rot13-filter.pl". This is useful if "rot13-filter.pl" is configured multiple times similar to the subsequent patch 'convert: add "status=delayed" to filter process protocol'. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-06-01 10:22:00 +02:00			`test_cmp_exclude_clean expected.log debug.log &&`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00
			`filter_git checkout --quiet --no-progress empty-branch &&`
			`cat >expected.log <<-EOF &&`
			`START`
			`init handshake complete`
			`IN: clean test.r $S [OK] -- OUT: $S . [OK]`
			`STOP`
			`EOF`
t0021: make debug log file name configurable The "rot13-filter.pl" helper wrote its debug logs always to "rot13-filter.log". Make this configurable by defining the log file as first parameter of "rot13-filter.pl". This is useful if "rot13-filter.pl" is configured multiple times similar to the subsequent patch 'convert: add "status=delayed" to filter process protocol'. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-06-01 10:22:00 +02:00			`test_cmp_exclude_clean expected.log debug.log &&`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00
			`filter_git checkout --quiet --no-progress master &&`
			`cat >expected.log <<-EOF &&`
			`START`
			`init handshake complete`
			`IN: smudge test.r $S [OK] -- OUT: $S . [OK]`
			`IN: smudge test2.r $S2 [OK] -- OUT: $S2 . [OK]`
			`IN: smudge test4-empty.r 0 [OK] -- OUT: 0 [OK]`
docs: warn about possible '=' in clean/smudge filter process values A pathname value in a clean/smudge filter process "key=value" pair can contain the '=' character (introduced in edcc858). Make the user aware of this issue in the docs, add a corresponding test case, and fix the issue in filter process value parser of the example implementation in contrib. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-12-03 20:45:16 +01:00			`IN: smudge testsubdir/test3 '\''sq'\'',\$x=.r $S3 [OK] -- OUT: $S3 . [OK]`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00			`STOP`
			`EOF`
t0021: make debug log file name configurable The "rot13-filter.pl" helper wrote its debug logs always to "rot13-filter.log". Make this configurable by defining the log file as first parameter of "rot13-filter.pl". This is useful if "rot13-filter.pl" is configured multiple times similar to the subsequent patch 'convert: add "status=delayed" to filter process protocol'. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-06-01 10:22:00 +02:00			`test_cmp_exclude_clean expected.log debug.log &&`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00
			`test_cmp_committed_rot13 "$TEST_ROOT/test.o" test.r &&`
			`test_cmp_committed_rot13 "$TEST_ROOT/test2.o" test2.r &&`
docs: warn about possible '=' in clean/smudge filter process values A pathname value in a clean/smudge filter process "key=value" pair can contain the '=' character (introduced in edcc858). Make the user aware of this issue in the docs, add a corresponding test case, and fix the issue in filter process value parser of the example implementation in contrib. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-12-03 20:45:16 +01:00			`test_cmp_committed_rot13 "$TEST_ROOT/test3 '\''sq'\'',\$x=.o" "testsubdir/test3 '\''sq'\'',\$x=.r"`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00			`)`
			`'`

			`test_expect_success PERL 'required process filter takes precedence' '`
			`test_config_global filter.protocol.clean false &&`
t0021: make debug log file name configurable The "rot13-filter.pl" helper wrote its debug logs always to "rot13-filter.log". Make this configurable by defining the log file as first parameter of "rot13-filter.pl". This is useful if "rot13-filter.pl" is configured multiple times similar to the subsequent patch 'convert: add "status=delayed" to filter process protocol'. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-06-01 10:22:00 +02:00			`test_config_global filter.protocol.process "rot13-filter.pl debug.log clean" &&`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00			`test_config_global filter.protocol.required true &&`
			`rm -rf repo &&`
			`mkdir repo &&`
			`(`
			`cd repo &&`
			`git init &&`

			`echo "*.r filter=protocol" >.gitattributes &&`
			`cp "$TEST_ROOT/test.o" test.r &&`
			`S=$(file_size test.r) &&`

			`# Check that the process filter is invoked here`
			`filter_git add . &&`
			`cat >expected.log <<-EOF &&`
			`START`
			`init handshake complete`
			`IN: clean test.r $S [OK] -- OUT: $S . [OK]`
			`STOP`
			`EOF`
t0021: make debug log file name configurable The "rot13-filter.pl" helper wrote its debug logs always to "rot13-filter.log". Make this configurable by defining the log file as first parameter of "rot13-filter.pl". This is useful if "rot13-filter.pl" is configured multiple times similar to the subsequent patch 'convert: add "status=delayed" to filter process protocol'. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-06-01 10:22:00 +02:00			`test_cmp_count expected.log debug.log`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00			`)`
			`'`

			`test_expect_success PERL 'required process filter should be used only for "clean" operation only' '`
t0021: make debug log file name configurable The "rot13-filter.pl" helper wrote its debug logs always to "rot13-filter.log". Make this configurable by defining the log file as first parameter of "rot13-filter.pl". This is useful if "rot13-filter.pl" is configured multiple times similar to the subsequent patch 'convert: add "status=delayed" to filter process protocol'. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-06-01 10:22:00 +02:00			`test_config_global filter.protocol.process "rot13-filter.pl debug.log clean" &&`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00			`rm -rf repo &&`
			`mkdir repo &&`
			`(`
			`cd repo &&`
			`git init &&`

			`echo "*.r filter=protocol" >.gitattributes &&`
			`cp "$TEST_ROOT/test.o" test.r &&`
			`S=$(file_size test.r) &&`

			`filter_git add . &&`
			`cat >expected.log <<-EOF &&`
			`START`
			`init handshake complete`
			`IN: clean test.r $S [OK] -- OUT: $S . [OK]`
			`STOP`
			`EOF`
t0021: make debug log file name configurable The "rot13-filter.pl" helper wrote its debug logs always to "rot13-filter.log". Make this configurable by defining the log file as first parameter of "rot13-filter.pl". This is useful if "rot13-filter.pl" is configured multiple times similar to the subsequent patch 'convert: add "status=delayed" to filter process protocol'. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-06-01 10:22:00 +02:00			`test_cmp_count expected.log debug.log &&`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00
			`rm test.r &&`

			`filter_git checkout --quiet --no-progress . &&`
			`# If the filter would be used for "smudge", too, we would see`
			`# "IN: smudge test.r 57 [OK] -- OUT: 57 . [OK]" here`
			`cat >expected.log <<-EOF &&`
			`START`
			`init handshake complete`
			`STOP`
			`EOF`
t0021: make debug log file name configurable The "rot13-filter.pl" helper wrote its debug logs always to "rot13-filter.log". Make this configurable by defining the log file as first parameter of "rot13-filter.pl". This is useful if "rot13-filter.pl" is configured multiple times similar to the subsequent patch 'convert: add "status=delayed" to filter process protocol'. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-06-01 10:22:00 +02:00			`test_cmp_exclude_clean expected.log debug.log`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00			`)`
			`'`

			`test_expect_success PERL 'required process filter should process multiple packets' '`
t0021: make debug log file name configurable The "rot13-filter.pl" helper wrote its debug logs always to "rot13-filter.log". Make this configurable by defining the log file as first parameter of "rot13-filter.pl". This is useful if "rot13-filter.pl" is configured multiple times similar to the subsequent patch 'convert: add "status=delayed" to filter process protocol'. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-06-01 10:22:00 +02:00			`test_config_global filter.protocol.process "rot13-filter.pl debug.log clean smudge" &&`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00			`test_config_global filter.protocol.required true &&`

			`rm -rf repo &&`
			`mkdir repo &&`
			`(`
			`cd repo &&`
			`git init &&`

			`# Generate data requiring 1, 2, 3 packets`
			`S=65516 && # PKTLINE_DATA_MAXLEN -> Maximal size of a packet`
			`generate_random_characters $(($S )) 1pkt_1__.file &&`
			`generate_random_characters $(($S +1)) 2pkt_1+1.file &&`
			`generate_random_characters $(($S*2-1)) 2pkt_2-1.file &&`
			`generate_random_characters $(($S*2 )) 2pkt_2__.file &&`
			`generate_random_characters $(($S*2+1)) 3pkt_2+1.file &&`

			`for FILE in "$TEST_ROOT"/*.file`
			`do`
			`cp "$FILE" . &&`
t0021: put $TEST_ROOT in $PATH We create a rot13.sh script in the trash directory, but need to call it by its full path when we have moved our cwd to another directory. Let's just put $TEST_ROOT in our $PATH so that the script is always found. This is a minor convenience for rot13.sh, but will be a major one when we switch rot13-filter.pl to a script in the same directory, as it means we will not have to deal with shell quoting inside the filter-process config. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-11-02 19:18:25 +01:00			`rot13.sh <"$FILE" >"$FILE.rot13"`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00			`done &&`

			`echo "*.file filter=protocol" >.gitattributes &&`
			`filter_git add *.file .gitattributes &&`
			`cat >expected.log <<-EOF &&`
			`START`
			`init handshake complete`
			`IN: clean 1pkt_1__.file $(($S )) [OK] -- OUT: $(($S )) . [OK]`
			`IN: clean 2pkt_1+1.file $(($S +1)) [OK] -- OUT: $(($S +1)) .. [OK]`
			`IN: clean 2pkt_2-1.file $(($S2-1)) [OK] -- OUT: $(($S2-1)) .. [OK]`
			`IN: clean 2pkt_2__.file $(($S2 )) [OK] -- OUT: $(($S2 )) .. [OK]`
			`IN: clean 3pkt_2+1.file $(($S2+1)) [OK] -- OUT: $(($S2+1)) ... [OK]`
			`STOP`
			`EOF`
t0021: make debug log file name configurable The "rot13-filter.pl" helper wrote its debug logs always to "rot13-filter.log". Make this configurable by defining the log file as first parameter of "rot13-filter.pl". This is useful if "rot13-filter.pl" is configured multiple times similar to the subsequent patch 'convert: add "status=delayed" to filter process protocol'. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-06-01 10:22:00 +02:00			`test_cmp_count expected.log debug.log &&`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00
			`rm -f *.file &&`

			`filter_git checkout --quiet --no-progress -- *.file &&`
			`cat >expected.log <<-EOF &&`
			`START`
			`init handshake complete`
			`IN: smudge 1pkt_1__.file $(($S )) [OK] -- OUT: $(($S )) . [OK]`
			`IN: smudge 2pkt_1+1.file $(($S +1)) [OK] -- OUT: $(($S +1)) .. [OK]`
			`IN: smudge 2pkt_2-1.file $(($S2-1)) [OK] -- OUT: $(($S2-1)) .. [OK]`
			`IN: smudge 2pkt_2__.file $(($S2 )) [OK] -- OUT: $(($S2 )) .. [OK]`
			`IN: smudge 3pkt_2+1.file $(($S2+1)) [OK] -- OUT: $(($S2+1)) ... [OK]`
			`STOP`
			`EOF`
t0021: make debug log file name configurable The "rot13-filter.pl" helper wrote its debug logs always to "rot13-filter.log". Make this configurable by defining the log file as first parameter of "rot13-filter.pl". This is useful if "rot13-filter.pl" is configured multiple times similar to the subsequent patch 'convert: add "status=delayed" to filter process protocol'. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-06-01 10:22:00 +02:00			`test_cmp_exclude_clean expected.log debug.log &&`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00
			`for FILE in *.file`
			`do`
			`test_cmp_committed_rot13 "$TEST_ROOT/$FILE" $FILE`
			`done`
			`)`
			`'`

			`test_expect_success PERL 'required process filter with clean error should fail' '`
t0021: make debug log file name configurable The "rot13-filter.pl" helper wrote its debug logs always to "rot13-filter.log". Make this configurable by defining the log file as first parameter of "rot13-filter.pl". This is useful if "rot13-filter.pl" is configured multiple times similar to the subsequent patch 'convert: add "status=delayed" to filter process protocol'. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-06-01 10:22:00 +02:00			`test_config_global filter.protocol.process "rot13-filter.pl debug.log clean smudge" &&`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00			`test_config_global filter.protocol.required true &&`
			`rm -rf repo &&`
			`mkdir repo &&`
			`(`
			`cd repo &&`
			`git init &&`

			`echo "*.r filter=protocol" >.gitattributes &&`

			`cp "$TEST_ROOT/test.o" test.r &&`
			`echo "this is going to fail" >clean-write-fail.r &&`
			`echo "content-test3-subdir" >test3.r &&`

			`test_must_fail git add .`
			`)`
			`'`

			`test_expect_success PERL 'process filter should restart after unexpected write failure' '`
t0021: make debug log file name configurable The "rot13-filter.pl" helper wrote its debug logs always to "rot13-filter.log". Make this configurable by defining the log file as first parameter of "rot13-filter.pl". This is useful if "rot13-filter.pl" is configured multiple times similar to the subsequent patch 'convert: add "status=delayed" to filter process protocol'. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-06-01 10:22:00 +02:00			`test_config_global filter.protocol.process "rot13-filter.pl debug.log clean smudge" &&`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00			`rm -rf repo &&`
			`mkdir repo &&`
			`(`
			`cd repo &&`
			`git init &&`

			`echo "*.r filter=protocol" >.gitattributes &&`

			`cp "$TEST_ROOT/test.o" test.r &&`
			`cp "$TEST_ROOT/test2.o" test2.r &&`
			`echo "this is going to fail" >smudge-write-fail.o &&`
			`cp smudge-write-fail.o smudge-write-fail.r &&`

			`S=$(file_size test.r) &&`
			`S2=$(file_size test2.r) &&`
			`SF=$(file_size smudge-write-fail.r) &&`

			`git add . &&`
			`rm -f *.r &&`

t0021: make debug log file name configurable The "rot13-filter.pl" helper wrote its debug logs always to "rot13-filter.log". Make this configurable by defining the log file as first parameter of "rot13-filter.pl". This is useful if "rot13-filter.pl" is configured multiple times similar to the subsequent patch 'convert: add "status=delayed" to filter process protocol'. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-06-01 10:22:00 +02:00			`rm -f debug.log &&`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00			`git checkout --quiet --no-progress . 2>git-stderr.log &&`

			`grep "smudge write error at" git-stderr.log &&`
			`grep "error: external filter" git-stderr.log &&`

			`cat >expected.log <<-EOF &&`
			`START`
			`init handshake complete`
t0021: write "OUT <size>" only on success "rot13-filter.pl" always writes "OUT <size>" to the debug log at the end of a response. This works perfectly for the existing responses "abort", "error", and "success". A new response "delayed", that will be introduced in a subsequent patch, accepts the input without giving the filtered result right away. At this point we cannot know the size of the response. Therefore, we do not write "OUT <size>" for "delayed" responses. To simplify the code we do not write "OUT <size>" for "abort" and "error" responses either as their size is always zero. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-06-28 23:29:49 +02:00			`IN: smudge smudge-write-fail.r $SF [OK] -- [WRITE FAIL]`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00			`START`
			`init handshake complete`
			`IN: smudge test.r $S [OK] -- OUT: $S . [OK]`
			`IN: smudge test2.r $S2 [OK] -- OUT: $S2 . [OK]`
			`STOP`
			`EOF`
t0021: make debug log file name configurable The "rot13-filter.pl" helper wrote its debug logs always to "rot13-filter.log". Make this configurable by defining the log file as first parameter of "rot13-filter.pl". This is useful if "rot13-filter.pl" is configured multiple times similar to the subsequent patch 'convert: add "status=delayed" to filter process protocol'. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-06-01 10:22:00 +02:00			`test_cmp_exclude_clean expected.log debug.log &&`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00
			`test_cmp_committed_rot13 "$TEST_ROOT/test.o" test.r &&`
			`test_cmp_committed_rot13 "$TEST_ROOT/test2.o" test2.r &&`

			`# Smudge failed`
			`! test_cmp smudge-write-fail.o smudge-write-fail.r &&`
t0021: put $TEST_ROOT in $PATH We create a rot13.sh script in the trash directory, but need to call it by its full path when we have moved our cwd to another directory. Let's just put $TEST_ROOT in our $PATH so that the script is always found. This is a minor convenience for rot13.sh, but will be a major one when we switch rot13-filter.pl to a script in the same directory, as it means we will not have to deal with shell quoting inside the filter-process config. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-11-02 19:18:25 +01:00			`rot13.sh <smudge-write-fail.o >expected &&`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00			`git cat-file blob :smudge-write-fail.r >actual &&`
			`test_cmp expected actual`
			`)`
			`'`

			`test_expect_success PERL 'process filter should not be restarted if it signals an error' '`
t0021: make debug log file name configurable The "rot13-filter.pl" helper wrote its debug logs always to "rot13-filter.log". Make this configurable by defining the log file as first parameter of "rot13-filter.pl". This is useful if "rot13-filter.pl" is configured multiple times similar to the subsequent patch 'convert: add "status=delayed" to filter process protocol'. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-06-01 10:22:00 +02:00			`test_config_global filter.protocol.process "rot13-filter.pl debug.log clean smudge" &&`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00			`rm -rf repo &&`
			`mkdir repo &&`
			`(`
			`cd repo &&`
			`git init &&`

			`echo "*.r filter=protocol" >.gitattributes &&`

			`cp "$TEST_ROOT/test.o" test.r &&`
			`cp "$TEST_ROOT/test2.o" test2.r &&`
			`echo "this will cause an error" >error.o &&`
			`cp error.o error.r &&`

			`S=$(file_size test.r) &&`
			`S2=$(file_size test2.r) &&`
			`SE=$(file_size error.r) &&`

			`git add . &&`
			`rm -f *.r &&`

			`filter_git checkout --quiet --no-progress . &&`
			`cat >expected.log <<-EOF &&`
			`START`
			`init handshake complete`
t0021: write "OUT <size>" only on success "rot13-filter.pl" always writes "OUT <size>" to the debug log at the end of a response. This works perfectly for the existing responses "abort", "error", and "success". A new response "delayed", that will be introduced in a subsequent patch, accepts the input without giving the filtered result right away. At this point we cannot know the size of the response. Therefore, we do not write "OUT <size>" for "delayed" responses. To simplify the code we do not write "OUT <size>" for "abort" and "error" responses either as their size is always zero. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-06-28 23:29:49 +02:00			`IN: smudge error.r $SE [OK] -- [ERROR]`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00			`IN: smudge test.r $S [OK] -- OUT: $S . [OK]`
			`IN: smudge test2.r $S2 [OK] -- OUT: $S2 . [OK]`
			`STOP`
			`EOF`
t0021: make debug log file name configurable The "rot13-filter.pl" helper wrote its debug logs always to "rot13-filter.log". Make this configurable by defining the log file as first parameter of "rot13-filter.pl". This is useful if "rot13-filter.pl" is configured multiple times similar to the subsequent patch 'convert: add "status=delayed" to filter process protocol'. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-06-01 10:22:00 +02:00			`test_cmp_exclude_clean expected.log debug.log &&`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00
			`test_cmp_committed_rot13 "$TEST_ROOT/test.o" test.r &&`
			`test_cmp_committed_rot13 "$TEST_ROOT/test2.o" test2.r &&`
			`test_cmp error.o error.r`
			`)`
			`'`

			`test_expect_success PERL 'process filter abort stops processing of all further files' '`
t0021: make debug log file name configurable The "rot13-filter.pl" helper wrote its debug logs always to "rot13-filter.log". Make this configurable by defining the log file as first parameter of "rot13-filter.pl". This is useful if "rot13-filter.pl" is configured multiple times similar to the subsequent patch 'convert: add "status=delayed" to filter process protocol'. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-06-01 10:22:00 +02:00			`test_config_global filter.protocol.process "rot13-filter.pl debug.log clean smudge" &&`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00			`rm -rf repo &&`
			`mkdir repo &&`
			`(`
			`cd repo &&`
			`git init &&`

			`echo "*.r filter=protocol" >.gitattributes &&`

			`cp "$TEST_ROOT/test.o" test.r &&`
			`cp "$TEST_ROOT/test2.o" test2.r &&`
			`echo "error this blob and all future blobs" >abort.o &&`
			`cp abort.o abort.r &&`

			`SA=$(file_size abort.r) &&`

			`git add . &&`
			`rm -f *.r &&`

			`# Note: This test assumes that Git filters files in alphabetical`
			`# order ("abort.r" before "test.r").`
			`filter_git checkout --quiet --no-progress . &&`
			`cat >expected.log <<-EOF &&`
			`START`
			`init handshake complete`
t0021: write "OUT <size>" only on success "rot13-filter.pl" always writes "OUT <size>" to the debug log at the end of a response. This works perfectly for the existing responses "abort", "error", and "success". A new response "delayed", that will be introduced in a subsequent patch, accepts the input without giving the filtered result right away. At this point we cannot know the size of the response. Therefore, we do not write "OUT <size>" for "delayed" responses. To simplify the code we do not write "OUT <size>" for "abort" and "error" responses either as their size is always zero. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-06-28 23:29:49 +02:00			`IN: smudge abort.r $SA [OK] -- [ABORT]`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00			`STOP`
			`EOF`
t0021: make debug log file name configurable The "rot13-filter.pl" helper wrote its debug logs always to "rot13-filter.log". Make this configurable by defining the log file as first parameter of "rot13-filter.pl". This is useful if "rot13-filter.pl" is configured multiple times similar to the subsequent patch 'convert: add "status=delayed" to filter process protocol'. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-06-01 10:22:00 +02:00			`test_cmp_exclude_clean expected.log debug.log &&`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00
			`test_cmp "$TEST_ROOT/test.o" test.r &&`
			`test_cmp "$TEST_ROOT/test2.o" test2.r &&`
			`test_cmp abort.o abort.r`
			`)`
			`'`

			`test_expect_success PERL 'invalid process filter must fail (and not hang!)' '`
			`test_config_global filter.protocol.process cat &&`
			`test_config_global filter.protocol.required true &&`
			`rm -rf repo &&`
			`mkdir repo &&`
			`(`
			`cd repo &&`
			`git init &&`

			`echo "*.r filter=protocol" >.gitattributes &&`

			`cp "$TEST_ROOT/test.o" test.r &&`
			`test_must_fail git add . 2>git-stderr.log &&`
sub-process: refactor handshake to common function Refactor, into a common function, the version and capability negotiation done when invoking a long-running process as a clean or smudge filter. This will be useful for other Git code that needs to interact similarly with a long-running process. As you can see in the change to t0021, this commit changes the error message reported when the long-running process does not introduce itself with the expected "server"-terminated line. Originally, the error message reports that the filter "does not support filter protocol version 2", differentiating between the old single-file filter protocol and the new multi-file filter protocol - I have updated it to something more generic and useful. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-07-26 20:17:29 +02:00			`grep "expected git-filter-server" git-stderr.log`
convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:37 +02:00			`)`
			`'`

convert: add "status=delayed" to filter process protocol Some `clean` / `smudge` filters may require a significant amount of time to process a single blob (e.g. the Git LFS smudge filter might perform network requests). During this process the Git checkout operation is blocked and Git needs to wait until the filter is done to continue with the checkout. Teach the filter process protocol, introduced in edcc8581 ("convert: add filter.<driver>.process option", 2016-10-16), to accept the status "delayed" as response to a filter request. Upon this response Git continues with the checkout operation. After the checkout operation Git calls "finish_delayed_checkout" which queries the filter for remaining blobs. If the filter is still working on the completion, then the filter is expected to block. If the filter has completed all remaining blobs then an empty response is expected. Git has a multiple code paths that checkout a blob. Support delayed checkouts only in `clone` (in unpack-trees.c) and `checkout` operations for now. The optimization is most effective in these code paths as all files of the tree are processed. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-06-30 22:41:28 +02:00			`test_expect_success PERL 'delayed checkout in process filter' '`
			`test_config_global filter.a.process "rot13-filter.pl a.log clean smudge delay" &&`
			`test_config_global filter.a.required true &&`
			`test_config_global filter.b.process "rot13-filter.pl b.log clean smudge delay" &&`
			`test_config_global filter.b.required true &&`

			`rm -rf repo &&`
			`mkdir repo &&`
			`(`
			`cd repo &&`
			`git init &&`
			`echo "*.a filter=a" >.gitattributes &&`
			`echo "*.b filter=b" >>.gitattributes &&`
			`cp "$TEST_ROOT/test.o" test.a &&`
			`cp "$TEST_ROOT/test.o" test-delay10.a &&`
			`cp "$TEST_ROOT/test.o" test-delay11.a &&`
			`cp "$TEST_ROOT/test.o" test-delay20.a &&`
			`cp "$TEST_ROOT/test.o" test-delay10.b &&`
			`git add . &&`
			`git commit -m "test commit"`
			`) &&`

			`S=$(file_size "$TEST_ROOT/test.o") &&`
			`cat >a.exp <<-EOF &&`
			`START`
			`init handshake complete`
			`IN: smudge test.a $S [OK] -- OUT: $S . [OK]`
			`IN: smudge test-delay10.a $S [OK] -- [DELAYED]`
			`IN: smudge test-delay11.a $S [OK] -- [DELAYED]`
			`IN: smudge test-delay20.a $S [OK] -- [DELAYED]`
			`IN: list_available_blobs test-delay10.a test-delay11.a [OK]`
			`IN: smudge test-delay10.a 0 [OK] -- OUT: $S . [OK]`
			`IN: smudge test-delay11.a 0 [OK] -- OUT: $S . [OK]`
			`IN: list_available_blobs test-delay20.a [OK]`
			`IN: smudge test-delay20.a 0 [OK] -- OUT: $S . [OK]`
			`IN: list_available_blobs [OK]`
			`STOP`
			`EOF`
			`cat >b.exp <<-EOF &&`
			`START`
			`init handshake complete`
			`IN: smudge test-delay10.b $S [OK] -- [DELAYED]`
			`IN: list_available_blobs test-delay10.b [OK]`
			`IN: smudge test-delay10.b 0 [OK] -- OUT: $S . [OK]`
			`IN: list_available_blobs [OK]`
			`STOP`
			`EOF`

			`rm -rf repo-cloned &&`
			`filter_git clone repo repo-cloned &&`
			`test_cmp_count a.exp repo-cloned/a.log &&`
			`test_cmp_count b.exp repo-cloned/b.log &&`

			`(`
			`cd repo-cloned &&`
			`test_cmp_committed_rot13 "$TEST_ROOT/test.o" test.a &&`
			`test_cmp_committed_rot13 "$TEST_ROOT/test.o" test-delay10.a &&`
			`test_cmp_committed_rot13 "$TEST_ROOT/test.o" test-delay11.a &&`
			`test_cmp_committed_rot13 "$TEST_ROOT/test.o" test-delay20.a &&`
			`test_cmp_committed_rot13 "$TEST_ROOT/test.o" test-delay10.b &&`

			`rm .a .b &&`
			`filter_git checkout . &&`
			`test_cmp_count ../a.exp a.log &&`
			`test_cmp_count ../b.exp b.log &&`

			`test_cmp_committed_rot13 "$TEST_ROOT/test.o" test.a &&`
			`test_cmp_committed_rot13 "$TEST_ROOT/test.o" test-delay10.a &&`
			`test_cmp_committed_rot13 "$TEST_ROOT/test.o" test-delay11.a &&`
			`test_cmp_committed_rot13 "$TEST_ROOT/test.o" test-delay20.a &&`
			`test_cmp_committed_rot13 "$TEST_ROOT/test.o" test-delay10.b`
			`)`
			`'`

			`test_expect_success PERL 'missing file in delayed checkout' '`
			`test_config_global filter.bug.process "rot13-filter.pl bug.log clean smudge delay" &&`
			`test_config_global filter.bug.required true &&`

			`rm -rf repo &&`
			`mkdir repo &&`
			`(`
			`cd repo &&`
			`git init &&`
			`echo "*.a filter=bug" >.gitattributes &&`
			`cp "$TEST_ROOT/test.o" missing-delay.a`
			`git add . &&`
			`git commit -m "test commit"`
			`) &&`

			`rm -rf repo-cloned &&`
			`test_must_fail git clone repo repo-cloned 2>git-stderr.log &&`
			`cat git-stderr.log &&`
			`grep "error: .missing-delay\.a. was not filtered properly" git-stderr.log`
			`'`

			`test_expect_success PERL 'invalid file in delayed checkout' '`
			`test_config_global filter.bug.process "rot13-filter.pl bug.log clean smudge delay" &&`
			`test_config_global filter.bug.required true &&`

			`rm -rf repo &&`
			`mkdir repo &&`
			`(`
			`cd repo &&`
			`git init &&`
			`echo "*.a filter=bug" >.gitattributes &&`
			`cp "$TEST_ROOT/test.o" invalid-delay.a &&`
			`cp "$TEST_ROOT/test.o" unfiltered`
			`git add . &&`
			`git commit -m "test commit"`
			`) &&`

			`rm -rf repo-cloned &&`
			`test_must_fail git clone repo repo-cloned 2>git-stderr.log &&`
			`grep "error: external filter .* signaled that .unfiltered. is now available although it has not been delayed earlier" git-stderr.log`
			`'`

Add 'ident' conversion. The 'ident' attribute set to path squashes "$ident:<any bytes except dollor sign>$" to "$ident$" upon checkin, and expands it to "$ident: <blob SHA-1> $" upon checkout. As we have two conversions that affect checkin/checkout paths, clarify how they interact with each other. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-22 04:09:02 +02:00			`test_done`