行尾在 Git 中搞砸了 - 如何在巨大的行尾修复后跟踪另一个分支的更改?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1011985/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-19 03:43:31  来源:igfitidea点击:

Line endings messed up in Git - how to track changes from another branch after a huge line ending fix?

githistoryrewritebranchnewline

提问by keo

We are working with a 3rd party PHP engine that gets regular updates. The releases are kept on a separate branch in git, and our fork is the master branch.

我们正在使用定期更新的 3rd 方 PHP 引擎。版本保存在 git 中的一个单独分支上,我们的分支是主分支。

This way we'll be able to apply patches to our fork from the new releases of the engine.

通过这种方式,我们将能够将补丁从引擎的新版本应用到我们的分支。

My problem is, after many commits to our branch, I realized that the initial import of the engine was done with CRLF line endings.

我的问题是,在对我们的分支进行多次提交之后,我意识到引擎的初始导入是使用 CRLF 行结尾完成的。

I converted every file to LF, but this made a huge commit, with 100k lines removed and 100k lines added, which obviously breaks what we intended to do: easily merge in patches from the factory releases of that 3rd party engine.

我将每个文件都转换为 LF,但这做了一个巨大的提交,删除了 100k 行并添加了 100k 行,这显然打破了我们的意图:轻松合并来自该 3rd 方引擎的工厂版本的补丁。

What whould I do know? How can I fix this? I already have hundreds of commits on our fork.

我怎么知道?我怎样才能解决这个问题?我已经在我们的分叉上进行了数百次提交。

What would be good is to somehow do a line endings fix commit after the initial import and before branching our own fork, and removing that huge line ending commit later in history.

最好是在初始导入之后和分支我们自己的分支之前以某种方式执行行结尾修复提交,并在历史后期删除那个巨大的行结尾提交。

However I have no idea how to do this in Git.

但是我不知道如何在 Git 中做到这一点。

Thanks!

谢谢!

回答by keo

I finally managed to solve it.

我终于设法解决了它。

The answer is:

答案是:

git filter-branch --tree-filter '~/Scripts/fix-line-endings.sh' -- --all

fix-line-endings.sh contains:

fix-line-endings.sh 包含:

#!/bin/sh
find . -type f -a \( -name '*.tpl' -o -name '*.php' -o -name '*.js' -o -name '*.css' -o -name '*.sh' -o -name '*.txt' -iname '*.html' \) | xargs fromdos

After all line endings were fixed in all trees in all commits, I did an interactive rebase and removed all commits that were fixing line endings.

在所有提交的所有树中修复所有行结尾后,我进行了交互式 rebase 并删除了所有修复行结尾的提交。

Now my repo is clean and fresh, ready to be pushed :)

现在我的 repo 干净而新鲜,可以推送了 :)

Note to visitors: do not do this if your repo has been pushed / cloned because it will mess things up badly!

访客注意:如果您的存储库已被推送/克隆,请不要这样做,因为它会将事情搞得一团糟!

回答by Greg Bacon

Going forward, avoid this problem with the core.autocrlfsetting, documented in git config --help:

展望未来,通过core.autocrlf设置避免此问题,记录在git config --help

core.autocrlf

If true, makes git convert CRLFat the end of lines in text files to LFwhen reading from the filesystem, and convert in reverse when writing to the filesystem. The variable can be set to input, in which case the conversion happens only while reading from the filesystem but files are written out with LFat the end of lines. A file is considered "text" (i.e.be subjected to the autocrlfmechanism) based on the file's crlfattribute, or if crlfis unspecified, based on the file's contents. See gitattributes.

核心.autocrlf

如果为 true,则CRLFLF从文件系统读取时使 git在文本文件的行尾转换为,并在写入文件系统时反向转换。该变量可以设置为input,在这种情况下,转换仅在从文件系统读取时发生,但文件LF在行尾写出。根据文件的属性,或者如果未指定,则根据文件的内容将文件视为“文本”(服从autocrlf机制)。参见gitattributescrlfcrlf

回答by Robert Munteanu

Did you look at git rebase?

你看了git rebase吗?

You will need to re-base the history of your repository, as follows:

您将需要重新建立存储库的历史记录,如下所示:

  • commit the line terminator fixes
  • start the rebase
  • leave the third-party import commit first
  • apply the line terminator fixes
  • apply your other patches
  • 提交行终止符修复
  • 开始变基
  • 首先保留第三方导入提交
  • 应用行终止符修复
  • 应用您的其他补丁


What you do need to understand though is that this will breakall downstream repositories - those that are cloned from your parent repo. Ideally you will start from scratch with those.

不过,您需要了解的是,这会破坏所有下游存储库 - 那些从您的父存储库克隆的存储库。理想情况下,您将从头开始。



Update: sample usage:

更新:示例用法:

target=`git rev-list --max-count=3 HEAD | tail -n1`
get rebase -i $target

Will start a rebase session for the last 3 commits.

将为最后 3 次提交启动 rebase 会话。

回答by Jakub Nar?bski

One solution (not necessarily the best one) would be to use git-filter-branchto rewrite history to always use correct line endings. This should be better solution that interactive rebase, at least for larger number of commits; also it might be easier to deal with merges using git-filter-branch.

一种解决方案(不一定是最好的)是使用git-filter-branch重写历史记录以始终使用正确的行尾。这应该是交互式 rebase 的更好解决方案,至少对于大量提交;使用 git-filter-branch 处理合并也可能更容易。

That is of course assuming that history was not published(repository was not cloned).

这当然是假设历史没有发布(存储库没有被克隆)。

回答by keo

we are avoiding this problem in the future with:

我们将在未来避免这个问题:

1) everyone uses an editor which strips trailing whitespaces, and we save all files with LF.

1) 每个人都使用一个删除尾随空格的编辑器,我们用 LF 保存所有文件。

2) if 1) fails (it can - someone accidentally saves it in CRLF for whatever reason) we have a pre-commit script that checks for CRLF chars:

2) 如果 1) 失败(它可以 - 有人出于任何原因不小心将其保存在 CRLF 中)我们有一个预提交脚本来检查 CRLF 字符:

#!/bin/sh
#
# An example hook script to verify what is about to be committed.
# Called by git-commit with no arguments.  The hook should
# exit with non-zero status after issuing an appropriate message if
# it wants to stop the commit.
#
# To enable this hook, rename this file to "pre-commit" and set executable bit

# original by Junio C Hamano

# modified by Barnabas Debreceni to disallow CR characters in commits


if git rev-parse --verify HEAD 2>/dev/null
then
    against=HEAD
else
    # Initial commit: diff against an empty tree object
    against=4b825dc642cb6eb9a060e54bf8d69288fbee4904
fi

crlf=0

IFS="
"
for FILE in `git diff-index --cached $against`
do
    fhash=`echo $FILE | cut -d' ' -f4`
    fname=`echo $FILE | cut -f2`

    if git show $fhash | grep -EUIlq $'\r$'
    then
        echo $fname contains CRLF characters
        crlf=1
    fi
done

if [ $crlf -eq 1 ]
then
    echo Some files have CRLF line endings. Please fix it to be LF and try committing again.
    exit 1
fi

exec git diff-index --check --cached $against --

This script uses GNU grep, and works on Mac OS X, however it should be tested before use on other platforms (we had problems with Cygwin and BSD grep)

该脚本使用 GNU grep,并在 Mac OS X 上运行,但是在其他平台上使用之前应该进行测试(我们在 Cygwin 和 BSD grep 上遇到了问题)

3) In case we find any whitespace errors, we use the following script on erroneous files:

3)如果我们发现任何空格错误,我们对错误文件使用以下脚本:

#!/usr/bin/env php
<?php

    // Remove various whitespace errors and convert to LF from CRLF line endings
    // written by Barnabas Debreceni
    // licensed under the terms of WFTPL (http://en.wikipedia.org/wiki/WTFPL)

    // handle no args
    if( $argc <2 ) die( "nothing to do" );


    // blacklist

    $bl = array( 'smarty' . DIRECTORY_SEPARATOR . 'templates_c' . DIRECTORY_SEPARATOR . '.*' );

    // whitelist

    $wl = array(    '\.tpl', '\.php', '\.inc', '\.js', '\.css', '\.sh', '\.html', '\.txt', '\.htc', '\.afm',
                    '\.cfm', '\.cfc', '\.asp', '\.aspx', '\.ascx' ,'\.lasso', '\.py', '\.afp', '\.xml',
                    '\.htm', '\.sql', '\.as', '\.mxml', '\.ini', '\.yaml', '\.yml'  );

    // remove $argv[0]
    array_shift( $argv );

    // make file list
    $files = getFileList( $argv );

    // sort files
    sort( $files );

    // filter them for blacklist and whitelist entries

    $filtered = preg_grep( '#(' . implode( '|', $wl ) . ')$#', $files );
    $filtered = preg_grep( '#(' . implode( '|', $bl ) . ')$#', $filtered, PREG_GREP_INVERT );

    // fix whitespace errors
    fix_whitespace_errors( $filtered );





    ///////////////////////////////////////////////////////////////////////////////////////////////
    ///////////////////////////////////////////////////////////////////////////////////////////////


    // whitespace error fixer
    function fix_whitespace_errors( $files ) {
        foreach( $files as $file ) {

            // read in file
            $rawlines = file_get_contents( $file );

            // remove \r
            $lines = preg_replace( "/(\r\n)|(\n\r)/m", "\n", $rawlines );
            $lines = preg_replace( "/\r/m", "\n", $lines );

            // remove spaces from before tabs
            $lines = preg_replace( "/0+\t/m", "\t", $lines );

            // remove spaces from line endings
            $lines = preg_replace( "/[0\t]+$/m", "", $lines );

            // remove tabs from line endings
            $lines = preg_replace( "/\t+$/m", "", $lines );

            // remove EOF newlines
            $lines = preg_replace( "/\n+$/", "", $lines );

            // write file if changed and set old permissions
            if( strlen( $lines ) != strlen( $rawlines )){

                $perms = fileperms( $file );

                // Uncomment to save original files

                //rename( $file, $file.".old" );
                file_put_contents( $file, $lines);
                chmod( $file, $perms );
                echo "${file}: FIXED\n";
            } else {
                echo "${file}: unchanged\n";
            }

        }
    }

    // get file list from argument array
    function getFileList( $argv ) {
        $files = array();
        foreach( $argv as $arg ) {
          // is a direcrtory
            if( is_dir( $arg ) )  {
                $files = array_merge( $files, getDirectoryTree( $arg ) );
            }
            // is a file
            if( is_file( $arg ) ) {
                $files[] = $arg;
            }
        }
        return $files;
    }

    // recursively scan directory
    function getDirectoryTree( $outerDir ){
        $outerDir = preg_replace( ':' . DIRECTORY_SEPARATOR . '$:', '', $outerDir );
        $dirs = array_diff( scandir( $outerDir ), array( ".", ".." ) );
        $dir_array = array();
        foreach( $dirs as $d ){
            if( is_dir( $outerDir . DIRECTORY_SEPARATOR . $d ) ) {
                $otherdir = getDirectoryTree( $outerDir . DIRECTORY_SEPARATOR . $d );
                $dir_array = array_merge( $dir_array, $otherdir );
            }
            else $dir_array[] = $outerDir . DIRECTORY_SEPARATOR . $d;
        }
        return $dir_array;
    }
?>