bash 检出新分支时自动删除 *.pyc 文件和其他空目录

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1504724/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-17 21:14:23  来源:igfitidea点击:

Automatically remove *.pyc files and otherwise-empty directories when I check out a new branch

gitpythonbash

提问by Apreche

So here's an interesting situation when using git and python, and I'm sure it happens for other situations as well.

所以在使用 git 和 python 时,这是一个有趣的情况,我相信在其他情况下也会发生这种情况。

Let's say I make a git repo with a folder /foo/. In that folder I put /foo/program.py. I run program.py and program.pyc is created. I have *.pyc in the .gitignore file, so git doesn't track it.

假设我使用文件夹 /foo/ 制作了一个 git repo。在那个文件夹中我放了/foo/program.py。我运行 program.py 并创建了 program.pyc 。我在 .gitignore 文件中有 *.pyc,所以 git 不会跟踪它。

Now let's say I make another branch, dev. In this dev branch, I remove the /foo/ folder entirely.

现在假设我创建了另一个分支 dev。在这个 dev 分支中,我完全删除了 /foo/ 文件夹。

Now I switch back to the master branch, and /foo/ reappears. I run the program.py and the program.pyc file reappears. All is well.

现在我切换回 master 分支,并且 /foo/ 重新出现。我运行 program.py 并且 program.pyc 文件重新出现。一切都很好。

I switch back to my dev branch. The /foo/ directory should disappear. It only exists in the master branch, not the dev branch. However, it is still there. Why? Because the ignored program.pyc file prevents the folder from being deleted when switching branches.

我切换回我的开发分支。/foo/ 目录应该消失。它只存在于 master 分支,而不存在于 dev 分支。然而,它仍然存在。为什么?因为忽略的 program.pyc 文件防止了切换分支时文件夹被删除。

The solution to this problem is to recursively delete all *.pyc files before switching branches. I can do that easily with this command.

这个问题的解决方法是在切换分支之前递归删除所有*.pyc文件。我可以用这个命令轻松做到这一点。

find . -name "*.pyc" -exec rm '{}' ';'

The problem is that it is annoying to have to remember to do this almost every time I change branches. I could make an alias for this command, but then I still have to remember to type it every time I change branches. I could also make an alias for git-branch, but that's no good either. The git branch command does other things besides just change branches, and I don't want to delete all pyc files every time I use it. Heck, I might even use it in a non-python repo, then what?

问题是几乎每次更改分支时都必须记住这样做很烦人。我可以为这个命令创建一个别名,但是每次更改分支时我仍然必须记住输入它。我也可以为 git-branch 做一个别名,但这也不好。git branch 命令除了只是更改分支之外,还可以做其他事情,我不想每次使用它时都删除所有的pyc 文件。哎呀,我什至可以在非 python 存储库中使用它,然后呢?

Is there a way to set a git hook that only executes when I change branches? Or is there some other way to set all *.pyc files to get erased whenever I switch branches?

有没有办法设置一个仅在我更改分支时执行的 git hook?或者有没有其他方法可以设置所有 *.pyc 文件在我切换分支时被删除?

采纳答案by Cascabel

There is a post-checkouthook, to be placed in .git/hooks/post-checkout. There's probably a sample there, possibly named .sample or possibly not executable, depending on your git version. Short description: it gets three parameters, the previous HEAD, the new HEAD, and a flag which is 1 if the branch changed and 0 if it was merely a file checkout. See man githooksfor more information! You should be able to write a shell script to do what you need and put it there.

有一个post-checkout钩子,放在 .git/hooks/post-checkout 中。那里可能有一个示例,可能命名为 .sample 或可能不可执行,具体取决于您的 git 版本。简短描述:它获取三个参数,前一个 HEAD、新 HEAD 和一个标志,如果分支更改,则为 1,如果只是文件检出,则为 0。查看man githooks更多信息!您应该能够编写一个 shell 脚本来执行您需要的操作并将其放在那里。

Edit: I realize you're looking to do this pre-checkout, so that the checkout automatically cleans up directories which become empty. There's no pre-checkout hook, though, so you'll have to use your script to remove the directories too.

编辑:我意识到您希望进行此预结帐,以便结帐自动清理变空的目录。但是,没有预结帐挂钩,因此您也必须使用脚本来删除目录。

Another note: Aliases are part of gitconfig, which can be local to a repository (in .git/config, not ~/.gitconfig). If you choose to do this with aliases (for git-checkout, not git-branch) you can easily put them only in python-related repositories. Also in this case, I'd make an alias specifically for this purpose (e.g. cc for checkout clean). You can still use checkout (or another aliased form of it) if you don't want to clean up pyc files.

另一个注意事项:别名是 gitconfig 的一部分,它可以是存储库本地的(在 .git/config 中,而不是 ~/.gitconfig 中)。如果您选择使用别名(对于 git-checkout,而不是 git-branch)执行此操作,您可以轻松地将它们仅放在与 python 相关的存储库中。同样在这种情况下,我会专门为此目的创建一个别名(例如 cc 用于结帐清洁)。如果您不想清理 pyc 文件,您仍然可以使用 checkout(或它的其他别名形式)。

回答by Christian Oudard

Just copying and updating a good solution by Apreche that was buried in the comments:

只是复制和更新 Apreche 的一个很好的解决方案,它被埋在评论中:

Save this shell script to the file /path/to/repo/.git/hooks/post-checkout, and make it executable.

将此 shell 脚本保存到文件中/path/to/repo/.git/hooks/post-checkout,并使其可执行。

#! /bin/sh

# Start from the repository root.
cd ./$(git rev-parse --show-cdup)

# Delete .pyc files and empty directories.
find . -name "*.pyc" -delete
find . -type d -empty -delete

回答by Ned Batchelder

Another option is to not solve this as a git problem at all, but as a Python problem. You can use the PYTHONDONTWRITEBYTECODEenvironment variable to prevent Python from writing .pyc files in the first place. Then you won't have anything to clean up when you switch branches.

另一种选择是根本不将其作为 git 问题来解决,而是作为 Python 问题来解决。您可以首先使用PYTHONDONTWRITEBYTECODE环境变量来防止 Python 写入 .pyc 文件。那么当你切换分支时,你就没有任何东西要清理了。

回答by hynekcer

My solution is more compatible with git: Git removes only enpty directories where any file has been deleted by checkout. It doesn't search the complete workcopy tree. That is useful for big repositories or repositories with a very big ignored tree, like virtual environments by toxpackage for testing many different with Python versions etc.

我的解决方案与 git 更兼容:Git 仅删除结帐已删除任何文件的 enpty 目录。它不会搜索完整的工作副本树。这对于大型存储库或具有非常大的被忽略树的存储库非常有用,例如通过tox包测试许多不同 Python 版本等的虚拟环境。

My first implementation explains the principle very clearly: Only pyc files related to files under version controlare cleaned. It's for reasons of efficiency and unwanted side effects.

我的第一个实现把原理解释的很清楚:只清理与版本控制下的文件相关的pyc文件。这是出于效率和不需要的副作用的原因。

#!/bin/bash
# A hook that removes orphan "*.pyc" files for "*.py" beeing deleted.
# It doesn not clean anything e.g. for .py files deleted manually.
oldrev=""
newrev=""
# ignored param: branchcheckout=""

for x in $(git diff --name-only --diff-filter=DR $oldrev..$newrev | grep "\.py$")
do
    if test -a ${x}c && ! test -a ${x}; then
        rm ${x}c
    fi
done

The post-checkouthook receive the three useful parameters that allow to get known exactly which files have been deleted by the git checkout, without searching the complete tree.

post-checkout挂钩接收三个有用的参数,使得到究竟哪些文件已被git的结帐删除已知的,而无需搜索整个树。

After reading the question I rewrote my hook code to Python and extended it according to your requirements about empty directories.

阅读问题后,我将钩子代码重写为 Python 并根据您对空目录的要求对其进行了扩展。

My complete short source code(Python)is in
https://gist.github.com/hynekcer/476a593a3fc584278b87#file-post-checkout-py

我完整的简短源代码(Python)
https://gist.github.com/hynekcer/476a593a3fc584278b87#file-post-checkout-py

The doc string:

文档字符串:

"""
A hook to git that removes orphan files "*.pyc" and "*.pyo" for "*.py"
beeing deleted or renamed by git checkout. It also removes their empty parent
directories.
Nothing is cleaned for .py files deleted manually or by "git rm" etc.
Place it to "my_local_repository/.git/hooks/post-checkout" and make it executable
"""
  • The problem with *.pyc files is not important for Python 3, because *.pyc files in __pycache__can not be executed without the related *.py* file in its parent directory.

  • No change directory is necessary, because hooks are started everytimes in the root of the repository.

  • Cache directories for compiled code __pycache__are cleaned completely, because they are never important (don't take part in any binary distribution) and also for high efficiency because deleting by parts __pycache__/some_name.*.pyccould be slow.
  • *.pyc 文件的问题对于 Python 3 并不重要,因为 *.pyc 文件在__pycache__其父目录中没有相关的 *.py* 文件时无法执行。

  • 不需要更改目录,因为钩子每次都在存储库的根目录中启动。

  • 编译代码的缓存目录__pycache__被彻底清理,因为它们从不重要(不参与任何二进制分发),而且为了高效,因为按部分删除__pycache__/some_name.*.pyc可能很慢。