C语言 flock():在没有竞争条件的情况下删除锁定的文件?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/17708885/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-02 06:56:18  来源:igfitidea点击:

flock(): removing locked file without race condition?

cflock

提问by Arnaud Le Blanc

I'm using flock() for inter-process named mutexes (i.e. some process can decide to hold a lock on "some_name", which is implemented by locking a file named "some_name" in a temp directory:

我将 flock() 用于名为互斥体的进程间(即某些进程可以决定对“some_name”持有锁,这是通过在临时目录中锁定名为“some_name”的文件来实现的:

lockfile = "/tmp/some_name.lock";
fd = open(lockfile, O_CREAT);
flock(fd, LOCK_EX);

do_something();

unlink(lockfile);
flock(fd, LOCK_UN);

The lock file should be removed at some point, to avoid filling the temp directory with hundreds of files.

应该在某个时候删除锁定文件,以避免用数百个文件填充临时目录。

However, there is an obvious race condition in this code; example with processes A, B and C:

但是,这段代码中存在明显的竞争条件;流程 A、B 和 C 的示例:

A opens file
A locks file
B opens file
A unlinks file
A unlocks file
B locks file (B holds a lock on the deleted file)
C opens file (a new file one is created)
C locks file (two processes hold the same named mutex !)

Is there a way to remove the lock file at some point without introducing this race condition ?

有没有办法在不引入这种竞争条件的情况下在某个时候删除锁定文件?

回答by user2769258

Sorry if I reply to a dead question:

对不起,如果我回答一个死问题:

After locking the file, open another copy of it, fstat both copies and check the inode number, like this:

锁定文件后,打开它的另一个副本, fstat 两个副本并检查 inode 编号,如下所示:

lockfile = "/tmp/some_name.lock";

    while(1) {
        fd = open(lockfile, O_CREAT);
        flock(fd, LOCK_EX);

        fstat(fd, &st0);
        stat(lockfile, &st1);
        if(st0.st_ino == st1.st_ino) break;

        close(fd);
    }

    do_something();

    unlink(lockfile);
    flock(fd, LOCK_UN);

This prevents the race condition, because if a program holds a lock on a file that is still on the file system, every other program that has a leftover file will have a wrong inode number.

这可以防止竞争条件,因为如果一个程序持有一个仍然在文件系统上的文件的锁,那么每个其他拥有剩余文件的程序都会有一个错误的 inode 编号。

I actually proved it in the state-machine model, using the following properties:

我实际上在状态机模型中证明了它,使用以下属性:

If P_i has a descriptor locked on the filesystem then no other process is in the critical section.

如果 P_i 的描述符锁定在文件系统上,则临界区中没有其他进程。

If P_i is after the stat with the right inode or in the critical section it has the descriptor locked on the filesystem.

如果 P_i 在具有正确 inode 的 stat 之后或在临界区中,则它的描述符已锁定在文件系统上。

回答by Guido U. Draheim

  1. In Unix it is possible to delete a file while it is opened - the inode will be kept until all processes have ended that have it in their file descriptor list
  2. In Unix it is possible to check that a file has been removed from all directories by checking the link count as it becomes zero
  1. 在 Unix 中,可以在打开文件时将其删除 - inode 将被保留,直到所有进程结束并将其包含在其文件描述符列表中
  2. 在 Unix 中,可以通过检查链接计数(当它变为零时)来检查文件是否已从所有目录中删除

So instead of comparing the ino-value of the old/new file paths you can simply check the nlink count on the file that is already open. It assumes that it is just an ephemeral lock file and not a real mutex resource or device.

因此,您可以简单地检查已打开文件的 nlink 计数,而不是比较旧/新文件路径的 ino 值。它假定它只是一个临时锁定文件,而不是真正的互斥锁资源或设备。

lockfile = "/tmp/some_name.lock";

for(int attempt; attempt < timeout; ++attempt) {
    int fd = open(lockfile, O_CREAT, 0444);
    int done = flock(fd, LOCK_EX | LOCK_NB);
    if (done != 0) { 
        close(fd);
        sleep(1);     // lock held by another proc
        continue;
    }
    struct stat st0;
    fstat(fd, &st0);
    if(st0.st_nlink == 0) {
       close(fd);     // lockfile deleted, create a new one
       continue;
    }
    do_something();
    unlink(lockfile); // nlink :=0 before releasing the lock
    flock(fd, LOCK_UN);
    close(fd);        // release the ino if no other proc 
    return true;
}
return false;

回答by MvG

If you use these files for locking only, and do not actually write to them, then I suggest you treat the existence of the directory entry itself as an indication for a held lock, and avoid using flockaltogether.

如果您仅将这些文件用于锁定,而不实际写入它们,那么我建议您将目录条目本身的存在视为持有锁定的指示,并避免flock完全使用。

To do so, you need to construct an operation which creates a directory entry and reports an error if it already existed. On Linux and with mostfile systems, passing O_EXCLto openwill work for this. But some platforms and some file systems (older NFS in particular) do not support this. The man page for opentherefore suggests an alternative:

为此,您需要构建一个操作,该操作创建一个目录条目并在该条目已经存在时报告错误。在 Linux 和大多数文件系统上,传递O_EXCLtoopen将适用于此。但是某些平台和某些文件系统(尤其是较旧的 NFS)不支持此功能。因此,手册页open建议了另一种选择:

Portable programs that want to perform atomic file locking using a lockfile, and need to avoid reliance on NFS support for O_EXCL, can create a unique file on the same file system (e.g., incorporating hostname and PID), and use link(2) to make a link to the lockfile. If link(2) returns 0, the lock is successful. Otherwise, use stat(2) on the unique file to check if its link count has increased to 2, in which case the lock is also successful.

想要使用锁文件执行原子文件锁定并且需要避免依赖 NFS 支持的便携式程序O_EXCL可以在同一文件系统上创建唯一的文件(例如,合并主机名和 PID),并使用link(2) 制作一个链接到锁文件。如果link(2)返回0,则锁定成功。否则,stat在唯一文件上使用(2) 来检查其链接数是否增加到 2,在这种情况下锁定也成功。

So this looks like a locking scheme which is officially documented and therefore indicates a certain level of support and best practice suggestion. But I have seen other approaches as well. bzrfor example uses directories instead of symlinks in most places. Quoting from its source code:

所以这看起来像是一个正式记录的锁定方案,因此表明了一定程度的支持和最佳实践建议。但我也看到了其他方法。例如bzr在大多数地方使用目录而不是符号链接。引用其源代码

A lock is represented on disk by a directory of a particular name, containing an information file. Taking a lock is done by renaming a temporary directory into place. We use temporary directories because for all known transports and filesystems we believe that exactly one attempt to claim the lock will succeed and the others will fail. (Files won't do because some filesystems or transports only have rename-and-overwrite, making it hard to tell who won.)

锁在磁盘上由具有特定名称的目录表示,其中包含一个信息文件。锁定是通过将​​临时目录重命名到位来完成的。我们使用临时目录是因为对于所有已知的传输和文件系统,我们相信只有一个尝试声明锁会成功,而其他的会失败。(文件不会这样做,因为某些文件系统或传输只有重命名和覆盖,因此很难判断谁赢了。)

One downside to the above approaches is that they won't block: a failed locking attempt will result in an error, but not wait till the lock becomes available. You will have to poll for the lock, which might be problematic in the light of lock contention. In that case, you might want to further depart from your filesystem-based approach, and use third party implementations instead. But general questions on how to do ipc mutexes have already been asked, so I suggest you search for [ipc] [mutex]and have a look at the results, this onein particular. By the way, these tags might be useful for your post as well.

上述方法的一个缺点是它们不会阻塞:失败的锁定尝试将导致错误,但不会等到锁定可用。您将不得不轮询锁,鉴于锁争用,这可能会出现问题。在这种情况下,您可能希望进一步脱离基于文件系统的方法,而是使用第三方实现。但是关于如何做 ipc 互斥锁的一般问题已经被问到了,所以我建议你搜索[ipc] [mutex]并查看结果,特别是这个。顺便说一句,这些标签也可能对您的帖子有用。