Linux 如何从单个流程实例创建多个网络命名空间

Question

提问by user389238

I am using following C function to create multiple network namespacesfrom a single process instance:

我正在使用以下 C 函数从单个流程实例创建多个网络命名空间：

void create_namespace(const char *ns_name)
{
    char ns_path[100];

    snprintf(ns_path, 100, "%s/%s", "/var/run/netns", ns_name);
    close(open(ns_path, O_RDONLY|O_CREAT|O_EXCL, 0));
    unshare(CLONE_NEWNET);
    mount("/proc/self/ns/net", ns_path, "none", MS_BIND , NULL);
}

After my process creates all the namspaces and I add a tapinterface to any of the one network namespace (with ip link set tap1 netns ns1command), then I actually see this interface in all of the namespaces (presumably, this is actually a single namespace that goes under different names).

在我的进程创建了所有的 namspaces 并且我将一个tap接口添加到任何一个网络命名空间（使用ip link set tap1 netns ns1命令）之后，我实际上在所有命名空间中看到了这个接口（大概，这实际上是一个名称不同的命名空间）。

But, if I create multiple namespaces by using multiple processes, then everything is working just fine.

但是，如果我通过使用多个进程创建多个命名空间，那么一切正常。

What could be wrong here? Do I have to pass any additional flags to the unshare()to get this working from a single process instance? Is there a limitation that a single process instance can't create multiple network namespaces? Or is there a problem with mount()call, because /proc/self/ns/netis actually mounted multiple times?

这里可能有什么问题？我是否必须向传递任何额外的标志才能unshare()从单个流程实例中获得它？单个流程实例不能创建多个网络命名空间是否存在限制？还是mount()调用有问题，因为/proc/self/ns/net实际上是多次挂载？

Update:It seems that unshare()function creates multiple network namespaces correctly, but all the mount points in /var/run/netns/actually reference to the first network namespace that was mounted in that direcotry.

更新：该unshare()函数似乎正确创建了多个网络命名空间，但所有挂载点/var/run/netns/实际上都引用了在该目录中挂载的第一个网络命名空间。

Update2:It seems that the best approach is to fork() another process and execute create_namespace() function from there. Anyway, I would be glad to hear a better solution that does not involve fork() call or at least get a confirmation that would prove that it is impossible to create and manage multiple network namespaces from a single process.

Update2：似乎最好的方法是 fork() 另一个进程并从那里执行 create_namespace() 函数。无论如何，我很高兴听到一个更好的解决方案，它不涉及 fork() 调用，或者至少得到一个确认，以证明不可能从单个进程创建和管理多个网络命名空间。

Update3:I am able to create multiple namespaces with unshare() by using the following code:

Update3：我可以使用以下代码使用 unshare() 创建多个命名空间：

int  main() {
    create_namespace("a");
    system("ip tuntap add mode tap tapa");
    system("ifconfig -a");//shows lo and tapA interface
    create_namespace("b");
    system("ip tuntap add mode tap tapb");
    system("ifconfig -a");//show lo and tapB interface, but does not show tapA. So this is second namespace created.
}

But after the process terminates and I execute ip netns exec a ifconfig -aand ip netns exec b ifconfig -ait seems that both commands were suddenly executed in namespace a. So the actual problem is storing the references to the namespaces (or calling mount() the right way. But I am not sure, if this is possible).

但是在进程终止并且我执行之后ip netns exec a ifconfig -a，ip netns exec b ifconfig -a似乎这两个命令都突然在命名空间a 中执行。所以实际的问题是存储对命名空间的引用（或以正确的方式调用 mount()。但我不确定，如果这可能的话）。

Answer 1

采纳答案by juris

You only have to bind mount /proc/*/ns/*if you need to access these namespaces from another process, or need to get handle to be able to switch back and forth between the two. It is not needed to use multiple namespaces from a single process.

/proc/*/ns/*如果您需要从另一个进程访问这些命名空间，或者需要获取句柄以便能够在两者之间来回切换，则只需绑定 mount 。不需要从单个进程使用多个命名空间。

unshare doescreate new namespace.
clone and fork by default do not create any new namespaces.
there is one "current" namespace of each kind assigned to a process. It can be changed by unshare or setns. Set of namespaces (by default) is inherited by child processes.

unshare确实会创建新的命名空间。
默认情况下 clone 和 fork 不会创建任何新的命名空间。
分配给进程的每种命名空间都有一个“当前”命名空间。可以通过 unshare 或 setns 更改它。命名空间集（默认情况下）由子进程继承。

Whenever you do open(/proc/N/ns/net), it creates inode for this file, and all subsequent open()s will return file that is bound to the same namespace. Details are lost in the depths of kernel dentry cache.

每当您执行 open( /proc/N/ns/net) 时，它都会为该文件创建 inode，并且所有后续的 open() 将返回绑定到相同命名空间的文件。详细信息丢失在内核 dentry 缓存的深处。

Also, each process has only one /proc/self/ns/netfile entry, and bind mount does not create new instances of this proc file. Opening those mounted files are exactly the sameas opening /proc/self/ns/netfile directly (which will keep pointing to the namespace it pointed to when you first opened it).

此外，每个进程只有一个/proc/self/ns/net文件条目，并且绑定挂载不会创建此 proc 文件的新实例。打开那些挂载的文件和/proc/self/ns/net直接打开文件完全一样（它会一直指向你第一次打开它时指向的命名空间）。

It seems that "/proc/*/ns" is half-baked like this.

看来“ /proc/*/ns”是这样半生不熟的。

So, if you only need 2 namespaces, you can:

因此，如果您只需要 2 个命名空间，您可以：

open /proc/1/ns/net
unshare
open /proc/self/ns/net

打开 /proc/1/ns/net
取消分享
打开 /proc/self/ns/net

and switch between the two.

并在两者之间切换。

For more that 2 you might have to clone(). There seems to be no way to create more than one /proc/N/ns/netfile per process.

对于超过 2 个，您可能需要clone()。似乎没有办法为/proc/N/ns/net每个进程创建多个文件。

However, if you do not need to switch between namespaces at runtime, or to share them with other processes, you can use many namespaces like this:

但是，如果您不需要在运行时在命名空间之间切换，或与其他进程共享它们，则可以使用许多命名空间，如下所示：

open sockets and run processes for main namespace.
unshare
open sockets and run processes for 2nd namespace (netlink, tcp, etc)
unshare
...
unshare
open sockets and run processes for Nth namespace (netlink, tcp, etc)

打开套接字并运行主命名空间的进程。
取消分享
为第二个命名空间（netlink、tcp 等）打开套接字并运行进程
取消分享
...
取消分享
打开套接字并运行第 N 个命名空间的进程（netlink、tcp 等）

Open sockets keep reference to their network namespace, so they will not be collected until sockets are closed.

打开的套接字保留对其网络命名空间的引用，因此在套接字关闭之前不会收集它们。

You can also use netlink to move interfaces between namespaces, by sending netlink command on source namespace, and specifying dst namespace either by PID or namespace FD (the later you don't have).

您还可以使用 netlink 在命名空间之间移动接口，方法是在源命名空间上发送 netlink 命令，并通过 PID 或命名空间 FD（后者您没有）指定 dst 命名空间。

You need to switch process namespace before accessing /procentries that depend on that namespace. Once "proc" file is open, it keeps reference to the namespace.

在访问/proc依赖于该命名空间的条目之前，您需要切换进程命名空间。一旦“proc”文件打开，它就会保持对命名空间的引用。

Answer 2

回答by Coren

Network Namespacesare, by design, created with a call to clone, and it can be modified after by unshare. Take note that even if you do create a new network namespace with unshare, in fact you just modify network stack of your running process. unshareis unable to modify network stack of other processes, so you won't be able to create another one only with unshare.

根据设计，网络命名空间是通过调用clone创建的，之后可以通过unshare进行修改。请注意，即使您确实使用unshare创建了一个新的网络命名空间，实际上您也只是修改了正在运行的进程的网络堆栈。unshare无法修改其他进程的网络堆栈，因此您将无法仅使用unshare创建另一个进程。

In order to work, a new network namespace needs a new network stack, and so it needs a new process. That's all.

为了工作，一个新的网络命名空间需要一个新的网络堆栈，因此它需要一个新的进程。就这样。

Good news is that it can be made very lightweight with clone, see:

好消息是它可以通过clone变得非常轻量级，请参阅：

Clone()differs from the traditional fork()system call in UNIX, in that it allows the parent and child processes to selectively share or duplicate resources.

Clone()与UNIX 中传统的fork()系统调用不同，它允许父进程和子进程有选择地共享或复制资源。

You are able to divert only on this network stack (and avoid memory space, table of file descriptors and table of signal handlers). Your new network process can be made more like a threadthan a real fork.

您只能在此网络堆栈上转移（并避免内存空间、文件描述符表和信号处理程序表）。你的新网络进程可以更像一个线程而不是一个真正的分支。

You can manipulate them with C code or with Linux Kernel and/or LXC tools.

您可以使用 C 代码或 Linux 内核和/或 LXC 工具来操作它们。

For instance, to add a device to new network namespace, it's as simple as:

例如，要将设备添加到新的网络命名空间，就像这样简单：

echo $PID > /sys/class/net/ethX/new_ns_pid

See this pagefor more info about CLI available.

有关可用 CLI 的更多信息，请参阅此页面。

On the C-side, one can take a look at lxc-unshare implementation. Despite its name it uses clone, as you can see(lxc_clone is here). One can also look at LTP implementation, where the author has chosen to use fork directly.

在 C 端，可以看一下 lxc-unshare 实现。尽管它的名字是clone，正如你所看到的（lxc_clone 在这里）。还可以看一下LTP实现，这里作者选择了直接使用fork。

EDIT: There is a trick that you can use to make them persistent, but you will still need to fork, even temporarily.

编辑：有一个技巧可以使它们持久化，但您仍然需要分叉，即使是暂时的。

Take a look at this code of ipsource2(I have removed error checking for clarity):

看看ipsource2的这段代码（为了清楚起见，我已经删除了错误检查）：

snprintf(netns_path, sizeof(netns_path), "%s/%s", NETNS_RUN_DIR, name);

/* Create the base netns directory if it doesn't exist */
mkdir(NETNS_RUN_DIR, S_IRWXU|S_IRGRP|S_IXGRP|S_IROTH|S_IXOTH);

/* Create the filesystem state */
fd = open(netns_path, O_RDONLY|O_CREAT|O_EXCL, 0);
[...]
close(fd);
unshare(CLONE_NEWNET);
/* Bind the netns last so I can watch for it */
mount("/proc/self/ns/net", netns_path, "none", MS_BIND, NULL)

If you execute this code in a forked process, you'll be able to create new network namespace at will. In order to delete them, you can simply umount and delete this bind:

如果您在分叉进程中执行此代码，您将能够随意创建新的网络命名空间。为了删除它们，您可以简单地卸载并删除此绑定：

umount2(netns_path, MNT_DETACH);
if (unlink(netns_path) < 0) [...]

EDIT2:Another (dirty) trick would be simply to execute "ip netns add .." cli with system.

EDIT2：另一个（肮脏的）技巧是简单地使用system执行“ip netns add ..”cli 。

Linux 如何从单个流程实例创建多个网络命名空间

提问by user389238

采纳答案by juris

回答by Coren

相关推荐

最近更新

标签

Linux 如何从单个流程实例创建多个网络命名空间

提问by user389238

采纳答案by juris

回答by Coren

相关推荐

检查文件是否存在 Linux bash

在 Linux 上安装 GlassFish 的位置？

从 Linux 上的消息队列读取时出现“错误地址”错误

在 linux 上解决整数程序的任何好工具？

相关推荐

最近更新

标签