Linux 关于 putenv() 和 setenv() 的问题

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/5873029/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-05 03:54:20  来源:igfitidea点击:

Questions about putenv() and setenv()

clinuxunixenvironment-variablessetenv

提问by ValenceElectron

I have been thinking a little about environment variables and have a few questions/observations.

我一直在思考环境变量,并有一些问题/观察。

  • putenv(char *string);

    This call seems fatally flawed. Because it doesn't copy the passed string you can't call it with a local and there is no guarantee a heap allocated string won't be overwritten or accidentally deleted. Furthermore (though I haven't tested it), since one use of environment variables is to pass values to child's environment this seems useless if the child calls one of the exec*()functions. Am I wrong in that?

  • The Linux man page indicates that glibc 2.0-2.1.1 abandoned the above behavior and began copying the string but this led to a memory leak that was fixed in glibc 2.1.2. It's not clear to me what this memory leak was or how it was fixed.

  • setenv()copies the string but I don't know exactly how that works. Space for the environment is allocated when the process loads but it is fixed. Is there some (arbitrary?) convention at work here? For example, allocating more slots in the env string pointer array than currently used and moving the null terminating pointer down as needed? Is the memory for the new (copied) string allocated in the address space of the environment itself and if it is too big to fit you just get ENOMEM?

  • Considering the above issues, is there any reason to prefer putenv()over setenv()?

  • putenv(char *string);

    这个电话似乎有致命的缺陷。因为它不复制传递的字符串,所以不能用本地调用它,并且不能保证堆分配的字符串不会被覆盖或意外删除。此外(虽然我没有测试过),因为环境变量的一种用途是将值传递给孩子的环境,如果孩子调用其中一个exec*()函数,这似乎没用。我错了吗?

  • Linux 手册页指出 glibc 2.0-2.1.1 放弃了上述行为并开始复制字符串,但这导致了内存泄漏,该问题已在 glibc 2.1.2 中修复。我不清楚这个内存泄漏是什么或者它是如何修复的。

  • setenv()复制字符串,但我不知道它是如何工作的。环境空间在进程加载时分配,但它是固定的。这里有一些(任意的?)约定在起作用吗?例如,在 env 字符串指针数组中分配比当前使用更多的插槽并根据需要向下移动空终止指针?新(复制的)字符串的内存是否在环境本身的地址空间中分配,如果它太大而无法容纳,您只需获取 ENOMEM?

  • 考虑到上述问题,有什么理由,更喜欢putenv()setenv()

采纳答案by Jonathan Leffler

  • [The] putenv(char *string);[...] call seems fatally flawed.
  • [The] putenv(char *string);[...] 调用似乎存在致命缺陷。

Yes, it is fatally flawed. It was preserved in POSIX (1988) because that was the prior art. The setenv()mechanism arrived later.Correction:The POSIX 1990 standard says in §B.4.6.1 "Additional functions putenv()and clearenv()were considered but rejected". The Single Unix Specification(SUS) version 2 from 1997 lists putenv()but not setenv()or unsetenv(). The next revision (2004) did define both setenv()and unsetenv()as well.

是的,它有致命的缺陷。 它保留在 POSIX (1988) 中,因为那是现有技术。该setenv()机构随后赶到。更正:POSIX 1990 标准在 §B.4.6.1 中说“额外的函数 putenv ()clearenv()被考虑但被拒绝”。1997 年的单一 Unix 规范(SUS) 版本 2 列出了putenv()但没有列出setenv()unsetenv(). 下一个修订版(2004 年)确实定义了两者setenv()unsetenv()

Because it doesn't copy the passed string you can't call it with a local and there is no guarantee a heap allocated string won't be overwritten or accidentally deleted.

因为它不复制传递的字符串,所以不能用本地调用它,并且不能保证堆分配的字符串不会被覆盖或意外删除。

You're correct that a local variable is almost invariably a bad choice to pass to putenv()— the exceptions are obscure to the point of almost not existing. If the string is allocated on the heap (with malloc()et al), you must ensure that your code does not modify it. If it does, it is modifying the environment at the same time.

你是对的,局部变量几乎总是一个糟糕的选择putenv()- 异常模糊到几乎不存在的地步。如果字符串是在堆上分配的(使用malloc()et al),你必须确保你的代码不会修改它。如果是这样,它同时也在修改环境。

Furthermore (though I haven't tested it), since one use of environment variables is to pass values to child's environment this seems useless if the child calls one of the exec*()functions. Am I wrong in that?

此外(虽然我没有测试过),因为环境变量的一种用途是将值传递给孩子的环境,如果孩子调用其中一个exec*()函数,这似乎没用。我错了吗?

The exec*()functions make a copy of the environment and pass that to the executed process. There's no problem there.

这些exec*()函数制作环境的副本并将其传递给执行的进程。那里没有问题。

The Linux man page indicates that glibc 2.0-2.1.1 abandoned the above behavior and began copying the string but this led to a memory leak that was fixed in glibc 2.1.2. It's not clear to me what this memory leak was or how it was fixed.

Linux 手册页表明 glibc 2.0-2.1.1 放弃了上述行为并开始复制字符串,但这导致了内存泄漏,该问题已在 glibc 2.1.2 中修复。我不清楚这个内存泄漏是什么或者它是如何修复的。

The memory leak arises because once you have called putenv()with a string, you cannot use that string again for any purpose because you can't tell whether it is still in use, though you could modify the value by overwriting it (with indeterminate results if you change the name to that of an environment variable found at another position in the environment). So, if you have allocated space, the classic putenv()leaks it if you change the variable again. When putenv()began to copy data, allocated variables became unreferenced because putenv()no longer kept a reference to the argument, but the user expected that the environment would be referencing it, so the memory was leaked. I'm not sure what the fix was — I would 3/4 expect it was to revert to the old behaviour.

内存泄漏的产生是因为一旦你putenv()用一个字符串调用,你就不能再出于任何目的使用该字符串,因为你无法判断它是否仍在使用中,尽管你可以通过覆盖它来修改该值(如果你用不确定的结果将名称更改为在环境中另一个位置找到的环境变量的名称)。所以,如果你已经分配了空间,putenv()如果你再次改变变量,经典会泄漏它。当putenv()开始复制数据时,分配的变量变为未引用,因为putenv()不再保留对参数的引用,但用户期望环境会引用它,因此内存泄漏。我不确定修复是什么——我 3/4 预计它会恢复到旧的行为。

setenv()copies the string but I don't know exactly how that works. Space for the environment is allocated when the process loads but it is fixed.

setenv()复制字符串,但我不知道它是如何工作的。环境空间在进程加载时分配,但它是固定的。

The original environment space is fixed; when you start modifying it, the rules change. Even with putenv(), the original environment is modified and could grow as a result of adding new variables, or as a result of changing existing variables to have longer values.

原有环境空间固定;当你开始修改它时,规则就会改变。即使使用putenv(),原始环境也会被修改,并且可能会因添加新变量或将现有变量更改为具有更长值而增长。

Is there some (arbitrary?) convention at work here? For example, allocating more slots in the env string pointer array than currently used and moving the null terminating pointer down as needed?

这里有一些(任意的?)约定在起作用吗?例如,在 env 字符串指针数组中分配比当前使用更多的插槽并根据需要向下移动空终止指针?

That is what the setenv()mechanism is likely to do. The (global) variable environpoints to the start of the array of pointers to environment variables. If it points to one block of memory at one time and a different block at a different time, then the environment is switched, just like that.

这就是该setenv()机制可能会做的事情。(全局)变量environ指向指向环境变量的指针数组的开头。如果它一次指向一个内存块,而在不同时间指向另一个内存块,那么环境就切换了,就像这样。

Is the memory for the new (copied) string allocated in the address space of the environment itself and if it is too big to fit you just get ENOMEM?

新(复制的)字符串的内存是否在环境本身的地址空间中分配,如果它太大而无法容纳,您只需获取 ENOMEM?

Well, yes, you could get ENOMEM, but you'd have to be trying pretty hard. And if you grow the environment too large, you may be unable to exec other programs properly - either the environment will be truncated or the exec operation will fail.

嗯,是的,你可以得到 ENOMEM,但你必须非常努力。并且如果您将环境增长得过大,您可能无法正确执行其他程序 - 环境将被截断或 exec 操作将失败。

Considering the above issues, is there any reason to prefer putenv() over setenv()?

考虑到上述问题,是否有任何理由更喜欢 putenv() 而不是 setenv()?

  • Use setenv()in new code.
  • Update old code to use setenv(), but don't make it a top priority.
  • Do not use putenv()in new code.
  • setenv()在新代码中使用。
  • 更新旧代码以使用setenv(),但不要将其作为首要任务。
  • 不要putenv()在新代码中使用。

回答by jim mcnamara

Read the RATIONALEsection of the setenvman page from The Open Group Base Specifications Issue 6.

阅读The Open Group Base Specifications Issue 6 手册页的RATIONALE部分setenv

putenvand setenvare both supposed to be POSIX compliant. If you have code with putenvin it, and the code works well, leave it alone. If you are developing new code you may want to consider setenv.

putenv并且setenv都应该符合 POSIX。如果其中包含代码putenv,并且代码运行良好,请不要管它。如果您正在开发新代码,您可能需要考虑setenv.

Look at the glibc source codeif you want to see an example of an implementation of setenv(stdlib/setenv.c) or putenv(stdlib/putenv.c).

看看glibc的源代码,如果你想看到的一个实现的示例setenvstdlib/setenv.c)或putenvstdlib/putenv.c)。

回答by R.. GitHub STOP HELPING ICE

I would highly recommend against using either of these functions. Either canbe used safely and without leaks, as long as you're careful and only one part of your code is responsible for modifying the environment, but it's hard to get right and dangerous if any code might be using threads and might read the environment (e.g. for timezone, locale, dns config, etc. purposes).

我强烈建议不要使用这些功能中的任何一个。要么可以被安全地和无泄漏使用,只要你小心,只有一个你的代码的一部分,负责修改的环境,但它很难得到正确的和危险的,如果任何代码可能会使用线程,可能阅读环境(例如,用于时区、区域设置、dns 配置等目的)。

The only two purposes I can think of for modifying the environment are to change the timezone at runtime, or to pass a modified environment to child processes. For the former, you probably have to use one of these functions (setenv/putenv), or you could walk environmanually to change it (this mightbe safer if you're worried other threads could try to read the environment at the same time). For the latter use (child processes), use one of the exec-family functions that lets you specify your own environment array, or simply clobber environ(the global) or use setenv/putenvin the child process after forkbut before exec, in which case you don't have to care about memory-leaks or thread-safety because there are no other threads and you're about to destroy your address space and replace it with a new process image.

我能想到的修改环境的唯一两个目的是在运行时更改时区,或者将修改后的环境传递给子进程。对于前者,您可能必须使用其中一个函数 ( setenv/ putenv),或者您可以environ手动更改它(如果您担心其他线程可能同时尝试读取环境,这可能更安全)。对于后者的使用(子进程),请使用exec-family 函数之一,该函数可让您指定自己的环境数组,或者简单地使用 clobber environ(全局)或在子进程之后但之前使用setenv/putenvforkexec,在这种情况下,您不必关心内存泄漏或线程安全,因为没有其他线程,并且您将要破坏地址空间并用新的进程映像替换它。

回答by Ben Hymanson

Furthermore (though I haven't tested it), since one use of environment variables is to pass values to child's environment this seems useless if the child calls one of the exec() functions. Am I wrong in that?

此外(虽然我没有测试过),因为环境变量的一种用途是将值传递给孩子的环境,如果孩子调用 exec() 函数之一,这似乎没用。我错了吗?

That's not how the environment is passed to the child. All of the various flavors of exec()(which you find in section 3 of the manual beause they are library functions) ultimately invoke the system call execve()(which you find in section 2 of the manual). The arguments are:

这不是环境传递给孩子的方式。所有各种风格的exec()(您可以在手册的第 3 节中找到,因为它们是库函数)最终都会调用系统调用execve()(您可以在手册的第 2 节中找到)。论据是:

   int execve(const char *filename, char *const argv[], char *const envp[]);

The vector of environment variables is passed explicitly (and may be partly constructed from the results of your putenv()and setenv()calls). The kernel copies these into the address space of the new process. Historically there was a limit to the size of your environment derived from the space available for this copy (similar to the argument limit) but I'm not familiar with the restrictions on a modern Linux kernel.

环境变量的向量是显式传递的(并且可能部分是根据您的putenv()setenv()调用的结果构造的)。内核将这些复制到新进程的地址空间中。从历史上看,由于此副本的可用空间(类似于参数限制),您的环境大小存在限制,但我不熟悉现代 Linux 内核的限制。

回答by Random832

There is no special "the environment" space - setenv just dynamically allocates space for the strings (with mallocfor example) as you would do normally. Because the environment doesn't contain any indication of where each string in it came from, it is impossible for setenvor unsetenvto free any space which may have been dynamically allocated by previous calls to setenv.

没有特殊的“环境”空间 - setenv 只是malloc像往常一样为字符串动态分配空间(例如)。因为环境不包含任何关于其中每个字符串来自何处的指示,所以不可能setenvunsetenv释放任何可能已由先前调用 setenv 动态分配的空间。

"Because it doesn't copy the passed string you can't call it with a local and there is no guarantee a heap allocated string won't be overwritten or accidentally deleted." The purpose of putenv is to make sure that if you have a heap-allocated string it's possible to delete it on purpose. That's what the rationale text means by "the only function available to add to the environment without permitting memory leaks." And yes, you can call it with a local, just remove the string from the environment (putenv("FOO=")or unsetenv) before you return from the function.

“因为它不复制传递的字符串,所以不能用本地调用它,并且不能保证堆分配的字符串不会被覆盖或意外删除。” putenv 的目的是确保如果您有一个堆分配的字符串,则可以故意删除它。这就是基本原理文本的意思,即“唯一可以添加到环境中且不允许内存泄漏的函数”。是的,您可以使用本地调用它,只需putenv("FOO=")在从函数返回之前从环境(或 unsetenv)中删除字符串。

The point is that using putenv makes the process of removing a string from the environment entirely deterministic. Whereas setenv will on some existing implementations modify an existing string in the environment if the new value is shorter (to avoid alwaysleaking memory), and since it made a copy when you called setenv you're not in control of the originally dynamically allocated string so you can't free it when it's removed.

关键是使用 putenv 使从环境中删除字符串的过程完全确定。而 setenv 将在某些现有实现上修改环境中的现有字符串,如果新值较短(以避免总是泄漏内存),并且由于它在您调用 setenv 时进行了复制,因此您无法控制最初动态分配的字符串所以当它被删除时你不能释放它。

Meanwhile, setenv itself(or unsetenv) can't free the previous string, since - even ignoring putenv - the string may have come from the original environment instead of being allocated by a previous invocation of setenv.

同时,setenv本身(或 unsetenv)不能释放前一个字符串,因为 - 即使忽略 putenv - 该字符串可能来自原始环境,而不是由先前调用 setenv 分配的。

(This whole answer assumes a correctly implemented putenv, i.e. notthe one in glibc 2.0-2.1.1 you mentioned.)

(整个答案假定正确实现了 putenv ,即不是您提到的 glibc 2.0-2.1.1 中的那个。)