NOHZ=ON 如何影响 Linux 内核中的 do_timer()?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/9775042/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-06 05:16:27  来源:igfitidea点击:

How NOHZ=ON affects do_timer() in Linux kernel?

linuxtimetimerlinux-kernel

提问by Eastern Monk

In a simple experiment I set NOHZ=OFFand used printk()to print how often the do_timer()function gets called. It gets called every 10?ms on my machine.

在一个简单的实验中,我设置NOHZ=OFF并用于printk()打印do_timer()函数被调用的频率。它在我的机器上每 10?ms 被调用一次。

However if NOHZ=ONthen there is a lot of jitter in the way do_timer()gets called. Most of the times it does get called every 10?ms but there are times when it completely misses the deadlines.

但是,如果NOHZ=ON那时do_timer()调用的方式有很多抖动。大多数情况下,它会每 10?ms 调用一次,但有时它会完全错过最后期限。

I have researched about both do_timer()and NOHZ. do_timer()is the function responsible for updating jiffiesvalue and is also responsible for the round robin scheduling of the processes.

我已经研究了两者do_timer()和 NOHZ。do_timer()是负责更新jiffies值的函数,也负责进程的循环调度。

NOHZ feature switches off the hi-res timers on the system.

NOHZ 功能会关闭系统上的高分辨率计时器。

What I am unable to understand is how can hi-res timers affect the do_timer()? Even if hi-res hardware is in sleep state the persistent clock is more than capable to execute do_timer()every 10?ms. Secondly if do_timer()is not executing when it should, that means some processes are not getting their timeshare when they should ideally be getting it. A lot of googling does show that for many people many applications start working much better when NOHZ=OFF.

我无法理解的是高分辨率计时器如何影响do_timer()? 即使高分辨率硬件处于睡眠状态,持久时钟也足以do_timer()每 10?ms执行一次。其次,如果do_timer()没有在它应该执行的时候执行,这意味着某些进程没有在理想情况下应该获得它的时候获得它们的分时度假。许多谷歌搜索确实表明,对于许多人来说,许多应用程序在NOHZ=OFF.

To make long story short, how does NOHZ=ONaffect do_timer()?
Why does do_timer()miss its deadlines?

长话短说,如何NOHZ=ON影响do_timer()
为什么会do_timer()错过最后期限?

采纳答案by Pavan Manjunath

First lets understand what is a tickless kernel( NOHZ=Onor CONFIG_NO_HZset ) and what was the motivation of introducing it into the Linux Kernel from 2.6.17

首先让我们了解什么是tickless kernelNOHZ=OnCONFIG_NO_HZ集合)以及将其引入 Linux 内核的动机是什么?2.6.17

From http://www.lesswatts.org/projects/tickless/index.php,

http://www.lesswatts.org/projects/tickless/index.php

Traditionally, the Linux kernel used a periodic timer for each CPU. This timer did a variety of things, such as process accounting, scheduler load balancing, and maintaining per-CPU timer events. Older Linux kernels used a timer with a frequency of 100Hz (100 timer events per second or one event every 10ms), while newer kernels use 250Hz (250 events per second or one event every 4ms) or 1000Hz (1000 events per second or one event every 1ms).

This periodic timer event is often called "the timer tick". The timer tick is simple in its design, but has a significant drawback: the timer tick happens periodically, irrespective of the processor state, whether it's idle or busy. If the processor is idle, it has to wake up from its power saving sleep state every 1, 4, or 10 milliseconds. This costs quite a bit of energy, consuming battery life in laptops and causing unnecessary power consumption in servers.

With "tickless idle", the Linux kernel has eliminated this periodic timer tick when the CPU is idle. This allows the CPU to remain in power saving states for a longer period of time, reducing the overall system power consumption.

传统上,Linux 内核为每个 CPU 使用一个周期性计时器。这个计时器做了很多事情,比如进程记账、调度程序负载平衡和维护每个 CPU 的计时器事件。较旧的 Linux 内核使用频率为 100Hz(每秒 100 个定时器事件或每 10ms 一个事件)的计时器,而较新的内核使用 250Hz(每秒 250 个事件或每 4ms 一个事件)或 1000Hz(每秒 1000 个事件或一个事件)每 1 毫秒)。

这种周期性的计时器事件通常称为“计时器滴答”。定时器滴答的设计很简单,但有一个明显的缺点:定时器滴答周期性地发生,与处理器状态无关,是空闲还是忙碌。如果处理器空闲,它必须每 1、4 或 10 毫秒从其节能睡眠状态唤醒。这会消耗相当多的能源,消耗笔记本电脑的电池寿命并导致服务器不必要的功耗。

有了“tickless idle”,Linux 内核在 CPU 空闲时消除了这种周期性的计时器滴答。这使得 CPU 能够在更长的时间内保持节电状态,从而降低整体系统功耗。

So reducing power consumption was one of the main motivations of the tickless kernel. But as it goes, most of the times, Performance takes a hit with decreased power consumption. For desktop computers, performance is of utmost concern and hence you see that for most of them NOHZ=OFFworks pretty well.

因此,降低功耗是无滴答内核的主要动机之一。但事实上,在大多数情况下,性能会因功耗降低而受到影响。对于台式计算机,性能是最受关注的,因此您会看到它们中的大多数都NOHZ=OFF运行良好。

In Ingo Molnar's own words

用 Ingo Molnar 自己的话来说

The tickless kernel feature (CONFIG_NO_HZ) enables 'on-demand' timer interrupts: if there is no timer to be expired for say 1.5 seconds when the system goes idle, then the system will stay totally idle for 1.5 seconds. This should bring cooler CPUs and power savings: on our (x86) testboxes we have measured the effective IRQ rate to go from HZ to 1-2 timer interrupts per second.

无滴答内核功能 (CONFIG_NO_HZ) 启用“按需”计时器中断:如果系统空闲时没有计时器到期,例如 1.5 秒,则系统将完全空闲 1.5 秒。这应该会带来更凉爽的 CPU 和节能:在我们的 (x86) 测试盒上,我们测量了有效 IRQ 速率,从 HZ 到每秒 1-2 个定时器中断。

Now, lets try to answer your queries-

现在,让我们试着回答你的疑问——

What I am unable to understand is how can hi-res timers affect the do_timer ?

我无法理解的是高分辨率计时器如何影响 do_timer ?

If a system supports high-res timers, timer interrupts can occur more frequently than the usual 10mson most systems. i.e these timers try to make the system more responsive by leveraging the system capabilities and by firing timer interrupts even faster, say every 100us. So with NOHZoption, these timers are cooled down and hence the lower execution of do_timer

如果系统支持高分辨率计时器,则10ms大多数系统上的计时器中断可能会比平常更频繁地发生。即,这些计时器试图通过利用系统功能和更快地触发计时器中断来使系统响应更快,例如每个100us. 因此,通过NOHZ选项,这些计时器会冷却下来,因此执行的次数较少do_timer

Even if hi-res hardware is in sleep state the persistent clock is more than capable to execute do_timer every 10ms

即使高分辨率硬件处于睡眠状态,持久时钟也足以每 10 毫秒执行一次 do_timer

Yes it is capable. But the intention of NOHZis exactly the opposite. To prevent frequent timer interrupts!

是的,它有能力。但意图NOHZ恰恰相反。防止定时器频繁中断!

Secondly if do_timer is not executing when it should that means some processes are not getting their timeshare when they should ideally be getting it

其次,如果 do_timer 没有在它应该执行的时候执行,这意味着某些进程在理想情况下应该获得它的时候没有获得它们的分时度假

As cafnoted in the comments, NOHZdoes not cause processes to get scheduled less often, because it only kicks in when the CPU is idle - in other words, when no processes are schedulable. Only the process accounting stuff will be done at a delayed time.

正如caf在评论中指出的那样,NOHZ不会导致进程的调度频率降低,因为它仅在 CPU 空闲时启动 - 换句话说,当没有进程可调度时。只有流程会计的东西会在延迟的时间完成。

Why does do_timer miss it's deadlines ?

为什么 do_timer 错过了截止日期?

As elaborated, it is the intended design of NOHZ

正如所阐述的,它是预期的设计 NOHZ

I suggest you go through the tick-sched.ckernel sources as a starting point. Search for CONFIG_NO_HZand try understanding the new functionality added for the NOHZfeature

我建议你从tick-sched.c内核源代码开始。搜索CONFIG_NO_HZ并尝试理解添加了新的功能NOHZ特性

Here is one test performed to measure the Impact of a Tickless Kernel

这是一项用于衡量Tickless 内核影响的测试