[原创] chrt: failed to set pid xxxx's policy: Operation not permitted

KEYWORDS: scheduler, scheduling, sysctl, sched_fifo, sched_rr, chrt, deadline, sched_deadline

HISTORY:

  • Created at 09:50:43 on 2019-03-20.

chrt: failed to set pid xxxx's policy: Operation not permitted

First reported on physical machine (Dell Inc. OptiPlex 790) with CentOS 7.6.

  • kernel-3.10.0-957.5.1.el7.x86_64.
  • util-linux-2.23.2-59.el7.x86_64.
# cat /proc/sys/kernel/sched_rt_runtime_us
950000

# ulimit -a | grep priority
scheduling priority             (-e) 0
real-time priority              (-r) 0

# sleep 300 &
[1] 9222
# chrt -f -p 20 9222
chrt: failed to set pid 9222's policy: Operation not permitted

# chrt -f 20 whoami
chrt: failed to set pid 0's policy: Operation not permitted

Analysis

According to [1] (see below), after committing following command, chrt works.

# sysctl -w kernel.sched_rt_runtime_us=-1
kernel.sched_rt_runtime_us = -1

# chrt -f 20 whoami
root

/usr/share/doc/kernel-doc-3.10.0/Documentation/scheduler/sched-rt-group.txt

A run time of -1 specifies runtime == period, ie. no limit.

For more information about sched_rt_runtime_us, refer to Terminology/scheduling_policies.md.


But this is not a issue on virtual machine with the same packages version, physical host or virtual machine with CentOS 6.x.

  • Is it a bug?
  • Does this issue relate to control groups?
  • There must be something different from physical machine and virtual machine. But what are they?

Digging Deeper

  • On Physical Machine with CentOS 7.x (kernel-3.10.0-957.5.1.el7.x86_64 and util-linux-2.23.2-59.el7.x86_64)

    # sysctl -a | grep sched_rt_runtime_us
    kernel.sched_rt_runtime_us = 950000
    
    # strace -fe sched_setattr,sched_setscheduler chrt -f 10 whoami
    sched_setattr(0, {size=48, sched_policy=SCHED_FIFO, sched_flags=0, sched_nice=0, sched_priority=10, sched_runtime=0, sched_deadline=0, sched_period=0}, 0) = -1 EPERM (Operation not permitted)
    chrt: failed to set pid 0's policy: Operation not permitted
    +++ exited with 1 +++
    
    # sysctl -w kernel.sched_rt_runtime_us=-1
    kernel.sched_rt_runtime_us = -1
    
    # strace -fe sched_setattr,sched_setscheduler chrt -f 10 whoami
    sched_setattr(0, {size=48, sched_policy=SCHED_FIFO, sched_flags=0, sched_nice=0, sched_priority=10, sched_runtime=0, sched_deadline=0, sched_period=0}, 0) = 0
    root
    +++ exited with 0 +++
    

    man 2 sched_setattr (On “Mint 19.1”, there is no sched_setattr man page can be found on CentOS 7.x)

    • EPERM. The caller does not have appropriate privileges.
    • EPERM. The CPU affinity mask of the thread specified by pid does not include all CPUs in the system (see sched_setaffinity(2)).

    These system calls first appeared in Linux 3.14.


  • On Virtual Machine with CentOS 7.x (kernel.sched_rt_runtime_us = 950000)

    Different results with different package versions.

    • kernel-3.10.0-862.6.3.el7.x86_64 and util-linux-2.23.2-52.el7.x86_64

      # strace -fe sched_setattr,sched_setscheduler chrt -f 10 whoami
      sched_setattr(0, {size=48, sched_policy=SCHED_FIFO, sched_flags=0, sched_nice=0, sched_priority=10, sched_runtime=0, sched_deadline=0, sched_period=0}, 0) = -1 ENOSYS (Function not implemented)
      sched_setscheduler(0, SCHED_FIFO, [10]) = 0
      root
      +++ exited with 0 +++
      
    • kernel-3.10.0-957.5.1.el7.x86_64 and util-linux-2.23.2-59.el7.x86_64

      # strace -fe sched_setattr,sched_setscheduler chrt -f 10 whoami
      sched_setattr(0, {size=48, sched_policy=SCHED_FIFO, sched_flags=0, sched_nice=0, sched_priority=10, sched_runtime=0, sched_deadline=0, sched_period=0}, 0) = 0
      root
      +++ exited with 0 +++
      
  • On Virtual Machine with CentOS 6.x (kernel-2.6.32-696.20.1.el6.x86_64 and util-linux-ng-2.17.2-12.28.el6.x86_64, kernel.sched_rt_runtime_us = 950000)

    # strace -fe sched_setattr,sched_setscheduler chrt -f 10 whoami
    strace: invalid system call `sched_setattr'
    
    # strace -fe sched_setscheduler chrt -f 10 whoami
    sched_setscheduler(0, SCHED_FIFO, { 10 }) = 0
    root
    
  • On Physical Machine with Mint 19.1 (linux-image-4.15.0-45-generic and util-linux-2.31.1-0.4ubuntu3.3) or Ubuntu 16.04 (linux-image-4.15.0-39-generic and util-linux-2.27.1-6ubuntu3.2)

    # strace -fe sched_setattr,sched_setscheduler chrt -f 10 whoami
    sched_setscheduler(0, SCHED_FIFO, [10]) = 0
    root
    +++ exited with 0 +++
    

I guess the key of this issue is sched_setattr. And it’s probably a bug, see below [2].


It is a bug. Relative issue you can find on URL EPERM running chrt from shell with positive niceness #359.

Though no one mentioned about SCHED_FIFO, and in my example I was the root user.

As the developer said “Well, both syscalls have priority as argument and it should be zero for SCHED_BATCH. It’s strange and inconsistent API that sched_setattr() interprets zero priority as attempt to change task priority. It makes the syscall useless.”

So, I guess this problem is related to two parameters sched_runtime (as kernel.sched_rt_runtime_us=-1 makes chrt work well and this parameter is used by SCHED_DEADLINE too [4]) and sched_priority of sched_setattr().

Following examples may prove that.

# sleep 3000 &
[1] 7000

# sysctl -w kernel.sched_rt_runtime_us=-1
kernel.sched_rt_runtime_us = -1

# strace -fe sched_setattr,sched_setscheduler chrt --verbose -f -p 1 7000
pid 7000's current scheduling policy: SCHED_OTHER
pid 7000's current scheduling priority: 0
sched_setattr(7000, {size=48, sched_policy=SCHED_FIFO, sched_flags=0, sched_nice=0, sched_priority=1, sched_runtime=0, sched_deadline=0, sched_period=0}, 0) = 0
pid 7000's new scheduling policy: SCHED_FIFO
pid 7000's new scheduling priority: 1
+++ exited with 0 +++

# sysctl -w kernel.sched_rt_runtime_us=950000
kernel.sched_rt_runtime_us = 950000

# chrt -f -p 2 7000
chrt: failed to set pid 7000's policy: Operation not permitted

# renice -n 0 -p 7000
7000 (process ID) old priority -2, new priority 0
# renice -n 2 -p 7000
7000 (process ID) old priority 0, new priority 2
# renice -n -2 -p 7000
7000 (process ID) old priority 2, new priority -2

# strace -fe sched_setattr,sched_setscheduler chrt -f -p 2 7000
sched_setattr(7000, {size=48, sched_policy=SCHED_FIFO, sched_flags=0, sched_nice=0, sched_priority=2, sched_runtime=0, sched_deadline=0, sched_period=0}, 0) = -1 EPERM (Operation not permitted)
chrt: failed to set pid 7000's policy: Operation not permitted
+++ exited with 1 +++

Remained Questions

But by now, I still have no idea about why chrt works well on virtual machine.

Intermediate Measure

You should exactly know what you are doing before using it.

# On the Fly
sysctl -w kernel.sched_rt_runtime_us=-1

# For Persistent
echo "# Added by dreamer_catcher on 2019-03-20"
echo "kernel.sched_rt_runtime_us = -1" >> /etc/sysctl.conf
sysctl -p

Solution (Not Tested)

You may need to download the source code (latest released) of util-linux and compile it manually.

References

  • [1]. FIFI Policy.
  • [2]. chrt from shell scripts: operation not permitted.
  • [3]. EPERM running chrt from shell with positive niceness #359.
  • [4]. Deadline Task Scheduling.

你可能感兴趣的:([原创] chrt: failed to set pid xxxx's policy: Operation not permitted)