psutil 4.0.0以及如何在Python中获取“真实的”进程内存和环境

New psutil 4.0.0 is out, with some interesting news about process memory metrics. I’ll just get straight to the point and describe what’s new.

新的psutil 4.0.0发布了,其中有一些有关进程内存指标的有趣消息。 我将直截了当并描述新功能。

“实际”过程内存信息 (“Real” process memory info)

Determining how much memory a process really uses is not an easy matter (see this and this). RSS (Resident Set Size), which is what most people usually rely on, is misleading because it includes both the memory which is unique to the process and the memory shared with other processes. What would be more interesting in terms of profiling is the memory which would be freed if the process was terminated right now. In the Linux world this is called USS (Unique Set Size), and this is the major feature which was introduced in psutil 4.0.0 (not only for Linux but also for Windows and OSX).

确定一个进程真正使用多少内存并不是一件容易的事(请参阅this和this )。 RSS(驻留集大小),这是大多数人平时靠,是误导,因为它包含这两者是独特的Craft.io,并与其他进程共享内存的内存。 什么是在分析方面更有趣的是,如果该进程被终止,现在将释放内存。 在Linux世界中,这称为USS (唯一设置大小),这是psutil 4.0.0中引入的主要功能(不仅适用于Linux,而且适用于Windows和OSX)。

USS记忆 (USS memory)

The USS (Unique Set Size) is the memory which is unique to a process and which would be freed if the process was terminated right now. On Linux this can be determined by parsing all the “private” blocks in /proc/pid/smaps. The Firefox team pushed this further and managed to do the same also on OSX and Windows, which is great. New version of psutil is now able to do the same:

USS(唯一集大小)是进程专有的内存如果该进程立即终止,则将释放内存 在Linux上,可以通过解析/ proc / pid / smaps中的所有“私有”块来确定。 Firefox团队进一步推动了这一点,并设法在OSX和Windows上也做到了这一点,这很棒。 新版本的psutil现在可以执行以下操作:

>>> psutil.Process().memory_full_info()
pfullmem(rss=101990, vms=521888, shared=38804, text=28200, lib=0, data=59672, dirty=0, 
         uss=81623, pss=91788, swap=0)

PSS和交换 (PSS and swap)

On Linux there are two additional metrics which can also be determined via /proc/pid/smaps: PSS and swap. PSS, aka “Proportional Set Size”, represents the amount of memory shared with other processes, accounted in a way that the amount is divided evenly between the processes that share it. I.e. if a process has 10 MBs all to itself (USS) and 10 MBs shared with another process, its PSS will be 15 MBs. “swap” is simply the amount of memory that has been swapped out to disk. With memory_full_info() it is possible to implement a tool like this, similar to smem on Linux, which provides a list of processes sorted by “USS”. It is interesting to notice how RSS differs from USS:

在Linux上,还可以通过/ proc / pid / smaps确定两个附加指标: PSSswap PSS,又称“比例集大小”,表示与其他进程共享的内存量,以在共享它的进程之间平均分配内存的方式进行说明 即,如果一个进程拥有10 MB的自身(USS)并且与另一个进程共享10 MB,则其PSS将为15 MB。 “交换”只是已换出到磁盘的内存量。 使用memory_full_info() ,可以实现类似于Linux上的smem 这样的工具,该工具提供按“ USS”排序的进程列表。 有趣的是,注意到RSS与USS有何不同:

~/svn/psutil$ ./scripts/procsmem.py
PID     User    Cmdline                            USS     PSS    Swap     RSS
==============================================================================
...
3986    giampao /usr/bin/python3 /usr/bin/indi   15.3M   16.6M      0B   25.6M
3906    giampao /usr/lib/ibus/ibus-ui-gtk3       17.6M   18.1M      0B   26.7M
3991    giampao python /usr/bin/hp-systray -x    19.0M   23.3M      0B   40.7M
3830    giampao /usr/bin/ibus-daemon --daemoni   19.0M   19.0M      0B   21.4M
20529   giampao /opt/sublime_text/plugin_host    19.9M   20.1M      0B   22.0M
3990    giampao nautilus -n                      20.6M   29.9M      0B   50.2M
3898    giampao /usr/lib/unity/unity-panel-ser   27.1M   27.9M      0B   37.7M
4176    giampao /usr/lib/evolution/evolution-c   35.7M   36.2M      0B   41.5M
20712   giampao /usr/bin/python -B /home/giamp   45.6M   45.9M      0B   49.4M
3880    giampao /usr/lib/x86_64-linux-gnu/hud/   51.6M   52.7M      0B   61.3M
20513   giampao /opt/sublime_text/sublime_text   65.8M   73.0M      0B   87.9M
3976    giampao compiz                          115.0M  117.0M      0B  130.9M
32486   giampao skype                           145.1M  147.5M      0B  149.6M

实作 (Implementation)

In order to get these values (USS, PSS and swap) we need to pass through the whole process address space. This usually requires higher user privileges and is considerably slower than getting the “usual” memory metrics via Process.memory_info(), which is probably the reason why tools like ps and top show RSS/VMS instead of USS. A big thanks goes to the Mozilla team which figured out all this stuff on Windows and OSX, and to Eric Rahm who put the PRs for psutil together (see #744, #745 and #746). For those of you who don’t use Python and would like to port the code on other languages here’s the interesting parts:

为了获得这些值(USS,PSS和交换),我们需要遍历整个进程地址空间。 这通常需要更高的用户特权,并且比通过Process.memory_info()获得“常规”内存指标要慢得多,这可能是pstop之类的工具显示RSS / VMS而不是USS的原因。 非常感谢Mozilla团队在Windows和OSX上找到所有这些东西,以及Eric Rahm将psutil的PR放在一起(请参阅#744 , #745和#746 )。 对于那些不使用Python并想将代码移植到其他语言上的人来说,以下是有趣的部分:

  • Linux
  • OSX
  • Windows
  • 的Linux
  • OSX
  • 视窗

内存类型百分比 (Memory type percent)

After reorganizing process memory APIs I decided to add a new memtype parameter to Process.memory_percent(). With this it is now possible to compare a specific memory type (not only RSS) against the total physical memory. E.g.

重新组织了进程内存API之后,我决定向Process.memory_percent()添加一个新的memtype参数。 现在,可以将特定的内存类型(不仅是RSS)与总的物理内存进行比较。 例如

>>> psutil.Process().memory_percent(memtype='pss')
0.06877466326787016

流程环境 (Process environ)

Second biggest improvement in psutil 4.0.0 is the ability to determine the process environment variables. This opens interesting possibilities about process recognition and monitoring techniques. For instance, one might start a process by passing a certain custom environment variable, then iterate over all processes to find the one of interest (and figure out whether it’s running or whatever):

psutil 4.0.0的第二大改进是确定进程环境变量的能力。 这为过程识别和监视技术带来了有趣的可能性。 例如,可以通过传递某个自定义环境变量来启动一个进程,然后遍历所有进程以找到感兴趣的进程(并确定其是否正在运行或执行其他操作):

import psutil
for p in psutil.process_iter():
    try:
        env = p.environ()
    except psutil.Error:
        pass
    else:
        if 'MYAPP' in env:
            ...

Process environ is a long standing issue (year 2009) who I gave up to implement because the Windows implementation worked for the current process only. Frank Benkstein solved that and the process environ can now be determined on Linux, Windows and OSX for all processes (of course you may still bump into AccessDenied for processes owned by another user):

进程环境是一个长期存在的问题 (2009年),我放弃实施,因为Windows实施仅适用于当前进程。 弗兰克·本克斯坦(Frank Benkstein) 解决了这一问题,现在可以在Linux,Windows和OSX上确定所有进程的进程环境(当然,对于其他用户拥有的进程,您仍然可以进入AccessDenied ):

>>> import psutil
>>> from pprint import pprint as pp
>>> pp(psutil.Process().environ())
{...
 'CLUTTER_IM_MODULE': 'xim',
 'COLORTERM': 'gnome-terminal',
 'COMPIZ_BIN_PATH': '/usr/bin/',
 'HOME': '/home/giampaolo',
 'PWD': '/home/giampaolo/svn/psutil',
  }
>>>

It must be noted that the resulting dict usually does not reflect changes made after the process started (e.g. os.environ[‘MYAPP’] = ‘1’). Again, for whoever is interested in doing this in other languages, here’s the interesting parts:

必须注意的是,结果字典通常不反映该过程开始后所做的更改(例如os.environ ['MYAPP'] ='1' )。 同样,对于那些对使用其他语言进行此操作感兴趣的人,这里有一些有趣的部分:

  • Linux
  • OSX
  • Windows
  • 的Linux
  • OSX
  • 视窗

扩展磁盘IO统计信息 (Extended disk IO stats)

psutil.disk_io_counters() returns a new busy_time field on Linux and FreeBSD and two new read_merged_count and write_merged_count fields on Linux only. With these new values it is now possible to have a better representation of actual disk utilization, similarly to iostat command on Linux.

psutil.disk_io_counters()在Linux和FreeBSD上仅返回一个新的busy_time字段,在Linux上仅返回两个新的read_merged_countwrite_merged _count字段。 利用这些新值,现在可以更好地表示实际磁盘利用率 ,类似于Linux上的iostat命令。

操作系统常数 (OS constants)

constants to quickly differentiate what platform you’re on: 常量以快速区分您所使用的平台: psutil.LINUX, psutil.LINUXpsutil.WINDOWS, etc. psutil.WINDOWS等。

主要错误修复 (Main bug fixes)

  • #734: on Python 3 invalid UTF-8 data was not correctly handled for proces name(), cwd(), exe(), cmdline() and open_files() methods, resulting in UnicodeDecodeError. This was affecting all platforms. Now surrogateescape error handler is used as a workaround for replacing the corrupted data.
  • #761: [Windows] psutil.boot_time() no longer wraps to 0 after 49 days.
  • #767: [Linux] disk_io_counters() may raise ValueError on 2.6 kernels and it’s  broken on 2.4 kernels.
  • #764: psutil can now be compiled on NetBSD-6.X.
  • #704: psutil can now be compiled on Solaris sparc.
  • #734 :在Python 3上,对于过程name() , cwd() , exe() , cmdline()和open_files()方法未正确处理无效的UTF-8数据,导致UnicodeDecodeError 这正在影响所有平台。 现在,使用surrogateescape错误处理程序作为替换损坏的数据的解决方法。
  • #761 :[Windows] psutil.boot_time()在49天后不再换为0。
  • #767 :[Linux] disk_io_counters()可能在2.6内核上引发ValueError,而在2.4内核上已损坏。
  • #764 :psutil现在可以在NetBSD-6.X上编译。
  • #704 :现在可以在Solaris sparc上编译psutil。
here. 这里 。

移植代码 (Porting code)

Being 4.0.0 a major version, I took the chance to (lightly) change / break some APIs.
作为主要版本4.0.0,我借此机会(轻微)更改/破坏了一些API。
  • Process.memory_info() no longer returns just an (rss, vms) namedtuple. Instead it returns a namedtuple of variable length, changing depending on the platform (rss and vms are always present though, also on Windows). Basically it returns the same result of old process_memory_info_ex(). This shouldn’t break your existent code, unless you were doing “rss, vms = p.memory_info()”.
  • At the same time process_memory_info_ex() is now deprecated. The method is still there as an alias for memory_info(), issuing a deprecation warning.
  • psutil.disk_io_counters() returns 2 additional fields on Linux and 1 additional field on FreeBSD.
  • psutil.disk_io_counters() on NetBSD and OpenBSD no longer return write_count and read_count metrics because the kernel do not provide them (we were returning the busy time instead). I also don’t expect this to be a big issue because NetBSD and OpenBSD support is very recent.
  • Process.memory_info()不再仅返回(rss,vms)命名元组。 取而代之的是,它返回一个可变长度的namedtuple,它随平台而变化(尽管在Windows上也始终存在rssvms )。 基本上,它返回的结果与旧的process_memory_info_ex()相同 。 除非您正在执行“ rss,vms = p.memory_info()” ,否则这不会破坏您现有的代码。
  • 同时,不建议使用process_memory_info_ex() 。 该方法仍然是memory_info()的别名,并发出弃用警告。
  • psutil.disk_io_counters()在Linux上返回2个附加字段,在FreeBSD上返回1个附加字段。
  • NetBSD和OpenBSD上的psutil.disk_io_counters()不再返回write_countread_count指标,因为内核不提供它们(我们返回的是繁忙时间)。 我也不希望这成为一个大问题,因为NetBSD和OpenBSD支持是最近的。

最后的笔记和找工作 (Final notes and looking for a job)

OK, this is it. I would like to spend a couple more words to announce the world that I’m currently unemployed and looking for a remote gig again! =) I want remote because my plan for this year is to remain in Prague (Czech Republic) and possibly spend 2-3 months in Asia. If you know about any company who’s looking for a Python backend dev feel who can work from afar feel free to share my Linkedin profile or mail me at g.rodola [at] gmail [dot] com.
好的,就是这样。 我想多花几句话来宣布这个世界,我目前正在失业,并再次寻找远程工作! =)我想偏远,因为我今年的计划是留在布拉格(捷克共和国),并可能在亚洲停留2-3个月。 如果您知道任何正在寻找Python后端开发人员的公司,那么可以在远处工作的人可以随时分享我的Linkedin个人资料或通过g.rodola [gmail] com邮寄给我。

外部链接 (External links)

  • reddit
  • hacker news
  • reddit
  • 黑客新闻

翻译自: https://www.pybloggers.com/2016/02/psutil-4-0-0-and-how-to-get-real-process-memory-and-environ-in-python/

你可能感兴趣的:(psutil 4.0.0以及如何在Python中获取“真实的”进程内存和环境)