python静态的代码分析
Static code analysis looks at the code without executing it. It is usually extremely fast to execute, requires little effort to add to your workflow, and can uncover common mistakes. The only downside is that it is not tailored towards your code.
静态代码分析将查看代码,而不执行代码。 它通常执行起来非常快,不需要花费很多精力就可以添加到您的工作流程中,并且可以发现常见的错误。 唯一的缺点是它不是针对您的代码量身定制的。
In this article, you will learn how to perform various types of static code analysis in Python. While the article focuses on Python, the types of analysis can be done in any programming language.
在本文中,您将学习如何在Python中执行各种类型的静态代码分析。 虽然本文重点介绍Python,但分析类型可以用任何编程语言完成。
代码复杂度 (Code Complexity)
Photo by John Barkiple on Unsplash John Barkiple在 Unsplash上 拍摄的照片One way to measure code complexity is the cyclomatic complexity, also called McCabe complexity as defined in A Complexity Measure:
度量代码复杂度的一种方法是循环复杂度 ,也称为“复杂性度量”中定义的McCabe复杂度 :
CC = E - N + 2*P
where N is the number of nodes in the control flow graph, E is the number of edges and P is the number of condition-nodes (if-statements, while/for loops).
其中N是控制流程图中的节点数,E是边缘数,P是条件节点数(if语句,while / for循环)。
You can calculate it in Python with radon:
您可以使用radon在Python中进行计算:
$ pip install radon
$ radon cc mpu/aws.py -s
mpu/aws.py
F 85:0 s3_download - B (6)
F 16:0 list_files - A (3)
F 165:0 _s3_path_split - A (2)
F 46:0 s3_read - A (1)
F 141:0 s3_upload - A (1)
C 77:0 ExistsStrategy - A (1)
The first letter shows the type of block (F for function, C for class). Then radon gives the line number, the name of the class/function, a grade (A, B, C, D, E, or F), and the actual complexity as a number. Typically, a complexity below 10 is ok. The most complex part of scipy has a complexity of 61.
第一个字母显示块的类型 (函数表示F,类表示C)。 然后radon给出行号 ,类/函数的名称 , 等级 (A,B,C,D,E或F)以及实际的复杂度(以数字表示) 。 通常,低于10的复杂度是可以的。 Scipy最复杂的部分的复杂度为61。
Besides radon, there are various other packages and Flake8 plugins:
除了ra以外,还有其他各种软件包和Flake8插件:
flake8-annotations-complexity: Nudge you to name complex types
flake8-annotations-complexity : 轻推您命名复杂类型
flake8-cognitive-complexity: Validates cognitive functions complexity
flake8-cognitive-complexity :验证认知功能的复杂性
flake8-expression-complexity: Make sure that single expressions are not too complicated; similar to cyclomatic complexity for functions / classes.
flake8-expression-complexity :确保单个表达式不太复杂。 类似于函数/类的圈复杂度。
flake8-functions: Report too long functions and functions with too many arguments
flake8-functions :报告的函数太长以及带有太多参数的函数
mccabe: This is used by a couple of other tools and projects
mccabe :其他一些工具和项目也使用
wily: A command-line application for tracking, reporting on the complexity of Python tests and applications.
wily :一个命令行应用程序,用于跟踪,报告Python测试和应用程序的复杂性。
xenon: Relies on radon
氙气 :依靠ra
风格指南 (Style Guides)
Make your code look professional. Photo by Hunters Race on Unsplash 使您的代码看起来专业。 猎人在 Unsplash上的 照片You might have heard the words “pythonic code”. It means to not only write correct Python code but use the languages features how they are intended to be used (source). It is for sure an opinionated term, but there are a lot of plugins that show you what a large part of the community considers to be pythonic.
您可能已经听说过“ Python代码”。 这意味着不仅要编写正确的Python代码,还要使用语言功能(按预期使用)( 源 )。 可以肯定地说,这是一个自以为是的术语,但是有许多插件向您展示了社区中很大一部分人认为是Python的东西。
Writing code in a similar style to other Python projects is valuable as people will have an easier time reading the code. This is important as we read software more often than we write it (source).
以与其他Python项目类似的方式编写代码非常有价值,因为人们将可以更轻松地阅读代码。 这一点很重要,因为我们阅读软件的频率比编写软件的频率高( 源 )。
So, what is pythonic code?
那么,什么是pythonic代码?
Let’s start with PEP-8: It’s a style guide written and accepted by the Python community in 2001. So it’s been around for a while and most people want to follow most of it. The main part which I’ve seen most people not to agree with is the maximum line length of 79. I’m always recommending to follow this advice in 95% of your codebase. I gave reasons for that.
让我们从PEP-8开始:这是一个样式指南,由Python社区在2001年编写并接受。所以它已经存在了一段时间了,大多数人都希望遵循它。 我见过的大多数人不同意的主要部分是最大线长79 。 我总是建议您在95%的代码库中遵循此建议。 我给出了原因 。
Black contributors 黑人贡献者For pure code formatting, you should use an auto formatter. I grew into liking black because it does NOT allow customization. Code formatted by black always looks the same. As you cannot customize it, you don’t need to discuss it. It just solves the issue of conflicting styles and arguments around it. Black is maintained by the Python Software Foundation and likely the most commonly adopted auto formatter for Python.
对于纯代码格式化,应使用自动格式化程序。 我开始喜欢黑色,因为它不允许自定义。 用黑色格式化的代码始终看起来相同。 由于无法自定义,因此无需讨论。 它只是解决了周围样式和参数冲突的问题。 Black由Python软件基金会(Python Software Foundation)维护,并且可能是最常用的Python自动格式化程序。
yapf by Google is another auto formatter.
Google的yapf是另一种自动格式化程序。
字串 (Docstrings)
Reading the manual can be fun if it’s written well. Lasagne and Scipy have pretty good documentation. Photo by Laura Dewilde on Unsplash 如果编写得当,那么阅读手册可能会很有趣。 千层面和 Scipy有相当不错的文档。 Laura Dewilde在 Unsplash上的 照片For docstrings, there is PEP-257. All of those rules are widely accepted in the community, but they still allow a wide variety of docstrings. There are three commonly used styles:
对于文档字符串,有PEP-257 。 所有这些规则在社区中已被广泛接受,但是它们仍然允许使用各种各样的文档字符串。 共有三种常用样式:
NumpyDoc-style docstrings: Used by Numpy and Scipy. It’s markdown with some specified sections such as
Parameters
andReturns
in a fixed order.NumpyDoc样式的文档字符串:由Numpy和Scipy使用。 它的降价带有一些指定的部分,例如“
Parameters
和“Returns
”以固定顺序排列。Google-style docstrings: A super-slim format which has
Args:
andReturns:
.Google样式的文档字符串:一种超薄格式,具有
Args:
并Returns:
。Sphinx-style docstrings: A very flexible format that uses restructured text.
Sphinx样式的文档字符串:一种非常灵活的格式,使用重新构造的文本。
I love the NumpyDoc format as it is super easy to read even when you just have it inside a text editor. Numpydoc is also well-supported by editors.
我喜欢NumpyDoc格式,因为即使将其放在文本编辑器中,它也非常易于阅读。 Numpydoc也得到编辑的大力支持。
Here you can see the three in comparison:
在这里您可以看到三个比较:
def get_meta_numpydoc(filepath, a_number, a_dict):
"""
Get meta-information of an image. Lorem ipsum dolor sit amet, consectetuer adipiscing elit.
Aenean commodo
ligula eget dolor. Aenean massa. Cum sociis natoque penatibus
et magnis dis
parturient montes, nascetur ridiculus mus. Parameters
----------
filepath : str
Get metadata from this file
a_number : int
Some more details
a_dict : dict
Configuration Returns
-------
meta : dict
Extracted meta information Raises
------
IOError
File could not be read
"""def get_meta_google_doc(filepath, a_number, a_dict):
"""Get meta-information of an image. Lorem ipsum dolor sit amet, consectetuer adipiscing elit.
Aenean commodo
ligula eget dolor. Aenean massa. Cum sociis natoque penatibus
et magnis dis
parturient montes, nascetur ridiculus mus. Args:
filepath: Get metadata from this file.
a_number: Some more details.
a_dict: Configuration. Returns:
Extracted meta information: Raises:
IOError: File could not be read.
"""
def get_meta_sphinx_doc(filepath, a_number, a_dict):
"""
Get meta-information of an image. Lorem ipsum dolor sit amet, consectetuer adipiscing elit.
Aenean commodo
ligula eget dolor. Aenean massa. Cum sociis natoque penatibus
et magnis dis
parturient montes, nascetur ridiculus mus. :param filepath: Get metadata from this file
:type filepath: str
:param a_number: Some more details
:type a_number: int
:param a_dict: Configuration
:type a_dict: dict :returns: dict -- Extracted meta information :raises: IOError
"""
片状8 (Flake8)
You should always use a linter, as Alberto Gimeno pointed out. They can check your style, but more importantly, show potential errors.
正如Alberto Gimeno指出的那样, 您应该始终使用短绒 。 他们可以检查您的样式,但更重要的是,可以显示潜在的错误。
Flake8 is a wrapper around PyFlakes, pycodestyle, and a McCabe script. It is the most commonly used tool for linting in Python. Flake8 is awesome because there are so many plugins for it. I found 223 packages with the string “flake8” within the name and looked at many of them. I’ve also looked at packages with the trove classifier Framework :: Flake8
and found 143 packages of which 122 started with flake8-
. Only 21 packages had the Flake8 Framework trove classifier but didn’t start with flake8-
and only two of them looked interesting.
Flake8是PyFlakes,pycodestyle和McCabe脚本的包装。 它是Python中最常用的掉毛工具。 Flake8很棒,因为有很多插件。 我找到了223个名称中带有字符串“ flake8”的软件包,并查看了其中的许多软件包。 我还使用Framework :: Flake8
分类器Framework :: Flake8
查看了软件包,发现143个软件包,其中122个以flake8-
。 只有21个软件包具有Flake8 Framework trove分类器,但并非以flake8-
开头,并且其中只有两个看起来很有趣。
Side note: Typo squatting is an issue every open package repository has to fight with (Bachelor’s Thesis: Typosquatting in Programming Language Package Managers which has a blog post and an interesting follow-up, Bachelor’s Thesis: Attacks on Package Managers). There are examples in Python for it causing harm (2017, 2017, 2017, 2019, 2019, 2019). There is pypi-scan for finding examples and pypi-parker to prevent common typos to be used. William Bengtsson also did something similar to harden the Python community against this thread. See his article below for more information about his project. Package parkinginflates the number of packages on PyPI and I filtered them by looking for the summary “A package to prevent exploit”.
旁注 :打字错误是每个开放式软件包存储库都必须解决的问题(Bachelor的论文: Programming Language Package Managers中的Tyquaquatting,其中包含博客文章和有趣的后续 文章 Bachelor的论文: Attacks on Package Managers )。 Python中有一些示例会造成伤害( 2017年 , 2017年 , 2017 年 , 2019年 , 2019 年 , 2019 年 )。 有pypi-scan用于查找示例,而pypi-parker可防止使用常见的错字。 威廉·本格森(William Bengtsson)也做了类似的事情来使Python社区更坚强地反对这个线程。 有关他的项目的更多信息,请参见下面的文章。 程序包停放会增加PyPI上的程序包数量,我通过查找摘要“防止攻击的程序包”来过滤它们。
Here are some of the interesting flake8 plugins:
以下是一些有趣的flake8插件:
cohesion: Check if class cohesion is below a threshold. This indicates that functionality should be split out of a class.
内聚性 :检查类内聚性是否低于阈值。 这表明功能应从类中分离出来。
flake8-assert-msg: Make sure assert statements have messages
flake8-assert-msg :确保断言语句包含消息
flake8-blind-except: Prevent Pokemon exception catching
flake8-blind-except :防止捕获Pokemon异常
flake8-builtins: Check for python builtins being used as variables or parameters.
flake8-builtins :检查python内置变量是否用作变量或参数。
flake8-docstrings: Adds pydocstyle support
flake8-docstrings :添加pydocstyle支持
flake8-isort: Use isort to check if the imports on your python files are sorted the way you expect
flake8-isort :使用isort检查python文件上的导入是否按照您期望的方式排序
flake8-logging-format: Validate (lack of) logging format strings
flake8-logging-format :验证(缺少)日志记录格式字符串
flake8-pytest-style: Checking common style issues or inconsistencies with pytest-based tests
flake8-pytest-style :检查常见样式问题或与基于pytest的测试不一致
flake8-requirements: Checks/validates package import requirements. It reports missing and/or not used project direct dependencies
flake8-requirements :检查/验证包导入要求。 它报告缺少和/或未使用的项目直接依赖项
flake8-graphql: Lint GraphQL query strings
flake8-graphql :Lint GraphQL查询字符串
flake8_implicit_str_concat: Goes well with black
flake8_implicit_str_concat :与黑色Go搭配良好
flake8-mock: Check for mistakes using mocks
flake8-mock :使用模拟检查错误
flake8-nb: Check jupyter notebooks
flake8-nb :检查jupyter笔记本
flake8-pyi: Lint stub files
flake8-pyi : 棉绒存根文件
flake8-variables-names: Find common “meaningless” variable names
flake8-variables-names :查找常见的“无意义的”变量名称
pep8-naming: Check your code against PEP 8 naming conventions
pep8-naming :根据PEP 8命名约定检查您的代码
pandas-vet: Opinionated linting for Pandas code
pandas-vet :对熊猫代码有意见的棉绒
wemake-python-styleguide: An opinionated style guide/checker which seems to be pretty popular. I haven’t seen that one before, though.
wemake-python-styleguide :固执己见的样式指南/检查器,它似乎很受欢迎。 我以前没有见过那个。
An alternative to parts of Flake8 prospector. It couples tools, but it is way less commonly used and thus not as flexible as Flake8.
替代Flake8 探矿机的一部分 。 它与工具结合使用,但是使用的方式较少,因此不如Flake8灵活。
Flake8:安全性和错误 (Flake8: Security and Bugs)
Be safe by looking at warning signs. Photo by Troy Bridges on Unsplash 注意警告标志,确保安全。 Troy Bridges在 Unsplash上 拍摄的照片flake8-bandit: Security Testing
flake8-bandit :安全测试
flake8-bugbear: finding likely bugs and design problems in your program — usually it’s silent, but when it’s not you should have a look
flake8-bugbear :在程序中查找可能的错误和设计问题-通常它是静默的,但是如果不是,则应该看看
flake8-requests: checks usage of the request framework
flake8-requests :检查请求框架的使用情况
Flake8:删除调试工件 (Flake8: Remove Debugging Artifacts)
It happened quite a couple of times to me: I’ve added some code while developing a new feature or debugging an old one and forgot to remove it afterward. It was most often caught by the reviewer, but it is not necessary to distract the reviewer with this.
它发生了很多次:我在开发新功能或调试旧功能时添加了一些代码,但后来忘记删除它。 它通常是由审阅者捕获的,但是没有必要以此来分散审阅者的注意力。
flake8-breakpoint checks for forgotten breakpoints and flake8-print will complain about every print statement. flake8-debugger, flake8-fixme, flake8-todo go in the same direction.
flake8-breakpoint检查是否遗忘了断点,而flake8-print将抱怨每条打印语句。 flake8-debugger , flake8-fixme和 flake8-todo朝着相同的方向发展。
皮林特 (Pylint)
pylint
is one of the most wide-spread linters in Python. The features of pylint for sure overlaps with Flake8, but there is one feature I love: Checking for code duplication ❤
pylint
是Python中使用最广泛的pylint
。 pylint的功能肯定与Flake8重叠,但是我喜欢一个功能:检查代码重复❤
$ pylint --disable=all --enable=duplicate-code .
************* Module mpu.datastructures.trie.base
mpu/datastructures/trie/base.py:1:0: R0801: Similar lines in 2 files
==mpu.datastructures.trie.char_trie:85
==mpu.datastructures.trie.string_trie:138
string += child.print(_indent=_indent + 1)
return stringdef __str__(self):
return f"TrieNode(value='{self._value}', nb_children='{len(self.children)}')"__repr__ = __str__EMPTY_NODE = TrieNode(value="", is_word=False, count=0, freeze=True)class Trie(AbstractTrie):
def __init__(self, container=None):
if container is None:
container = [] (duplicate-code)
让死代码死 (Let Dead Code Die)
Who hasn’t done it: You removed a functionality, but the code could be handy. So you comment it out. Or you add a if False
block around it. Sometimes more sophisticated by adding a configuration option you don’t need.
谁没有做:您删除了功能,但是代码可能很方便。 因此,您将其注释掉。 或者,您可以在其周围添加if False
块。 有时,通过添加不需要的配置选项可以使功能更加复杂。
The clean solution is to have a single, clear commit that removes that feature. Maybe add a git tag so that you can find it later if you want to add it again.
干净的解决方案是拥有一个明确的提交,以删除该功能。 也许添加一个git标签,以便以后再次添加时可以找到它。
And then there is code which is dead, but you forgot about it. Luckily, you can automatically detect it:
然后是死掉的代码,但是您忘了它。 幸运的是,您可以自动检测到它:
flake8-eradicate: Find commented out (or so-called “dead”) code.
flake8-eradicate :查找注释掉(或所谓的“死”)代码。
vulture: Finds unused code in Python programs
秃 ::在Python程序中查找未使用的代码
Flake8:鼓励自己使用好风格 (Flake8: Nudging Yourself to use Good Style)
Having an experienced developer review your code is awesome. In the best case, you will learn something new that you can apply in all further projects. And some plugins act like that. Photo by Brooke Cagle on Unsplash 让经验丰富的开发人员检查您的代码真是太棒了。 在最佳情况下,您将学到一些新知识,可以在所有其他项目中应用。 有些插件就是这样。 Brooke Cagle在 Unsplash上 拍摄的照片Some plugins helped me to learn something about Python. For example, the following helped me to get rid of small little bugs and inconsistencies:
一些插件帮助我学习了有关Python的知识。 例如,以下内容帮助我摆脱了小的小错误和不一致之处:
flake8-comprehensions: Helps you write better list/set/dict comprehensions — I love this one
flake8-comprehensions :帮助您编写更好的列表/设置/字典理解-我喜欢这个
flake8-executable: Check executable permissions and shebangs. Files should either executable and have a shebang, or not be executable and not have a shebang.
flake8-executable :检查可执行权限和shebangs 。 文件应该是可执行文件并且具有shebang,或者不是可执行文件并且没有shebang。
flake8-raise: Finds improvements for raise statements
flake8-raise :发现关于凸起语句的改进
flake8-pytest: Use assert instead of assertEqual
flake8-pytest :使用assert代替assertEqual
The following new style nudging plugins aim to push you to use modern style Python:
以下新样式的推钉插件旨在推动您使用现代样式的Python:
flake8-pathlib: Pathlib was added in Python 3.4 and I’m still not quite used to it. This plugin might nudge me to use it when it’s appropriate.
flake8-pathlib : Pathlib是在Python 3.4中添加的,我仍然不太习惯。 该插件可能会促使我在适当的时候使用它。
flake8-string-format, flake8-printf-formatting, flake8-sts: String formatting.
flake8-string-format , flake8-printf-formatting , flake8-sts :字符串格式。
This is one of the most valuable categories for me. If you know more plugins which help to use new styles, let me know
对我来说,这是最有价值的类别之一。 如果您知道更多有助于使用新样式的插件,请告诉我
Flake8 Meta插件 (Flake8 Meta Plugins)
Image created by Martin Thoma via imgflip.com 图片由Martin Thoma通过imgflip.com创建Flake8 has some plugins which don’t add more linting functionality, but improve flake8 in another way:
Flake8有一些插件,它们没有添加更多的棉绒功能,但以另一种方式改进了flake8:
flake8–colors: ANSI colors highlight for Flake8
flake8–colors : Flake8的 ANSI颜色突出显示
flake8-csv: Generate error reports in CSV format
flake8-csv :以CSV格式生成错误报告
flake8-json: Generate error reports in JSON format
flake8-json :以JSON格式生成错误报告
flake8-dashboard and flake8-html: Generate an HTML report (dashboard demo)
flake8-dashboard和flake8-html :生成HTML报告( 仪表板演示 )
flake8-immediate: Prints the errors directly without any delay
flake8-immediate :直接打印错误而没有任何延迟
flake8-strftime: Checks for use of platform-specific strftime codes
flake8-strftime :检查特定于平台的strftime代码的使用
flake8-SQL and py-find-injection: Looks for SQL queries and checks them against an opinionated style
flake8-SQL和py-find-injection :查找SQL查询并根据自觉样式检查它们
flake8-tuple: Checks for (probably) unintended one element tuples
flake8-tuple :检查(可能)意外的一个元素元组
And some plugins people might need for legal reasons like flake8-author, flake8-copyright, and flake8-license.
人们可能出于法律原因需要一些插件,例如flake8-author,flake8-copyright和flake8-license。
To Flake8 plugin authors: Please make sure that you list the error codes your plugin introduces and that you give at least some examples of what your plugin considers bad / good.
给Flake8插件作者:请确保您列出了插件引入的错误代码,并且至少提供了一些示例,说明您的插件认为不好/很好。
类型注释和类型检查 (Type Annotations and Type Checking)
It’s possible in Python, but you need to do it. It’s not done automatically. I’ve written a longer article about how type annotations work in Python. There are multiple tools you can use, but I recommend mypy. You can run it via pytest by using pytest-mypy
or via flake8 by using flake8-mypy
, but I prefer to run it separately. The main reason for it is that the output given by CI pipelines is cleaner.
在Python中是可能的,但是您需要这样做。 它不是自动完成的。 我写了一篇较长的文章,介绍如何在Python中使用类型注释 。 您可以使用多种工具,但我建议使用mypy。 您可以通过使用通过pytest运行pytest-mypy
或通过使用flake8 flake8-mypy
,但我更喜欢单独运行它。 主要原因是CI管道提供的输出更干净。
You can integrate type checking (e.g. via mypy) into your editor, but the type annotations alone already go a long way as they document what is expected.
您可以将类型检查(例如,通过mypy)集成到编辑器中,但是仅使用类型注释就可以了,因为它们记录了预期的内容。
包装结构 (Package Structure)
Check that your package looks fine before shipping it. Photo by Toby Stodart on Unsplash 在运输之前,请检查您的包裹是否正常。 Toby Stodart在 Unsplash上的 照片pyroma rates how well a Python project complies with the best practices of the Python packaging ecosystem.
pyroma评估Python项目与Python打包生态系统的最佳实践的符合程度。
Here are some examples of my projects:
以下是我的项目的一些示例:
$ pyroma mpu
------------------------------
Checking mpu
Found mpu
------------------------------
Final rating: 10/10
Your cheese is so fresh most pe$ pyroma nox
------------------------------
Checking nox
Found nox
------------------------------
Your long_description is not valid ReST:
:2: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
:3: (WARNING/2) Field list ends without a blank line; unexpected unindent.
:4: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
------------------------------
Final rating: 9/10
Cottage Cheese
------------------------------
想更多地了解单元测试? (Want to Know More About Unit Testing?)
In this series, we already had:
在本系列中,我们已经有:
Part 1: The basics of Unit Testing in Python
第1部分: Python单元测试的基础
Part 2: Patching, Mocks, and Dependency Injection
第2部分: 修补,模拟和依赖注入
Part 3: How to test Flask applications with Databases, Templates and Protected Pages
第3部分: 如何使用数据库,模板和受保护的页面测试Flask应用程序
Part 4: tox and nox
第4部分: 有毒和无毒
Part 5: Structuring Unit Tests
第5部分: 结构单元测试
Part 6: CI-Pipelines
第6部分: CI管道
Part 7: Property-based Testing
第7部分: 基于属性的测试
Part 8: Mutation Testing
第8部分: 变异测试
- Part 9: Static Code Analysis: Linters, Type Checking, and Code Complexity 第9部分:静态代码分析:Linter,类型检查和代码复杂性
Let me know if you’re interested in other topics around testing with Python or professional software development with Python: [email protected]
让我知道您是否对使用Python测试或使用Python专业软件开发感兴趣的其他主题:[email protected]
翻译自: https://towardsdatascience.com/static-code-analysis-for-python-bdce10b8d287
python静态的代码分析