标准库之stringprep

python标准库对于这个库的介绍不是很全面,所以今天严小日就粘贴到这里并友情提供下翻译(翻译是根据理解的文意翻译的,而不是逐字翻译的,所以可能会有出入):
When identifying things (such as host names) in the internet, it is often necessary to compare such identifications for “equality”. Exactly how this comparison is executed may depend on the application domain, e.g. whether it should be case-insensitive or not. It may be also necessary to restrict the possible identifications, to allow only identifications consisting of “printable” characters.
RFC 3454 defines a procedure for “preparing” Unicode strings in internet protocols. Before passing strings onto the wire, they are processed with the preparation procedure, after which they have a certain normalized form. The RFC defines a set of tables, which can be combined into profiles. Each profile must define which tables it uses, and what other optional parts of the stringprep
procedure are part of the profile. One example of a stringprep
profile is nameprep
, which is used for internationalized domain names.
The module stringprep
only exposes the tables from RFC 3454. As these tables would be very large to represent them as dictionaries or lists, the module uses the Unicode character database internally. The module source code itself was generated using the mkstringprep.py
utility.
As a result, these tables are exposed as functions, not as data structures. There are two kinds of tables in the RFC: sets and mappings. For a set, stringprep
provides the “characteristic function”, i.e. a function that returns true if the parameter is part of the set. For mappings, it provides the mapping function: given the key, it returns the associated value. Below is a list of all functions available in the module.

当你在网上区别类似主机名字的时候,需要区别这些hostname是否确切相同,特别是如今比较操作都依赖应用,比如是否大小写要区分开.同时也很好必要去限制标识:全部由仅可以打印的字符组成,所以不会exactly equale
那么由此RFC 3454就定义了一个程序用来在互联网中prepara unicode字符串: 在将字符串发送到网线之前,被预先的程序提前处理,这样处理完后他们就有了规范化的样式,
RFC定义了很多表格,这些表格组成了大体的一个轮廓,每个程序都是其中的一员,一个例子就是nameprep,这个可以被用作国际通用domain name
stringprep 只显示了RFC3454中的表格,因为这些表格如果用字典或者列别表示的话会非常大,所以内部采用了Unicode编码,模块原代码本身是采用mkstringgrep.py生成的
因此,这些表的表现形式为函数,而不是数据结构,RFC表格中有两种表格,其中一种提供了'特征函数',可以判断参数是否在集合中如果是就返回TRue,
另外一种是映射,提供了映射函数:根据key提供关联的数值.

ps:这个库有点偏冷门了,很少用到,但是根据这个库顺便了解下RFC是很有必要的

你可能感兴趣的:(标准库之stringprep)