接前面Python源码笔记之内存管理,尝试看看Python的对象的创建与销毁。Python的对象类型还挺多,在Python源码笔记之数据类型中试图列一个表出来,最终未果。
不敢贪多,看4个内建对象。创建对象,也就是创建下面几个结构体的实例了:
结构体 |
通用C API? |
Type中的tp_new |
|
整数 |
PyLongObject |
_PyLong_New() |
long_new |
字符串 |
PyUnicodeObject |
_PyUnicode_New() |
unicode_new |
列表 |
PyListObject |
PyList_New |
PyType_GenericNew |
字典 |
PyDictObject |
PyDict_New() |
dict_new |
封装的层次太多,太让人迷惑了。简单说,不就是malloc一块合适大小的内存,让PyXxxxObject的指针指向它,为成员赋值么?
如何分配内存:
Python提供简单的封装PyMem_{Malloc、Realloc、Free}这3个东西
高级的内存接口PyObject_{Malloc、Free}(内存池) 等
含有类型信息(知道结构体大小)的PyObject_{New、NewVar}等
带垃圾回收信息的_PyObject_GC_{Malloc、New、NewVar、Del}等
幸好,创建内建类型时,我们不需要这些东西,因为它们提供了
_PyLong_New()
_PyUnicode_New()
PyList_New
PyDict_New()
以下划线开头的啥意思,看来是不希望大家用的东西,特别是_PyUnicode_New(),直接就是一个static函数。看来要创建一个整数或字符串,需要使用其他的东西了,这就是:
PyLong_FromLong(long ival) PyLong_FromUnsignedLong(unsigned long ival) PyLong_FromDouble(double dval) PyLong_FromVoidPtr(void *p) PyLong_FromLongLong(PY_LONG_LONG ival) ... PyObject *PyUnicode_FromString(const char *u) PyObject *PyUnicode_FromUnicode(const Py_UNICODE *u, Py_ssize_t size) ...
除了前面这种方式外,Python提供了一种"统一"的创建接口,通过 PyXxxx_Type 中的那些函数指针(fixme)。
PyLong_Type |
PyUnicode_Type |
PyList_Type |
PyDict_Type |
|
tp_base |
0 |
&PyBaseObject_Type |
0 |
0 |
tp_init |
0 |
0 |
list_init |
dict_init |
tp_alloc |
0 |
0 |
PyType_GenericAlloc |
PyType_GenericAlloc |
tp_new |
long_new |
unicode_new |
PyType_GenericNew |
dict_new |
tp_free |
PyObject_Del |
PyObject_Del |
PyObject_GC_Del |
PyObject_GC_Del |
真复杂,静下心,试着读读看(每个都是从tp_new开始):
这个应该是比较简单,它调用的基本上就是PyLong_From***这些东西
调用基本上就是_PyUnicode_New和PyObject_Str(x)这些
只是它设置了tp_base,不清楚何用。
列表PyType_GenericNew
该函数源码如下:
PyObject * PyType_GenericNew(PyTypeObject *type, PyObject *args, PyObject *kwds) { return type->tp_alloc(type, 0); }
恩,看起来很简单,直接调用了 tp_alloc。也就是PyType_GenericAlloc
源码:
PyObject * PyType_GenericAlloc(PyTypeObject *type, Py_ssize_t nitems) { PyObject *obj; const size_t size = _PyObject_VAR_SIZE(type, nitems+1); if (PyType_IS_GC(type)) obj = _PyObject_GC_Malloc(size); else obj = (PyObject *)PyObject_MALLOC(size); ...
视情况调用_PyObject_GC_Malloc或PyObject_MALLOC,对list,当然是前者喽。
额,和PyDict_New()类似,只是,还是不清楚tp_alloc干嘛用的。
Python中每一个对象都有一个引用计数。当引用计数为零时,对象将被删除。
Py_REFCNT(ob) |
3个简单的宏,对应结构体3个成员 |
|
Py_TYPE(ob) |
||
Py_SIZE(ob) |
||
_Py_NewReference(op) |
将引用计数设置为1 |
|
Py_INCREF(op) |
增加、减少引用计数,带X的会判断op是否为空 |
|
Py_XINCREF(op) |
||
Py_DECREF(op) |
||
Py_XDECREF(op) |
||
_Py_Dealloc(op) |
引用计数降为零时会被调用,进而调用对象类型中的tp_dealloc |
每一个类型都提供有自己的tp_dealloc,比如:unicode_dealloc、long_dealloc、list_dealloc、dict_dealloc
但尚不清楚这些东西和 tp_free 是什么关系,如何关联的。(要学的东西好多啊)
和删除有关的,应该还有两个问题:
static void long_dealloc(PyObject *v) { Py_TYPE(v)->tp_free(v); }
恩,只要保证引用计数不为0就行了
循环引用? list 和 dict 都可能会循环引用,所以和整数处理不同。创建和销毁都是用的带 GC 的东西:PyObject_GC_***
似乎没什么关系哈,不是还是附上
PyLongObject、PyUnicodeObject、PyListObject、PyDictObject 4个结构体的定义:
整数对象 PyLongObject
struct _longobject { PyObject_VAR_HEAD digit ob_digit[1]; }; typedef struct _longobject PyLongObject;
字符串对象 PyUnicodeObject
typedef struct { PyObject_HEAD Py_ssize_t length; /* Length of raw Unicode data in buffer */ Py_UNICODE *str; /* Raw Unicode buffer */ Py_hash_t hash; /* Hash value; -1 if not set */ int state; /* != 0 if interned. In this case the two * references from the dictionary to this object * are *not* counted in ob_refcnt. */ PyObject *defenc; /* (Default) Encoded version as Python string, or NULL; this is used for implementing the buffer protocol */ } PyUnicodeObject;
列表对象 PyListObject
typedef struct { PyObject_VAR_HEAD /* Vector of pointers to list elements. list[0] is ob_item[0], etc. */ PyObject **ob_item; /* ob_item contains space for 'allocated' elements. The number * currently in use is ob_size. * Invariants: * 0 <= ob_size <= allocated * len(list) == ob_size * ob_item == NULL implies ob_size == allocated == 0 * list.sort() temporarily sets allocated to -1 to detect mutations. * * Items must normally not be NULL, except during construction when * the list is not yet visible outside the function that builds it. */ Py_ssize_t allocated; } PyListObject;
字典对象 PyDictObject
typedef struct _dictobject PyDictObject; struct _dictobject { PyObject_HEAD Py_ssize_t ma_fill; /* # Active + # Dummy */ Py_ssize_t ma_used; /* # Active */ /* The table contains ma_mask + 1 slots, and that's a power of 2. * We store the mask instead of the size because the mask is more * frequently needed. */ Py_ssize_t ma_mask; /* ma_table points to ma_smalltable for small tables, else to * additional malloc'ed memory. ma_table is never NULL! This rule * saves repeated runtime null-tests in the workhorse getitem and * setitem calls. */ PyDictEntry *ma_table; PyDictEntry *(*ma_lookup)(PyDictObject *mp, PyObject *key, Py_hash_t hash); PyDictEntry ma_smalltable[PyDict_MINSIZE]; };
简单罗列一下(只为方便自己查看)
3个宏用来获取PyObject/PyVarObject结构体中的成员
#define Py_REFCNT(ob) (((PyObject*)(ob))->ob_refcnt) #define Py_TYPE(ob) (((PyObject*)(ob))->ob_type) #define Py_SIZE(ob) (((PyVarObject*)(ob))->ob_size)
#define _Py_NewReference(op) (Py_REFCNT(op) = 1)
#define Py_INCREF(op) (((PyObject*)(op))->ob_refcnt++) #define Py_DECREF(op) \ do { \ if ( --((PyObject*)(op))->ob_refcnt != 0) \ ; \ else \ _Py_Dealloc((PyObject *)(op)); \ } while (0) #define Py_CLEAR(op) \ do { \ if (op) { \ PyObject *_py_tmp = (PyObject *)(op); \ (op) = NULL; \ Py_DECREF(_py_tmp); \ } \ } while (0) #define Py_XINCREF(op) do { if ((op) == NULL) ; else Py_INCREF(op); } while (0) #define Py_XDECREF(op) do { if ((op) == NULL) ; else Py_DECREF(op); } while (0)
#define _Py_Dealloc(op) ((*Py_TYPE(op)->tp_dealloc)((PyObject *)(op)))