python-学习range类C源代码-接收参数个数、长度计算

range —— 在我们写程序时用的非常多，比如：for 循环、列表字典等解析语句等。

range（start, stop, step）函数接收三个参数，最后生成一个像是下面这种形式类似列表的range类:
[start, start+step, start+2step, start+3step, start+4step, ......, start+kstep]
start+kstep < stop
它不是直接生成了一个列表，而是一个迭代器对象，比如如下的for 循环：

for x in range(100000000)

这个语法不会直接生成一个长100000000的列表然后遍历列表，而是一个迭代器，从0开始，每次加1。

接收参数个数

但是range函数接受三个参数都可以运行：

range(10)      # [0,1,2,3,4,5,6,7,8,9]
range(0, 10)      # [0,1,2,3,4,5,6,7,8,9]
range(0, 10, 2)      # [0,2,4,6,8]

有没有发现问题，学过python的应该都知道，python函数接受的参数是根据位置判断的，start、stop、step分别是输入三个参数，后两个没有问题，那么第一个呢，只有之歌参数，但他还是输出了与两个参数一样的range类。

我去Github上看了cpython的源码，是一个叫rangeobject.c的c文件里。在object文件夹下，
这是make_range_object（）的代码：

static PyObject *
compute_range_length(PyObject *start, PyObject *stop, PyObject *step);

static rangeobject *
make_range_object(PyTypeObject *type, PyObject *start,
                  PyObject *stop, PyObject *step)
{
    rangeobject *obj = NULL;
    PyObject *length;
    length = compute_range_length(start, stop, step);
    if (length == NULL) {
        return NULL;
    }
    obj = PyObject_New(rangeobject, type);
    if (obj == NULL) {
        Py_DECREF(length);
        return NULL;
    }
    obj->start = start;
    obj->stop = stop;
    obj->step = step;
    obj->length = length;
    return obj;
}

刚又花了半小时看了源码，实在是太棒了，这c语言写的太好了！代码写的非常清晰，向大佬好好学习。

我们看到了这个：
make_range_object(PyTypeObject *type, PyObject *start,PyObject *stop, PyObject *step)函数中有这么一行：
obj = PyObject_New(rangeobject, type);这就是他返回的一个新的PyObject对象。
在这个函数的下面的一个函数就是：range_new(PyTypeObject *type, PyObject *args, PyObject *kw)
可以看到下面的注释就是我们需要解释的东西：

/* XXX(nnorwitz): should we error check if the user passes any empty ranges?
   range(-10)
   range(0, -5)
   range(0, 5, -1)
*/
static PyObject *
range_new(PyTypeObject *type, PyObject *args, PyObject *kw)
{
    rangeobject *obj;
    PyObject *start = NULL, *stop = NULL, *step = NULL;

    if (!_PyArg_NoKeywords("range", kw))
        return NULL;

    if (PyTuple_Size(args) <= 1) {
        if (!PyArg_UnpackTuple(args, "range", 1, 1, &stop))
            return NULL;
        stop = PyNumber_Index(stop);
        if (!stop)
            return NULL;
        Py_INCREF(_PyLong_Zero);
        start = _PyLong_Zero;
        Py_INCREF(_PyLong_One);
        step = _PyLong_One;
    }
    else {
        if (!PyArg_UnpackTuple(args, "range", 2, 3,
                               &start, &stop, &step))
            return NULL;

        /* Convert borrowed refs to owned refs */
        start = PyNumber_Index(start);
        if (!start)
            return NULL;
        stop = PyNumber_Index(stop);
        if (!stop) {
            Py_DECREF(start);
            return NULL;
        }
        step = validate_step(step);    /* Caution, this can clear exceptions */
        if (!step) {
            Py_DECREF(start);
            Py_DECREF(stop);
            return NULL;
        }
    }

    obj = make_range_object(type, start, stop, step);
    if (obj != NULL)
        return (PyObject *) obj;

    /* Failed to create object, release attributes */
    Py_DECREF(start);
    Py_DECREF(stop);
    Py_DECREF(step);
    return NULL;
}

写的真的太好了！

这函数先判断：if (!_PyArg_NoKeywords("range", kw))判断我们输入的关键词是否是“range”
，如果不是返回null，如果是则继续。
然后判断输入参数的个数：if (PyTuple_Size(args) <= 1)这里就刚好对应了我们输入的参数个数，0个和一个，两个和三个。

小于等于1

从这句if (!PyArg_UnpackTuple(args, "range", 1, 1, &stop))可以看出，它把输入的参数当做了stop这个参数来用。且把参数为0个情况输入null。
看下面的代码：

        Py_INCREF(_PyLong_Zero);
        start = _PyLong_Zero;
        Py_INCREF(_PyLong_One);
        step = _PyLong_One;

这里可以看出，输入的参数为一个时，start设为0，step设为1。

大于1

if (!PyArg_UnpackTuple(args, "range", 2, 3, &start, &stop, &step))从这句可以看出这种情况是两个或三个参数作为输入，分别作为start、start、step输入。看下面这段：

        /* Convert borrowed refs to owned refs */
        start = PyNumber_Index(start);
        if (!start)
            return NULL;
        stop = PyNumber_Index(stop);
        if (!stop) {
            Py_DECREF(start);
            return NULL;
        }
        step = validate_step(step);    /* Caution, this can clear exceptions */
        if (!step) {
            Py_DECREF(start);
            Py_DECREF(stop);
            return NULL;

/* Convert borrowed refs to owned refs */
/* Caution, this can clear exceptions */
看，人家注释都给你写好了。

将借来的参数变为自己的；注意，这可以清除例外！

代码非常清晰，将前三个参数分别格式化，而step可能没有给，所以这就是例外。如果是合法数据，
，分别作为start、stop，step输入，如果有一个为空，则返回空。
至于step是如何处理的我现在还不知道。

然后下面就是返回新的range对象了：
obj = make_range_object(type, start, stop, step);，可以看到输入的参数是4个，第一个是py对象的类型，后面三个，无论我们输入几个参数，都是三个——start、stop、step。

我来学python的，这倒好发c语言如此博大精深。。。。。。

现在range类输入个数的问题就解决的差不多了。

长度计算

这里卡了我好久，昨天就开始看了，现在还是有点迷糊。下来看下源代码：

static PyObject*
compute_range_length(PyObject *start, PyObject *stop, PyObject *step)
{
    /* -------------------------------------------------------------
    Algorithm is equal to that of get_len_of_range(), but it operates
    on PyObjects (which are assumed to be PyLong objects).
    ---------------------------------------------------------------*/
    int cmp_result;
    PyObject *lo, *hi;
    PyObject *diff = NULL;
    PyObject *tmp1 = NULL, *tmp2 = NULL, *result;
                /* holds sub-expression evaluations */

    cmp_result = PyObject_RichCompareBool(step, _PyLong_Zero, Py_GT);
    if (cmp_result == -1)
        return NULL;

    if (cmp_result == 1) {
        lo = start;
        hi = stop;
        Py_INCREF(step);
    } else {
        lo = stop;
        hi = start;
        step = PyNumber_Negative(step);
        if (!step)
            return NULL;
    }

    /* if (lo >= hi), return length of 0. */
    cmp_result = PyObject_RichCompareBool(lo, hi, Py_GE);
    if (cmp_result != 0) {
        Py_DECREF(step);
        if (cmp_result < 0)
            return NULL;
        return PyLong_FromLong(0);
    }

    if ((tmp1 = PyNumber_Subtract(hi, lo)) == NULL)
        goto Fail;

    if ((diff = PyNumber_Subtract(tmp1, _PyLong_One)) == NULL)
        goto Fail;

    if ((tmp2 = PyNumber_FloorDivide(diff, step)) == NULL)
        goto Fail;

    if ((result = PyNumber_Add(tmp2, _PyLong_One)) == NULL)
        goto Fail;

    Py_DECREF(tmp2);
    Py_DECREF(diff);
    Py_DECREF(step);
    Py_DECREF(tmp1);
    return result;

  Fail:
    Py_DECREF(step);
    Py_XDECREF(tmp2);
    Py_XDECREF(diff);
    Py_XDECREF(tmp1);
    return NULL;
}

emmmmm....这段代码有点长，我功力不够，只注意到了注释。。。。hhhhhh

    /* -------------------------------------------------------------
    Algorithm is equal to that of get_len_of_range(), but it operates
    on PyObjects (which are assumed to be PyLong objects).
    ---------------------------------------------------------------*/

他说这个算法与get_len_of_range()的结果是一样的，但是这是以PyLong objects对象作为操作对象的。
。。这我c语言老师可能看的明白，现在我还看不明白，所以我就研究get_len_of_range()去了，往下拉到了line834:

/* Return number of items in range (lo, hi, step).  step != 0
 * required.  The result always fits in an unsigned long.
 */
static unsigned long
get_len_of_range(long lo, long hi, long step)
{
    /* -------------------------------------------------------------
    If step > 0 and lo >= hi, or step < 0 and lo <= hi, the range is empty.
    Else for step > 0, if n values are in the range, the last one is
    lo + (n-1)*step, which must be <= hi-1.  Rearranging,
    n <= (hi - lo - 1)/step + 1, so taking the floor of the RHS gives
    the proper value.  Since lo < hi in this case, hi-lo-1 >= 0, so
    the RHS is non-negative and so truncation is the same as the
    floor.  Letting M be the largest positive long, the worst case
    for the RHS numerator is hi=M, lo=-M-1, and then
    hi-lo-1 = M-(-M-1)-1 = 2*M.  Therefore unsigned long has enough
    precision to compute the RHS exactly.  The analysis for step < 0
    is similar.
    ---------------------------------------------------------------*/
    assert(step != 0);
    if (step > 0 && lo < hi)
        return 1UL + (hi - 1UL - lo) / step;
    else if (step < 0 && lo > hi)
        return 1UL + (lo - 1UL - hi) / (0UL - step);
    else
        return 0UL;
}

哈哈哈这注释。。。。

程序用lo（low）和hi（high）来分别表示start和stop.

If step > 0 and lo >= hi, or step < 0 and lo <= hi, the range is empty.
Else for step > 0, if n values are in the range, the last one is
lo + (n-1)step, which must be <= hi-1. Rearranging,
n <= (hi - lo - 1)/step + 1, so taking the floor of the RHS gives
the proper value. Since lo < hi in this case, hi-lo-1 >= 0, so
the RHS is non-negative and so truncation is the same as the
floor. Letting M be the largest positive long, the worst case
for the RHS numerator is hi=M, lo=-M-1, and then
hi-lo-1 = M-(-M-1)-1 = 2M. Therefore unsigned long has enough
precision to compute the RHS exactly. The analysis for step < 0
is similar.

如果各位看的懂这段英文，那么就不用往下看了，这篇文写完了。

如果看不懂：

如果step > 0 且 lo >= hi，或者step < 0 且 lo <= hi，这个range类是空的。

怎么理解呢？就是如果start >= stop，这个时候你希望的range类是递减的，但是给的step是正的，这是不可能情况，所以range为空。The analysis for start <= stop is similar.(借用大佬的话)。

否则：在step > 0 的情况下，如果range类有n个item，也就是长度为n，则最后一个元素是：last_item = lo + (n-1)step，且last_item <= hi-1。化简不等式推出：n <= (hi - lo - 1)/step + 1，所以so taking the floor of the RHS gives the proper value（这句我不太懂）。

这说的就很明白了。非常有道理！^-，只是这个RHS我不知道是什么东西。。。

这后面的我就看不懂了。。。还在研究。

The analysis for step < 0 is similar.这就就是同理可证step < 0的情况。

最后他给出了下面的算法，将step分为正负两种情况：

    assert(step != 0);
    if (step > 0 && lo < hi)
        return 1UL + (hi - 1UL - lo) / step;
    else if (step < 0 && lo > hi)
        return 1UL + (lo - 1UL - hi) / (0UL - step);
    else
        return 0UL;

断言step不为零。

step > 0 && lo < hi

1+（hi-1-lo）/step

step < 0 && lo > hi

1+(lo-1-hi)/step

第一种情况我能看得懂，就是，，，，我现在也讲不清楚。第二种还不明白。。。我在请教大佬，等他们来给我回复了吧！

__
今天本来是想要自己定义一个简单的range类，但是看了源码之后，就觉得它太完美了，十分严谨。。。

光是这计算长度就难到我了。。先这样吧！

今天光学C了。。。还做了英文翻译。