查询构造器

什么是查询构造器

查询构造器是建立在sql语句上的抽象，其本身是一些已经封装好的方法，使用时只需要传入参数，其内部封装的逻辑会将参数解析成sql语句，进而与数据库交互。

查询构造器的意义

查询构造器的意义在于能够使你使用较少的代码来实现数据的读，写，更新，并且易于维护的代码；同时还能避免一定程度上的 sql 注入。

现在让我们用查询构造器写下这么一段代码获取数据

$query = $this->db
          ->select('user_id,user_phone')
          ->from('users')
          ->where(['user_id > ' => 1, 'is_lock' => 0])
          ->limit(0, 10)
          ->order_by('user_id desc')
          ->get();

然后我们看下查询构造器是如何解析这段查询的

select()

select() 中传递是查询字段，那我们就很好奇传进去的字段是怎么被处理的，有没有被转义之类的？

public function select($select = '*', $escape = NULL)
    {
        //将字段处理成数组
        if (is_string($select))
        {
            $select = explode(',', $select);
        }

        //第二个参数控制器是否对参数字段进行转义，如果没有改参数，那么使用默认的处理方式
        is_bool($escape) OR $escape = $this->_protect_identifiers;

        // 将查询字段存储到特定的数组中，并同时记录它的转义方式以及是否对字段缓存，
        // 字段缓存会用在查询构造器缓存和缓存重置相应的逻辑处
        foreach ($select as $val)
        {
            $val = trim($val);

            if ($val !== '')
            {
                $this->qb_select[] = $val;
                $this->qb_no_escape[] = $escape;

                if ($this->qb_caching === TRUE)
                {
                    $this->qb_cache_select[] = $val;
                    $this->qb_cache_exists[] = 'select';
                    $this->qb_cache_no_escape[] = $escape;
                }
            }
        }

        return $this;
    }

通过分析 select() 的源码，我们发现查询字段竟然是保存在特定的数组中，有木有似曾相识的感觉？

如果大家研究过模板引擎的话，发现该处处理查询字段的方式和模板引擎的编译很像，类似模板引擎编译成抽象语法树的方式来解析查询构造器参数，这种方式不得不说好聪明；不然拼 sql ，还要做好安全性的话，代码会变成一团乱麻！

那么就可以大胆的假设了，查询构造器其他函数的处理方式会不会也类似 select () 呢？

where()

我们知道查询构造器提供给了我们4种传递查询条件的方式

简单的 key=>value： $this->db->where('name', $name);
含有运算符的 key=>value : $this->db->where('name !=', $name);
关联数组： $this->db->where(['name' => $name);
自定义字符串： $this->db->where("name=$name");

那 where() 函数对这四种情况又是怎么处理的呢？

public function where($key, $value = NULL, $escape = NULL)
    {
        return $this->_wh('qb_where', $key, $value, 'AND ', $escape);
    }

可以看到对where的参数还是解析到一个特定数组中去了，所以很肯定了，其他方法也是着这种类似抽象语法树的方式解析参数的

注意：后两种相对前两种没有第二个参数，对于和where，or_where 等相关的条件查询都是在 _wh() 这个方法中进行的

protected function _wh($qb_key, $key, $value = NULL, $type = 'AND ', $escape = NULL)
    {

        //根据 db_key 决定是 hava的缓存key还是where的缓存key,该key一样会用在查询构造器缓存相关的逻辑;
       //由于where和having的查询语法很相似，所以你会发现having()方法也调用了_wh()！
        $qb_cache_key = ($qb_key === 'qb_having') ? 'qb_cache_having' : 'qb_cache_where';

        //将那四种查询方式都处理成关联数组的方式，那么很肯定对于最后一种，他的value是null
        if ( ! is_array($key))
        {
            $key = array($key => $value);
        }

        //对查询条件进行转义相关的设置
        is_bool($escape) OR $escape = $this->_protect_identifiers;

        foreach ($key as $k => $v)
        {
            // $prefix 是连接查询条件的关键字，它是由 $type 得来的，也就是and，or之类的关键字
            $prefix = (count($this->$qb_key) === 0 && count($this->$qb_cache_key) === 0)
                ? $this->_group_get_type('')
                : $this->_group_get_type($type);

            //首先判断 value 是不是为null，也就是where 查询条件的前两种方式，
           //对于没有运算符的 $key 需要接一个等号(说明是简单的key=>value条件查询)
            if ($v !== NULL)
            {
                if ($escape === TRUE)
                {
                    $v = ' '.$this->escape($v);
                }

                if ( ! $this->_has_operator($k))
                {
                    $k .= ' = ';
                }
            }
            // 如果$k没有运算符并且value为空，那就是说明可能使用了类似 $this->db->where('name') 这种没有条件值的查询条件，
            // 为了避免sql错误，接一个 IS NULL
            elseif ( ! $this->_has_operator($k))
            {
                // value appears not to have been set, assign the test to IS NULL
                $k .= ' IS NULL';
            }
            elseif (preg_match('/\s*(!?=|<>|IS(?:\s+NOT)?)\s*$/i', $k, $match, PREG_OFFSET_CAPTURE))
            {
                $k = substr($k, 0, $match[0][1]).($match[1][0] === '=' ? ' IS NULL' : ' IS NOT NULL');
            }

            //最后将条件值拼起来，扔到qb_where下，后期解析查询条件时只需要将其中的条件拼接起来就行了，同时将查询条件也缓存一下
            $this->{$qb_key}[] = array('condition' => $prefix.$k.$v, 'escape' => $escape);
            if ($this->qb_caching === TRUE)
            {
                $this->{$qb_cache_key}[] = array('condition' => $prefix.$k.$v, 'escape' => $escape);
                $this->qb_cache_exists[] = substr($qb_key, 3);
            }

        }
        
        //为了支持链式调用，返回当前对象
        return $this;
    }

from()

public function from($from)
    {
        /*
         * from 就是要查询的表名了,可以看到from是支持多表名传入的，不过这种情况不多见；
         *  _track_aliases() 是处理表别名的函数，其内部通过判断是否有 AS 关键字会解析到表别名，
         * 然后将别名存储到 qb_aliased_tables 这个数组下！
         *
         * 那么你可能会有疑问，设置的表别名后，那之前查询字段怎么办？如果是关联查询，没有命名空间的
         * 字段一定会引起的歧义的；其实你看到的 _protect_identifiers() 这个函数就是处理这种情况的！
         * 并且如果你的表名是 host.dbname.table table_alias 这种情况的话，该函数也能处理！
         * 
         * 
         * 最后解析到表名后在写到相关的映射数组中去，并缓存
         * */
        foreach ((array) $from as $val)
        {
            if (strpos($val, ',') !== FALSE)
            {
                foreach (explode(',', $val) as $v)
                {
                    $v = trim($v);
                    $this->_track_aliases($v);

                    $this->qb_from[] = $v = $this->protect_identifiers($v, TRUE, NULL, FALSE);

                    if ($this->qb_caching === TRUE)
                    {
                        $this->qb_cache_from[] = $v;
                        $this->qb_cache_exists[] = 'from';
                    }
                }
            }
            else
            {
                $val = trim($val);

                // Extract any aliases that might exist. We use this information
                // in the protect_identifiers to know whether to add a table prefix
                $this->_track_aliases($val);

                $this->qb_from[] = $val = $this->protect_identifiers($val, TRUE, NULL, FALSE);

                if ($this->qb_caching === TRUE)
                {
                    $this->qb_cache_from[] = $val;
                    $this->qb_cache_exists[] = 'from';
                }
            }
        }

        return $this;
    }

limit()

limit 就很简单了，由于 limt 参数不像 where 那样有多组，所以 limit 的两个参数是直接扔在变量上的！

public function limit($value, $offset = 0)
{
    is_null($value) OR $this->qb_limit = (int) $value;
    empty($offset) OR $this->qb_offset = (int) $offset;
    return $this;
}

order_by()

order_by 是处理排序的，我们知道其要两种入参方式：

简单的 key => value : $this->db->order_by('id', 'DESC');
字符串： $this->db->order_by('id DESC, ctime DESC');

看下 order_by 对这两种排序的处理

public function order_by($orderby, $direction = '', $escape = NULL)
{
        // 将排序关键字转成大写 desc => DESC,asc => ASC
        $direction = strtoupper(trim($direction));

        // 如果排序关键字是RANDOM，说明是随机排序
        if ($direction === 'RANDOM')
        {
            $direction = '';

            // Do we have a seed value?
            $orderby = ctype_digit((string) $orderby)
                ? sprintf($this->_random_keyword[1], $orderby)
                : $this->_random_keyword[0];

        }
        elseif (empty($orderby))
        {
            return $this;
        }
        // 处理排序关键字，对于随机排序 $direction 是空的
        elseif ($direction !== '')
        {
            $direction = in_array($direction, array('ASC', 'DESC'), TRUE) ? ' '.$direction : '';
        }

        is_bool($escape) OR $escape = $this->_protect_identifiers;

        if ($escape === FALSE)
        {
            $qb_orderby[] = array('field' => $orderby, 'direction' => $direction, 'escape' => FALSE);
        }
        else
        {
            /*
             * 这里就是处理解析排序参数的核心处了，首先不管是 key => valude 风格还是字符串风格，都处理成数组的方式，
             * 接下来根据 $direction 判断是该排序是正常的字段排序还是随机排序
             * */

            $qb_orderby = array();
            foreach (explode(',', $orderby) as $field)
            {
                $qb_orderby[] = ($direction === '' && preg_match('/\s+(ASC|DESC)$/i', rtrim($field), $match, PREG_OFFSET_CAPTURE))
                    ? array('field' => ltrim(substr($field, 0, $match[0][1])), 'direction' => ' '.$match[1][0], 'escape' => TRUE)
                    : array('field' => trim($field), 'direction' => $direction, 'escape' => TRUE);
            }
        }
        
        //由于我们可能会多次调用 $this->order_by,这势必会导致qb_orderby不为空，每调一次，就需要添加到之前的排序数组中去
        $this->qb_orderby = array_merge($this->qb_orderby, $qb_orderby);
        if ($this->qb_caching === TRUE)
        {
            $this->qb_cache_orderby = array_merge($this->qb_cache_orderby, $qb_orderby);
            $this->qb_cache_exists[] = 'orderby';
        }

        return $this;
}

get()

当所有查询构造器参数被解析完毕后，对于 get() 来说就是将这些参数拼成sql，进而获取查询结果了！

public function get($table = '', $limit = NULL, $offset = NULL)
{
        //如果你没有使用 from 设置要查询的表，那么你还可以通过 get 传入表名，
        // 传入表名后需要解析别名，也看到get其实调用了 from 设置了要查询的表
        if ($table !== '')
        {
            $this->_track_aliases($table);
            $this->from($table);
        }
        
        if ( ! empty($limit))
        {
            $this->limit($limit, $offset);
        }
        
        //看到没，最终的获取查询的结果的方式是通过在 _compile_select 中拼成 sql 然后传给了 query
        $result = $this->query($this->_compile_select());
        $this->_reset_select();
        return $result;
 }

关于查询构造器的核心就是 _compile_select() 这个函数了，在这个函数中我们会看到将查询构造器的参数解析成了 sql。

protected function _compile_select($select_override = FALSE)
    {
        // 将没有缓存的查询构造器参数在缓存中存一份，如果开启查询构造器缓存，其实下面的 qb_xxx 就是 
       //db_cache_xxx ，因为 _merge_cache 内部中将 db_cache_xxx 赋给了 qb_xxx，这个不难理解，有缓存当然是先从缓存中读数据了 
        $this->_merge_cache();

        // $select_override 是from前面的部分
        if ($select_override !== FALSE)
        {
            $sql = $select_override;
        }
        else
        {
            
            $sql = ( ! $this->qb_distinct) ? 'SELECT ' : 'SELECT DISTINCT ';

            if (count($this->qb_select) === 0)
            {
                $sql .= '*';
            }
            else
            {
                /*
                 * 该部分就是对查询字段的拼接了，可以看到其对字段做了是否转义的设置，
                 * 而 protect_identifiers 则进行字段转义，是否为给字段加别名的处理
                 * 
                 * */
                
                foreach ($this->qb_select as $key => $val)
                {
                    $no_escape = isset($this->qb_no_escape[$key]) ? $this->qb_no_escape[$key] : NULL;
                    $this->qb_select[$key] = $this->protect_identifiers($val, FALSE, $no_escape);
                }

                $sql .= implode(', ', $this->qb_select);
            }
        }

        // 拼接表名，由于from可以支持传入多个表，那么这里 _from_tables 其实就是 implode 传入
        // 的多个表名而已；一般情况下from 传入多表名的情况很少
        if (count($this->qb_from) > 0)
        {
            $sql .= "\nFROM ".$this->_from_tables();
        }

        // 拼接处理join
        if (count($this->qb_join) > 0)
        {
            $sql .= "\n".implode("\n", $this->qb_join);
        }
        
        // 拼接查询条件，分组，排序等
        $sql .= $this->_compile_wh('qb_where')
            .$this->_compile_group_by()
            .$this->_compile_wh('qb_having')
            .$this->_compile_order_by(); // ORDER BY

        // LIMIT
        if ($this->qb_limit)
        {
            return $this->_limit($sql."\n");
        }
        
        //最后将拼成的sql返回
        return $sql;
    }

几个重要的查询构造器函数的源码就分析到这里了，下节看下事务处理相关的源码！

CodeIgniter源码分析 7.2 - 数据库驱动之查询构造器