在django中常常见到modle.objects.using(db).all()或者modle.objects.get(pk=xx),又或者modle.objects.first()等等关于对数据库的操作,下面以mysql为代表,简要分析一下上面的种种操作,背后到底做了什么?(先重点提示一下,关注_fetch_all函数)

首先,modle.objects中的objects是modle的manager,manager是Manager类,俗称管理器,他是继承BaseManager并且通过BaseManager.get_queryset(QuerySet),将QuerySet类的所有方法都添加到了manager中,所以modle.objects.using(db).all()其实就是modle.objects.queryset.using(db).all();modle.objects.get(pk=xx)就等于modle.objects.queryset.get()

那么每一个model中的manager从哪儿来的呢?见ModelBase中的new()方法最后调用的_prepare函数,_prepare函数最后创建Manager属性,_prepare的源代码如下:

    def _prepare(cls):
        """
        Creates some methods once self._meta has been populated.
        """
        opts = cls._meta
        opts._prepare(cls)

        if opts.order_with_respect_to:
            cls.get_next_in_order = curry(cls._get_next_or_previous_in_order, is_next=True)
            cls.get_previous_in_order = curry(cls._get_next_or_previous_in_order, is_next=False)

            # Defer creating accessors on the foreign class until it has been
            # created and registered. If remote_field is None, we're ordering
            # with respect to a GenericForeignKey and don't know what the
            # foreign class is - we'll add those accessors later in
            # contribute_to_class().
            if opts.order_with_respect_to.remote_field:
                wrt = opts.order_with_respect_to
                remote = wrt.remote_field.model
                lazy_related_operation(make_foreign_order_accessors, cls, remote)

        # Give the class a docstring -- its definition.
        if cls.__doc__ is None:
            cls.__doc__ = "%s(%s)" % (cls.__name__, ", ".join(f.name for f in opts.fields))

        get_absolute_url_override = settings.ABSOLUTE_URL_OVERRIDES.get(opts.label_lower)
        if get_absolute_url_override:
            setattr(cls, 'get_absolute_url', get_absolute_url_override)

        if not opts.managers or cls._requires_legacy_default_manager():
            if any(f.name == 'objects' for f in opts.fields):
                raise ValueError(
                    "Model %s must specify a custom Manager, because it has a "
                    "field named 'objects'." % cls.__name__
                )
            manager = Manager()   #重点关注这里
            manager.auto_created = True
            cls.add_to_class('objects', manager)

        signals.class_prepared.send(sender=cls)

所以是每一个model类,都有自己的Manager。

而对Manager类,从源码django/db/models/manager.py中可以得到:

class Manager(BaseManager.from_queryset(QuerySet)):
    pass

Manager是一个继承自BaseManager.from_queryset(QuerySet)类的类,到底是一个怎样的呢?同样在django/db/models/manager.py中可以找到答案

    @classmethod
    def from_queryset(cls, queryset_class, class_name=None):
        if class_name is None:
            class_name = '%sFrom%s' % (cls.__name__, queryset_class.__name__)
        class_dict = {
            '_queryset_class': queryset_class,
        }
        class_dict.update(cls._get_queryset_methods(queryset_class)) #重点关注
        return type(class_name, (cls,), class_dict) #重点关注

他返回的是一个又元类type定义的一个新的类,这个类继承自Manager类,同时把QuerySet类中的多有方法等属性也添加到了这个用type新建的类,具体体现就在cls._get_queryset_methods(queryset_class)。同样,在django/db/models/manager.py中找到_get_queryset_methods函数,就会明白:

    @classmethod
    def _get_queryset_methods(cls, queryset_class):
        def create_method(name, method):
            def manager_method(self, *args, **kwargs):
                return getattr(self.get_queryset(), name)(*args, **kwargs)
            manager_method.__name__ = method.__name__
            manager_method.__doc__ = method.__doc__
            return manager_method

好了,下面以models.objects.all()为例进行分析:
由前面的分析可以知道,这里的all其实也就是QuerSet中的all函数

    def all(self):
        """
        Returns a new QuerySet that is a copy of the current one. This allows a
        QuerySet to proxy for a model manager in some cases.
        """
        return self._clone()

在django中,我们一般拿到all的返回值都是进行for循环读取每一个值,然后处理,所以,看看self._clone() 在for循环后得到了什么呢?,首先_clone函数返回的

    def _clone(self, **kwargs):
        query = self.query.clone()
        if self._sticky_filter:
            query.filter_is_sticky = True
        clone = self.__class__(model=self.model, query=query, using=self._db, hints=self._hints)
        clone._for_write = self._for_write
        clone._prefetch_related_lookups = self._prefetch_related_lookups[:]
        clone._known_related_objects = self._known_related_objects
        clone._iterable_class = self._iterable_class
        clone._fields = self._fields

        clone.__dict__.update(kwargs)
        return clone

其实还是一个QuerySet实例,根据python中的魔法特性,对于一个类来说,for循环会去调用他的iter函数,那么直接看QuerySet类中的iter函数就好了

    def __iter__(self):
        """
        The queryset iterator protocol uses three nested iterators in the
        default case:
            1. sql.compiler:execute_sql()
               - Returns 100 rows at time (constants.GET_ITERATOR_CHUNK_SIZE)
                 using cursor.fetchmany(). This part is responsible for
                 doing some column masking, and returning the rows in chunks.
            2. sql/compiler.results_iter()
               - Returns one row at time. At this point the rows are still just
                 tuples. In some cases the return values are converted to
                 Python values at this location.
            3. self.iterator()
               - Responsible for turning the rows into model objects.
        """
        self._fetch_all() #重点关注
        return iter(self._result_cache)

这里看_fetch_all()函数:

    def _fetch_all(self):
        if self._result_cache is None:
            self._result_cache = list(self.iterator()) #重点关注这个
        if self._prefetch_related_lookups and not self._prefetch_done:
            self._prefetch_related_objects()

所以看self.iterator()就好:

    def iterator(self):
        """
        An iterator over the results from applying this QuerySet to the
        database.
        """
        return iter(self._iterable_class(self))

在QuerySet中的init函数中有self._iterable_class = ModelIterable,那么看ModelIterable类

class ModelIterable(BaseIterable):
    """
    Iterable that yields a model instance for each row.
    """

    def __iter__(self):
        queryset = self.queryset
        db = queryset.db
        compiler = queryset.query.get_compiler(using=db) #重点关注
        # Execute the query. This will also fill compiler.select, klass_info,
        # and annotations.
        results = compiler.execute_sql()  #重点关注
        select, klass_info, annotation_col_map = (compiler.select, compiler.klass_info,
                                                  compiler.annotation_col_map)
        if klass_info is None:
            return
        model_cls = klass_info['model']  #models.py中的model类,详见django/db/models/sql/compiler.py中的  get_select函数(165行)
        select_fields = klass_info['select_fields'] #
        model_fields_start, model_fields_end = select_fields[0], select_fields[-1] + 1
        init_list = [f[0].target.attname
                     for f in select[model_fields_start:model_fields_end]]
        related_populators = get_related_populators(klass_info, select, db)
        for row in compiler.results_iter(results):
            obj = model_cls.from_db(db, init_list, row[model_fields_start:model_fields_end]) #重点注意
            if related_populators:
                for rel_populator in related_populators:
                    rel_populator.populate(row, obj)
            if annotation_col_map:
                for attr_name, col_pos in annotation_col_map.items():
                    setattr(obj, attr_name, row[col_pos])

            # Add the known related objects to the model, if there are any
            if queryset._known_related_objects:
                for field, rel_objs in queryset._known_related_objects.items():
                    # Avoid overwriting objects loaded e.g. by select_related
                    if hasattr(obj, field.get_cache_name()):
                        continue
                    pk = getattr(obj, field.get_attname())
                    try:
                        rel_obj = rel_objs[pk]
                    except KeyError:
                        pass  # may happen in qs1 | qs2 scenarios
                    else:
                        setattr(obj, field.name, rel_obj)

            yield obj

所以,其实我们在for 循环获取all的结果的时候,最后获得的就是obj,所以关注obj对象就好了,但是在这个过程中,我们貌似还不知道django是怎么确认到底要操作哪一个数据库的,他是怎么和数据库建立连接的呢?obj到底是怎么来的呢?

其实对于django的mysql部分,在class ModelIterable(BaseIterable)的源码中,重点关注一下queryset.query.get_compiler(using=db),也就是django/db/models/sql/query.py中的 get_compiler方法,这就是指定从哪个数据库中获取数据的入口。。

    def get_compiler(self, using=None, connection=None):
        if using is None and connection is None:
            raise ValueError("Need either using or connection")
        if using:
            # 这里的connections是全局变量,见django/db/__init__.py中的connections = ConnectionHandler(),
            # 所以这里的connections[using]就会触发ConnectionHandler类的__getitem__方法(python的魔法函数之一)
            connection = connections[using]  
        return connection.ops.compiler(self.compiler)(self, connection, using)

那再看看ConnectionHandler的getitem方法:

    def __getitem__(self, alias):
        if hasattr(self._connections, alias):
            return getattr(self._connections, alias)

        self.ensure_defaults(alias)
        self.prepare_test_settings(alias)
        #self.databases就是settings中指定的settings.DATABASES,是一个字典
        db = self.databases[alias]
        #加载指定的数据库后端引擎
        backend = load_backend(db['ENGINE'])
        #参见django/db/backends/mysql/base.py的中的DatabaseWrapper
        conn = backend.DatabaseWrapper(db, alias)
        setattr(self._connections, alias, conn)
        return conn

所以上面get_compiler函数中的connection变量是一个DatabaseWrapper类。那 connection.ops.compiler(self.compiler)(self, connection, using)到底是个啥呢?先看看DatabaseWrapper类的ops是啥。

class DatabaseWrapper(BaseDatabaseWrapper):
    vendor = 'mysql'
    #中间省略
    Database = Database
    SchemaEditorClass = DatabaseSchemaEditor

    def __init__(self, *args, **kwargs):
        super(DatabaseWrapper, self).__init__(*args, **kwargs)

        self.features = DatabaseFeatures(self)
        self.ops = DatabaseOperations(self) #重点关注
        self.client = DatabaseClient(self)
        self.creation = DatabaseCreation(self)
        self.introspection = DatabaseIntrospection(self)
        self.validation = DatabaseValidation(self)

connection.ops其实是一个DatabaseOperations类,对于DatabaseOperations类,他的源码,目前只需要关注

class DatabaseOperations(BaseDatabaseOperations):
    compiler_module = "django.db.backends.mysql.compiler"

所以接下来只要看看DatabaseOperations中的compiler函数了,DatabaseOperations中是没有compiler函数的,但是它继承自BaseDatabaseOperations,所以看看django/db/backends/base/operations.py中BaseDatabaseOperations的compiler函数

    def compiler(self, compiler_name):
        """
        Returns the SQLCompiler class corresponding to the given name,
        in the namespace corresponding to the `compiler_module` attribute
        on this backend.
        """
        if self._cache is None:
            self._cache = import_module(self.compiler_module)
        return getattr(self._cache, compiler_name)

对于mysql而言,从DatabaseOperations类的源码中可以看出,self.compiler_module就是 compiler_module = “django.db.backends.mysql.compiler”,所以connection.ops.compiler(self.compiler),就是根据self.compiler从django/db/backends/mysql/compiler模块中返回一个SQLcompiler类或者其子类(根据query类型,也就是self.compiler)的实例(django/db/models/sql/compiler.py),SQLcompiler中的execute_sql函数中调用cursor = self.connection.cursor()

    def execute_sql(self, result_type=MULTI):
        """
        Run the query against the database and returns the result(s). The
        return value is a single data item if result_type is SINGLE, or an
        iterator over the results if the result_type is MULTI.

        result_type is either MULTI (use fetchmany() to retrieve all rows),
        SINGLE (only retrieve a single row), or None. In this last case, the
        cursor is returned if any query is executed, since it's used by
        subclasses such as InsertQuery). It's possible, however, that no query
        is needed, as the filters describe an empty set. In that case, None is
        returned, to avoid any unnecessary database interaction.
        """
        if not result_type:
            result_type = NO_RESULTS
        try:
            sql, params = self.as_sql()
            if not sql:
                raise EmptyResultSet
        except EmptyResultSet:
            if result_type == MULTI:
                return iter([])
            else:
                return

        cursor = self.connection.cursor() #重点注意
        try:
            cursor.execute(sql, params)
        except Exception:
            cursor.close()
            raise

        if result_type == CURSOR:
            # Caller didn't specify a result_type, so just give them back the
            # cursor to process (and close).
            return cursor
        if result_type == SINGLE:
            try:
                val = cursor.fetchone()
                if val:
                    return val[0:self.col_count]
                return val
            finally:
                # done with the cursor
                cursor.close()
        if result_type == NO_RESULTS:
            cursor.close()
            return

        result = cursor_iter(
            cursor, self.connection.features.empty_fetchmany_value,
            self.col_count
        )
        if not self.connection.features.can_use_chunked_reads:
            try:
                # If we are using non-chunked reads, we return the same data
                # structure as normally, but ensure it is all read into memory
                # before going any further.
                return list(result)
            finally:
                # done with the cursor
                cursor.close()
        return result

因为在SQLCompiler中的init函数中有self.connection = connection,这里的connection就是DatabaseWrapper实例,所以cursor = self.connection.cursor()即是DatabaseWrapper中的cursor函数

    def cursor(self):
        """
        Creates a cursor, opening a connection if necessary.
        """
        self.validate_thread_sharing()
        if self.queries_logged:
            cursor = self.make_debug_cursor(self._cursor())
        else:
            cursor = self.make_cursor(self._cursor())
        return cursor

这里重点关注self._cursor()函数,即django/db/backends/base/base.py中的_corsor函数。

    def _cursor(self):
        self.ensure_connection() #重点关注
        with self.wrap_database_errors:
            return self.create_cursor()

重点关注self.ensure_connection()

    def ensure_connection(self):
        """
        Guarantees that a connection to the database is established.
        """
        if self.connection is None:
            with self.wrap_database_errors:
                self.connect()

self.connection 在BaseDatabaseWrapper的init函数中为None,所以执行self.connect()

    def connect(self):
        """Connects to the database. Assumes that the connection is closed."""
        # Check for invalid configurations.
        self.check_settings()
        # In case the previous connection was closed while in an atomic block
        self.in_atomic_block = False
        self.savepoint_ids = []
        self.needs_rollback = False
        # Reset parameters defining when to close the connection
        max_age = self.settings_dict['CONN_MAX_AGE']
        self.close_at = None if max_age is None else time.time() + max_age
        self.closed_in_transaction = False
        self.errors_occurred = False
        # Establish the connection
        conn_params = self.get_connection_params()
        self.connection = self.get_new_connection(conn_params) #重点关注
        self.set_autocommit(self.settings_dict['AUTOCOMMIT'])
        self.init_connection_state()
        connection_created.send(sender=self.__class__, connection=self)

        self.run_on_commit = []

这里重点关注self.get_new_connection,这个函数其实是在django/db/backends/mysql/base.py中的146行的DatabaseWrapper中的get_new_connection被重写,所以也就是

    def get_new_connection(self, conn_params):
        conn = Database.connect(**conn_params)
        conn.encoders[SafeText] = conn.encoders[six.text_type]
        conn.encoders[SafeBytes] = conn.encoders[bytes]
        return conn

Database为import MySQLdb as Database,哈哈其实发现到最后django的ROM是是调用第三方的库,connect函数就是MySQLdb/init.py中的Connect函数,那么最后连接到底是哪个mysql呢,最终就是看conn_params,在connect函数中可以发现conn_params = get_connection_params(),那么看看get_connection_params函数:

    def get_connection_params(self):
        kwargs = {
            'conv': django_conversions,
            'charset': 'utf8',
        }
        if six.PY2:
            kwargs['use_unicode'] = True
        settings_dict = self.settings_dict
        if settings_dict['USER']:
            kwargs['user'] = settings_dict['USER']
        if settings_dict['NAME']:
            kwargs['db'] = settings_dict['NAME']
        if settings_dict['PASSWORD']:
            kwargs['passwd'] = force_str(settings_dict['PASSWORD'])
        if settings_dict['HOST'].startswith('/'):
            kwargs['unix_socket'] = settings_dict['HOST']
        elif settings_dict['HOST']:
            kwargs['host'] = settings_dict['HOST']
        if settings_dict['PORT']:
            kwargs['port'] = int(settings_dict['PORT'])
        # We need the number of potentially affected rows after an
        # "UPDATE", not the number of changed rows.
        kwargs['client_flag'] = CLIENT.FOUND_ROWS
        kwargs.update(settings_dict['OPTIONS'])
        return kwargs

全都来自settings_dict,我们再回到ConnectionHandler的getitem函数,

    def __getitem__(self, alias):
        if hasattr(self._connections, alias):
            return getattr(self._connections, alias)

        self.ensure_defaults(alias)
        self.prepare_test_settings(alias)
        db = self.databases[alias]
        backend = load_backend(db['ENGINE'])
        conn = backend.DatabaseWrapper(db, alias)
        setattr(self._connections, alias, conn)
        return conn

回过头来从上面可以看出,其实settings_dict = db = self.databases[alias] 就是settings中DATABASES中定义alias的字典的值,alias默认为default,所以也就是类似下面的一个东东。

DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.mysql',
        'NAME':  'dbname',
        'USER': 'dbuser'),
        'PASSWORD': 'dbpasswd'),
        'HOST': 'dbhost'),
        'PORT': 'dbport'),
    }
}

所以SQLcompiler中的execute_sql函数中调用cursor = self.connection.cursor()中的cursor 就是django/db/backends/base/base.py中cursor函数的返回值,

    def cursor(self):
        """
        Creates a cursor, opening a connection if necessary.
        """
        self.validate_thread_sharing()
        if self.queries_logged:
            cursor = self.make_debug_cursor(self._cursor())
        else:
            cursor = self.make_cursor(self._cursor())
        return cursor

那么cursor到底是啥呢,其实make_cursor只是对self._cursor()做简单的封装,真正的执行时的cursor还是self._cursor()的返回值。

    def _cursor(self):
        self.ensure_connection()
        with self.wrap_database_errors:
            return self.create_cursor()

所以关注create_cursor函数就好。create_cursor是django/db/backends/mysql/base.py中的DatabaseWrapper中的create_cursor函数。

    def create_cursor(self):
        cursor = self.connection.cursor()
        return CursorWrapper(cursor)

所以从上面的分析中可以看出,self.connection是self.connect函数中创建的connection,也就是MySQLdb中的connection,所以cursor就是MySQLdb.connection.cursor,也就是根据setting.py中的DATABASES值指定的mysql数据库,建立的连接和游标。

对于django中的mysql来说,执行流程是,SQLcompliter或者其子类执行sql语句时,首先用cursor = self.connection.cursor()获取游标cursor,因为self.connection是DatabaseWrapper类,所以调用DatabaseWrapper的cursor函数,这个函数会根据传进来的db信息(settings中的DATABASES信息),首先去调用MySQLdb中的库新建一个connecttion连接,获取游标cursor,然后借用cursor去执行sql。其实mysql的查询的过程主要从query.get_compiler开始的,也就是queryset.query.get_compiler(using)。最后我们回到obj上来。
obj = model_cls.from_db(db, init_list, row[model_fields_start:model_fields_end]),关注from_db函数

    def from_db(cls, db, field_names, values):
        if len(values) != len(cls._meta.concrete_fields):
            values = list(values)
            values.reverse()
            values = [values.pop() if f.attname in field_names else DEFERRED for f in cls._meta.concrete_fields]
        new = cls(*values)
        new._state.adding = False
        new._state.db = db
        return new

特喵的,到头来还是一个model对象,对象中的值来源于values ,values 从哪儿来的呢?values = row[model_fields_start:model_fields_end],而row从哪儿来的?for row in compiler.results_iter(results):,也就是最终来源于results,results = compiler.execute_sql(),execute_sql就是去连接的数据库去查询结果,我们再看一遍execute_sql;

    def execute_sql(self, result_type=MULTI):
        """
        Run the query against the database and returns the result(s). The
        return value is a single data item if result_type is SINGLE, or an
        iterator over the results if the result_type is MULTI.

        result_type is either MULTI (use fetchmany() to retrieve all rows),
        SINGLE (only retrieve a single row), or None. In this last case, the
        cursor is returned if any query is executed, since it's used by
        subclasses such as InsertQuery). It's possible, however, that no query
        is needed, as the filters describe an empty set. In that case, None is
        returned, to avoid any unnecessary database interaction.
        """
        if not result_type:
            result_type = NO_RESULTS
        try:
            sql, params = self.as_sql()
            if not sql:
                raise EmptyResultSet
        except EmptyResultSet:
            if result_type == MULTI:
                return iter([])
            else:
                return
        #重点关注
        cursor = self.connection.cursor()
        try:
            cursor.execute(sql, params)
        except Exception:
            cursor.close()
            raise

        if result_type == CURSOR:
            # Caller didn't specify a result_type, so just give them back the
            # cursor to process (and close).
            return cursor
        if result_type == SINGLE:
            try:
                val = cursor.fetchone()
                if val:
                    return val[0:self.col_count]
                return val
            finally:
                # done with the cursor
                cursor.close()
        if result_type == NO_RESULTS:
            cursor.close()
            return

        #重点关注
        result = cursor_iter(
            cursor, self.connection.features.empty_fetchmany_value,
            self.col_count
        )
        if not self.connection.features.can_use_chunked_reads:
            try:
                # If we are using non-chunked reads, we return the same data
                # structure as normally, but ensure it is all read into memory
                # before going any further.
                return list(result)
            finally:
                # done with the cursor
                cursor.close()
        return result

这里我们只要重点关注下面这段就好了。

        result = cursor_iter(
            cursor, self.connection.features.empty_fetchmany_value,
            self.col_count
        )

那么cursor_iter函数做了什么呢?

def cursor_iter(cursor, sentinel, col_count):
    """
    Yields blocks of rows from a cursor and ensures the cursor is closed when
    done.
    """
    try:
        for rows in iter((lambda: cursor.fetchmany(GET_ITERATOR_CHUNK_SIZE)),
                         sentinel):
            yield [r[0:col_count] for r in rows]
    finally:
        cursor.close()

哈哈,看到这里还不明白的话,建议用MySQLdb库调用一下 cursor.fetchmany试试就知道了。cursor.fetchmany返回的就是从mysql中查询获取的结果。所以,其实也就是将从mysql中获取到的数据封装成对应的model对象。

上面我们只分析常用的model.obejcts.all这一种情况,那么对于model.obejcts.first(),model.obejcts.get()等其他情况呢?
先看model.obejcts.first(),也就是看first函数:

    def first(self):
        """
        Returns the first object of a query, returns None if no match is found.
        """
        objects = list((self if self.ordered else self.order_by('pk'))[:1])
        if objects:
            return objects[0]
        return None

这里重点关注objects[0],其实对于objects 不管是self ,还是self.order_by的返回值,他们都是一个QuerySet类,那么objects[0]就会调用django/db/models/query.py中159行的QuerySet的getitem函数:

    def __getitem__(self, k):
        """
        Retrieves an item or slice from the set of results.
        """
        if not isinstance(k, (slice,) + six.integer_types):#k为 int,所以这里一定通过
            raise TypeError
        assert ((not isinstance(k, slice) and (k >= 0)) or
                (isinstance(k, slice) and (k.start is None or k.start >= 0) and
                 (k.stop is None or k.stop >= 0))), 
            "Negative indexing is not supported."

        if self._result_cache is not None:
            return self._result_cache[k]

        if isinstance(k, slice):
            qs = self._clone()
            if k.start is not None:
                start = int(k.start)
            else:
                start = None
            if k.stop is not None:
                stop = int(k.stop)
            else:
                stop = None
            qs.query.set_limits(start, stop)
            return list(qs)[::k.step] if k.step else qs

        qs = self._clone()
        qs.query.set_limits(k, k + 1)
        return list(qs)[0]

当self._result_cache不为空的时候,这部分就不在细说,当self._result_cache为None的时候,那么list(qs)[0]中的list()也会去调用qs(即QuerySet类),这里有体现了一个python的内置魔法,那就是list()会去调用qs对象中的iter函数;那么就看看QuerySet的iter()函数:

    def __iter__(self):
        """
        The queryset iterator protocol uses three nested iterators in the
        default case:
            1. sql.compiler:execute_sql()
               - Returns 100 rows at time (constants.GET_ITERATOR_CHUNK_SIZE)
                 using cursor.fetchmany(). This part is responsible for
                 doing some column masking, and returning the rows in chunks.
            2. sql/compiler.results_iter()
               - Returns one row at time. At this point the rows are still just
                 tuples. In some cases the return values are converted to
                 Python values at this location.
            3. self.iterator()
               - Responsible for turning the rows into model objects.
        """
        self._fetch_all()
        return iter(self._result_cache)

其实又回到for循环model.objects.all的常规用法了,后续就不再赘述。

关于model.obejcts.get()这一类的操作,其实也是看get()函数,

    def get(self, *args, **kwargs):
        """
        Performs the query and returns a single object matching the given
        keyword arguments.
        """
        clone = self.filter(*args, **kwargs) 
        if self.query.can_filter() and not self.query.distinct_fields:
            clone = clone.order_by()
        num = len(clone) #重点关注
        if num == 1:
            return clone._result_cache[0]
        if not num:
            raise self.model.DoesNotExist(
                "%s matching query does not exist." %
                self.model._meta.object_name
            )
        raise self.model.MultipleObjectsReturned(
            "get() returned more than one %s -- it returned %s!" %
            (self.model._meta.object_name, num)
        )

这里关注 num = len(clone),len()也会触发另外一个python魔法,那就是len()回去调用len()函数,所以看看def len(self)函数的源码:

def __len__(self):
        self._fetch_all() #重点关注
        return len(self._result_cache)

到了这里又跟上面的一样啦,因为get返回的是clone._result_cache[0],这里的_result_cache只不过是被self.filter函数给过滤后的结果。。其实这里面所有的操作,只要关注self._fetch_all()被谁调用了,都能反向推到获得。

其实这里还有一点不明白,每次执行相同的execute_sql函数,他怎么知道是增删查改中的哪一种呢?对应的SQL语句从而来呢?其实看看execute_sql中调用的self.as_sql函数就好了。每一个继承自SQLCompiler的不同的类的as_sql返回的SQL语句都是特定的,只是要替换其中的表名等参数,例如:

class SQLDeleteCompiler(SQLCompiler):
    def as_sql(self):
        """
        Creates the SQL for this query. Returns the SQL string and list of
        parameters.
        """
        assert len([t for t in self.query.tables if self.query.alias_refcount[t] > 0]) == 1, 
            "Can only delete from one table at a time."
        qn = self.quote_name_unless_alias
        result = ['DELETE FROM %s' % qn(self.query.tables[0])]
        where, params = self.compile(self.query.where)
        if where:
            result.append('WHERE %s' % where)
        return ' '.join(result), tuple(params)


class SQLUpdateCompiler(SQLCompiler):
    def as_sql(self):
        """
        Creates the SQL for this query. Returns the SQL string and list of
        parameters.
        """
        self.pre_sql_setup()
        if not self.query.values:
            return '', ()
        qn = self.quote_name_unless_alias
        values, update_params = [], []
        for field, model, val in self.query.values:
            if hasattr(val, 'resolve_expression'):
                val = val.resolve_expression(self.query, allow_joins=False, for_save=True)
                if val.contains_aggregate:
                    raise FieldError("Aggregate functions are not allowed in this query")
            elif hasattr(val, 'prepare_database_save'):
                if field.remote_field:
                    val = field.get_db_prep_save(
                        val.prepare_database_save(field),
                        connection=self.connection,
                    )
                else:
                    raise TypeError(
                        "Tried to update field %s with a model instance, %r. "
                        "Use a value compatible with %s."
                        % (field, val, field.__class__.__name__)
                    )
            else:
                val = field.get_db_prep_save(val, connection=self.connection)

            # Getting the placeholder for the field.
            if hasattr(field, 'get_placeholder'):
                placeholder = field.get_placeholder(val, self, self.connection)
            else:
                placeholder = '%s'
            name = field.column
            if hasattr(val, 'as_sql'):
                sql, params = self.compile(val)
                values.append('%s = %s' % (qn(name), sql))
                update_params.extend(params)
            elif val is not None:
                values.append('%s = %s' % (qn(name), placeholder))
                update_params.append(val)
            else:
                values.append('%s = NULL' % qn(name))
        if not values:
            return '', ()
        table = self.query.tables[0]
        result = [
            'UPDATE %s SET' % qn(table),
            ', '.join(values),
        ]
        where, params = self.compile(self.query.where)
        if where:
            result.append('WHERE %s' % where)
        return ' '.join(result), tuple(update_params + params)

什么时候调用那些不同的继承自SQLCompiler的类呢?在QuerySet中我们看update函数

    def update(self, **kwargs):
        """
        Updates all elements in the current QuerySet, setting all the given
        fields to the appropriate values.
        """
        assert self.query.can_filter(), 
            "Cannot update a query once a slice has been taken."
        self._for_write = True
        query = self.query.clone(sql.UpdateQuery)  #重点注意
        query.add_update_values(kwargs)
        with transaction.atomic(using=self.db, savepoint=False):
            rows = query.get_compiler(self.db).execute_sql(CURSOR)
        self._result_cache = None
        return rows
    update.alters_data = True

最后调用的是SQLUpdateCompiler类,所以query.get_compiler(self.db)其实是根据query的类型,Query的类型在django/db/models/sql/subquerys.py有定义。所以其实也就是query的类型决定了SQL语句,query的类型在那儿被决定的呢,就是在model.obejcts.get(pk).update,采用UpdateQuery,model.obejcts.get(pk).delete,其他同理。他只不过是将普通的Query类进行clone,然后更改自身的class属性。

所以,整个django中mysql操作的关键是query.get_compiler(self.using).execute_sql();首先query在上面被对应的操作(update,delete, insert等)所决定,self.using决定操作的数据库,query类型决定execute_sql执行的SQL语句,所以整个就被决定下来了。
执行完操作的最后一步就是save()函数了,save()函数在django/db/models/base.py第718行,暂时觉得没啥需要将分析的,就不做过多的分析了。
其实ORM就是用某一种编程语言,基于该语言对应的数据库的库进行封装。

文章来源于互联网:Django ORM分析

发表评论