We can force the issue and get an error when we close the session as soon as we get our publisher back. In this case the lazy loading is no longer possible and SQLAlchemy throws an exception:. Even better, we can do that when we create the query and do not need to change our mapping. This allows us to get exactly the data back that we need. There is a second option called selectinload that we can use to optimize the database access.
Which method should you use? In the SQLAlchemy documentation we find this advice:. Take this as a starting point and then check what queries your code produces. It is a small change to switch between selectinload and joinedload if you do not like the generated SQL. We can now write our queries in a way that will not exhaust the database. The next step is to change the database to keep it in sync with our changing application.
For this reason, while lazy loading can be expensive for related collections, in the case that one is loading lots of objects with simple many-to-ones against a relatively small set of possible target objects, lazy loading may be able to refer to these objects locally without emitting as many SELECT statements as there are parent objects.
Lazy loading can be enabled for a given attribute that is normally configured in some other way using the lazyload loader option:. However, eager loading requires that the attributes which are to be loaded be specified with the Query up front. The problem of code that may access other attributes that were not eagerly loaded, where lazy loading is not desired, may be addressed using the raiseload strategy; this loader strategy replaces the behavior of lazy loading with an informative error being raised:.
Above, a User object loaded from the above query will not have the. For example, to set up only one attribute as eager loading, and all the rest as raise:. The above wildcard will apply to all relationships not just on Order besides items , but all those on the Item objects as well. To set up raiseload for only the Order objects, specify a full path with Load :.
Conversely, to set up the raise for just the Item objects:. The raiseload option applies only to relationship attributes. For column-oriented attributes, the defer option supports the defer. Changed in version 1. This means that if the unit of work needs to load a particular attribute in order to complete its work, it will perform the load. Wildcard Loading Strategies. Raiseload for Deferred Columns. Joined eager loading is the most fundamental style of eager loading in the ORM.
Joined eager loading is usually applied as an option to a query, rather than as a default loading option on the mapping, in particular when used for collections rather than many-to-one-references.
This is achieved using the joinedload loader option:. For an attribute that is guaranteed to have an element, such as a many-to-one reference to a related object where the referencing foreign key is NOT NULL, the query can be made more efficient by using an inner join; this is available at the mapping level via the relationship. At the query option level, via the joinedload. Older versions of SQLAlchemy would convert right-nested joins into subqueries in all cases.
A central concept of joined eager loading when applied to collections is that the Query object must de-duplicate rows against the leading entity being queried.
Such as above, if the User object we loaded referred to three Address objects, the result of the SQL statement would have had three rows; yet the Query returns only one User object. As additional rows are received for a User object just loaded in a previous row, the additional columns that refer to new Address objects are directed into additional results within the User. Joined eager loading used for scalar references is however compatible with Query.
The Query. Since joined eager loading seems to have many resemblances to the use of Query. It is critical to understand the distinction that while Query. A particular query might start out using all lazy loads. After using it in context, it might be revealed that particular attributes or collections are always accessed, and that it would be more efficient to change the loader strategy for these.
The strategy can be changed with no other modifications to the query, the results will remain identical, but fewer SQL statements would be emitted. In theory and pretty much in practice , nothing you can do to the Query would make it load a different set of primary or related objects based on a change in loader strategy.
The correct way to load the User records and order by email address is to use Query. The statement above is of course not the same as the previous one, in that the columns from addresses are not included in the result at all. We can add joinedload back in, so that there are two joins - one is that which we are ordering on, the other is used anonymously to load the contents of the User.
What we see above is that our usage of Query. In this case, the two joins most probably appear redundant - which they are. But to see why joinedload does what it does, consider if we were filtering on a particular Address :. Above, we can see that the two JOINs have very different roles. One will match exactly one row, that of the join of User and Address where Address. By changing the usage of joinedload to another style of loading, we can change how the collection is loaded completely independently of SQL used to retrieve the actual User rows we want.
Below we change joinedload into subqueryload :. Subqueryload eager loading is configured in the same manner as that of joined eager loading; for the relationship.
The operation of subquery eager loading is to emit a second SELECT statement for each relationship to be loaded, across all result objects at once. This SELECT statement refers to the original SELECT statement, wrapped inside of a subquery, so that we retrieve the same list of primary keys for the primary object being returned, then link that to the sum of all the collection members to load them at once:. The subqueryload strategy has many advantages over joined eager loading in the area of loading collections.
Secondly, it allows for many collections to be eagerly loaded without producing a single query that has many JOINs in it, which can be even less efficient; each relationship is loaded in a fully separate query. Finally, because the additional query only needs to load the collection items and not the lead object, it can use an inner JOIN in all cases for greater query efficiency. Disadvantages of subqueryload include that the complexity of the original query is transferred to the relationship queries, which when combined with the use of a subquery, can on some backends in some cases notably MySQL produce significantly slow queries.
The newer style of loading provided by selectinload solves these limitations of subqueryload. Select IN loading. A query which makes use of subqueryload in conjunction with a limiting modifier such as Query. Without it, there is a chance that the inner query could return the wrong rows:.
Select IN loading is similar in operation to subquery eager loading, however the SELECT statement which is emitted has a much simpler structure than that of subquery eager loading. In most cases, selectin loading is the most simple and efficient way to eagerly load collections of objects. The only scenario in which selectin eager loading is not feasible is when the model is using composite primary keys, and the backend database does not support tuples with IN, which currently includes SQL Server.
This style of loading emits a SELECT that refers to the primary key values of the parent object, or in the case of a many-to-one relationship to the those of the child objects, inside of an IN clause, in order to load related associations:. Because the relationship between User and Address has a simple 1 primary join condition and provides that the primary key values for User can be derived from Address.
For simple 1 many-to-one loads, a JOIN is also not needed as the foreign key value from the parent object is used:. Select IN loading also supports many-to-many relationships, where it currently will JOIN across all three tables to match rows from one side to the other.
In comparison, joined and subquery eager loading always refer to multiple JOINs up to the original parent. There is no special logic in SQLAlchemy to check ahead of time which platforms support this syntax or not; if run against a non-supporting platform, the database will return an error immediately.
As always, developers should spend time looking at the statements and results generated by their applications in development to check that things are working efficiently. Which type of loading to use typically comes down to optimizing the tradeoff between number of SQL executions, complexity of SQL emitted, and amount of data fetched.
Lets take two examples, a relationship which references a collection, and a relationship that references a scalar many-to-one reference. When using the default lazy loading, if you load objects, and then access a collection on each of them, a total of SQL statements will be emitted, although each statement will typically be a simple SELECT without any joins.
When using joined loading, the load of objects and their collections will emit only one SQL statement. However, the total number of rows fetched will be equal to the sum of the size of all the collections, plus one extra row for each parent object that has an empty collection. Each row will also contain the full set of columns represented by the parents, repeated for each collection item - SQLAlchemy does not re-fetch these columns other than those of the primary key, however most DBAPIs with some exceptions will transmit the full data of each parent over the wire to the client connection in any case.
Therefore joined eager loading only makes sense when the size of the collections are relatively small. When using subquery loading, the load of objects will emit two SQL statements. The second statement will fetch a total number of rows equal to the sum of the size of all collections.
So a subquery load makes sense when the collections are larger. Databases allow various locking schemes, and tests can be programmed. But these ad hoc solutions divert attention away from the model, and soon you are back to hacking and hoping.
Aggregates define consistency boundaries. You can eager load your entire aggregate in one database call. This not only reduces the number of database calls but it also avoids the problem with inconsistent in-memory state.
I am not going to discuss this further because I am sure I cannot do justice. If you are interested you should read the book. Do not use GetCustomer method from everywhere in your code you need something related to customers. You should create more explicit methods that are context-aware and from those methods you should prefetch all required entities.
In other words, design your application so whenever you are loading an entity from database you know exactly what you need and how that entity is going to be used; so load everything you need upfront. If you are not loading something upfront it means that it is not going to be used. So you will not need lazy loading because everything you need is already loaded, and ONLY things that are needed in that specific context are loaded — nothing more, nothing less. Implement explicit methods with business meaning to load required entities There are some scenarios where you cannot know in advance what you are going to need.
For example, you may need to load and check the last orders for silver customers before authorizing a large payment while for gold customers you do not. In cases like this I prefer to load required entities explicitly from my business logic code to make that requirement very clear and readable in the code. A side benefit of this is that I will not need lazy loading support from ORM. I must say that this is much harder to implement in Active Record because, in the absence of identity map and a proper unit of work, every object graph only tracks its own changes, and if you load an object explicitly and not through lazy loading then its changes are not saved as part of the main object graph and have to be handled explicitly.
So you have to remember to call save on all separately loaded object graphs you have changed as part of one transaction. Use caching If you are using Lazy Loading and it is too late to change then caching could really help. Recently I added a cache with 5 second sliding expiry to a property call which reduced the number of database hits significantly.
The call that took database hits now only needs 5 hits, because the other calls were made to load the objects that had already been loaded. Caching could significantly decrease the number of database calls in your application; but it also means that you use potentially stale data.
You should always think very carefully before caching. When you are using caching to avoid unnecessary database calls, that is when the entity being lazily loaded has just been loaded from database, then you should cache your entities for a very short period of time.
For example, if you have an operation that takes 10 seconds to run and hits database times due to lazy loading and if you know a lot of those calls are unnecessary and are to load a very small number of entities, then you may cache those entities for 10 seconds. If your operation response time is reduced to half a second, then you can reduce your cache expiry to half a second.
Lazy loading may provide a tiny bit of benefit in some rare cases; but it is being used almost everywhere an ORM is used! The way it is used is doing far more harm than good and I believe the good it is doing is not justifiable at all. In my opinion, lazy loading should be limited to Active Record pattern, and Active Record pattern should only be used in forms over data applications.
Even in that case lazy loading should be used only when there is no high concurrency requirement or you will have the risk of working with inconsistent state. All the other issues explained above will also apply.
0コメント