Monday, July 02, 2007

Rico Mariani's DLinq (LINQ to SQL) Performance Tips (Round 3)

While waiting for Rico to drop his other shoe and provide some Beta 2 LINQ to SQL performance numbers, it occurred to me that increasing the size of the object property (field) values would have an deleterious effect on LINQ to SQL performance ratios.

Update 7/6/2007: Rico dropped the other shoe yesterday at 5:30 PM: DLinq (Linq to SQL) Performance (Part 4). Matt Warren comments at LINQ to SQL: Rico drops the other shoe. There's not much more to say until we get Beta 2.

Update 7/16/2007: Prior to moving to his new position as Chief Architect of Visual Studio 2008, Rico summarized his LINQ to SQL performance tests with Beta 2 bits in his July 16, 2007 DLinq (Linq to SQL) Performance Part 5 post. Rico achieved 93% of the performance of the underlying SqlClient/SqlDataReader data provider with compiled LINQ to SQL query expressions that exaggerate LINQ overhead. I'll see if I can duplicate his results when Beta 2 becomes generally available in two to three weeks.

Per Row Costs

As noted in my updated response to Rico's part 2, his tests used a short-form version of the Northwind Orders table: Just the OrderID, CustomerID, EmployeeID, and ShippedDate fields, which total 26 bytes of SQL Server datatypes. The objects have an additional ShipCountry property that SELECT AVG(LEN(ShipCountry)) FROM Orders says is 5 Unicode characters or 10 bytes. So you don't need to look it up, here's a copy of the performance metrics for the lightweight objects with the 36-byte (average) load:


Rows

Base, s.
LINQ (Param), s. Param / Base LINQ (Compiled), s. Compiled / Base Param / Compiled
0 0.293 1.941 6.63 0.589 2.01 3.30
6 0.317 2.051 6.47 0.664 2.09 3.08
22 0.340 2.299 6.76 0.803 2.36 2.86
77 0.429 3.164 7.38 1.273 2.96 2.49
122 0.507 3.894 7.68 1.684 3.32 2.31

A reminder: Base is the time to create and populate 500 instances of the objects with an SqlDataReader from an SQL Server 2005 Express instance on the same machine as the test harness. The code adds one instance per row to a List<Order> to emulate populating LINQ to SQL's IEnumerable<Order>. (The C# test harness now has 330 lines of code.)The LINQ (Param) and LINQ (Compiled) are times to execute 500 conventional-parameterized and compiled-parameterized LINQ queries, respectively.

Following are execution times and performance ratios for objects created from the entire Orders table with rows that average 172 bytes. The LINQ (Param) and LINQ (Compiled) times had more scatter than those for the smaller objects, but the performance patterns are similar (and to be expected).


Rows

Base, s.
LINQ (Param), s. Param / Base LINQ (Compiled), s. Compiled / Base Param / Compiled
0 0.304 1.948 6.41 0.579 1.90 3.36
6 0.350 2.504 7.15 0.739 2.11 3.38
22 0.426 2.866 6.73 1.084 2.54 2.64
77 0.735 4.059 5.55 2.243 3.05 1.81
122 0.973 5.033 5.17 3.126 3.21 1.61

The moral of this story is that increasing the average width of the row (weight of the object) by a factor of 4.78 increases execution time of a compiled query by a factor of 1.13 for 6 rows to 1.86 for 122 rows. If you need high performance when querying heavyweight objects, you might consider adding a second object widget with only the fields you need to the designer. But don't fall victim to an "entity explosion" by creating a custom entity for every table with lengthy varchar or nvarchar fields.

Entity/Identity Management

Rico recommends providing a read-only DataContext type to improve performance. The way I read the tea leaves is that setting the DataContext.EnableObjectTracking property value to false makes the DataContext read-only. However, my tests show no measurable performance difference with object tracking on or off.

Data Binding

Rico didn't provide the source code for the "necessary helper for getting and storing the fields so that there is effectively straight line code calling the underlying provider," so I'm not sure what he had in mind. Perhaps SqlDataReader code similar to that for the base data?

1 comments:

Anonymous said...

>>Rico didn't provide the source code for the "necessary helper for getting and storing the fields so that there is effectively straight line code calling the underlying provider,"

The helper is automatically generated. It's step 4 in my latest blog

http://blogs.msdn.com/ricom/archive/2007/07/16/dlinq-linq-to-sql-performance-part-5.aspx

So there's no source needed.