Stored Procedures

zippy

Honorary Master
Joined
May 31, 2005
Messages
10,320
Reaction score
897
Location
Kent, UK
http://grokable.com/2003/11/dont-us...ffering-from-nihs-not-invented-here-syndrome/

At just about every talk I give I always try to make several consistent statements. One of which is: ‘Whenever possible use stored procedures to access your data’.



The following day I usually receive some email from people that attended the talk – it’s always great to hear from people and how they build their applications. Every once in a while I’ll get a comment such as: ‘I don’t use stored procedures for several reasons, one of which is that it ties me to a particular database’. I just finished answering several such emails after our last couple of roadshows events and thought – great blog topic!



As soon as you put data in a database, whether its SQL Server, Oracle, or DB2, you just became tied to that particular database — surprised??? Sure, it’s just your data that is in there and you can use ANSI SQL for ‘generic/independent’ access to the data from your application. But claiming that ad-hoc SQL (SQL script embedded directly in your code) is the only way to accomplish database independence is simply incorrect. In fact there are many ways to efficiently solve this problem without creating such a brittle link between the application and the database.



One of the ways we solve this problem in ASP.NET “Whidbey” and the ASP.NET Forums (and now .Text and DotNetNuke too) is through the provider model pattern: a pluggable data access layer component that can easily be swapped out for other data access layer components. The difference is we do everything possible to take advantage of the database we’re running on. This means ensuring that all data access is through stored procedures and that the stored procedures return data to the business logic layer with no wasted cycles and no wasted data, i.e. sort the data in the database and only return records which will be used, no-more no-less.



There are several reasons for using stored procedures to access data, below I’ve attempted to capture the 3 that I always reference:



1. Ad-hoc SQL Script is brittle

You would never put business logic code in your presentation tier, right? Why put ad-hoc SQL in your business logic layer? Embedded SQL script is very brittle; small changes to the database can have severe impacts on the application. Whereas stored procedures afford abstraction between the data and the business logic layer. The data model can be dramatically changed, but the stored procedures can still return identical data.



2. Security

What permissions are required for working with your data? If embedded SQL is used it likely means that the application can execute any INSERT, UPDATE, DELETE, SELECT script it desires. You wouldn’t – hopefully you don’t – run as the administrator (sa) account on your server, so why allow full access to the database itself?



Stored procedures allow for better data protection by controlling how the data is accessed. By granting a database login EXECUTE permissions on stored procedures you can specify limited actions to the application. Additionally, stored procedures are a counter-measure to dangerous SQL Script injection attacks, a susceptibility that applications using embedded SQL are more vulnerable to.



3. Performance

A counter argument for performing data manipulation in a stored procedure, vs. simply retrieving the desired records in a SELECT statement, is it slows SQL Server down. That’s not exactly correct. In many cases you can get better performance by looping and filtering data in SQL Server than you could performing the same loops and filters in the Data Access Layer – databases are intrinsically designed to do this, while you and I have to write our own code (which do you think is going to be faster?). It is, however, important to understand how SQL Server uses indexes and clustered indexes. I’d recommend SQL Server 2000 Performance and Optimization Tuning to learn more about this and other SQL Server performance guidelines.



Beyond sorting and filtering data in the stored procedures you can also batch common work together or retrieve multiple sets of data. For example, retrieve some data, update a datetime stamp, and then insert a new record. If you were to execute these 3 tasks once a second as ad-hoc SQL this would result in 259,200/day independent database request vs. 86,400/day if all were encapsulated in a stored procedure. That’s 172,800 database connections and network IO usages that you no longer require! Consolidating work through stored procedures makes more effective use of a connection (and your system).



There are also internal performance benefits to SQL Server for using stored procedures vs. ad-hoc SQL script. When stored procedures are used SQL Server can cache the ‘execution plan’ that it uses to execute the SQL vs. having to recalculate the execution plan on each request – would you recompile your business logic class on each request? …probably not. While small details such as this appear to be trivial, when you start considering total requests/second that your SQL Server is serving, small performance improvements to repetitive code paths add up very quickly and make your server more efficient and scalable.



It’s also worth mentioning that the easiest way to get performance out of your database is to do everything you can to take advantage of the platform you are running on. For example, if I were to write an application that used a DB2 database, I would do everything I could to get the maximum performance out of that system vs. trying to write generic data access code. Why? Well the SQL Server team here at Microsoft has some of the best minds in the industry making sure SQL Server has phenomenal performance and scalability – it would be a shame not to take advantage of all that effort, right? We can assume that other database companies hire smart people for their products too — don’t get striken by N.I.H.S. (Not Invented Here S<
/STRONG>yndrome) a dreaded&n
bsp;software developer disease where nothing is good unless you wrote every line of code!



Side-note: if you jumped to the conclusion that the point of all this is to advocate all business logic residing in stored procedures, you missed the point. There needs to be balance. My general rule of thumb is: return only the necessary data, no-more no less. The more efficient you can make your data access code by rapidly opening and closing the connection to the database the better; stored procedures provide you with an effective tool for accomplishing this. Performing business logic is frowned upon, i.e. transforming the data or performing calculations on the data. Use the power of the database to thresh the wheat from the chaff. Use your business logic to turn the wheat into dough
 
I think that article was spot on for 2003.

Since then (at least in the MS world), we have the likes of LINQ and Entity Framework. That has changed things quite a bit. Stored procs still have their place, but LINQ (in part designed by the SQL server development team if I recall correctly - or at least they had input on the original Linq to SQL) has meant that much of those traditional benefits of stored procs no longer hold true.
 
*Yawn* - Go play with your mac.

But yeah you post an article from 10 years ago and think it still holds true. lol.
 
Last edited:
It still holds true mostly. Another important point missed was that in general your DB server is MUCH more powerful than the server/pc/laptop/etc the client app is running on. Anyhoo, getting a beer now ...
 
You would never put business logic code in your presentation tier, right? Why put ad-hoc SQL in your business logic layer?

I wouldnt, and dont...

There are also internal performance benefits to SQL Server for using stored procedures vs. ad-hoc SQL script. When stored procedures are used SQL Server can cache the ‘execution plan’ that it uses to execute the SQL vs. having to recalculate the execution plan on each request

except that SQL Server, and pretty much every other database DOES cache the execution plan of paramaterized queries


SQL Injection??? is this 2003??? oh wait......

The more efficient you can make your data access code by rapidly opening and closing the connection to the database the better; stored procedures provide you with an effective tool for accomplishing this.

that doesnt even make sense. firstly, in this decade, we have things called connection pools.
secondly, if your stored proc is doing extra work, how does this decrease the time spent on the DB.


what happens in the following cause using a stored proc
Code:
BEGIN TRANSACTION
..//do some stuff
COMMIT TRANSACTION

Code:
@Transactional
public void doSomething(some args) {
   executeStoredProc();
   executeAnotherStoredProc(); this fails.
}

If the 2nd stored proc fails, you can consider the process in the first stored proc to be throw-away.
If you didnt have intimate knowledge of the stored proc, you might not know that it internally commits it's transaction. when the 2nd stored proc fails, you expect that it will rollback the first.


another thing. pretty much every app i have written in the last 7 years will work on any db, with a 1 line change and dependancy import. might have to write a small adapter to handle pagination between say mySQL and oracle, but that is small compared to, having to write stored process for each provider. and then when you need to make a change, you need to update each of the stored procs, and remember that they have been updated.

stored procs are throwbacks to when people thought that you start developing an application by designing the database/table structure.

if IO performance becomes an issue (which it isnt for 99.9% of apps), we now have bigdata in memory datastores, exponentially faster than that traditional RDBMS
 
Last edited:
Stored procs do have the advantage of being more isolated and handled by a DBA (hopefully) which - means they are optimised - hopefully - whereas using something like LINQ allows any developer to write any query he likes (bypassing indexes etc in the process). Or am I talking rubbish?

Also a fix on a stored proc is much easier than fixing LINQ and doing a production roll out to the client machines (not that this happens all that much).

Not that I am in favour of one above the other - each has their place.
 
In all honesty, who has built an enterprise-scale application that's been running for several years at 50+ large clients and then just decided to change the data store? Obviously, there should be loose enough dependencies between the DAL and the queries it calls to make it feasible to do change, but I advocate taking advantage of a data store's native programming / retrieval mechanisms to get the best performance out of it.
 
Stored procs do have the advantage of being more isolated and handled by a DBA (hopefully) which - means they are optimised - hopefully - whereas using something like LINQ allows any developer to write any query he likes (bypassing indexes etc in the process). Or am I talking rubbish?

Also a fix on a stored proc is much easier than fixing LINQ and doing a production roll out to the client machines (not that this happens all that much).

Not that I am in favour of one above the other - each has their place.

Lets first get the terminology right, LINQ is not exclusive to database querying. There is LINQ To Sql, Linq To Entities, etc. If one does not design the DBML correctly then indexes will not be used but that is not the developer writing the queries problem, its in fact the person who designed the table. The same can be applied in your situation (and it happened numerous times when i was there).

The fact is that in this day and age with all these compute clusters and with bigdata like _kabal_ said a lot of the heavy lifting can be loaded onto a clustered EC instance (2 Intel Xeon E5-2670 8-core processors, 244 GiB of RAM, 240 GB of SSD-based). With that type of power i don't need some puny database to do my "heavy work". And there are much more intelligent ways to write datawrappers/layers without the need for

C/R/U/D sprocs and other random ones.
 
In all honesty, who has built an enterprise-scale application that's been running for several years at 50+ large clients and then just decided to change the data store? Obviously, there should be loose enough dependencies between the DAL and the queries it calls to make it feasible to do change, but I advocate taking advantage of a data store's native programming / retrieval mechanisms to get the best performance out of it.

So you use Sprocs for CRUD methods.. i think a part of me just died.
 
I am busy on a new project at the mo (about 70% in) and so far not used a single sproc, UDF or view.
I started this project in March using EF6 and Entity Framework Code First with migrations. It has been a bit of an adjustment as I have been SQL Server daily since 1997 and always used sprocs and views where necessary.
Part of the migration is to generate my indexes and integrity is maintained via the fluent API. Everything else is straight forward LINQ.

It has been a few months now and I have to admit - I am liking it. Having all logic built into the domain layer makes it so much easier to manage - and especially - unit test!
The staging DB is used purely for data persistence with zero logic.
Next week I start on the loading into the presentation database code (final part of the ETL) and that I will be using sprocs as it just makes more sense to aggregate information from staging to presentation using them. So they do seem to have their uses, but limited as far as I am concerned.
 
Microsoft was late to the ORM party so please don't make like they came up with the idea. They just copied it like they did with .net, MVC etc
 
Microsoft was late to the ORM party so please don't make like they came up with the idea. They just copied it like they did with .net, MVC etc

They copied .NET ? rofl okay exactly where did they copy that from? MVC is a software engineering pattern they copied it from no one they just implemented their own framework on its principals.Goddam junior developers.
 
I am busy on a new project at the mo (about 70% in) and so far not used a single sproc, UDF or view.
I started this project in March using EF6 and Entity Framework Code First with migrations. It has been a bit of an adjustment as I have been SQL Server daily since 1997 and always used sprocs and views where necessary.
Part of the migration is to generate my indexes and integrity is maintained via the fluent API. Everything else is straight forward LINQ.

It has been a few months now and I have to admit - I am liking it. Having all logic built into the domain layer makes it so much easier to manage - and especially - unit test!
The staging DB is used purely for data persistence with zero logic.
Next week I start on the loading into the presentation database code (final part of the ETL) and that I will be using sprocs as it just makes more sense to aggregate information from staging to presentation using them. So they do seem to have their uses, but limited as far as I am concerned.

Thats pretty cool, however i would not use EF even if i was paid to its just hellishly slow

Code:
Hand coded (using a SqlDataReader)	 47ms
Dapper ExecuteMapperQuery<Post>	         49ms
ServiceStack.OrmLite (QueryById)	         50ms
PetaPoco	                                         52ms	
BLToolkit	                                         80ms
SubSonic CodingHorror	                         107ms
NHibernate SQL	                                 104ms
Linq 2 SQL ExecuteQuery	                 181ms
Entity framework ExecuteStoreQuery	 631ms
 
Thats pretty cool, however i would not use EF even if i was paid to its just hellishly slow

Code:
Hand coded (using a SqlDataReader) 47ms
Dapper ExecuteMapperQuery         49ms
ServiceStack.OrmLite (QueryById)         50ms
PetaPoco                                         52ms
BLToolkit                                         80ms
SubSonic CodingHorror                         107ms
NHibernate SQL                                 104ms
Linq 2 SQL ExecuteQuery                 181ms
Entity framework ExecuteStoreQuery 631ms

Switch off change tracking when not required. Makes a big difference.
IE: In one instance I do massive amounts of inserts. I switch off change tracking which allows the objects to be populated a lot faster. I then have an extension that maps the objects to an SqlBulkCopy and I stuff the data into into the DB in that way. Works great - and very fast.

If reading large amounts of data for whatever read only reason - switch off change tracking.

But yes - there are instances where I have to leave change tracking on and I can feel the performance hit. Nothing major enough to cripple the system - but it's there. The simplicity of EF outweighs this cost though.

For CRUD operations it makes no difference.

I have used Subsonic and nHiberbate extensively. Way prefer EF.
I also prefer to stick to EF as it is an MS tech with very active development. I used Subsonic on multiple projects and them stopping further dev just p1ssed me off. Will rather use a tech that belongs to MS and we - hopefully - see support for years to come.
I still have live projects that are tied to Subsonic. Over time I am slowly converting over to EF.

What version of EF where those tests executed with? I am curious because EF has moved on leaps and bounds since version 4 (using 6 beta at the mo). I stuck with SubSonic and nHibernate until EF5 as I felt EF4 was not enterprise ready.
The test implies that straight TSQL commands were executed for the tests. Curious to see how it would pan out using EF 6 and code first.

I implement my DAL using the repository and UnitOfWork patterns anyway, so the main project does not have a clue that EF is the data driver. If the client decides to go with another ORM (or DB) , no problem.
 
"Render onto SQL what is SQL's due." I always say. (Sorry Mr Shakespeare)
I like LINQ and the likes, but sometimes the best option is SQL, and then I use it to the max - Views, Functions, Table Variables, the works. I specialize in integration, so a lot of time is working with the data and business layer. The moment you start working with external data, presentation layers, MVC or such - SQL stored procedures may not be the best option. But for ETL or the likes, lets make the server squeal.
 
Switch off change tracking when not required. Makes a big difference.
IE: In one instance I do massive amounts of inserts. I switch off change tracking which allows the objects to be populated a lot faster. I then have an extension that maps the objects to an SqlBulkCopy and I stuff the data into into the DB in that way. Works great - and very fast.

If reading large amounts of data for whatever read only reason - switch off change tracking.

But yes - there are instances where I have to leave change tracking on and I can feel the performance hit. Nothing major enough to cripple the system - but it's there. The simplicity of EF outweighs this cost though.

For CRUD operations it makes no difference.

I have used Subsonic and nHiberbate extensively. Way prefer EF.
I also prefer to stick to EF as it is an MS tech with very active development. I used Subsonic on multiple projects and them stopping further dev just p1ssed me off. Will rather use a tech that belongs to MS and we - hopefully - see support for years to come.
I still have live projects that are tied to Subsonic. Over time I am slowly converting over to EF.

What version of EF where those tests executed with? I am curious because EF has moved on leaps and bounds since version 4 (using 6 beta at the mo). I stuck with SubSonic and nHibernate until EF5 as I felt EF4 was not enterprise ready.
The test implies that straight TSQL commands were executed for the tests. Curious to see how it would pan out using EF 6 and code first.

I implement my DAL using the repository and UnitOfWork patterns anyway, so the main project does not have a clue that EF is the data driver. If the client decides to go with another ORM (or DB) , no problem.

That test was for POCO serialization

https://code.google.com/p/dapper-dot-net/

We tried using EF6 in one of our projects at work and we just found that it was way to slow in POCO materialization.

I'm not one of those people that need to only use MS tech because its MS tech i strongly believe in other options out there. We're actually moving our entire infrastructure to postgresql this weekend getting away from MS-SQL.
 
"Render onto SQL what is SQL's due." I always say. (Sorry Mr Shakespeare)
I like LINQ and the likes, but sometimes the best option is SQL, and then I use it to the max - Views, Functions, Table Variables, the works. I specialize in integration, so a lot of time is working with the data and business layer. The moment you start working with external data, presentation layers, MVC or such - SQL stored procedures may not be the best option. But for ETL or the likes, lets make the server squeal.

Yes i agree with you 100%, but there are people on this form that believe that stored procedures are the correct way to write database layers. Slamming all business logic into hundreds of stored procedures is just piss poor pathetic design.
 
So you use Sprocs for CRUD methods.. i think a part of me just died.

Why yes we do! Makes it easier to modify the data model and not make ANY changes to the DAL, BL or any other code. You can also customize the solution to a degree to suit the client's needs and leave your core solution intact.
 
"customize to suit a clients needs"? So another advocate of business logic in the DAL? Jesus!!!
 
Top
Sign up to the MyBroadband newsletter
X