Input on SQL Server window function

DominionZA

Executive Member
Joined
May 5, 2005
Messages
8,309
I use window functions quite extensively and have finally gotten myself into a hole.

Consider the following expression...

AvgOutgoingMass = AVG(sr_Hoppers.Mass) OVER (PARTITION BY sr_Levels.Shaft_Id, sr_Hoppers.Alias ORDER BY sr_Trains.EventDateTime, sr_Hoppers.N1 ROWS BETWEEN 10 PRECEDING AND 1 PRECEDING)

has the range fixed at the previous 10 records. I need to make this dynamic as each of our clients needs to be able to set how many they want to use to work out the average.

The solutions I have come up with - which are not options really...
1. Dynamic SQL. Firstly - yuk. Secondly - the query is rather large. Just don't like dynamic SQL if I can help it.
2. CASE statements where for each potential option (1, 2, 3, ...) your statement defines the entire window function with each specific n PRECEDING. Also no go as this value can be whatever the client wishes - no cap.

We are targeting SQL 2014, with a minimum requirement of SQL 2012.

Pity variables are not supported as that would just make my life simple.

Any ideas from the SQL gurus?
 

prod

Executive Member
Joined
Nov 11, 2009
Messages
6,132
Hmm my gut feeling says using row_number() somehow. Not sure though
 

DominionZA

Executive Member
Joined
May 5, 2005
Messages
8,309
Hmm my gut feeling says using row_number() somehow. Not sure though
I use that quite a bit, particularly for framing my data groups. Can not see how I would use it in this instance though.
Shot for the attempt though...

Edit: There is actually a solution using it, but would like to try avoid it as it will involve a CTE or 2. I will do this if I have to, but would way prefer a single expression.
 
Last edited:

Pho3nix

The Legend
Joined
Jul 31, 2009
Messages
30,589
I use that quite a bit, particularly for framing my data groups. Can not see how I would use it in this instance though.
Shot for the attempt though...

Edit: There is actually a solution using it, but would like to try avoid it as it will involve a CTE or 2. I will do this if I have to, but would way prefer a single expression.

CTE was going to be my suggestion :\
 

[)roi(]

Executive Member
Joined
Apr 15, 2005
Messages
6,282
Assuming this is related to calling a stored procedure; side question have you tried explictly assigning variables with EXEC, for example:
PHP:
EXEC SpProc @RowFrom = 1, @To = 23

Alternatively store the rows limits in a temp table link to this.
 

DominionZA

Executive Member
Joined
May 5, 2005
Messages
8,309
[)roi(];19254306 said:
Assuming this is related to calling a stored procedure; side question have you tried explictly assigning variables with EXEC, for example:
PHP:
EXEC SpProc @RowFrom = 1, @To = 23

Alternatively store the rows limits in a temp table link to this.
It's a view which takes care of transforming data in the ETL process.

I am sorted though - but not with a solution to this specific issue. I would still like to figure how to use dynamic values for preceding, but at least pressure is off.

Been chatting to the national site manager. He has agreed to lock down the end user options to 1 (last mass instead if average), 5 and 10. Using this I will have a 3 line CASE statement negating the need for another CTE.

I would still like to bounce around ideas though to see if there is a nice elegant way of making the preceding count dynamic :)

Shot for the input thus far peeps.
 

[)roi(]

Executive Member
Joined
Apr 15, 2005
Messages
6,282
It's a view which takes care of transforming data in the ETL process.

I am sorted though - but not with a solution to this specific issue. I would still like to figure how to use dynamic values for preceding, but at least pressure is off.

Been chatting to the national site manager. He has agreed to lock down the end user options to 1 (last mass instead if average), 5 and 10. Using this I will have a 3 line CASE statement negating the need for another CTE.

I would still like to bounce around ideas though to see if there is a nice elegant way of making the preceding count dynamic :)

Shot for the input thus far peeps.
Sure... the last option I mentioned should work though; because it's the way (in the past) that we substituted for a lack of variables; populate temp table with variable values, and then link that into your query.
 

DominionZA

Executive Member
Joined
May 5, 2005
Messages
8,309
[)roi(];19254350 said:
Sure... the last option I mentioned should work though; because it's the way (in the past) that we substituted for a lack of variables; populate temp table with variable values, and then link that into your query.
I get that dude, but it's a performance hit I would like to try avoid.
We are currently doing this using old techniques (limited to 2008 - and not even R2), and while it works - it is slow. Using a window / framing function has resulted in a massive performance improvement in our transformation processes.

We finally have all sites on 2012 / 14 so I am upgrading code.
 

[)roi(]

Executive Member
Joined
Apr 15, 2005
Messages
6,282
I get that dude, but it's a performance hit I would like to try avoid.
We are currently doing this using old techniques (limited to 2008 - and not even R2), and while it works - it is slow. Using a window / framing function has resulted in a massive performance improvement in our transformation processes.

We finally have all sites on 2012 / 14 so I am upgrading code.
No worries... you know your data / queries; but strange though I'd expect the performance hit to minor.
 

FarligOpptreden

Executive Member
Joined
Mar 5, 2007
Messages
5,396
[)roi(];19254448 said:
No worries... you know your data / queries; but strange though I'd expect the performance hit to minor.

Dealt with a similar issue at a client end of last year. The queries were fairly complex and were run against large datasets. I wrote it initially against a 2014 server, which worked a treat with windowed functions. Once I found out it had to be deployed to 2008, I had to refactor it back to CTEs and variable temp tables. The performance knock was in the order of 20x slower than windowed functions... and that's after I optimised it as much as possible in the time available.
 

[)roi(]

Executive Member
Joined
Apr 15, 2005
Messages
6,282
Dealt with a similar issue at a client end of last year. The queries were fairly complex and were run against large datasets. I wrote it initially against a 2014 server, which worked a treat with windowed functions. Once I found out it had to be deployed to 2008, I had to refactor it back to CTEs and variable temp tables. The performance knock was in the order of 20x slower than windowed functions... and that's after I optimised it as much as possible in the time available.
There are arguments for and against; some feel it's just the opposite; its like many things relative: to the data, and to the approach (code)
 

prod

Executive Member
Joined
Nov 11, 2009
Messages
6,132
I use that quite a bit, particularly for framing my data groups. Can not see how I would use it in this instance though.
Shot for the attempt though...

Edit: There is actually a solution using it, but would like to try avoid it as it will involve a CTE or 2. I will do this if I have to, but would way prefer a single expression.

This is such a nice problem to try and solve! Worthy of an interview question me thinks :D The only way I got it close was to use a CTE to limit the data with ROW_NUMBER() using a variable, and then to use ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING, also assuming the '1' is constant:

PHP:
DECLARE @RowNumber INT = 10
;WITH CTE1 AS (
	SELECT	ROW_NUMBER() OVER (PARTITION BY CustomerKey ORDER BY SalesAmount) AS [RN],
			SalesAmount,
			CustomerKey
	FROM FactInternetSales 
	WHERE CustomerKey = 11001			
)
SELECT	SUM(SalesAmount) OVER (PARTITION BY CustomerKey ORDER BY CustomerKey ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING ) AS [SumOverNew],
		SUM(SalesAmount) OVER (PARTITION BY CustomerKey ORDER BY CustomerKey ROWS BETWEEN 10 PRECEDING AND 1 PRECEDING) AS [SumOverOld]
FROM CTE1
WHERE RN <= @RowNumber

Output:

PHP:
SumOverNew	SumOverOld
NULL	NULL
4.99	4.99
9.98	9.98
18.97	18.97
27.96	27.96
37.95	37.95
59.93	59.93
94.92	94.92
148.91	148.91
688.90	688.90

Windowing functions were added rather late to MSSQL right? Maybe they will introduce the ability to use a variable in the future.
 
Last edited:

DominionZA

Executive Member
Joined
May 5, 2005
Messages
8,309
This is such a nice problem to try and solve! Worthy of an interview question me thinks :D The only way I got it close was to use a CTE to limit the data with ROW_NUMBER() using a variable, and then to use ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING, also assuming the '1' is constant:

PHP:
DECLARE @RowNumber INT = 10
;WITH CTE1 AS (
	SELECT	ROW_NUMBER() OVER (PARTITION BY CustomerKey ORDER BY SalesAmount) AS [RN],
			SalesAmount,
			CustomerKey
	FROM FactInternetSales 
	WHERE CustomerKey = 11001			
)
SELECT	SUM(SalesAmount) OVER (PARTITION BY CustomerKey ORDER BY CustomerKey ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING ) AS [SumOverNew],
		SUM(SalesAmount) OVER (PARTITION BY CustomerKey ORDER BY CustomerKey ROWS BETWEEN 10 PRECEDING AND 1 PRECEDING) AS [SumOverOld]
FROM CTE1
WHERE RN <= @RowNumber

Output:

PHP:
SumOverNew	SumOverOld
NULL	NULL
4.99	4.99
9.98	9.98
18.97	18.97
27.96	27.96
37.95	37.95
59.93	59.93
94.92	94.92
148.91	148.91
688.90	688.90

Windowing functions were added rather late to MSSQL right? Maybe they will introduce the ability to use a variable in the future.

Ye, window functions are relatively new - well the more advanced stuff at least.

2000 had bugger all.
2005 introduced ROW_NUMBER, OVER, etc.. The basics.
2008 (and R2) had no changes.
2012 and the world opened up. MS finally starting to pay some real attention to this powerful feature set.

If we had not decided to stick to only one of 3 options, then I was going to go with a CTE and ROW_NUMBER(). This would have worked, but not ideal.
Instead I am now going to use a CASE statement on the variable. If 1, then LAG(Mass), if 5 then AVG over 5 preceding. Anything else will be AVG over 10 preceding making 10 the default basically.
 

prod

Executive Member
Joined
Nov 11, 2009
Messages
6,132
Ye, window functions are relatively new - well the more advanced stuff at least.

2000 had bugger all.
2005 introduced ROW_NUMBER, OVER, etc.. The basics.
2008 (and R2) had no changes.
2012 and the world opened up. MS finally starting to pay some real attention to this powerful feature set.

If we had not decided to stick to only one of 3 options, then I was going to go with a CTE and ROW_NUMBER(). This would have worked, but not ideal.
Instead I am now going to use a CASE statement on the variable. If 1, then LAG(Mass), if 5 then AVG over 5 preceding. Anything else will be AVG over 10 preceding making 10 the default basically.

Yea, it's nice when you can get business to compromise on a solution - in my experience it only happens after you've struggled your ass off already. :p

Them: "Oh well, just limit it to X then"
Me: -___________-
 

RaptorSA

Executive Member
Joined
Sep 2, 2008
Messages
5,553
What if you just used a subquery with ROW_NUMBER() or DENSE_RANK() to build a subset that limits the amount of rows to how many they want to use in the Window function to get the average, then use unbounded...?
If they need to see all the records in the original query you can just populate a table with the above values and join it back to the original dataset.

Sorry, hope it makes sense. I'm actually also sitting with a SQL problem currently, will try out a piece of sample code later and post it.

SELECT AvgOutgoingMass = AVG(SubQ.sr_Hoppers.Mass) OVER (PARTITION BY SubQ.Shaft_Id, SubQ.Alias ORDER BY SubQ.EventDateTime, SubQ.N1 ROWS BETWEEN UNBOUNDED PRECEDING)
FROM (SELECT ROW_NUMBER()/DENSE_RANK() AS RowID,
Shaft_Id,
Alias ,
EventDateTime,
N1
FROM [youtables]) SubQ
WHERE SubQ.RowID <= @RangeAvgCalcInt)
 
Last edited:
Top