How good is your SQL Kung-Fu?

envo

Expert Member
Joined
Jan 14, 2014
Messages
3,265
Reaction score
437
I need some help and I don't know if this is possible or not

I have a query that returns a few hundred thousand records based on some specific criteria management needs to see.

I need to return that data including any data that wasn't returned in the initial query (left joins and such isn't possible in this one)

Is it possible to do something like:

select some,data from query_using_table
union all
select other,data from data_from_same_table_as_above where records not in (select_query_above)

Or do I have to do crap like temp tables to get what I want?
 
I think you can use the following.

select some,data from query_using_table where some_where_clause
union
select some,data from data_from_same_table_as_above

Read up on how union works, but basically a union should contain all entries which have not been found yet.

To differentiate between the 2 sets of data you could add a column to both queries with some identifier like 1 or 2. You can call that column anything as long as they are both the same in the 2 queries.
 
Temp tables would be easiest, but i would just use Cursors.
 
I need some help and I don't know if this is possible or not

I have a query that returns a few hundred thousand records based on some specific criteria management needs to see.

I need to return that data including any data that wasn't returned in the initial query (left joins and such isn't possible in this one)

Is it possible to do something like:

select some,data from query_using_table
union all
select other,data from data_from_same_table_as_above where records not in (select_query_above)

Or do I have to do crap like temp tables to get what I want?

Woah there cowboy

Cr@p like temp tables ?

I think you will find that when working with a large amount of data like that temp tables are your friend.

I'd say most definitely go the temp table route, performance will be vastly superior.

EDIT: I see you want to basically run the same query twice, so this is probably the best reason to use a temp table.
 
Errr,i don't get it, so you are selecting -everything- from the table anyway? What is the difference between the 2 queries then?

If TableX have 100k rows in it, and you are going to return all 100k , why aren't you doing this in 1 single select statement? Why are you doing 2 statements? Is there a difference between the 2 data sets?

I would probably first try something like this, depending on how complicated the difference between the 2 datasets are i.e


SELECT
(CASE WHEN {whatever you used in your where clause for first dataset}
THEN 'Group 1' ELSE 'Group 2' END) AS [Category] ,

(all the other columns you need here)

FROM TableX


So you don't have a where clause, since you are returning all the data anyway ?


The TEMP table route would probably involved having to go.

1. Declare the TEMP table (#table , not @table variable since we're talking alot of data here, i found table variables are very slow when it comes to alot of rows and lack of indexing, but i guess you play around with this.....table variables are easier to work with, don't need to drop etc)
2. Insert into #TEMP from TableX -> your first select statement
3. Insert into #TEMP from TableX with a left join back to #TEMP where key column is null.
4. Select all from #TEMP

But again, above doesn't make sense, since you are selecting all the data anyway.
 
Last edited:
the first query is a subset of data, I then want to get the rest that doesn't match the subset based on other criteria.

for example, our system generates invoices based off of work orders produced by CRM. That's the data I report off of. They now want to see manual invoices generated too. So I need to get invoices that's not in the system generated list.

I'll just use temp tables
 
Tricky - it always is with huge volumes. From what I understand - I'd play around with Union until I get what I wanted - just like Scooby suggested. No Temp Table or Table Variable required (Better housekeeping)
Just remember - Union will remove duplicates - so it would automatically exclude Result1 from Result2. Union All won't filter duplicates - so Result2 would need a Not In clause. (Let SQL do the hard work for you.)

And depending on how many rows you need to return - I'd even try to see if the opposite is not more efficient - try to exclude the rest of the data that does not fit into your initial criteria. It all depends on what you want to do with it.
Sometimes you need to step back and look at the problem from a different angle.
 
the first query is a subset of data, I then want to get the rest that doesn't match the subset based on other criteria.

for example, our system generates invoices based off of work orders produced by CRM. That's the data I report off of. They now want to see manual invoices generated too. So I need to get invoices that's not in the system generated list.

I'll just use temp tables

Ah - now it makes sense. Use two table variables or temp tables - just add your own indicator to differentiate.
 
the first query is a subset of data, I then want to get the rest that doesn't match the subset based on other criteria.

for example, our system generates invoices based off of work orders produced by CRM. That's the data I report off of. They now want to see manual invoices generated too. So I need to get invoices that's not in the system generated list.

I'll just use temp tables


and you can't do this?

SELECT
[InvoiceType] =
CASE
WHEN invoices.worksOrder is null THEN 'Manual Invoice'
WHEN invoices.something = "something" THEN 'System Invoice'
WHEN [as many as you want]
ELSE 'Invalid Invoice' END,


(the rest of the columns from the invoices table)


FROM invoices


and then you can filter out where InvoiceType = 'Invalid Invoice' in the report or in the where clause etc.
 
Last edited:
Assuming that your source table is the same for both statements, you're basically selecting all of the data, but you want to indicate which data is 'selected' and which wasn't all in one result set.

In that case, just use 'CASE' to flag the row as 'selected' or 'not selected'; you might want to first order by the CASE value.

Give us a bit more visibility to the query you currently have going so that we can fully understand what you're trying to achieve.
 
If you are using MS SQL..

Investigate CTE's as well as possibly the MERGE statement (which can do includes, excludes etc)
 
No offense to any of the guys who responded, but you'll get much better responses if you ask this on StackOverflow.com. Get answers from the experts.

My SQL skills are a bit rusty, but using a NOT IN clause, especially on such big datasets, is a big NO NO. The performance of your query will go down the toilet. It won't matter if you're running this once a week or once a month, but for a query that will get executed often, you're looking for trouble.

And as pointed out by ToxicBunny, you need to provide us with the query you already have going, so that we can see what you're trying to achieve. Otherwise people will be advising you based on what they THINK you're trying to achieve.
 
Learnt and used CTE's today and they are really powerful.
Worth a look.

Yeah... they're just a bit of a mindfsck the first few times you look at them, but once you figure them out.. MASSIVELY powerful, and give great performance as well.
 
I need some help and I don't know if this is possible or not

I have a query that returns a few hundred thousand records based on some specific criteria management needs to see.

I need to return that data including any data that wasn't returned in the initial query (left joins and such isn't possible in this one)

Is it possible to do something like:

select some,data from query_using_table
union all
select other,data from data_from_same_table_as_above where records not in (select_query_above)

Or do I have to do crap like temp tables to get what I want?

Just as a side note on this (even though we would need to see exactly what you're trying to achieve), in its most basic form.. what you are trying to achieve is a SELECT * FROM result set with the way you have laid out your query.
 
Top
Sign up to the MyBroadband newsletter
X