Some pretty basic SQL query help

Brawler · Jan 18, 2014

Hi. I haven't done SQL in a damn long time. I'd appreciate some help with this rather weird query that I need to run. I've made this one up but the logic is the same.

Bear with me:

2x tables:
Customer(custId, name, surname, etc)
Order(orderId, custId, numberOfOrder)

Clarity on orderId and numberOfOrder: OrderID is just a unique number for each row (PK) and numberOfOrder is the number of the order per customer, i.e. the first order will be 1, the second 2 and so forth.

Obviously each customer can have many orders.

What I want the query to do:
I want to select all of the customers names and surnames who do not have a NumberOfOrder of 1.

I have joined the tables and all and I can easily search for those that have an order number of 1 and then filter them out from the original dataset to get the values I am looking for but I am hoping there is an easier way to do this in SQL?

froot · Jan 18, 2014

SELECT Customer.name, Customer.surname, Orders.numberOfOrder FROM Customer
JOIN Orders ON
Customer.custID = Orders.custID
WHERE Orders.numberOfOrder <> 1

You shouldn't use the table name "Order". It's a keyword.

Bernie · Jan 18, 2014

froot said:
SELECT Customer.name, Customer.surname, Orders.numberOfOrder FROM Customer
JOIN Orders ON
Customer.custID = Orders.custID
WHERE Orders.numberOfOrder <> 1

You shouldn't use the table name "Order". It's a keyword.

If there are no records at all for a customer in the order table will this be picked up? If not, then maybe some version of this plus an outer join of sorts.

skimread · Jan 18, 2014

SELECT Customer.custID, Customer.name, Customer.surname
FROM Customer
INNER JOIN Orders
ON Customer.custID = Orders.custID
WHERE Orders.numberOfOrder <> 1
GROUP BY Customer.custID, Customer.name, Customer.surname

froot · Jan 18, 2014

skimread said:
GROUP BY Customer.name, Customer.surname

Of course, and add the group by clause to eliminate duplicates of people, if that is what you require.

DominionZA · Jan 18, 2014

Bernie said:
If there are no records at all for a customer in the order table will this be picked up? If not, then maybe some version of this plus an outer join of sorts.

Just left join to ensure you get all customers regardless of having orders or not.

If you got duplicate customers then bad initial design. Should have a unique index on customer name and surname preventing duplicates getting in in the first place.
Can't understand why grouping to identify duplicates even came up to be honest.

skimread · Jan 18, 2014

froot said:
Of course, and add the group by clause to eliminate duplicates of people, if that is what you require.

Since the order table have multiples of the same custID the grouping is to eliminate the same results appearing in the output not to eliminate duplicates of different people with the same name

semaphore · Jan 18, 2014

Mike Smit said:
Just left join to ensure you get all customers regardless of having orders or not.

If you got duplicate customers then bad initial design. Should have a unique index on customer name and surname preventing duplicates getting in in the first place.
Can't understand why grouping to identify duplicates even came up to be honest.

Indexing on name and surname? Now who is talking bad design. You do know people can have the same name and surname right? Like at my work place we got two people identical name and surname so i guess one of them cant order anything :wtf:.

He will need to think of a better way to isolate duplicate clients, or actually not at all. I mean i've registered three times on amazon with the exact same details except for my email. So use an email has a unique identifier per customer.

DominionZA · Jan 18, 2014

semaphore said:
Indexing on name and surname? Now who is talking bad design. You do know people can have the same name and surname right? Like at my work place we got two people identical name and surname so i guess one of them cant order anything :wtf:.

He will need to think of a better way to isolate duplicate clients, or actually not at all. I mean i've registered three times on amazon with the exact same details except for my email. So use an email has a unique identifier per customer.

Was in argument to grouping by name/surname to handle duplicate customers. Read first - then argue.
I even quoted it. How did you miss that?

Hamster · Jan 18, 2014

Mike Smit said:
Just left join to ensure you get all customers regardless of having orders or not.

If you got duplicate customers then bad initial design. Should have a unique index on customer name and surname preventing duplicates getting in in the first place.
Can't understand why grouping to identify duplicates even came up to be honest.

Left join, yes

Unique constraint on name/surname, no

Interesting to see the group by. I would have used "distinct". Does the group by offer a performance gain?

semaphore · Jan 18, 2014

Mike Smit said:
Was in argument to grouping by name/surname to handle duplicate customers. Read first - then argue.
I even quoted it. How did you miss that?

Whether you where responding to something of that nature or not is immaterial, the advice you put forward was flawed. And if you read in the actual group by the customer id would have been there so in essence it would have been unique unless the post was edited after the fact, which it seems so. Even then still your advice was flawed.

GROUP BY Customer.custID, Customer.name, Customer.surname

DominionZA · Jan 18, 2014

semaphore said:
Whether you where responding to something of that nature or not is immaterial, the advice you put forward was flawed. And if you read in the actual group by the customer id would have been there so in essence it would have been unique unless the post was edited after the fact, which it seems so. Even then still your advice was flawed.

Okay.

semaphore · Jan 18, 2014

Hamster said:
Left join, yes

Unique constraint on name/surname, no

Interesting to see the group by. I would have used "distinct". Does the group by offer a performance gain?

Distinct will be faster if you have indexes, where as group by's will suffer performance if its having to cluster fields together.

Hamster · Jan 18, 2014

semaphore said:
Distinct will be faster if you have indexes, where as group by's will suffer performance if its having to cluster fields together.

Never knew.

semaphore · Jan 18, 2014

Hamster said:
Never knew.

But yeah that being said sql server is clever enough to do internal optimizations and use what is required. Unless you're using DB2 of course

Hamster · Jan 18, 2014

semaphore said:
But yeah that being said sql server is clever enough to do internal optimizations and use what is required. Unless you're using DB2 of course

%#}€*!! DB2

zippy · Jan 18, 2014

Indexes on names can be very unreliable. Some people have strange spellings and the user might spell it incorrectly or it may have been spelled incorrectly when first captured. The solution to this is have a column which stores the SOUNDEX version of the last name. The search is on this column. Any customer search should always list the customers with the matching names and then include date of birth and address as well. Customer services users should have been trained to verify the identity by asking for the customers date if birth and address. It's then easy to identify the correct record. The chances of someone living at the same address with the same name, same gender and same date of birth are very remote.

Don't try unique indexes on names. It's a world of pain. Duplicate customer records are always an issue. Proper training of your customer services helps

Sodan · Jan 19, 2014

Brawler said:
Hi. I haven't done SQL in a damn long time. I'd appreciate some help with this rather weird query that I need to run. I've made this one up but the logic is the same.

Bear with me:

2x tables:
Customer(custId, name, surname, etc)
Order(orderId, custId, numberOfOrder)

Clarity on orderId and numberOfOrder: OrderID is just a unique number for each row (PK) and numberOfOrder is the number of the order per customer, i.e. the first order will be 1, the second 2 and so forth [1].

Obviously each customer can have many orders.

What I want the query to do:
I want to select all of the customers names and surnames who do not have a NumberOfOrder of 1 [2].

I have joined the tables and all and I can easily search for those that have an order number of 1 and then filter them out from the original dataset to get the values I am looking for but I am hoping there is an easier way to do this in SQL?

This is confusing: as per [1] above, each customer's first order will have a numberOfOrder value of 1, second order will have a numberOfOrder value of 2, and so on. However, as per [2] you're looking for all customers that don't have a numberOfOrder of 1. These then can only be customer that have no orders at all!

However, if we're only looking for customers that have placed more than one order, then, apart from duplicates, this will work fine:

froot said:
SELECT Customer.name, Customer.surname, Orders.numberOfOrder FROM Customer
JOIN Orders ON
Customer.custID = Orders.custID
WHERE Orders.numberOfOrder <> 1

You shouldn't use the table name "Order". It's a keyword.

But that doesn't include customers that have no orders. To include them, and simultaneously remove duplicates, try:

Code:

select c.custid, c.name, c.surname
from Customers c
where not exists (select 1 from "Order" o.custid = c.custid and o.numberOfOrder <> 1)

What is actually the business information you want out of this?

Hamster · Jan 19, 2014

Sodan said:
What is actually the business information you want out of this?

Methinks he has been tasked to identify bad data. Maintenance instead of business code.

Brawler · Jan 19, 2014

Hamster said:
Methinks he has been tasked to identify bad data. Maintenance instead of business code.

Correct.

Sodan said:
This is confusing: as per [1] above, each customer's first order will have a numberOfOrder value of 1, second order will have a numberOfOrder value of 2, and so on. However, as per [2] you're looking for all customers that don't have a numberOfOrder of 1. These then can only be customer that have no orders at all!

What is actually the business information you want out of this?

It's bad data. I am trying to find all customers that have a missing first order.

Join the MyBroadband community

Get started

Some pretty basic SQL query help

Honorary Master

Honorary Master

Expert Member

Honorary Master

Honorary Master

Executive Member

Honorary Master

Honorary Master

Executive Member

Resident Rodent

Honorary Master

Executive Member

Honorary Master

Resident Rodent

Honorary Master

Resident Rodent

Honorary Master

Expert Member

Resident Rodent

Honorary Master