SQL Select Statement Help

The fun stuff begins when you have to write the SQL to remove the duplicates.... NO stored procs/cursors just plain SQL

Read, steady, go...
 
The fun stuff begins when you have to write the SQL to remove the duplicates.... NO stored procs/cursors just plain SQL

Read, steady, go...

Not really, actually a very easy query:

Code:
DELETE n1 FROM [B][I]table [/I][/B]n1, [B][I]table [/I][/B]n2 WHERE n1.[B][I]id [/I][/B]< n2.[B][I]id [/I][/B]AND n1.[B][I]field [/I][/B]= n2.[B][I]field[/I][/B]

Replace the table above - id refers to your id field - replace the field. This will remove all duplicates and leave only 1.
 
Not really, actually a very easy query:

Code:
DELETE n1 FROM [B][I]table [/I][/B]n1, [B][I]table [/I][/B]n2 WHERE n1.[B][I]id [/I][/B]< n2.[B][I]id [/I][/B]AND n1.[B][I]field [/I][/B]= n2.[B][I]field[/I][/B]

Replace the table above - id refers to your id field - replace the field. This will remove all duplicates and leave only 1.
Hmm dont understand the syntax. Is this sql server? Not sure if this is std SQL. Is it like delete from table where exists?

same sql, just

delete from table where id in (SQL select)
Nope, the group by having does not give you ids
 
Just to confirm that the 1st answer is correct but just needs a minor tweak

SELECT Project,Task,Resource,Status, Count(*) from Table
group by Project,Task,Resource,Status
having count(*) > 1

this will show all duplicates based on the columns listed in the select. The count will give you how many records have the exact same values for Project,Task,Resource,Status

Trillkops code is pretty accurate to delete the duplicates.. self join a table.
 
The fun stuff begins when you have to write the SQL to remove the duplicates.... NO stored procs/cursors just plain SQL

Read, steady, go...

Oracle has a nice way of dealing with this:
Code:
 delete from $table_name where rowid in
  (
  select "rowid" from
     (select "rowid", rank_n from
         (select rank() over (partition by $primary_key order by rowid) rank_n, rowid as "rowid"
             from $table_name
             where $primary_key in
                (select $primary_key from $table_name
                  group by $all_columns
                  having count(*) > 1
                )
             )
         )
     where rank_n > 1
  )

Keeps the "first version" of the duplicates and deletes the rest

Im sure sql server also has the rank function. You can use a timestamp/date field in the order by to also rank the rows you want to discard
 
Last edited:
Top
Sign up to the MyBroadband newsletter
X