Trying to bulk-load a huge amount of data into MS SQL Server.
But, unfortunately the table already contains a few records with matching keys.
What to do?
So what to do.
Googled till I was sweaty.
Finally, the most useful advice is the one documented in SQLMag.com:
Umachandar Jayachandran – A Bulk-Copy Procedure
The solution is actually attributed to
- Alejandro Mesa, a database programmer for Simplex Medical in Fort Myers, Florida
- Marcos Kirchner, a student at the University of Blumenau (FURB) in Santa Catarina, Brazil
The advice is to create a unique index on the columns that infer sameness and be sure to indicate that duplicates should simply be discarded.
The following statement creates the new index:
CREATE UNIQUE INDEX [index-name] ON [table-name] ( [column-1] , [column-2] , [column-3] ) WITH ignore_dup_key
CREATE UNIQUE INDEX [uq_idx_companies_id] ON Companies(CompanyId) WITH ignore_dup_key
So it looks like it is OK to leave everything as is, just create a unique index (with ignore_no_dup) that has the same columns as the primary key. Import the data and go back and drop the ignore_dup_key index.