Monday, October 8, 2012

SQL - Import Data

A few pointers on importing data into a SQL database.

The project that I'm currently working on required data to be imported into SQL server from a different database. Fortunately, SQL Management Studio has a nice feature for importing data. But importing data into SQL is not that strait forward. Things can go wrong depending on the type of data that your importing so it pays to spend some time looking at the data that you are going to be importing before you start.

The database that I was exporting the data from gave me three options for export; Comma Separated Values (CSV), Tab Separated Values (TSV), and XML. CSV was the one that I used the most and XML wasn't useful at all with the data that I needed to export.

My first attempt was importing the CSV file into SQL. This didn't work at all because I had a field the contained carriage returns and line feeds. Since a record is denoted by a new line you can see how this would cause a problem. Other fields like date fields could also cause problems with the import.

The next step was to open the CSV file in Excel and save it as an Excel (xls) file. In some cases this was all that I needed to do to be able to import the data into SQL.

When the Excel file wouldn't work I would import either the CSV or the Excel file into Access. The nice thing about doing the Access import is that Access will show you the records and fields where it's having a problem with the data. The Access file import into SQL usually worked but occasionally I would need to change the field type in the Access database.

The last thing that caused a problem was bogus (end of line ) characters in foreign key fields. This prevented my SQL statements from being able to match the foreign keys from the imported data with my SQL tables. They would show up as space characters and were not removable using the SQL RTIM function because they are not a space character! The fix for that was the following SQL to remove the problem characters:

Update [QuickBaseImportsv1].[dbo].[SubAgreements3]
set [Task#] = ltrim(rtrim(replace([Task#], char(160), char(32))))
Update [QuickBaseImportsv1].[dbo].[SubAgreements3]
set [Task#] = ltrim(rtrim(replace([Task#], char(10), char(32))))
Update [QuickBaseImportsv1].[dbo].[SubAgreements3]
set [Task#] = ltrim(rtrim(replace([Task#], char(13), char(32))))

One final piece of advice - unless you are dealing with a very simple table, do not try to import the data directly into your final, target SQL tables. Import the data into an intermediary table first where you can massage the data before transferring the data to your target SQL tables.

No comments:

Post a Comment

Thank you for commenting!