I’m going to go over some methods to import data from text files into SQL Server today. The particular file I went out and grabbed is comma delimited and with a few qualifiers in it. It is a typical file you may get and a request made to import or just for your own administrative tasks.

Below is the location of field layout and file that I grabbed off the net to play with. This is just a text file comma separated of zip codes. I will attach the file as well to this blog.

http://spatialnews.geocomm.com/newsletter/2000/jan/zipcodes.html

Field 1 – State Fips Code
Field 2 – 5-digit Zipcode
Field 3 – State Abbreviation
Field 4 – Zipcode Name
Field 5 – Longitude in Decimal Degrees (West is assumed, no minus sign)
Field 6 – Latitude in Decimal Degrees (North is assumed, no plus sign)
Field 7 – 1990 Population (100%)
Field 8 – Allocation Factor (decimal portion of state within zipcode)

Example of file

6 ways to import data into SQL Server

Import Wizard

First and very manual technique is the import wizard. This is great for ad-hoc and just to slam it in tasks.

In SSMS right click the database you want to import into. Scroll to Tasks and select Import Data…

6 ways to import data into SQL Server

For the data source we want out zips.txt file. Browse for it and select it. You should notice the wizard tries to fill in the blanks for you. One key thing here with this file I picked is there are “ “ qualifiers. So we need to make sure we add “ into the text qualifier field. The wizard will not do this for you.

6 ways to import data into SQL Server

Go through the remaining pages to view everything. No further changes should be needed though

6 ways to import data into SQL Server
6 ways to import data into SQL Server

Hit next after checking the pages out and select your destination. This in our case will be DBA.dbo.zips.

Following the destination step, go into the edit mappings section to ensure we look good on the types and counts.

6 ways to import data into SQL Server

Hit next and then finish. Once completed you will see the count of rows transferred and the success or failure rate

6 ways to import data into SQL Server

Import wizard completed and you have the data!

bcp utility

Method two is bcp with a format file http://msdn.microsoft.com/en-us/library/ms162802.aspx

This is probably going to win for speed on most occasions but is limited to the formatting of the file being imported. For this file it actually works well with a small format file to show the contents and mappings to SQL Server.

To create a format file all we really need is the type and the count of columns for the most basic files. In our case the qualifier makes it a bit difficult but there is a trick to ignoring them. The trick is to basically throw a field into the format file that will reference it but basically ignore it in the import process.

Given that our format file in this case would appear like this

9.0

9

1       SQLCHAR       0       0       """         0     dummy1             ""

2       SQLCHAR       0       50      "",""      1     Field1             ""

3       SQLCHAR       0       50      "",""      2     Field2             ""

4       SQLCHAR       0       50      "",""      3     Field3             ""

5       SQLCHAR       0       50      "","        4     Field4             ""

6       SQLCHAR       0       50      ","          5     Field5             ""

7       SQLCHAR       0       50      ","          6     Field6             ""

8       SQLCHAR       0       50      ","          7     Field7             ""

9       SQLCHAR       0       50      "n"         8     Field8             ""

The bcp call would be as follows

C:Program FilesMicrosoft SQL Server90ToolsBinn>bcp DBA..zips in “C:zips.txt” -f “c:zip_format_file.txt” -S LKFW0133 -T

Given a successful run you should see this in command prompt after executing the statement

Starting copy...

1000 rows sent to SQL Server. Total sent: 1000

1000 rows sent to SQL Server. Total sent: 2000

1000 rows sent to SQL Server. Total sent: 3000

1000 rows sent to SQL Server. Total sent: 4000

1000 rows sent to SQL Server. Total sent: 5000

1000 rows sent to SQL Server. Total sent: 6000

1000 rows sent to SQL Server. Total sent: 7000

1000 rows sent to SQL Server. Total sent: 8000

1000 rows sent to SQL Server. Total sent: 9000

1000 rows sent to SQL Server. Total sent: 10000

1000 rows sent to SQL Server. Total sent: 11000

1000 rows sent to SQL Server. Total sent: 12000

1000 rows sent to SQL Server. Total sent: 13000

1000 rows sent to SQL Server. Total sent: 14000

1000 rows sent to SQL Server. Total sent: 15000

1000 rows sent to SQL Server. Total sent: 16000

1000 rows sent to SQL Server. Total sent: 17000

1000 rows sent to SQL Server. Total sent: 18000

1000 rows sent to SQL Server. Total sent: 19000

1000 rows sent to SQL Server. Total sent: 20000

1000 rows sent to SQL Server. Total sent: 21000

1000 rows sent to SQL Server. Total sent: 22000

1000 rows sent to SQL Server. Total sent: 23000

1000 rows sent to SQL Server. Total sent: 24000

1000 rows sent to SQL Server. Total sent: 25000

1000 rows sent to SQL Server. Total sent: 26000

1000 rows sent to SQL Server. Total sent: 27000

1000 rows sent to SQL Server. Total sent: 28000

1000 rows sent to SQL Server. Total sent: 29000

bcp import completed!

BULK INSERT

Next, we have BULK INSERT given the same format file from bcp

CREATE TABLE zips (

   Col1 nvarchar(50),

   Col2 nvarchar(50),

   Col3 nvarchar(50),

   Col4 nvarchar(50),

   Col5 nvarchar(50),

   Col6 nvarchar(50),

   Col7 nvarchar(50),

   Col8 nvarchar(50)

   );

GO

INSERT INTO zips

    SELECT *

      FROM  OPENROWSET(BULK  'C:Documents and SettingstkruegerMy Documentsblogcenzuszipcodeszips.txt',

      FORMATFILE='C:Documents and SettingstkruegerMy Documentsblogzip_format_file.txt'     

      ) as t1 ;

GO

That was simple enough given the work on the format file that we already did. Bulk insert isn’t as fast as bcp but gives you some freedom from within TSQL and SSMS to add functionality to the import.

SSIS

Next is my favorite playground in SSIS

We can do many methods in SSIS to get data from point A, to point B. I’ll show you data flow task and the SSIS version of BULK INSERT

First create a new integrated services project.

Create a new flat file connection by right clicking the connection managers area. This will be used in both methods

6 ways to import data into SQL Server
6 ways to import data into SQL Server

Bulk insert

You can use format file here as well which is beneficial to moving methods around. This essentially is calling the same processes with format file usage. Drag over a bulk insert task and double click it to go into the editor.

Fill in the information starting with connection. This will populate much as the wizard did.

Example of format file usage

6 ways to import data into SQL Server

Or specify your own details

6 ways to import data into SQL Server

Execute this and again, we have some data

6 ways to import data into SQL Server

Data Flow method

Bring over a data flow task and double click it to go into the data flow tab.

Bring over a flat file source and SQL Server destination. Edit the flat file source to use the connection manager “The file” we already created. Connect the two once they are there

6 ways to import data into SQL Server

Double click the SQL Server Destination task to open the editor. Enter in the connection manager information and select the table to import into.

6 ways to import data into SQL Server

Go into the mappings and connect the dots per say

6 ways to import data into SQL Server

Typical issue of type conversions is Unicode to non-unicode.

6 ways to import data into SQL Server

We fix this with a Data conversion or explicit conversion in the editor. Data conversion tasks are usually the route I take. Drag over a data conversation task and place it between the connection from the flat file source to the SQL Server destination.

New look in the mappings

6 ways to import data into SQL Server

And after execution…

6 ways to import data into SQL Server

SqlBulkCopy Method

Sense we’re in the SSIS package we can use that awesome “script task” to show SlqBulkCopy. Not only fast but also handy for those really “unique” file formats we receive so often

Bring over a script task into the control flow

6 ways to import data into SQL Server

Double click the task and go to the script page. Click the Design script to open up the code behind

6 ways to import data into SQL Server

Go ahead and put this code into the task.

Imports System

Imports System.Data

Imports System.Math

Imports System.Xml

Imports System.Data.SqlClient

Imports System.Data.OleDb

Imports Microsoft.SqlServer.Dts.Runtime



Public Class ScriptMain





    Public Sub Main()

        Dim conn As New SqlConnection("Data Source=LKFW0133;Initial Catalog=DBA;Integrated Security=SSPI")



        Using bulk As New SqlBulkCopy(conn.ConnectionString)

            bulk.BatchSize = 1000

            bulk.NotifyAfter = 1000

            bulk.DestinationTableName = "zips"

            AddHandler bulk.SqlRowsCopied, AddressOf OnSqlRowsCopied

            bulk.WriteToServer(LoadupDTFromTxt)

        End Using



        Dts.TaskResult = Dts.Results.Success

    End Sub



    Private Sub OnSqlRowsCopied(ByVal sender As Object, _

        ByVal args As SqlRowsCopiedEventArgs)

        Console.WriteLine("Copied {0} so far...", args.RowsCopied)

    End Sub





    Private Function LoadupDTFromTxt() As DataTable

        Dim cn As New OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0; Data Source=C:;Extended Properties=""Text;HDR=No;FMT=Delimited""")

        Dim da As New OleDbDataAdapter()

        Dim ds As New DataSet

        Dim cd As New OleDbCommand("SELECT * FROM C:zips.txt", cn)

        cn.Open()

        da.SelectCommand = cd

        ds.Clear()

        da.Fill(ds, "zips")

        Return ds.Tables(0)

        cn.Close()

    End Function

End Class
6 ways to import data into SQL Server

Then execute the script task

Again the same results as previous methods but with a new look.

All of these methods have a place in each unique situation. Performance wise in my experience, bcp wins on speed typically. This is not always a method you can use though which leads us to other resources SQL Server and services provide us.

Of course none of this would be completely finished unless we added some statistics on runtimes along with these methods. I went ahead and created 3 addition zips.txt files to go with the original. In these files we have the following

zips.txt – 29,471 rows at around 1.8MB
zips_halfmill.txt – 500,991 rows at around 31.4MB
zips_million.txt – 1,001,981 rows at around 62.8MB
zips_5mill.txt – 5,009,901 rows and around 314.3MB

I ran these each through all the methods. My results are below. Mind you, the important thing to understand is that I write my blogs/articles off my eprsonal test lab. In no way do I utilize monster servers that would be more suited for benchmarking each. All hardware is created not so equal and results will be varying given that variable. Take these results but keep in mind that resources will make them move up and down the chart. Memory, I/O and CPU is a big factor in speed.

Tests were complete by running each process 5 times. All resources cleared on each execution.
Shown in AVG of milliseconds between the types. Import Wizard was not tested as this is basically a Data Flow Task behind the scenes and can be seen (minus the slow user clicking things) from the Data Flow Task in SSIS results

6 ways to import data into SQL Server

Hope this helps as a good reference in your own imports.