On the recent project I worked on, I had a requirement to load data from an Excel spreadsheet to a database on a regular basis. My goal was to load the data from Excel to a ADO.NET Dataset and run my business rules. I couldn't find any articles addressing this problem and so I decided to write one.
There are two projects included in the source code package.
ExcelADOAdapter
builds the library, ExcelADOAdapter.dll. ExcelAdapterTest
is a test Windows Form application. In addition, you need to add the following references.
All sample codes in this articles are in C#.
This component uses Microsoft Office 2003 Interop Assemblies to handle the Excel source and the code is written in C#. The individual worksheets on the Excel workbook will be loaded into separate Datatable
s on the DataSet. I have a workbook, PurchaseOrder, with 5 worksheets; Customers, Orders, OrderDetails, Shippers and Products. The load process will create a dataset with 5 tables. The name of the dataset will be set to the name of the Excel workbook, in this example PurchaseOrder. The DataTable
names will be the name of the worksheets.
For more information on the Office 2003 Interop Assemblies, refer MSDN Reference.
The most important thing to remember when working with Excel is to get a valid range of cells on the worksheet to work with. We certainly don't want all the empty cells to get loaded into the DataTable
s. The logic that I used to get the valid range is depicted below:
/// <summary /> /// Gets the valid range of cells to work with. /// </summary /> /// /// <returns />Excel Range</returns /> private Microsoft.Office.Interop.Excel.Range getValidRange(Worksheet sheet)
{
string downAddress = "";
string rightAddress = "";
long indexOld = 0;
long index = 0;
Microsoft.Office.Interop.Excel.Range startRange;
Microsoft.Office.Interop.Excel.Range rightRange;
Microsoft.Office.Interop.Excel.Range downRange;
try
{
// get a range to work with startRange = sheet.get_Range("A1", Missing.Value);
// get the end of values to the right // (will stop at the first empty cell) rightRange = startRange.get_End(XlDirection.xlToRight);
/*get_End method scans the sheet in the direction specified *until it finds an empty cell and returns the previous cell. *We need to scan all the columns and find the column with *the highest number of rows (row count). *Then use the Prefix character on the right cell and the *row count to determine the address for the valid range. */
while (true)
{
downRange = rightRange.get_End(XlDirection.xlDown);
downAddress = downRange.get_Address(false, false,
eferenceStyle.xlA1, Type.Missing, Type.Missing);
index = getIndex(downAddress);
if (index >= 65536) index = 0;
if (index > indexOld) indexOld = index;
if (rightRange.Column == 1) break;
rightRange = rightRange.Previous;
}
rightRange = startRange.get_End(XlDirection.xlToRight);
rightAddress = rightRange.get_Address(false, false,
ReferenceStyle.xlA1, Type.Missing, Type.Missing);
return sheet.get_Range("A1", getPrefix(rightAddress) + indexOld);
}
catch (Exception)
{
throw;
}
finally
{
startRange = null;
rightRange = null;
downRange = null;
}
}
startRange = sheet.get_Range("A1", Missing.Value);
sets the starting cell as A1
which is the top left cell on the Excel worksheet. The get_End
method on the Range
object scans the sheet in the direction specified until it finds a empty cell and returns the previous cell. But a blank cell can also hold valid values as shown below.
To avoid this problem, you need to check all the columns and find the column with the highest number of rows and use that as the boundary for the valid range. But still if all the cells in the row are blank, the get_End
method will not find any rows after that row.
Once after a valid range is found, we can create a two dimensional array and load the array to a DataTable
. See the code segment below.
/// <summary /> /// Loads the data table. /// </summary /> /// /// <returns /></returns /> private System.Data.DataTable loadDataTable(Worksheet sheet)
{
object[] columnValues;
System.Data.DataTable dt = new System.Data.DataTable();
dt.TableName = sheet.Name;
Microsoft.Office.Interop.Excel.Range range;
try
{
range = getValidRange(sheet);
object[,] values = (object[,])range.Value2;
columnValues = new object[values.GetLength(1)];
for (int i = 1; i <= values.GetLength(0); i++)
{
if (i == 1 && _firstRowHeader)
{
for (int j = 1; j <= values.GetLength(1); j++)
{
object value = values[i, j];
if (value != null) dt.Columns.Add(value.ToString());
else dt.Columns.Add("");
}
}
else
{
for (int j = 1; j <= values.GetLength(1); j++)
{
object value = values[i, j];
if (value != null)
{
columnValues[j - 1] = value.ToString();
}
else
{
columnValues[j - 1] = "";
}
}
dt.Rows.Add(columnValues);
}
}
return dt;
}
catch (Exception)
{
throw;
}
finally
{
range = null;
}
}
I hope you will find the code useful. Thank you.