Monday, July 20, 2009

Import Flat Files into DataSet

In many legacy applications, data is stored in text files, sometimes called flat files. It might be necessary to read data from these files into an ADO.NET Dataset

The System.IO Namespace facilitates file access, and the methods exposed by the System.String class enable parsing the data contained in text files.

When reading data from a text file, you must first determine how the data is stored. Data stored in text files is usually separated by a common character called a delimiter. A delimiter can be a comma(,), a colon(:), a semicolon(;), or any other character. Rows of data are often stored as lines of text with a carriage return signifying the beginning of a new row. The number of entries in each row should correspond to the number of columns in the table.

Once the basic structure of the stored data has been determined, you can begin to construct your Dataset around it. You must create a single DataTable for each table of data you want to represent. Next, you should add the appropriate number of columns to your DataSet. Sometimes, the data in the file’s first row is used for the column names. If this is the case, you should read the first line of the text file and name the columns programmatically.

A text file can be read with the System.IO.StreamReader class. This class can open a file and return the characters represented within it. Once the file has been opened, you can use the method of the System.String class to separate the data entries and add them to the new data rows.

Because every flat file conceivably could have a different format, you must determine the correct procedure for accessing the data on an individual basis. The following code example demonstrates how to access data from a simple, common scenario: data stored in a text file where the rows are represented by lines and the entries are delimited by commas, a common format known as comma-Separated values (CSV).

Imports System.IO

Creates a new dataset

Dim myDataSet As New DataSet()

‘Creates a new datatable and adds it to the Tables collection

Dim aTable As New DataTable(“MyTable”)

Creates and names seven columns and adds them to MyTable

Dim Counter As Integer

Dim aColumn As DataColumn

For Counter = 0 to 5

aColumn = new DataColumn(“Column “ & Counter.ToString())

myDataSet.Tables(“MyTable”).Columns.Add(aColumn)

Next

‘Creates the StreamReader to read the file and a string variable to hold the output of the StreamReader

Dim myReader As New System.IO.StreamReader(“C:\Viji\MyTestFile.txt”)

Dim myString As String

‘Checks to see if the reader has reached the end of the stream

While myreder.Peek <> -1

‘Reads a line of data from the text file

myString = myReader.ReadLine

‘Uses the String.Split method to create an array of strings that represents each entry in the line.

myDataSet.Tables(“MyTable”).Rows.Add(myString.Split(“,”c))

End While

1 comment:

  1. I believe there is an error in your code.

    'Creates a new datatable and adds it to the Tables collection
    Dim aTable As New DataTable(“MyTable”)

    The piece of code above does create a new datatable, but it does not add that table to the Tables collection of the established dataset (myDataset).

    You need a line similar to this:

    myDataSet.Tables().Add(aTable)

    in order to add the new table to the tables collection. Otherwise, the following line within the For...Next loop fails.

    myDataSet.Tables(“MyTable”).Columns.Add(aColumn)

    ReplyDelete