snippetMinor
Efficent way to Query a DB and write to text file - Large Recordset
Viewed 0 times
filerecordsetefficentquerytextwaywritelargeand
Problem
I have a SQL query that returns a recordset with between 2 and 5 million records and I need to write that to a .csv file.
I wrote the following procedure and I'm curious to see if there is a better / more efficient way to do this.
```
Dim conn As New OracleConnection()
...
Sub DBExecuteQueryWriteToFile(ByVal SQLCommand As String, ByVal FileName As String, Optional ByVal Delimiter As String = ",")
Dim cmd = New OracleCommand(SQLCommand, conn)
Dim sb As New StringBuilder
Dim x As Integer = 0
Dim CountRow As Integer = 0
If File.Exists(FileName) Then
File.Delete(FileName)
End If
Using FileObject As New FileStream(FileName, FileMode.OpenOrCreate)
Using MStream As New MemoryStream()
Using StreamWriterObj As New StreamWriter(MStream)
Using Reader As OracleDataReader = cmd.ExecuteReader()
Dim FieldCount As Integer = Reader.FieldCount - 1
Do While Reader.Read()
sb.Append(Reader.Item(0))
For i = 1 To FieldCount
sb.Append(Delimiter)
sb.Append(Reader.Item(i))
Next
sb.Append(vbCrLf)
'Write every 25000 rows of data to the file from the buffer
If x = 25000 Then
StreamWriterObj.Write(sb.ToString().ToCharArray())
MStream.Seek(0, SeekOrigin.Begin)
MStream.WriteTo(FileObject)
sb = New StringBuilder()
CountRow = CountRow + x
x = 0
End If
x = x + 1
Loop
'Write any remaining data from the buffer to the file
StreamWriterObj.Write(sb.ToString().ToCharArray())
MStream.WriteTo(FileObjec
I wrote the following procedure and I'm curious to see if there is a better / more efficient way to do this.
```
Dim conn As New OracleConnection()
...
Sub DBExecuteQueryWriteToFile(ByVal SQLCommand As String, ByVal FileName As String, Optional ByVal Delimiter As String = ",")
Dim cmd = New OracleCommand(SQLCommand, conn)
Dim sb As New StringBuilder
Dim x As Integer = 0
Dim CountRow As Integer = 0
If File.Exists(FileName) Then
File.Delete(FileName)
End If
Using FileObject As New FileStream(FileName, FileMode.OpenOrCreate)
Using MStream As New MemoryStream()
Using StreamWriterObj As New StreamWriter(MStream)
Using Reader As OracleDataReader = cmd.ExecuteReader()
Dim FieldCount As Integer = Reader.FieldCount - 1
Do While Reader.Read()
sb.Append(Reader.Item(0))
For i = 1 To FieldCount
sb.Append(Delimiter)
sb.Append(Reader.Item(i))
Next
sb.Append(vbCrLf)
'Write every 25000 rows of data to the file from the buffer
If x = 25000 Then
StreamWriterObj.Write(sb.ToString().ToCharArray())
MStream.Seek(0, SeekOrigin.Begin)
MStream.WriteTo(FileObject)
sb = New StringBuilder()
CountRow = CountRow + x
x = 0
End If
x = x + 1
Loop
'Write any remaining data from the buffer to the file
StreamWriterObj.Write(sb.ToString().ToCharArray())
MStream.WriteTo(FileObjec
Solution
You can simplify the code greatly without loosing any performance, more likely gaining instead.
There is no need to use
Next, there is no need to use
You should also play around with the options how to open
Of course, as the other answer proposed, separating reading from database and writing to the file in two threads might work as well. Try and measure three separate scenarios - just reading from the database (measure separately the query (
P.S. few other issues in your code:
There is no need to use
StringBuilder - you can write directly to the StreamWriter - it will be faster since there will be no need of copy the whole data twice.Next, there is no need to use
MemoryStream - you are using it to buffer data before writing it to the disk. The same can be achieved by specifying the buffer size when creating FileStream. You should play around with the buffer size to see what is fastest on your environment - usually 4KB is used, in the sample below it is 1MB, but you have to try it on your hardware.You should also play around with the options how to open
FileStream. You should try what impact Asynchronous and WriteThrough has on your solution.Of course, as the other answer proposed, separating reading from database and writing to the file in two threads might work as well. Try and measure three separate scenarios - just reading from the database (measure separately the query (
ExecuteReader() and first Read() and then reading the other rows - because it might be that most of the time is spending to initially execute the query, not to iterate through the data), just writing to the file (dummy data) and then combined. This way you will see if there is actual reason to put the two operations in parallel.Dim conn As New OracleConnection()
...
Sub DBExecuteQueryWriteToFile(ByVal SQLCommand As String, ByVal FileName As String, Optional ByVal Delimiter As String = ",")
Dim cmd = New OracleCommand(SQLCommand, conn)
Dim bufferSize = 1024*1024; // 1Mb
If File.Exists(FileName) Then
File.Delete(FileName)
End If
Using FileObject As New FileStream(FileName, FileMode.OpenOrCreate, FileAccess.Write, FileShare.None, bufferSize)
Using StreamWriterObj As New StreamWriter(FileObject)
Using Reader As OracleDataReader = cmd.ExecuteReader()
Dim FieldCount As Integer = Reader.FieldCount - 1
Do While Reader.Read()
StreamWriterObj.Write(Reader.Item(0))
For i = 1 To FieldCount
StreamWriterObj.Write(Delimiter)
StreamWriterObj.Write(Reader.Item(i))
Next
StreamWriterObj.WriteLine();
Loop
End Using
End Using
End Using
End SubP.S. few other issues in your code:
StringBuilder().ToString().ToCharArray()- no need to convert to char array.
new StringBuilder()- instead of reusing the memory already allocated to the existing builder, you are throwing that away and creating new one.
MStream.Seek()- you are seeking to the beginning of the stream but are not settingLengthto 0. So if the second block written there is smaller than first one, junk data will be written to the file.
Code Snippets
Dim conn As New OracleConnection()
...
Sub DBExecuteQueryWriteToFile(ByVal SQLCommand As String, ByVal FileName As String, Optional ByVal Delimiter As String = ",")
Dim cmd = New OracleCommand(SQLCommand, conn)
Dim bufferSize = 1024*1024; // 1Mb
If File.Exists(FileName) Then
File.Delete(FileName)
End If
Using FileObject As New FileStream(FileName, FileMode.OpenOrCreate, FileAccess.Write, FileShare.None, bufferSize)
Using StreamWriterObj As New StreamWriter(FileObject)
Using Reader As OracleDataReader = cmd.ExecuteReader()
Dim FieldCount As Integer = Reader.FieldCount - 1
Do While Reader.Read()
StreamWriterObj.Write(Reader.Item(0))
For i = 1 To FieldCount
StreamWriterObj.Write(Delimiter)
StreamWriterObj.Write(Reader.Item(i))
Next
StreamWriterObj.WriteLine();
Loop
End Using
End Using
End Using
End SubContext
StackExchange Code Review Q#41683, answer score: 4
Revisions (0)
No revisions yet.