如何使用脚本组件过滤平面文件源



我有以下方案:我有成千上万的文本文件具有以下格式。列名称以单独的行编写,其中作为管道值(|)界定行值。

START-OF-FILE
PROGRAMNAME=getdata
DATEFORMAT=yyyymmdd
#Some Text
#Some Text
#Some Text
#Some Text
#Some Text
START-OF-FIELDS
Field1
Field2
Field3
------
FieldN
END-OF-FIELDS
TIMESTARTED=Tue May 12 16:04:42 JST 2015
START-OF-DATA
Field1Value|Field2value|Field3Value|...|Field N Value
Field1Value|Field2value|Field3Value|...|Field N Value
------|...........|----|-------
END-OF-DATA
DATARECORDS=30747
TIMEFINISHED=Tue May 12 16:11:53 JST 2015
END-OF-FILE

现在我有了一个相应的SQL Server表,可以轻松地将数据加载为目的地。由于我是SSI的新手,因此在如何编写脚本组件方面遇到麻烦,以便我可以过滤源文本文件并轻松加载到SQL Server表中。

预先感谢!

有几种方法可以做到。如果文件的格式是恒定的,则具有Flat File Connection Manager编辑器的一些有用属性。例如,您可以将新的平面文件连接添加到连接管理器中。有一些属性,例如"要跳过"的上述文件,您可以将其设置为18。然后,它将以" |"。

从列线开始。

可能有用的平面文件连接管理器的另一个属性是,如果您打开"平面文件连接管理器",然后单击侧面菜单中的列,可以将列特征设置为管道" |"

但是,如果文件的格式将更改,例如可变的标题行数,您可以使用脚本任务删除任何非填充行。例如标题和页脚。

例如,您可以添加一个方法,例如file.readalllines,然后根据需要编辑或删除行,然后保存文件。

有关该方法的信息在这里:https://msdn.microsoft.com/en-us/library/s2tte0y1(v=vs.110).aspx

例如。要删除脚本任务中的最后一行

string[] lines = File.ReadAllLines( "input.txt" );
StringBuilder sb = new StringBuilder();
int count = lines.Length - 1; // all except last line
for (int i = 0; i < count; i++)
{
    sb.AppendLine(lines[i]);
}
File.WriteAllText("output.txt", sb.ToString());

在您的SSIS脚本组件任务中使用以下Vb脚本作为源

enter code here
Imports System
Imports System.Data
Imports System.Math
Imports System.IO
Imports Microsoft.SqlServer.Dts.Runtime
Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper
Imports Microsoft.SqlServer.Dts.Runtime.Wrapper

<Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute()> _
<CLSCompliant(False)> _
Public Class ScriptMain
    Inherits UserComponent
    'Private strSourceDirectory As String
    'Private strSourceFileName As String
    Private strSourceSystem As String
    Private strSourceSubSystem As String
    Private dtBusinessDate As Date

    Public Overrides Sub PreExecute()
        MyBase.PreExecute()
        '
        ' Add your code here for preprocessing or remove if not needed
        ''
    End Sub
    Public Overrides Sub PostExecute()
        MyBase.PostExecute()
        '
        ' Add your code here for postprocessing or remove if not needed
        ' You can set read/write variables here, for example:
        Dim strSourceDirectory As String = Me.Variables.GLOBALSourceDirectory.ToString()
        Dim strSourceFileName As String = Me.Variables.GLOBALSourceFileName.ToString()
        'Dim strSourceSystem As String = Me.Variables.GLOBALSourceSystem.ToString()
        'Dim strSourceSubSystem As String = Me.Variables.GLOBALSourceSubSystem.ToString()
        'Dim dtBusinessDate As Date = Me.Variables.GLOBALBusinessDate.Date

    End Sub
    Public Overrides Sub CreateNewOutputRows()
        '
        ' Add rows by calling the AddRow method on the member variable named "<Output Name>Buffer".
        ' For example, call MyOutputBuffer.AddRow() if your output was named "MyOutput".
        '
        Dim sr As System.IO.StreamReader
        Dim strSourceDirectory As String = Me.Variables.GLOBALSourceDirectory.ToString()
        Dim strSourceFileName As String = Me.Variables.GLOBALSourceFileName.ToString()
        'Dim strSourceSystem As String = Me.Variables.GLOBALSourceSystem.ToString()
        'Dim strSourceSubSystem As String = Me.Variables.GLOBALSourceSubSystem.ToString()
        'Dim dtBusinessDate As Date = Me.Variables.GLOBALBusinessDate.Date
        'sr = New System.IO.StreamReader("C:QRM_SourceFilesBBG_BONDS_OUTPUT_YYYYMMDD.txt")
        sr = New System.IO.StreamReader(strSourceDirectory & strSourceFileName)
        Dim lineIndex As Integer = 0
        While (Not sr.EndOfStream)
            Dim line As String = sr.ReadLine()
            If (lineIndex <> 0) Then 'remove header row
                Dim columnArray As String() = line.Split(Convert.ToChar("|"))
                If (columnArray.Length > 1) Then
                    Output0Buffer.AddRow()
                    Output0Buffer.Col0 = columnArray(0).ToString()
                    Output0Buffer.Col3 = columnArray(3).ToString()
                    Output0Buffer.Col4 = columnArray(4).ToString()
                    Output0Buffer.Col5 = columnArray(5).ToString()
                    Output0Buffer.Col6 = columnArray(6).ToString()
                    Output0Buffer.Col7 = columnArray(7).ToString()
                    Output0Buffer.Col8 = columnArray(8).ToString()
                    Output0Buffer.Col9 = columnArray(9).ToString()
                    Output0Buffer.Col10 = columnArray(10).ToString()
                    Output0Buffer.Col11 = columnArray(11).ToString()
                    Output0Buffer.Col12 = columnArray(12).ToString()
                    Output0Buffer.Col13 = columnArray(13).ToString()
                    Output0Buffer.Col14 = columnArray(14).ToString()
                    Output0Buffer.Col15 = columnArray(15).ToString()
                    Output0Buffer.Col16 = columnArray(16).ToString()
                    Output0Buffer.Col17 = columnArray(17).ToString()
                    Output0Buffer.Col18 = columnArray(18).ToString()
                    Output0Buffer.Col19 = columnArray(19).ToString()
                    Output0Buffer.Col20 = columnArray(20).ToString()
                    Output0Buffer.Col21 = columnArray(21).ToString()
                    Output0Buffer.Col22 = columnArray(22).ToString()
                    Output0Buffer.Col23 = columnArray(23).ToString()
                    Output0Buffer.Col24 = columnArray(24).ToString()
                End If
            End If
            lineIndex = lineIndex + 1
        End While
        sr.Close()
    End Sub
End Class

代码结束

最新更新