数据集SQL的预处理



我有一个数据集的平面文件,其中有100行和2001个属性,如下所示:

att1,att2,att4,att5,att6,att7,att8,att9,att10,....,att2000,type
5,3,4,4,1,5,1,3,2,4,...,12,Agresti
4,4,2,0,1,0,2,0,0,0,...,22,bbresti
0,0,0,0,0,0,0,0,1,0,...,34,bbresti
0,0,0,1,0,0,0,1,1,1,...,45,Agresti
...
0,6,0,0,0,0,1,0,3.5,...,1,Agresti

我想将这些数据作为进行预处理

i,j,val
1,1,5
1,2,4
1,3,0
1,4,0
...
1,100,0
2,1,3
2,2,4
2,3,0
2,4,0
...
2,100,6
3,1,4
3,2,2
3,3,0
3,4,0
...
...
2000,100,1

所以我去掉了最后一个列类型,我知道在sql中我会做这样的事情:

CREATE TABLE matrix (Row int NOT NULL, Column int NOT NULL, Value <datatype> NOT NULL)
SELECT Row AS Column
       ,Column AS Row
       ,Value
FROM matrix

但是,正如我只想在SQL中导入表一样,但是代码在c#

中看起来怎么样

以下是一些C#的想法,它将接受您的输入文件并重新格式化您的请求。然后使用SqlBulkCopy类将数据上推到SQLServer中的表中。

希望能有所帮助。

using System.IO;
namespace ConsoleApplication1
{
     class Program
     {
          static void Main(string[] args)
          {
                string[] parts;
                //string[] lines = System.IO.File.ReadAllLines(@"somefile.csv");
                string[] lines = new string[] { 
                     "a1,a2,a3,a4,type", 
                     "22,33,44,55,t1", 
                     "222,333,444,555,t2", 
                     "2222,3333,4444,5555,t3" };
                int lineno = 1;
                string path = @"c:output.csv";
                using (TextWriter writer = File.CreateText(path))
                {
                     foreach (string line in lines)
                     {
                          if (lineno > 1)
                          {
                                parts = line.Split(',');
                                for (int a = 0; a < parts.Length - 1; a++)
                                     writer.WriteLine(lineno.ToString() + "," + (a+1).ToString() + "," + parts[a]);
                          }
                          lineno++;
                     }
                }
          }
     }
}

最新更新