我有 3 个 csv 文件位于不同的网络文件夹中。网络文件夹/子文件夹中可能包含空格。我想联接这 3 个 csv 文件以创建包含所需列的单个 ADO 记录集。
Test1.csv(我已经从所有csv中排除了不必要的列(
T1Id | Gpos | lbl
-----------------------
1001 | 0 | Innovate
1002 | 1 | Buys
1003 | 2 | Sales
1004 | 3 | Forecasts
1005 | 4 | Usage
1006 | 5 | Forum
Test2.csv:(我已经从所有csv中排除了不必要的列(
T2Id | T1Id | Apos | tval
-----------------------------------
382 | 1001 | 1 | my life my rules.
203 | 1001 | 2 | earth wind rain and fire.
658 | 1002 | 1 | wealth power blood desire.
200 | 1003 | 1 | one good to live for.
301 | 1003 | 2 | before we die.
439 | 1004 | 1 | one taste to glory
795 | 1004 | 2 | one mouthful of sky.
494 | 1004 | 3 | some other text.
Test3.csv:(我已经从所有csv中排除了不必要的列(
(blank) Aggregate (blank) Aggregate 149_SG_Bryl_Cream 891_SG_Myo__Sky_Blue_Dress
------------------------------------------------------------------------------
X0.1 0.422300 0.424658 0.458014 0.434639
X0.2 0.318628 0.345475 0.334548 0.333675
X0.3 0.274694 0.274643 0.243424 0.286865
X0.4 0.294568 0.346758 0.276552 0.366648
X1.1 0.565734 0.293436 0.283564 0.235366
X1.2 0.286657 0.755456 0.283233 0.310544
X2.1 0.234643 0.245459 0.245434 0.343423
X2.2 0.343645 0.455659 0.343282 0.334343
X2.3 0.234643 0.245459 0.245434 0.343423
如您所见,Test3.csv 有 4 个问题:
- 它具有带有空白标题的列,但包含数据,例如
X0.1
. - 它具有带有空白标题的列,但带有
no data
. - 它有 2 个同名的
Aggregate
列。 - 它有
SG
列(许多(,我需要提取only the part starting from 'SG'.
要求
Test1
&Test2
CSV 需要加入T1Id
。- 只需要保留这些文件的
lbl
列和tval
列。 Apos
和Gpos
将用于连接从文件中的第一个空白标题列(包含X0.1
等值(创建的 2 列Test3.csv
。
法典:
Sub Doit(cFiles As Collection)
Application.DisplayAlerts = False
Application.ScreenUpdating = False
Dim strSQL1$, TempSG$,sFullDirectory$
sFullDirectory = `\xxx.xxx.xxxclient name`
strFF1 = "Test1.csv"
strFF2 = "Test2.csv"
strFF3 = "Test3.csv"
' Test1.csv Path: `\xxx.xxx.xxxclient nameXYZsub-folder1 name`
strF1 = cFiles(strFF1)
' Test1.csv Path: `\xxx.xxx.xxxclient nameABCsub-folder2 name`
strF2 = cFiles(strFF2)
' Test1.csv Path: `\xxx.xxx.xxxclient nameDEFGHIsub-folder1 name`
strF3 = cFiles(strFF3)
Set oCon = CreateObject("ADODB.Connection")
Set oRs = CreateObject("ADODB.Recordset")
strCon = "Driver=Microsoft Access Text Driver (*.txt, *.csv);Dbq=" & sFullDirectory & ";Extensions=asc,csv,tab,txt;HDR=Yes;"
' Select TOP 1 row with headers from `Test3.csv`
strSQL = "SELECT TOP 1 * FROM " & strF3 & strFF3
oCon.Open strCon
Set oRs = oCon.Execute(strSQL)
i = 1
strSQL = "SELECT "
For Each Fld In oRs.Fields
Select Case True
Case Is = Fld.Name = "NoName" 'Blank header columns
If Fld.Value <> vbNullString Then
strSQL = strSQL & " CLng(Replace(Left(" & Fld.Name & ", InStr(" & Fld.Name & ", '.') - 1), 'X', ''))" & " AS [gval],"
strSQL = strSQL & " CLng(Right(" & Fld.Name & ", Len(" & Fld.Name & ") - InStr(" & Fld.Name & ", '.')))" & " AS [pos],"
Else
' Do nothing here
End If
Case Is = Fld.Name = "Aggregate"
strSQL = strSQL & " CDbl([" & Fld.Name & "]) AS [Aggregate " & i & "],"
i = i + 1
Case Is = InStr(1, Fld.Name, "SG") > 0
TempSG = Trim(Mid(Fld.Name, InStr(1, Fld.Name, "SG"), Len(Fld.Name)))
strSQL = strSQL & " CDbl([" & Fld.Name & "]) AS [" & TempSG & "], "
End Select
Next Fld
If Right(Trim(strSQL), 1) = "," Then strSQL = Left(Trim(strSQL), Len(Trim(strSQL)) - 1)
strSQL = strSQL & " FROM " & strF3 & strFF3
strSQL = strSQL & " WHERE ((NoName) <> 'Base Sizes')"
oRs.Close
' This `strSQL1` will be used to join `strSQL`.
strSQL1 = "SELECT G.[lbl], A.[tval], Q.*"
strSQL1 = strSQL1 & " FROM "
strSQL1 = strSQL1 & " (SELECT G.[Gpos], A.[Apos], G.[lbl], A.[tval] FROM " & strF1 & strFF1 & " G," & strF2 & strFF2 & " A WHERE G.[T1Id] = A.[T1Id]) T, (" & strSQL & ") Q "
strSQL1 = strSQL1 & " WHERE (CLng(T.[G].[Gpos]) = CLng(Q.[gval])) AND (CLng(T.[A].[Apos]) = CLng(Q.[pos]))"
strSQL1 = strSQL1 & " ORDER BY CLng(Q.[gval]), CDbl(Q.[Aggregate 1]) DESC, G.[lbl];"
'CREATE RECORDSET FROM SQL STRING
Set oRs = oCon.Execute(strSQL1)
ExitSub:
oRs.Close
oCon.Close
Set oRs = Nothing
Set oCon = Nothing
Application.ScreenUpdating = True
Application.DisplayAlerts = True
Exit Sub
ErrorHandler:
MsgBox "Error No: " & Err.Number & vbCrLf & "Description: " & Err.Description, vbCritical + vbOKOnly, "An Error occurred!"
Err.Clear
On Error GoTo 0
Resume ExitSub
End Sub
拆分测试3后.csv表如下所示:
gval pos (blank) Aggregate (blank) Aggregate 149_SG_Bryl_Cream 891_SG_Myo__Sky_Blue_Dress
------------------------------------------------------------------------------
0 1 0.422300 0.424658 0.458014 0.434639
0 2 0.318628 0.345475 0.334548 0.333675
0 3 0.274694 0.274643 0.243424 0.286865
0 4 0.294568 0.346758 0.276552 0.366648
1 1 0.565734 0.293436 0.283564 0.235366
1 2 0.286657 0.755456 0.283233 0.310544
2 1 0.234643 0.245459 0.245434 0.343423
2 2 0.343645 0.455659 0.343282 0.334343
2 3 0.234643 0.245459 0.245434 0.343423
最终表:(简短示例(
lbl txval gval pos Aggregate 1 Aggregate 2 SG_Bryl_Cream SG_Myo__Sky_Blue_Dress
-------------------------------------------------------------------------------------------------------------------
Innovate My life my rules. 0 1 0.422300 0.424658 0.458014 0.434639
Innovate earth wind rain and fire. 0 2 0.318628 0.345475 0.334548 0.333675
Buys my life my rules. 1 1 0.565734 0.293436 0.283564 0.235366
Buys earth wind rain and fire. 1 2 0.286657 0.755456 0.283233 0.310544
Sales my life my rules. 2 1 0.234643 0.245459 0.245434 0.343423
Sales earth wind rain and fire. 2 2 0.343645 0.455659 0.343282 0.334343
Sales Some other text. 2 3 0.234643 0.245459 0.245434 0.343423
...
问题
- 有没有办法通过SQL获取重复列
Aggregate
列? - 有没有办法从
Q
中仅选择SG
列。Test3.csv
通过 SQL ?
例如
strSQL1 = "SELECT G.[lbl], A.[tval], Q.* "
相反:
strSQL1 = "SELECT G.[lbl], A.[tval], Q.* LIKE 'SG' "
考虑通过使用所需的列别名直接查询 CSV 文件来运行纯 SQL:
SELECT t.Food, t.Bev, t.Meds, t.[Average], t.Midpoint, t.Average AS [OtherAverage]
FROM [text;database=C:FolderToCSV With Spaces].[my File.csv] AS t;
此外,查询可以集成到操作查询中:
生成表查询
SELECT t.Food, t.Bev, t.Meds, t.[Average], t.Midpoint, t.Average AS [OtherAverage]
INTO [myNewtable]
FROM [text;database=C:FolderToCSV With Spaces].[my File.csv] AS t;
追加查询
INSERT INTO myFinalTable (Food, Bev, Meds, Average, Midpoint, OtherAverage)
SELECT t.Food, t.Bev, t.Meds, t.Average, t.Midpoint, t.Average AS [OtherAverage]
FROM [text;database=C:FolderToCSV With Spaces].[my File.csv] AS t;
若要在 VBA 中使用 ADO 运行,请将 Jet/ACE SQL 引擎与 Excel 或 Access ODBC 驱动程序一起使用,其中工作簿或数据库文件源无关紧要,因为您远程连接到 CSV:
Set conn = CreateObject("ADODB.Connection")
Set rst = CreateObject("ADODB.Recordset")
' EXCEL DRIVER
conn.Open "Driver={Microsoft Excel Driver (*.xls, *.xlsx, *.xlsm, *.xlsb)};" _
& "DBQ=" & ThisWorkbook.FullName & ";"
rst.Open "SELECT t.Food, t.Bev, t.Meds, t.[Average], t.Midpoint, t.Average AS [OtherAverage] " _
& " FROM [text;database=C:FolderToCSV With Spaces].[my File.csv] AS t", conn
' ACCESS DRIVER
conn.Open "Driver={Microsoft Access Driver (*.mdb, *.accdb)};" _
& "DBQ=C:PathToAnyDatabase.accdb"
rst.Open "SELECT t.Food, t.Bev, t.Meds, t.[Average], t.Midpoint, t.Average AS [OtherAverage] " _
& " FROM [text;database=C:FolderToCSV With Spaces].[my File.csv] AS t", conn
使用 Access ODBC 文本驱动程序
Set oCon = CreateObject("ADODB.Connection")
Set oRs = CreateObject("ADODB.Recordset")
sFullDirectory = "C:FolderToCSV With Spaces"
strCon = "Driver=Microsoft Access Text Driver (*.txt, *.csv);" _
& "Dbq=" & sFullDirectory & ";Extensions=asc,csv,tab,txt;HDR=Yes;"
strSQL = "SELECT t.Food, t.Bev, t.Meds, t.[Average], t.Midpoint, t.Average AS [OtherAverage] " _
& " FROM [my File.csv] AS t"
oCon.Open strCon
Set oRs = oCon.Execute(strSQL)