Excel字符串分析



我有一个电子表格,它使用字符串约定来传达有关单个"案例"的详细级别

约定如下:字符串中的数据有5个主要类别。1=类别,2=工具,3=文档,4=流程,5=工作援助。

每个类别可以有多个子类别,每个类别按1a、1b、1c或2a、2b、2c等细分。

主类别与子类别用":"分隔,子类别与子分类用"分隔,而子类别与新的主类别用";"分隔

字符串示例:"1:1i;2:2a;3:3a,3d,3l;4:4a">

以下是按价值分类的示例:

CATEGORY    1
Incorrect:VG:QOC    1i
TOOLS   2
Macro:Used  2a
DOCUMENTATION   3
TAT:Missing 3a
ROUTING:Missing 3d
STORY:Missing Impact to Health  3i
PROCESS 4
CNX Checklist Not Used  4a

我希望能够以文本形式提取出标记的主要类别和子类别。

示例查询:
在所有字符串中,主类别4被标记了多少次?在所有4个标记中,4a标记了多少次?在一个"case"或字符串上标记的所有元素是什么?

我可以构建字符串,但不能解析它们。请帮忙…:)

您要查找的函数是Split(此链接用于VB函数,但VBA中的行为几乎相同)。您可以将特定的字符串传递给它并指定分隔符,它将返回每个值的数组。

在您的情况下,由于字符串有多个分隔符,因此您需要多次执行此操作。

第一个潜在的问题是,如果您没有给定类别的子类别,该怎么办?如果字符串中的每个类别总是至少有一个子类别,那么这很好,但如果可能存在没有子类别的情况,那么您需要确保最高级别的分组仍然由;分隔。

由于您没有说明信息将如何呈现给用户,下面的示例将在Excel的中间窗口中打印出与您期望的内容接近的内容。

Option Explicit
Sub SplitExample()
Dim inputString As String
Dim categories() As String
Dim subCategories() As String
Dim individualSubCat() As String
Dim cat As Variant
Dim subCat As Variant
Dim cnt As Integer
inputString = "1:1i;2:2a;3:3a,3d,3l;4:4a"
categories = Split(inputString, ";")
For Each cat In categories
subCategories = Split(cat, ":")
If UBound(subCategories) >= 0 Then
Debug.Print ("Category " & subCategories(0))
If UBound(subCategories) > 0 Then
individualSubCat = Split(subCategories(1), ",")
Debug.Print (vbTab & "Has " & UBound(individualSubCat) - LBound(individualSubCat) + 1 & " flag(s)")
For Each subCat In individualSubCat
Debug.Print (vbTab & subCat & " was flagged " & CountSubCategory(individualSubCat, subCat) & " time(s)")
Next
Else
Debug.Print (vbTab & "No Subcategories flagged")
End If
Debug.Print ("")
End If
Erase subCategories
Erase individualSubCat
Next
End Sub

这是一个功能,将计算子类别容易

Private Function CountSubCategory(individualSubCategories() As String, ByVal subCat As String) As Integer
Dim cnt As Integer
Dim value As Variant
cnt = 0
For Each value In individualSubCategories
If value = subCat Then cnt = cnt + 1
Next
CountSubCategory = cnt
End Function

使用您的示例字符串作为输入,上面的代码将打印:

Category 1
Has 1 flag(s)
1i was flagged 1 time(s)
Category 2
Has 1 flag(s)
2a was flagged 1 time(s)
Category 3
Has 3 flag(s)
3a was flagged 1 time(s)
3d was flagged 1 time(s)
3l was flagged 1 time(s)
Category 4
Has 1 flag(s)
4a was flagged 1 time(s)

上面的代码将打印每个标志,即使有重复的标志。你没有说明这是否是你想要的行为。从数组中筛选或分组重复项并不简单,但最好使用VBA中的Collection或Dictionary类。(查看此问题以获得从数组中筛选重复项的帮助)。

上面的代码只是一个示例,展示了需要做什么以及如何完成解析(因为这是您的特定请求)。要真正将其转化为工作代码,您需要做两件事:

  1. 在VBA中创建一个函数或Sub,用上面的代码解析它(基本上是上面SplitExample()内部的代码),给它一个名称(像ParseErrorCodes),并让它接受一个名为inputString的字符串参数。然后,您将从构建字符串的方法中调用它(您说您已经可以这样做了),并将该字符串传递给该方法
  2. 决定如何输出结果。实际上,你可以用将结果输出到某个地方的东西来替换Debug.Print行(可能是另一个excel电子表格,这样你就可以创建你想要的图表)

基本想法是:

Sub OutputErrorCodes()
Dim inputString as String
' You code to read your string values from where-ever you keep them
'  and build the inputString
' this source could be a file, or a worksheet in the Excel Workbook
'  or could be an external datasource like a database or even an internet 
'  location
' once you build your inputString, you just need to call ParseErrorCodes 
ParseErrorCodes inputString
End Sub
Sub ParseErrorCodes(input as String)
' MyCode from above with some modifications
'  - You need to remove the Dim for inputString and the assignment for 
'     inputString
'  - You need to replace the Debug.Print lines with your method of 
'     defining the output
'    *  this can be via outputing directly to an excel spreadsheet or 
'        maybe a global variable
'    *  outputing to an excel spreadsheet would probably be the best 
'        option and allow for more dynamic flags, but you need to decide 
'        where to output it in the code
End Sub
Private Function CountSubCategory(individualSubCategories() As String, 
ByVal subCat As String) As Integer)
' this code can probably be taken directly from my example
End Function

最新更新