将OUTLOOK.HOL文件解析为CSV



OUTLOOK.HOL(假日(文件结构如下:

[Portugal] 207
All Saints' Day,2021/11/1
All Saints' Day,2022/11/1
Assumption,2021/8/15
Assumption,2022/8/15
Carnival,2021/2/16
Carnival,2022/3/1
[Puerto Rico] 489
Birthday of Eugenio María de Hostos,2021/1/11
Birthday of Eugenio María de Hostos,2022/1/10
Birthday of José de Diego,2021/4/19
Birthday of José de Diego,2022/4/18
Birthday of Don Luis Muñoz Rivera,2021/7/19
Birthday of Don Luis Muñoz Rivera,2022/7/18
[Qatar] 118
...

如何使用PowerShell将文件解析为结构化数据,以将CSV转换为具有标头的文件:

国家;数字假期名称;日期

/Michal

您需要逐个循环遍历文件中的所有行,并使用regex解析不同的"字段"。

$result = switch -Regex -File 'D:Testoutlook.hol' {
'^[([^]]+)]s+(d+)' { 
$country = $matches[1]
$number = $matches[2]
}
'^([^,]+),(d{4}/d{1,2}/d{1,2})$' { 
# found a data line, output a PSObject
[PsCustomObject]@{
Country      = $country
Number       = $number
Holiday_name = $matches[1]
Date         = $matches[2]
}
}
}
# output on screen
$result | Format-Table -AutoSize
# output to CSV file
$result | Export-Csv -Path 'D:TestOutlookHolidays.csv' -NoTypeInformation -Encoding UTF8

输出(屏幕上(

Country     Number Holiday_name                        Date     
-------     ------ ------------                        ----     
Portugal    207    All Saints' Day                     2021/11/1
Portugal    207    All Saints' Day                     2022/11/1
Portugal    207    Assumption                          2021/8/15
Portugal    207    Assumption                          2022/8/15
Portugal    207    Carnival                            2021/2/16
Portugal    207    Carnival                            2022/3/1 
Puerto Rico 489    Birthday of Eugenio María de Hostos 2021/1/11
Puerto Rico 489    Birthday of Eugenio María de Hostos 2022/1/10
Puerto Rico 489    Birthday of José de Diego           2021/4/19
Puerto Rico 489    Birthday of José de Diego           2022/4/18
Puerto Rico 489    Birthday of Don Luis Muñoz Rivera   2021/7/19
Puerto Rico 489    Birthday of Don Luis Muñoz Rivera   2022/7/18

Regex 1详细信息:

^                  Assert position at the beginning of the string
[                 Match the character “[” literally
(                  Match the regular expression below and capture its match into backreference number 1
[^]]           Match any character that is NOT a “A ] character”
+            Between one and unlimited times, as many times as possible, giving back as needed (greedy)
)                 
]                 Match the character “]” literally
s                 Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
+               Between one and unlimited times, as many times as possible, giving back as needed (greedy)
(                  Match the regular expression below and capture its match into backreference number 2
d              Match a single digit 0..9
+            Between one and unlimited times, as many times as possible, giving back as needed (greedy)
)

Regex 2详细信息:

^                 Assert position at the beginning of the string
(                 Match the regular expression below and capture its match into backreference number 1
[^,]           Match any character that is NOT a “,”
+           Between one and unlimited times, as many times as possible, giving back as needed (greedy)
)                
,                 Match the character “,” literally
(                 Match the regular expression below and capture its match into backreference number 2
d             Match a single digit 0..9
{4}         Exactly 4 times
/              Match the character “/” literally
d             Match a single digit 0..9
{1,2}       Between one and 2 times, as many times as possible, giving back as needed (greedy)
/              Match the character “/” literally
d             Match a single digit 0..9
{1,2}       Between one and 2 times, as many times as possible, giving back as needed (greedy)
)                
$                 Assert position at the end of the string (or before the line break at the end of the string, if any)

最新更新