NIFI:将深度嵌套XML转换为CSV(executeScript vs executeStreamCommand)的最

我一直在使用nifi。

要求从一个大XML创建许多小表(每个都有不同数量的列(，所有这些都将合并，或与特殊字符(例如连字符(串联在一起，以输出一个CSV。/p>

但是，我不太确定我的方法是否最佳。

我的nifi管道如下。

getfile
executestreamCommand(Python脚本(
Splitjson
conventrecord(json to csv(
MergeContent(使用Fragment.Identifier的策略(
UpdateAttribute(将CSV扩展到文件名(
putfile

我的方法是从XML创建JSON，如下所示，并使用Controller Service将JSON转换为XML后，将JSON拆分为每个表。而不是从头开始重写XML，而只需创建{column：value}字典或JSON的速度要快得多。

{table1：[{column1：value1 ,,, column_n：value_n}，{}，{}]
table2：[{column1：value1 ,,,,, column_n：value_n}，{}，{}，{}，{}，{}]

*每个表值的列表长度表示CSV中的记录数。

当我在本地环境中尝试上述管道时，它处理了250 xml，以60秒的速度处理，每个文件约为0.25秒。但是，当我用ExecuteScript(Jython(替换ExecuteStreamCommand时，我期望的不是更快的性能，NIFI由于内存错误而下降。每个文件的处理速度也超过30秒。

为什么executescript(jython(在性能方面很差？如果我必须使用ExecuteScript，我应该使用Groovy吗？

文档说明executecript是实验性的。

executestreamCommand更适合您的目标

在流文件的内容上执行外部命令，并使用命令的结果创建一个新的流文件。

https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi--standard-nar/1.5.0/org.apache.nifi.nifi.nifi.nifi.processors.standard.standard.standard.standard.stancecutesteramcommand/sececutesteramcommand/indcommand/indexex/indindexexex.html

.html

相关内容

最新更新

热门标签：