SnowflakeRead逗号分隔的负载,其中双引号之间有逗号



如何在Snowflake中读取逗号分隔的负载,该负载在某些列中的双引号之间可能有逗号?例如:(o,"sadasdasd",123123123,"this is an example, of data field, with commas and backslashes which should be read as one unique column",0,...)

到目前为止,我使用的是:

with your_table as 
(select json_text:message::string as payload from table)
split_part(payload,',',1) as firstfield, 
...
from 
your_table

这里有一个JavaScript UDF可以做到这一点:

create or replace function SPLIT_QUOTED_STRING(STR string)
returns array
language javascript
as
$$
var arr = STR.match(/(".*?"|[^",s]+)(?=s*,|s*$)/g);
for (var i = 0; i < arr.length; i++) {
arr[i] = arr[i].replace(/['"]+/g, '')
}
return arr;
$$;
select split_quoted_string('o,"sadasdasd",123123123,"asdasdasd.www.org,123123,link.com",0');
-- To get a member of the array:
select split_quoted_string('o,"sadasdasd",123123123,"asdasdasd.www.org,123123,link.com",0')[1]::string;

如果要拆分为一个表而不是一个数组,请使用FLATTEN表函数:

select "VALUE"::string as MY_VALUE
from table(flatten (split_quoted_string('o,"sadasdasd",123123123,"asdasdasd.www.org,123123,link.com",0')));

最新更新