使用 bash 从 json 文件中提取句子



我有一个包含以下内容的 json 文件:

"1034280": {
    "transcript": ["1040560",
    "Ok, so what Akanksha is saying is that..."],
    "interaction": "Student Question"
},
"1041600": {
    "transcript": ["1044860",
    "this is also not correct because it will take some time."],
    "explain": "Describing&Interpreting"
},
"1046800": {
    "transcript": ["1050620",
    "So, what you have to do is, what is the closest to the answer?"],
    "question": "FocusingInformation"
},

我想提取成绩单句子并将它们连接起来。

例如。我希望输出为:

"Ok, so what Akanksha is saying is that..." "this is also not correct because it will take some time." "So, what you have to do is, what is the closest to the answer?"

有注意事项

  • 您确实应该使用 JSON 解析库,如注释中所述
  • 这可能仅在您的数据与问题完全匹配时才有效
  • 我会把破译这个尴尬留给你,因为你没有指定你尝试过什么

当输入数据位于名为 data 的文件中时:

awk -F"]," 'BEGIN { ORS="" } /"transcript":/ {p=1} NF==2 && p=1 { sub( /^[[:space:]]*"/, (++i==1?"":" ")""", $1 ); print $1; p=0 } END { print "n" }' data

输出:

"Ok, so what Akanksha is saying is that..." "this is also not correct because it will take some time." "So, what you have to do is, what is the closest to the answer?"

这可能对你有用(GNU sed):

sed '/{/,+2{//,+1d;s/^s*|],s*$//g;H;};$!d;x;s/n//;y/n/ /' file

相关内容

  • 没有找到相关文章

最新更新