我有一个包含以下内容的 json 文件:
"1034280": {
"transcript": ["1040560",
"Ok, so what Akanksha is saying is that..."],
"interaction": "Student Question"
},
"1041600": {
"transcript": ["1044860",
"this is also not correct because it will take some time."],
"explain": "Describing&Interpreting"
},
"1046800": {
"transcript": ["1050620",
"So, what you have to do is, what is the closest to the answer?"],
"question": "FocusingInformation"
},
我想提取成绩单句子并将它们连接起来。
例如。我希望输出为:
"Ok, so what Akanksha is saying is that..." "this is also not correct because it will take some time." "So, what you have to do is, what is the closest to the answer?"
有注意事项
- 您确实应该使用 JSON 解析库,如注释中所述
- 这可能仅在您的数据与问题完全匹配时才有效
- 我会把破译这个尴尬留给你,因为你没有指定你尝试过什么
当输入数据位于名为 data
的文件中时:
awk -F"]," 'BEGIN { ORS="" } /"transcript":/ {p=1} NF==2 && p=1 { sub( /^[[:space:]]*"/, (++i==1?"":" ")""", $1 ); print $1; p=0 } END { print "n" }' data
输出:
"Ok, so what Akanksha is saying is that..." "this is also not correct because it will take some time." "So, what you have to do is, what is the closest to the answer?"
这可能对你有用(GNU sed):
sed '/{/,+2{//,+1d;s/^s*|],s*$//g;H;};$!d;x;s/n//;y/n/ /' file