使用另一个文本文件单词列表按顺序替换单词 - data_file



在我的脑海中,我有:

读"original_file",将第 3 行"ENTRY1"更改为 data_file 中第一个单词的单词。写出new_file1.读"original_file",将第 3 行"ENTRY1"更改为 data_file 中的第二个单词。写出new_file2

重复整个data_file

摘录/示例:

original_file:
    line1      {
    line2        "id": "b5902627-0ba0-40b6-8127-834a3ddd6c2c",
    line3        "name": "ENTRY1",
    line4        "auto": true,
    line5        "contexts": [],
    line6        "responses": [
    line7      {
    ------------
    data_file:(simply a word/number List)
    line1   AAA11
    line2   BBB12
    line3   CCC13
    ..100lines/Words..
    -------------
    *the First output/finished file would look like:
    newfile1:
    line1      {
    line2        "id": "b5902627-0ba0-40b6-8127-834a3ddd6c2c",
    line3        "name": "AAA11",
    line4        "auto": true,
    line5        "contexts": [],
    line6        "responses": [
    line7      {
    ------------
    and the Second:
    newfile2:
    line1      {
    line2        "id": "b5902627-0ba0-40b6-8127-834a3ddd6c2c",
    line3        "name": "BBB12",
    line4        "auto": true,
    line5        "contexts": [],
    line6        "responses": [
    line7      {
    ------------

..等等。

我一直在尝试使用 sed,类似

awk 'FNR==$n1{if((getline line < "data_file") > 0) fprint '/"id:"/' '/""/' line ; next}$n2' < newfile

并且.. 作为 shell 脚本的开头。

#!/bin/bash
n1=3
n2=2
sed '$n1;$n2 data_file' original_file > newfile

任何帮助将不胜感激。我一直在尝试将SO.上发现的各种技术粘合在一起。一次一件事..学习如何更换..然后从第二个文件替换。.但它超出了我的知识范围。再次感谢。我的data_file大约有 31,000 行。所以这是必要的。.(待自动化)。这是一次性的事情,但可能对其他人非常有用?

假设我们尝试更改某些 JSON 数据中的"名称",并且新值将是纯字母数字(以便双引号正常工作):

#!/bin/bash
n=1
cat data_file | while read value; do
    jq <original_file >"newfile$n" ".name = "$value""
    ((n++))
done

在Python 2.+中:

输入:

more original_file.json data_file 
::::::::::::::
original_file.json
::::::::::::::
{
  "id": "b5902627-0ba0-40b6-8127-834a3ddd6c2c",
  "name": "ENTRY1",
  "auto": true,
  "contexts": [],
  "responses": []
}
::::::::::::::
data_file
::::::::::::::
AAA11
BBB12
CCC13

蟒蛇脚本:

import json
#open the original json file
with open('original_file.json') as handle:
  #create a dict based on the json content
  dictdump = json.loads(handle.read())
  #file counter
  i = 1
  #open the data file
  f = open("data_file", "r")
  #get all lines of the data file
  lines = f.read().splitlines()
  #close it
  f.close()
  #for each line of the data file
  for line in lines:
    #change the value of the json name element by the current line
    dictdump['name'] = line
    #open newfileX
    o = open("newfile" + str(i),'w')
    #dump the content of modified json
    json.dump(dictdump,o)
    #close the file
    o.close()
    #increase the counter value
    i += 1

输出:

more newfile*
::::::::::::::
newfile1
::::::::::::::
{"contexts": [], "auto": true, "responses": [], "id": "b5902627-0ba0-40b6-8127-834a3ddd6c2c", "name": "AAA11"}
::::::::::::::
newfile2
::::::::::::::
{"contexts": [], "auto": true, "responses": [], "id": "b5902627-0ba0-40b6-8127-834a3ddd6c2c", "name": "BBB12"}
::::::::::::::
newfile3
::::::::::::::
{"contexts": [], "auto": true, "responses": [], "id": "b5902627-0ba0-40b6-8127-834a3ddd6c2c", "name": "CCC13"}

如果需要垂直输出 json 转储,可以调整以下行: json.dump(dictdump, o)json.dump(dictdump, o, indent=4).这将产生:

more newfile*
::::::::::::::
newfile1
::::::::::::::
{
    "contexts": [], 
    "auto": true, 
    "responses": [], 
    "id": "b5902627-0ba0-40b6-8127-834a3ddd6c2c", 
    "name": "AAA11"
}
::::::::::::::
newfile2
::::::::::::::
{
    "contexts": [], 
    "auto": true, 
    "responses": [], 
    "id": "b5902627-0ba0-40b6-8127-834a3ddd6c2c", 
    "name": "BBB12"
}
::::::::::::::
newfile3
::::::::::::::
{
    "contexts": [], 
    "auto": true, 
    "responses": [], 
    "id": "b5902627-0ba0-40b6-8127-834a3ddd6c2c", 
    "name": "CCC13"
}

文档:https://docs.python.org/2/library/json.html

较新的版本保持与输入相同的顺序:

import json
from collections import OrderedDict
#open the original json file
with open('original_file.json') as handle:
  #create a dict based on the json content
  dictdump = json.loads(handle.read(), object_pairs_hook=OrderedDict)
  #file counter
  i = 1
  #open the data file
  f = open("data_file", "r")
  #get all lines of the data file
  lines = f.read().splitlines()
  #close it
  f.close()
  #for each line of the data file
  for line in lines:
    #change the value of the json name element by the current line
    dictdump['name'] = line
    #open newfileX
    o = open("newfile" + str(i),'w')
    #dump the content of modified json
    json.dump(dictdump, o, indent=4)
    #close the file
    o.close()
    #increase the counter value
    i += 1

输出:

more newfile*
::::::::::::::
newfile1
::::::::::::::
{
    "id": "b5902627-0ba0-40b6-8127-834a3ddd6c2c", 
    "name": "AAA11", 
    "auto": true, 
    "contexts": [], 
    "responses": []
}
::::::::::::::
newfile2
::::::::::::::
{
    "id": "b5902627-0ba0-40b6-8127-834a3ddd6c2c", 
    "name": "BBB12", 
    "auto": true, 
    "contexts": [], 
    "responses": []
}
::::::::::::::
newfile3
::::::::::::::
{
    "id": "b5902627-0ba0-40b6-8127-834a3ddd6c2c", 
    "name": "CCC13", 
    "auto": true, 
    "contexts": [], 
    "responses": []
}

最新更新