uk1 (' http://scope1.com ', ' http://scope2.com ') jane.doe@gmail.com uk2[http://scopeapp2 - 1. com, ' http://scopeapp2 - 1. com '] jane.doe@gmail.com
我有一个应该是json的csv文件,我试图在多列中对它进行排序
可以是这样的json(如果有帮助的话):
{"username":"jane.doe@gmail.com"
"app": [
{"appid":"123456"
"appname:"apppname"
"scopes":["scope1","scope2"]}
{"appid":"23456
"appname:"apppname"2
"scopes":["scope1","scope2"]}
{"username":john.doe@gmail.com"
...}
这里是数据
Value | 用户:jane.doe@gmail.com |
---|
客户端ID: CI1 |
匿名:假 |
displayText: app1 |
nativeApp:假 |
userKey:同名uk1 |
范围: |
http://scope1.com |
http://scope2.com |
客户端ID: CI2 |
匿名:假 |
displayText: app2 |
nativeApp:假 |
userKey:同名uk2 |
范围: |
http://scopeapp2 - 1. com |
http://scopeapp2 - 1. com |
我想你的意思是你有一个csv文件。
如果你可以指望的结构,即1用户,1到N客户端ID区段与1的范围区段…对于N个url,你可以这样做:
if __name__ == '__main__':
from itertools import islice
from pprint import pprint
data = {}
def fieldv(line):
return line.rsplit(':', 1)[1].strip()
users = []
client_data = []
user_record = None
scopes = []
with open(..., 'r') as infile:
while line := infile.readline():
if line.startswith('User'):
user = fieldv(line)
client_data = []
user_record = {'User': user, 'client_data': client_data}
users.append(user_record)
elif line.startswith('http://'):
scopes.append(line.strip())
else:
d = list(islice(infile, 5))
scopes = []
app = {'Client ID': fieldv(line),
'anonymous': fieldv(d[0]),
# other fields d[1], d[2]...,
'scopes': scopes}
client_data.append(app)
使用提供的数据打印用户列表:
[{'User': 'jane.doe@gmail.com',
'client_data': [{'Client ID': 'CI1',
'anonymous': 'False',
'scopes': ['http://scope1.com', 'http://scope2.com']},
{'Client ID': 'CI2',
'anonymous': 'False',
'scopes': ['http://scopeapp2-1.com',
'http://scopeapp2-1.com']}]}]
你的文件是如此接近YaML很容易插入缺失的缩进和列表分隔符,然后使用json_normalize()
import pandas as pd
import io
from pathlib import Path
import yaml
raw = """User: jane.doe@gmail.com
Client ID: CI1
anonymous: False
displayText: app1
nativeApp: False
userKey: uk1
scopes:
http://scope1.com
http://scope2.com
Client ID: CI2
anonymous: False
displayText: app2
nativeApp: False
userKey: uk2
scopes:
http://scopeapp2-1.com
http://scopeapp2-1.com"""
fn = Path.cwd().joinpath("so.yaml")
with io.StringIO(raw) as f, open(fn, "w") as fw:
while True:
suffix = ""
l = f.readline()
if not l: break
elif l.startswith("User:"):
prefix = ""
suffix = "napp:"
elif l.startswith("Client ID:"): prefix = " - "
elif (" " in l) or l.startswith("scopes:"): prefix = " "
else: prefix = " - "
fw.write(f"{prefix}{l.strip()}{suffix}n")
with open(fn) as f: myyaml = yaml.safe_load(f)
pd.json_normalize(myyaml, record_path="app", meta="User")