将一个数据帧列展开为多个



我读了一些帖子,但没能得到我想要的。我有一个从Infoblox(DNS服务器(导出的约4k行和几列的数据帧。其中之一是dhcp属性,我想将其扩展为具有单独的值。这是我的df(我附上了一张excel的截图(:excel 截图

其中一列是所有选项的字典,这是一个示例(已清理(:

[
{"name": "tftp-server-name", "num": 66, "value": "10.70.0.27", "vendor_class": "DHCP"},
{"name": "bootfile-name", "num": 67, "value": "pxelinux.0", "vendor_class": "DHCP"},
{"name": "dhcp-lease-time", "num": 51, "use_option": False, "value": "21600", "vendor_class": "DHCP"},
{"name": "domain-name-servers", "num": 6, "use_option": False, "value": "10.71.73.143,10.71.74.163", "vendor_class": "DHCP"},
{"name": "domain-name", "num": 15, "use_option": False, "value": "example.com", "vendor_class": "DHCP"},
{"name": "routers", "num": 3, "use_option": True, "value": "10.70.1.200", "vendor_class": "DHCP"},
]

我想将此列扩展到一些(同一行(,如下所示。使用";name";作为df列;值";作为行值。这将是目标:

tftp-server-name         voip-tftp-server                  dhcp-lease-time        domain-name-server        domain-name        routers
0      10.71.69.58              10.71.69.58,10.71.69.59           86400           10.71.73.143,10.71.74.163       example.com      10.70.12.254

为了有一个包含所有信息的全局df,我想我应该创建一个新的df,保持索引与primary合并,但我没能做到。我尝试过扩展、附加、爆炸。。。求你了,你能帮我吗?

非常感谢您(对两者(的解决方案。我可以把它搞定,这是我的最后一份文件:我可以做到。我添加了完整的解决方案,以备有人需要(也许有一种更像蟒蛇的方法,但它有效(:

def formato(df):
opciones = df['options']
df_int = pd.DataFrame()
for i in opciones:
df_int = df_int.append(pd.DataFrame(i).set_index("name")[["value"]].T.reset_index(drop=True))
df_int.index = range(len(df_int.index))
df_global = pd.merge(df, df_int, left_index=True, right_index=True, how="inner")
df_global = df.rename(columns={"comment": "Comentario", "end_addr": "IP Fin", "network": "Red",
"start_addr": "IP Inicio", "disable": "Deshabilitado"})
df_global = df_global[["Red", "Comentario", "IP Inicio", "IP Fin", "dhcp-lease-time",
"domain-name-servers", "domain-name", "routers", "tftp-server-name", "bootfile-name",
"voip-tftp-server", "wdm-server-ip-address", "ftp-file-server", "vendor-encapsulated-options"]]
return df_global

这里有一个解决方案:

import pandas as pd
data = [{'name': 'tftp-server-name', 'num': 66, 'value': '10.70.0.27', 'vendor_class': 'DHCP'}, {'name': 'bootfile-name', 'num': 67, 'value': 'pxelinux.0', 'vendor_class': 'DHCP'}, {'name': 'dhcp-lease-time', 'num': 51, 'use_option': False, 'value': '21600', 'vendor_class': 'DHCP'}, {'name': 'domain-name-servers', 'num': 6, 'use_option': False, 'value': '10.71.73.143,10.71.74.163', 'vendor_class': 'DHCP'}, {'name': 'domain-name', 'num': 15, 'use_option': False, 'value': 'example.com', 'vendor_class': 'DHCP'}, {'name': 'routers', 'num': 3, 'use_option': True, 'value': '10.70.1.200', 'vendor_class': 'DHCP'}]
df = pd.DataFrame(data).set_index("name")[["value"]].T.reset_index(drop=True)    

输出:

name tftp-server-name bootfile-name dhcp-lease-time        domain-name-servers  domain-name      routers
0          10.70.0.27    pxelinux.0           21600  10.71.73.143,10.71.74.163  example.com  10.70.1.200

您可以按如下方式使用json_normalize

from pandas.io.json import json_normalize
import ast
import pandas as pd

def extract_dict(ld):
res ={}
for d in ast.literal_eval(ld):
res[d['name']] = d['value']
return res
# load dataframe (I made a dummy, replace it with read from file)
df = pd.DataFrame.from_dict({'temp':['temp'],'option':['''[{'name': 'tftp-server-name', 'num': 66, 'value': '10.70.0.27', 'vendor_class': 'DHCP'}, {'name': 'bootfile-name', 'num': 67, 'value': 'pxelinux.0', 'vendor_class': 'DHCP'}, {'name': 'dhcp-lease-time', 'num': 51, 'use_option': False, 'value': '21600', 'vendor_class': 'DHCP'}, {'name': 'domain-name-servers', 'num': 6, 'use_option': False, 'value': '10.71.73.143,10.71.74.163', 'vendor_class': 'DHCP'}, {'name': 'domain-name', 'num': 15, 'use_option': False, 'value': 'example.com', 'vendor_class': 'DHCP'}, {'name': 'routers', 'num': 3, 'use_option': True, 'value': '10.70.1.200', 'vendor_class': 'DHCP'}]''']})
B = json_normalize(df['option'].apply(extract_dict).tolist())
print(B)

输出如下:

tftp-server-name bootfile-name dhcp-lease-time        domain-name-servers  domain-name      routers
0       10.70.0.27    pxelinux.0           21600  10.71.73.143,10.71.74.163  example.com  10.70.1.200

最新更新