在python中是否有办法将一列转换为多个列?

  • 本文关键字:一列 转换 是否 python python
  • 更新时间 :
  • 英文 :


我有一个数据,只有一个列具有以下结构:

Datetime stamp 1
Obs1
Obs2
Obs3
Datetime stamp 2
Obs1
Obs2
Obs3

我想像上面那样转换它。这样,日期时间将成为header,该特定日期时间的所有对象将成为该特定日期时间戳的行

Date time stamp 1.     Date time stamp2
Obs1                         Obs1
Obs2.                         obs2
Obs3.                         Obs3

假设您的单列存储在列表/数组中,您可以像这样创建子列表:

lst = ['Datetime stamp 1', 'Obs1', 'Obs2', 'Obs3', 'Datetime stamp 2', 'Obs1', 'Obs2', 'Obs3']
result = []
temp = [lst[0]]
for item in lst[1:]:
if item.startswith('Datetime'):
result.append(temp)
temp = [item]
else:
temp.append(item)
result.append(temp)
print(result)

输出:

[['Datetime stamp 1', 'Obs1', 'Obs2', 'Obs3'], ['Datetime stamp 2', 'Obs1', 'Obs2', 'Obs3']]

它现在是一个列表的列表,其中的每个元素都可以代表一个列。

假设格式始终相同(即所有分割都以字符串"Datetime"),你可以得到指数与"Datetime"字符串开始的地方,并选择所有数据之间分歧:

import pandas as pd
data = pd.Series(["Datetime stamp 1",
"Obs1",
"Obs2",
"Obs3",
"Datetime stamp 2",
"Obs1",
"Obs2",
"Obs3"])

#Get splits
idx_split =data.str.startswith("Datetime ")
idx_split = idx_split.index[idx_split] # [0,4]

N_COLS = len(idx_split) #number of columns
vals = [0]*N_COLS #Initialize values
#Loop over each split-index and slize data
for i in range(N_COLS-1):
vals[i] = list(data[idx_split[i]:idx_split[i+1]])
vals[i+1] = list(data[idx_split[-1]:]) #Get the last one
print(vals)
#[['Datetime stamp 1', 'Obs1', 'Obs2', 'Obs3'],
#['Datetime stamp 2', 'Obs1', 'Obs2', 'Obs3']]

#Get the first element from each list and use that as column name
# + remove it
cols = [p.pop(0) for p in vals] 

#The data list is in wrong shape for pandas, use https://stackoverflow.com/questions/6473679/transpose-list-of-lists to transpose the list to right shape
df = pd.DataFrame(list(map(list, zip(*vals))),columns = cols)
print(df)
#Datetime stamp 1   Datetime stamp 2
#0  Obs1    Obs1
#1  Obs2    Obs2
#2  Obs3    Obs3

最新更新