切片一个3d pandas数据框



我有一个嵌套的三维pandas DataFrame,如下所示:

data = pd.DataFrame([pd.Series([k for k in range(10)]) for j in range(5)] for i in range(8))

我想对这个数据帧进行切片,使k维序列的长度是它们当前长度(10)的一半。

我尝试过data.iloc[:,:][0:6],但这只是返回前6行(I -维)。我也尝试遍历整个数据框架并替换每个单元格,但我想知道是否有更简洁的方法来做到这一点。

已更新:

回答问题的方法如下:

import pandas as pd
data = pd.DataFrame([pd.Series([k for k in range(10)]) for j in range(5)] for i in range(8))
ni, nj = data.shape
nk = len(data.loc[0, 0])
print(ni, nj, nk)
data2 = data.applymap(lambda x: x[:len(x) // 2])
print(*data2.shape, len(data2.loc[0, 0]))

输出:

8 5 10
8 5 5

如果你的数据是在3D numpy数组中,那么实际上可以对3D数组进行切片。

这是一个从熊猫到numpy到pandas的往返解决方案:

import pandas as pd
import numpy as np
data = pd.DataFrame([pd.Series([k for k in range(10)]) for j in range(5)] for i in range(8))
ni, nj = data.shape
nk = len(data.loc[0, 0])
print(ni, nj, nk)
xf = [y.to_numpy() for y in data.to_numpy().flatten()]
n = np.array(xf).reshape([ni, nj, nk])
print(n.shape)
print(n)
n2 = n[:, :, :nk // 2]
print(n2.shape)
print(n2)
data2 = pd.DataFrame([pd.Series(n2[i, j, :]) for j in range(n2.shape[1])] for i in range(n2.shape[0]))
ni, nj = data2.shape
nk = len(data2.loc[0, 0])
print(ni, nj, nk)

这是从包含k长度序列值的i x j数据帧转换为3D numpy数组的输入n:

[[[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]]
[[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]]
[[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]]
[[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]]
[[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]]
[[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]]
[[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]]
[[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]]]

这里是三维numpy数组n2,它的k-extent是原始3D数组(nk//2)的一半:

[[[0 1 2 3 4]
[0 1 2 3 4]
[0 1 2 3 4]
[0 1 2 3 4]
[0 1 2 3 4]]
[[0 1 2 3 4]
[0 1 2 3 4]
[0 1 2 3 4]
[0 1 2 3 4]
[0 1 2 3 4]]
[[0 1 2 3 4]
[0 1 2 3 4]
[0 1 2 3 4]
[0 1 2 3 4]
[0 1 2 3 4]]
[[0 1 2 3 4]
[0 1 2 3 4]
[0 1 2 3 4]
[0 1 2 3 4]
[0 1 2 3 4]]
[[0 1 2 3 4]
[0 1 2 3 4]
[0 1 2 3 4]
[0 1 2 3 4]
[0 1 2 3 4]]
[[0 1 2 3 4]
[0 1 2 3 4]
[0 1 2 3 4]
[0 1 2 3 4]
[0 1 2 3 4]]
[[0 1 2 3 4]
[0 1 2 3 4]
[0 1 2 3 4]
[0 1 2 3 4]
[0 1 2 3 4]]
[[0 1 2 3 4]
[0 1 2 3 4]
[0 1 2 3 4]
[0 1 2 3 4]
[0 1 2 3 4]]]

作为最后一步,切片的3D numpy数组n2被转换回包含长度为nk//2的Series值的i x j数据帧。

相关内容

  • 没有找到相关文章

最新更新