打印2个panda列的格式


  • 我可以像这样打印pandas数据帧的两列
  • 如何格式化逐行打印
  • 这是我的";丑陋的";解决方案,然后是我所期望的工作
import pandas
def date_normalization(data: pandas.core.frame.DataFrame) -> None:
# EDIT: add completed code
# convert to desired date format
data[normalized] = pandas.to_datetime(
data[original],
errors="coerce",
).dt.strftime('%d/%m/%Y')
original = "start"
normalized = "normalized"
data = pandas.DataFrame({
original:
{
0: "AUG 26 2016",
1: "JAN-FEB 2021",
2: "2017-06-01 00:00:00"
}})
date_normalization(data)
# remove rows with invalid date
data = data[data[normalized].notnull()]
# arrggghh ... this is working, but ugly 👹👹👹 ...
for i, before in enumerate(data[original]):
for j, after in enumerate(data[normalized]):
if i == j:
print(f"row {i}: {before} -> {after}")
print("n")
# surprisingly (?) this doesn't work 🥴
for row in data:
print(f"{row[original]} -> {row[normalized]}")

这是我第二次尝试时得到的错误:

row 0: AUG 26 2016 -> 26/08/2016
row 1: 2017-06-01 00:00:00 -> 01/06/2017

Traceback (most recent call last):
File "/home/oren/Downloads/GGG/main.py", line 36, in <module>
print(f"{row[original]} -> {row[normalized]}")
TypeError: string indices must be integers

因为创建了新列normalized,所以可以使用zip:

import pandas as pd
def date_normalization(data: pd.core.frame.DataFrame) -> None:
# EDIT: add completed code
# convert to desired date format
data[normalized] = pd.to_datetime(
data[original],
errors="coerce",
).dt.strftime('%d/%m/%Y')
return data.dropna(subset=['normalized'])
original = "start"
normalized = "normalized"

data = pd.DataFrame({
original:
{
0: "AUG 26 2016",
1: "JAN-FEB 2021",
2: "2017-06-01 00:00:00"
}})

data = date_normalization(data)
print (data)
start  normalized
0          AUG 26 2016  26/08/2016
2  2017-06-01 00:00:00  01/06/2017

for o,n in zip(data[original], data[normalized]):
print(f"{o} -> {n}")
AUG 26 2016 -> 26/08/2016
2017-06-01 00:00:00 -> 01/06/2017

删除NaN后,可以使用data.reset_index(drop=True, inplace=True)重置索引。如果不重置索引,即使删除某些行,原始索引也将保留。

您可以使用DataFrame.iterrows.

for index, row in data.iterrows():
print(f"{row[original]} -> {row[normalized]}")

最新更新