我想把数据帧分成块(例如:如果我们有100行,我把它们分成20块),每个块有5个值,我需要在这个块数据上应用5个更新查询(5个不同的表)。
我怎么能完成这个任务,作为一个新的和学习我的工作,你能建议这个方法吗?
for item in np.array_split(df1, 10):
print(item) ##I was able to divide into chunks
for i,j in item.iterrows():
print(item.iloc[i]['ColumnName'])
我的想法是在这个print语句之后添加update查询行。
但是这段代码给出了一个异常。
Traceback (most recent call last):
File "/Users/gd/Documents/myproj/test.py", line 63, in <module>
func()
File "/Users/gd/Documents/myproj/test.py", line 45, in dedupe_pe
print(item.iloc[i]['ColumnName'])
File "/Users/gd/Documents/myproj/lib/python3.9/site-packages/pandas/core/indexing.py", line 931, in __getitem__
return self._getitem_axis(maybe_callable, axis=axis)
File "/Users/gd/Documents/myproj/lib/python3.9/site-packages/pandas/core/indexing.py", line 1566, in _getitem_axis
self._validate_integer(key, axis)
File "/Users/gd/Documents/myproj/lib/python3.9/site-packages/pandas/core/indexing.py", line 1500, in _validate_integer
raise IndexError("single positional indexer is out-of-bounds")
IndexError: single positional indexer is out-of-bounds
item.iterrows()
生成行索引和行本身,因此您可以尝试如下操作:
for item in np.array_split(df1, 10):
print(item) ##I was able to divide into chunks
item["sql"] = "UPDATE " + item["table_name"] + " SET column1 = '" + item["ColumnName_DATA"] + "' WHERE condition"
for i, j in item.iterrows():
print(j['ColumnName'])
print(j['sql'])