Bulkcreate in Django with pandas from django shell



我正在尝试从 django shell 运行脚本以从 csv 批量创建数据库。我不确定是我的熊猫错了,还是我的django模型是罪魁祸首。我正在使用 Python3,我不确定这是否会影响事情。我在 django 文档中迷失了方向

我想从kaggle导入这个csv:https://www.kaggle.com/weil41/flights/data

脚本:

import pandas as pd
from .models import Flight
data = pd.read_csv('data/Flights.csv', sep=',')
# year,month,day,dep_time,dep_delay,arr_time,arr_delay,cancelled,
# carrier,tailnum,flight,origin,dest,air_time,distance,hour,min
flights = [
Flight(
year = data.ix[row]['year'],
month = data.ix[row]['month'],
day = data.ix[row]['day'],
dep_time = data.ix[row]['dep_time'],
dep_delay = data.ix[row]['dep_delay'],
arr_time = data.ix[row]['arr_time'],
arr_delay = data.ix[row]['arr_delay'],
cancelled = data.ix[row]['cancelled'],
carrier = data.ix[row]['carrier'],
tailnum = data.ix[row]['tailnum'],
flight = data.ix[row]['flight'],
origin = data.ix[row]['origin'],
dest = data.ix[row]['dest'],
air_time = data.ix[row]['air_time'],
distance = data.ix[row]['distance'],
hour = data.ix[row]['hour'],
min = data.ix[row]['min'],
)
for row in data
]
Flight.objects.bulk_create(flights)

models.py

from django.db import models
# year,month,day,dep_time,dep_delay,arr_time,arr_delay,cancelled,
# carrier,tailnum,flight,origin,dest,air_time,distance,hour,min
class Flight(models.Model):
year = models.CharField(max_length=100, default='')
month = models.CharField(max_length=100, default='')
day = models.CharField(max_length=100, default='')
dep_time = models.CharField(max_length=100, default='')
arr_time = models.CharField(max_length=100, default='')
arr_delay = models.CharField(max_length=100, default='')
cancelled = models.CharField(max_length=100, default='')
carrier = models.CharField(max_length=100, default='')
tailnum = models.CharField(max_length=100, default='')
flight = models.CharField(max_length=100, default='')
origin = models.CharField(max_length=100, default='')
dest = models.CharField(max_length=100, default='')
air_time = models.CharField(max_length=100, default='')
distance = models.CharField(max_length=100, default='')
hour = models.CharField(max_length=100, default='')
min = models.CharField(max_length=100, default='')
def __str__(self):
return f'{self.flight} {self.dest} {self.year} {self.month} {self.day}'

我得到的错误是KeyError:"'名称'不在全局中"?

错误信息:

exec(open('calendarapp/get_data.py'(.read((( 回溯(最近一次调用(: 文件 ",第 1 行,在 文件 ",第 2 行,在 键错误:"'名称'不在全局变量中">

有关类似情况,请参阅此问题。

根据那里的解决方案,您可以尝试将导入语句从

from .models import Flight

from [app_name].models import Flight

在您的情况下,这似乎会导致:

from calendarapp.models import Flight

编辑:我建议更改您的迭代过程。

flights = [
Flight(
year = row['year'],
...
)
for i, row in df.iterrows()]
Flight.objects.bulk_create(flights)

请注意我如何使用 pandasiterrows,这使得代码更具可读性。

您可以通读这篇文章,了解如何使用.ix(或为什么不使用它(的一些上下文。

此外,bulk_create尚未处理 ID 字段的创建(如果不是Postgres(。

最新更新