使用带有Scrapy的SQLAlchemy导出到SQLite时出错



我正在创建一个从etf.com收集etf数据的抓取机器人。我试图将收集的数据导出到sqlite数据库,但每次这样做时,我都会收到以下错误消息(对于我尝试添加的每个项目(:

sqlalchemy.exc.InterfaceError: (sqlite3.InterfaceError) Error binding parameter 0 - probably unsupported type.
[SQL: INSERT INTO etf (ticker, name, issuer, aum, expense_ratio, tr, segment) VALUES (?, ?, ?, ?, ?, ?, ?)]
[parameters: (['ROKT'], ['SPDR S&P Kensho Final Frontiers ETF'], ['State Street Global Advisors'], ['$19.24M'], ['0.45%'], ['6.45%'], ['Equity: U.S. Space'])]

我的刮刀(brandetfs_spider.py(:

for etf in etfs:
loader = ItemLoader(item=BrandetfsItem(), selector=etf)
loader.add_css('ticker', 'a.linkTickerName::text')
loader.add_css('name', 'td.col_2::text')
loader.add_css('issuer', 'td.col_3::text')
loader.add_css('aum', 'td.col_4::text')
loader.add_css('expense_ratio', 'td.col_5::text')
loader.add_css('tr', 'td.col_6::text')
loader.add_css('segment', 'td.col_7::text')
yield loader.load_item()

型号:

Base = declarative_base()
def db_connect():
"""
performs database connection using database settings from settings.py
returns sqlalchemy engine instance
"""
return create_engine(get_project_settings().get("CONNECTION_STRING")) # connects to a database
def create_table(engine):
Base.metadata.create_all(engine)
class ETF(Base):
__tablename__ = "etf"
print("-------------------------------------------")
id = Column(Integer, primary_key=True)
ticker = Column('ticker', Text())
name = Column('name', Text())
issuer = Column('issuer', Text())
aum = Column('aum', String(10))
expense_ratio = Column('expense_ratio', Text())
tr = Column('tr', Text())
segment = Column('segment', Text())

管道:

class BrandetfsPipeline(object):
def __init__(self):
"""
Initializes database connection and sessionmaker
creates tables
"""
engine = db_connect()
create_table(engine)
self.Session = sessionmaker(bind=engine)
def process_item(self, item, spider):
"""
Save etfs in the database
This method is called for every item pipeline component
"""
session = self.Session()
# create etf table
etf = ETF()
etf.ticker = item["ticker"]
etf.name = item["name"]
etf.issuer = item["issuer"]
etf.aum = item["aum"]
etf.expense_ratio = item["expense_ratio"]
etf.tr = item["tr"]
etf.segment = item["segment"]
try:
session.add(etf)
session.commit()
except:
session.rollback()
raise
finally:
session.close()
return item

settings.py:

CONNECTION_STRING = 'sqlite:///scrapy_etfs.db' 

我不明白为什么我会收到错误消息,因为我在数据库字段中使用了文本类型,而这正是我试图放入它们的

这很可能是因为项加载器返回列表。您可以使用TakeFirst作为处理器来修复它

最新更新