将 CSV 文件中的列转换为 NULL



我有一个CSV文件,其中包含如下所示的数据:

54861,54850,Doe,John,NULL,-87.1181407064,30.3773576858
54862,54851,Doe,Linda,Lee,None,None

最后两列是经度和纬度

我正在使用mysql.connector插入到数据库中。它处理第一条记录,但是,因为最后两列是浮点数,所以当它到达值设置为"None"的第二条记录时,它会发出呱呱声。

我尝试以编程方式将值设置为 NULL,但它也不会接受。

我尝试了几种不同的东西,但无法弄清楚。

这必须在Python中完成。

这是代码:

import sys
import mysql.connector
import csv
import os
from mysql.connector import Error
from mysql.connector import errorcode

#Specify the import file
try:

inputCSV = 'geocoded_test.csv'
#Open the file and give it a handle
csvFile = open(inputCSV, 'r')
#Create a reader object for the input file
reader = csv.reader(csvFile, delimiter = ',')
except IOError as e:
print("The input file ", inputCSV, " was not found", e)
exit()

try:
mydb = mysql.connector.connect(host='localhost',
database='wordpress',
user='wp_user',
password='XXXXXXXX!'
)
mycursor = mydb.cursor()
except mysql.connector.Error as error:
print( "Failed to connect to database: {}".format(error))
exit()
try:
record_count = 0
for row in reader:
contact_id,address_id,last_name, first_name, middle_name, longitude, latitude = row
print(row)
# It is here that I want to convert to NULL. 
if longitude == "None":
longitude = -1.0
if latitude == "None":
latitude = -1.0
#Update single record now
mycursor.execute("""
update civicrm_address
set 
geo_code_1 = %s,
geo_code_2 = %s
where
id = %s 
and
location_type_id = %s 
""",
(latitude, longitude, address_id, 6)
)
mydb.commit()
print(mycursor.rowcount)
record_count +=1
print("Record", record_count, " updated successfully")

finally:
print(record_count, " records updated")
#closing database connection.
if(mydb.is_connected()):
mydb.close()
print("connection is closed")

一种选择是将LOAD DATA与自定义逻辑一起使用,该逻辑捕获None字符串值,然后将它们转换为NULL

LOAD DATA LOCAL INFILE 'your_file.csv'
INTO TABLE yourTable
FIELDS TERMINATED BY ','
LINES TERMINATED BY 'rn' 
(col1, col2, last, first, middle, @lat, @lng)
SET lat = CASE WHEN @lat = 'None'
THEN NULL
ELSE CAST(@lat AS DECIMAL(10,8)) END,
SET lng = CASE WHEN @lng = 'None'
THEN NULL
ELSE CAST(@lng AS DECIMAL(10,8)) END;

我在上面假设您的纬度列称为lat,而您的经度列lng. 我只是为其他列提供了虚拟占位符名称,但是您需要使用实际的列名称才能使上述加载工作。

相关内容

  • 没有找到相关文章