我正在尝试从本地保存的musicbrainz数据库中随机获取100K歌曲。我是编程新手,想知道我的电脑变慢(可能是内存填充(背后的原因是什么。我在虚拟机上运行 ubuntu。请提出一些更改建议,以便我可以在运行此内容后睡觉。
import psycopg2
import random
import MySQLdb
from contextlib import closing
conn = psycopg2.connect("dbname='musicbrainz' user='musicbrainz' host='localhost' password='musicbrainz'")
conn1 = MySQLdb.connect(host = "localhost", user = "root", passwd = "40OZlike", db = "plalyst")
print("connections and cursors made...")
cur= conn1.cursor()
conn1.set_character_set('utf8')
cur.execute('SET NAMES utf8;')
cur.execute('SET CHARACTER SET utf8;')
cur.execute('SET character_set_connection=utf8;')
cur.close()
def migrateSongDB():
try:
cur1 = conn1.cursor()
cur1.execute("select count(*) from Song")
numberOfSongs = cur1.fetchall()[0][0]
cur1.close()
print("number of songs in our database is ")
print(numberOfSongs)
rnumbers = random.sample(range(1, 22660511), 100000-numberOfSongs)
print("random numbers generated....")
for eachnum in rnumbers:
cur = conn.cursor()
cur1 = conn1.cursor()
print(eachnum)
songName=""
while(songName==""):
cur.execute("""select name from track where id = %s """, (eachnum,))
rows = cur.fetchall()
print(rows)
if not len(rows)==0:
songName = rows[0][0]
eachnum+=1
print("Got the track name:")
print(songName)
sql = 'INSERT into Song (name) values ( "'+songName+'")'
print(sql)
cur1.execute(sql)
cur1.execute('commit')
print("inserted into the song table....")
cur.close()
cur1.close()
print("Songs Saved into new Data Base...")
conn.close()
conn1.close()
print("Connections Closed")
except:
with conn1 as cursor:
cursor.execute('select 1;')
result = cursor.fetchall()
for cur in result:
print(cur)
migrateSongDB()
def main():
migrateSongDB()
conn.close()
conn1.close()
if __name__ == "__main__": main()
感谢您抽出宝贵时间阅读此代码。另外,如果你们对我有任何建议来改善我的编码风格,我很想学习。再次感谢您。
我的预感是,你从帽子里拿出的ID最终会徘徊到ID不存在的领域。 这意味着您将运行一个永久长*的循环,这是低效的。
与其使用这种方法,为什么不直接从数据库中提取已知的 ID 并选择它呢?
你可以用这个来实现这一点。 我从这个堆栈溢出答案中借用了一个列表扁平化操作,这将使使用 ID 列表变得轻而易举。
cur1.execute("select id from Song")
result = cur1.fetchall()
result = [item for sublist in result for item in sublist]
# result holds the total amount of elements in your Song DB
# use len() to get the total number of rows from here
rnumbers = random.sample(result, 100000-len(result))
然后,您可以摆脱 while 循环,因为您可以保证数据库中实际存在一个 ID。