输入文件创建表如果不存在test.employee(eid int,名称字符串,薪金字符串,目标字符串(评论"员工详细信息"行格式定界由" t"终止的字段终止" n"的线存储为textfile;创建表(如果不存在(test.employee1(eid int,名称字符串,薪金字符串,目标字符串(评论"员工详细信息"行格式定界由" t"终止的字段终止" n"的线存储为textfile;创建表(如果不存在(test.employee2(eid int,名称字符串,薪金字符串,目标字符串(评论"员工详细信息"行格式定界由" t"终止的字段终止" n"的线存储为textfile;......所以
预期输出:
创建表(如果不存在(test.employee(eid int,名称字符串,薪金字符串,目标字符串(评论"员工详细信息"行格式定界由" t"终止的字段终止" n"的线存储为textfile;
将其保存为雇员.hql并创建表(如果不存在(test.employee1(eid int,name String,薪金字符串,目标字符串(评论"员工详细信息"行格式定界由" t"终止的字段终止" n"的线存储为textfile;
将其保存为employee1.hql等等.....我尝试了RE,对我来说,只能捕获表名或以数据结束。无法加入两个重新表达。是Python的新手。帮助我如何实现这一目标。
这取决于您的输入文件的大小,但这是一个解决方案:
import re
sql_file = "sql.txt"
with open(sql_file) as f:
data=f.read().replace('n', '')
items = data.split("; ")
for item in items:
if item: # skip blank lines
sql = item + ";"
table_name = re.findall(r"(w+.w+)", sql)
if table_name:
_, outfile = table_name[0].split(".")
outfile += ".hql"
print("Writing:", outfile)
with open( outfile, "w") as f:
f.write(sql + "n")
本质上,它在while sql输入中读取,然后将其分配在; 上,然后是一个空间。如果您还有其他分号,则需要对其进行修改,以免易碎。
接下来,它查看每个项目,并寻找一个字母,时期,字母。然后将其拆分,然后将第二部分(即您的表名称(用作文件名。
最后,它将SQL写出并给它提供一个文件名,如您所述。
# -*- coding: utf-8 -*-
import re
text = "CREATE TABLE IF NOT EXISTS test.employee ( eid int, name String,salary String, destination String) COMMENT ‘Employee details’ ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘\t’ LINES TERMINATED BY ‘\n’ STORED AS TEXTFILE; CREATE TABLE IF NOT EXISTS test.employee1 ( eid int, name String,salary String, destination String) COMMENT ‘Employee details’ ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘\t’ LINES TERMINATED BY ‘\n’ STORED AS TEXTFILE; CREATE TABLE IF NOT EXISTS test.employee2 ( eid int, name String,salary String, destination String) COMMENT ‘Employee details’ ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘\t’ LINES TERMINATED BY ‘\n’ STORED AS TEXTFILE;"
for item in re.findall(r'CREATE[^;]*;',text):
print 'Save in',re.search(r'(?<=.)w+',item).group()+'.sql'
print item
不太清楚,但这可能会对您有所帮助。