我有两个包含SQL数据的文件,我希望删除第二个文件中具有匹配的课程代码和学生编号的数据。这些文件如下所示:
文件 1:
INSERT INTO RegisteredCourses (course,student) VALUES ('BKE974','3941021693');
INSERT INTO RegisteredCourses (course,student) VALUES ('BKE974','5044463260');
INSERT INTO RegisteredCourses (course,student) VALUES ('BKE974','5923001715');
INSERT INTO RegisteredCourses (course,student) VALUES ('DQY359','7539643746');
INSERT INTO RegisteredCourses (course,student) VALUES ('DQY359','9604636424');
INSERT INTO RegisteredCourses (course,student) VALUES ('DQY359','9649249670');
文件 2:
INSERT INTO Queue (course,student,registrationDate) VALUES ('BKE974','3941021693','1354811709');
INSERT INTO Queue (course,student,registrationDate) VALUES ('BKE974','5044463260','1378352712');
INSERT INTO Queue (course,student,registrationDate) VALUES ('BKE974','3421728825','1368144500');
INSERT INTO Queue (course,student,registrationDate) VALUES ('DQY359','7421758823','1375874278');
INSERT INTO Queue (course,student,registrationDate) VALUES ('DQY359','9604636424','1374587707');
INSERT INTO Queue (course,student,registrationDate) VALUES ('DQY359','9649249670','1370542279');
我已经操作了文件,以便课程和学生字段在文件的前两行和最后两行中匹配。在第一行中,您可以看到它们具有相同的课程 (BKE974) 和学生 (3941021693) 值。如果这些值不匹配,我想将整行从 File2 打印到新文件。
我一直在尝试使用一些 bash 脚本来解决这个问题,我很想有一个 bash 解决方案,因为我正在尝试了解有关 bash 的更多信息。我尝试使用grep,awk和cut来尝试解决这个问题,但我在bash方面的知识非常缺乏:P
编辑:所以我希望最终得到的结果应该是将这两行打印到一个新文件中:
INSERT INTO Queue (course,student,registrationDate) VALUES ('BKE974','3421728825','1368144500');
INSERT INTO Queue (course,student,registrationDate) VALUES ('DQY359','7421758823','1375874278');
这是使用 GNU awk
的一种方法:
awk -F "[()]" 'FNR==NR { a[$(NF-1)]++; next } !(gensub(/(.*),.*/,"\1","g",$(NF-1)) in a)' File1 File2
结果:
INSERT INTO Queue (course,student,registrationDate) VALUES ('BKE974','3421728825','1368144500');
INSERT INTO Queue (course,student,registrationDate) VALUES ('DQY359','7421758823','1375874278');
试试这个
#!/bin/bash
while read line
do
x=`echo "$line" | sed -n "s/.*VALUES (//p" | sed -n "s/);//p"`;
sed -i '/'$x'/d' file2.txt
done<file1.txt