我有一个sql文件来过滤数据
-- Edit this file by adding your SQL below each question.
-------------------------------------------------------------------------------
-------------------------------------------------------------
-- The following queries are based on the 1994 census data.
-------------------------------------------------------------
.read 1994
-census-summary-1.sql
-- 4. what is the average age of people from China?
select avg(age)
from census
where native_country ='China';
-- 5. what is the average age of people from Taiwan?
select avg(age)
from census
where native_country ='Taiwan';
-- 6. which native countries have "land" in their name?
select distinct(native_country)
from census
where native_country like '%land%';
--------------------------------------------------------------------------------------
-- The following queries are based on the courses-ddl.sql and courses-small.sql data
--------------------------------------------------------------------------------------
drop table census;
.read courses-ddl.sql
.read courses-small-1.sql
-- 11. what are the names of all students who have taken some course? Don't show duplicates.
select distinct(name)
from student
where tot_cred > 0;
-- 12. what are the names of departments that offer 4-credit courses? Don't list duplicates.
select distinct(dept_name)
from course
where credits=4;
-- 13. What are the names and IDs of all students who have received an A in a computer science class?
select distinct(name), id
from student natural join takes natural join course
where dept_name="Comp. Sci." and grade="A";
如果我运行
/script.awk-v ID=6 file.sql
请注意,问题id在命令行中作为变量id传递给awk脚本,如下所示:-v ID=6
我怎样才能得到这样的结果结果:
select distinct(native_country) from census where native_country like '%land%';
使用所示的示例和GNUawk
,请尝试使用其match
函数执行以下GNUawk
代码。其中id
是一个awk
变量,它具有要确保在Input_file的行中检查的值。此外,我使用exit
来获取/打印第一个匹配,并退出程序以节省一些时间/周期,如果您有多个匹配,那么只需将其从以下代码中删除即可。
awk -v RS= -v id="6" '
match($0,/(n|^)-- ([0-9]+).[^n]*n(select[^;]*;)/,arr) && arr[2]==id{
gsub(/n/,"",arr[3])
print arr[3]
exit
}
' Input_file
awk
的一个选项可以将行的开头与--6匹配。其中6是ID。
然后移到下一行,设置一个变量,使您想要匹配的零件的开头为seen
然后打印所有未以空格开头且可见的行。
当遇到"seed"时,将seed设置为0;"空";线路
将输出中所需的行连接为一行,并在末尾删除尾部空格。
gawk -v ID=6 '
match($0, "^-- "ID"\.") {
seen=1
next
}
/^[[:space:]]*$/ {
seen=0
}
seen {
a = a $0 " "
}
END {
sub(/ $/, "", a)
print a
}
' file.sql
或者作为单行
gawk -v ID=6 'match($0,"^-- "ID"\."){seen=1;next};/^[[:space:]]*$/{seen=0};seen{a=a$0" "};END{sub(/ $/,"",a);print a}' file.sql
输出
select distinct(native_country) from census where native_country like '%land%';
CCD_ 10将行分隔符设置为"0"的另一个选项;"空";行,并使用带有捕获组的正则表达式来匹配初始-- ID
匹配之后不以空格开头的所有行
gawk -v ID=6 '
match($0, "\n-- "ID"\.[^\n]*\n(([^[:space:]][^\n]*(\n|$))*)", m) {
gsub(/n/, " ", m[1])
print m[1]
}
' RS='^[[:space:]]*$' file