使用awk脚本需要帮助才能做到这一点,只需要你的建议或逻辑



我有一个sql文件来过滤数据

-- Edit this file by adding your SQL below each question.
-------------------------------------------------------------------------------
-------------------------------------------------------------
-- The following queries are based on the 1994 census data.
-------------------------------------------------------------
.read 1994
-census-summary-1.sql
-- 4. what is the average age of people from China?
select avg(age)
from census
where native_country ='China';
-- 5. what is the average age of people from Taiwan?
select avg(age)
from census
where native_country ='Taiwan';
-- 6. which native countries have "land" in their name?
select distinct(native_country)
from census
where native_country like '%land%';
--------------------------------------------------------------------------------------
-- The following queries are based on the courses-ddl.sql and courses-small.sql data
--------------------------------------------------------------------------------------
drop table census;
.read courses-ddl.sql
.read courses-small-1.sql
-- 11. what are the names of all students who have taken some course? Don't show duplicates.
select distinct(name)
from student
where tot_cred > 0;
-- 12. what are the names of departments that offer 4-credit courses? Don't list duplicates.
select distinct(dept_name)
from course
where credits=4;
-- 13. What are the names and IDs of all students who have received an A in a computer science class?
select distinct(name), id
from student natural join takes natural join course  
where dept_name="Comp. Sci." and grade="A";

如果我运行

/script.awk-v ID=6 file.sql

请注意,问题id在命令行中作为变量id传递给awk脚本,如下所示:-v ID=6

我怎样才能得到这样的结果结果:

select distinct(native_country) from census where native_country like '%land%';

使用所示的示例和GNUawk,请尝试使用其match函数执行以下GNUawk代码。其中id是一个awk变量,它具有要确保在Input_file的行中检查的值。此外,我使用exit来获取/打印第一个匹配,并退出程序以节省一些时间/周期,如果您有多个匹配,那么只需将其从以下代码中删除即可。

awk -v RS= -v id="6" '
match($0,/(n|^)-- ([0-9]+).[^n]*n(select[^;]*;)/,arr) && arr[2]==id{
gsub(/n/,"",arr[3])
print arr[3]
exit
}
'  Input_file

awk的一个选项可以将行的开头与--6匹配。其中6是ID。

然后移到下一行,设置一个变量,使您想要匹配的零件的开头为seen

然后打印所有未以空格开头且可见的行。

当遇到"seed"时,将seed设置为0;"空";线路

将输出中所需的行连接为一行,并在末尾删除尾部空格。

gawk -v ID=6 '
match($0, "^-- "ID"\.") {
seen=1
next
}
/^[[:space:]]*$/ {
seen=0
}
seen {
a = a $0 " "
}
END {
sub(/ $/, "", a)
print a
}
' file.sql

或者作为单行

gawk -v ID=6 'match($0,"^-- "ID"\."){seen=1;next};/^[[:space:]]*$/{seen=0};seen{a=a$0" "};END{sub(/ $/,"",a);print a}' file.sql

输出

select distinct(native_country) from census where native_country like '%land%';

CCD_ 10将行分隔符设置为"0"的另一个选项;"空";行,并使用带有捕获组的正则表达式来匹配初始-- ID匹配之后不以空格开头的所有行

gawk -v ID=6 '
match($0, "\n-- "ID"\.[^\n]*\n(([^[:space:]][^\n]*(\n|$))*)", m) {
gsub(/n/, " ", m[1])
print m[1]
}
' RS='^[[:space:]]*$' file

相关内容

最新更新