从网站源代码中提取两个值



我试图提取,编码一个bash脚本,两个不同的值"vendor"one_answers"产品",从CVEdetails源代码,并将每个存储在一个bash变量中。这是vendor=$(requested code)product=$(requested code)

包含我需要的信息的代码片段是:
<tr>
<th>
Vendor
</th>
<th>
Product
</th>
<th>
Vulnerable Versions
</th>
</tr>
<tr>
<td>
<a href="/vendor/45/Apache.html" title="Details for Apache">Apache</a>                              </td>
<td><a href="/product/66/Apache-Http-Server.html?vendor_id=45" title="Product Details Apache Http Server">Http Server</a></td>
<td class="num">
34                             </td>
</tr>
</table>

有了这个,我需要的信息是供应商=Apache和产品=HTTP服务器,但最接近的代码我能自己做的是:

wget https://www.cvedetails.com/cve/CVE-2017-3169 &>/dev/null; grep -C 6 "Vulnerable Versions" CVE-2017-3169

你知道怎么得到这样的信息吗?提前感谢!

要处理HTML和JSON等结构化数据,应该使用适当的解析器。sed,grep,awk等都不是。对于命令行工具,我强烈推荐xidel,它是一个HTML和json解析器!

HTML源代码

$ xidel -s https://www.cvedetails.com/cve/CVE-2017-3169 -e '
//table[@id="vulnversconuttable"]//td[position() lt 3]/a
'
Apache
Http Server

([position() = (1,2)]也可以让它返回第一个和第二个<td>-节点)

$ xidel -s https://www.cvedetails.com/cve/CVE-2017-3169 -e '
//table[@id="vulnversconuttable"]/(vendor:=.//td[1]/a,product:=.//td[2]/a)
'
vendor := Apache
product := Http Server
$ xidel -s https://www.cvedetails.com/cve/CVE-2017-3169 -e '
//table[@id="vulnversconuttable"]/(vendor:=.//td[1]/a,product:=.//td[2]/a)
' --output-format=bash
vendor='Apache'
product='Http Server'
$ eval "$(
xidel -s https://www.cvedetails.com/cve/CVE-2017-3169 -e '
//table[@id="vulnversconuttable"]/(vendor:=.//td[1]/a,product:=.//td[2]/a)
' --output-format=bash
)"
$ printf '%sn' "$vendor" "$product"
Apache
Http Server
JSON API

$ xidel -s "https://cve.circl.lu/api/cve/CVE-2017-3169" -e '
$json/(vulnerable_product)(1)
'
cpe:2.3:a:apache:http_server:2.2.2:*:*:*:*:*:*:*
$ xidel -s "https://cve.circl.lu/api/cve/CVE-2017-3169" -e '
tokenize($json/(vulnerable_product)(1),":")
'
cpe
2.3
a
apache
http_server
2.2.2
*
*
*
*
*
*
*
$ xidel -s "https://cve.circl.lu/api/cve/CVE-2017-3169" -e '
tokenize($json/(vulnerable_product)(1),":")[position() = (4,5)]
'
apache
http_server
$ xidel -s "https://cve.circl.lu/api/cve/CVE-2017-3169" -e '
let $a:=tokenize($json/(vulnerable_product)(1),":") return (
vendor:=$a[4],product:=$a[5]
)
'
vendor := apache
product := http_server
$ eval "$(
xidel -s "https://cve.circl.lu/api/cve/CVE-2017-3169" -e '
let $a:=tokenize($json/(vulnerable_product)(1),":") return (
vendor:=$a[4],product:=$a[5]
)
' --output-format=bash
)"
$ printf '%sn' "$vendor" "$product"
apache
http_server

看一个示例,当使用API和适当的解析器时,它是如何简单的:

#!/usr/bin/env bash
API_URL='https://cve.circl.lu/api'
cve_id='CVE-2017-3169'
# Read parsed JSON data
IFS=: read -r _ _ _ vendor product _ < <(
# Perform API request
curl -s "$API_URL/cve/$cve_id" |
# Parse JSON data returned by the API to get only what we need
jq -r '.vulnerable_product[0]'
)
# Demo what we got
printf 'CVE ID: %sn' "$cve_id"
printf 'Vendor: %sn' "${vendor^}"
printf 'Product: %sn' "${product}"

样本输出:

CVE ID: CVE-2017-3169
Vendor: Apache
Product: http_server

相关内容

  • 没有找到相关文章

最新更新