我想从下面的URL中提取插件名称和主题名称
http://example.com/wp-content/plugins/contact-form-7/includes/css/styles.css?ver=4.2.1
http://example.com/wp-content/plugins/recent-tweets-widget/tp_twitter_plugin.css?ver=1.0
http://example.com/wp-content/plugins/revslider/rs-plugin/css/settings.css?rev=4.6.0&ver=4.2.2
http://example.com/wp-content/plugins/js_composer/assets/css/vc-ie8.css
http://example.com/wp-content/themes/themeforest-9412083-specular-responsive-multipurpose-business-theme/specular/style.css?ver=4.2.2
我试过awk和sed。无法获得所需的结果。
sed
使用这个sed命令:
sed 's/.*(plugin|theme)s/([^/]*)/.*/2/'
它查找第一个出现的plugins
或themes
,然后是斜线(/
(。接下来,它使用一系列非斜杠([^/]*
(,后跟一个斜杠。该序列被放入组()
中,并在替换2
处被重新插入。
示例用法:
$ cat file
http://example.com/wp-content/plugins/contact-form-7/includes/css/styles.css?ver=4.2.1
http://example.com/wp-content/plugins/recent-tweets-widget/tp_twitter_plugin.css?ver=1.0
http://example.com/wp-content/plugins/revslider/rs-plugin/css/settings.css?rev=4.6.0&ver=4.2.2
http://example.com/wp-content/plugins/js_composer/assets/css/vc-ie8.css
http://example.com/wp-content/themes/themeforest-9412083-specular-responsive-multipurpose-business-theme/specular/style.css?ver=4.2.2
new2, 2.2.2.2, myweb2.com
$ sed 's/.*(plugin|theme)s/([^/]*)/.*/2/' file
contact-form-7
recent-tweets-widget
revslider
js_composer
themeforest-9412083-specular-responsive-multipurpose-business-theme
awk
使用awk实际上更容易,只需将字段分隔符设置为斜线并打印第六个字段即可。
awk -F '/' '{ print $6 }' file
这将产生与上述sed命令相同的结果。
非常简单的python方法
with open('urls.txt') as f:
for url in f:
print url.split('/')[5]