如何在 Bash 中将 URL 剥离成单独的部分



我有一个URL(https://example.com/someone/something(,我想将其简化为三个变量:

  • https://example.com/
  • 有人
  • 东西

我想知道如何使用 grepawk 或其他工具在 Bash 中执行此操作。

澄清一下,我并不是在问是否有办法一次获得所有三个变量。为每个变量运行三个单独的命令是完全可以的。像这样,

URL="https://example.com/someone/something"
DOMAIN=$(echo ${URL} | <some wizardry here>)
USER=$(echo ${URL} | <some wizardry here>)
THING=$(echo ${URL} | <some wizardry here>)
#!/bin/bash
URL="https://example.com/someone/something"
DOMAIN=$(echo ${URL} | awk -F'/' '{print $1FS$2FS$3}')    
USER=$(echo ${URL} | awk -F'/' '{print $4}')    
THING=$(echo ${URL} | awk -F'/' '{print $5}')
echo $DOMAIN $USER $THING

输出:

https://example.com someone something

bash解决方案:

url_str="https://example.com/someone/something"
if [[ "$url_str" =~ ^(http.+)/([^/]+)/([^/]+)$ ]]; then
    domain="${BASH_REMATCH[1]}"
    section1="${BASH_REMATCH[2]}"
    section2="${BASH_REMATCH[3]}"
fi

结果:

$ echo $domain 
https://example.com
$ echo $section1
someone
$ echo $section2
something

试试这个:

$ URL="https://example.com/someone/something"
$ IFS=' ' read DOMAIN USER THING <<< $(sed 's|/| |3g' <<< ${URL})
$ echo ${DOMAIN}
https://example.com
$ echo ${USER}
someone
$ echo ${THING}
something

也试试这个,

URL="https://example.com/someone/something?test=a"
DOMAIN=$(echo "$URL" | python -c "from urlparse import urlparse;import sys; print urlparse(sys.stdin.read()).hostname")
USER=$(echo "$URL" | python -c "from urlparse import urlparse;import sys; print urlparse(sys.stdin.read()).path.split('/')[1]")
THING=$(echo "$URL" | python -c "from urlparse import urlparse;import sys; print urlparse(sys.stdin.read()).path.split('/')[2]")
QUERYSTRING=$(echo "$URL" | python -c "from urlparse import urlparse;import sys; print urlparse(sys.stdin.read()).query")

只需抨击:)

URL="https://example.com/someone/something"
regex='(https?://[^/]*)/([^/]*)/(.*)'
if [[ "$URL" =~ $regex ]]
then
  DOMAIN=${BASH_REMATCH[1]}
  UUSER=${BASH_REMATCH[2]}
  THING=${BASH_REMATCH[3]}
fi
echo "$DOMAIN -- $UUSER -- $THING"

我使用 UUSER 以免与 USER 环境变量混淆

最新更新