我有一组数据:
<>之前www.google.com.sgwww.yahoo.com市场观察bing.combbc.co.uk之前有的数据有www.
,有的没有。有的有.com
/.com.sg
/.com.ul
,有的没有。
我如何提取只是名称,例如google
, yahoo
, marketwatch
, bing
, bbc
使用SQL?
使用MS SQL Server的CHARINDEX和SUBSTRING语法,你可以做一些像…
(故意过度分割,使每一步都很明显)
WITH
url_start AS
(
SELECT
*,
CASE WHEN LEFT(myURL, 4) = 'www.' THEN 4 ELSE 1 END AS d_start
FROM
myTable
)
,
url_end
AS
(
SELECT
*,
CASE WHEN
CHARINDEX('.', myURL, d_start) = 0
THEN
LEN(myURL) + 1
ELSE
CHARINDEX('.', myURL, d_start)
END as d_end
FROM
url_start
)
SELECT
*,
SUBSTRING(myURL, d_start, d_end - d_start) AS domain
FROM
url_end
您可以使用SQL中的Replace函数来删除www.
,如果它不存在,它将保留字符串原样。
Select Replace(URLColumn, 'www.','') as [CleanURLColumn]
From YourTable
编辑
对不起,我错过了结尾-根据您提供的示例数据,这将提取名称:
Select Case
When CharIndex('.', Replace(URL, 'www.','')) > 0 then
Left(Replace(URL, 'www.',''), CharIndex('.',Replace(URL, 'www.',''))-1)
Else
Replace(URL, 'www.','')
End as [CleanURL]
From dbo.YourTable
;with cte as
(
select replace(URL, 'www.', '')+'.' as url
from myTable
)
select
left(url, charindex('.', url)-1)
from cte