将数据从SQL Server中的html列传输并清理到其他关系表



我有一个包含HTML数据的表,如下所示:

SELECT 
[ID],
[title],
[authors_html],
[authors_text]
FROM 
[wiley].[dbo].[library]

authors_html列数据示例:

<div class="accordion-tabbed__tab-mobile accordion__closed">
<a href="http://185.141.105.238/action/doSearch?ContribAuthorStored=PRICE%2C+M+A" class="author-name accordion-tabbed__control" data-id="a1" data-db-target-for="a1" aria-controls="a1" aria-haspopup="true" id="a1_Ctrl" role="button"><span>M. A. PRICE</span><i aria-hidden="true" class="icon-section_arrow_d"></i></a>
<div class="author-info accordion-tabbed__content" data-db-target-of="a1" aria-labelledby="a1_Ctrl" role="region" id="a1">
<p>Department of Mechanical and Manufacturing Engineering, The Queen's University of Belfast, Belfast BT95AH, U.K.</p>
<a class="moreInfoLink" href="http://185.141.105.238/action/doSearch?ContribAuthorStored=PRICE%2C+M+A">Search for more papers by this author</a>
</div>
</div>
<div class="accordion-tabbed__tab-mobile accordion__closed">
<a href="http://185.141.105.238/action/doSearch?ContribAuthorStored=ARMSTRONG%2C+C+G" class="author-name accordion-tabbed__control" data-id="a2" data-db-target-for="a2" aria-controls="a2" aria-haspopup="true" id="a2_Ctrl" role="button"><span>C. G. ARMSTRONG</span><i aria-hidden="true" class="icon-section_arrow_d"></i></a>
<div class="author-info accordion-tabbed__content" data-db-target-of="a2" aria-labelledby="a2_Ctrl" role="region" id="a2">
<p>Department of Mechanical and Manufacturing Engineering, The Queen's University of Belfast, Belfast BT95AH, U.K.</p>
<a class="moreInfoLink" href="http://185.141.105.238/action/doSearch?ContribAuthorStored=ARMSTRONG%2C+C+G">Search for more papers by this author</a>
</div>
</div>

我需要将此列数据传输到Researcher表:

空http://185.141.105.238/action/doSearch?ContribAuthorStored=HUANG%2C+MIN-WEI空NULL
1007M。A.价格为空http://185.141.105.238/action/doSearch?ContribAuthorStored=PRICE%2C+M+A英国贝尔法斯特BT95AH女王大学机械与制造工程系
1008C。G.ARMSTRONGhttp://185.141.105.238/action/doSearch?ContribAuthorStored=ARMSTRONG%2C+C+G英国贝尔法斯特女王大学机械与制造工程系,贝尔法斯特BT95AH
1009B。BOROOMANDhttp://185.141.105.238/action/doSearch?ContribAuthorStored=BOROOMAND%2C+B英国斯旺西威尔士大学工程数值方法研究所
1010O。C.ZIENKIEWICZhttp://185.141.105.238/action/doSearch?ContribAuthorStored=ZIENKIEWICZ%2C+O+C英国斯旺西威尔士大学工程数值方法研究所英国
1011http://185.141.105.238/action/doSearch?ContribAuthorStored=JIAO%2C+赵平中国建筑大学土木工程系,中国广州5104051012http://185.141.105.238/action/doSearch?ContribAuthorStored=PIAN%2C+THEODORE+H+H美国马萨诸塞州剑桥市麻省理工学院航空航天系美国
1013沈勇http://185.141.105.238/action/doSearch?ContribAuthorStored=YONG%2C+沈中国科学技术大学现代力学系,中国河北省
11014MIN‐WEI HUANGNULL优化设计实验室,美国爱荷华州爱荷华市爱荷华大学工程学院,IA 52242
1015JASBIR S.ARORAhttp://185.141.105.238/action/doSearch?ContribAuthorStored=ARORA%2C+JASBIR+S优化设计实验室,美国爱荷华州爱荷华市爱荷华大学工程学院,IA 52242
1016C。S.TSAINULLhttp://185.141.105.238/action/doSearch?ContribAuthorStored=TSAI%2C+C+S中华民国台湾台中逢甲大学土木工程系中华人民共和国逢甲大学NULL

这是您的起点。

它适用于SQL Server 2016以后的版本。

SQL

DECLARE @HtmlTbl TABLE (ID INT IDENTITY, Html XML);
INSERT INTO @HtmlTbl(Html) VALUES('<div class="accordion-tabbed__tab-mobile accordion__closed">
<a href="http://185.141.105.238/action/doSearch?ContribAuthorStored=PRICE%2C+M+A"
class="author-name accordion-tabbed__control" data-id="a1"
data-db-target-for="a1" aria-controls="a1" aria-haspopup="true"
id="a1_Ctrl" role="button">
<span>M. A. PRICE</span>
<i aria-hidden="true" class="icon-section_arrow_d"></i>
</a>
<div class="author-info accordion-tabbed__content"
data-db-target-of="a1" aria-labelledby="a1_Ctrl" role="region"
id="a1">
<p>Department of Mechanical and Manufacturing Engineering, The Queen''s University of Belfast, Belfast BT95AH, U.K.</p>
<a class="moreInfoLink"
href="http://185.141.105.238/action/doSearch?ContribAuthorStored=PRICE%2C+M+A">Search for more papers by this author</a>
</div>
</div>
<div class="accordion-tabbed__tab-mobile accordion__closed">
<a href="http://185.141.105.238/action/doSearch?ContribAuthorStored=ARMSTRONG%2C+C+G"
class="author-name accordion-tabbed__control" data-id="a2"
data-db-target-for="a2" aria-controls="a2" aria-haspopup="true"
id="a2_Ctrl" role="button">
<span>C. G. ARMSTRONG</span>
<i aria-hidden="true" class="icon-section_arrow_d"></i>
</a>
<div class="author-info accordion-tabbed__content"
data-db-target-of="a2" aria-labelledby="a2_Ctrl" role="region"
id="a2">
<p>Department of Mechanical and Manufacturing Engineering, The Queen''s University of Belfast, Belfast BT95AH, U.K.</p>
<a class="moreInfoLink"
href="http://185.141.105.238/action/doSearch?ContribAuthorStored=ARMSTRONG%2C+C+G">Search for more papers by this author</a>
</div>
</div>');
-- INSERT INTO dbo.Researcher (Full_Name, [URL], [Address], University, Country) -- uncommemnt when you are ready
SELECT ID
, c.value('(a/span/text())[1]', 'nvarchar(50)') AS Full_Name
, c.value('(div/a/@href)[1]', 'nvarchar(max)') AS [URL]
, c.value('(div/p/text())[1]', 'nvarchar(max)') AS [Address]
, JSON_VALUE(x,'$[1]') AS University
, JSON_VALUE(x,'$[3]') AS Country
-- continue with the rest
FROM @HtmlTbl
CROSS APPLY Html.nodes('/div') AS t(c)
CROSS APPLY (VALUES ('["' + REPLACE(c.value('(div/p/text())[1]', 'nvarchar(max)'),',','","') + '"]')) AS t2(x);

输出

+----+-----------------+-----------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------+------------------------------------+---------+
| ID |    Full_Name    |                                     URL                                     |                                                     Address                                                     |             University             | Country |
+----+-----------------+-----------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------+------------------------------------+---------+
|  1 | M. A. PRICE     | http://185.141.105.238/action/doSearch?ContribAuthorStored=PRICE%2C+M+A     | Department of Mechanical and Manufacturing Engineering, The Queen's University of Belfast, Belfast BT95AH, U.K. |  The Queen's University of Belfast |  U.K.   |
|  1 | C. G. ARMSTRONG | http://185.141.105.238/action/doSearch?ContribAuthorStored=ARMSTRONG%2C+C+G | Department of Mechanical and Manufacturing Engineering, The Queen's University of Belfast, Belfast BT95AH, U.K. |  The Queen's University of Belfast |  U.K.   |
+----+-----------------+-----------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------+------------------------------------+---------+

相关内容

  • 没有找到相关文章

最新更新