SQL Server将XML列解析为表格式-IRS 990



表中有三列,第三列是有效的XML类型。下面列出了一个XML示例,其中包括几个标记。该XML是从IRS表格990下载的。我正在尝试以表格格式返回一些标记。我在下面完成了这个查询,它运行了,但没有返回任何值。我可能在这条路上做错了什么,但我似乎想不通。

这是我的表格结构:

CREATE TABLE dbo.XMLFilesTable
(
Id INT IDENTITY PRIMARY KEY,
FileName VARCHAR(100),
XMLData XML,
LoadedDateTime DATETIME
)
GO

以下是xml列中的xml:

<?xml version="1.0" encoding="utf-8"?>
<Return xmlns="http://www.irs.gov/efile" returnVersion="2019v5.1">
<ReturnData documentCnt="7">
<IRS990 documentId="IRS990">
<PrincipalOfficerNm>Fr Francis Pizzarelli</PrincipalOfficerNm>
<USAddress>
<AddressLine1Txt>One High Street</AddressLine1Txt>
<CityNm>Port Jefferson</CityNm>
<StateAbbreviationCd>NY</StateAbbreviationCd>
<ZIPCd>11777</ZIPCd>
</USAddress>
<GrossReceiptsAmt>6463631</GrossReceiptsAmt>
<GroupReturnForAffiliatesInd>false</GroupReturnForAffiliatesInd>
<Organization501c3Ind>X</Organization501c3Ind>
<TypeOfOrganizationCorpInd>X</TypeOfOrganizationCorpInd>
<FormationYr>1980</FormationYr>
<LegalDomicileStateCd>NY</LegalDomicileStateCd>
<ActivityOrMissionDesc>To provide help to individuals who can not find help in the form of housing, counseling, and other support so that they can eventually live productive independent lives. Hope House provides hope, care and compassion to nearly 2,000 individuals in need each month.</ActivityOrMissionDesc>
<VotingMembersGoverningBodyCnt>8</VotingMembersGoverningBodyCnt>
<VotingMembersIndependentCnt>7</VotingMembersIndependentCnt>
<TotalEmployeeCnt>122</TotalEmployeeCnt>
<TotalGrossUBIAmt>0</TotalGrossUBIAmt>
<NetUnrelatedBusTxblIncmAmt>0</NetUnrelatedBusTxblIncmAmt>
<PYContributionsGrantsAmt>3307653</PYContributionsGrantsAmt>
<CYContributionsGrantsAmt>4963545</CYContributionsGrantsAmt>
<CYProgramServiceRevenueAmt>0</CYProgramServiceRevenueAmt>
<PYInvestmentIncomeAmt>25158</PYInvestmentIncomeAmt>
<CYInvestmentIncomeAmt>122678</CYInvestmentIncomeAmt>
<PYOtherRevenueAmt>1302778</PYOtherRevenueAmt>
<CYOtherRevenueAmt>1016131</CYOtherRevenueAmt>
<PYTotalRevenueAmt>4635589</PYTotalRevenueAmt>
<CYTotalRevenueAmt>6102354</CYTotalRevenueAmt>
<CYGrantsAndSimilarPaidAmt>0</CYGrantsAndSimilarPaidAmt>
<CYBenefitsPaidToMembersAmt>0</CYBenefitsPaidToMembersAmt>
<PYSalariesCompEmpBnftPaidAmt>3294184</PYSalariesCompEmpBnftPaidAmt>
<CYSalariesCompEmpBnftPaidAmt>3352675</CYSalariesCompEmpBnftPaidAmt>
<CYTotalProfFndrsngExpnsAmt>0</CYTotalProfFndrsngExpnsAmt>
<CYTotalFundraisingExpenseAmt>501828</CYTotalFundraisingExpenseAmt>
<PYOtherExpensesAmt>1793710</PYOtherExpensesAmt>

我尝试了以下代码,它运行了,但没有返回任何结果:

DECLARE @XML AS XML, @hDoc AS INT, @SQL NVARCHAR (MAX), @rootxmlns VARCHAR(100)
SELECT @XML = XMLData FROM XMLFilesTable
SET @rootxmlns = '<Return xmlns="http://www.irs.gov/efile"/'
EXEC sp_xml_preparedocument @hDoc OUTPUT, @XML

SELECT PYContributionsGrantsAmt, CYContributionsGrantsAmt, PYInvestmentIncomeAmt
FROM OPENXML(@hDoc, 'Return/ReturnData')
WITH
(
PYContributionsGrantsAmt [BIGINT] 'PYContributionsGrantsAmt',
CYContributionsGrantsAmt [BIGINT] 'CYContributionsGrantsAmt',
PYInvestmentIncomeAmt [BIGINT] 'PYInvestmentIncomeAmt'
)
EXEC sp_xml_removedocument @hDoc
GO

我的想法是打开xml之后的路径不正确。

对此有什么想法吗?

发布的代码中有几个问题。。。

  1. SET @rootxmlns = '<Return xmlns="http://www.irs.gov/efile"/'不是一个自关闭标记,并且还试图定义一个默认命名空间,这与OPENXML有关。

  2. EXEC sp_xml_preparedocument @hDoc OUTPUT, @XML不包括@rootxmlns参数,因此efile命名空间不用于查询。

  3. OPENXML(@hDoc, 'Return/ReturnData')在XPath中不包括IRS990元素。

纠正这些问题可以通过以下方法获得结果:

DECLARE @XML AS XML, @hDoc AS INT, @SQL NVARCHAR (MAX), @rootxmlns VARCHAR(100)
SELECT @XML = XMLData FROM XMLFilesTable
SET @rootxmlns = '<Return xmlns:example="http://www.irs.gov/efile"/>'
EXEC sp_xml_preparedocument @hDoc OUTPUT, @XML, @rootxmlns

SELECT PYContributionsGrantsAmt, CYContributionsGrantsAmt, PYInvestmentIncomeAmt
FROM OPENXML(@hDoc, 'example:Return/example:ReturnData/example:IRS990')
WITH
(
PYContributionsGrantsAmt [BIGINT] 'example:PYContributionsGrantsAmt',
CYContributionsGrantsAmt [BIGINT] 'example:CYContributionsGrantsAmt',
PYInvestmentIncomeAmt [BIGINT] 'example:PYInvestmentIncomeAmt'
)
EXEC sp_xml_removedocument @hDoc

结果是:

PY出资额
3307653

保留Microsoft专有的OPENXML及其配套sp_xml_preparedocumentsp_xml_removedocument只是为了与过时的SQL Server 2000向后兼容。它们的使用减少到只有极少数的边缘案例。

从SQLServer2005开始,强烈建议重新编写SQL并将其切换到XQuery。

SQL

-- DDL and sample data population, start
DECLARE @tbl TABLE (ID INT IDENTITY PRIMARY KEY, XMLData XML);
INSERT INTO @tbl (XMLData) VALUES
('<?xml version="1.0" encoding="utf-8"?>
<Return xmlns="http://www.irs.gov/efile" returnVersion="2019v5.1">
<ReturnData documentCnt="7">
<IRS990 documentId="IRS990">
<PrincipalOfficerNm>Fr Francis Pizzarelli</PrincipalOfficerNm>
<USAddress>
<AddressLine1Txt>One High Street</AddressLine1Txt>
<CityNm>Port Jefferson</CityNm>
<StateAbbreviationCd>NY</StateAbbreviationCd>
<ZIPCd>11777</ZIPCd>
</USAddress>
<GrossReceiptsAmt>6463631</GrossReceiptsAmt>
<GroupReturnForAffiliatesInd>false</GroupReturnForAffiliatesInd>
<Organization501c3Ind>X</Organization501c3Ind>
<TypeOfOrganizationCorpInd>X</TypeOfOrganizationCorpInd>
<FormationYr>1980</FormationYr>
<LegalDomicileStateCd>NY</LegalDomicileStateCd>
<ActivityOrMissionDesc>To provide help to individuals who can not find help in the form of housing, counseling, and other support so that they can eventually live productive independent lives. Hope House provides hope, care and compassion to nearly 2,000 individuals in need each month.</ActivityOrMissionDesc>
<VotingMembersGoverningBodyCnt>8</VotingMembersGoverningBodyCnt>
<VotingMembersIndependentCnt>7</VotingMembersIndependentCnt>
<TotalEmployeeCnt>122</TotalEmployeeCnt>
<TotalGrossUBIAmt>0</TotalGrossUBIAmt>
<NetUnrelatedBusTxblIncmAmt>0</NetUnrelatedBusTxblIncmAmt>
<PYContributionsGrantsAmt>3307653</PYContributionsGrantsAmt>
<CYContributionsGrantsAmt>4963545</CYContributionsGrantsAmt>
<CYProgramServiceRevenueAmt>0</CYProgramServiceRevenueAmt>
<PYInvestmentIncomeAmt>25158</PYInvestmentIncomeAmt>
<CYInvestmentIncomeAmt>122678</CYInvestmentIncomeAmt>
<PYOtherRevenueAmt>1302778</PYOtherRevenueAmt>
<CYOtherRevenueAmt>1016131</CYOtherRevenueAmt>
<PYTotalRevenueAmt>4635589</PYTotalRevenueAmt>
<CYTotalRevenueAmt>6102354</CYTotalRevenueAmt>
<CYGrantsAndSimilarPaidAmt>0</CYGrantsAndSimilarPaidAmt>
<CYBenefitsPaidToMembersAmt>0</CYBenefitsPaidToMembersAmt>
<PYSalariesCompEmpBnftPaidAmt>3294184</PYSalariesCompEmpBnftPaidAmt>
<CYSalariesCompEmpBnftPaidAmt>3352675</CYSalariesCompEmpBnftPaidAmt>
<CYTotalProfFndrsngExpnsAmt>0</CYTotalProfFndrsngExpnsAmt>
<CYTotalFundraisingExpenseAmt>501828</CYTotalFundraisingExpenseAmt>
<PYOtherExpensesAmt>1793710</PYOtherExpensesAmt>
</IRS990>
</ReturnData>
</Return>');
WITH XMLNAMESPACES (DEFAULT 'http://www.irs.gov/efile')
SELECT ID
, PYContributionsGrantsAmt = c.value('(PYContributionsGrantsAmt/text())[1]', 'BIGINT')
, CYContributionsGrantsAmt  = c.value('(CYContributionsGrantsAmt/text())[1]', 'BIGINT')
, PYInvestmentIncomeAmt  = c.value('(PYInvestmentIncomeAmt /text())[1]', 'BIGINT')
FROM @tbl
CROSS APPLY XMLData.nodes('/Return/ReturnData/IRS990') AS t(c);

最新更新