我正试图从HTML文件中提取一个表。表格如下:
Form 990 FYE Date Published Overall Score Stars
CN 2.1
2019-06 12/23/2020 96.98
2017-06 05/01/2018 97.46
2016-06 06/01/2017 100.00
2015-06 07/01/2016 99.98
2015-06 06/01/2016 97.87
CN 2.0
2015-06 04/01/2016 95.22
2014-06 10/01/2015 94.56
2014-06 09/01/2015 86.22
2013-06 02/01/2014 95.01
2012-06 09/01/2013 95.24
2012-06 07/01/2013 88.04
2011-06 12/01/2012 99.13
2011-06 04/01/2012 92.17
2010-06 09/20/2011 92.17
表格HTML如下所示:
<table class="summaryPage ratings" width="100%">
<tr>
<th align="left" scope="col">Form 990 FYE</th>
<th align="left" scope="col">Date Published</th>
<th align="center" scope="col">Overall Score</th>
<th scope="col" style="text-align: center;">Overall Rating</th>
</tr>
<tr class="methodology-2-1 current">
<td colspan="10">
<b><a href="/index.cfm?bay=content.view&cpid=2200">CN 2.1</a></b>
</td>
</tr>
<tr class="current">
<td>
2019-06
</td>
<td>
12/23/2020
</td>
<td align="center">96.98</td>
<td align="center">
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
<svg class="stars" enable-background="new 0 0 61 15" version="1.1" viewbox="0 0 61 15" x="0px" xml:space="preserve" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" y="0px">
<title>four stars</title>
<g>
<g>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="12.14,15 10.37,9.27 15,5.72 9.27,5.73 7.5,0 5.729,5.73 0,5.73 4.64,9.27 2.87,15 7.5,11.459"></polygon>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="27.14,15 25.37,9.27 30,5.72 24.27,5.73 22.5,0 20.729,5.73 15,5.73 19.64,9.27 17.87,15 22.5,11.459"></polygon>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="58.141,15 56.369,9.27 61,5.72 55.27,5.73 53.5,0 51.73,5.73 46,5.73 50.641,9.27 48.869,15 53.5,11.459"></polygon>
</g>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="42.141,15 40.369,9.27 45,5.72 39.27,5.73 37.5,0 35.73,5.73 30,5.73 34.641,9.27 32.869,15 37.5,11.459"></polygon>
</g>
</svg>
</td>
</tr>
<tr class="methodology-2-1">
<td>
2017-06
</td>
<td>
05/01/2018
</td>
<td align="center">97.46</td>
<td align="center">
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
<svg class="stars" enable-background="new 0 0 61 15" version="1.1" viewbox="0 0 61 15" x="0px" xml:space="preserve" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" y="0px">
<title>four stars</title>
<g>
<g>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="12.14,15 10.37,9.27 15,5.72 9.27,5.73 7.5,0 5.729,5.73 0,5.73 4.64,9.27 2.87,15 7.5,11.459"></polygon>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="27.14,15 25.37,9.27 30,5.72 24.27,5.73 22.5,0 20.729,5.73 15,5.73 19.64,9.27 17.87,15 22.5,11.459"></polygon>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="58.141,15 56.369,9.27 61,5.72 55.27,5.73 53.5,0 51.73,5.73 46,5.73 50.641,9.27 48.869,15 53.5,11.459"></polygon>
</g>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="42.141,15 40.369,9.27 45,5.72 39.27,5.73 37.5,0 35.73,5.73 30,5.73 34.641,9.27 32.869,15 37.5,11.459"></polygon>
</g>
</svg>
</td>
</tr>
<tr class="methodology-2-1">
<td>
2016-06
</td>
<td>
06/01/2017
</td>
<td align="center">100.00</td>
<td align="center">
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
<svg class="stars" enable-background="new 0 0 61 15" version="1.1" viewbox="0 0 61 15" x="0px" xml:space="preserve" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" y="0px">
<title>four stars</title>
<g>
<g>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="12.14,15 10.37,9.27 15,5.72 9.27,5.73 7.5,0 5.729,5.73 0,5.73 4.64,9.27 2.87,15 7.5,11.459"></polygon>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="27.14,15 25.37,9.27 30,5.72 24.27,5.73 22.5,0 20.729,5.73 15,5.73 19.64,9.27 17.87,15 22.5,11.459"></polygon>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="58.141,15 56.369,9.27 61,5.72 55.27,5.73 53.5,0 51.73,5.73 46,5.73 50.641,9.27 48.869,15 53.5,11.459"></polygon>
</g>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="42.141,15 40.369,9.27 45,5.72 39.27,5.73 37.5,0 35.73,5.73 30,5.73 34.641,9.27 32.869,15 37.5,11.459"></polygon>
</g>
</svg>
</td>
</tr>
<tr class="methodology-2-1">
<td>
2015-06
</td>
<td>
07/01/2016
</td>
<td align="center">99.98</td>
<td align="center">
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
<svg class="stars" enable-background="new 0 0 61 15" version="1.1" viewbox="0 0 61 15" x="0px" xml:space="preserve" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" y="0px">
<title>four stars</title>
<g>
<g>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="12.14,15 10.37,9.27 15,5.72 9.27,5.73 7.5,0 5.729,5.73 0,5.73 4.64,9.27 2.87,15 7.5,11.459"></polygon>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="27.14,15 25.37,9.27 30,5.72 24.27,5.73 22.5,0 20.729,5.73 15,5.73 19.64,9.27 17.87,15 22.5,11.459"></polygon>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="58.141,15 56.369,9.27 61,5.72 55.27,5.73 53.5,0 51.73,5.73 46,5.73 50.641,9.27 48.869,15 53.5,11.459"></polygon>
</g>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="42.141,15 40.369,9.27 45,5.72 39.27,5.73 37.5,0 35.73,5.73 30,5.73 34.641,9.27 32.869,15 37.5,11.459"></polygon>
</g>
</svg>
</td>
</tr>
<tr class="methodology-2-1">
<td>
<span id="cf_tooltip_28842661508586">
2015-06 <span style="color: grey;"><i aria-hidden="true" class="fa fa-info-circle"></i></span>
</span>
</td>
<td>
06/01/2016
</td>
<td align="center">97.87</td>
<td align="center">
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
<svg class="stars" enable-background="new 0 0 61 15" version="1.1" viewbox="0 0 61 15" x="0px" xml:space="preserve" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" y="0px">
<title>four stars</title>
<g>
<g>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="12.14,15 10.37,9.27 15,5.72 9.27,5.73 7.5,0 5.729,5.73 0,5.73 4.64,9.27 2.87,15 7.5,11.459"></polygon>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="27.14,15 25.37,9.27 30,5.72 24.27,5.73 22.5,0 20.729,5.73 15,5.73 19.64,9.27 17.87,15 22.5,11.459"></polygon>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="58.141,15 56.369,9.27 61,5.72 55.27,5.73 53.5,0 51.73,5.73 46,5.73 50.641,9.27 48.869,15 53.5,11.459"></polygon>
</g>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="42.141,15 40.369,9.27 45,5.72 39.27,5.73 37.5,0 35.73,5.73 30,5.73 34.641,9.27 32.869,15 37.5,11.459"></polygon>
</g>
</svg>
</td>
</tr>
<tr class="methodology-2-0">
<td colspan="10"></td>
</tr>
<tr class="">
<td colspan="10">
<b><a href="/index.cfm?bay=content.view&cpid=2200">CN 2.0</a></b>
</td>
</tr>
<tr class="">
<td>
2015-06
</td>
<td>
04/01/2016
</td>
<td align="center">95.22</td>
<td align="center">
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
<svg class="stars" enable-background="new 0 0 61 15" version="1.1" viewbox="0 0 61 15" x="0px" xml:space="preserve" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" y="0px">
<title>four stars</title>
<g>
<g>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="12.14,15 10.37,9.27 15,5.72 9.27,5.73 7.5,0 5.729,5.73 0,5.73 4.64,9.27 2.87,15 7.5,11.459"></polygon>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="27.14,15 25.37,9.27 30,5.72 24.27,5.73 22.5,0 20.729,5.73 15,5.73 19.64,9.27 17.87,15 22.5,11.459"></polygon>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="58.141,15 56.369,9.27 61,5.72 55.27,5.73 53.5,0 51.73,5.73 46,5.73 50.641,9.27 48.869,15 53.5,11.459"></polygon>
</g>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="42.141,15 40.369,9.27 45,5.72 39.27,5.73 37.5,0 35.73,5.73 30,5.73 34.641,9.27 32.869,15 37.5,11.459"></polygon>
</g>
</svg>
</td>
</tr>
<tr class="methodology-2-0">
<td>
2014-06
</td>
<td>
10/01/2015
</td>
<td align="center">94.56</td>
<td align="center">
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
<svg class="stars" enable-background="new 0 0 61 15" version="1.1" viewbox="0 0 61 15" x="0px" xml:space="preserve" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" y="0px">
<title>four stars</title>
<g>
<g>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="12.14,15 10.37,9.27 15,5.72 9.27,5.73 7.5,0 5.729,5.73 0,5.73 4.64,9.27 2.87,15 7.5,11.459"></polygon>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="27.14,15 25.37,9.27 30,5.72 24.27,5.73 22.5,0 20.729,5.73 15,5.73 19.64,9.27 17.87,15 22.5,11.459"></polygon>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="58.141,15 56.369,9.27 61,5.72 55.27,5.73 53.5,0 51.73,5.73 46,5.73 50.641,9.27 48.869,15 53.5,11.459"></polygon>
</g>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="42.141,15 40.369,9.27 45,5.72 39.27,5.73 37.5,0 35.73,5.73 30,5.73 34.641,9.27 32.869,15 37.5,11.459"></polygon>
</g>
</svg>
</td>
</tr>
<tr class="methodology-2-0">
<td>
<span id="cf_tooltip_28842661508587">
2014-06 <span style="color: grey;"><i aria-hidden="true" class="fa fa-info-circle"></i></span>
</span>
</td>
<td>
09/01/2015
</td>
<td align="center">86.22</td>
<td align="center">
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
<svg class="stars" enable-background="new 0 0 61 15" version="1.1" viewbox="0 0 61 15" x="0px" xml:space="preserve" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" y="0px">
<title>three stars</title>
<g>
<g>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="12.14,15 10.37,9.27 15,5.72 9.27,5.73 7.5,0 5.729,5.73 0,5.73 4.64,9.27 2.87,15 7.5,11.459"></polygon>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="27.14,15 25.37,9.27 30,5.72 24.27,5.73 22.5,0 20.729,5.73 15,5.73 19.64,9.27 17.87,15 22.5,11.459"></polygon>
<polygon clip-rule="evenodd" fill="#fff" fill-rule="evenodd" points="58.141,15 56.369,9.27 61,5.72 55.27,5.73 53.5,0 51.73,5.73 46,5.73 50.641,9.27 48.869,15 53.5,11.459" stroke="#CDCCCC"></polygon>
</g>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="42.141,15 40.369,9.27 45,5.72 39.27,5.73 37.5,0 35.73,5.73 30,5.73 34.641,9.27 32.869,15 37.5,11.459"></polygon>
</g>
</svg>
</td>
</tr>
<tr class="methodology-2-0">
<td>
2013-06
</td>
<td>
02/01/2014
</td>
<td align="center">95.01</td>
<td align="center">
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
<svg class="stars" enable-background="new 0 0 61 15" version="1.1" viewbox="0 0 61 15" x="0px" xml:space="preserve" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" y="0px">
<title>four stars</title>
<g>
<g>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="12.14,15 10.37,9.27 15,5.72 9.27,5.73 7.5,0 5.729,5.73 0,5.73 4.64,9.27 2.87,15 7.5,11.459"></polygon>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="27.14,15 25.37,9.27 30,5.72 24.27,5.73 22.5,0 20.729,5.73 15,5.73 19.64,9.27 17.87,15 22.5,11.459"></polygon>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="58.141,15 56.369,9.27 61,5.72 55.27,5.73 53.5,0 51.73,5.73 46,5.73 50.641,9.27 48.869,15 53.5,11.459"></polygon>
</g>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="42.141,15 40.369,9.27 45,5.72 39.27,5.73 37.5,0 35.73,5.73 30,5.73 34.641,9.27 32.869,15 37.5,11.459"></polygon>
</g>
</svg>
</td>
</tr>
<tr class="methodology-2-0">
<td>
2012-06
</td>
<td>
09/01/2013
</td>
<td align="center">95.24</td>
<td align="center">
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
<svg class="stars" enable-background="new 0 0 61 15" version="1.1" viewbox="0 0 61 15" x="0px" xml:space="preserve" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" y="0px">
<title>four stars</title>
<g>
<g>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="12.14,15 10.37,9.27 15,5.72 9.27,5.73 7.5,0 5.729,5.73 0,5.73 4.64,9.27 2.87,15 7.5,11.459"></polygon>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="27.14,15 25.37,9.27 30,5.72 24.27,5.73 22.5,0 20.729,5.73 15,5.73 19.64,9.27 17.87,15 22.5,11.459"></polygon>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="58.141,15 56.369,9.27 61,5.72 55.27,5.73 53.5,0 51.73,5.73 46,5.73 50.641,9.27 48.869,15 53.5,11.459"></polygon>
</g>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="42.141,15 40.369,9.27 45,5.72 39.27,5.73 37.5,0 35.73,5.73 30,5.73 34.641,9.27 32.869,15 37.5,11.459"></polygon>
</g>
</svg>
</td>
</tr>
<tr class="methodology-2-0">
<td>
<span id="cf_tooltip_28842661508588">
2012-06 <span style="color: grey;"><i aria-hidden="true" class="fa fa-info-circle"></i></span>
</span>
</td>
<td>
07/01/2013
</td>
<td align="center">88.04</td>
<td align="center">
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
<svg class="stars" enable-background="new 0 0 61 15" version="1.1" viewbox="0 0 61 15" x="0px" xml:space="preserve" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" y="0px">
<title>three stars</title>
<g>
<g>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="12.14,15 10.37,9.27 15,5.72 9.27,5.73 7.5,0 5.729,5.73 0,5.73 4.64,9.27 2.87,15 7.5,11.459"></polygon>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="27.14,15 25.37,9.27 30,5.72 24.27,5.73 22.5,0 20.729,5.73 15,5.73 19.64,9.27 17.87,15 22.5,11.459"></polygon>
<polygon clip-rule="evenodd" fill="#fff" fill-rule="evenodd" points="58.141,15 56.369,9.27 61,5.72 55.27,5.73 53.5,0 51.73,5.73 46,5.73 50.641,9.27 48.869,15 53.5,11.459" stroke="#CDCCCC"></polygon>
</g>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="42.141,15 40.369,9.27 45,5.72 39.27,5.73 37.5,0 35.73,5.73 30,5.73 34.641,9.27 32.869,15 37.5,11.459"></polygon>
</g>
</svg>
</td>
</tr>
<tr class="methodology-2-0">
<td>
2011-06
</td>
<td>
12/01/2012
</td>
<td align="center">99.13</td>
<td align="center">
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
<svg class="stars" enable-background="new 0 0 61 15" version="1.1" viewbox="0 0 61 15" x="0px" xml:space="preserve" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" y="0px">
<title>four stars</title>
<g>
<g>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="12.14,15 10.37,9.27 15,5.72 9.27,5.73 7.5,0 5.729,5.73 0,5.73 4.64,9.27 2.87,15 7.5,11.459"></polygon>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="27.14,15 25.37,9.27 30,5.72 24.27,5.73 22.5,0 20.729,5.73 15,5.73 19.64,9.27 17.87,15 22.5,11.459"></polygon>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="58.141,15 56.369,9.27 61,5.72 55.27,5.73 53.5,0 51.73,5.73 46,5.73 50.641,9.27 48.869,15 53.5,11.459"></polygon>
</g>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="42.141,15 40.369,9.27 45,5.72 39.27,5.73 37.5,0 35.73,5.73 30,5.73 34.641,9.27 32.869,15 37.5,11.459"></polygon>
</g>
</svg>
</td>
</tr>
<tr class="methodology-2-0">
<td>
<span id="cf_tooltip_28842661508589">
2011-06 <span style="color: grey;"><i aria-hidden="true" class="fa fa-info-circle"></i></span>
</span>
</td>
<td>
04/01/2012
</td>
<td align="center">92.17</td>
<td align="center">
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
<svg class="stars" enable-background="new 0 0 61 15" version="1.1" viewbox="0 0 61 15" x="0px" xml:space="preserve" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" y="0px">
<title>four stars</title>
<g>
<g>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="12.14,15 10.37,9.27 15,5.72 9.27,5.73 7.5,0 5.729,5.73 0,5.73 4.64,9.27 2.87,15 7.5,11.459"></polygon>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="27.14,15 25.37,9.27 30,5.72 24.27,5.73 22.5,0 20.729,5.73 15,5.73 19.64,9.27 17.87,15 22.5,11.459"></polygon>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="58.141,15 56.369,9.27 61,5.72 55.27,5.73 53.5,0 51.73,5.73 46,5.73 50.641,9.27 48.869,15 53.5,11.459"></polygon>
</g>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="42.141,15 40.369,9.27 45,5.72 39.27,5.73 37.5,0 35.73,5.73 30,5.73 34.641,9.27 32.869,15 37.5,11.459"></polygon>
</g>
</svg>
</td>
</tr>
<tr class="methodology-2-0">
<td>
2010-06
</td>
<td>
09/20/2011
</td>
<td align="center">92.17</td>
<td align="center">
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
<svg class="stars" enable-background="new 0 0 61 15" version="1.1" viewbox="0 0 61 15" x="0px" xml:space="preserve" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" y="0px">
<title>four stars</title>
<g>
<g>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="12.14,15 10.37,9.27 15,5.72 9.27,5.73 7.5,0 5.729,5.73 0,5.73 4.64,9.27 2.87,15 7.5,11.459"></polygon>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="27.14,15 25.37,9.27 30,5.72 24.27,5.73 22.5,0 20.729,5.73 15,5.73 19.64,9.27 17.87,15 22.5,11.459"></polygon>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="58.141,15 56.369,9.27 61,5.72 55.27,5.73 53.5,0 51.73,5.73 46,5.73 50.641,9.27 48.869,15 53.5,11.459"></polygon>
</g>
<polygon clip-rule="evenodd" fill="#3499CD" fill-rule="evenodd" points="42.141,15 40.369,9.27 45,5.72 39.27,5.73 37.5,0 35.73,5.73 30,5.73 34.641,9.27 32.869,15 37.5,11.459"></polygon>
</g>
</svg>
</td>
</tr>
</table>
请注意,该表很简单,但HTML代码可能有点混乱。列Stars
的数据在代码svg class="stars"
的块中找到,其余数据在类似tr class="methodology-2-0"
的块中发现。我想提取表格来存储它,由于我将对几千个文件进行提取,我想知道什么是最好的方法。我想要的输出如下:
Form 990 FYE Date Published Overall Score Stars CN
2019-06 12/23/2020 96.98 X stars CN 2.1
2017-06 05/01/2018 97.46 Y star CN 2.0
2016-06 06/01/2017 100.00 .... ......
我想知道最好的方法是什么。我在这里发现的第一种方法在我调整它时不起作用:
sumtab= soup.find('table',class_='summaryPage ratings')
sumdf = pd.DataFrame(columns=['Form 990 FYE','Date Published','Overall Score','Overall Rating'])
for row in sumtab.find_all('tr'):
cols = row.find_all('td')
row_list = [ data.text for data in cols ]
temp_df = pd.DataFrame([row_list], columns = ['Form 990 FYE','Date Published','Overall Score','Overall Rating'])
sumdf = sumdf.append(temp_df).reset_index(drop = True)
sumdf = sumdf.iloc[1:, :]
以下尝试也不起作用:
table = pd.read_html(soup.find(class_="summaryPage ratings"))
print(table)
你有什么建议吗?
当您在迭代列行时遇到CN
时,您可以将其存储在一个值中,并不断将当前CN
值添加到列行列表中:
from bs4 import BeautifulSoup
import pandas as pd
soup = BeautifulSoup(your_html)
lists = []
cn = None
for row in soup.find_all('tr'):
cols = row.find_all('td')
c = [i.text.strip() for i in cols]
if len(c) == 1:
cn = c[0]
elif len(c) > 1:
c = c + [cn]
lists.append(c)
df = pd.DataFrame(lists, columns = ['Form 990 FYE','Date Published','Overall Score','Stars', 'CN'])
结果:
表格990 FYE | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2019-06 | 020年12月23日96.98 | >1 | 201706 | 2018年1月5日 | 7.462 | 2016-06 | 2017年1月6日td style="text align:right;">32015-06 | 4 |