Python抛出ascii编解码器不能编码u'xed'在位置108:序数不在范围(128)解析xml时



我在Python中运行以下代码:

import xml.etree.ElementTree as ET
tree = ET.parse('dblp_140.xml')
root = tree.getroot()
f = open('hi', 'w')
for country in root.findall('article'):
    rank = country.find('year').text
    name = country.find('title')
    if(int(rank)>2009):
        f.write(name.text)
        f.write(':')
        auth = country.findall('author')
        for a in auth:
            #print str(a)
            f.write(a.text.encode('utf8'))  
            f.write(',')
        f.write('n')

我得到一个错误抛出ascii编解码器不能编码u'xed'在位置108:序数不在范围(128)。我试图解析dblp数据,看起来像这样:

<?xml version="1.0" encoding="iso-8859-1"?>
<dblp>
<article mdate="2011-08-25" key="journals/envsoft/ElbernSTE00">
<author>Hendrik Elbern</author>
<author>H. Schmidt</author>
<author>O. Talagrand</author>
<author>Adolf Ebel</author>
<title>4D-variational data assimilation with an adjoint air quality model for emission analysis.</title>
<pages>539-548</pages>
<year>2000</year>
<volume>15</volume>
<journal>Environmental Modelling and Software</journal>
<number>6-7</number>
<url>db/journals/envsoft/envsoft15.html#ElbernSTE00</url>
<ee>http://dx.doi.org/10.1016/S1364-8152(00)00049-9</ee>
</article>
<article mdate="2015-01-12" key="journals/envsoft/VerstegenKHF14">
<author>Judith Anne Verstegen</author>
<author>Derek Karssenberg</author>
<author>Floor van der Hilst</author>
<author>André P. C. Faaij</author>
<title>Identifying a land use change cellular automaton by Bayesian data assimilation.</title>
<pages>121-136</pages>
<year>2014</year>
<volume>53</volume>
<journal>Environmental Modelling and Software</journal>
<ee>http://dx.doi.org/10.1016/j.envsoft.2013.11.009</ee>
<url>db/journals/envsoft/envsoft53.html#VerstegenKHF14</url>
</article>
<article mdate="2014-07-15" key="journals/envsoft/FechtBB14">
<author>Daniela Fecht</author>
<author>Linda Beale</author>
<author>David Briggs</author>
<title>A GIS-based urban simulation model for environmental health analysis.</title>
<pages>1-11</pages>
<year>2014</year>
<volume>58</volume>
<journal>Environmental Modelling and Software</journal>
<ee>http://dx.doi.org/10.1016/j.envsoft.2014.03.013</ee>
<url>db/journals/envsoft/envsoft58.html#FechtBB14</url>
</article>
<article mdate="2008-03-03" key="journals/envsoft/GhenuRS08">
<author>A. Ghenu</author>
<author>J.-M. Rosant</author>
<author>J.-F. Sini</author>
<title>Dispersion of pollutants and estimation of emissions in a street canyon in Rouen, France.</title>
<pages>314-321</pages>
<year>2008</year>
<volume>23</volume>
<journal>Environmental Modelling and Software</journal>
<number>3</number>
<ee>http://dx.doi.org/10.1016/j.envsoft.2007.05.017</ee>
<url>db/journals/envsoft/envsoft23.html#GhenuRS08</url>
</article>
<article mdate="2014-07-15" key="journals/envsoft/ChaS14">
<author>YoonKyung Cha</author>
<author>Craig A. Stow</author>
<title>A Bayesian network incorporating observation error to predict phosphorus and chlorophyll a in Saginaw Bay.</title>
<pages>90-100</pages>
<year>2014</year>
<volume>57</volume>
<journal>Environmental Modelling and Software</journal>
<ee>http://dx.doi.org/10.1016/j.envsoft.2014.02.010</ee>
<url>db/journals/envsoft/envsoft57.html#ChaS14</url>
</article>
<article mdate="2015-04-10" key="journals/envsoft/PinetDS07">
<author>François Pinet</author>
<author>Magali Duboisset</author>
<author>Vincent Soulignac</author>
<title>Using UML and OCL to maintain the consistency of spatial data in environmental information systems.</title>
<pages>1217-1220</pages>
<year>2007</year>
<volume>22</volume>
<journal>Environmental Modelling and Software</journal>
<number>8</number>
<ee>http://dx.doi.org/10.1016/j.envsoft.2006.10.003</ee>
<url>db/journals/envsoft/envsoft22.html#PinetDS07</url>
</article>
<article mdate="2015-01-13" key="journals/envsoft/CastronovaG13">
<author>Anthony M. Castronova</author>
<author>Jonathan L. Goodall</author>
<title>Simulating watersheds using loosely integrated model components: Evaluation of computational scaling using OpenMI.</title>
<pages>304-313</pages>
<year>2013</year>
<volume>39</volume>
<journal>Environmental Modelling and Software</journal>
<ee>http://dx.doi.org/10.1016/j.envsoft.2012.01.020</ee>
<url>db/journals/envsoft/envsoft39.html#CastronovaG13</url>
</article>
<article mdate="2005-10-28" key="journals/envsoft/BishopHS05">
<author>Ian D. Bishop</author>
<author>R. Bruce Hull IV</author>
<author>Christian Stock</author>
<title>Supporting personal world-views in an envisioning system.</title>
<pages>1459-1468</pages>
<year>2005</year>
<volume>20</volume>
<journal>Environmental Modelling and Software</journal>
<number>12</number>
<ee>http://dx.doi.org/10.1016/j.envsoft.2004.06.014</ee>
<url>db/journals/envsoft/envsoft20.html#BishopHS05</url>
</article>
<article mdate="2015-01-12" key="journals/envsoft/LaucelliBG12">
<author>Daniele Laucelli</author>
<author>Luigi Berardi</author>
<author>Orazio Giustolisi</author>
<title>Assessing climate change and asset deterioration impacts on water distribution networks: Demand-driven or pressure-driven network modeling?</title>
<pages>206-216</pages>
<year>2012</year>
<volume>37</volume>
<journal>Environmental Modelling and Software</journal>
<ee>http://dx.doi.org/10.1016/j.envsoft.2012.04.004</ee>
<url>db/journals/envsoft/envsoft37.html#LaucelliBG12</url>
</article>
<article mdate="2005-10-28" key="journals/envsoft/GeMSH05">
<author>Ying Ge</author>
<author>Doug MacDonald</author>
<author>Sébastien Sauvé</author>
<author>William Hendershot</author>
<title>Modeling of Cd and Pb speciation in soil solutions by WinHumicV and NICA-Donnan model.</title>
<pages>353-359</pages>
<year>2005</year>
<volume>20</volume>
<journal>Environmental Modelling and Software</journal>
<number>3</number>
<ee>http://dx.doi.org/10.1016/j.envsoft.2003.12.014</ee>
<url>db/journals/envsoft/envsoft20.html#GeMSH05</url>
</article>
<article mdate="2015-06-29" key="journals/envsoft/DupasPRJDG15">
<author>Rémi Dupas</author>
<author>Virginie Parnaudeau</author>
<author>Raymond Reau</author>
<author>Marie Hélène Jeuffroy</author>
<author>Patrick Durand</author>
<author>Chantal Gascuel-Odoux</author>
<title>Integrating local knowledge and biophysical modeling to assess nitrate losses from cropping systems in drinking water protection areas.</title>
<pages>101-110</pages>
<year>2015</year>
<volume>69</volume>
<journal>Environmental Modelling and Software</journal>
<ee>http://dx.doi.org/10.1016/j.envsoft.2015.03.009</ee>
<url>db/journals/envsoft/envsoft69.html#DupasPRJDG15</url>
</article>
<article mdate="2005-10-28" key="journals/envsoft/RobertsT04">
<author>Philip J. W. Roberts</author>
<author>Xiaodong Tian</author>
<title>New experimental techniques for validation of marine discharge models.</title>
<pages>691-699</pages>
<year>2004</year>
<volume>19</volume>
<journal>Environmental Modelling and Software</journal>
<number>7-8</number>
<ee>http://dx.doi.org/10.1016/j.envsoft.2003.08.005</ee>
<url>db/journals/envsoft/envsoft19.html#RobertsT04</url>
</article>
<article mdate="2007-02-12" key="journals/envsoft/ElliottT07">
<author>A. H. Elliott</author>
<author>S. A. Trowsdale</author>
<title>A review of models for low impact urban stormwater drainage.</title>
<pages>394-405</pages>
<year>2007</year>
<volume>22</volume>
<journal>Environmental Modelling and Software</journal>
<number>3</number>
<ee>http://dx.doi.org/10.1016/j.envsoft.2005.12.005</ee>
<url>db/journals/envsoft/envsoft22.html#ElliottT07</url>
</article>
<article mdate="2010-01-05" key="journals/envsoft/WegmannCMSH09">
<author>Fabio Wegmann</author>
<author>Laurent Cavin</author>
<author>Matthew MacLeod</author>
<author>Martin Scheringer</author>
<author>Konrad Hungerbühler</author>
<title>The OECD software tool for screening chemicals for persistence and long-range transport potential.</title>
<pages>228-237</pages>
<year>2009</year>
<volume>24</volume>
<journal>Environmental Modelling and Software</journal>
<number>2</number>
<ee>http://dx.doi.org/10.1016/j.envsoft.2008.06.014</ee>
<url>db/journals/envsoft/envsoft24.html#WegmannCMSH09</url>
</article>
<article mdate="2015-11-10" key="journals/envsoft/MaciejewskaJRK15">
<author>Katarzyna Maciejewska</author>
<author>Katarzyna Juda-Rezler</author>
<author>Magdalena Reizer</author>
<author>Krzysztof Klejnowski</author>
<title>Modelling of black carbon statistical distribution and return periods of extreme concentrations.</title>
<pages>212-226</pages>
<year>2015</year>
<volume>74</volume>
<journal>Environmental Modelling and Software</journal>
<ee>http://dx.doi.org/10.1016/j.envsoft.2015.04.016</ee>
<url>db/journals/envsoft/envsoft74.html#MaciejewskaJRK15</url>
</article>

这个文件比我贴出来的大。我该如何解决?

StackOverflow弄乱了XML文件的编码,所以我不能真正测试这个,但试试这个:

import xml.etree.ElementTree as ET
parser = ET.XMLParser(encoding="iso-8859-1")
parser.parser.UseForeignDTD(True)
etree = ET.ElementTree()
tree = etree.parse('dblp_140.xml', parser=parser)
f = open('hi', 'w')
for country in tree.findall('article'):
    rank = country.find('year').text
    name = country.find('title')
    if(int(rank)>2009):
        f.write(name.text)
        f.write(':')
        auth = country.findall('author')
        for a in auth:
            print a.text.encode('utf-8')
            f.write(a.text.encode('utf-8'))  
            f.write(',')
        f.write('n')

相关内容

最新更新