xml 解析 - SAXParser 和 XMLReader 无法读取某些以 /> 结尾</tag-name>而不是



背景:

处理从网站检索信息的应用程序。这是我几个月前开发的一个应用程序,当时一切都很好,但我回来进行维护并继续这个项目。

我在检索什么:

视图源:http://services.runescape.com/m=news/latest_news.rss

请查看源以查看XML。请注意,每个项目都有类别、链接、pubDate、标题、说明和guid。而且只有少数项目有外壳。

问题:

每个标签都以<tag>并以</结尾tag>,但以/>结尾的enclosures除外,这就是导致读取过程混乱的原因,但我不知道如何解释它,也不知道它的XML格式是否正确。

问题:

是否有处理不以传统</结尾的标签的方法标签>方式?

以下是我迄今为止的所有代码-如果有人有任何问题,请随时提问。感谢stackoverflow的同事们的任何帮助或评论,非常感谢。

如何处理:

我使用SAXParserFactory和XML阅读器:

            try 
            {
                /** Handling XML */
                SAXParserFactory spf = SAXParserFactory.newInstance();
                SAXParser sp = spf.newSAXParser();
                XMLReader xr = sp.getXMLReader();
                /** Send URL to parse XML Tags */
                URL sourceUrl = new URL("http://services.runescape.com/m=news/latest_news.rss");
                /**
                 * Create handler to handle XML Tags ( extends
                 * DefaultHandler )
                 */
                //MyXMLHandler myXMLHandler = new MyXMLHandler();
                xr.setContentHandler(new MyXMLHandler());
                xr.parse(new InputSource(sourceUrl.openStream()));
            } 
            catch (Exception e) 
            {
                e.printStackTrace();
                //Log.i("EXCEPTION:","HomeActivity.java, line 121 - xml not parsed");
            }
            /** Get result from MyXMLHandler XMLlist Object */
            newsList = MyXMLHandler.xMLList; 

全局变量:

private XMLList newsList;

XMLList类:

我通过从XMLList类中的所有get方法检索列表来显示新闻"项"。

因此,如果有15个项目,则应该有15个日期、类别、链接等。但由于存储模块的问题,它会启动存储模块,但不会结束它们。

所以我得到了15个标题,但只有12个日期、12个类别、12个链接等。如果你在上面的rss的源代码中搜索单词"enclosure"(查看源代码:http://services.runescape.com/m=news/latest_news.rss),您将看到只有3个项目具有机柜。

public class XMLList
{
/** Variables */
private ArrayList<String> title = new ArrayList<String>();
private ArrayList<String>  detail = new ArrayList<String>();
private ArrayList<String> description = new ArrayList<String>();
private ArrayList<String> link = new ArrayList<String>();
private ArrayList<String> date = new ArrayList<String>();
private ArrayList<String> category = new ArrayList<String>();
/**
 * In Setter method default it will return arraylist change that to add
 */
public XMLList()
{
    title.clear();
    detail.clear();
    description.clear();
    link.clear();
    date.clear();
    category.clear();
}
public ArrayList<String> getTitle()
{
    return title;
}
public void setTitle(String name)
{
    this.title.add(name);
}
public ArrayList<String> getDetail()
{
    return detail;
}
public void setDetail(String detail)
{
    this.detail.add(detail);
}
public ArrayList<String> getDescription()
{
    return description;
}
public void setDescription(String description)
{
    this.description.add(description);
}
public void setLink(String link)
{
    this.link.add(link);
}
public ArrayList<String> getLink()
{
    return link;
}
public void setDate(String date)
{
    this.date.add(date);
}
public ArrayList<String> getDate()
{
    return date;
}
public void setCategory(String cat)
{
    this.category.add(cat);
}
public ArrayList<String> getCategory()
{
    return category;
}
}

MyXMLHandler类:

public class MyXMLHandler extends DefaultHandler
{
public static XMLList xMLList;
Boolean currentElement = false;
String currentValue = null;
Boolean inTitle = false;
Boolean inDescription = false;
Boolean inItem = false;
Boolean inDate = false;
Boolean inLink = false;
Boolean inCategory = false;
StringBuilder buff = null;
public MyXMLHandler()
{
    xMLList = new XMLList();
}
// All methods auto called in this order - start, characters, end
/*
 * Called when an xml tag starts
 * imgView.setImageResource(R.drawable.newImage);
 */
@Override
public void startElement(String uri, String localName, String qName, Attributes attributes)
{
    if(xMLList == null)
    {
        xMLList = new XMLList();
    }
    if(localName.equals("item"))
    {
        inItem = true;
    }
    if (inItem) 
    {
        Log.d("START " + localName,"");
        if (localName.equals("title")) 
        {
            inTitle = true;
            buff = new StringBuilder();
        }
        if (localName.equals("description")) 
        {
            inDescription = true;
            buff = new StringBuilder();
        }
        if (localName.equals("link")) 
        {
            inLink = true;
            buff = new StringBuilder();
        }
        if (localName.equals("pubDate")) 
        {
            inDate = true;
            buff = new StringBuilder();
        }
        if (localName.equals("category")) 
        {
            inCategory = true;
            buff = new StringBuilder();
        }
    }
}
/*
 * Called when an xml tag ends
 */
@Override
public void endElement(String uri, String localName, String qName)throws SAXException
{
    if (inItem && !inTitle && !inDescription && !inLink && !inDate && !inCategory) 
    {
        Log.d("END ITEM", "");
        inItem = false;
    } 
    else if (inTitle) 
    {
        String check = buff.toString().trim();
        Log.d("TITLE:", check);
        Log.d("END " + localName,"");
        xMLList.setTitle(check);
        inTitle = false;
        buff = null;
    }
    else if (inDescription) 
    {
        String check  = buff.toString().trim();
        Log.d("DESC:", check);
        Log.d("END " + localName,"");
        xMLList.setDescription(check);
        inDescription = false;
        buff = null;
    }
    else if (inLink) 
    {
        String check  = buff.toString().trim();
        Log.d("LINK:", check);
        Log.d("END " + localName,"");
        xMLList.setLink(check);
        inLink = false;
        buff = null;
    }
    else if (inDate) 
    {
        String check  = buff.toString().trim();
        Log.d("DATE:", check);
        Log.d("END " + localName,"");
        check = check.substring(0,16);
        xMLList.setDate(check);
        inDate = false;
        buff = null;
    }
    else if(inCategory)
    {
        String check  = buff.toString().trim();
        Log.d("CATEGORY:", check);
        Log.d("END " + localName,"");
        xMLList.setCategory(check);
        inCategory = false;
        buff = null;
    }
}
/*
 * Called to get tag characters
 */
@Override
public void characters(char[] ch, int start, int length)throws SAXException
{
    if (buff != null) 
    {
        for (int i = start; i < start + length; i++) 
        {
            buff.append(ch[i]);
        }
    }
}
}

endElement方法中,您有:

    if (inItem && !inTitle && !inDescription && !inLink && !inDate && !inCategory)
    {
        Log.d("END ITEM", "");
        inItem = false;
    }

这是在解析器命中<enclosure />标记时将inItem设置为false。

只有当您实际遇到</item>结束标记时,才应将inItem设置为false,这将通过用localname的"item"来检测endElement

在XMLList类中使用多个列表也是非常脆弱的。您最好创建一个具有与正在读取的标签匹配的字段的类,并构建该类的单个对象列表。

最新更新