Python - 如何从 RSS 提要获取时区 - Python - How to get timezone from RSS feed 小贝子编程网

我需要获取RSS提要的已发布字段，并且我需要知道时区是什么。我以 UTC 格式存储日期，并且我希望另一个字段来存储时区，以便以后可以操作日期时间。

我当前的代码如下：

for entry in feed['entries']:
    if hasattr(entry, 'published'):
        if isinstance(entry.published_parsed, struct_time):
            dt = datetime(*entry.published_parsed[:-3])

dt 的最终值是 UTC 格式的正确日期时间，但我还需要获取原始时区。谁能帮忙？

编辑：

为了将来参考，即使它不是我原始问题的一部分，如果您需要操作非标准时区（如 est），您需要根据您的规范制作一个转换表。感谢这个答案：在 Python 中使用时区缩写名称解析日期/时间字符串？

您可以使用parser.parse dateutil包的方法。

例如，对于 statckoverflow：

import feedparser
from dateutil import parser, tz
url = 'http://stackoverflow.com/feeds/tag/python'
feed = feedparser.parse(url)
published = feed.entries[0].published
dt = parser.parse(published)
print(published)
print(dt) # that is timezone aware
print(dt.utcoffset()) # time zone of time
print(dt.astimezone(tz.tzutc())) # that is timezone aware as UTC
2012-11-28T19:07:32Z
2012-11-28 19:07:32+00:00
0:00:00
2012-11-28 19:07:32+00:00

您可以看到published以 Z 结尾，这意味着时区采用 UTC：

在提要解析器中查看它的日期格式历史记录：

Atom 1.0 states that all date elements “MUST conform to the date-time 
production in RFC 3339. 
In addition, an uppercase T character MUST be used to separate date and time, 
and an uppercase Z character MUST be present in the absence of 
a numeric time zone offset.”

再举一个例子：

import feedparser
from dateutil import parser, tz
url = 'http://omidraha.com/rss/'
feed = feedparser.parse(url)
published = feed.entries[0].published
dt = parser.parse(published)
print(published)
print(dt) # that is timezone aware
print(dt.utcoffset()) # time zone of time
print(dt.astimezone(tz.tzutc())) # that is timezone aware as UTC
Thu, 26 Dec 2013 14:24:04 +0330
2013-12-26 14:24:04+03:30
3:30:00
2013-12-26 10:54:04+00:00

但它也取决于接收的时间数据的格式和馈送的类型。

Python - 如何从 RSS 提要获取时区

相关内容

最新更新

热门标签：