通过在emacs中调用谷歌词典api, http://www.google.com/dictionary/json?callback=cb&q=word&sl=en&tl=en&restrict=pr%%2Cde&client=te我可以得到如下回复
"entries": [{
"type": "example",
"terms": [{
"type": "text",
"text": "his grandfatherx27s x3cemx3ewordsx3c/emx3e had been meant kindly",
"language": "en"
}]
}]
如您所见,"文本"中有转义的 unicode。我想在如下所示的函数中转换它们。
(defun unescape-string (string)
"Return unescape unicode string"
...
)
(unescape-string "his grandfatherx27s x3cemx3ewordsx3c/emx3e")
=> "his grandfathers's <em>words</em>"
(insert #x27)'
(insert #x27)'
(insert #x3c)<
(insert #x3e)>
这是我尝试过的
- replace-regexp-in-string
- 自定义替换,如 http://www.emacswiki.org/emacs/ElispCookbook#toc33
但是,我想我不知道如何将"\x123"替换为相应的 unicode 到缓冲区或字符串中。
提前致谢
最简单的方法:
(read (princ ""his grandfather\x27s \x3cem\x3ewords\x3c/em\x3e had been meant kindly""))
;; "his grandfather's ώm>words</em> had been meant kindly"
同样有趣的是,Emacs 解析x3ce
而不是x3c
。我不确定这是错误还是预期行为。我一直认为不应该在x
之后阅读超过两个字符......
如果你仍然想使用read
+ princ
组合,你需要放一个反斜杠来防止 Emacs 解析更多字符,如下所示:x3ce
。或者这里有一些我可以快速想到的东西:
(defun replace-c-escape-codes (input)
(replace-regexp-in-string
"\\x[[:xdigit:]][[:xdigit:]]"
(lambda (match)
(make-string 1 (string-to-number (substring match 2) 16)))
input))
(replace-c-escape-codes "his grandfather\x27s \x3cem\x3ewords\x3c/em\x3e")
"his grandfather's <em>words</em>"