Haskell读取CSV文件 ->从url加载XML文件 ->再次写出CSV文件



我正在尝试

  1. 加载一个CSV文件
  2. 从文件中读取ID
  3. 为每个ID加载一个外部XML文件
  4. 阅读XML的一些名称
  5. 将ID和名称写入新的CSV文件

我是Haskell的新手,真的想学习它,我仍然处于理解的复制和粘贴阶段。我自行找到了每个部分的教程,但是我很难将它们结合在一起。

CSV很简单,例如:

736572,"Mount Athos"
6697806,"North Aegean"

我使用木薯读取XML阅读的CSV和Heldomesoup。

在这里我尝试读取ID,加载XML并至少打印名称。

{-# LANGUAGE ScopedTypeVariables #-}
import qualified Data.ByteString.Lazy as BL
import Data.Csv
import qualified Data.Vector as V
import Text.XML.HXT.Core
import Text.HandsomeSoup
import Data.List
import Data.Char

getPlaceNames::String->String->String
getPlaceNames pid name = do
    let doc = fromUrl ("http://api.geonames.org/get?geonameId="++pid++"&username=demo")
    c<-runX $ doc >>> css "alternateNames" >>> deep getText
    return (head c)

main :: IO ()
main = do
    csvData <- BL.readFile "input.csv"
    case decode NoHeader csvData of
        Left err -> putStrLn err
        Right v -> V.forM_ v $  ( pid, name ) ->
          putStrLn $  getPlaceNames pid name

我想我打电话给getPlaceNames并返回名称时,我做错了。我什至不确定是否应该在GetPlaceNames中使用" DO"语句。

错误说

 Couldn't match expected type ‘[[Char]]’
            with actual type ‘IO [String]’
In a stmt of a 'do' block:
  c <- runX $ doc >>> css "alternateNames" >>> deep getText
In the expression:
  do { let doc
             = fromUrl
                 ("http://api.geonames.org/get?geonameId="
                  ++ pid ++ "&username=demo");
       c <- runX $ doc >>> css "alternateNames" >>> deep getText;
       return (head c) }
In an equation for ‘getPlaceNames’:
    getPlaceNames pid name
      = do { let doc = ...;
             c <- runX $ doc >>> css "alternateNames" >>> deep getText;
             return (head c) }

,但这可能只是我做错了的一件事,因为我缺乏对儿子和绑定的理解。

即使只是指向正确的文档的指针,任何帮助也可以理解。

欢呼

bjorn

感谢Chi,我已经找到了整个过程。我正在为其他需要做类似事情的人发布我的代码。

最后,我不仅从XML中获取名称,而且还取了多个字段。所以我将getPlaceNames更改为gtPlaceDetails

我显示了完整的代码,因为它也显示了我如何读取XML的不同字段,以及如何将XML中的alternateName元素合并到一个字符串中。

{-# LANGUAGE ScopedTypeVariables #-}

import qualified Data.ByteString.Lazy.Char8 as BL

import Data.Csv
import qualified Data.Vector as V
import Text.XML.HXT.Core
import Text.HandsomeSoup
import Data.List
import Data.Char

uppercase :: String -> String
uppercase = map toUpper

toLanguageStr :: (String, String) -> String
toLanguageStr (lan,name) = uppercase lan ++ ":" ++ name

getPlaceDetails::String->String->IO (Int,String,Float,Float,Float,Float,Float,Float,String,String)
getPlaceDetails pid name = do
    let doc = fromUrl ("http://api.geonames.org/get?geonameId="++pid++"&username=demo")
    id<-runX $ doc >>> css "geonameId" >>> deep getText
    name<-runX $ doc >>> css "name" >>> deep getText
    s<- runX $ doc >>> css "south" >>> deep getText
    w<- runX $ doc >>> css "west" >>> deep getText
    n<- runX $ doc >>> css "north" >>>  deep getText
    e<- runX $ doc >>> css "east" >>> deep getText
    lat<- runX $ doc >>> css "lat" >>> deep getText
    lng<- runX $ doc >>> css "lng" >>> deep getText
    translations<- runX $ doc >>> css "alternateName" >>> (getAttrValue "lang" &&& (deep getText))
    terms<- runX $ doc >>> css "alternateNames" >>> deep getText
    return ( read (head id),head name, read (head lat), read (head lng), read (head s), read (head w), read (head n), read (head e), intercalate "|" $ map toLanguageStr translations, head terms )

main :: IO ()
main = do
    csvData <- BL.readFile "input.csv"
    case decode NoHeader csvData of
        Left err -> putStrLn err
        Right v -> V.forM_ v $  ( pid, name )->do
            details <- getPlaceDetails pid name
            BL.appendFile "out.csv" $ encode [details]
            BL.putStrLn  (encode [details]) 

例如input.csv行

736572,"Mount Athos"

映射到out.csv此

736572,"Mount Athos",40.15798,24.33021,40.11294,23.99234,40.4563,24.40044,"KO:아토스 산|:Aftónomos Periochí Agíou Órous|:Ágion Óros|:Ágio Óros|:Athos|NO:Áthos|EN:Autonomous Monastic State of the Holy Mountain|:Avtonómos Periokhí Ayíou Órous|:Áyion Óros|:Dhioíkisis Ayíou Órous|:Hagion Oros|:Holy Athonite Republic|LINK:http://en.wikipedia.org/wiki/Mount_Athos|CA:Mont Athos|FR:Mont Athos|EN:Mount Athos|FR:République monastique du Mont Athos|EL:Αυτόνομη Μοναστική Πολιτεία Αγίου Όρους","Aftonomos Periochi Agiou Orous,Aftónomos Periochí Agíou Órous,Agio Oros,Agion Oros,Athos,Autonome Monastike Politeia Agiou Orous,Autonomous Monastic State of the Holy Mountain,Avtonomos Periokhi Ayiou Orous,Avtonómos Periokhí Ayíou Órous,Ayion Oros,Dhioikisis Ayiou Orous,Dhioíkisis Ayíou Órous,Hagion Oros,Holy Athonite Republic,Mont Athos,Mount Athos,Republique monastique du Mont Athos,République monastique du Mont Athos,atoseu san,Ágio Óros,Ágion Óros,Áthos,Áyion Óros,Αυτόνομη Μοναστική Πολιτεία Αγίου Όρους,아토스 산"

最新更新