如何使用Attoparsec解析雅虎历史csv



我是haskell的初学者,如何使用attoparsec解析为开放数组、高数组等

module CsvParser (
      Quote (..)
    , csvFile
    , quote
    ) where
import System.IO
import Data.Attoparsec.Text
import Data.Attoparsec.Combinator
import Data.Text (Text, unpack)
import Data.Time
import System.Locale
import Data.Maybe
data Quote = Quote {
        qTime       :: LocalTime,
        qAsk        :: Double,
        qBid        :: Double,
        qAskVolume  :: Double,
        qBidVolume  :: Double
    } deriving (Show, Eq)
csvFile :: Parser [Quote]
csvFile = do
    q <- many1 quote
    endOfInput
    return q
quote   :: Parser Quote
quote   = do
    time        <- qtime
    qcomma
    ask         <- double
    qcomma
    bid         <- double
    qcomma
    askVolume   <- double
    qcomma
    bidVolume   <- double
    endOfLine
    return $ Quote time ask bid askVolume bidVolume 
qcomma  :: Parser ()
qcomma  = do 
    char ','
    return ()
qtime   :: Parser LocalTime
qtime   = do
    tstring     <- takeTill (x -> x == ',')
    let time    = parseTime defaultTimeLocale "%d.%m.%Y %H:%M:%S%Q" (unpack tstring)
    return $ fromMaybe (LocalTime (fromGregorian 0001 01 01) (TimeOfDay 00 00 00 )) time
--testString :: Text
--testString = "01.10.2012 00:00:00.741,1.28082,1.28077,1500000.00,1500000.00n" 
quoteParser = parseOnly quote
main = do  
    handle <- openFile "C:\Users\ivan\Downloads\0005.HK.csv" ReadMode  
    contents <- hGetContents handle  
    let allLines = lines contents
    map (line -> quoteParser line) allLines
    --putStr contents  
    hClose handle

错误消息:

testhaskell.hs:89:5:
    Couldn't match type `[]' with `IO'
    Expected type: IO (Either String Quote)
      Actual type: [Either String Quote]
    In the return type of a call of `map'
    In a stmt of a 'do' block:
      map ( line -> quoteParser line) allLines
    In the expression:
      do { handle <- openFile
                       "C:\Users\ivan\Downloads\0005.HK.csv" ReadMode;
           contents <- hGetContents handle;
           let allLines = lines contents;
           map ( line -> quoteParser line) allLines;
           .... }
testhaskell.hs:89:37:
    Couldn't match type `[Char]' with `Text'
    Expected type: [Text]
      Actual type: [String]
    In the second argument of `map', namely `allLines'
    In a stmt of a 'do' block:
      map ( line -> quoteParser line) allLines
    In the expression:
      do { handle <- openFile
                       "C:\Users\ivan\Downloads\0005.HK.csv" ReadMode;
           contents <- hGetContents handle;
           let allLines = lines contents;
           map ( line -> quoteParser line) allLines;
           .... }

错误与parsec或attoparsec无关。错误消息指向的行不是IO操作,因此当您尝试将其用作一个操作时,它会导致错误:

main = do  
    handle <- openFile "C:\Users\ivan\Downloads\0005.HK.csv" ReadMode  
    contents <- hGetContents handle  
    let allLines = lines contents
    map (line -> quoteParser line) allLines   -- <== This is not an IO action
    --putStr contents  
    hClose handl

您将忽略map调用的结果。您应该将它存储在具有let的变量中,就像处理lines的结果一样。

第二个错误是因为您试图将Text用作不同类型的String,尽管它们都表示有序的字符集合(它们也有不同的内部表示)。您可以使用packunpack在两种类型之间进行转换:http://hackage.haskell.org/package/text/docs/Data-Text.html#g:5

此外,您应该始终显式地为main提供类型签名main :: IO ()。如果你不这样做,有时会导致一些微妙的问题。

不过,正如其他人所说,您可能应该使用csv解析器包。

您可以使用attoparsecsv包,也可以查看其源代码,了解如何自己编写。

代码将类似

import qualified Data.Text.IO as T
import Text.ParseCSV
main = do
  txt <- T.readFile "file.csv"
  case parseCSV txt of
    Left  err -> error err
    Right csv -> mapM_ (print . mkQuote) csv
mkQuote :: [T.Text] -> Quote
mkQuote = error "Not implemented yet"

相关内容

  • 没有找到相关文章

最新更新