使用Haskell类型族或gadt的模块化算法

我经常有机会在Haskell中执行模运算，其中模通常很大并且通常是素数(如2000000011)。目前，我只是使用函数像(modAdd m a b)， (modMul m a b)， (modDiv m a b)等。但这是相当不方便的，需要一个额外的参数来指定和携带，并以正则积分形式和分别以mod-形式创建各种函数。

因此，创建一个像这样的新类可能是个好主意:

class Integral a => Mod a
m = 2000000011
instance Integral a => Num (Mod a) where
   ... defining (+), etc. as modulo m

那么就可以使用正则函数执行正则算术，并定义有用的结构，如

factorials :: [Mod Int]
factorials = 1:zipWith (*) factorials [1..]

但是有一个问题:所有Mod Int类型的值必须具有相同的模数。然而，我经常需要在一个程序中使用多个模(当然总是只组合相同模的值)。

我认为，但不太了解，不能肯定，这可以通过这样的事情来克服:

class Integral a => Mod Nat a

，其中Nat是一个以Peano方式编码模数的类型。这将是有利的:我可以有不同模的值，并且类型检查器将使我避免意外地组合这些值。

这样的事情是可行和有效的吗?它是否会导致编译器或RTS尝试着去构建巨大的(成功)(成功)(成功)……重复2000000011次)，如果我尝试使用该模量，使解决方案有效无效?RTS会在每次操作中检查类型匹配吗?每个价值的RTS表现形式是否会从原本只是未装箱的内容中被放大?

有更好的方法吗?

结论

感谢来自cirdec, dfeuer, user5402和tikhon-jelvis的有益评论，我了解到(毫不奇怪)我不是第一个有这个想法的人。特别是，Kiselyov和Shan最近发表了一篇论文，给出了一个实现，tikhon-jelvis在Hackage上发布了一个名为(惊喜!)模块化算法的解决方案，它使用花哨的ghc pragmas提供了更好的语义。

OPEN QUESTION (TO ME)

幕后发生了什么?特别是，一个包含百万元素的[Mod Int 2000000011]列表是否会包含大约100万个额外的2000000011副本?或者它是否编译成与单独携带模数参数的百万Int列表相同的代码?后者比较好。

性能附录

我对我正在处理的当前问题运行了一点基准测试。第一次运行使用了一个未装箱的包含10,000个元素的Int向量，并对其执行了10,000次操作:

   4,810,589,520 bytes allocated in the heap
         107,496 bytes copied during GC
       1,197,320 bytes maximum residency (1454 sample(s))
         734,960 bytes maximum slop
              10 MB total memory in use (0 MB lost due to fragmentation)
                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0      6905 colls,     0 par    0.109s   0.101s     0.0000s    0.0006s
  Gen  1      1454 colls,     0 par    0.812s   0.914s     0.0006s    0.0019s
  TASKS: 13 (1 bound, 12 peak workers (12 total), using -N11)
  SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)
  INIT    time    0.000s  (  0.001s elapsed)
  MUT     time    2.672s  (  2.597s elapsed)
  GC      time    0.922s  (  1.015s elapsed)
  EXIT    time    0.000s  (  0.001s elapsed)
  Total   time    3.594s  (  3.614s elapsed)
  Alloc rate    1,800,454,557 bytes per MUT second
  Productivity  74.3% of total user, 73.9% of total elapsed

对于第二次运行，我对未装箱的向量10,000 (Mod Int 1000000007)执行相同的操作。这使我的代码更简单一些，但花费了大约3倍的时间(同时具有几乎相同的内存配置文件):

   4,810,911,824 bytes allocated in the heap
         107,128 bytes copied during GC
       1,199,408 bytes maximum residency (1453 sample(s))
         736,928 bytes maximum slop
              10 MB total memory in use (0 MB lost due to fragmentation)
                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0      6906 colls,     0 par    0.094s   0.107s     0.0000s    0.0007s
  Gen  1      1453 colls,     0 par    1.516s   1.750s     0.0012s    0.0035s
  TASKS: 13 (1 bound, 12 peak workers (12 total), using -N11)
  SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)
  INIT    time    0.000s  (  0.001s elapsed)
  MUT     time    8.562s  (  8.323s elapsed)
  GC      time    1.609s  (  1.857s elapsed)
  EXIT    time    0.000s  (  0.001s elapsed)
  Total   time   10.172s  ( 10.183s elapsed)
  Alloc rate    561,858,315 bytes per MUT second
  Productivity  84.2% of total user, 84.1% of total elapsed

我想知道为什么会发生这种情况，如果它可以修复。尽管如此，我还是非常喜欢模块化算术包，并将在性能不是绝对关键的地方使用它。

新版本的GHC内置了类型级别的数字，这应该比使用Peano算法生成的数字更有效。您可以通过启用DataKinds来使用它们。作为奖励，您还可以获得一些不错的语法:

factorials :: [Mod Int 20]

这是否有效取决于您如何实现Mod类型。在最一般的情况下，你可能想在每次算术运算之后只使用mod。除非您处于热循环中，需要保存少量指令，否则这应该没问题。(在热循环中，无论如何，最好明确说明何时进行修改。)

我实际上在Hackage: modular-arithmetic的一个库中实现了这种类型。它有一个测试套件，但没有基准测试，所以我不能保证绝对的性能，但它没有做任何应该很慢的事情，而且对于我的目的来说已经足够快了。(不可否认，这涉及到很小的模数。)如果您尝试它并遇到性能问题，我很乐意听到它们，以便我可以尝试修复它们。

下面是一些使用Data.Reflection的工作代码:

{-# LANGUAGE Rank2Types #-}
{-# LANGUAGE FlexibleContexts #-}
import Data.Reflection
import Data.Proxy
data M a s = M a -- Note the phantom comes *after* the concrete
-- In `normalize` we're tying the knot to get the phantom types to align
-- note that reflect :: Reifies s a => forall proxy. proxy s -> a
normalize :: (Reifies s a, Integral a) => a -> M a s
normalize a = b where b = M (mod a (reflect b)) 
instance (Reifies s a, Integral a) => Num (M a s) where
  M a + M b = normalize (a + b)
  M a - M b = normalize (a - b)
  M a * M b = normalize (a * b)
  fromInteger n = normalize (fromInteger n)
  abs _     = error "abs not implemented"
  signum _  = error "sgn not implemented"
withModulus :: Integral a => a -> (forall s. Reifies s a => M a s) -> a
withModulus m ma = reify m (runM . asProxyOf ma)
  where asProxyOf :: f s -> Proxy s -> f s
        asProxyOf a _ = a
runM :: M a s -> a
runM (M a) = a
example :: (Reifies s a, Integral a) => M a s
example = normalize 3
example2 :: (Reifies s a, Integral a, Num (M a s)) => M a s
example2 = 3*3 + 5*5
mfactorial :: (Reifies s a, Integral a, Num (M a s)) => Int -> M a s
mfactorial n = product $ map fromIntegral [1..n]
test1 p n = withModulus p $ mfactorial n
madd :: (Reifies s Int, Num (M Int s)) => M Int s -> M Int s -> M Int s
madd a b = a + b
test2 :: Int -> Int -> Int -> Int
test2 p a b = withModulus p $ madd (fromIntegral a) (fromIntegral b)

OPEN QUESTION (TO ME)

性能附录

相关内容

最新更新

热门标签：