使用递归方案的表达式扩展



我有一个表示算术表达式的数据类型:

data E = Add E E | Mul E E | Var String

我想写一个扩展函数,它将表达式转换为变量乘积的总和(有点大括号扩展)。当然使用递归方案。

我只能想到一种本着"进步和保存"精神的算法。每一步的算法都会构建完全扩展的术语,因此无需重新检查。

Mul的处理让我发疯,所以我没有直接这样做,而是使用了同构类型的[[String]],并利用了已经为我实现concatconcatMap

type Poly = [Mono]
type Mono = [String]
mulMonoBy :: Mono -> Poly -> Poly
mulMonoBy x = map (x ++)
mulPoly :: Poly -> Poly -> Poly
mulPoly x = concatMap (flip mulMonoBy x)

所以我只使用cata

expandList :: E -> Poly
expandList = cata $ case
Var x -> [[x]]
Add e1 e2 = e1 ++ e2
Mul e1 e2 = mulPoly e1 e2

并转换回来:

fromPoly :: Poly -> Expr
fromPoly = foldr1 Add . map fromMono where
fromMono = foldr1 Mul . map Var

有没有明显更好的方法?

Upd:很少有困惑。

  1. 该解决方案允许多行变量名称。Add (Val "foo" (Mul (Val "foo) (Var "bar")))foo + foo * bar的表示形式。我不是在用Val "xyz"或其他东西来代表x*y*z。请注意,由于没有标量,因此完全允许重复的变量,例如"foo * foo * quux"。

  2. 我所说的产品总和是指一种"柯里化"的n元产品总和。乘积总和的简明定义是,我想要一个没有任何括号的表达式,所有括号都由关联性和优先级表示。

所以(foo * bar + bar) + (foo * bar + bar)不是产品的总和,因为中间+是总和

(foo * bar + (bar + (foo * bar + bar)))或相应的左关联版本都是正确的答案,尽管我们必须保证结合性总是左的,总是右的。所以右结合解的正确类型是

data Poly = Sum Mono Poly
| Product Mono

它与非空列表同构:NonEmpty Poly(注意Sum Mono Poly而不是Sum Poly Poly)。如果我们允许空和或产品,那么我们只得到我使用的列表表示列表。

  1. 你们也不关心性能,乘法似乎只是liftA2 (++)

我不是递归方案的专家,但由于听起来您正在尝试练习它们,希望您不会发现将使用手动递归的解决方案转换为使用递归方案的解决方案太繁琐。我将首先用混合散文和代码编写它,并在最后再次包含完整的代码,以便更轻松地复制/粘贴。

简单地使用分配属性和一点递归代数并不难。不过,在开始之前,让我们定义一个更好的结果类型,一种保证我们只能表示乘积总和的结果类型:

data Poly term = Sum (Poly term) (Poly term)
| Product (Mono term) 
deriving Show
data Mono term = Term term
| MonoMul (Mono term) (Mono term)
deriving Show

这样我们就不可能搞砸并意外产生不正确的结果,例如

(Mul (Var "x") (Add (Var "y") (Var "z")))

现在,让我们编写我们的函数。

expand :: E -> Poly String

首先,一个基本情况:扩展 Var 是微不足道的,因为它已经是乘积总和的形式。但是我们必须对其进行一些转换以使其适合我们的 Poly 结果类型:

expand (Var x) = Product (Term x)

接下来,请注意,扩展加法很容易:只需展开两个子表达式,然后将它们相加即可。

expand (Add x y) = Sum (expand x) (expand y)

乘法呢?这有点复杂,因为

Product (expand x) (expand y)

类型错误:我们不能乘多项式,只能乘以单项式。但我们确实知道如何进行代数操作,通过分配规则将多项式的乘法转换为单项式的乘法之和。就像你的问题一样,我们需要一个函数mulPoly。但是,让我们假设它存在,并在以后实现它。

expand (Mul x y) = mulPoly (expand x) (expand y)

这处理了所有情况,因此剩下的就是通过在两个多项式的项中分配乘法来实现mulPoly。我们只是一次分解一个多项式,然后将该项乘以另一个多项式中的每个项,将结果相加。

mulPoly :: Poly String -> Poly String -> Poly String
mulPoly (Product x) y = mulMonoBy x y
mulPoly (Sum a b) x = Sum (mulPoly a x) (mulPoly b x)
mulMonoBy :: Mono String -> Poly -> Poly
mulMonoBy x (Product y) = Product $ MonoMul x y
mulMonoBy x (Sum a b) = Sum (mulPoly a x') (mulPoly b x')
where x' = Product x

最后,我们可以测试它是否按预期工作:

expand (Mul (Add (Var "a") (Var "b")) (Add (Var "y") (Var "z")))
{- results in: Sum (Sum (Product (MonoMul (Term "y") (Term "a"))) 
(Product (MonoMul (Term "z") (Term "a")))) 
(Sum (Product (MonoMul (Term "y") (Term "b"))) 
(Product (MonoMul (Term "z") (Term "b"))))
-}

(a + b)(y * z) = ay + az + by + bz

我们知道这是正确的。

完整的解决方案,如上所述

data E = Add E E | Mul E E | Var String
data Poly term = Sum (Poly term) (Poly term)
| Product (Mono term) 
deriving Show
data Mono term = Term term
| MonoMul (Mono term) (Mono term)
deriving Show
expand :: E -> Poly String
expand (Var x) = Product (Term x)
expand (Add x y) = Sum (expand x) (expand y)
expand (Mul x y) = mulPoly (expand x) (expand y)
mulPoly :: Poly String -> Poly String -> Poly String
mulPoly (Product x) y = mulMonoBy x y
mulPoly (Sum a b) x = Sum (mulPoly a x) (mulPoly b x)
mulMonoBy :: Mono String -> Poly String -> Poly String
mulMonoBy x (Product y) = Product $ MonoMul x y
mulMonoBy x (Sum a b) = Sum (mulPoly a x') (mulPoly b x')
where x' = Product x
main = print $ expand (Mul (Add (Var "a") (Var "b")) (Add (Var "y") (Var "z")))

这个答案有三个部分。第一部分是我介绍我最喜欢的两个解决方案的摘要,是最重要的部分。第二部分包含类型和导入,以及对解决方案的扩展评论。第三部分侧重于重新关联表达式的任务,这是答案的原始版本(即第二部分)没有给予应有的关注。

归根结底,我最终得到了两个值得讨论的解决方案。第一个是expandDirect(参见第三部分):

expandDirect :: E a -> E a
expandDirect = cata alg
where
alg = case
Var' s -> Var s
Add' x y -> apo coalgAdd (Add x y)
Mul' x y -> (apo coalgAdd' . apo coalgMul) (Mul x y)
coalgAdd = case
Add (Add x x') y -> Add' (Left x) (Right (Add x' y))
x -> Left <$> project x
coalgAdd' = case
Add (Add x x') y -> Add' (Left x) (Right (Add x' y))
Add x (Add y y') -> Add' (Left x) (Right (Add y y'))
x -> Left <$> project x
coalgMul = case
Mul (Add x x') y -> Add' (Right (Mul x y)) (Right (Mul x' y))
Mul x (Add y y') -> Add' (Right (Mul x y)) (Right (Mul x y'))
x -> Left <$> project x

有了它,我们从底部重建树(cata)。在每个分支上,如果我们发现无效的东西,我们会返回并重写子树(apo),根据需要重新分发和重新关联,直到所有直接子级都正确排列(apo可以做到这一点,而不必重写到最底层)。

第二种解决方案,expandMeta,是第三部分中expandFlat的简化版本。

expandMeta :: E a -> E a
expandMeta = apo coalg . cata alg
where
alg = case
Var' s -> pure (Var s)
Add' x y -> x <> y
Mul' x y -> Mul <$> x <*> y
coalg = case
x :| [] -> Left <$> project x
x :| (y:ys) -> Add' (Left x) (Right (y :| ys))

expandMeta是一种变质;也就是说,一个后跟一个变形(虽然我们在这里也使用apo,但只是一种花哨的变形,所以我想命名法仍然适用)。同态将树更改为非空列表 - 隐式处理Add的重新关联 - 列表应用用于分发乘法(就像你建议的那样)。然后,代数非常简单地将非空列表转换回具有适当形状的树。


谢谢你的问题 - 我玩得很开心!预赛:

{-# LANGUAGE LambdaCase #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE DeriveFunctor #-}
{-# LANGUAGE DeriveFoldable #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
import Data.Functor.Foldable
import qualified Data.List.NonEmpty as N
import Data.List.NonEmpty (NonEmpty(..))
import Data.Semigroup
import Data.Foldable (toList)
import Data.List (nub)
import qualified Data.Map as M
import Data.Map (Map, (!))
import Test.QuickCheck
data E a = Var a | Add (E a) (E a) | Mul (E a) (E a)
deriving (Eq, Show, Functor, Foldable)
data EF a b = Var' a | Add' b b | Mul' b b
deriving (Eq, Show, Functor)
type instance Base (E a) = EF a
instance Recursive (E a) where
project = case
Var x -> Var' x
Add x y -> Add' x y
Mul x y -> Mul' x y
instance Corecursive (E a) where
embed = case
Var' x -> Var x
Add' x y -> Add x y
Mul' x y -> Mul x y

首先,我的第一个工作(如果有缺陷)尝试,它使用(非空)列表的应用实例来分发:

expandTooClever :: E a -> E a
expandTooClever = cata $ case
Var' s -> Var s
Add' x y -> Add x y
Mul' x y -> foldr1 Add (Mul <$> flatten x <*> flatten y)
where
flatten :: E a -> NonEmpty (E a)
flatten = cata $ case
Var' s -> pure (Var s)
Add' x y -> x <> y
Mul' x y -> pure (foldr1 Mul (x <> y))

expandTooClever有一个相对严重的问题:正如它所说的flatten,一个成熟的折叠,对于两个子树来说,每当它达到一个Mul时,它对Mul链都有可怕的渐近。

蛮力,最简单的可能工作解决方案,具有递归自称的代数:

expandBrute :: E a -> E a
expandBrute = cata alg
where
alg = case
Var' s -> Var s
Add' x y -> Add x y
Mul' (Add x x') y -> Add (alg (Mul' x y)) (alg (Mul' x' y))
Mul' x (Add y y') -> Add (alg (Mul' x y)) (alg (Mul' x y'))
Mul' x y -> Mul x y

需要递归调用,因为分布可能会在Mul下引入新的Add

expandBrute的一个稍微有品味的变体,递归调用被分解成一个单独的函数:

expandNotSoBrute :: E a -> E a
expandNotSoBrute = cata alg
where
alg = case
Var' s -> Var s
Add' x y -> Add x y
Mul' x y -> dis x y
dis (Add x x') y = Add (dis x y) (dis x' y)
dis x (Add y y') = Add (dis x y) (dis x y')
dis x y = Mul x y

一个被驯服的expandNotSoBrutedis变成了一个。这种措辞方式很好地表达了正在发生的事情的大局:如果你只有VarAdds,你可以自下而上地复制这棵树,而不必关心这个世界;但是,如果您遇到Mul,则必须返回并重新构建整个子树以执行分发(我想知道是否有专门的递归方案可以捕获此模式)。

expandEvert :: E a -> E a
expandEvert = cata alg
where
alg :: EF a (E a) -> E a
alg = case
Var' s -> Var s
Add' x y -> Add x y
Mul' x y -> apo coalg (x, y)
coalg :: (E a, E a) -> EF a (Either (E a) (E a, E a))
coalg (Add x x', y) = Add' (Right (x, y)) (Right (x', y))
coalg (x, Add y y') = Add' (Right (x, y)) (Right (x, y'))
coalg (x, y) = Mul' (Left x) (Left y)

apo是必要的,因为如果没有其他要分发的内容,我们希望预测最终结果。(有一种方法可以用ana编写它;但是,这需要浪费地重建Mul树而不进行更改,这导致了expandTooClever相同的渐近问题。

最后但并非最不重要的一点是,一个解决方案,既是成功实现我对expandTooClever的尝试,也是我对amalloy答案的解释。BT是一棵花园品种的二叉树,叶子上有价值。乘积用BT a表示,而乘积的总和是树的树。

expandSOP :: E a -> E a
expandSOP = cata algS . fmap (cata algP) . cata algSOP
where
algSOP :: EF a (BT (BT a)) -> BT (BT a)
algSOP = case
Var' s -> pure (pure s)
Add' x y -> x <> y
Mul' x y -> (<>) <$> x <*> y
algP :: BTF a (E a) -> E a
algP = case
Leaf' s -> Var s
Branch' x y -> Mul x y
algS :: BTF (E a) (E a) -> E a
algS = case
Leaf' x -> x
Branch' x y -> Add x y

BT及其实例:

data BT a = Leaf a | Branch (BT a) (BT a)
deriving (Eq, Show)
data BTF a b = Leaf' a | Branch' b b
deriving (Eq, Show, Functor)
type instance Base (BT a) = BTF a
instance Recursive (BT a) where
project (Leaf s) = Leaf' s
project (Branch l r) = Branch' l r
instance Corecursive (BT a) where
embed (Leaf' s) = Leaf s
embed (Branch' l r) = Branch l r
instance Semigroup (BT a) where
l <> r = Branch l r
-- Writing this, as opposed to deriving it, for the sake of illustration.
instance Functor BT where
fmap f = cata $ case
Leaf' x -> Leaf (f x)
Branch' l r -> Branch l r
instance Applicative BT where
pure x = Leaf x
u <*> v = ana coalg (u, v)
where
coalg = case
(Leaf f, Leaf x) -> Leaf' (f x)
(Leaf f, Branch xl xr) -> Branch' (Leaf f, xl) (Leaf f, xr)
(Branch fl fr, v) -> Branch' (fl, v) (fr, v)

总结一下,测试套件:

newtype TestE = TestE { getTestE :: E Char }
deriving (Eq, Show)
instance Arbitrary TestE where
arbitrary = TestE <$> sized genExpr
where
genVar = Var <$> choose ('a', 'z')
genAdd n = Add <$> genSub n <*> genSub n
genMul n = Mul <$> genSub n <*> genSub n
genSub n = genExpr (n `div` 2)
genExpr = case
0 -> genVar
n -> oneof [genVar, genAdd n, genMul n]
data TestRig b = TestRig (Map Char b) (E Char)
deriving (Show)
instance Arbitrary b => Arbitrary (TestRig b) where
arbitrary = do
e <- genExpr
d <- genDict e
return (TestRig d e)
where
genExpr = getTestE <$> arbitrary
genDict x = M.fromList . zip (keys x) <$> (infiniteListOf arbitrary)
keys = nub . toList
unsafeSubst :: Ord a => Map a b -> E a -> E b
unsafeSubst dict = fmap (dict !)
eval :: Num a => E a -> a
eval = cata $ case
Var' x -> x
Add' x y -> x + y
Mul' x y -> x * y
evalRig :: (E Char -> E Char) -> TestRig Integer -> Integer
evalRig f (TestRig d e) = eval (unsafeSubst d (f e))
mkPropEval :: (E Char -> E Char) -> TestRig Integer -> Bool
mkPropEval f = (==) <$> evalRig id <*> evalRig f
isDistributed :: E a -> Bool
isDistributed = para $ case
Add' (_, x) (_, y) -> x && y
Mul' (Add _ _, _) _ -> False
Mul' _ (Add _ _, _) -> False
Mul' (_, x) (_, y) -> x && y
_ -> True
mkPropDist :: (E Char -> E Char) -> TestE -> Bool
mkPropDist f = isDistributed . f . getTestE
main = mapM_ test
[ ("expandTooClever" , expandTooClever)
, ("expandBrute"     , expandBrute)
, ("expandNotSoBrute", expandNotSoBrute)
, ("expandEvert"     , expandEvert)
, ("expandSOP"       , expandSOP)
]
where
test (header, func) = do
putStrLn $ "Testing: " ++ header
putStr "Evaluation test:   "
quickCheck $ mkPropEval func
putStr "Distribution test: "
quickCheck $ mkPropDist func

我所说的产品总和是指一种"柯里化"的n元产品总和。乘积总和的简明定义是,我想要一个没有任何括号的表达式,所有括号都由关联性和优先级表示。

我们可以调整上面的解决方案,以便重新关联总和。最简单的方法是用NonEmpty替换expandSOP中的外部BT。鉴于那里的乘法,就像你建议的那样,liftA2 (<>),这立即起作用。

expandFlat :: E a -> E a
expandFlat = cata algS . fmap (cata algP) . cata algSOP
where
algSOP :: EF a (NonEmpty (BT a)) -> NonEmpty (BT a)
algSOP = case
Var' s -> pure (Leaf s)
Add' x y -> x <> y
Mul' x y -> (<>) <$> x <*> y
algP :: BTF a (E a) -> E a
algP = case
Leaf' s -> Var s
Branch' x y -> Mul x y
algS :: NonEmptyF (E a) (E a) -> E a
algS = case
NonEmptyF x Nothing -> x
NonEmptyF x (Just y) -> Add x y

另一种选择是使用任何其他解决方案,并在单独的步骤中重新关联分布式树中的总和。

flattenSum :: E a -> E a
flattenSum = cata alg
where
alg = case
Add' x y -> apo coalg (x, y)
x -> embed x
coalg = case
(Add x x', y) -> Add' (Left x) (Right (x', y))
(x, y) -> Add' (Left x) (Left y)

我们还可以将flattenSumexpandEvert滚动到一个函数中。请注意,当总和代数获得分布代数的结果时,它需要一个额外的情况。发生这种情况是因为,当代数从上到下进行时,我们无法确定它生成的子树是否正确关联。

-- This is written in a slightly different style than the previous functions.
expandDirect :: E a -> E a
expandDirect = cata alg
where
alg = case
Var' s -> Var s
Add' x y -> apo coalgAdd (Add x y)
Mul' x y -> (apo coalgAdd' . apo coalgMul) (Mul x y)
coalgAdd = case
Add (Add x x') y -> Add' (Left x) (Right (Add x' y))
x -> Left <$> project x
coalgAdd' = case
Add (Add x x') y -> Add' (Left x) (Right (Add x' y))
Add x (Add y y') -> Add' (Left x) (Right (Add y y'))
x -> Left <$> project x
coalgMul = case
Mul (Add x x') y -> Add' (Right (Mul x y)) (Right (Mul x' y))
Mul x (Add y y') -> Add' (Right (Mul x y)) (Right (Mul x y'))
x -> Left <$> project x

也许有一种更聪明的写法expandDirect,但我还没有弄清楚。

最新更新