约束生长树:类型族和类型类

这个问题是关于围绕生长的树的习惯用法设计抽象的。

假设我有一个非常简单的语法树:

data Expr χ  
= Lam  (Lamfam χ) Name (Ty χ) (Expr χ)
| App  (Appfam χ) (Expr χ)
| Unit (Ufam χ)
data Ty χ  
= UnitTy (Utyfam χ)
| Arrow (Arrfam χ) (Ty χ) (Ty χ)
type family Lamfam χ
type family Appfam χ
type family Ufam χ
type family Utyfam χ
type family Arrfam χ

我的问题出现在寻找一种对这组类型族施加约束的方法。本文给出了一种解决方案:约束类型:

type ForAllχ (c :: * -> Constraint) χ
= (c (LamFam χ), c (Appfam χ), c (Ufam χ), c (Utyfam χ), c (Arrfam χ))

Forallχ Show χ => Show (Ty χ) where   
show = ...
Forallχ Show χ => Show (Exp χ)  
show = ...

但是，这只有在我想对所有类型设置相同的约束时才有效。

假设，例如，我正在编写一个双向类型检查器算法，它是参数χ上的泛型。

infer env term = case term of 
...
Lam ex name ty body -> do
ret_ty <- infer (insert name ty env) body
pure $ Arr ??? a ret_ty
...

我们没有任何东西可以代入???的位置。为了解决这个问题，也许我希望有一个函数lamToArr :: (Lamfam χ) -> (Arrfam χ)，它允许类型决定扩展信息如何流经类型检查器。

infer lamtoArr env term = case term of 
...
Lam ex name ty body -> do
ret_ty <- infer lamtoArr (insert name ty env) body
pure $ Arr lamtoArr a ret_ty
...

当然，随着添加更多的函数，这可能会变得很尴尬，所以理想情况下，我希望能够使用类型类-例如:

class Checker χ where  
lamToArr :: (Lamfam χ) -> (Arrfam χ)
unitToTy :: (Ufam χ) -> (Utyfam χ)

但是，这会产生一个类型错误:

• Couldn't match type: Lamfam χ0
with: Lamfam χ
Expected: Lamfam χ -> Arrfam χ
Actual: Lamfam χ0 -> Arrfam χ0
NB: ‘Lamfam’ is a non-injective type family
The type variable ‘χ0’ is ambiguous
• In the ambiguity check for ‘lamToArr’
To defer the ambiguity check to use sites, enable AllowAmbiguousTypes
When checking the class method:
lamToArr :: forall χ. Checker χ => Lamfam χ -> Arrfam χ
In the class declaration for ‘Checker’
|
41 |   lamToArr :: (Lamfam χ) -> (Arrfam χ)
|   ^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^

这个错误表明禁用歧义检查和/或使我的类型族注入(不太确定这意味着什么)可能是一个解决方案，但我想检查这是否是一个很好的解决方案，我的问题摆在首位，如果是这样，那么这两种方法中的哪一种更像是我正在寻找的?

约束的一般意图是描述特定算法/阶段的信息如何在类型之间流动，仅提供特定阶段所需的信息。这将使我能够更新或扩展树的扩展，而不必更新核心算法。

在您的情况下，(Lamfam χ) -> (Arrfam χ)是模糊的，因为不能保证Lamfam χ或Arrfam χ足以告诉您在调用站点需要哪个χ。例如，新解析的没有注释的AST可能会将它们都设置为某些data SrcLoc。

这里的AllowAmbiguousTypes没有问题。它表示您同意有时在调用站点编写类型注释或TypeApplications。可以使用TypeFamilyDependencies注释使这些类型族注入:

type family Lamfam χ = λ | λ -> χ

隐式地，类型族必须是函数，其输入χ唯一地决定其输出λ。这只是在另一个方向上增加了依赖性，即对于来自Lamfam χ应用程序的任何输出λ，必须足以告诉输入χ是什么。这是一个相当严格的要求，否定了TTG的一些预期好处(避免包装器类型)。在当前的GHC中，注入性检查也存在一些表达性问题。尽管如此，当它工作时，它确实倾向于使推理和错误消息更有帮助。创建族注入的一个常用习惯是将输入类型保留为虚类型参数。基本上，像这样替换:

newtype Name = Name String
newtype Id   = Id   Int
type family Namefam χ
type instance Namefam Parsed  = Name
type instance Namefam Renamed = Id    -- conflict
type instance Namefam Typed   = Id    -- conflict

用这样的东西:

newtype Name  = Name String
newtype Id' χ = Id'  Int
-- or: Const Id χ
type family Namefam χ
type instance Namefam 'Parsed      = Name
type instance Namefam 'Renamed     = Id' 'Renamed
type instance Namefam 'Typechecked = Id' 'Typechecked

然而，这有点误入歧途。通常使用TTG，类型检查器不会在χ中是多态的，它会通过重新注释树将χ从'Renamed更改为'Typechecked。在这种情况下，您可以使用具体类型，而不需要类型类。

您的class Checker χ仅描述您有足够信息运行类型检查器的AST状态集。所以我不知道它是否真的发挥了抽象的作用，除非你希望能够在两次传递之间重复地重新检查程序的类型，比如，确保重写不会破坏任何东西。

也就是说，在GHC中有一个值得注意的解决方案来解决大型约束集的问题，您可能会发现它很有帮助。首先，它们从Pass类型开始，索引每次传递的结果。

data Pass
= Parsed
| Renamed
| Typechecked

(只有几个，因为他们现在只在前端使用TTG，但如果你从一开始就使用TTG编写编译器，你当然可以在更多的管道中使用它。)

接下来，他们不是像HsExpr 'Parsed那样直接用这个类型索引AST和扩展点，而是使用GADTGhcPass(像Sing Pass和singletons)。

data GhcPass (c :: Pass) where
GhcPs :: GhcPass 'Parsed
GhcRn :: GhcPass 'Renamed
GhcTc :: GhcPass 'Typechecked
-- Type synonyms as a shorthand for tagging
type GhcPs   = GhcPass 'Parsed      -- Output of parser
type GhcRn   = GhcPass 'Renamed     -- Output of renamer
type GhcTc   = GhcPass 'Typechecked -- Output of typechecker

最后，他们添加了一个类IsPass(像SingI或KnownNat)，允许以统一的方式对该类型进行调度。

class IsPass p where
ghcPass :: GhcPass p
instance IsPass 'Parsed where
ghcPass = GhcPs
instance IsPass 'Renamed where
ghcPass = GhcRn
instance IsPass 'Typechecked where
ghcPass = GhcTc

特别用于美观的打印，有时用于构造树。这意味着他们可以使用IsPass p => HsExpr (GhcPass p)，而不是像ForAllχ c χ那样携带一组所有可能需要的约束。使用模式匹配提供的χ，每个c χ都可用。值得注意的是，您甚至可以以一种特别的方式使用这些约束，而不需要任何特定的类:

pprIfPs :: forall p. IsPass p => (p ~ 'Parsed => SDoc) -> SDoc
pprIfPs pp = case ghcPass @p of
GhcPs -> pp
_ -> empty

如果你想写很多传递多态代码，我发现它是到目前为止使用数据族或GADT代替TTG更容易、更清晰。编译器消息更有帮助，并且有更多的工具可用来减少样板文件，例如StandaloneDeriving。

相关内容

最新更新

热门标签：