如何避免歧义在TypeScript模板文字类型推断?



我试图编写一个类型,验证给定的输入字符串是否具有由1个或多个空白字符分隔的有效类名。输入也可以有前后空白。

我现在使用的类型非常接近,但是TS编译器可以通过多种方式推断模板字面量,这意味着语法是不明确的。这会导致不想要的结果。

首先定义基本类型:

// To avoid recursion as much as possible
type Spaces = (
| "     "
| "    "
| "   "
| "  "
| " "
);
type Whitespace = Spaces | "n" | "t";
type ValidClass = 'a-class' | 'b-class' | 'c-class';

那么实用程序类型

// Utility type to provide nicer error messages
type Err<Message extends string> = `Error: ${Message}`;
type TrimEnd<T extends string> = (
T extends `${infer Rest}${Whitespace}`
? TrimEnd<Rest>
: T
);
type TrimStart<T extends string> = (
T extends `${Whitespace}${infer Rest}`
? TrimStart<Rest>
: T
);
type Trim<T extends string> = TrimEnd<TrimStart<T>>;

最后是检查输入字符串的实际类型:

// Forces the string to be trimmed before starting recursive loop.
type SplitToValidClasses<T extends string> = SplitToValidClassesInner<Trim<T>>;
// Splits the input string into array of `Array<Token | 'Error: ...'>`
// strings. The input is converted to an array format mostly because I found it
// easier to work with arrays in other TS generics, instead of e.g space separated
// values.
type SplitToValidClassesInner<T extends string> =
// Does `T` contain more than one string? For example 'aaaann  bbbb'
T extends `${infer Head}${Whitespace}${infer Tail}`
// Yes, `T` could be infered into three parts.
// Is `Head` a valid class name?
? Trim<Head> extends ValidClass
// Yes, it's a valid name. Continue recursively with rest of the string
// but trim white space from both sides.
? [Trim<Head>, ...SplitToValidClassesInner<Trim<Tail>>]
: [Err<`'${Head}' is not a valid class`>]
: T extends `${infer Tail}`
? Tail extends ValidClass
? [Tail]
: [Err<`'${Tail}' is not a valid class`>]
: [never];
// This works
type CorrectResult = SplitToValidClasses<'a-class b-class c-class'>

但是当使用不同的输入进行测试时,我们可以注意到不正确的结果:

// Should be ["a-class", "b-class", "c-class"]
type Input1 = `a-class b-class  c-class`;
type Result = SplitToValidClasses<Input1>;
// Should be ["a-class", "b-class", "c-class", "a-class"]
type Result2 = SplitToValidClasses<`
a-class    b-class
c-class
a-class
`>;
// Should be ["a-class", "Error: 'wrong-class' is not a valid class"]
type Result3 = SplitToValidClasses<`
a-class
wrong-class
c-class
`>;

问题发生在模板推理中:

type SplitToValidClassesInnerFirstLevelDebug<T extends string> =
T extends `${infer Head}${Whitespace}${infer Tail}`
? [Head, Whitespace, Tail]
: never
// The grammar is ambiguous, leading to 
// "["a-class b-class" | "a-class", Whitespace, "c-class" | "b-class  c-class"]
// Removing the ambiguousity should fix the issue
type Result4 = SplitToValidClassesInnerFirstLevelDebug<Input1>

操场上联系

我找不到太多关于如何推断模板字面量的详细文档,除了Anders Hejlsberg在他的PR中解释的:

要使推理成功,目标的开始和结束文字字符跨度(如果有的话)必须与源的开始和结束字符跨度完全匹配。通过从左到右将每个占位符与源中的子字符串匹配来进行推理:通过从源中推断零个或多个字符来匹配占位符,直到该文字字符跨度在源中首次出现。紧跟在另一个占位符后面的占位符通过从源中推断单个字符来匹配。

如何在不产生歧义的情况下实现这种类型?我想到的一种方法是逐个字符地递归解析输入,但它很快就达到了TS中的递归限制。

我想出了两种解决方案,但都不能解决最初的问题,因为类型变得过于复杂或递归。第二个解决方案肯定比第一个更具可扩展性。

方案1:递归解析

此解决方案递归地解析输入字符串。type Split用空格分割输入字符串,并返回一个由令牌(或单词)组成的数组。

type EndOfInput = '';
// Validates given `UnprocessedInput` input string
// It recursively iterates through each character in the string,
// and appends characters into the second type parameter `Current` until the
// token has been consumed. When the token is fully consumed, it is added to 
// `Result` and `Current` memory is cleared.
//
// NOTE: Do not pass anything else than the first type parameter. Other type
//       parameters are for internal tracking during recursive loop
//
// See https://github.com/microsoft/TypeScript/pull/40336 for more template literal
// examples.
type Split<UnprocessedInput extends string, Current extends string = '', Result extends string[] = []> =
// Have we reached to the end of the input string ?
UnprocessedInput extends EndOfInput
// Yes. Is the `Current` empty?
? Current extends EndOfInput
// Yes, we're at the end of processing and no need to add new items to result
? Result
// No, add the last item to results, and return result
: [...Result, Current]
// No, use template literal inference to get first char, and the rest of the string
: UnprocessedInput extends `${infer Head}${infer Rest}`
// Is the next character whitespace?
? Head extends Whitespace
// No, and is the `Current` empty?
? Current extends EndOfInput
// Yes, continue "eating" whitespace
? Split<Rest, Current, Result>
// No, it means we went from a token to whitespace, meaning the token
// is fully parsed and can be added to the result
: Split<Rest, '', [...Result, Current]>
// No, add the character to Current 
: Split<Rest, `${Current}${Head}`, Result>
// This shouldn't happen since UnprocessedInput is restricted with
// `extends string` type narrowing.
// For example ValidCssClassName<null> would be a `never` type if it didn't
// already fail to "Type 'null' does not satisfy the constraint 'string'"
: [never] 

这适用于较小的输入,但不适用较大的字符串,因为TS递归限制:

type Result5 = Split<`
a   

b 
c`>
// Fails for larger string values, because of recursion limit
type Result6 = Split<`aaaaaaaaaaaaaaaaaaa
bbbbbbbbbbbbbbbbbbbbb`

操场上联系

解决方案2:已知类作为令牌

由于我们实际上已经将有效的类名作为字符串联合,因此可以将其用作模板文字类型的一部分来消费整个类名。

为了理解这个解决方案,让我们从各个部分构建它。首先,让我们在模板文字中使用ValidClass:

type SplitDebug1<T extends string> =
T extends `${ValidClass}${Whitespace}${infer Tail}`
? [ValidClass, Whitespace, Tail]
: never
// The grammar is not ambiguous anymore!
// [ValidClass, Whitespace, "b-class c-class"]
type Result1 = SplitDebug1<"a-class b-class c-class">

这解决了歧义问题,但现在我们不能再访问解析后的Head,因为ValidClass只是引用类型type ValidClass = "a-class" | "b-class" | "c-class"。不幸的是,TypeScript不允许同时推断和限制token,所以这是不可能的:

type SplitDebug2<T extends string> =
T extends `${infer Head extends ValidClass ? infer Head : never}${Whitespace}${infer Tail}`
? [Head, Whitespace, Tail]
: never
// Still just [ValidClass, Whitespace, "b-class c-class"]
type Result2 = SplitDebug1<"a-class b-class c-class">

但是黑客来了。我们可以使用已知的Tail作为一种反向匹配的方式来访问Head:

type SplitDebug3<T extends string> =
T extends `${ValidClass}${Whitespace}${infer Tail}`
? T extends `${infer Head}${Whitespace}${Tail}` 
? [Head, Whitespace, Tail]
: never
: never
// Now we now the first valid token aka class name!
// ["a-class", Whitespace, "b-class c-class"]
type Result3 = SplitDebug3<"a-class b-class c-class">

这个技巧可以用来解析有效的类名,完整的解决方案:


// Demonstrating with large amount of class names
// Breaks to "too complex union type" with 20k class names
type Digit = '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9';
type ValidClass1000 = `class-${Digit}${Digit}${Digit}`;
type SplitToValidClasses<T extends string> = SplitToValidClassesInner<Trim<T>>;
type SplitToValidClassesInner<T extends string> =
T extends `${ValidClass1000}${Whitespace}${infer Tail}`
? T extends `${infer Head}${Whitespace}${Tail}` 
? Trim<Head> extends ValidClass1000
? [Trim<Head>, ...SplitToValidClassesInner<Trim<Tail>>]
: [Err<`'${Head}' is not a valid class`>]
: never
: T extends `${infer Tail}`
? Tail extends ValidClass1000
? [Tail]
: [Err<`'${Tail}' is not a valid class`>]
: [never];
// ["class-001", "class-002", "class-003", "class-004", "class-000"]
type Result4 = SplitToValidClasses<`
class-001 class-002 
class-003
class-004 class-000
`>

操场上联系

这是我能想到的最好的解决方案,也适用于相当大的联合类型。错误信息可以修改,但它仍然提示正确的位置。

虽然在联合类型中支持大量的选择,但这并不适用于我们在单个类型联合中有40k个顺风类名的实际用例。该类型表示在开发期间可能添加的所有可能的类名(未使用的将在prod中清除)。

相关内容

  • 没有找到相关文章

最新更新