我试图编写一个类型,验证给定的输入字符串是否具有由1个或多个空白字符分隔的有效类名。输入也可以有前后空白。
我现在使用的类型非常接近,但是TS编译器可以通过多种方式推断模板字面量,这意味着语法是不明确的。这会导致不想要的结果。
首先定义基本类型:
// To avoid recursion as much as possible
type Spaces = (
| " "
| " "
| " "
| " "
| " "
);
type Whitespace = Spaces | "n" | "t";
type ValidClass = 'a-class' | 'b-class' | 'c-class';
那么实用程序类型
// Utility type to provide nicer error messages
type Err<Message extends string> = `Error: ${Message}`;
type TrimEnd<T extends string> = (
T extends `${infer Rest}${Whitespace}`
? TrimEnd<Rest>
: T
);
type TrimStart<T extends string> = (
T extends `${Whitespace}${infer Rest}`
? TrimStart<Rest>
: T
);
type Trim<T extends string> = TrimEnd<TrimStart<T>>;
最后是检查输入字符串的实际类型:
// Forces the string to be trimmed before starting recursive loop.
type SplitToValidClasses<T extends string> = SplitToValidClassesInner<Trim<T>>;
// Splits the input string into array of `Array<Token | 'Error: ...'>`
// strings. The input is converted to an array format mostly because I found it
// easier to work with arrays in other TS generics, instead of e.g space separated
// values.
type SplitToValidClassesInner<T extends string> =
// Does `T` contain more than one string? For example 'aaaann bbbb'
T extends `${infer Head}${Whitespace}${infer Tail}`
// Yes, `T` could be infered into three parts.
// Is `Head` a valid class name?
? Trim<Head> extends ValidClass
// Yes, it's a valid name. Continue recursively with rest of the string
// but trim white space from both sides.
? [Trim<Head>, ...SplitToValidClassesInner<Trim<Tail>>]
: [Err<`'${Head}' is not a valid class`>]
: T extends `${infer Tail}`
? Tail extends ValidClass
? [Tail]
: [Err<`'${Tail}' is not a valid class`>]
: [never];
// This works
type CorrectResult = SplitToValidClasses<'a-class b-class c-class'>
但是当使用不同的输入进行测试时,我们可以注意到不正确的结果:
// Should be ["a-class", "b-class", "c-class"]
type Input1 = `a-class b-class c-class`;
type Result = SplitToValidClasses<Input1>;
// Should be ["a-class", "b-class", "c-class", "a-class"]
type Result2 = SplitToValidClasses<`
a-class b-class
c-class
a-class
`>;
// Should be ["a-class", "Error: 'wrong-class' is not a valid class"]
type Result3 = SplitToValidClasses<`
a-class
wrong-class
c-class
`>;
问题发生在模板推理中:
type SplitToValidClassesInnerFirstLevelDebug<T extends string> =
T extends `${infer Head}${Whitespace}${infer Tail}`
? [Head, Whitespace, Tail]
: never
// The grammar is ambiguous, leading to
// "["a-class b-class" | "a-class", Whitespace, "c-class" | "b-class c-class"]
// Removing the ambiguousity should fix the issue
type Result4 = SplitToValidClassesInnerFirstLevelDebug<Input1>
操场上联系
我找不到太多关于如何推断模板字面量的详细文档,除了Anders Hejlsberg在他的PR中解释的:
要使推理成功,目标的开始和结束文字字符跨度(如果有的话)必须与源的开始和结束字符跨度完全匹配。通过从左到右将每个占位符与源中的子字符串匹配来进行推理:通过从源中推断零个或多个字符来匹配占位符,直到该文字字符跨度在源中首次出现。紧跟在另一个占位符后面的占位符通过从源中推断单个字符来匹配。
如何在不产生歧义的情况下实现这种类型?我想到的一种方法是逐个字符地递归解析输入,但它很快就达到了TS中的递归限制。
我想出了两种解决方案,但都不能解决最初的问题,因为类型变得过于复杂或递归。第二个解决方案肯定比第一个更具可扩展性。
方案1:递归解析
此解决方案递归地解析输入字符串。type Split
用空格分割输入字符串,并返回一个由令牌(或单词)组成的数组。
type EndOfInput = '';
// Validates given `UnprocessedInput` input string
// It recursively iterates through each character in the string,
// and appends characters into the second type parameter `Current` until the
// token has been consumed. When the token is fully consumed, it is added to
// `Result` and `Current` memory is cleared.
//
// NOTE: Do not pass anything else than the first type parameter. Other type
// parameters are for internal tracking during recursive loop
//
// See https://github.com/microsoft/TypeScript/pull/40336 for more template literal
// examples.
type Split<UnprocessedInput extends string, Current extends string = '', Result extends string[] = []> =
// Have we reached to the end of the input string ?
UnprocessedInput extends EndOfInput
// Yes. Is the `Current` empty?
? Current extends EndOfInput
// Yes, we're at the end of processing and no need to add new items to result
? Result
// No, add the last item to results, and return result
: [...Result, Current]
// No, use template literal inference to get first char, and the rest of the string
: UnprocessedInput extends `${infer Head}${infer Rest}`
// Is the next character whitespace?
? Head extends Whitespace
// No, and is the `Current` empty?
? Current extends EndOfInput
// Yes, continue "eating" whitespace
? Split<Rest, Current, Result>
// No, it means we went from a token to whitespace, meaning the token
// is fully parsed and can be added to the result
: Split<Rest, '', [...Result, Current]>
// No, add the character to Current
: Split<Rest, `${Current}${Head}`, Result>
// This shouldn't happen since UnprocessedInput is restricted with
// `extends string` type narrowing.
// For example ValidCssClassName<null> would be a `never` type if it didn't
// already fail to "Type 'null' does not satisfy the constraint 'string'"
: [never]
这适用于较小的输入,但不适用较大的字符串,因为TS递归限制:
type Result5 = Split<`
a
b
c`>
// Fails for larger string values, because of recursion limit
type Result6 = Split<`aaaaaaaaaaaaaaaaaaa
bbbbbbbbbbbbbbbbbbbbb`
操场上联系
解决方案2:已知类作为令牌
由于我们实际上已经将有效的类名作为字符串联合,因此可以将其用作模板文字类型的一部分来消费整个类名。
为了理解这个解决方案,让我们从各个部分构建它。首先,让我们在模板文字中使用ValidClass
:
type SplitDebug1<T extends string> =
T extends `${ValidClass}${Whitespace}${infer Tail}`
? [ValidClass, Whitespace, Tail]
: never
// The grammar is not ambiguous anymore!
// [ValidClass, Whitespace, "b-class c-class"]
type Result1 = SplitDebug1<"a-class b-class c-class">
这解决了歧义问题,但现在我们不能再访问解析后的Head,因为ValidClass
只是引用类型type ValidClass = "a-class" | "b-class" | "c-class"
。不幸的是,TypeScript不允许同时推断和限制token,所以这是不可能的:
type SplitDebug2<T extends string> =
T extends `${infer Head extends ValidClass ? infer Head : never}${Whitespace}${infer Tail}`
? [Head, Whitespace, Tail]
: never
// Still just [ValidClass, Whitespace, "b-class c-class"]
type Result2 = SplitDebug1<"a-class b-class c-class">
但是黑客来了。我们可以使用已知的Tail
作为一种反向匹配的方式来访问Head
:
type SplitDebug3<T extends string> =
T extends `${ValidClass}${Whitespace}${infer Tail}`
? T extends `${infer Head}${Whitespace}${Tail}`
? [Head, Whitespace, Tail]
: never
: never
// Now we now the first valid token aka class name!
// ["a-class", Whitespace, "b-class c-class"]
type Result3 = SplitDebug3<"a-class b-class c-class">
这个技巧可以用来解析有效的类名,完整的解决方案:
// Demonstrating with large amount of class names
// Breaks to "too complex union type" with 20k class names
type Digit = '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9';
type ValidClass1000 = `class-${Digit}${Digit}${Digit}`;
type SplitToValidClasses<T extends string> = SplitToValidClassesInner<Trim<T>>;
type SplitToValidClassesInner<T extends string> =
T extends `${ValidClass1000}${Whitespace}${infer Tail}`
? T extends `${infer Head}${Whitespace}${Tail}`
? Trim<Head> extends ValidClass1000
? [Trim<Head>, ...SplitToValidClassesInner<Trim<Tail>>]
: [Err<`'${Head}' is not a valid class`>]
: never
: T extends `${infer Tail}`
? Tail extends ValidClass1000
? [Tail]
: [Err<`'${Tail}' is not a valid class`>]
: [never];
// ["class-001", "class-002", "class-003", "class-004", "class-000"]
type Result4 = SplitToValidClasses<`
class-001 class-002
class-003
class-004 class-000
`>
操场上联系
这是我能想到的最好的解决方案,也适用于相当大的联合类型。错误信息可以修改,但它仍然提示正确的位置。
虽然在联合类型中支持大量的选择,但这并不适用于我们在单个类型联合中有40k个顺风类名的实际用例。该类型表示在开发期间可能添加的所有可能的类名(未使用的将在prod中清除)。