在groupbyKey之后的下面的代码中,我得到了PCollection>>。如何在发送到FileIO之前使值中的Iterable变平。
.apply(GroupByKey.<String, String>create())
.apply("Write file to output",FileIO.< String, KV<String,String>>writeDynamic()
.by(KV::getKey)
.withDestinationCoder(StringUtf8Coder.of())
.via(Contextful.fn(KV::getValue), TextIO.sink())
.to("Out")
.withNaming(key -> FileIO.Write.defaultNaming("file-" + key, ".txt")));
谢谢你的好心帮助。
您需要使用ParDo来压平PCollection
的Iterable部分,如下所示:-
PCollection<KV<String, Doc>> urlDocPairs = ...;
PCollection<KV<String, Iterable<Doc>>> urlToDocs =
urlDocPairs.apply(GroupByKey.<String, Doc>create());
PCollection<R> results =
urlToDocs.apply(ParDo.of(new DoFn<KV<String, Iterable<Doc>>, R>() {
{@literal @}ProcessElement
public void processElement(ProcessContext c) {
String url = c.element().getKey();
for <String,Doc> docsWithThatUrl : c.element().getValue();
c.output(docsWithThatUrl)
}}));