不要删除失败规则的输出

我有一个snakemake工作流，其中包含一个运行另一个"内部"；蛇制造工作流
有时内部工作流的某个规则失败，这意味着内部工作流失败。因此，内部工作流的output下列出的所有文件都会被外部工作流删除，即使创建这些文件的内部工作流的规则成功完成
是否有方法防止snakemake删除失败规则的输出？或者你可以建议另一种变通方法
一些注意事项：

必须列出内部工作流的输出，b/c它们被用作外部工作流中其他规则的输入
我尝试将内部工作流的输出设置为protected，但这没有帮助
我还尝试将CCD_ 3添加到内部工作流调用的末尾，以使snakemake认为它成功完成

如下：

rule run_inner:
input:
inputs...
output:
outputs...
shell:
"""
snakemake -s inner.snakefile
exit 0
"""

但是输出仍然被删除
如有任何帮助，不胜感激。谢谢

您可以在命令行或通过配置文件将--keep-incomplete选项用于snakemake。这将防止失败的作业删除不完整的输出文件。

一个选项可能是让run_inner生成一个标记规则完成的伪输出文件。遵循run_inner的规则将接受伪文件的输入。例如：

rule run_inner:
...
output:
# or just 'run_inner.done' if wildcards are not involved
touch('{sample}.run_inner.done'), 
shell:
'snakemake -s inner.snakefile'
run next:
input:
'{sample}.run_inner.done',
params:
real_input= '{sample}.data.txt', # This is what run_inner actually produces
shell:
'do stuff {params.real_input}'

如果snakemake -s inner.snakefile失败，伪输出将被删除，但snakemake -s inner.snakefile将从其离开的位置重新启动。

另一种方案可以是使用include语句将inner.snakefile中的规则集成到外部管道中。我觉得这个选择更可取，但当然，实施起来会更复杂。

一种解决方法是使用run而不是shell:

rule run_inner:
input:
inputs...
output:
outputs...
run:
shell("""snakemake -s inner.snakefile""")
# Add your code here to store the files before removing

即使shell函数调用中的脚本失败，文件仍然存在，直到run部分中的脚本完成。你可以把文件复制到一个安全的地方。

更新：每当脚本返回错误时，您需要处理异常才能继续执行。下面的脚本说明了这个想法：来自except:块的print函数打印True，来自onerror的另一个打印False

rule run_inner:
output:
"output.txt"
run:
try:
shell("""touch output.txt; exit 1""")
except:
print(os.path.exists("output.txt"))
onerror:
print(os.path.exists("output.txt"))

程序"失败"；当抛出非零返回值时。因此，我们只需要"；"修复"；这个问题欺骗了内壳认为所有程序都已成功完成。最简单的方法是使用some error command || true。下面是一个最小的例子：

rule test:
output:
"test.output",
shell:
"""
touch test.output
# below cat will trigger error 
cat file_not_exist || true
"""

您会发现，尽管cat抛出了错误，但test.output仍然存在。

相关内容

最新更新

热门标签：