在Paralell Bash内运行Serial

我在解释中添加了一些内容。从概念上讲，我正在运行一个在循环中处理的脚本，调用使用行内容作为输入参数的shell。(仅供参考：a启动执行，b监控该执行)

我需要1a和1b先运行，在pararell中运行前两个$param
接下来，当步骤1完成时，2a和2b需要串行运行$params
一旦2a和2b完成，3a和3b将启动(如果串行或并行则无关)
循环从input.txt继续到下两行

我无法让它串行处理第二个，只能并行处理：我需要的是以下

cat filename | while readline 
export param=$line
do
./script1a.sh "param" > process.lg && ./script2b.sh > monitor.log &&
##wait for processes to finish, running 2 in parallel in script1.sh
./script2a.sh "param" > process2.log && ./script2b.sh > minitor2.log &&
##run each of the 2 in serial for script2.sh
./script3a.sh && ./script3b.sh

我尝试添加wait，并尝试了一个包含script2a.sh和script2b.sh的if语句，该语句将串行运行，但没有成功。

if ((++i % 2 ==0)) then wait fi
done
#only run two lines at a time, then cycle back through loop

我究竟如何才能让script2.sh作为script1的并行结果而串行运行？？

锁定！

如果您想并行化script1和script3，但需要序列化script2的所有调用，请继续使用：

./script1.sh && ./script2.sh && ./script3.sh &

但是修改script2以在执行其他操作之前获取锁：

#!/bin/bash
exec 3>.lock2
flock -x 3
# ... continue with script2's business here.

请注意，不得删除此处使用的.lock2文件，否则可能会导致多个进程认为它们同时持有锁。

您并没有向我们展示您从文件中读取的行是如何被消耗的
如果我正确理解你的问题，你想在filename的两行上运行script1，每行并行，然后在两行都完成后串行运行script2？

while read first; do
echo "$first" | ./script1.sh &
read second
echo "$second" | ./script1.sh &
wait
script2.sh &    # optionally don't background here?
script3.sh
done <filename &

while循环包含两个read语句，因此每次迭代从filename读取两行，并将每行提供给script1的一个单独实例。然后我们wait，直到它们都完成，然后我们运行script2。我对它进行后台处理，以便script3可以在运行时启动，并对整个while循环进行后台处理；但默认情况下，您实际上可能不需要对整个作业进行后台处理(如果您将其作为常规前台作业编写，那么开发会容易得多，然后当它工作时，如果需要，在启动它时对整个工作进行后台处理)。

根据您实际希望数据的流动方式，我可以想出许多不同的方法；以下是您最近更新的问题的最新回复。

export param  # is this really necessary?
while read param; do
# First instance
./script1a.sh "$param" > process.lg  &&
./script2b.sh > monitor.log &
# Second instance
read param
./script2a.sh "$param" > process2.log && ./script2b.sh > minitor2.log &
# Wait for both to finish
wait
./script3a.sh && ./script3b.sh
done <filename

如果这仍然没有帮助，也许你应该发布一个第三个问题，真正解释你想要什么。。。

我不能100%确定你的问题是什么意思，但现在我认为你的内环是这样的：

(
# run script1 and script2 in parallel
script1 &
s1pid=$!
# start no more than one script2 using GNU Parallel as a mutex
sem --fg script2
# when they are both done...
wait $s1pid
# run script3
script3
) &    # and do that lot in parallel with previous/next loop iteration

@triplee如果感兴趣的话，我把以下内容放在一起(注意：我为帖子更改了一些变量，很抱歉，如果任何地方都有不一致……导出也有其原因。我认为有比导出更好的方法，但目前它有效)

cat input.txt | while read first; do
export step=${first//"/}
export stepem=EM_${step//,/_}
export steptd=TD_${step//,/_}
export stepeg=EG_${step//,/_}
echo "$step" |  $directory"/ws_client.sh" processOptions  "$appName" "$step" "$layers" "$stages" "" "$stages" "$stages" FALSE > "$Folder""/""$stepem""_ProcessID.log" &&
$dir_model"/check_ status.sh" "$Folder" "$stepem" > "$Folder""/""$stepem""_Monitor.log" &
read second
export step2=${second//"/}
export stepem2=ExecuteModel_${step2//,/_}
export steptd2=TransferData_${step2//,/_}
export stepeg2=ExecuteGeneology_${step2//,/_}
echo "$step2" |  $directory"/ws_client.sh" processOptions  "$appName" "$step2" "$layers" "$stages" "" "$stages" "$stages" FALSE > "$Folder""/""$stepem2""_ProcessID.log" && 
$dir _model"/check _status.sh" "$Folder" "$stepem2" > "$Folder""/""$stepem2""_Monitor.log" &
wait
$directory"/ws_client.sh" processOptions "$appName" "$step" "$layers" "" ""  "$stage_final" "" TRUE > "$appLogFolder""/""$steptd""_ProcessID.log" &&
$dir _model"/check_status.sh" "$Folder" "$steptd" > "$Folder""/""$steptd""_Monitor.log" &&
$directory"/ws_client.sh" processOptions "$appName" "$step2" "$layers" "" ""  "$stage_final" "" TRUE > "$appLogFolder""/""$steptd2""_ProcessID.log" &&
$dir _model"/check _status.sh" "$Folder" "$steptd2" > "$Folder""/""$steptd2""_Monitor.log" &
wait
$directory"/ws_client.sh" processPaths "$appName" "$step" "$layers" "$genPath_01" > "$appLogFolder""/""$stepeg""_ProcessID.log" &&
$dir _model"/check _status.sh" "$Folder" "$stepeg" > "$Folder""/""$stepeg""_Monitor.log" &&
$directory"/ws_client.sh" processPaths "$appName" "$step2" "$layers" "$genPath_01" > "$appLogFolder""/""$stepeg2""_ProcessID.log" &&
$dir_model"/check _status.sh" "$Folder" "$stepeg2" > "$Folder""/""$stepeg2""_Monitor.log" &
wait
if (( ++i % 2 == 0))
then
echo "Waiting..."
wait
fi

我理解你的问题，比如：

你有一个模型列表。这些模型需要运行。它们运行后，必须进行转移。简单的解决方案是：

run_model model1
transfer_result model1
run_model model2
transfer_result model2

但为了加快速度，我们希望将各个部分并行化。不幸的是，transfer_result无法并行化。

run_model model1
run_model model2
transfer_result model1
transfer_result model2

从文本文件中读取CCD_ 18和CCD_。run_model可以并行运行，您希望其中2个并行运行。transfer_result一次只能运行一个，并且只有在计算完结果后才能传输结果。

可以这样做：

cat models.txt | parallel -j2 'run_model {} && sem --id transfer transfer_model {}'

run_model {} && sem --id transfer transfer_model {}将运行一个模型，如果成功，则传输该模型。只有在没有其他传输运行的情况下，才会开始传输。

parallel -j2将并行运行其中两个作业。

如果传输比计算模型花费的时间更短，那么你应该不会感到惊讶：传输最多会与下一次传输交换。如果转移比运行模型花费更长的时间，您可能会看到模型完全无序转移(例如，您可能在转移作业2之前看到作业10的转移)。但它们最终将全部被转移。

您可以看到以下示例的执行序列：

seq 10 | parallel -uj2 'echo ran model {} && sem --id transfer "sleep .{};echo transferred {}"'

此解决方案比基于wait的解决方案更好，因为您可以在传输模型1+2时运行模型3。

相关内容

最新更新

热门标签：