os.f分叉()和递归函数的子进程限制



我想知道是否有什么好方法可以限制递归函数中os.fork((派生的进程数量?假设我希望最多30个进程同时运行,以确保系统不会完全过载。

我正在分析一个(文本(文件中包含文件的路径,这些路径也可能包含对文件的引用。我的解决方案是设计一个递归函数,并使用os.fork((来";并行化";函数。

def recursive_copying(file, target_path):
newlines=[]
with open(file) as f:
for line in f:
# If pattern matching a path is detected
if re.match(pathpattern,line)
file_line=line
# Spawn new process and scan file for paths
pid = os.fork()
if pid==0:
recursive_copying(file_line, target_path)
os._exit(0)
else:
processes.append(pid)
# Perform operations on currently read file/line and save to new line
newlines.append(modified_line)
# Create new file with modified lines
new_file=target_path_path+file.split("/")[-1]
with open(new_file,'w+') as f:
for i in newlines:
f.write(i)
# Wait for all children processes to close
for i in processes:
os.waitpid(i,0)
return

真的是你的,遗憾的

所以,我成功实现它的唯一方法是:

我正在收集父进程使用shell命令创建的所有子进程:

p4=subprocess.Popen("pgrep -P " + str(motherpid), stdout=subprocess.PIPE, stderr=subprocess.PIPE,shell=True)
p4.wait()
all_children=p4.communicate()[0].split('n')

我使用这个子进程列表来了解有多少是活动的(而不是失效的(

def activeProcesses():
p4=subprocess.Popen("pgrep -P " + str(motherpid), stdout=subprocess.PIPE, stderr=subprocess.PIPE,shell=True)
p4.wait()
all_children=p4.communicate()[0].split('n')
all_children.remove('')
all_children_static=all_children[:]
for i in all_children_static:
if not os.kill(int(i),0):
all_children.remove(i)
return all_children

在父进程中使用了一个服务程序函数,但对于子进程,我让它们在不分叉的情况下继续,以确保它们实际上关闭了递归。

def processWaiter(limit):
while len(activeProcesses()) >= limit:
time.sleep(0.1) 
return

然后代码将如下所示:

def recursive_copying(file, target_path):
newlines=[]
processlimit=30
with open(file) as f:
for line in f:
# If pattern matching a path is detected
if re.match(pathpattern,line)
file_line=line
# If we are in main/parent process wait until 
# amounts of processes allow forking
if os.getpid() == parentpid:
processWaiter(processlimit)
# If below active process limit -> fork
if activeProcesses() < processlimit:
# Spawn new process and scan file for paths
pid = os.fork()
if pid==0:
recursive_copying(file_line, target_path)
os._exit(0)
else:
processes.append(pid)
else:
recursive_copying(file_line, target_path)
# Perform operations on currently read file/line and save to new line
newlines.append(modified_line)
# Create new file with modified lines
new_file=target_path_path+file.split("/")[-1]
with open(new_file,'w+') as f:
for i in newlines:
f.write(i)
# Wait for all children processes to close
for i in processes:
os.waitpid(i,0)
return
def activeProcesses():
p4=subprocess.Popen("pgrep -P " + str(motherpid), stdout=subprocess.PIPE, stderr=subprocess.PIPE,shell=True)
p4.wait()
all_children=p4.communicate()[0].split('n')
all_children.remove('')
all_children_static=all_children[:]
for i in all_children_static:
if not os.kill(int(i),0):
all_children.remove(i)
return all_children
def processWaiter(limit):
while len(activeProcesses()) >= limit:
time.sleep(0.1) 
return
if __name__ == '__main__':
startfile=sys.argv[1]
target=sys.argv[2]
motherpid=os.getpid()
recursive_copying(startfile,target)

也许使用multiprocessing模块会更好,但我真的很想看看是否可以用这种棒和石头的方式来做。

最新更新