多处理池不改变处理速度



我已经使用python 3和openCV创建了一个近似遗传算法的图像。它的作用是,它创建了一个将随机的彩色,大小和不透明度圆圈绘制到空白图像上的人群。优胜品最终在数百代后饱和。

我试图实现多处理,因为渲染图像需要时间与人口大小和圆形大小以及目标图像大小(对于细节细度很重要(

我所做的是使用多处理和池,将单个对象的数组用作具有峰值的数组,并仅绘制出健身和ID。实际上,在主要过程中,每个人都没有自己的画布,而在多进程过程中,每个人都会弹出画布并计算适应性/差异。

但是,似乎使用多处理使整个程序较慢?实际上,与序列化处理相比,渲染过程似乎正在采用相同的速度,但是由于多处理方面的速度较慢。

class PopulationCircle:
    def renderPop(self, individual):
        individual.render()
    return [individual.index, individual.fitness]
class IndividualCircle:
    def render(self):
        self.genes.sort(key=lambda x: x[-1], reverse=False)
        self.canvas = np.zeros((self.height,self.width, 4), np.uint8)
        for i in range(self.maxCount):
            overlay=self.canvas.copy()
            cv2.circle(overlay, (self.genes[i][0], self.genes[i][1]), self.genes[i][2], (self.genes[i][3],self.genes[i][4],self.genes[i][5]), -1, lineType=cv2.LINE_AA)
            self.canvas = cv2.addWeighted(overlay, self.genes[i][6], self.canvas, 1-self.genes[i][6], 0)
        diff = np.absolute(np.array(self.target)- np.array(self.canvas))
        diffSum = np.sum(diff)
        self.fitness = diffSum
def evolution(mainPop, generationLimit):
    p = mp.Pool()
    for i in range(int(generationLimit)):
        start_time = time.time()
        result =[]
        print(f"""
-----------------------------------------
Current Generation: {mainPop.generation}
Initial Score: {mainPop.score}
-----------------------------------------
        """)
        #Multiprocessing used for rendering out canvas since it takes time.
        result = p.map(mainPop.renderPop, mainPop.population)
        #returns [individual.index, individual.fitness]; results is a list of list
        result.sort(key = lambda x: x[0], reverse=False)
        #Once multiprocessing is done, we only receive fitness value and index. 
        for k in mainPop.population:
            k.fitness = result[k.index][1]
        mainPop.population.sort(key = lambda x: x.fitness, reverse = True)
        if mainPop.generation == 0:
            mainPop.score = mainPop.population[-1].fitness
        """
        Things to note:
            In main process, none of the individuals have a canvas since the rendering
            is done on a different process tree.
            The only thing that changes in this main process is the individual's 
            fitness.
            After calling .renderHD and .renderLD, the fittest member will have a canvas
            drawn in this process. 
        """
        end_time = time.time() - start_time
        print(f"Time taken: {end_time}")
        if i%50==0:
            mainPop.population[0].renderHD()
            cv2.imwrite( f"../output/generationsPoly/generation{i}.jpg", mainPop.population[0].canvasHD)
        if i%10==0:
            mainPop.population[0].renderLD()
            cv2.imwrite( f"../output/allGenPoly/image{i}.jpg", mainPop.population[0].canvas)
        mainPop.toJSON()
        mainPop.breed()

    p.close()
    p.join()
if __name__ == "__main__":
        #Creates Population object
        #init generates self.population array which is an array of IndividualCircle objects that contain DNA and render methods
    pop = PopulationCircle(targetDIR, maxPop, circleAmount, mutationRate, mutationAmount, cutOff)
    #Starts loop
    evolution(pop, generations)

如果我使用600个人口,有800个圆圈,串行:11SITERATION AVG。多进程:18S/迭代AVG。

我是多处理的新手,因此将不胜感激。

发生的原因是OpenCV内部产生了很多线程。当您从主分叉并运行许多过程时,这些过程中的每一个都会创建单独的OpenCV线程,从而导致一个小的雪崩。这里的问题是,它们最终会同步并等待锁定释放,这是您的某些东西可以通过用cProfile分析您的代码来轻松检查。

Joblib文档中描述了问题。这也可能是您的解决方案:切换到琼布布。过去,我遇到过类似的问题,您会在此帖子中找到它。

[编辑]此处的额外证据和解决方案。简而言之,根据该帖子,这是一个已知的问题,但是由于OpenCV释放了GIL,因此可以运行多线程而不是多处理,从而减少开销。

相关内容

  • 没有找到相关文章

最新更新