如何使用多线程来优化面部检测



i有一个代码,该代码使用CSV文件中的图像URL列表,然后对这些图像进行脸部检测,然后加载一些模型并在这些图像上进行预测。

我进行了一些负载测试,发现代码中的 get_face 函数需要产生结果所需的大部分时间,并且由创建用于预测的咸菜文件花费了额外的时间。

问题:是否有可能通过在线程中运行这些过程,可以减少时间以及如何以多线程方式进行操作?

这是代码示例:

from __future__ import division
import numpy as np
from multiprocessing import Process, Queue, Pool
import os
import pickle
import pandas as pd
import dlib
from skimage import io
from skimage.transform import resize
df = pd.read_csv('/home/instaurls.csv')
detector = dlib.get_frontal_face_detector()
img_width, img_height = 139, 139
confidence = 0.8
def get_face():
    output = None
    data1 = []
    for row in df.itertuples():
        img = io.imread(row[1])
        dets = detector(img, 1)
        for i, d in enumerate(dets):
            img = img[d.top():d.bottom(), d.left():d.right()]
            img = resize(img, (img_width, img_height))
            output = np.expand_dims(img, axis=0)
            break
        data1.append(output)
    data1 = np.concatenate(data1)
    return data1
get_face()

CSV样本

data
https://scontent-frt3-2.cdninstagram.com/t51.2885-19/s320x320/23101834_1502115223199537_1230866541029883904_n.jpg
https://scontent-frt3-2.cdninstagram.com/t51.2885-19/s320x320/17883193_940000882769400_8455736118338387968_a.jpg
https://scontent-frt3-2.cdninstagram.com/t51.2885-19/s320x320/22427207_1737576603205281_7879421442167668736_n.jpg
https://scontent-frt3-2.cdninstagram.com/t51.2885-19/s320x320/12976287_1720757518213286_1180118177_a.jpg
https://scontent-frt3-2.cdninstagram.com/t51.2885-19/s320x320/23101834_1502115223199537_1230866541029883904_n.jpg
https://scontent-frx5-1.cdninstagram.com/t51.2885-19/s320x320/16788491_748497378632253_566270225134125056_a.jpg
https://scontent-frx5-1.cdninstagram.com/t51.2885-19/s320x320/21819738_128551217878233_9151523109507956736_n.jpg
https://scontent-frx5-1.cdninstagram.com/t51.2885-19/s320x320/14295447_318848895135407_524281974_a.jpg
https://scontent-frx5-1.cdninstagram.com/t51.2885-19/s320x320/18160229_445050155844926_2783054824017494016_a.jpg
https://scontent-frt3-2.cdninstagram.com/t51.2885-19/s320x320/23101834_1502115223199537_1230866541029883904_n.jpg
https://scontent-frt3-2.cdninstagram.com/t51.2885-19/s320x320/17883193_940000882769400_8455736118338387968_a.jpg
https://scontent-frt3-2.cdninstagram.com/t51.2885-19/s320x320/22427207_1737576603205281_7879421442167668736_n.jpg
https://scontent-frt3-2.cdninstagram.com/t51.2885-19/s320x320/12976287_1720757518213286_1180118177_a.jpg
https://scontent-frt3-2.cdninstagram.com/t51.2885-19/s320x320/23101834_1502115223199537_1230866541029883904_n.jpg
https://scontent-frx5-1.cdninstagram.com/t51.2885-19/s320x320/16788491_748497378632253_566270225134125056_a.jpg
https://scontent-frx5-1.cdninstagram.com/t51.2885-19/s320x320/21819738_128551217878233_9151523109507956736_n.jpg
https://scontent-frx5-1.cdninstagram.com/t51.2885-19/s320x320/14295447_318848895135407_524281974_a.jpg
https://scontent-frx5-1.cdninstagram.com/t51.2885-19/s320x320/18160229_445050155844926_2783054824017494016_a.jpg
https://scontent-frt3-2.cdninstagram.com/t51.2885-19/s320x320/23101834_1502115223199537_1230866541029883904_n.jpg

这是您可以并行尝试的方法:

from __future__ import division
import numpy as np
from multiprocessing import Process, Queue, Pool
import os
import pickle
import pandas as pd
import dlib
from skimage import io
from skimage.transform import resize
from csv import DictReader
df = DictReader(open('/home/instaurls.csv')) # DictReader is iterable
detector = dlib.get_frontal_face_detector() 
img_width, img_height = 139, 139
confidence = 0.8
def get_face(row):
    """
    Here row is dictionary where keys are CSV header names
    and values are values from current CSV row.
    """
    output = None
    img = io.imread(row[1]) # row[1] has to be changed to row['data']?
    dets = detector(img, 1)
    for i, d in enumerate(dets):
        img = img[d.top():d.bottom(), d.left():d.right()]
        img = resize(img, (img_width, img_height))
        output = np.expand_dims(img, axis=0)
        break
    return output
if __name__ == '__main__':
    pool = Pool() # default to number CPU cores
    data = list(pool.imap(get_face, df))
    print np.concatenate(data)

请注意get_face并具有论证。另外,它返回的内容。这就是我说较小的作品时的意思。现在get_face从CSV处理一行。

运行此脚本时,pool将是对Pool实例的引用,然后您在df.itertuples()中的每行/tuple致电get_face

完成了所有操作后,data保留处理数据,然后在其上进行np.concatenate

最新更新