FFmpeg:是否可以逐块运行PCM音频数据的管道I/O(过滤)

我正在开发一个名为ffmpegio的Python FFmpeg包装器，我想实现的一个功能是对原始视频和音频数据进行逐块avfiltering。一个数据块被管道传输到FFmpeg，Python等待FFmpeg处理并管道返回可用的输出数据，冲洗并重复。我有这个工作的视频馈送，但有一个问题的PCM音频I/O。PCM编码器或解码器显示为阻塞，直到stdin关闭。有什么办法可以绕过这种行为吗？

这个问题与另一个问题"；FFmpeg阻塞管道直到完成"但它的答案都不适用(我认为(

编辑#1：(为了清晰起见，删除了许多原始文本(

以下是Python的最小示例。

首先，这是load_setup()加载视频和音频数据的常见脚本：

def reader(stdout):
print("reading stdout...")
y = stdout.read(1)
print(f"  stdout: read the first byte")
try:
stdout.read()
except:
pass
def logger(stderr):
print("log stderr...")
l = stderr.readline()
print(f"  stderr: {l.decode('utf8')}")
while True:
try:
l = stderr.readline()
except:
break
cmd, x = load_setup() # <- 2 cases: video & audio
nbytes = x.size * x.itemsize
p = sp.Popen(cmd, stdin=sp.PIPE, stdout=sp.PIPE, stderr=sp.PIPE)
rd = Thread(target=reader, args=(p.stdout, nbytes))
rd.start()
lg = Thread(target=logger, args=(p.stderr,))
lg.start()
try:
print("written input data to buffer")
p.stdin.write(x)
print("written input data")
sleep(1)
print("slept 1 second, closing stdin")
finally:
p.stdin.close()
print("stdin closed")
p.stdout.close()
p.stderr.close()
rd.join()
lg.join()
p.wait()

首先，带有设置功能的rawvideo I/O：

def load_setup():
return (
"ffmpeg -hide_banner -f rawvideo -s 100x100 -pix_fmt rgb24 -i - -vf 'transpose' -f rawvideo -s 100x100 -",
np.ones((100, 100, 3), "u1"),
)

它产生以下输出：

reading stdout...
written input data to buffer
log stderr...
written input data
stderr: Input #0, rawvideo, from 'pipe:':
stdout: read the first byte
slept 1 second, closing stdin
stdin closed

注意，stderr: ...和stdout: ...出现在slept 1 second, closing stdin之前。

现在，音频对应

def load_setup():
return (
"ffmpeg -hide_banner -f f64le -ar 8000 -ac 1 -i - -af lowpass -f f64le -ac 1 -",
np.ones((16000, 1)),
)

reading stdout...
written input data to bufferlog stderr...
written input data
slept 1 second, closing stdin
stdin closed
stderr: Guessed Channel Layout for Input Stream #0.0 : mono
stdout: read the first byte

这里，stderr和stdout显示行都在stdin closed之后，这表明FFmpeg只在stdin管道关闭后输出经过过滤的音频样本。这种行为在不同数量的样本或附加stdin.write()的情况下仍然存在。

因此，问题是，音频I/O是否有任何变通方法，使其表现得像视频I/O。也就是说，在初始写入之后立即输出一些内容。

我浏览了一下FFmpeg repo上的pcm.c，在我看来，PCM编码器似乎是不正确的。所以我正在寻找一种变通方法，比使用AVI容器更简单的方法。

编辑#2：修改示例，只读取第一个字节，使用不同的音频过滤器，并使用更多的音频样本

如果其他人好奇，我可以通过运行更长的实验来回答自己的问题。

(推测(PCM编码器/解码器(两者都使用pcm_f32le(最初过度缓冲其输入，并且最大缓冲大小似乎与采样率有关。它的最高值在51200到52224之间。

输入s/s	stderr输出b/w样本
44100	51200-52224
32000	51200-52224
16000	32768-33792
8000	16384-17408

相关内容

最新更新

热门标签：