Quantcast
Channel: Intel® Software - Media
Viewing all articles
Browse latest Browse all 2185

Maximum Number of QuickSync sessions

$
0
0

Hi,

I'm trying to determine how many QuickSync H264 encodes I can simultaneously run (at real-time) on a 3rd-gen i7 @ 2.1GHz w/ 16GB of RAM. My source content is encoded using HEVC, 640x360, 30f/s @ 550kb/s, packaged in MPG2TS/UDP. I am transcoding that material (in real-time) to H.264-MP, 640x360, 30f/s @ 900kb/s, packaged in HLS. I am using a slightly modified version of ffmpeg 2.7 which includes a highly-optimized CPU-only HEVC decoder, and a version of the h264_qsv encoder for ffmpeg which still supports the 3rd-gen i7. The box runs 64bit Ubuntu 12 w/ VA-API version 0.34, driver version 16.3.1.18283. I have confirmed that QuickSync is configured and running properly (vainfo returns supported profiles for H264 encoding).

As an initial test, I tried a non-realtime transcode operation (HEVC, 640x360, 30f/s @ 550kb/s, packaged in MP4 to H.264-MP, 640x360, 30f/s @ 900kb/s, packaged in MP4). I am achieving roughly 700f/s in this configuration. While ffmpeg is running, the overall CPU utilization is ~25% (spread across the 4 cores). This would suggest to me that 700f/s is the upper limit of what QuickSync can do on this platform (e.g., the GPU is probably pegged at 100%).

Dividing 700/30 suggests that the platform should be capable of at least 15 *simultaneous* real-time HEVC>H264 encodes?

I then spun up 15 instances of FFMPEG in my 'production' configuration (e.g., real-time HEVC/MP2TS/UDP in, real-time H264/HLS out). Everything appears to be working, and my overall CPU utilization (spread across the 4 cores) is again about 25%.

What is interesting is that the 'load average', as reported by top, hits around 4 (if I don't force the ffmpeg instances to run on a specific core), or 10 (if I force the 15 ffmpeg instances to run on the same hyper threaded core (e.g., via taskset -c 0,1).

Obviously, the load average of 10 is concerning, although my CPU utilization is quite reasonable. I understand that load average is the number of processes sitting in a queue waiting on (I thought) CPU or I/O resources to become available. Does 'load average' on linux also take into account a process waiting on GPU availability (e.g., QuickSync availability)? If so, does that indicate that QuickSync cannot keep up with my 15 real-time requests? If I spot check the output, notably everything looks ok.

So should I be worried? Is the high load average expected in this scenario? Will it impact other things? Does it suggest I'm 'over budget' on QuickSync resources on this machine? Is there another way to determine if QuickSync is able to maintain real-time encoding?

Thank you!

Ty


Viewing all articles
Browse latest Browse all 2185

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>