Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
|
View Poll Results: Which gcc version makes x264 1745 encode faster? | |||
gcc 4.4.4 | 3 | 17.65% | |
gcc 4.5.1 | 14 | 82.35% | |
Voters: 17. You may not vote on this poll |
|
Thread Tools | Search this Thread | Display Modes |
17th October 2010, 00:57 | #1 | Link |
Pain and suffering
Join Date: Jul 2002
Posts: 1,337
|
x264 1745 gcc 4.4.4 versus gcc 4.5.1
gcc 4.4.4: http://x264.nl/dump/x264_1745_gcc_4.4.4/x264.exe
gcc 4.5.1: http://x264.nl/dump/x264_1745_gcc_4.5.1/x264.exe x264 1745 gcc 4.4.4 versus gcc 4.5.1 which is fastest with encoding? Please post the commandline you used, run each test 3x. Compiled with --configure and make. All libs recompiled with the gcc version used aswell: pthreads cvs: 2010-06-22, gpac svn: 2122, ffmpeg git: 25289, ffmpeg-libswscale git: 1252, ffmpegsource svn: 347 gcc 4.4.5 crashes with compiling: http://x264.nl/dump/x264_1745_gcc_4.4.5_crach.txt a very weird crash, it only crashes within the script http://x264.nl/x264_updater_git.sh on the second compile. When i run it manually after it fails, it works.... |
17th October 2010, 03:39 | #4 | Link |
I often say "maybe"...
Join Date: Jul 2009
Location: France
Posts: 583
|
Hmmm....problem ?
My previous test show a 1.67% speed gain for 4.5.1...but with low settings : --crf 22 --me dia -m 2 --b-bias 100 -b 16 -r 2 It's only 0.65%. So I decided to try with a little higher settings than my first test : --crf 20 --me tesa -m 9 --b-adapt 2 --b-bias 100 -b 16 -r 16 (on same clip but with a pointresize(640,360)) And...4.5.1 is slower than 4.4.4 ! (avg 4.51 against 4.59 fps) |
17th October 2010, 07:23 | #5 | Link |
Registered User
Join Date: Aug 2006
Posts: 2,229
|
Can I ask why --b-bias 100? that effectively forces x264 to use 16 b-frames all the time, and disables the internal logic to insert them where beneficial. By using this setting, you impact quality. If you want to use more b-frames, unless its animation anything over around 6 is pointless, but you could set the bias if you still wish to do so to a more practical number like 5...
The main reason why I mentioned it is in the 'low settings' script, you have --me dia, -r2, but also 16 b-frames and 100 b-frames. In the higher settings script, --me umh and -m10 should be more beneficial... |
17th October 2010, 09:00 | #7 | Link |
Registered User
Join Date: Aug 2006
Posts: 2,229
|
I was thinking that was the case, but just making sure was the gcc 4.5.1 consistently slower than 4.4.4? other processes/system processes can null the results. Also the encode statistics should be identical (apart from encode speed) for 4.4.4 and 4.5.1, are there any anomalies?
|
17th October 2010, 09:17 | #8 | Link |
Pain and suffering
Join Date: Jul 2002
Posts: 1,337
|
-b 2000 ../1280x720p50_parkrun_ter.yuv -o NUL
4.4.4 encoded 504 frames, 9.12 fps, 11827.43 kb/s encoded 504 frames, 14.73 fps, 11827.43 kb/s encoded 504 frames, 15.00 fps, 11827.43 kb/s encoded 504 frames, 14.85 fps, 11827.43 kb/s 4.5.1 encoded 504 frames, 14.89 fps, 11827.43 kb/s encoded 504 frames, 14.65 fps, 11827.43 kb/s encoded 504 frames, 14.72 fps, 11827.43 kb/s encoded 504 frames, 14.83 fps, 11827.43 kb/s Except for the first run, gcc 4.4.4 seems a bit faster here. That's why you have to run it 3x or more, so you first load it in your system's memory. And then you can get an average because your CPU is always busy. Ran on Intel Q66200 stock. |
17th October 2010, 09:20 | #9 | Link | |
Pain and suffering
Join Date: Jul 2002
Posts: 1,337
|
Quote:
If 100 people run this test, and 95% say gcc 4.5.1 is fastest, we can check if they use slow or fast settings. But i think most people use the default or slower settings, we shall see how the results end up! |
|
17th October 2010, 09:28 | #10 | Link |
Registered User
Join Date: Oct 2004
Location: France
Posts: 567
|
AMD Athlon X2 6000+ 3 GHz - 1610 frames 704x384
--crf 23 --preset medium: GCC 4.4.4: 27.70 fps, 27.81 fps, 27.69 fps GCC 4.5.1: 28.45 fps, 28.42 fps, 28.50 fps --> GCC 4.5.1 2.61% faster --crf 23 --preset fast: GCC 4.4.4: 30.79 fps, 30.76 fps, 30.80 fps GCC 4.5.1: 31.45 fps, 31.55 fps, 31.53 fps --> GCC 4.5.1 2.36% faster --crf 23 --preset slow: GCC 4.4.4: 16.96 fps, 16.97 fps, 17.00 fps GCC 4.5.1: 17.40 fps, 17.40 fps, 17.40 fps --> GCC 4.5.1 2.49% faster (all runs with --output NUL) Last edited by Underground78; 17th October 2010 at 09:31. |
18th October 2010, 11:30 | #13 | Link |
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
|
I see.
So, there isn't anything that 4.4.x (or 4.5.x) could screw up worse than 3.4.x, right? I tried both (3.4.5 and 4.5.1) and there is very little performance difference. The only thing I noticed is that the 3.4.5 build is much smaller than 4.5.1 (780KB compared to 1MB, without gpac). |
18th October 2010, 19:53 | #15 | Link |
User of free A/V tools
Join Date: Jul 2006
Location: SK
Posts: 826
|
I made 9 rounds with each GCC version, 3 runs with 3 different presets on i7-920 @ stock frequency 2.66GHz.
Command lines used: Code:
x264.exe --preset slower --tune film --crf 18.0 --sar 1:1 --level 4.1 --keyint 240 --output null test.avs x264.exe --preset medium --tune film --crf 18.0 --sar 1:1 --level 4.1 --keyint 240 --output null test.avs x264.exe --preset faster --tune film --crf 18.0 --sar 1:1 --level 4.1 --keyint 240 --output null test.avs GCC 4.4.4 results:
GCC 4.5.1 results:
However, in the very last run (--preset faster) there occurred a non-deterministic result for GCC 4.4.4: Code:
avs [info]: 1920x800p 1:1 @ 24000/1001 fps (cfr) x264 [info]: using SAR=1/1 x264 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 FastShuffle SSE4.2 x264 [info]: profile High, level 4.1 x264 [info]: frame I:16 Avg QP:16.58 size:217109 x264 [info]: frame P:375 Avg QP:20.04 size:150895 x264 [info]: frame B:329 Avg QP:21.87 size: 98865 x264 [info]: consecutive B-frames: 9.2% 85.2% 0.4% 5.1% x264 [info]: mb I I16..4: 39.3% 18.8% 42.0% x264 [info]: mb P I16..4: 13.2% 17.5% 14.4% P16..4: 15.8% 11.6% 10.5% 0.0% 0 .0% skip:16.9% x264 [info]: mb B I16..4: 6.3% 10.4% 4.6% B16..8: 24.2% 14.8% 3.6% direct: 14.9% skip:21.3% L0:26.6% L1:22.0% BI:51.4% x264 [info]: 8x8 transform intra:40.3% inter:19.6% x264 [info]: coded y,uvDC,uvAC intra: 89.1% 51.0% 17.3% inter: 53.0% 19.9% 2.4% x264 [info]: i16 v,h,dc,p: 30% 6% 44% 20% x264 [info]: i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 10% 11% 42% 4% 9% 5% 6% 6% 8% x264 [info]: i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 11% 15% 24% 5% 12% 7% 8% 7% 11% x264 [info]: i8c dc,h,v,p: 58% 18% 20% 3% x264 [info]: ref P L0: 69.3% 9.8% 20.9% x264 [info]: ref B L0: 77.4% 22.6% x264 [info]: ref B L1: 99.6% 0.4% x264 [info]: kb/s:24664.96 encoded 720 frames, 14.09 fps, 24664.96 kb/s Code:
avs [info]: 1920x800p 1:1 @ 24000/1001 fps (cfr) x264 [info]: using SAR=1/1 x264 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 FastShuffle SSE4.2 x264 [info]: profile High, level 4.1 x264 [info]: frame I:16 Avg QP:16.58 size:217110 x264 [info]: frame P:375 Avg QP:20.04 size:150898 x264 [info]: frame B:329 Avg QP:21.87 size: 98881 x264 [info]: consecutive B-frames: 9.2% 85.2% 0.4% 5.1% x264 [info]: mb I I16..4: 39.3% 18.8% 41.9% x264 [info]: mb P I16..4: 13.2% 17.5% 14.4% P16..4: 15.8% 11.6% 10.6% 0.0% 0 .0% skip:16.9% x264 [info]: mb B I16..4: 6.3% 10.4% 4.6% B16..8: 24.2% 14.8% 3.6% direct: 14.9% skip:21.2% L0:26.6% L1:22.0% BI:51.4% x264 [info]: 8x8 transform intra:40.3% inter:19.7% x264 [info]: coded y,uvDC,uvAC intra: 89.1% 51.0% 17.2% inter: 53.1% 19.9% 2.4% x264 [info]: i16 v,h,dc,p: 30% 6% 44% 20% x264 [info]: i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 10% 11% 42% 4% 9% 5% 6% 6% 8% x264 [info]: i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 11% 15% 24% 5% 12% 7% 8% 7% 11% x264 [info]: i8c dc,h,v,p: 58% 18% 20% 3% x264 [info]: ref P L0: 69.4% 9.7% 20.9% x264 [info]: ref B L0: 77.4% 22.6% x264 [info]: ref B L1: 99.6% 0.4% x264 [info]: kb/s:24666.65 encoded 720 frames, 14.05 fps, 24666.65 kb/s |
19th October 2010, 08:07 | #17 | Link |
User of free A/V tools
Join Date: Jul 2006
Location: SK
Posts: 826
|
I strongly doubt that. If anybody suggests how to make 100% deterministic input for x264 (preferably with use of my sample script, not some other material) I could try to locate the problem better. As it is now I stand biased towards x264 encoding issue (with GCC 4.4.4) rather than DGDecodeNV decoding.
|
19th October 2010, 08:40 | #19 | Link | |
Software Developer
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,249
|
Quote:
http://media.xiph.org/video/derf/ Using a H.264 source only adds another layer of uncertainty to the "deterministic" test. I usually do a MD5 comparisons of the output files after updating my compiler with one of those Xiph samples. It's not a guarantee, because it only tests one particular combination of settings with one particular source, but it helped me to identify compiler issues in the past... Here are the results from the latest test: http://pastie.org/1232074 (BTW: Is there any reason why are we comparing GCC 4.4.4 here, after 4.4.5 has already been officially released from the 4.4.x tree? Any regressions I should know?)
__________________
Go to https://standforukraine.com/ to find legitimate Ukrainian Charities 🇺🇦✊ Last edited by LoRd_MuldeR; 19th October 2010 at 09:15. |
|
|
|