Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > MPEG-4 AVC / H.264
Register FAQ Calendar Today's Posts Search

View Poll Results: Which gcc version makes x264 1745 encode faster?
gcc 4.4.4 3 17.65%
gcc 4.5.1 14 82.35%
Voters: 17. You may not vote on this poll

Reply
 
Thread Tools Search this Thread Display Modes
Old 17th October 2010, 00:57   #1  |  Link
bob0r
Pain and suffering
 
bob0r's Avatar
 
Join Date: Jul 2002
Posts: 1,337
x264 1745 gcc 4.4.4 versus gcc 4.5.1

gcc 4.4.4: http://x264.nl/dump/x264_1745_gcc_4.4.4/x264.exe
gcc 4.5.1: http://x264.nl/dump/x264_1745_gcc_4.5.1/x264.exe



x264 1745 gcc 4.4.4 versus gcc 4.5.1 which is fastest with encoding?
Please post the commandline you used, run each test 3x.

Compiled with --configure and make.

All libs recompiled with the gcc version used aswell:
pthreads cvs: 2010-06-22, gpac svn: 2122, ffmpeg git: 25289,
ffmpeg-libswscale git: 1252, ffmpegsource svn: 347


gcc 4.4.5 crashes with compiling: http://x264.nl/dump/x264_1745_gcc_4.4.5_crach.txt
a very weird crash, it only crashes within the script http://x264.nl/x264_updater_git.sh
on the second compile. When i run it manually after it fails, it works....
bob0r is offline   Reply With Quote
Old 17th October 2010, 01:08   #2  |  Link
Dark Shikari
x264 developer
 
Dark Shikari's Avatar
 
Join Date: Sep 2005
Posts: 8,666
I still use gcc 3.4
Dark Shikari is offline   Reply With Quote
Old 17th October 2010, 02:05   #3  |  Link
MatLz
I often say "maybe"...
 
MatLz's Avatar
 
Join Date: Jul 2009
Location: France
Posts: 583
Core2duo @1.66 GHz, 701 frames @1280x720

--crf 20 --me umh -m 9 --b-adapt 0 --b-bias 100 -b 16 -r 8 -o mkv avs

gcc4.4.4 : 4.79, 4.79 and 4.80 fps

gcc4.5.1 : 4.87, 4.88 and 4.87 fps
MatLz is offline   Reply With Quote
Old 17th October 2010, 03:39   #4  |  Link
MatLz
I often say "maybe"...
 
MatLz's Avatar
 
Join Date: Jul 2009
Location: France
Posts: 583
Hmmm....problem ?
My previous test show a 1.67% speed gain for 4.5.1...but with low settings :
--crf 22 --me dia -m 2 --b-bias 100 -b 16 -r 2
It's only 0.65%.


So I decided to try with a little higher settings than my first test :
--crf 20 --me tesa -m 9 --b-adapt 2 --b-bias 100 -b 16 -r 16 (on same clip but with a pointresize(640,360))
And...4.5.1 is slower than 4.4.4 ! (avg 4.51 against 4.59 fps)
MatLz is offline   Reply With Quote
Old 17th October 2010, 07:23   #5  |  Link
burfadel
Registered User
 
Join Date: Aug 2006
Posts: 2,229
Can I ask why --b-bias 100? that effectively forces x264 to use 16 b-frames all the time, and disables the internal logic to insert them where beneficial. By using this setting, you impact quality. If you want to use more b-frames, unless its animation anything over around 6 is pointless, but you could set the bias if you still wish to do so to a more practical number like 5...

The main reason why I mentioned it is in the 'low settings' script, you have --me dia, -r2, but also 16 b-frames and 100 b-frames. In the higher settings script, --me umh and -m10 should be more beneficial...
burfadel is offline   Reply With Quote
Old 17th October 2010, 07:26   #6  |  Link
MatLz
I often say "maybe"...
 
MatLz's Avatar
 
Join Date: Jul 2009
Location: France
Posts: 583
It is only tests...
Don't make attention...
MatLz is offline   Reply With Quote
Old 17th October 2010, 09:00   #7  |  Link
burfadel
Registered User
 
Join Date: Aug 2006
Posts: 2,229
I was thinking that was the case, but just making sure was the gcc 4.5.1 consistently slower than 4.4.4? other processes/system processes can null the results. Also the encode statistics should be identical (apart from encode speed) for 4.4.4 and 4.5.1, are there any anomalies?
burfadel is offline   Reply With Quote
Old 17th October 2010, 09:17   #8  |  Link
bob0r
Pain and suffering
 
bob0r's Avatar
 
Join Date: Jul 2002
Posts: 1,337
-b 2000 ../1280x720p50_parkrun_ter.yuv -o NUL

4.4.4
encoded 504 frames, 9.12 fps, 11827.43 kb/s
encoded 504 frames, 14.73 fps, 11827.43 kb/s
encoded 504 frames, 15.00 fps, 11827.43 kb/s
encoded 504 frames, 14.85 fps, 11827.43 kb/s

4.5.1
encoded 504 frames, 14.89 fps, 11827.43 kb/s
encoded 504 frames, 14.65 fps, 11827.43 kb/s
encoded 504 frames, 14.72 fps, 11827.43 kb/s
encoded 504 frames, 14.83 fps, 11827.43 kb/s

Except for the first run, gcc 4.4.4 seems a bit faster here.
That's why you have to run it 3x or more, so you first load it in your system's memory. And then you can get an average because your CPU is always busy.

Ran on Intel Q66200 stock.
bob0r is offline   Reply With Quote
Old 17th October 2010, 09:20   #9  |  Link
bob0r
Pain and suffering
 
bob0r's Avatar
 
Join Date: Jul 2002
Posts: 1,337
Quote:
Originally Posted by MatLz View Post
Hmmm....problem ?
My previous test show a 1.67% speed gain for 4.5.1...but with low settings :
--crf 22 --me dia -m 2 --b-bias 100 -b 16 -r 2
It's only 0.65%.


So I decided to try with a little higher settings than my first test :
--crf 20 --me tesa -m 9 --b-adapt 2 --b-bias 100 -b 16 -r 16 (on same clip but with a pointresize(640,360))
And...4.5.1 is slower than 4.4.4 ! (avg 4.51 against 4.59 fps)
That's why i only ask about the commandline used.
If 100 people run this test, and 95% say gcc 4.5.1 is fastest, we can check if they use slow or fast settings.
But i think most people use the default or slower settings, we shall see how the results end up!
bob0r is offline   Reply With Quote
Old 17th October 2010, 09:28   #10  |  Link
Underground78
Registered User
 
Underground78's Avatar
 
Join Date: Oct 2004
Location: France
Posts: 567
AMD Athlon X2 6000+ 3 GHz - 1610 frames 704x384

--crf 23 --preset medium:
GCC 4.4.4: 27.70 fps, 27.81 fps, 27.69 fps
GCC 4.5.1: 28.45 fps, 28.42 fps, 28.50 fps
--> GCC 4.5.1 2.61% faster

--crf 23 --preset fast:
GCC 4.4.4: 30.79 fps, 30.76 fps, 30.80 fps
GCC 4.5.1: 31.45 fps, 31.55 fps, 31.53 fps
--> GCC 4.5.1 2.36% faster

--crf 23 --preset slow:
GCC 4.4.4: 16.96 fps, 16.97 fps, 17.00 fps
GCC 4.5.1: 17.40 fps, 17.40 fps, 17.40 fps
--> GCC 4.5.1 2.49% faster

(all runs with --output NUL)

Last edited by Underground78; 17th October 2010 at 09:31.
Underground78 is offline   Reply With Quote
Old 18th October 2010, 10:24   #11  |  Link
Groucho2004
 
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
Quote:
Originally Posted by Dark Shikari View Post
I still use gcc 3.4
I assume that you have a good reason why you're still using 3.4.x. Could you elaborate?
Groucho2004 is offline   Reply With Quote
Old 18th October 2010, 11:18   #12  |  Link
Dark Shikari
x264 developer
 
Dark Shikari's Avatar
 
Join Date: Sep 2005
Posts: 8,666
Quote:
Originally Posted by Groucho2004 View Post
I assume that you have a good reason why you're still using 3.4.x. Could you elaborate?
It comes with Cygwin and I'm lazy.
Dark Shikari is offline   Reply With Quote
Old 18th October 2010, 11:30   #13  |  Link
Groucho2004
 
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
Quote:
Originally Posted by Dark Shikari View Post
It comes with Cygwin and I'm lazy.
I see.

So, there isn't anything that 4.4.x (or 4.5.x) could screw up worse than 3.4.x, right? I tried both (3.4.5 and 4.5.1) and there is very little performance difference. The only thing I noticed is that the 3.4.5 build is much smaller than 4.5.1 (780KB compared to 1MB, without gpac).
Groucho2004 is offline   Reply With Quote
Old 18th October 2010, 16:12   #14  |  Link
bob0r
Pain and suffering
 
bob0r's Avatar
 
Join Date: Jul 2002
Posts: 1,337
As you can see the difference between 4.4.4 and 4.5.1 in size is quite large aswell. I guess there are too many factors why a gcc version does what a gcc version does.
bob0r is offline   Reply With Quote
Old 18th October 2010, 19:53   #15  |  Link
kypec
User of free A/V tools
 
kypec's Avatar
 
Join Date: Jul 2006
Location: SK
Posts: 826
I made 9 rounds with each GCC version, 3 runs with 3 different presets on i7-920 @ stock frequency 2.66GHz.

Command lines used:
Code:
x264.exe --preset slower --tune film --crf 18.0 --sar 1:1 --level 4.1 --keyint 240 --output null test.avs
x264.exe --preset medium --tune film --crf 18.0 --sar 1:1 --level 4.1 --keyint 240 --output null test.avs
x264.exe --preset faster --tune film --crf 18.0 --sar 1:1 --level 4.1 --keyint 240 --output null test.avs
Sample = 1920 x 800 24p sourced via DGDecodeNV, 720 frames

GCC 4.4.4 results:
  • slower: 1.53 / 1.39 / 1.39 ~ 1.44
  • medium: 7.96 / 7.95 / 8.03 ~ 7.98
  • faster: 13.99 / 14.05 / 14.09 ~ 14.04

GCC 4.5.1 results:
  • slower: 1.72 / 1.51 / 1.44 ~ 1.56 +8.33%
  • medium: 8.19 / 8.18 / 8.08 ~ 8.15 +2.13%
  • faster: 14.43 / 14.55 / 14.36 ~ 14.45 +2.92%

However, in the very last run (--preset faster) there occurred a non-deterministic result for GCC 4.4.4:
Code:
avs [info]: 1920x800p 1:1 @ 24000/1001 fps (cfr)
x264 [info]: using SAR=1/1
x264 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 FastShuffle SSE4.2
x264 [info]: profile High, level 4.1
x264 [info]: frame I:16    Avg QP:16.58  size:217109
x264 [info]: frame P:375   Avg QP:20.04  size:150895
x264 [info]: frame B:329   Avg QP:21.87  size: 98865
x264 [info]: consecutive B-frames:  9.2% 85.2%  0.4%  5.1%
x264 [info]: mb I  I16..4: 39.3% 18.8% 42.0%
x264 [info]: mb P  I16..4: 13.2% 17.5% 14.4%  P16..4: 15.8% 11.6% 10.5%  0.0%  0
.0%    skip:16.9%
x264 [info]: mb B  I16..4:  6.3% 10.4%  4.6%  B16..8: 24.2% 14.8%  3.6%  direct:
14.9%  skip:21.3%  L0:26.6% L1:22.0% BI:51.4%
x264 [info]: 8x8 transform intra:40.3% inter:19.6%
x264 [info]: coded y,uvDC,uvAC intra: 89.1% 51.0% 17.3% inter: 53.0% 19.9% 2.4%
x264 [info]: i16 v,h,dc,p: 30%  6% 44% 20%
x264 [info]: i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 10% 11% 42%  4%  9%  5%  6%  6%  8%
x264 [info]: i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 11% 15% 24%  5% 12%  7%  8%  7% 11%
x264 [info]: i8c dc,h,v,p: 58% 18% 20%  3%
x264 [info]: ref P L0: 69.3%  9.8% 20.9%
x264 [info]: ref B L0: 77.4% 22.6%
x264 [info]: ref B L1: 99.6%  0.4%
x264 [info]: kb/s:24664.96

encoded 720 frames, 14.09 fps, 24664.96 kb/s
as opposed to all other runs (2 times GCC 4.4.4 and 3 times GCC 4.5.1)
Code:
avs [info]: 1920x800p 1:1 @ 24000/1001 fps (cfr)
x264 [info]: using SAR=1/1
x264 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 FastShuffle SSE4.2
x264 [info]: profile High, level 4.1
x264 [info]: frame I:16    Avg QP:16.58  size:217110
x264 [info]: frame P:375   Avg QP:20.04  size:150898
x264 [info]: frame B:329   Avg QP:21.87  size: 98881
x264 [info]: consecutive B-frames:  9.2% 85.2%  0.4%  5.1%
x264 [info]: mb I  I16..4: 39.3% 18.8% 41.9%
x264 [info]: mb P  I16..4: 13.2% 17.5% 14.4%  P16..4: 15.8% 11.6% 10.6%  0.0%  0
.0%    skip:16.9%
x264 [info]: mb B  I16..4:  6.3% 10.4%  4.6%  B16..8: 24.2% 14.8%  3.6%  direct:
14.9%  skip:21.2%  L0:26.6% L1:22.0% BI:51.4%
x264 [info]: 8x8 transform intra:40.3% inter:19.7%
x264 [info]: coded y,uvDC,uvAC intra: 89.1% 51.0% 17.2% inter: 53.1% 19.9% 2.4%
x264 [info]: i16 v,h,dc,p: 30%  6% 44% 20%
x264 [info]: i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 10% 11% 42%  4%  9%  5%  6%  6%  8%
x264 [info]: i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 11% 15% 24%  5% 12%  7%  8%  7% 11%
x264 [info]: i8c dc,h,v,p: 58% 18% 20%  3%
x264 [info]: ref P L0: 69.4%  9.7% 20.9%
x264 [info]: ref B L0: 77.4% 22.6%
x264 [info]: ref B L1: 99.6%  0.4%
x264 [info]: kb/s:24666.65

encoded 720 frames, 14.05 fps, 24666.65 kb/s
Is this normal? I thought that non-deterministic results are possible only when VBV model is specified with multithread processing...
kypec is offline   Reply With Quote
Old 18th October 2010, 22:00   #16  |  Link
MasterNobody
Registered User
 
Join Date: Jul 2007
Posts: 552
Quote:
Originally Posted by kypec View Post
Is this normal? I thought that non-deterministic results are possible only when VBV model is specified with multithread processing...
It must be deterministic. May be decoding is non-deterministic (DGDecodeNV)?
MasterNobody is offline   Reply With Quote
Old 19th October 2010, 08:07   #17  |  Link
kypec
User of free A/V tools
 
kypec's Avatar
 
Join Date: Jul 2006
Location: SK
Posts: 826
Quote:
Originally Posted by MasterNobody View Post
It must be deterministic. May be decoding is non-deterministic (DGDecodeNV)?
I strongly doubt that. If anybody suggests how to make 100% deterministic input for x264 (preferably with use of my sample script, not some other material) I could try to locate the problem better. As it is now I stand biased towards x264 encoding issue (with GCC 4.4.4) rather than DGDecodeNV decoding.
kypec is offline   Reply With Quote
Old 19th October 2010, 08:21   #18  |  Link
MatLz
I often say "maybe"...
 
MatLz's Avatar
 
Join Date: Jul 2009
Location: France
Posts: 583
In my three series of tests, output files were identical.(for the same command line...of course)
MatLz is offline   Reply With Quote
Old 19th October 2010, 08:40   #19  |  Link
LoRd_MuldeR
Software Developer
 
LoRd_MuldeR's Avatar
 
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,249
Quote:
Originally Posted by kypec View Post
I strongly doubt that. If anybody suggests how to make 100% deterministic input for x264 (preferably with use of my sample script, not some other material) I could try to locate the problem better. As it is now I stand biased towards x264 encoding issue (with GCC 4.4.4) rather than DGDecodeNV decoding.
Use a lossless sample file (raw YUV data) from here, for example:
http://media.xiph.org/video/derf/

Using a H.264 source only adds another layer of uncertainty to the "deterministic" test. I usually do a MD5 comparisons of the output files after updating my compiler with one of those Xiph samples. It's not a guarantee, because it only tests one particular combination of settings with one particular source, but it helped me to identify compiler issues in the past...

Here are the results from the latest test:
http://pastie.org/1232074

(BTW: Is there any reason why are we comparing GCC 4.4.4 here, after 4.4.5 has already been officially released from the 4.4.x tree? Any regressions I should know?)
__________________
Go to https://standforukraine.com/ to find legitimate Ukrainian Charities 🇺🇦✊

Last edited by LoRd_MuldeR; 19th October 2010 at 09:15.
LoRd_MuldeR is offline   Reply With Quote
Old 19th October 2010, 12:53   #20  |  Link
video_magic
Registered User
 
Join Date: Jan 2005
Posts: 368
Quote:
Originally Posted by kypec View Post
I made 9 rounds with each GCC version, 3 runs with 3 different presets on i7-920 @ stock frequency 2.66GHz......
....Is this normal? I thought that non-deterministic results are possible only when VBV model is specified with multithread processing...
Maybe you could check for a RAM error, or a HDD read/write error. You might have an unreliable component.
__________________
Thankyou!, I am grateful for any help
video_magic is offline   Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 15:23.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.