x86 Assembly

Moderators: None (Apply to moderate this forum)
Number of threads: 4563
Number of posts: 16029

This Forum Only
Post New Thread
Single Post View       Linear View       Threaded View      f

Report
slow floating point on Pentium 4 Posted by ckdh on 25 Apr 2011 at 6:32 AM
Hi,

I have performance issues with the following code:

...
mulsd %xmm1, %xmm0
addsd %xmm2, %xmm0
add 1, %eax
cmp 10000000, %eax
...

basically: for (i = 0; i < 10000000; ++i) { a = a * b + c; }

when c != 0 it works 20 times faster than when c == 0.0

Does anybody have any ideas on what the problem is?

thanks,

--ckdh
Report
Re: slow floating point on Pentium 4 Posted by Bret on 26 Apr 2011 at 8:19 AM
I'm not familiar with those specific instructions on CPU's, but there are a couple of general things to keep in mind.

On modern hardware, at least MUL & DIV are usually processed iteratively in the microcode and use "exit early" algorithms, so the amount of time it takes to process will vary depending on exactly what the input values are. In addition, when using floating point numbers, numbers are rounded and truncated, so "0.0" may not be precisely 0.0. I don't know what your particular application involves, but if you're worried about speed you should not use floating point numbers when integers will work just as well.

IOW, I'm not surprised that the timing is variable, and a factor of 20 may not be all that unreasonable.



 

Recent Jobs

Official Programmer's Heaven Blogs
Web Hosting | Browser and Social Games | Gadgets

Popular resources on Programmersheaven.com
Assembly | Basic | C | C# | C++ | Delphi | Flash | Java | JavaScript | Pascal | Perl | PHP | Python | Ruby | Visual Basic
© Copyright 2011 Programmersheaven.com - All rights reserved.
Reproduction in whole or in part, in any form or medium without express written permission is prohibited.
Violators of this policy may be subject to legal action. Please read our Terms Of Use and Privacy Statement for more information.
Operated by CommunityHeaven, a BootstrapLabs company.