- AVS
- A question of speed.
Archive: A question of speed.
your-dentist
6th July 2004 10:49 UTC
A question of speed.
When I build superscopes with individual coords for each points, I mostly use bnot() to do this.
Example: superscope has three points.
Coords are: x1, x2, x3 and y1, y2, y3
In perpoint textfield:
cnt=i*(n-1);
x=bnot(cnt)*x1+bnot(cnt-1)*x2+bnot(cnt-2)*x2;
y=bnot(cnt)*y1+bnot(cnt-1)*y2+bnot(cnt-2)*y2;
So I recently read that multiplications are not good for this, because they're too slow.
Are multiplication with 1 or 0 as slow as, let's say .693482*1.593782?
I mean 'whatever'*1 is 'whatever' and 'whatever'*0 is 0.
Is it faster to use IFs in this case?
piR
6th July 2004 13:14 UTC
For only three points in a superscope, I don't think any optimization is really needed ;)
About multiplication, unless the compiler tries to make an optimization, the time needed to make a compute will be the same whatever the values multiplied.
For a real optimization, you should better use a series of if(), because AVS only executes the necessary code (in you example, it always executes 3 bnot() and 3 multiplications, using if() it can only execute from 1 to 3 bnot(), and that's all).
Jaak
6th July 2004 16:24 UTC
why not use ifs?
i think they the fastest. for less than... less say 200 points you really dont have to worry if you use ifs or multiplications
:p
just dont use divisions =)
your-dentist
6th July 2004 16:41 UTC
Is there no case differentiation in the multiplication algo, that makes multiplication with whole numbers faster?
I prefer the bnot() method, because it's easier to read the source.
Having tons of if(b,if(1-c,if(a,0,d),1),5); things sux.
I gave up using division too.
Though /3 is shorter than *.333. :)
Raz
6th July 2004 22:35 UTC
if you want easy to read just do this:
frame:
a=0;
point:
a=a+1;
x1=if(equal(a,1),point position on x axis,x1);
y1=if(equal(a,1),point position on y axis,y1);
x1=if(equal(a,2),point position on x axis,x1);
etc
x=x1;
y=y1;
NemoOrange
7th July 2004 20:45 UTC
I've been experimenting along these lines & I've been using i & if statements.
For your scope, you have 3 points, so n=3. The i "points" are located at 0, .5 and 1
With this in mind, using a if statement short cut, you can skip the equal or above conditions & just use values of i.
I'm having trouble explaining it, but in the end you'll come up with:
x=if(i,if(i-.5,x3,x2),x1);
y=if(i,if(i-.5,y3,y2),y1);
I've done larger statements like these where n=9;
x=if(i,if(i-1/8,if(i-2/8,if(i-3/8,if(i-4/8,if(i-5/8,if(i-6/8,if(i-7/8,x9,x8),x7),x6),x5),x4),x3),x2),x1)
Raz
7th July 2004 21:09 UTC
Looks interesting nemo, but get rid of all those needless divisions. Makes it much slower.
1/8 = 0.125
2/8 = 0.25
3/8 = 0.375
etc
piR
8th July 2004 12:46 UTC
Originally posted by your-dentist
Is there no case differentiation in the multiplication algo, that makes multiplication with whole numbers faster?
By "whole numbers" I think you mean integer values (like 1 or 3, but not 1.5 or 3.05). Integer computes are faster than floating point computes (I'm not so sure with the actual processors), but these are two types of data stored differently in memory, and then, they use different instructions.
Unfortunately, AVS only uses floating point numbers, so, 0, 1, 12546.4567862 are all the same ... ;)
NemoOrange
8th July 2004 16:31 UTC
Originally posted by Raz
Looks interesting nemo, but get rid of all those needless divisions. Makes it much slower.
1/8 = 0.125
2/8 = 0.25
3/8 = 0.375
etc
Of course, but I wanted to point out the relationship between the fractions and n.
Although, are you sure this is correct? I remember somebody mentioning how 1/3 is faster than .3333. Maybe I'm wrong.
your-dentist
8th July 2004 21:50 UTC
Anthoer way using megabuf().
I doubt that it's faster, because it needs to access the cache.
All coords are put into the megabuf.
Like this:
assign(megabuf(0),x1);assign(megabuf(1),x2);
assign(megabuf(0+yoffset),y1);assign(megabuf(1+yoffset),y2);
Then used like this:
Per point:
cnt=i*(n-1);
x=megabuf(cnt);
y=megabuf(cnt+yoffset);
The var 'yoffset' can be chosen freely, though it must be bigger than the amount of individual point coords.
Jaak
9th July 2004 16:28 UTC
Originally posted by NemoOrange
Although, are you sure this is correct? I remember somebody mentioning how 1/3 is faster than .3333. Maybe I'm wrong. [/B]
single blank (even no n=1 foo) ssc, code in frame part window size at 256x256
loop(2000,
loop(2000,
assign(a,a/3)
)
);
8.8 FPS
loop(2000,
loop(2000,
assign(a,a*.333)
)
);
11.1 FPS
yes, it does make difference
[edit]
@yourdentist, not that new idea, apparently slower too.
Its realyl useful in TomyLobos triangle ape, if it gets released youll see my examples and way of using gmegabuf :p
dirkdeftly
9th July 2004 17:57 UTC
1/3 is more accurate than .3333, but not faster
*however*
unless you're going over a division 4,000,000 times, you're not going to lose 3.3 fps over it. and it would take running over a division 100,000 times to even lose .1 fps. it's really not that big of a deal, and in many cases using the division is a lot more convenient than the decimal.
so jaak, for the love of god, stop obsessing over divisions. there are more important optimisations.
your-dentist: avs is really quite shitty and not well optimised, and afaik x*1 and x*0 are no faster than x*1.23456789. the bottom line is, ifs are faster than mults, and the difference in speed is not quite as negligible as that between a division and a multiplication.
(so yeah, this is one of those important optimisations)
Jaak
10th July 2004 09:13 UTC
Originally posted by Atero
so jaak, for the love of god, stop obsessing over divisions. there are more important optimisations.
yes, it does't matter *that* much, but in quite heavy scopes it will matter, besides it allways feels good to code optimized code :p
TomyLobo
13th July 2004 20:44 UTC
about the speed difference between using megabuf() and if/equal/bnot/whatever:
megabuf() is most probably faster in terms of memory access because it accesses memory only 4 times:
1. fetch the index (if it isnt already in the fpu from previous calculations) (prolly fld [e*x])
2. store the index to a temporary integer (fist [ebp+/-something])
3. put the index into a cpu register (mov e*x, [ebp+/-something])
4. get the value from the megabuf using that index and the base address of the megabuf (fld [e*x+e*x])
that would be a complexity of 4 which is a constant cost while using if()s, equal()s or bnot()s would require a linear amount of time which would be slower if you have more than 3 or 4 or so...
your-dentist
14th July 2004 15:42 UTC
Sounds plausible.
So for high amounts of points, using megabuf() is faster.
Whoa, you really seem to have become an expert at AVS code!
TomyLobo
17th July 2004 15:40 UTC
a little ASM knowledge helps ;)
jheriko
28th July 2004 20:01 UTC
btw, if you want to know whats fastest you should get the winamp sdk and look at the evallib source. the actual functions are in nseel-cfunc.c, the pro/epilogs are in the nseel-addfuncs.h file
its all pretty interesting but you need to be a real tech head to understand 90% of the rest of it cos its written in c and asm and in a code style almost totally unreadable to me (and therefore by logical extension everyone :P).