Misc / Comic - AVS Forums

PAK-9

20th February 2005 15:54 UTC

Misc / Comic
Some fun at the expense of my favorite kind of AVS'r

http://www.deviantart.com/view/15352657/

^..^

21st February 2005 12:34 UTC

:D nice comic...
funny but unfortuantely a frequent occurrence...

S-uper_T-oast

21st February 2005 15:08 UTC

sines of factors of pi on init...
:D

PAK-9

21st February 2005 16:25 UTC

sines of factors of pi on init is a ligitimate optimisation

^..^

21st February 2005 16:29 UTC

sure... but why should it be advantageous?
factors of pi, ok! but sines of factors of pi?

PAK-9

21st February 2005 16:36 UTC

because sine is an expensive operation, if you fill the megabuf with a load of sines of 0-$pi you can reference the megabuf rather than doing an actual sin(x) because sin(x)=megabuf(x*scaling factor) much cheaper. The advantage really comes if your doing sines in the point box. Mind you the rotation matrices can be done in frame so the speed benefit would be pretty small... good for any other trig stuff in point/pixel tho

^..^

21st February 2005 16:53 UTC

yeah, somehow i thought on rotation matrices... cause then it's really just a marginal + in speed.

let me just repeat that idea, to see if i understood it correct: in innit you store y-values of a sin-function into the megabuf using loop and assign and stuff. Then you, when it comes to the point you need the values, you don't calulate a sin of a variable, but get it directly from the corresponding entry in megabuf. Tricky, and that read-out-method really provides a noticeable advantaqe compared to "sin(..)"?

PAK-9

21st February 2005 18:33 UTC

yea thats the idea, a sin() is a lot lot slower than a x=megabuf(y), just fill a ssc point box with 100 sines and compare it to a 100 megabuf retrievals. The slower part is the conversion process because say you have 100 megabuf entries of:

sin(($pi*2)/100)
sin(($pi*2)/100*2)
sin(($pi*2)/100*3)
etc..

you can have a counter t=(t+1)%100 which is nice an quick to reference but if you specifically want sin(23) or something you need a calculation to get the megabuf reference...

i.e. make it fall between 0 and 2*pi

((23*100)%628)*0.01;

but covert it to a reference

floor((23*100)%628);

Its a multiply and divide (mod is a divide) as opposed to a sin() so its still cheaper, but its a bit more complicated, and of course the accuracy of the result is limited to the number of sines you calculated on init.

I think that maths is right, im not in front of avs so I cant check.

UnConeD

21st February 2005 21:30 UTC

Look-up tables are known to be slower than the FPU in many, many cases. Megabuf access is slow as well, as it is stored in chunks of 16KB and allocated dynamically. Every access requires a check.

^..^

21st February 2005 22:43 UTC

im about making a preset to compare both kinds of calculation. But as i thought i would get stuck, cause i didn't have enough time to learn the correct use of loop/megabuf/assign until recently.

However, it'd be nice if you could have a look: here

PAK-9

23rd February 2005 20:58 UTC

Originally posted by UnConeD
Look-up tables are known to be slower than the FPU in many, many cases. Megabuf access is slow as well, as it is stored in chunks of 16KB and allocated dynamically. Every access requires a check.

I still maintain that retrieving the item of the stack is quicker than a sine, I'll have to check evallib to see the exact full method by which each is performed but I would be very surprised to find otherwise.

UnConeD

27th February 2005 06:50 UTC

From megabuf.c in ns-eel:


static double * NSEEL_CGEN_CALL megabuf_(double ***blocks, double *which)
{
  static double error;
  int w=(int)(*which + 0.0001);
  int whichblock = w/MEGABUF_ITEMSPERBLOCK;

  if (!*blocks)
  {
    *blocks = (double **)GlobalAlloc(GPTR,sizeof(double *)*MEGABUF_BLOCKS);
  }
  if (!*blocks) return &error;

  if (w >= 0 && whichblock >= 0 && whichblock < MEGABUF_BLOCKS)
  {
    int whichentry = w%MEGABUF_ITEMSPERBLOCK;
    if (!(*blocks)[whichblock])
    {
      (*blocks)[whichblock]=(double *)GlobalAlloc(GPTR,sizeof(double)*MEGABUF_ITEMSPERBLOCK);
    }
    if ((*blocks)[whichblock])
      return &(*blocks)[whichblock][whichentry];
  }

  return &error;
}

static double * (NSEEL_CGEN_CALL *__megabuf)(double ***,double *) = &megabuf_;
__declspec ( naked ) void _asm_megabuf(void)
{
  double ***my_ctx;
  double *parm_a, *__nextBlock;
  __asm { mov edx, 0xFFFFFFFF }
  __asm { mov ebp, esp }
  __asm { sub esp, __LOCAL_SIZE }
  __asm { mov dword ptr my_ctx, edx }
  __asm { mov dword ptr parm_a, eax }
  
  __nextBlock = __megabuf(my_ctx,parm_a);

  __asm { mov eax, __nextBlock } // this is custom, returning pointer
  __asm { mov esp, ebp }
}
__declspec ( naked ) void _asm_megabuf_end(void) {}

If you can't figure it out (it's justin-style coding):

_asm_megabuf is the ASM code for the AVS function megabuf(x)

It calls __megabuf, which is in fact the same as megabuf_, just with a slightly different declaration:

static double * (NSEEL_CGEN_CALL *__megabuf)(double ***,double *) = &megabuf_;

And megabuf_ is the huge-ass function at the top.

Now, for sin, the asm code for the function is:


  __asm 
  {
    fld qword ptr [eax]
    fsin
    mov eax, esi
    fstp qword ptr [esi]
    add esi, 8
  }

Still convinced megabuf is faster?

^..^

27th February 2005 11:56 UTC

wow, the code behind megabuf is huge! Although it is beyond me to understand that code :eek: But all this is executed when avs uses megabuf or gmegabuf?
OK, i see that calculating a sin here is actually faster than retrieving data from the megabuf, though it seems strange when you look at it with pure logic.

btw: how did you look up the code for sin and megabuf? Can it be red from the vis_avs.dll?

PAK-9

27th February 2005 16:34 UTC

Just because the megabuf code is long and convoluted doesnt mean its slower. The sine function looks all lovely and neat but it doesnt have any clever hacks, the bottom line is it calls an fsin which is 16-126 clock cycles, and we're double precision here so most of the time I reckon your looking at the upper end of that figure.

The megabuf function is almost entirely mov instructions which are... you guessed it... a big fat 1 clock cycle each. In fact depending on the alignment of code in memory accessing data from a register can be LESS than 1 clock cycle. I think I also see a sub, thats 2 clock cycles, an add, thats about 2. Those condition statements, all the if's, are jne's jnz's etc.. which are mostly 3 clock cycles.

Also the compiler will optimise the ASM a bit for the megabuf code, but it cant optimise an fsin its just one instruction. I'd guess If you follow the code through a megabuf retrieval its (very) roughly 40-50 clocks, and thats without any compiler optimisation, and not accounting for code alignment etc...

Some clock cycle reference
http://www.singlix.org/trdos/pentium.txt

TomyLobo

28th February 2005 15:27 UTC

fsin takes longer for higher numbers

double precision is just a means of storing floating point values in the RAM. it doesnt influence fsin's speed at all.

when loading a double or a float, the FPU converts it to an 80 bit floating point value, which is then stored to one of the FPU's registers

and afaik msvc6 doesnt optimize asm

do you really thing those dozens of pointer operations are faster than fsin?
not counting the stuff that is needed around the megabuf call

PAK-9

28th February 2005 17:09 UTC

Originally posted by TomyLobo
double precision is just a means of storing floating point values in the RAM. it doesnt influence fsin's speed at all.

Thats not strictly true, because for example a number that is extremely small might be rounded to zero in a single which would make the sin operation faster.

Originally posted by TomyLobo
and afaik msvc6 doesnt optimize asm

...which means the evallib sin function will not be optimised by the compiler but the megabuf function will (because its not all asm)

Originally posted by TomyLobo
do you really thing those dozens of pointer operations are faster than fsin?

I already explained this, pointer operations are just movs and adds etc... really cheap operations 1 and 2 clock cycles each.

_asm_megabuf is about 8 mov's and a sub, and some quick stack work to megabuf_ so its really quick. megabuf_ looks big and ugly but if you break it down:


  static double error;
  int w=(int)(*which + 0.0001);
  int whichblock = w/MEGABUF_ITEMSPERBLOCK;

this is the setup for the function, a few movs an add and a div (most expensive op in the function)


  if (!*blocks)
  {
    *blocks = (double **)GlobalAlloc(GPTR,sizeof(double *)*MEGABUF_BLOCKS);
  }
  if (!*blocks) return &error;

These conditions arent met in a normal retrieval so its just two conditions about 3 clocks each


  if (w >= 0 && whichblock >= 0 && whichblock < MEGABUF_BLOCKS)
  {
    int whichentry = w%MEGABUF_ITEMSPERBLOCK;

This condition gets entered, a few conditions movs and a mod (another div, damnit justin)


    if (!(*blocks)[whichblock])
    {
      (*blocks)[whichblock]=(double *)GlobalAlloc(GPTR,sizeof(double)*MEGABUF_ITEMSPERBLOCK);
    }

This condition is not met in a normal megabuf retrieval, so its just a few clocks for the check


    if ((*blocks)[whichblock])
      return &(*blocks)[whichblock][whichentry];
  }

This is the actual memory retirieval, a few clocks for the condition and some stack work to return the value, another few clocks.


  return &error;

This doesnt get reached

you see its really not that slow, those divs are probably the slowest thing in it, but dont forget they are integer divisions not floating point so they arent that slow.

TomyLobo

6th March 2005 22:15 UTC

megabuf involves lots of memory access, whereas fsin is just 2 of them (getting the value from [eax] i think and storing the result back to it)

UnConeD

6th March 2005 22:18 UTC

Pak, are you just being obtuse? Any AVS code involves doubles. That means that even the megabuf indexes are calculated in floating point and then rounded to ints (which is a significant speed hit as well). Then there's the huge megabuf lookup procedure, which translates to a 100 lines of ASM code or so.

The floating point sine op has to take one value which is already in the cache and sine it. The algorithm to do this is implemented in hardware and fully pipelined. As Tomy said, the upper time limit for sine calculations is for really huge angles, which tend to be rare in AVS anyway, and which you don't even handle in your current code. Handling them would be even worse because you'd need a bunch of evallib code in AVS to do that and there is no easy floating point modulus function/instruction/operator in evallib.

The fact is: evallib is pretty darn close to ASM when it comes to floating point operations. That's what it is designed for. No amount of wishful thinking or self-righteousness is going to change that. So let's drop this topic while you still have your dignity, sweetheart. :rolleyes:

PAK-9

7th March 2005 14:01 UTC

Fine, I will drop this subject but after a last post (because I want the last word):

first of all my point about the divisions is that they are integer divisions, in assembly they will by 'divs' not 'fdivs' because Justin has explicitly arranged it that way, for example

int w=(int)(*which + 0.0001);
int whichblock = w/MEGABUF_ITEMSPERBLOCK;

a division of two integer values. Yes the values have to be rounded first and yes that will slow it down but the divides themselves are integer.

Secondly I already explained that more lines of code doesnt necessarily mean its slower

Thirdly I already explained that memory access and moving is all just mov's which are incredibly quick

Fourth if you actually bothered making some preset tests with loads of sines and loads of megabuf retrievals youd see the megabuf retrievals come out faster. So bottom line in real world tests megabuf is faster.

Fifth and finally, dont fucking condescend to me. I'm just voicing my opinion on a technical subject I dont appreciate you trying to make me look stupid.

Okay, consider the topic finished, as we said at the start the realistic difference it makes in normal AVS use is negligible.

UnConeD

7th March 2005 15:48 UTC

Re divs: You're forgetting all the extra evallib code needed to actually /use/ the megabuf. For sin(), you just pass in the angle directly and don't need to do anything.

"Fourth if you actually bothered making some preset tests with loads of sines and loads of megabuf retrievals youd see the megabuf retrievals come out faster. So bottom line in real world tests megabuf is faster."

So did you make actual test cases, or just a loop where you calculate a sine 10000 times? Caching will completely destroy any validity of such an artificial test. Furthermore, if you just do a loop in evallib code, you're not actually doing anything in between, like actually drawing SSC points or lines, which will cause the cache to get filled with image data instead.

PS: For someone making a comic, your sense of humor is pretty bad :rolleyes:

PAK-9

7th March 2005 18:18 UTC

"Re divs: You're forgetting all the extra evallib code needed to actually /use/ the megabuf. For sin(), you just pass in the angle directly and don't need to do anything."

if you mean the code the user types to get the right megabuf entry then yes that is another overhead, I already meantioned that a few posts up.

I have done tests, the difference is insignificant but it is faster with megabuf. The tests I've done are just making presets of average complexity in my own style, and thats probably a harsher test than is necessary since 90% of AVS'rs dont make presets with as much code as I or you would.

I'd be lying if I said I regularily make sin lookup tables for presets, its a massive fag for little result. The closest thing I use is lookups for my rotation matrices for high point/triangle count stuff (please god dont start an argument about that as well)

PS: Sorry if you think I have a bad sense of humor, I just dont think that little roll eyes smiley is a licence to insult someone under the guise of a bad joke

mysterious_w

7th March 2005 20:02 UTC

I make comics too! Mine are funnier!

http://img87.exs.cx/img87/2571/balleandshorty17jc.png

http://img101.exs.cx/img101/4459/balleandshorty38wx.png

MaTTFURY

7th March 2005 23:58 UTC

Maybe the next installment could feature a certain newbie spontaneously combusting due to flames.

hey im working on a new pack for later anyways...

UnConeD

8th March 2005 01:45 UTC

Oh well, back to the original topic I guess... maybe this'll cheer you up Pak-9 :P.
http://www.acko.net/dumpx/notfunny.png

MaTTFURY

9th March 2005 00:32 UTC

I really do see such irony in anything.
you need some flames :p

PAK-9

9th March 2005 12:17 UTC

since we are returning swiftly to the topic I made another misc/comic, but I'm hesistant to upload it because it involves Tugg graphically shooting himself in the face, and he might object.

Warrior of the Light

9th March 2005 15:42 UTC

.. perhaps you should mail him the image and ask for his approval? if he doesn't mind then you're free to upload it...

Tuggummi

13th March 2005 11:25 UTC

Why would i object?
Is it because i might mind about blowing my face up?
Is it because you would use graphics from my avatar or pictures?
Or is it that this whole post is just a clever piece of sarcasm involving late actions of mine and there really isn't any comic about me blowing my face up.

:rolleyes:

really PAK, i expected more from you

[Ishan]

14th March 2005 09:08 UTC

Yes PAK we all wanna see the comic :D

PAK-9

14th March 2005 13:50 UTC

Originally posted by Tuggummi
Why would i object?
Is it because i might mind about blowing my face up?
Is it because you would use graphics from my avatar or pictures?
Or is it that this whole post is just a clever piece of sarcasm involving late actions of mine and there really isn't any comic about me blowing my face up.

:rolleyes:

really PAK, i expected more from you

No it's because of your sudden and violent mood swings that mean I cant really guess your reaction to things

I'll upload it next time i remember to bring it to uni

Tuggummi

14th March 2005 14:03 UTC

Oh... that thing.

But that's really quite easy to predict, you only need to know how much beer i've consumed before taking a action of approach.

PAK-9

14th March 2005 18:07 UTC

Shit I went home and forgot to put it on my usb drive

PAK-9

14th March 2005 18:14 UTC

whoops, It was on my usb drive all along

[edit]
I guess you cant image link bmp's

http://www.deviantart.com/view/16097090/
[/edit]

Tuggummi

14th March 2005 18:40 UTC

God damn it PAK! What the hell is wrong with you?! You're one sidd motherfudder! What's your problem? Why do you hate me so much?! All I ever wanted to be was your friend, and you treat me like this?!

My PC is in a TOWER case, not some sucky-ass 486-age horizontal shitbox!!!

PAK-9

14th March 2005 20:23 UTC

Dont hit me!

^..^

14th March 2005 20:35 UTC

guess pak has got an asshole day. He also offended me in the i-café. God knows what his problem with life is...

(yes, pak im really pissed off)

PAK-9

14th March 2005 20:50 UTC

Actually Tug was joking

But your right, I have terrible and disturbing problems with my life; like sometimes I'm hungry but I cant be arsed to cook anything, so I think maybe I'll just get a takeaway from somewhere... but I'm too lazy to go out too! WHAT THE HELL SHOULD I DO!?!?!

Terrible, disturbing problems.

^..^

14th March 2005 20:52 UTC

perhaps starve to death?

Tuggummi

14th March 2005 22:31 UTC

Smoke, drink coffee (black, no sugar) or masturbate. All of these thing take your mind off from hunger. -Tug's tip for student life #203

MaTTFURY

14th March 2005 23:41 UTC

pak-9 has a desktop with the almighty quaking machine?!! :igor:

PAK-9

15th March 2005 14:11 UTC

Originally posted by Tuggummi
Smoke, drink coffee (black, no sugar) or masturbate. All of these thing take your mind off from hunger. -Tug's tip for student life #203

Coffee? yuck