Some of you might wonder why disassemble executables ?
Well, simply because our C/C++ compiler produce slow code.
Today I disassembled StarShipW3D which was compiled by Alain Thellier with an old Gcc version, the 2.90.27...
Well, the generated code looks as two peas in a pod like the one from Warp3D. I am almost certain that all the 4.2 version has been compiled at the time with this version of Gcc...
Look now this edifying example in the W3D_Permedia2.library. This is the Per2_SetState function which is concerned. We must understand that our compilers are "mechanical", this means that they convert C/C++ code without thinking !
Per2_SetState weighs 134 bytes :
Well, simply because our C/C++ compiler produce slow code.
Today I disassembled StarShipW3D which was compiled by Alain Thellier with an old Gcc version, the 2.90.27...
Well, the generated code looks as two peas in a pod like the one from Warp3D. I am almost certain that all the 4.2 version has been compiled at the time with this version of Gcc...
Look now this edifying example in the W3D_Permedia2.library. This is the Per2_SetState function which is concerned. We must understand that our compilers are "mechanical", this means that they convert C/C++ code without thinking !
Per2_SetState weighs 134 bytes :
By looking closely and by understanding the routine, it's clear that we are facing bits tests. By thinking a little bit, it's possible to group all this tests in a single !
Every time a d0 bit is set to one, the routine does a "moveq #0,d0". For a better understanding, simply convert all this comparisons in binary :
Every time a d0 bit is set to one, the routine does a "moveq #0,d0". For a better understanding, simply convert all this comparisons in binary :
- $00002000 = %0000000000000000 0010000000000000 (W3D_BLENDING)
- $00000400 = %0000000000000000 0000010000000000 (W3D_GOURAUD)
- $00000100 = %0000000000000000 0000000100000000 (W3D_TEXMAPPING)
- $00000010 = %0000000000000000 0000000000010000 (W3D_GLOBALTEXENV)
- $00000200 = %0000000000000000 0000001000000000 (W3D_PERSPECTIVE)
- $00000800 = %0000000000000000 0000100000000000 (W3D_ZBUFFER)
- $00001000 = %0000000000000000 0001000000000000 (W3D_ZBUFFERUPDATE)
- $02000000 = %0000001000000000 0000000000000000 (W3D_SCISSOR)
- $00080000 = %0000000000001000 0000000000000000 (W3D_DITHERING)
- $00004000 = %0000000000000000 0100000000000000 (W3D_FOGGING)
- $00400000 = %0000000001000000 0000000000000000 (W3D_ALPHATEST)
- $04000000 = %0000010000000000 0000000000000000 (W3D_CHROMATEST)
- $08000000 = %0000100000000000 0000000000000000 (W3D_CULLFACE)
Then you just need to gather all the different bit to be tested in a single digit, which gives :
- %0000111001001000 0111111100010000 = $E487F10
In order to remove the last "moveq #0,d0", we must inverse this digit with a not.l :
- not.l $E487F10 = $F1B780EF
Here is it a nicely optimised routine which has less of 12 bytes instead of the 134 from the beginning :
Well then, of course, is a rather special case, but nevertheless gives a good idea of human capacities to improve what C/C++ compiler robots does...
An Amiga coder told me that the 68k Macintosh CodeWarrior compiler produced a quality code, which I have not been able to verify...Maybe it should be adapted on ours Amiga ?
So, is that Warp3D will end up being faster overall ?
We muste believe, there's a lots and lots of work anyway !!
An Amiga coder told me that the 68k Macintosh CodeWarrior compiler produced a quality code, which I have not been able to verify...Maybe it should be adapted on ours Amiga ?
So, is that Warp3D will end up being faster overall ?
We muste believe, there's a lots and lots of work anyway !!
(translated by Squaley)
Aucun commentaire:
Enregistrer un commentaire
Laissez vos commentaires ici :