Page 1 of 1

Strange gcc "de-optimizations" happening on SH pla

PostPosted: Thu Jan 12, 2006 6:32 pm
by cboles
I don't know if anyone on here gets down and dirty with gcc, especially on embedded targets, but I figured it was worth trying...

I'm writing code for the SH2 processor family (SH7055 specifically) using a gcc-3.4.1 cross-compiler. I'm runing into a strange "optimization" bug with the compiler. If I try to AND an unsigned char memory mapped peripheral register with a constant that is has only one '1' bit, in this example 0x40, the compiler tries to "optimize" the comparison by right shifting a copy of the register 6 times and then comparing it to 1, which is a silly way to do things. By contrast if the bit mask is instead 0x41, the resulting assembly code is less than half as large and faster. I get the same results regardless of the optimization options I set. Any ideas why this is happening or how I can turn this "optimization" off?


Code: Select all
// real code which has the "optimization" problem

#define   SSR1   (*((volatile unsigned char *) 0xFFFFF00C))
if (SSR & (unsigned char)0x40) // test RDRF

 179 00ae 6030           mov.b   @r3,r0
 180 00b0 4009           shlr2   r0
 181 00b2 4009           shlr2   r0
 182 00b4 4009           shlr2   r0
 183 00b6 C901           and   #1,r0
 184 00b8 2008           tst   r0,r0
 185 00ba 8901           bt   .L20

Code: Select all
// sample of how changing the constant to 0x41 prevents the "optimization"
// from being possible and makes for faster, smaller code

#define   SSR1   (*((volatile unsigned char *) 0xFFFFF00C))
if (SSR & (unsigned char)0x41) // sample mask with two '1' bits
 179 00ae 6030           mov.b   @r3,r0
 180 00b0 C841           tst   #65,r0
 181 00b2 8901           bt   .L20

PostPosted: Thu Jan 12, 2006 8:00 pm
by cdvma
I think its a case of "double-optimization". In this case, 0x40 has 6 "insisnifigant" bits that are least signifigant to 0x4 so it tries to eliminate them. That way it reduces the comparison to 1 bit versus 7, or halfs the amount of 8 bit registers used. I would imagine the comparitors built into the cpu handle 8 or 16 bits in parallel so it shouldn't matter. Bit shifting and comparing to 1 is probably a legacy 8 bit optimization. Dunno if there are any options to force higher bit operations only.

Good find tho.

PostPosted: Thu Jan 12, 2006 8:21 pm
by cboles
Right. If the mask '1' bits were above the first 8 bits, I would see how this could be useful. I guess I shouldn't sweat it, but this is an extremely common type of operation to do on a microcontroller when you are checking peripheral status registers, whether it be a 8,16, or 32 bit CPU. I always want to believe that compilers are good and can do a good job of writing asssembly, but whenever I look, I find stuff like this...