openecu.org

Posted: **Thu Jan 12, 2006 6:32 pm**

I don't know if anyone on here gets down and dirty with gcc, especially on embedded targets, but I figured it was worth trying...

I'm writing code for the SH2 processor family (SH7055 specifically) using a gcc-3.4.1 cross-compiler. I'm runing into a strange "optimization" bug with the compiler. If I try to AND an unsigned char memory mapped peripheral register with a constant that is has only one '1' bit, in this example 0x40, the compiler tries to "optimize" the comparison by right shifting a copy of the register 6 times and then comparing it to 1, which is a silly way to do things. By contrast if the bit mask is instead 0x41, the resulting assembly code is less than half as large and faster. I get the same results regardless of the optimization options I set. Any ideas why this is happening or how I can turn this "optimization" off?

Colby

Code: Select all: // real code which has the "optimization" problem #define SSR1 (*((volatile unsigned char *) 0xFFFFF00C)) if (SSR & (unsigned char)0x40) // test RDRF 179 00ae 6030 mov.b @r3,r0 180 00b0 4009 shlr2 r0 181 00b2 4009 shlr2 r0 182 00b4 4009 shlr2 r0 183 00b6 C901 and #1,r0 184 00b8 2008 tst r0,r0 185 00ba 8901 bt .L20

Code: Select all: // sample of how changing the constant to 0x41 prevents the "optimization" // from being possible and makes for faster, smaller code #define SSR1 (*((volatile unsigned char *) 0xFFFFF00C)) if (SSR & (unsigned char)0x41) // sample mask with two '1' bits 179 00ae 6030 mov.b @r3,r0 180 00b0 C841 tst #65,r0 181 00b2 8901 bt .L20

Posted: **Thu Jan 12, 2006 8:00 pm**

I think its a case of "double-optimization". In this case, 0x40 has 6 "insisnifigant" bits that are least signifigant to 0x4 so it tries to eliminate them. That way it reduces the comparison to 1 bit versus 7, or halfs the amount of 8 bit registers used. I would imagine the comparitors built into the cpu handle 8 or 16 bits in parallel so it shouldn't matter. Bit shifting and comparing to 1 is probably a legacy 8 bit optimization. Dunno if there are any options to force higher bit operations only.

Good find tho.

Posted: **Thu Jan 12, 2006 8:21 pm**

Right. If the mask '1' bits were above the first 8 bits, I would see how this could be useful. I guess I shouldn't sweat it, but this is an extremely common type of operation to do on a microcontroller when you are checking peripheral status registers, whether it be a 8,16, or 32 bit CPU. I always want to believe that compilers are good and can do a good job of writing asssembly, but whenever I look, I find stuff like this...

openecu.org

Strange gcc "de-optimizations" happening on SH pla

Strange gcc "de-optimizations" happening on SH pla