Strange gcc "de-optimizations" happening on SH pla
Posted: Thu Jan 12, 2006 6:32 pm
I don't know if anyone on here gets down and dirty with gcc, especially on embedded targets, but I figured it was worth trying...
I'm writing code for the SH2 processor family (SH7055 specifically) using a gcc-3.4.1 cross-compiler. I'm runing into a strange "optimization" bug with the compiler. If I try to AND an unsigned char memory mapped peripheral register with a constant that is has only one '1' bit, in this example 0x40, the compiler tries to "optimize" the comparison by right shifting a copy of the register 6 times and then comparing it to 1, which is a silly way to do things. By contrast if the bit mask is instead 0x41, the resulting assembly code is less than half as large and faster. I get the same results regardless of the optimization options I set. Any ideas why this is happening or how I can turn this "optimization" off?
Colby
I'm writing code for the SH2 processor family (SH7055 specifically) using a gcc-3.4.1 cross-compiler. I'm runing into a strange "optimization" bug with the compiler. If I try to AND an unsigned char memory mapped peripheral register with a constant that is has only one '1' bit, in this example 0x40, the compiler tries to "optimize" the comparison by right shifting a copy of the register 6 times and then comparing it to 1, which is a silly way to do things. By contrast if the bit mask is instead 0x41, the resulting assembly code is less than half as large and faster. I get the same results regardless of the optimization options I set. Any ideas why this is happening or how I can turn this "optimization" off?
Colby
- Code: Select all
// real code which has the "optimization" problem
#define SSR1 (*((volatile unsigned char *) 0xFFFFF00C))
if (SSR & (unsigned char)0x40) // test RDRF
179 00ae 6030 mov.b @r3,r0
180 00b0 4009 shlr2 r0
181 00b2 4009 shlr2 r0
182 00b4 4009 shlr2 r0
183 00b6 C901 and #1,r0
184 00b8 2008 tst r0,r0
185 00ba 8901 bt .L20
- Code: Select all
// sample of how changing the constant to 0x41 prevents the "optimization"
// from being possible and makes for faster, smaller code
#define SSR1 (*((volatile unsigned char *) 0xFFFFF00C))
if (SSR & (unsigned char)0x41) // sample mask with two '1' bits
179 00ae 6030 mov.b @r3,r0
180 00b0 C841 tst #65,r0
181 00b2 8901 bt .L20