Tag Archives: bit-band

Bit Banding in the STM32

Wondrous though the STM32 (ARM Cortex M3) might be, it makes something of a meal of atomic access to individual bits in memory. The technique used is called bit-banding. Although it is simple enough in concept and pretty friendly to the assembly language programmer, it is easy enough to get lost in C. Or should that be at C?

So why would you want atomic access to individual bits? Consider a single bit used as a flag in your program. Perhaps you have a data buffer and the interrupt service routine that looks after it sets a bit in a memory location to signal that it is full. A higher level function sees the bit set and does something about it, then resets the bit. the problem is that this is just one bit in a whole memory word of 32 bits and, to modify it you need to read the word, change the bit and write back the word. What happens if some other interrupt function changes one of the other bits in that word after you read it but before you wrote back the modified version? when you write back your version, it will put all the bits back to how they were before you started thus destroying a piece of information.

You can generally solve this problem by using an entire variable for each such flag. At best this will use up a whole byte for each flag and so wastes memory. That may not be a problem for you and if that is the case then that will work out just fine. Things are not so easy if you wanted to keep all these flags neatly together in a single location as a status word that might get sent to a host or recorded in a log.

This kind of thing happens all the time in the peripheral control registers. Take the USART for example. The CTS flag goes high and an interrupt handler as part of its job, wants to reset the flag. Don’t ask me why, it just does. Meantime, you just received a byte and the RXNE flag is set to indicate that there is data waiting. But the CTS handler is in the middle of a read-modify-write cycle. It has read the status register and is in the process of clearing the CTS flag. When it writes the result back, the RXNE flag will be cleared and the arrival of the character could go un-noticed.

OK, so I just invented all that. the point is, the peripheral registers may need care in setting and clearing bits and the safest way to do that in both cases is through bit-banding. Why? because the changes are atomic. That is, they happen to single bits in on cycle and cannot be interrupted. Thus, there will never be an occasion where you read a bit which can be modified by some other code before you get to write it back out.

Right, so, how is it done? 8051 programmers had a rich set of bit set/reset instruction that could do all this very neatly. then again, the 8051 had a tiny address space and a suitably simple architecture.

On the STM32, some magic is worked internally so that each bit in a pre-defined memory range can be addressed as another location in a kind of virtual address space somewhere else. So, for example, the 32 bit value stored at address 0×20000000 also appears as 32 sequential memory locations starting at 0×22000000. There are two regions of memory that have bit-band alias regions. First there is a 1Mbyte SRAM region from 0×20000000 – 0×20100000 where each bit is aliases by a byte address in the range 0×22000000 – 0x23FFFFFF. Then there is the peripheral space from 0×40000000 – 0×40100000 which is aliased in the same way to the range 0×42000000 – 0x43FFFFFF.

Using this scheme, a read or write to memory location 0×22000000 is the same as a read or write to the least significant bit of SRAM location 0×20000000. I have no intention of going through the whole thing.

If you want to find out more about this and many other dark STM32 secrets, read the excellent book by Joseph Yiu – The Definitive Guide to the Arm Cortex – M3

The Peripheral Library, among other sources, provides us C programmers with macros to do the address translation. They look like this for the SRAM memory space:


#define RAM_BASE 0x20000000
#define RAM_BB_BASE 0x22000000
#define Var_ResetBit_BB(VarAddr, BitNumber)
(*(vu32 *) (RAM_BB_BASE | ((VarAddr - RAM_BASE) << 5) | ((BitNumber) << 2)) = 0)
#define Var_SetBit_BB(VarAddr, BitNumber)
(*(vu32 *) (RAM_BB_BASE | ((VarAddr - RAM_BASE) << 5) | ((BitNumber) << 2)) = 1)
#define Var_GetBit_BB(VarAddr, BitNumber)
(*(vu32 *) (RAM_BB_BASE | ((VarAddr - RAM_BASE) << 5) | ((BitNumber) << 2)))

These are all well and good but not too intuitive to use. Even if you understand what they do. Rather than mess with these, I define a couple of additional versions that look like this:


#define varSetBit(var,bit) (Var_SetBit_BB((u32)&var,bit))
#define varGetBit(var,bit) (Var_GetBit_BB((u32)&var,bit))
#define varGetBit(var,bit) (Var_GetBit_BB((u32)&var,bit))

Using these macros is quite simple. The following are all legitimate ways to use them:


uint32_t flags;
uint32_t status;
varSetBit(flags,1);
varSetBit(flags,READY_BIT);
varClrBit(flags,3);
ready = varGetBit(flags,READY_BIT);

It is interesting to note that the varGetBit macro is an LValue so that it can be used in an assignment like this:


varGetBit(flags,4) = y;
varGetBit(flags,ARRIVED) = varGetBit(status,READY);

These methods are not primarily about speed but convenience. The compiler cannot know where the variables will be stored when the code is generated so you will see some of the calculations done at run time by your program. However, if you use the pointer method in the code fragment above, the calculation of the alias address is done only once and you will be able to get to the bit variables quite efficiently after that. To access bits in the peripheral registers, you use the exact same technique but with different base registers. There is, of course, no reason why you could not define a macro that refers to a specific bit in a peripheral register at a known address. Then you get the addresses completely pre-calculated by the compiler and the most efficient code since the peripheral register addresses are known at compile time. Continue reading

Posted in STM32 | Tagged | 12 Comments