So far, we have created two display implementations for Mecrisp Forth: a 128x64 OLED display, connected via (overclocked) I2C, and a 320x240 colour LCD, connected via hardware SPI clocked to 9 MHz. While quite usable, these displays are not terribly snappy:
the OLED display driver uses a 1 KB ram buffer, which it sends in full to the OLED whenever “
display
” is called - this currently requires about 60 millisecondsthe TFT display uses a much faster connection, but it also needs to handle a lot more data: 320x240 pixels as 16 bits per pixel is 150 KB of data - changes are written directly into the display controller, but this means that it now takes over 1.4 seconds to clear the entire screen!
Fortunately, there are much faster options available, even on low-end STM32F103 chips. They are based on STM’s Flexible Static Memory controller (FSMC), a hardware peripheral which can map various types of external memory into the ARM’s address space. This requires a lot of pins, because such interfaces to external memory will be either 8-bit or 16-bit wide.
But the results can be quite impressive. To access an LCD controller connected in this way, you can now simply write to specific memory addresses in code.
Let’s try it out, using the Hy-MiniSTM32V board from Haoyu. It has an STM32F103VC µC on board, i.e. 80-pins, 256K flash, 64K RAM. Still not enough to keep a complete display copy in RAM, but as you’ll see, this no longer matters. The implementation is available on GitHub.
The code is just under 100 lines, a bit lengthy for inclusion in this article. Some of the highlights:
: tft-pins ( -- )
8 bit RCC-AHBENR bis! \ enable FSMC clock
OMODE-AF-PP OMODE-FAST +
dup PE7 io-mode! dup PE8 io-mode! dup PE9 io-mode! dup PE10 io-mode!
dup PE11 io-mode! dup PE12 io-mode! dup PE13 io-mode! dup PE14 io-mode!
dup PE15 io-mode! dup PD0 io-mode! dup PD1 io-mode! dup PD4 io-mode!
dup PD5 io-mode! dup PD7 io-mode! dup PD8 io-mode! dup PD9 io-mode!
dup PD10 io-mode! dup PD11 io-mode! dup PD14 io-mode! dup PD15 io-mode!
drop ;
As mentioned, we need to set up a lot of GPIO/O pins for this, and of course they have to match with the actual connections on this particular board.
Next, we need to set up three registers in the FSMC hardware (that last write enables the FSMC):
: tft-fsmc ( -- )
[...] FSMC-BCR1 !
[...] FSMC-BTR1 !
[...] FSMC-BWTR1 !
1 FSMC-BCR1 bis! ;
For full details, see GitHub and the - 1,100-page - STM32F103 Reference Manual (RM0008).
So much for the FSMC. We also need to initialise this particular “R61505U” LCD controller on our board, which requires sending it just the right magic mix of config settings on startup:
create tft:R61505U
hex
E5 h, 8000 h, 00 h, 0001 h, 2B h, 0010 h, 01 h, 0100 h, [...]
decimal align
: tft-init ( -- )
tft-pins tft-fsmc
tft:R61505U begin
dup h@ dup $200 < while ( addr reg )
over 2+ h@ swap ( addr val reg )
dup $100 = if drop ms else tft! then
4 + repeat 2drop ;
And that’s about it. But here is the interesting bit with respect to the FSMC:
: tft! ( val reg -- ) LCD-REG h! LCD-RAM h! ;
That little definition is our sole interface to the LCD, and it just writes two values to two different memory addresses, now mapped by the FSMC.
This same approach can probably be used with a huge variety of LCD displays out there, as long as they are connected via a parallel bus and the µC has support for FSMC. You “just” need to connect the LCD properly, set up all the GPIO pins and the FSMC to match (including proper read/write timing), and initialise the LCD controller with its matching power-up sequence.
The rest is mostly boilerplate to provide the 3 definitions needed by the display-independentgraphics.fs library from Mecrisp:
$0000 variable tft-bg
$FFFF variable tft-fg
: clear ( -- )
0 $20 tft! 0 $21 tft! $22 LCD-REG h!
tft-bg @ 320 240 * 0 do dup LCD-RAM h! loop drop ;
: putpixel ( x y -- ) \ set a pixel in display memory
$21 tft! $20 tft! tft-fg @ $22 tft! ;
: display ( -- ) ; \ update tft from display memory (ignored)
And here’s the result of running all this code with the Mescrisp graphics demo:
(with apologies for the low image quality of this snapshot)
So now we’re back to displaying stuff on the screen, just like the previous two display implementations. But with the above FSMC-based code, a clear screen takes just 30 ms!
As you can see, the “clear
” word above simply brute-forces its way through, by
setting each screen pixel in a big loop. That’s 5,000 16-bit writes per
millisecond, i.e. 200 ns cycle time.
Which goes to show that performance is the result of optimising (only) the right things!