After some investigation, I discovered that setting 16 bytes at a time correctly sets all the pixels. This is using either the memset() or memcpy() functions (either when setting blocks to one colour or using a backup copy of the display and copying to the framebuffer as required). I am assuming that the implementations of these functions in the C library does something different for blocks of 16 bytes. Or else the kernel video driver behaves differently. I don't know. If anyone has an explanation of this, please let me know, but there it is. As far as I know, you have to update the framebuffer 16 bytes at a time or you will not get the results you want.