Design choices
This guide serves as a loose explanation of some design choices made during the development of BlocksDS.
A supplementary good source for understanding its design choices is the BlocksDS issue tracker, where many were and continue to be openly discussed.
Existing builds of GCC used for embedded development come with various, often not fully matching C/C++ standard libraries. The compatibility issues and added support headache make relying on user-provided GCC a non-viable option.
Conversely, requiring the user to build GCC is a long and not-too-easy process that doesn’t give big advantages over providing binary builds. It would make setting up the SDK a more fragile and time-consuming process.
LLVM/Clang has also been evaluated, but as of writing it (LLVM 18) appears to generate worse-performing machine code on NDS than GCC (13). This has not been extensively tested, however.
Therefore, in order to take advantage of per-CPU optimizations and to have a robust toolchain, it has been decided to standardize on a custom binary build of GCC. For this purpose, BlocksDS relies on a courtesy build provided by the Wonderful toolchain’s infrastructure.
Often, SDKs come with pre-built libraries - users only need to build them manually if they want to modify them. It is the more convenient route for most, and the Wonderful toolchain’s infrastructure currently provides it for BlocksDS users.
However, part of the reason for creating this SDK was to show users how easy it
actually is to modify any part of the code in your program. As such, one can
easily build (make
) and use (export BLOCKSDS
) a custom variant of the
BlocksDS SDK. This is a quick process (under a minute on most modern
machines for a full rebuild) which allows hands-on tinkering with it.
The exceptions are picolibc
(standard C library) and libstdc++
(standard
C++ library), which are fairly difficult to build reliably. They are, as such,
provided by the toolchain.
The C library picolibc is used in BlocksDS instead of newlib. The reasons are:
-
Clearer licensing: The maintainer of
picolibc
has ensured that all of the code in the library has consistent BSD-style licensing. Even though non-BSD code was not linked in a typical embedded build, it provides additional peace of mind. -
Standard thread-local storage:
newlib
utilizes a custom re-entrancy structure for those C library functions which have thread-dependent state. Conversely,picolibc
uses standard thread-local storage mechanisms provided by the compiler. -
Additional optimizations:
picolibc
modifies many aspects ofnewlib
which help reduce code size and thus memory usage. In particular,tinystdio
is a much leaner implementation of POSIX-compatible standard I/O which omits features not typically necessary in an embedded environment. -
Compatibility:
picolibc
is a fork ofnewlib
, and so is largely compatible with it at the user application level.
The main disadvantage I’ve seen is that the documentation of picolibc
about
porting it to a new system is worse than the one of newlib
.
Other C library alternatives with more convenient licensing have been evaluated, such as PDCLib, but as of writing (May 2024) have been deemed insufficiently complete for production use in the context of BlocksDS.
picolibc
only provides the generic functionality of the standard C library.
For example, it provides versions of memset()
or strlen()
that are
functional. However, it can’t access any OS services, so functions like
malloc()
or fopen()
don’t work right away. It is needed to port them to
the platform.
libnds
is the library that has the drivers to access the SD card, that
provides argc
and argv
to main()
, and that knows where to locate the
heap memory used by malloc()
.
For example, for malloc()
to work, picolibc
expects the port to provide
a function called sbrk()
. This function needs to get information from
libnds
to work. The glue code between picolibc
and libnds
is in
libnds
, in source/arm9/libc
.
The extent of this support is documented in this document.
The reason to keep this as a separate library, instead of adding it to
picolibc
as a target, is to make updating picolibc
mostly independent
from refactors in libnds
.
This section will describe how the filesystem support has been implemented in libnds. Check this document if you’re interested in the C standard functions that are supported.
Filesystem support requires 3 things:
-
Something that provides POSIX file I/O functions, like
fopen()
. This is done bypicolibc
. -
Something that reads and writes raw bytes from the SD card. This is done by
libnds
. -
Something that understands the raw bytes read from the SD card and interprets it as a FAT filesystem. This is done by a modified version of Elm’s FatFs library, which is included in
libnds
as a submodule.
Also, it is needed to provide glue code between the 3 components. For example:
picolibc
provides fopen()
, and expects the user to implement open()
,
which should work like the Linux system call open
. open()
must have code
that calls functions in FatFs to do the right thing. In this case, open()
translates its arguments to arguments that f_open()
from FatFs can
understand.
Internally, f_open()
requires a function called disk_read()
, which calls
libnds
functions to read raw bytes from the SD card. Reading raw bytes is
complicated. If you’re running the code on a DSi, and you want to read from the
internal SD card, you need one specific driver. If you are running the code from
a DS slot 1 flashcart, for example, the instructions of how to read from the SD
card are provided as a DLDI driver. f_open()
must determine the location of
the file (based on the filesystem prefix, fat:
or sd:
) and use DLDI
driver functions or DSi SD driver functions accordingly.
When creating a game, it is needed to add a lot of assets such as graphics and music. Initially, most people just include them in their ARM9 binary, but this is a bad idea. ARM7 and ARM9 binaries are loaded into RAM. There are only 4 MiB of available memory (actually, a bit less than that, some RAM is used for things like a hook to exit to the loader). The ARM9 is loaded in full to RAM. On top of that, you also need RAM for your program to work. This means that, in most cases, you’re limited to 1 or 2 MiB binaries. This isn’t enough for larger projects. There is the option to provide a folder with all your assets and tell your users to copy it to their SD card, but this is messy.
The solution is to append a filesystem to the .nds file. On the DS platform, a filesystem format called NitroROM is traditionally used. There is a library that can be used to access this filesystem, called libfilesystem (formerly known as Nitrofs). The problem is that this library’s licensing status in unclear. As such, instead of using this library, BlocksDS has a compatible reimplementation of NitroFS.
Accessing the filesystem itself is tricky.
Commercial games access their own data by issuing card read commands. These
only work on emulators and official cartridges. Flashcarts typically rely
on patching specific instruction sequences, which is not viable for homebrew
projects. The solution, instead, is argv
.
When it is initialized, NitroFS
checks if argv[0]
has been provided and
if it can be opened. argv[0]
is a path to the executable .nds file being
run. For example, it may look like fat:/games/my-game.nds
.
First, NitroFS
will try to open the file using FatFs
. If it can be
opened, whenever fopen()
is called with a path that starts with nitro:/
,
FatFs
will read blocks from the file in argv[0]
with fseek()
and
fread()
. This route is typically used on cartridges and to read from SD
cards.
If this fails, the Slot-2 cartridge memory area is checked. This is provided solely for compatibility with very legacy execution methods.
If this fails, Slot-1 card read commands will be attempted. This approach is provided for compatibility with emulators which do not emulate a full FAT storage device and the DLDI protocol required for homebrew filesystem access.
This system makes it possible to use the bundled filesystem to read assets in a transparent manner. The developer doesn’t need to worry about the method of execution used for their homebrew; on the vast majority of platforms, it will be handled transparently.
The original libnds
did not support any kind of multithreading. This made it
impossible to fully utilize the CPUs of the NDS during blocking operations.
For example, it wasn’t possible to offload file operations to the ARM7 while
the ARM9 continued to execute code; the ARM9 had to be stalled until the
operation was complete.
BlocksDS supports cooperative multithreading in the form of cothreads. By
integrating it with libnds
, it is possible for functions like fopen
or fread
to switch to a different thread while they are waiting for the
SD card to finish reading a block.
While accessing the internal SD card on the DSi is performed on the ARM7, DLDI drivers used by cartridges are traditionally executed on the ARM9. However, as they are executed synchronously, this stalls the CPU whenever a file operation is performed, prohibiting the use of multithreading.
As a workaround, BlocksDS supports moving DLDI execution to the ARM7 sub CPU; this should be compatible with many, but not all cartridges. Feel free to read this document for more information.