This project proposes to provide restructuring and improvement of existing Bela Software Code to allow for compatibility and easier transition to newer Texas Instrument Sitara Processors (like the AM5729 in the BeagleBone AI).
As given on the official website, Bela is a hardware and software system for creating beautiful interaction with sensors and sound.
Bela has a lot of analog and digital inputs and outputs for hooking up sensors and controlling other devices, and most importantly Bela has stereo audio i/o allowing you to interact with the world of sound.
All Bela systems so far use the same Bela software. It uses a customized Debian distribution which - most notably - uses a Xenomai kernel instead of a stock kernel. Xenomai is co-kernel for Linux which allows to achieve hard real-time performance on Linux machines (ref: xenomai.org). It thus takes advantage of features of the BeagleBone computers and can achieve extremely low-latency audio and sensor processing times.
Although the proposal Title mentions support for AI, I have developed a standardized setup that allows an easy jump across all TI chips.
Bela and BB
Bela systems have used BeagleBoard computers from the very beginning. Bela uses the BeagleBone Black, and Bela Mini uses the PocketBeagle.
Both the BeagleBone Black, the PocketBeagle, (and also the BBAI) feature programmable real-time units, or PRUs, which are central to the way Bela works. These PRUs enable Bela's ultra-low latency processing: They are fast (200MHz, 32-bit) processors with single-cycle I/O access to a number of the board's pins, as well as full access to the internal memory and peripherals.
Applications of Bela:
Bela is ideal for creating anything interactive that uses sensors and sound. So far, Bela has been used to create:
Why add support for BBAI/newer TI chips?
The Beagle Black was launched over 7 years ago in 2013 and newer and better TI Sitara Processors have been launched ever since. It would be better to have a more standardized setup that allows an easier jump across TI chips. Soon, newer boards with different and more efficient chips like the AM5X and the TI C66x digital-signal-processor (DSP) cores in the BBAI are coming up that will need to be compatible with the Bela Software and Hardware.
Programming languages and tools to be used:
C, C++, PRU, dtb, GNU Make, ARM Assembly
The hardware was partially working on the BBAI using only ALSA(Advanced Linux Sound Architecture) and the SPI driver (refer). However, the Bela real-time code on ARM and PRU was not running on the BBAI yet.
This project involved dealing with pinmuxing (using overlays), PRU assembly, C and C++ for Linux user space applications and I also had to study the Technical Reference Manual of the Sitara family of SoCs. (AM5729 and the AM335x).
What is RProc?
The remoteproc framework allows different platforms/architectures to control (power on, load firmware, power off) those remote processors while abstracting the hardware differences, so the entire driver doesn't need to be duplicated. In addition, this framework also adds rpmsg virtio devices for remote processors that supports this kind of communication. This way, platform-specific remoteproc drivers only need to provide a few low-level handlers
What is a Device Tree Overlay? Sometimes it is not convenient to describe an entire system with a single FDT(Flattened Device Tree). For example, processor modules that are plugged into one or more modules (a la the BeagleBone), or systems with an FPGA peripheral that is programmed after the system is booted. For these cases it is proposed to implement an overlay feature so that the initial device tree data can be modified by userspace at runtime by loading additional device tree overlays that amend the original data.
Pinmuxing The following pin diagram
from a mathworks forum aided greatly to help visualize and compare the pins on the BeagleBone black versus the BeagleBone AI.
Also, inorder to write a DTO(Device Tree Overlay) using CCL(Cape Compatibility Layer), I referred am572x-bone-common-univ.dtsi which helps one understand the names and references to the pinmuxes.
How is an overlay compiled?
dtc(Device Tree Compiler) - converts between the human editable device tree source
dts format and the compact device tree blob
dtb representation usable by the kernel or assembler source.
Once an overlay is compiled, it generates a
.dtbo file which we can then use in the next stage to load the overlay.
To know more on how to compile and load the overlay, just head over to overlay-instructions.
Xenomai is a Free Software project in which engineers from a wide background collaborate to build a robust and resource-efficient real-time core for Linux© following the dual kernel approach, for applications with stringent latency requirements.
Xenomai kernel (
v4.19.94-ti-xenomai-r64) has been built and tested on the BBAI with a few minor bugs. I have installed the xenomai kernel through the default procedure to update kernel and libraries which I have documented here. I have successfully built the entire Bela core code without needing to modify any xenomai dependant syntax.
Hardware required: The hardware listed below was used for testing if my code implementation works correctly.
The places within the Bela core code that required intervention are:
IS_AM572x flag that is set automatically
1 on the BBAI and not set in BBB. Updated the workflow to build the PRU code for remoteproc. Also implemented auto-detection of which processor the code was being compiled on which was passed as a compile time flag to the BELA PRU and Core codes.
New flags and their brief description:
| ||1 and 0||- Set as 1 on BBAI |
- Set as 0 on BBB
| ||1 and 0||Tells |
- Set 0 on BBAI.
- Set 1 on BBB
| ||1 and 0||Tell |
- Set 1 on BBAI.
- Set 0 on BBB
| || ||Only gets set on BBAI|
| || ||useful mainly with RProc for passing |
| || ||useful mainly with RProc for passing |
Workflow to build pasm code
Although pasm is outdated, the binary it generates is still valid for the PRU. The issue is with the fact that it does not generate a file ready to be packaged up as a valid
.out ELF which remoteproc would recognise and load as fw into the PRU.
To solve this issue, we came up with the following workflow to build the PRU Firmware to be used by RProc:
pasm -V2 -L -c -b.
__asm__directive (i.e.: add quotes and prepend a space at the beginning of each line):
prudis $< | sed 's/^\(.*\)$$/" \1\\n"/' > $(RPROC_INCLUDED_ASSEMBLY)
.cfile mentioned above with the regular clpru toolchain.
clpru -fe $(RPROC_TMP_FILE).o $(RPROC_TEMPLATE) -v3 --endian=little --include_path=$(RPROC_BUILD_DIR) --include_path=$(RPROC_INCLUDE) --include_path=/usr/lib/ti/pru-software-support-package/include
lnkpru -o $(RPROC_TMP_FILE).out $(RPROC_TMP_FILE).o --stack_size=0x0 --heap_size=0x0 -m $(RPROC_TMP_FILE).map $(RPROC_CMD)
QBBxinstructions in there, so in the
dd if=$< of=$(RPROC_TMP_FILE).out bs=1 obs=1 seek=52 conv=notrunc status=none
.binthat had been created by
PruManager enables RProc and UIO PRUSS(using the libprussdrv API) implementation all under one roof.
Transitioning from libprussdrv to rproc:
I initially believed that I needed to change the initialization code in PRU.cpp that is currently relying on
libprussdrv and move to using
rproc. I was not sure if rproc provides some functionalities to access the PRU's RAM the way
prussdrv_map_prumem() used to, that essentially gives access to a previously mmap'ed area of memory.
On the latest Bela code there's a
Mmap class which can make this somehow simpler (ref. here).
For this transition, maintaining backward compatibility was also quite essential. This is what the
PruManager class achieves. Below is a rough structure of the class
class PruManageris an abstract base class
void stop();as the name suggests, stops the PRU
int start(bool useMcaspIrq);The first start is called by the
PRU.cppcode where required, where
useMcaspIrqis a flag that decides which pru code is to be used out of
int start(const std::string& path);The second start is called within the first one after the choice of PRU code is made. This function then does the job of loading the firmware file and starting the PRU.
void* getOwnMemory();Each PRU has is own 8-KiB data RAM per PRU CPU (signified RAM0 for PRU0 and RAM1 for PRU1) and 32-KiB general purpose memory RAM (signified RAM2) shared between PRU0 and PRU1. (ref. prucookbook)
void* getSharedMemory();refer the DATA RAM2 (shared) block in the diagram below from the AM572x Ref. Manual
The classes below are children of the above
PruManager virtual base class.
class PruManagerRprocMmapis responsible for RProc implementation. The RProc Mmap class is named so, because we are also using the
Mmap.hheader mentioned above to access
/dev/memon the BeagleBones to read or write to desired global memory locations.
prussAddresseswhich is a vector of type
<uint32_t>and is responsible for keeping the base addresses of the 2 PRU Sub Systems.
getOwnMemory()then accesses the DATA RAM using an object of
1(shown in the PRU-ICSS block diagram) is
getSharedMemoryfunction then accesses the shared RAM using an object of
class PruManagerUiois basically a ditto implementation of the
libprussdrvapproach that was being used earlier. This class is mainly useful to maintain backward compatibility with v4.14 on the BBB+Bela. It does not support the AM572x processor.
ENABLE_PRU_RPROCwhich tells the core/codes to use the RProc implementation or else
ENABLE_PRU_UIOtells the core/codes to keep using the old
pru/pru_rtaudio.p the hard-coded McASP, SPI and GPIO constants were replaced with board-dependent ones using
pru/board_specific.h uses the
IS_AM572x to set the proper BASE constants depending on which board it is compiling on. The GPIO,
CLOCK_SPI2, SPI2, and a few other Base Addresses needed changing in the AM572x. For more details visit my PR.
Other places like
bela_hw_settings.h, and a few other codes also needed updating the base addresses for including the new AM572x constants. Those changes can also be viewed all at once here.
2 device tree overlays were also created using the CCL,
and, a debugger for PRU called PRUDebug was ported to the BBAI.
Created a device tree overlay using Cape Compatibility layer to port BB-BONE-AUDI overlay to the BBAI.
The Overlay I wrote has been accepted by BeagleBone maintainer Robert Nelson, and you can find it to here
Created a BBAI-BELA-00A1 device tree overlay which helps in setting the right pinmux for BELA.
Installed a Xenomai patched kernel and ran the full Bela stack.
beagleboard/BeagleBoard-DeviceTrees BBAI-AUDI-02-00A0 overlay using the CCL PR#33
BBAI-AUDI-02-00A0.dts: Solved the output audio frequency issue in PR#36
cloud9-examples Corrected: solved a compilation in PR#57
Bela: PruManager Rproc + MMap/ prussdrv+UIO implementation PR#1
giuliomoro/prudebug: Add support for AM57x PR#2
MarkAYoder/BeagleBoard-exercises: prudebug: Add BBAI support PR#7
Bela: Add support for BeagleBone AI PR#668
Documentation: using Doxygen for PruManager PR#4
Due to the introduction of a new concept called
IRQ_CROSSBAR for handling interrupts from peripherals in the AM572x chips, porting the existing codes from BELA that use interrupts proved to be a bit complicated.
After going through the AM572x manual, a workflow was suggested. However on testing this workflow there still seem to be a few steps missing.
Essentially what we were trying to achieve was McASP –> PRU interrupts like there were in the pru_rtaudio_irq.p code.
Materials referred were: AM572x sitara manual and PRU-ICSS Migration Guide
TLDR of shortcomings:
IRQ_CROSSBARSwhich me and the Bela team have been working on for a few days. This may take some more time to implement. (however, this shouldn't be much of a deal-breaker for most users)
This project adds support for the Bela cape + Xenomai + PRU on the BeagleBone AI, and also the code will now be easier to port to other Texas Instruments systems-on-chips.
By going through the steps needed to have the Bela environment running on BBAI, we will go through refactoring and rationalization, using mainline drivers and APIs where possible. This will make Bela easier to maintain and to port to new platforms, benefiting the project's longevity and allowing it to expand its user base.
Just ordered a Bela cape. The platform seems really cool and exactly what I was looking for :) Just thought I would add that, I would also be very interested in having a bit more processing power under the hood and the ai would definitely be enough for my purposes.
-A User on Bela Forum