Forums » Programming with the ECS »
First try with ARM MCU bare betal
Added by Runar Tenfjord about 1 year ago
Based on the rpi2brun.asm runtime I tried to generate code for
a ARM Cortex-M0 and run the result on the Qemu ARM emulator
through the ARM semihost debug interface.
I use the following files.
test.mod:
MODULE Test; IMPORT SYSTEM; (** A Boolean value indicating whether the last output operation was successful. *) VAR Done-: BOOLEAN; PROCEDURE ^ Putchar ["putchar"] (character: INTEGER): INTEGER; (** Prints a character value to the standard output stream. *) PROCEDURE Char (value: CHAR); BEGIN Done := Putchar (ORD (value)) = ORD (value); END Char; (** Prints a string value to the standard output stream. *) PROCEDURE String (value-: ARRAY OF CHAR); VAR i: LENGTH; char: CHAR; BEGIN FOR i := 0 TO LEN (value) - 1 DO char := value[i]; IF char = 0X THEN RETURN END; Char (char) END; END String; BEGIN String('test123') END Test.
armt32semihostrun.asm:
.header _header .required .origin 0x00001000 ldr r0, 0x00 ; dummy code .data ram .required .origin 0x20000000 ; ram start .reserve 0x1000 ; how to define size? ; vector table .const bootflash .required .origin 0x0000000 .qbyte 0x20004000 ; end of ram, start of stack, should be able to calculate this .qbyte 0x00001000 ; flash start of code .qbyte @handle .qbyte @handle .qbyte @handle .qbyte @handle .qbyte @handle .qbyte @handle .qbyte @handle .qbyte @handle .qbyte @handle .qbyte @handle .qbyte @handle .qbyte @handle .qbyte @handle .qbyte @handle .reserve 0xFC0 ; how to define size easier? ; standard handle .code handle loop: b loop ; standard stderr variable .const stderr .alignment 4 .qbyte 0 ; standard stdin variable .const stdin .alignment 4 .qbyte 0 ; standard stdout variable .const stdout .alignment 4 .qbyte 0 ; standard abort function .code abort ldr r0, 0x18 mov r1, 0x20002 bkpt 0xab bx lr ; standard _Exit function .code _Exit ldr r0, 0x18 mov r1, 0x20002 bkpt 0xab bx lr ; standard fclose function .code fclose bx lr ; standard fgetc function .code fgetc ldr r0, 0x07 mov r1, 0 bkpt 0xab bx lr ; standard fputc function .code fputc ldr r0, 0x03 mov r1, sp add r1, 4 bkpt 0xab bx lr ; standard putchar function .code putchar ldr r0, 0x03 mov r1, sp bkpt 0xab ldr r0, 1 bx lr ; standard free function .code free bx lr ; standard malloc function .code malloc ldr r2, [pc, offset (heap)] ldr r0, [r2, 0] ldr r3, [sp, 0] add r3, r3, r0 str r3, [r2, 0] bx lr heap: .qbyte @_heap_start ; heap start .data _heap_start .alignment 4 .qbyte 0x20000000 ; last section .trailer _trailer .required .alignment 4
Makefile:
OB := obarmt32 AS := armt32asm LK := linkmem QEMU := qemu-system-arm DEVICE := microbit ADR := 0x00000000 build: test.rom test.rom: test.obf armt32semihostrun.obf $(LK) $^ test.obf: test.mod $(OB) $< armt32semihostrun.obf: armt32semihostrun.asm $(AS) $< run: test.rom $(QEMU) -M $(DEVICE) -semihosting -nographic -device loader,file=$<,addr=$(ADR) test.elf: test.mod llvm-objcopy -I binary -O elf32-littlearm --rename-section=.data=.text,code $< $@ dis: test.elf llvm-objdump -d $< clean: rm -f *.sym *.hex *.elf *.obf *.ram *.rom *.map *.lst *.dbg .PHONY: clean run dis
Building is OK. When I run the generated .rom file in QEMU
the stack pointer and pc seems to be correct, confirming that
it was loaded correctly, but in ends with a hard fault:
qemu-system-arm -M microbit -semihosting -nographic -device loader,file=test.rom,addr=0x00000000 qemu: fatal: Lockup: can't escalate 3 to HardFault (current priority -1) R00=00000000 R01=00000000 R02=00000000 R03=00000000 R04=00000000 R05=00000000 R06=00000000 R07=00000000 R08=00000000 R09=00000000 R10=00000000 R11=00000000 R12=00000000 R13=20003fe0 R14=fffffff9 R15=000010fc XPSR=40000003 -Z-- A handler FPSCR: 00000000
The .map file seems fine, with the code placed after 0x1000 and
the variable placed in ram.
I could not find out how to disassemble the raw binary and therefore
used llvm for this.
Looking at the disassembled binary file it looks like the
code is messed up and using some online decompiler tool
with the hex dump it seems to be THUMB big endian encoded
and should perhaps be little endian.
Is this a endian issue or is it something other I am doing wrong here?
Replies (17)
RE: First try with ARM MCU bare betal - Added by Runar Tenfjord about 1 year ago
The disassembler line should read:
test.elf: test.rom llvm-objcopy -I binary -O elf32-littlearm --rename-section=.data=.text,code $< $@
RE: First try with ARM MCU bare betal - Added by Florian Negele about 1 year ago
Is this a endian issue or is it something other I am doing wrong here?
I do not know the details of the emulated machine and its boot process so I cannot really help here. While it seems that the stack pointer is initialised correctly, putting a loop: b loop
at the start of the _header
section does not change the behaviour, so it seems that there is something executed prior to that or there is indeed some encoding issue causing possibly nested interrupts.
I could not find out how to disassemble the raw binary and therefore used llvm for this.
For disassembling, you can create an object file from a raw file by embedding it in an assembly file using the .embed "test.rom"
directive.
.reserve 0x1000 ; how to define size?
There is no need to define the size, as all data sections will be placed by the linker after the fixed ram
section. The _trailer
section is empty and will be placed behind all other data sections such that its address allows to compute the size of the required memory.
.reserve 0xFC0 ; how to define size easier?
Use the .pad 0x1000
to reserve space up until some required padding.
RE: First try with ARM MCU bare betal - Added by Runar Tenfjord about 1 year ago
A reduced test case which works with GCC.
test.asm:
.cpu cortex-m0 .thumb .thumb_func .global _start _start: stacktop: .word 0x20004000 .word run .word handle .word handle .word handle .word handle .word handle .word handle .word handle .word handle .word handle .word handle .word handle .word handle .word handle .word handle .thumb_func run: ldr r0,=0x04 ldr r1,=string bkpt 0xab ldr r0,=0x18 ldr r1,=0x20026 bkpt 0xab loop: b loop .thumb_func handle: b . .section .rodata .align 2 string: .ascii "testing123\0" .end
Makefile:
ARM := /c/GNUArmEmbeddedToolchain/bin/arm-none-eabi AFLAGS := --warn --fatal-warnings -mcpu=cortex-m0 QEMU := qemu-system-arm DEVICE := microbit ADR := 0x00000000 build: test.rom test.rom: test.o test.ld $(ARM)-ld -o test.elf -T test.ld test.o $(ARM)-objdump -D test.elf > test.lst $(ARM)-objcopy test.elf test.rom -O binary test.o: test.asm $(ARM)-as $(AFLAGS) test.asm -o test.o run: test.rom $(QEMU) -M $(DEVICE) -semihosting -nographic -device loader,file=$<,addr=$(ADR) clean: rm -f *.o *.elf *.lst *.rom .PHONY: clean run
This prints the string 'testing123' and exit with an exit code of 0.
RE: First try with ARM MCU bare betal - Added by Runar Tenfjord about 1 year ago
The same test case with ECS.
test.asm:
.header _header .required .origin 0x00000100 ldr r0, [pc, offset (SYS_WRITE0)] ldr r1, [pc, offset (string)] bkpt 0xab ldr r0, [pc, offset (angel_SWIreason_ReportException)] ldr r1, [pc, offset (ADP_Stopped_ApplicationExit)] bkpt 0xab j: b j SYS_WRITE0: .qbyte @_SYS_WRITE0 string: .qbyte @_string angel_SWIreason_ReportException: .qbyte @_angel_SWIreason_ReportException ADP_Stopped_ApplicationExit: .qbyte @_ADP_Stopped_ApplicationExit .const _SYS_WRITE0 .alignment 4 .qbyte 0x04 .const _angel_SWIreason_ReportException .alignment 4 .qbyte 0x18 .const _ADP_Stopped_ApplicationExit .alignment 4 .qbyte 0x20026 .const _string .alignment 4 .byte "testing123", 0 .data ram .required .origin 0x20000000 ; ram start .reserve 0x1000 ; vector table .const bootflash .required .origin 0x0000000 .qbyte 0x20001000 ; end of ram, start of stack, should be able to calculate this .qbyte 0x00000100 ; flash start of code .qbyte @handle .qbyte @handle .qbyte @handle .qbyte @handle .qbyte @handle .qbyte @handle .qbyte @handle .qbyte @handle .qbyte @handle .qbyte @handle .qbyte @handle .qbyte @handle .qbyte @handle .qbyte @handle ; standard handle .code handle loop: b loop ; main .code main loop: b loop ; last section .trailer _trailer .required .alignment 4
Makefile:
OB := obarmt32 AS := armt32asm DAS := armt32dism LK := linkmem QEMU := qemu-system-arm DEVICE := microbit # m0 ADR := 0x00000000 build: test.rom test.rom: test.obf $(LK) $^ test.obf: test.asm $(AS) $< run: test.rom $(QEMU) -M $(DEVICE) -semihosting -nographic -device loader,file=$<,addr=$(ADR) dis.obf: dis.asm test.rom $(AS) $< dis: dis.obf $(DAS) $< clean: rm -f *.sym *.hex *.elf *.obf *.ram *.rom *.map *.lst *.dbg .PHONY: clean run dis
This fails with:
qemu-system-arm -M microbit -semihosting -nographic -device loader,file=test.rom,addr=0x00000000 qemu: fatal: Lockup: can't escalate 3 to HardFault (current priority -1) R00=00000000 R01=00000000 R02=00000000 R03=00000000 R04=00000000 R05=00000000 R06=00000000 R07=00000000 R08=00000000 R09=00000000 R10=00000000 R11=00000000 R12=00000000 R13=20000fe0 R14=fffffff9 R15=00000042 XPSR=40000003 -Z-- A handler FPSCR: 00000000
Looking at the disassembly it seems ECS creates the ldr.w instruction
which is 32bit instruction, but GCC generates the normal ldr instruction.
- Thumb-1 (most), missing CBZ, CBNZ, IT
- Thumb-2 (some), only BL, DMB, DSB, ISB, MRS, MSR
I tried to change the cpu to Cortex-M4 which should support all of
Thumb1 and Thumb2, but it still failed.
So probably some illegal instruction or unaligned access.
RE: First try with ARM MCU bare betal - Added by Florian Negele about 1 year ago
Is there some documentation about the specifics of the boot process and memory map of the emulated MCU?
RE: First try with ARM MCU bare betal - Added by Runar Tenfjord about 1 year ago
The chip emulated is the Nordic nrf51822 which has an
Cortex-M0 core with added peripherals by Nordic:
The real chip I believe has firmware which is not included
in the QEMU version which is blank.
Cortex-M0 Technical Reference Manual:
Armv6-M Architecture Reference Manual:
RE: First try with ARM MCU bare betal - Added by Florian Negele about 1 year ago
Thanks for the information. Please note that the compiler does not support 16-bit only instructions at the moment. In order to make sure that the assembler uses only 16-bit instructions, use the .n
suffix. Also, the boot vector seems to need odd target addresses to select the T32 instruction set. The rom file contains code sections while the .ram file contains all other sections which is why the bootflash memory needs to be marked as code.
RE: First try with ARM MCU bare betal - Added by Florian Negele about 1 year ago
The following code seems to work but requires the attached patch to compile:
.code vector .required .origin 0x00000000 .qbyte 0x20004000 .qbyte @main + 1 #repeat 15 .qbyte @handle + 1 #endrep .code main .alignment 4 ldr.n r0, offset (SYS_WRITE0) + offset (SYS_WRITE0) % 4 ldr.n r1, offset (text) + offset (text) % 4 bkpt.n 0xab ldr.n r0, offset (SYS_EXIT) + offset (SYS_EXIT) % 4 ldr.n r1, offset (ADP_Stopped_ApplicationExit) + offset (ADP_Stopped_ApplicationExit) % 4 bkpt.n 0xab .align 4 SYS_WRITE0: .qbyte 0x04 SYS_EXIT: .qbyte 0x18 ADP_Stopped_ApplicationExit: .qbyte 0x20026 text: .qbyte @text .code text .byte "hello\n", 0 .code handle loop: b loop
RE: First try with ARM MCU bare betal - Added by Runar Tenfjord about 1 year ago
Excellent. One step further. Now it is working in armt32 assembler.
Apparently the BX instruction and vectors uses bit 0 to indicate thumb instruction set.
Based on this I updated the test code for Oberon to use the M-4 cpu
which support Thumb-1 and Thumb-2 as I understand the Oberon
compiler create mixed code.
I tested the assembler code with the M-4 cpu and it works as expected.
I believe all Cortex-M types is backward compatible to Thumb-1 code.
armt32semihostrun.asm:
; Only works on Cortex-M profile ARMv6-M and ARMv7-M ; Uses only Thumb1 16bit instructions to support Coretex-M0/M0+ ; Flash/memory origin and size must be changed to values for target device. .code vector .required .origin 0x00000000 ; flash .qbyte 0x20004000 ; stack = ram top .qbyte @start + 1 ; +1 for Thumb flag #repeat 15 .qbyte @handle + 1 ; +1 for Thumb flag #endrep .code handle .alignment 4 loop: b.n loop .code start .alignment 4 nop.n ; last section .trailer _trailer .data ram .required .origin 0x20000000 ; ram start ; standard abort function .code abort .alignment 4 ldr.n r0, offset (SYS_EXIT) + offset (SYS_EXIT) % 4 ldr.n r1, offset (ADP_Stopped_ApplicationExit) + offset (ADP_Stopped_ApplicationExit) % 4 bkpt.n 0xab loop: b.n loop .align 4 SYS_EXIT: .qbyte 0x18 ADP_Stopped_ApplicationExit: .qbyte 0x20026 ; standard _Exit function .code _Exit .alignment 4 bl @abort ; standard getchar function .code getchar .alignment 4 ldr.n r0, offset (SYS_READC) + offset (SYS_READC) % 4 mov r1, 0x00 bkpt.n 0xab bx.n lr .align 4 SYS_READC: .qbyte 0x07 ; standard free function .code free .alignment 4 bx.n lr ; standard malloc function .code malloc .alignment 4 ldr.n r2, offset (heap) + offset (heap) % 4 ldr.n r0, [r2, 0] ldr.n r3, [sp, 0] add.n r3, r3, r0 str.n r3, [r2, 0] bx.n lr heap: .qbyte @_heap_start ; heap start .data _heap_start .alignment 4 .qbyte 0x20000000 ; ram start ; standard putchar function .code putchar .alignment 4 ldr.n r0, offset (SYS_WRITEC) + offset (SYS_WRITEC) % 4 ldr.n r1, [sp, 0] bkpt.n 0xab mov r0, 0x01 bx.n lr .align 4 SYS_WRITEC: .qbyte 0x03
test.mod:
MODULE Test; IMPORT SYSTEM; VAR Done-: BOOLEAN; PROCEDURE ^ Putchar ["putchar"] (character: INTEGER): INTEGER; BEGIN Done := Putchar (ORD ('x')) = ORD ('x'); END Test.
Makefile:
OB := obarmt32 AS := armt32asm DAS := armt32dism LK := linkmem QEMU := qemu-system-arm DEVICE := mps2-an386 # cortex-m4 ADR := 0x00000000 build: test.rom test.rom: test.obf armt32semihostrun.obf $(LK) $^ test.obf: test.mod $(OB) $< armt32semihostrun.obf: armt32semihostrun.asm $(AS) $< run: test.rom $(QEMU) -M $(DEVICE) -semihosting -nographic -device loader,file=$<,addr=$(ADR) dis.obf: dis.asm test.rom $(AS) $< dis: dis.obf $(DAS) $< clean: rm -f *.sym *.hex *.elf *.obf *.ram *.rom *.map *.lst *.dbg .PHONY: clean run dis
It runs, but without the expected output.
Probably due to the BX instruction given an even address?
RE: First try with ARM MCU bare betal - Added by Florian Negele about 1 year ago
By jumping to start
, your boot vector skips the module initialisation and flows off into abort, see the generated map file. Start at the end of the vector instead:
.qbyte extent (@vector) + 1
The SYS_WRITEC
operation requires a pointer to the character rather than the character itself:
mov r1, sp
RE: First try with ARM MCU bare betal - Added by Runar Tenfjord about 1 year ago
Thank you very much for your time fixing this.
I guess now Oberon code should just work with
NEW, DISPOSE and TRACE covered by the runtime code.
There is an error in the putchar code.
It should return the character printed:
ldr.n r0, offset (SYS_WRITEC) + offset (SYS_WRITEC) % 4 mov r1, sp bkpt.n 0xab ldr.n r0, [r1] bx.n lr
I believe this runtime should now work with a large number of MCUs.
I will test this code with real hardware when I get back to the
office. The nice thing here is that the semihost interface works
troughs the programming probe with gdb or similar tools.
There is much more features of this semihost interface, but
I have not had the need for anything other than printing.
Also this makes it possible to do automatic testing trough
QEMU as the printing can be redirected and checked on
the host.
It would also be nice to have a way to run code in ram only
for testing, to avoid wearing on the flash cycles. I think that
is possible by first flashing a rom which redirects start to
ram and then have a second "rom" which is placed in ram.
The only thing a can think of could be a problem is interrupt
handles in Oberon. As these are run directly by the hardware
the procedures can not operate on the stack. We then need
some kind of "naked" functions similar to LLVM/GCC.
RE: First try with ARM MCU bare betal - Added by Florian Negele about 1 year ago
Thanks for the fix. For code running in RAM only you can use the ordinary binary file linker and have an empty vector at the desired target address. For "naked procedures" you could define assembler stubs that setup a fake stack in order to call Oberon procedures. The stubs need to store and restore registers and return from the interrupt with special instructions anyway.
RE: First try with ARM MCU bare betal - Added by Runar Tenfjord about 1 year ago
I have now updated the runtime and moved the semihost functionality to
a Oberon module which replaces the dummy routines in the runtime at
link time. I think this is a cleaner solution as it keeps the runtime general.
I was also able to use the interrupts from a Oberon module.
Turns out the ARMv7-M platform the stack i setup in hardware
when calling ISR and there is no issue.
Attached are some test code with this functionality which
implements a SVC
call trap and prints the information
pushed to the stack at the call location.
The only thing is that I not able to find the SVC
value in theXPSR
register. Otherwise working as expected.
I see that the language defined traps are using a HLT
instruction.
This I do not think I am able to catch in the runtime. Perhaps
this should be a replaceable function in the runtime?
Ideally I would like to use a SVC
call with R0
set to the
trap number. Then I could handle this in the runtime and when in
deployed mode be able to hard reset the MCU.
Attached are also the general runtime armv7mrun.asm
.
Is there perhaps a way to reuse this in the stm32f4run.asm
code
and just replace the empty ISR block in armv7mrun.asm
?
Semihost.mod (1.33 KB) Semihost.mod | |||
stm32f4run.asm (11.1 KB) stm32f4run.asm | |||
testTrap.mod (1.96 KB) testTrap.mod | |||
Makefile (1.01 KB) Makefile | |||
armv7mrun.asm (3.01 KB) armv7mrun.asm |
RE: First try with ARM MCU bare betal - Added by Runar Tenfjord about 1 year ago
After testing directly on hardware I found some issues:
- QEMU does not complain about non-aligned access to memory, but the MCU triggers
a hardfault trap. Turns out the malloc code in the runtime needs to be updated here.
Also the free list allocator needs to be updated.
- Semihosting is a really expensive operation. Print character by character is really
slow, and I had to implement some line buffering.
Attached updated runtime code with fix for aligned memory access and clearing
for RAM on startup.
The alignment issue on the free list allocator was a one line fix:
- Heap := ADR(heapStart); + Heap := ADDRESS(SET32(ADR(heapStart) + 3) * (-SET32(3))); (* round up address to next word *)
No I am able to run about 600 tests also directly on hardware without any issue.
There is some failures related to REAL
support, but these I will look at on a later
stage (emulated, non-FPU) as it is not needed now.
armv7mrun.asm (3.01 KB) armv7mrun.asm |
RE: First try with ARM MCU bare betal - Added by Runar Tenfjord 7 months ago
Regarding using GDB with the runtime.
I see most examples I find online that the ELF format is used in order
to supply extra information to GDB. I do not think I can use the DWARF
information without an ELF file. At least I could not find information on this.
Is it possible to create this with the linkbin command for a
runtime with fixed memory locations, e.g. bare metal target?
I tried to convert my runtime following the armt32linuxrun.asm, adding the
elf header information from this file excluding the DLL sections, but just ended
up with a very large binary file of 800Mb.
I followed the link instructions at : https://ecs.openbrace.org/manual/manualse13.html#x19-370003.6
RE: First try with ARM MCU bare betal - Added by Florian Negele 7 months ago
The debugging information linked together with a program currently depends on the linker to resolve the addresses of debugging symbols. Getting the address of a symbol from an external source like the map file is not supported at the moment, sorry.
RE: First try with ARM MCU bare betal - Added by Runar Tenfjord 6 months ago
Thanks for the quick reply and clarification on this subject.