Boot0
Analogous to the old Wii, the Wii U also has a first-stage bootloader dubbed boot0, which is placed inside 16K of Mask ROM in the Latte's ARM core Starbuck. Being 2x bigger than the Wii version, Wii U's boot0 contains a number of features that include the ability of loading a recovery second-stage image from a SD card.
What follows are general descriptions and pseudo-code that illustrates the several sub-stages of the Wii U's boot0.
Initialization
boot0 runs from address 0xFFFF0000 where the ARM exception vectors are located. At this point, all exception vectors point to themselves (deadlock) except for the reset vector. Following the execution into the reset vector, the following takes place:
// Clear 0x10 bytes (will be used later) memset_range(0x0D417FE0, 0x0D417FF0, 0, 0x04); // Copy boot0 into a mirror in SRAM memcpy(0x0D4100A0, 0xFFFF00A0, 0x3680); // Set LR to a deadlock LR = 0xFFFF004C // Set PC PC = sub_D4100A0 // Deadlock loc_D41004C();
In other words, boot0 copies it's own code (starts at 0xFFFF00A0) into a special memory region in SRAM (0x0D410000) and jumps there. This behavior could also be observed, to some extent, in the old Wii's bootloader.
Setup
Right after boot0 copies itself over to SRAM, it does the following:
// Invalidate ICache MCR p15, 0, R0,c7,c5, 0 // Invalidate DCache MCR p15, 0, R0,c7,c6, 0 // Read control register MRC p15, 0, R0,c1,c0, 0 // Set replacement strategy to Round-Robin r0 = r0 | 0x1000 // Write control register MCR p15, 0, R0,c1,c0, 0 // Clear 0x351C bytes after boot0 memset_range(0x0D413720, 0x0D416C3C, 0, 0x04); // Set stack pointer SP = 0x0D414204 + 0x1000 // Set return address LR = sub_D4100F8 // Jump to boot1 // Set PC PC = sub_D410384 // boot0 main
Essentially, sets up it's own stack and jumps to boot0's main function.
Main
This is the bulk of the first-stage bootloader. During the main function's execution, boot0 will send different signals to debug ports via GPIO. These signals can be used to represent error codes or mark different execution sub-stages.
Start
boot0 reads and saves the value from register LT_TIMER.
// Get timer value u32 time_now = *(u32 *)LT_TIMER; // Set start time *(u32 *)0x0D413768 = 0;
Stage 0x00
boot0 sets a flag in LT_BOOT0 and configures debug ports' GPIOs.
// Send debug mark SendGPIODebugOut(0x00); u32 boot0_val = *(u32 *)LT_BOOT0; // Set something in LT_BOOT0 *(u32 *)LT_BOOT0 = boot0_val | 0xC0; // Enable GPIO for debug ports u32 gpio_enable_val = *(u32 *)LT_GPIO_ENABLE; *(u32 *)LT_GPIO_ENABLE = gpio_enable_val | 0x00FF0000; // Set direction to output u32 gpio_dir_val = *(u32 *)LT_GPIO_DIR; *(u32 *)LT_GPIO_DIR = gpio_dir_val | 0x00FF0000;
Stage 0x01
Set something for memory swap.
// Send debug mark SendGPIODebugOut(0x01); // Set memory swap *(u32 *)LT_MEMIRR = 0x07;
Stage 0x02
boot0 initializes the AES engine.
// Send debug mark SendGPIODebugOut(0x02); *(u32 *)AES_CTRL = 0; *(u32 *)AES_KEY = 0; *(u32 *)AES_KEY = 0; *(u32 *)AES_KEY = 0; *(u32 *)AES_KEY = 0; *(u32 *)AES_IV = 0; *(u32 *)AES_IV = 0; *(u32 *)AES_IV = 0; *(u32 *)AES_IV = 0; *(u32 *)AES_SRC = 0; *(u32 *)AES_DEST = 0;
Stage 0x03
boot0 initializes the SHA-1 engine.
// Send debug mark SendGPIODebugOut(0x03); *(u32 *)SHA_CTRL = 0; *(u32 *)SHA_H0 = 0; *(u32 *)SHA_H1 = 0; *(u32 *)SHA_H2 = 0; *(u32 *)SHA_H3 = 0; *(u32 *)SHA_H4 = 0; *(u32 *)SHA_SRC = 0;
Stage 0x04
boot0 sets something in LT_COMPAT_AHB and enables OTP reading.
// Send debug mark SendGPIODebugOut(0x04); // Set something in AHB compat u32 ahb_val = *(u32 *)LT_COMPAT_AHB; *(u32 *)LT_COMPAT_AHB = (ahb_val & 0xFFFFF3FF) | 0xC00; // Enable OTP reading SetOTPReadCommand();
Stages 0x05, 0x06 and 0x07
These three stages are all merged together and the only signal sent to the debug ports is effectively 0x05. boot0 asserts some resets and enables EXI.
// Send debug mark SendGPIODebugOut(0x05); // Assert RSTB_IOPI u32 resets = *(u32 *)LT_RESETS; *(u32 *)LT_RESETS = resets | 0x80000; // Assert unknown reset u32 resets = *(u32 *)LT_RESETS; *(u32 *)LT_RESETS = resets | 0x40000; // Enable EXI u32 exi_ctrl = *(u32 *)LT_EXICTRL; *(u32 *)LT_EXICTRL = exi_ctrl | 0x01;
Stage 0x08
During this stage boot0 reads all it needs from the OTP.
// Send debug mark SendGPIODebugOut(0x08); // Read security level flag from OTP ReadOTP(0x20, 0x0D413760, 0x04); // Read IOStrength flags from OTP ReadOTP(0x21, 0x0D41375C, 0x04); // Read SEEPROM pulse length from OTP ReadOTP(0x22, 0x0D413758, 0x04); // Read SEEPROM key from OTP ReadOTP(0x28, 0x0D41376C, 0x10); // Read boot1 key from OTP ReadOTP(0xE8, 0x0D41377C, 0x10);
Stage 0x09
boot0 analyses the IOStrength flags read in the previous stage and sets the strength of various devices.
Stage 0x0A
boot0 sets the SEEPROM's pulse length and configures the SEEPROM GPIOs.
Stage 0x0B
boot0 analyses the OTP security level flag and decides which keys to use.
// Send debug mark SendGPIODebugOut(0x0B); u32 sec_lvl_flag = *(u32 *)0x0D413760; // Forcefully throw error if (sec_lvl_flag == 0x40000000) throw_error(); // Factory mode if (sec_lvl_flag == 0x00000000) set_empty_aes_keys(); else { // Disable boot1 AES key access u32 otpprot_val = *(u32 *)LT_OTPPROT; *(u32 *)LT_OTPPROT = otpprot_val & 0xDFFFFFFF; // This is a devkit unit if ((sec_lvl_flag & 0x18000000) == 0x08000000) set_debug_aes_keys(); else if ((sec_lvl_flag & 0x10000000) == 0x10000000) // This is a retail unit set_retail_aes_keys(); else throw_error(); }
Stage 0x0C
boot0 generates a CRC32 table in it's stack.
Stage 0x0D
boot0 uses the SEEPROM key to decrypt SEEPROM data related to boot1.
// Send debug mark SendGPIODebugOut(0x0D); // Set SEEPROM AES key AES_Set_Key(aes_seeprom_key); // Read and decrypt from SEEPROM SEEPROM_Read(0x1C, 0x0D41378C, 0x01); // Read and decrypt from SEEPROM SEEPROM_Read(0x1D, 0x0D41379C, 0x01); // Read and decrypt from SEEPROM SEEPROM_Read(0x1E, 0x0D4137AC, 0x01);
Stage 0x0E
boot0 validates the data decrypted from the SEEPROM at offset 0x1C0 in the previous stage using CRC32. This data is used to set miscellaneous settings during boot0's execution.
Stage 0x0F
boot0 validates the data decrypted from the SEEPROM at offsets 0x1D0 and 0x1E0 in the previous stage using CRC32. This data is used to determine boot1's version and sector inside the NAND. This stage is skipped in factory mode.
Stages 0x10, 0x11 and 0x12
boot0 configures an unknown feature for Starbuck. These stages are optional and only execute if the 2-byte flag read from the SEEPROM at offset 0x1C0 translates to a negative value. The three stages are merged together and the only signal sent to the debug ports is effectively 0x10.
// Send debug mark SendGPIODebugOut(0x10); // Allow IRQ 12 (AHBLT) *(u32 *)LT_INTMR_AHBLT_ARM = 0x1000; // Turn IOP2X on? *(u32 *)LT_IOP2X = 0x03; // Wait for interrupt sub_D4132E8(); // Disable IRQ 12 (AHBLT) *(u32 *)LT_INTMR_AHBLT_ARM = 0;
Stage 0x13
boot0 analyses the remaining data read from SEEPROM at offset 0x1C0 and uses it to configure the NAND_CONFIG and NAND_BANK registers. It then initializes the NAND engine.
Stage 0x14
This stage only executes if NAND engine's initialization was successful. boot0 checks if it's start time is valid (must be 0, otherwise boot0 throws an error).
Stages 0x15, 0x16 and 0x17
These stages only execute if NAND engine's initialization was successful. Only signal 0x15 is sent to the debug ports. boot0 flushes AHB memory, reads boot1's ancast header from NAND and checks the boot1's image size by looking into the respective field inside the header. If the size is valid (must not exceed 0xF800, so it doesn't overflow boot1's memory region), boot0 then proceeds to read the full boot1's image from NAND into address 0x0D400000: Finally, boot0 calculates how long this operation took and stores this value for the IOS-MCP to read later on.
Stage 0x18
This stage only executes if NAND engine's initialization was successful. boot0 checks if boot1 is encrypted or not (boot1 is not encrypted in factory mode).
Stage 0x19
This stage only executes if NAND engine's initialization was successful. boot0 verifies boot1's hash (SHA-1) and signature (RSA).
Stage 0x1A
This stage only executes if NAND engine's initialization was successful. boot0 decrypts boot1 (using the AES engine) in place.
Stage 0x1B
boot0 reads a flag from SEEPROM to determine how long it should wait before attempting to initialize the SD card host.
Stage 0x1C
boot0 initializes EXI.
// Send debug mark SendGPIODebugOut(0x1C); // Assert RSTB_IOEXI u32 resets = *(u32 *)LT_RESETS; *(u32 *)LT_RESETS = resets | 0x10000; // Setup EXI sub_D410BB8();
Stage 0x1D
boot0 uses EXI to capture events coming from surface mounted components (SMC). If a special button combo (power + eject) is being held, boot0 will attempt to load a recovery boot1 image from a SD card.
Stage 0x1E
This stage only executes if EXI told boot0 to load an image from the SD card. boot0 starts by configuring the SDC0S0Power GPIO. It then initializes and configures the SD host controller, flushes AHB memory and loads the recovery image's ancast header from the SD card. It checks the recovery image's size by looking at the size field in it's header (must not exceed 0xF800, so it doesn't overflow boot1's memory region) and then reads in the full image into memory address 0x0D400000 (replacing what was read from the NAND).
Stage 0x1F
This stage only executes if EXI told boot0 to load an image from the SD card. boot0 checks if the recovery image is encrypted or not (it is not encrypted in factory mode).
Stage 0x20
This stage only executes if EXI told boot0 to load an image from the SD card. boot0 verifies the recovery image's hash (SHA-1) and signature (RSA).
Stage 0x21
This stage only executes if EXI told boot0 to load an image from the SD card. boot0 decrypts the recovery image (using the AES engine) in place.
Stages 0x22, 0x23 and 0x24
These stages do not exist, but code leftovers indicate they could have been related to loading a recovery image via the 802.11 Wireless host.
Stage 0x25
boot0 clears boot1 and SEEPROM keys from memory, calculates and stores how long it took to run and returns.
// Send debug mark SendGPIODebugOut(0x25); // Clear boot1 key memset(0x0D41377C, 0, 0x10); // Clear SEEPROM key memset(0x0D41376C, 0, 0x10); // boot1/recovery image start address r0 = 0x0D400200 // Store boot0's boot time *(u32 *)0x0D417FE0 = time_now - initial_time; return;
Loading boot1
After boot0's main function returns, execution falls into the pointer that was set in the LR register.
// Jump to boot1 sub_D4100F8(addr) { r1 = addr // Read control register MRC p15, 0, R0,c1,c0, 0 // Set replacement strategy to normal r0 = r0 & ~(0x1000) // Write control register MCR p15, 0, R0,c1,c0, 0 PC = addr }
Since boot0 finishes by setting r0 to 0x0D400200, returning from boot0 is equivalent to call sub_D4100F8(0x0D400200).