Ubiquiti UAP Firmware Investigations: Part 1

This series of posts will cover some interesting repair and re-programming work I performed on some Ubiquiti UAP Wi-Fi Access Points over the summer of 2015.

Since defining the method documented I’ve used it with lots of success on UAPs that could both be soft recovered (as in Part 1) and hardware recovered (which will be covered in Part 2). I haven’t posted any extracted firmware files on here as they form part of UBNT’s software and as such are their IP and I wouldn’t want to distribute that unauthorised.

About a year ago I was discussing WiFi issues with our IT consultant at work and he offered me 3 Ubiquiti UAPs in various conditions to augment the UAP-Pro I use at home (You’ve got love that 2.4Ghz channel contention). Being a fan of a technical challenge I took him up on the offer.

Once I power up them was apparent all three were “bricked”; however the first I tried entered TFTP soft recovery mode without needing the lid lifted. Result.

Units 2 and 3 weren’t so lucky, TFTP soft recovery seemed to make no difference as they weren’t trying to connect to the TFTP server and grab a firmware file. Damn.

A quick bit of research showed the units had a TTL serial interface to access the Linux U-Boot console on the Atheros AR7430.

I fired up unit 2, tapped Enter and saw the dreaded *** Warning - bad CRC, using default environment message. After it booted to the ar7240> prompt I typed the common U-Boot command urescue which usually fires up a recovery TFTP server on the device to accept a firmware file, however this didn’t work as expected.

A bit of investigation led me to an interesting thread on the UBNT forum Again on Unifi BAD DATA CRC where I got some ideas and techniques to investigate the problem and base my recover method on that is documented below.

I will prefix the rest of this post with a warning. If you are not experienced in Unix command line, the U-Boot environment and understanding how memory addressing works you can very easily brick your device. This post is aimed more a series of notes that I developed rather than a step-by-step guide for novices.

It goes without saying I can’t and won’t be held responsible for anything you do to your UAP as a result of your, or anyone else’s usage or interpretation of my notes!

Additionally most UAP issues mean you probably won’t need to flash mtdblock0 or mtdblock1, these contain the U-Boot environment and they are only included in these notes for completeness. Flashing mtdblock0 or mtdblock1 are far more likely to completely brick your device than flashing mtdblock2 to mtdblock5.

Still, even if you totally brick your device all is not lost. Wait and read Part 2 for my way to recovery a completely bricked UAP.

To start with I ran a printenv command to get some more information on the units:

a7240> printenv  
bootargs=root=31:03 rootfstype=squashfs init=/init console=tty0 panic=3  
bootcmd=run ubntappinit ubntboot  
bootdelay=1  
ipaddr=192.168.1.20  
ethaddr=00:15:6d:XX:XX:XX  
serverip=192.168.1.254  
ubntappinit=go ${ubntaddr} uappinit;go ${ubntaddr} ureset_button;urescue;go ${ubntaddr} uwrite  
mtdparts=mtdparts=ar7240-nor0:256k(u-boot),64k(u-boot-env),1024k(kernel),6528k(rootfs),256k(cfg),64k(EEPROM)  
ubntboot=bootm 0x9f050000  
stdin=serial  
stdout=serial  
stderr=serial  
ethact=eth0  
ubntaddr=80200020  

Doing a printenv shows all the environmental variables for the environment, but what is of interest is:

mtdparts=mtdparts=ar7240-nor0:256k(u-boot),64k(u-boot-env),1024k(kernel),6528k(rootfs),256k(cfg),64k(EEPROM)

ubntboot=bootm 0x9f050000  

This tells us the location and functions of the memory space. Knowing U-Boot as well as I do I know 0x9F000000 refers to the external NOR memory interface. This will come in handy later. To be certain I ran mtdparts to grab the partition table:

ar7240> mtdparts

device nor0 <ar7240-nor0>, # parts = 6  
#: name                        size            offset            mask_flags
0: u-boot                      0x00040000      0x00000000        0  
1: u-boot-env                  0x00010000      0x00040000        0  
2: kernel                      0x00100000      0x00050000        0  
3: rootfs                      0x00660000      0x00150000        0  
4: cfg                         0x00040000      0x007b0000        0  
5: EEPROM                      0x00010000      0x007f0000        0

active partition: nor0,0 - (u-boot) 0x00040000 @ 0x00000000

defaults:  
mtdids : nor0=ar7240-nor0  
mtdparts: mtdparts=ar7240-nor0:256k(u-boot),64k(u-boot-env),1024k(kernel),6528k(rootfs),256k(cfg),64k(EEPROM)  

Those offsets and sizes will come in handy later. So the partition table in a more readable format works out to:

Partion Name Address Size
mtdblock0 u-boot 0x9F000000 256KB
mtdblock1 u-boot-env 0x9F040000 64KB
mtdblock2 kernel 0x9F050000 1024KB
mtdblock3 rootfs 0x9F150000 6528KB
mtdblock4 cfg 0x9F7B0000 256KB
mtdblock5 EEPROM 0x9F7F0000 64KB
Total Size: 8192KB

Now given there is a CRC error the first thing I need is a memory dump from a known working UAP. Handily I happened have one here.

The easiest way, as documented in the thread on the UBNT Forum is to login via SSH to the working UAP and use cat to copy the contents of the locations and pipe them to file on another Unix machine, which is simple enough on my MacBook. Just use cat /dev/mtdblock0 >mtdblock0 and repeat for mtdblock1-mtdblock5. Simple.

Now I like to be careful so again I fired my TFTP server up on 192.168.1.254 (the UAP default documented in printenv), copied the mtdblock0 – mtdblock5 files over and performed the following commands:

ar7240> tftp 0x83000000 mtdblock0  
Using eth0 device  
TFTP from server 192.168.1.254; our IP address is 192.168.1.20  
Filename 'mtdblock0'.  
Load address: 0x83000000  
Loading: ####################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################  
done  
Bytes transferred = 262144 (40000 hex)  
ar7240> cmp.b 0x83000000 0x9f000000 0x40000  
Total of 262144 bytes were the same  

This copied the mtdblock0 file I just created to memory location 0x8300000 (a nice clear memory area), compared it to the contents of the memory in blocks 0x9F000000 – 0x9F039999 and confirmed how many bytes were the same (0x40000 is 262144 in decimal). I repeated this for mtdblock1 to mtdblock5 and all was good.

Once I had all 6 blocks safely saved (and backed up, you know, just incase), I wrote a set of commands to run on the bad UAP to copy them from my TFP server to scratch memory (0x8300000), write them to the SPI flash and compare the contents of the copied SPI with the scratch memory upload:

ar7240> mtdparts default  
ar7240> savenv

ar7240> protect off all

ar7240> tftp 0x83000000 mtdblock0  
ar7240> erase 0x9f000000 +0x40000  
ar7240> cp.b 0x83000000 0x9f000000 0x40000  
ar7240> cmp.b 0x83000000 0x9f000000 0x40000

ar7240> tftp 0x83000000 mtdblock1  
ar7240> erase 0x9f040000 +0x10000  
ar7240> cp.b 0x83000000 0x9f040000 0x10000  
ar7240> cmp.b 0x83000000 0x9f040000 0x10000

ar7240> tftp 0x83000000 mtdblock2  
ar7240> erase 0x9f050000 +0x100000  
ar7240> cp.b 0x83000000 0x9f050000 0x100000  
ar7240> cmp.b 0x83000000 0x9f050000 0x100000

ar7240> tftp 0x83000000 mtdblock3  
ar7240> erase 0x9f150000 +0x660000  
ar7240> cp.b 0x83000000 0x9f150000 0x660000  
ar7240> cmp.b 0x83000000 0x9f150000 0x660000

ar7240> tftp 0x83000000 mtdblock4  
ar7240> erase 0x9f7b0000 +0x40000  
ar7240> cp.b 0x83000000 0x9f7b0000 0x40000  
ar7240> cmp.b 0x83000000 0x9f7b0000 0x100000

ar7240> tftp 0x83000000 mtdblock5  
ar7240> erase 0x9f7f0000 +0x10000  
ar7240> cp.b 0x83000000 0x9f7f0000 0x10000  
ar7240> cmp.b 0x83000000 0x9f7f0000 0x10000

ar7240> setenv ethaddr 00:15:6D:XX:XX:XX  
ar7240> saveenv

ar7240> reset  

After this I had one working UAP… great, a result.

However when I repeated on the final unit after issuing the reset command the UAP failed to boot correctly as the previous ‘fixed’ unit had done.

I scratched my head and did a quick re check of the memory locations against the mtdblock2 file:

ar7240> tftp 0x83000000 mtdblock2  
ar7240> cmp.b 0x83000000 0x9f050000 0x100000  

And got a whole loads of bytes not matching. How odd!

I repeated the procedure for copying a valid mtdblock2 over, checked it matched after writing, and performed a reset and it still didn’t work. I double checked it and it was back to the definitely back to the original (erroneous) state.

This lead me to to conclude it must be one of two options:

  1. The procedure writing to flash was failing.
  2. The flash chip had physically failed.

The next step is to get my SPI memory programmer out, fire up the Metcal and see what that Flash IC is up to on a hardware level (SPOILER HINT: Dead flash memory usually goes read only).

If anyone has any questions feel free to contact me.

As a footnote, since I started using this procedure a similar method has been well documented and used over on the UBNT Forums with GreatWhiteDan kindly posting the MTDBlock files for those who don’t have a working UAP.