Fault is triggered by trying to read LPC_CAN1->IER when the peripheral is powered off. Fixed by checking the power control register before checking the IER register.
While fixing this issue in the various LPC* ports, I noticed a comment
pointing to this mbed forum post which summarizes this bug quite well:
https://mbed.org/forum/bugs-suggestions/topic/4473/
This bug was introduced in the following commit:
2662e105c4
The following code was added to serial_putc() as part of this commit:
uint32_t lsr = obj->uart->LSR;
lsr = lsr;
uint32_t thr = obj->uart->THR;
thr = thr;
As the forum post indicates, this causes the serial_putc() routine to
actually eat an inbound received byte if it exists since reading THR is
really reading the RBR, the Receiver Buffer Register. This code looks
like code that was probably added so that the developer could take a
snapshot of these registers and look at them in the debugger. It
probably got committed in error.
LPC11U24/LPC11U24_301/LPC11U35_401 shared the same startup file for ARM
and uARM toolchains, which is wrong, because the initial SP value is
different for LPC11U24_301. This commit fixes this issue by giving each
target its own startup file.
There were lots of overlaps in the code for LPC810 and LPC812, including
duplicated source files. This commit adds a TARGET_LPC81X_COMMON folder in
both HAL and CMSIS, this folder keeps common code for the targets.
The dependency file generated by GCC might contain more than one
dependency listed on a single line, which wasn't taken into account by the
GCC dependency fille interpreter. This commit fixes this issue.
If the FileBase::lookup operation in the constructor of FilePath returns
NULL, subsequent operations (such as isFile()/isFileSystem()) will call
methods on a NULL 'fb' pointer. This commit fixes this issue by adding
explicit NULL checks and a new method in FilePath (exists()).
Asm versions of netstack memcpy() and lwip_standard_chksum()
[Note] I'm generally a bit reluctant when including optimizations like this (from an architectural standpoint), because they tend to be a bit too specific (for example, this one works only with lwIP+GCC+Cortex-M3 or M4), but for now it looks as this is the right place for them, although the optimized memcpy should ideally be in libc (or even better replaced with a DMA transfer in this particular case). But this will be both a nice optimization and a reminder of what we need to implement/change in the future.
1. Added: GCC_CR toolchain ID for LPC2368. (targets.py)
2. Modified: Startup codes for GCC_ARM and GCC_CR toolchain.
3. Verified: "ticker" and "basic" test program work well, so far.
(Fixed typo.)
1. Added: GCC_CR toolchain ID for LPC2368. (targets.py)
2. Modified: Startup codes for GCC_ARM and GCC_CR toolchain.
3. Verified: "ticker" and "basic" test program works well, so far.
I verified that the hang issue I was seeing when building and running
the mbed official networking tests with GCC_ARM was related to this
issue reported on the mbed forums:
http://mbed.org/forum/mbed/topic/3803/?page=1#comment-18934
If you are using the 4.7 2013q1 update of GCC_ARM or newer then it
will have a _sbrk() implementation which checks the new top of heap
pointer against the current thread SP, stack pointer.
See this GCC_ARM related thread for more information:
https://answers.launchpad.net/gcc-arm-embedded/+question/218972
When using RTX RTOS threads, the thread's stack pointer can easily
point to an address which is below the current top of heap so this
check will incorrectly fail the allocation.
I have added a _sbrk() implementation to the mbed SDK which checks the
heap pointer against the MSP instead of the current thread SP. I have
only enabled this for TOOLCHAIN_GCC_ARM as this is the only GCC based
toolchain that I am sure requires this.
For tests such as TCPEchoServer
(http://mbed.org/users/emilmont/notebook/networking-libraries-benchmark/)
this change showed a 28% improvement (14Mbps to 18Mbps) when the echo
test was modified to instead use 1K data buffers.
I targetted these two functions based on manual profiling samples which
showed that a great deal of time was being spent in these two functions
when the network stack was being slammed with UDP packets.
A new hooks mechanism (hooks.py) allows various targets to customize
part(s) of the build process. This was implemented to allow generation of
custom binary images for the EA LPC4088 target, but it should be generic
enough to allow other such customizations in the future. For now, only the
'binary' step is hooked in toolchains/arm.py.
I now use a signal to communicate when a packet has been received by
the ethernet hardware and should be processed by the packet_rx thread.
Previously the change to make the lwIP stack thread safe introduced
enough delay in packet_rx that the semaphore count could lag behind
the processed packets and overflow its maximum token count. Now the
ISR uses the signal to indicate that >= 1 packet has been received
since the last time packet_rx() was awaken.
Previously the ethernet driver used generic sys_arch* APIs exposed from
lwIP to manipulate the semaphores. I now call CMSIS RTOS APIs
directly when using the signals. I think this is acceptable since that
same driver source file already contains similar os* calls that talk
directly to the RTOS.
This reverts commit acb35785c9.
It turns out that this commit actually causes problems if an ethernet
interrupt is dropped because a higher privilege task is running, such
as LocalFileSystem accesses. If this happens, the semaphore count isn't
incremented enough times and the packet_rx() thread will fall behind and
end up running as though it had only one ethernet receive buffer. This
causes even more lost packets.
I plan to fix this by switching the semaphore to be a signal so that
the syncronization object is more boolean. It simply indicates if an
interrupt has arrived since the last time packet_rx() was awaken to
process inbound packets.
I recently pulled a NXP crash fix for their ethernet driver which will
requeue a pbuf to the ethernet driver rather than sending it to the
lwip stack if it can't allocate a new pbuf to keep the ethernet
hardware primed with available packet buffers. While recently
reviewing this code I noticed that the full size of the pbuf wasn't
used on this re-queueing operation but the size of the last received
packet. I now reset the pbuf size back to its originally allocated
size before doing this requeue operation.
Previously the packet_rx() function would wait on the RxSem and when
signalled it would process all available inbound packets. This used to
cause no problem but once the thread synchronization was turned
on via SYS_LIGHTWEIGHT_PROT, the semaphore actually started to overflow
its maximum token count of 65535. This caused the mbed_die() flashing
LEDs of death. The old code was really breaking the producer/consumer
pattern that I typically see with a semaphore since the consumer was
written to consume more than 1 produced object per semaphore wait.
Before the thread synchronization was enabled, the packet_rx() thread
could use a single time slice to process all of these packets and then
loop back around a few more times to decrement the semaphore count
while skipping the packet processing since it had all been done.
Now the packet processing code would cause the thread to give up its
time slice as it hit newly enabled critical sections. In the end it
was possible for the code to leak 2 semaphore signals for every 1 by
which the thread was awaken. After about 10 seconds of load, this
would cause a leak of 65535 signals.
NOTE: Two potential issues with this change:
1) The LPC_EMAC->RxConsumeIndex != LPC_EMAC->RxProduceIndex check was
removed from packet_rx(). I believe that this is Ok since the same
condition is later checked in lpc_low_level_input() anyway so it
won't now try to process more packets than what exist.
2) What if ENET_IRQHandler(void) ends up not signalling the RxSem for
every packet received? When would that happen? I could see it
happening if the ethernet hardware would try to pend more than 1
interrupt when the priority was too elevated to process the
pending requests. Putting the consumer loop back in packet_rx()
and using a Signal instead of a Semaphore might be a better
solution?
This option actually enables the use of the lwip_sys_mutex for
protecting concurrent access to such important lwIP resources as:
select_cb_list (this is the one which orig flagged problem)
sockets array
mem stats (if enabled)
heap (if LWIP_ALLOW_MEM_FREE_FROM_OTHER_CONTEXT was non-zero)
memp pool allocs/frees
netif->loop_last pbuf linked list
pbuf reference counts
...
I first noticed this issue when I hit a crash while slamming the net
stack with a large number of TCP packets (I was actually sending 1k
data buffers from the TCPEchoServer mbed sample.) It crashed in the
last line of this code snippet from event_callback:
for (scb = select_cb_list; scb != NULL; scb = scb->next) {
if (scb->sem_signalled == 0) {
It was crashing because scb had an invalid address so it generated a
bus fault. I figured that memory was either corrupted or there was
some kind of concurrency issue. In trying to determine which, I wanted
to walk through the select_cb_list linked list and see where it was
corrupted:
(gdb) p scb
$1 = (struct lwip_select_cb *) 0x85100080
(gdb) p select_cb_list
$2 = (struct lwip_select_cb *) 0x0
That was interesting, the head of the linked list was now NULL but it
must have had a non-NULL value when this loop started running or we
would have never gotten to the point where we hit this crash.
This was starting to look like a concurrency issue since the linked
list was modified out from underneath this thread. Looking through the
source code for this function, I saw use of macros like
SYS_ARCH_PROTECT and SYS_ARCH_UNPROTECT which looked like they should
be providing the thead synchronization. I disassembled the
event_callback() function in the debugger and saw no signs of the
usage of synchronizition APIs that I expected. A search
through the code for the definition of these SYS_ARCH_UN/PROTECT
macros led me to discovering that they were actually ignored unless an
implementation defined them itself (the mbed version doesn't do so) or
the SYS_LIGHTWEIGHT_PROT macro is set to non-zero (the mbed version
didn't do this either). Flipping the SYS_LIGHTWEIGHT_PROT macro on in
lwipopts.h fixed the crash I kept hitting, increased the size of the
code a bit, and unfortunately slows things down a bit since it now
actually serializes access to these data structures by making calls
to the RTOS sync APIs.
After making my previous commit to completely disable LWIP_ASSERT
macro invocations, I ended up with a warning in pbuf.c where an
err variable was set but only checked for success in an assert. I
added a "(void)err;" reference to silence this warning.
I was doing some debugging that had me looking at the disassembly of
lpc_rx_queue() from within the debugger. I was looking for the call to
pbuf_alloc() that we see in the following code snippet:
p = pbuf_alloc(PBUF_RAW, (u16_t) EMAC_ETH_MAX_FLEN, PBUF_RAM);
if (p == NULL) {
LWIP_DEBUGF(UDP_LPC_EMAC | LWIP_DBG_TRACE,
("lpc_rx_queue: could not allocate RX pbuf (free desc=%d)\n",
lpc_enetif->rx_free_descs));
return queued;
}
/* pbufs allocated from the RAM pool should be non-chained. */
LWIP_ASSERT("lpc_rx_queue: pbuf is not contiguous (chained)",
pbuf_clen(p) <= 1);
When I was looking through the disassembly for this code I noticed a
call to pbuf_clen() in the actual machine code.
=> 0x0000bab0 <+24>: bl 0x44c0 <pbuf_clen>
0x0000bab4 <+28>: ldr r3, [r4, #112] ; 0x70
0x0000bab6 <+30>: ldrh.w r12, [r5, #10]
0x0000baba <+34>: add.w r2, r3, #9
0x0000babe <+38>: add.w r0, r12, #4294967295 ; 0xffffffff
The only call to pbuf_clean made from this function is made from
within the LWIP_ASSERT. When I looked more closely at how this macro
was defined, I saw that the mbed version of the stack had disabled the
LWIP_PLATFORM_ASSERT macro when LWIP_DEBUG was false which means that
no action will be taken if the assert is false but it still allows the
LWIP_ASSERT macro to potentially evaluate the assert expression.
Defining the LWIP_NOASSERT macro will fully disable the LWIP_ASSERT
macro.
I saw one of my TCP/IP samples shrink about 0.5K when I made this
change.