Commit Graph

17463 Commits (a070e350ecd92a65e102aed05ac4f87edd026161)

Author SHA1 Message Date
Adam Green f5ec5d3ab2 net: Enable SYS_LIGHTWEIGHT_PROT
This option actually enables the use of the lwip_sys_mutex for
protecting concurrent access to such important lwIP resources as:
  select_cb_list (this is the one which orig flagged problem)
  sockets array
  mem stats (if enabled)
  heap (if LWIP_ALLOW_MEM_FREE_FROM_OTHER_CONTEXT was non-zero)
  memp pool allocs/frees
  netif->loop_last pbuf linked list
  pbuf reference counts
  ...

I first noticed this issue when I hit a crash while slamming the net
stack with a large number of TCP packets (I was actually sending 1k
data buffers from the TCPEchoServer mbed sample.)  It crashed in the
last line of this code snippet from event_callback:
  for (scb = select_cb_list; scb != NULL; scb = scb->next) {
    if (scb->sem_signalled == 0) {

It was crashing because scb had an invalid address so it generated a
bus fault.  I figured that memory was either corrupted or there was
some kind of concurrency issue.  In trying to determine which, I wanted
to walk through the select_cb_list linked list and see where it was
corrupted:
    (gdb) p scb
    $1 = (struct lwip_select_cb *) 0x85100080
    (gdb) p select_cb_list
    $2 = (struct lwip_select_cb *) 0x0

That was interesting, the head of the linked list was now NULL but it
must have had a non-NULL value when this loop started running or we
would have never gotten to the point where we hit this crash.

This was starting to look like a concurrency issue since the linked
list was modified out from underneath this thread.  Looking through the
source code for this function, I saw use of macros like
SYS_ARCH_PROTECT and SYS_ARCH_UNPROTECT which looked like they should
be providing the thead synchronization.  I disassembled the
event_callback() function in the debugger and saw no signs of the
usage of synchronizition APIs that I expected.  A search
through the code for the definition of these SYS_ARCH_UN/PROTECT
macros led me to discovering that they were actually ignored unless an
implementation defined them itself (the mbed version doesn't do so) or
the SYS_LIGHTWEIGHT_PROT macro is set to non-zero (the mbed version
didn't do this either).  Flipping the SYS_LIGHTWEIGHT_PROT macro on in
lwipopts.h fixed the crash I kept hitting, increased the size of the
code a bit, and unfortunately slows things down a bit since it now
actually serializes access to these data structures by making calls
to the RTOS sync APIs.
2013-08-27 22:24:03 -07:00
Adam Green fa392423c8 Silence GCC unused variable warning.
After making my previous commit to completely disable LWIP_ASSERT
macro invocations, I ended up with a warning in pbuf.c where an
err variable was set but only checked for success in an assert.  I
added a "(void)err;" reference to silence this warning.
2013-08-27 22:24:03 -07:00
Adam Green 4603d729f9 net: Fully disable LWIP_ASSERTs
I was doing some debugging that had me looking at the disassembly of
lpc_rx_queue() from within the debugger.  I was looking for the call to
pbuf_alloc() that we see in the following code snippet:
		p = pbuf_alloc(PBUF_RAW, (u16_t) EMAC_ETH_MAX_FLEN, PBUF_RAM);
		if (p == NULL) {
			LWIP_DEBUGF(UDP_LPC_EMAC | LWIP_DBG_TRACE,
				("lpc_rx_queue: could not allocate RX pbuf (free desc=%d)\n",
				lpc_enetif->rx_free_descs));
			return queued;
		}

		/* pbufs allocated from the RAM pool should be non-chained. */
		LWIP_ASSERT("lpc_rx_queue: pbuf is not contiguous (chained)",
			pbuf_clen(p) <= 1);

When I was looking through the disassembly for this code I noticed a
call to pbuf_clen() in the actual machine code.
=> 0x0000bab0 <+24>:	bl	0x44c0 <pbuf_clen>
   0x0000bab4 <+28>:	ldr	r3, [r4, #112]	; 0x70
   0x0000bab6 <+30>:	ldrh.w	r12, [r5, #10]
   0x0000baba <+34>:	add.w	r2, r3, #9
   0x0000babe <+38>:	add.w	r0, r12, #4294967295	; 0xffffffff

The only call to pbuf_clean made from this function is made from
within the LWIP_ASSERT.  When I looked more closely at how this macro
was defined, I saw that the mbed version of the stack had disabled the
LWIP_PLATFORM_ASSERT macro when LWIP_DEBUG was false which means that
no action will be taken if the assert is false but it still allows the
LWIP_ASSERT macro to potentially evaluate the assert expression.
Defining the LWIP_NOASSERT macro will fully disable the LWIP_ASSERT
macro.

I saw one of my TCP/IP samples shrink about 0.5K when I made this
change.
2013-08-27 22:24:03 -07:00
Bogdan Marinescu 2cdd41d9b1 Added support for LPC11U24/_301 and LPC11U35_401 in uARM 2013-08-27 15:31:47 +03:00
Bogdan Marinescu 9a270999d0 Added support for LPC11U35_401 in ARM and GCC_ARM 2013-08-27 15:19:01 +03:00
Bogdan Marinescu 1c23d68281 Added support for LPC11U24_301/401 compilation with ARM toolchain 2013-08-27 14:52:39 +03:00
ytsuboi 2e549668a8 [LPC810] fixed scatter file 2013-08-24 16:02:03 +09:00
ytsuboi 8dd6bdb701 [LPC810] add LPC810 support 2013-08-24 15:49:16 +09:00
ytsuboi 9ae383f8f4 [LPC1114] fixed simple bugs on export templates 2013-08-24 11:57:44 +09:00
Emilio Monti 4d08c0741f Merge pull request #43 from adamgreen/gccWarnFlagUpdate
Updates to GCC warning level flags
2013-08-23 02:30:08 -07:00
Adam Green 25a332d8f1 Updates to GCC warning level flags
In gcc4mbed, I have been running with "-Wall -Wextra" and then
disabling a couple of noisy warnings that result.  In particular, I
disable the unused-parameter and missing-field-initializers warnings.
The first commonly goes off for implementation of virtual methods or
other overridable functions where not all parameters are required for
every override.  I don't find the second warning to be all that useful
anyway since missing structure field initializers will be set to 0
according to the C language specification.  The RTOS code uses this
language feature and I see no reason that it shouldn't :)
2013-08-22 18:09:14 -07:00
Emilio Monti 32fa9b27f7 Merge pull request #42 from adamgreen/usbdeviceGccWarnSilence
USBDevice: Silence GCC warning
2013-08-22 01:54:24 -07:00
Adam Green 76b7bb4209 USBDevice: Silence GCC warning
The following line in USBHAL_KL25Z.cpp would generate a warning in GCC
because of a potential operator precendence issue:
    return((USB0->FRMNUML | (USB0->FRMNUMH << 8) & 0x07FF));

This would have been interpreted as:
    return((USB0->FRMNUML | ((USB0->FRMNUMH << 8) & 0x07FF)));
since & has higher precedence than |

I switched it to be:
    return((USB0->FRMNUML | (USB0->FRMNUMH << 8)) & 0x07FF);
Since it makes more sense to & with 0x7FF after having merged the lower
and upper bytes together rather than just the upper byte.  It should
have resulted in the same value either way.
2013-08-21 22:23:35 -07:00
Emilio Monti dfd6fe6fbe Merge pull request #41 from ytsuboi/master
[LPC1114] fixed to use IRC and System PLL clock
2013-08-21 05:25:51 -07:00
ytsuboi 4c0b5ab404 [LPC1114] fixed to use IRC and System PLL clock 2013-08-21 20:44:31 +09:00
Emilio Monti 05b42afb68 Merge pull request #40 from ytsuboi/master
Fixed error handling and cleanup
2013-08-21 01:57:51 -07:00
Toyomasa Watarai 1ff64de847 Fixed gpio_irq_api bugs
Corrected improper implementation of channel_ids[] usage.
Revemod redundant arrays e.g. pin_names[] and trigger_events[]
2013-08-21 16:46:03 +09:00
Toyomasa Watarai 463bf47a40 Support LPC1114 platform
Added TARGET_LPC1114 pins for InterruptIn test case and changed to
PullUp mode.
2013-08-21 16:41:18 +09:00
Toyomasa Watarai 10e30b5cfb Fixed error handling and cleanup 2013-08-21 10:46:37 +09:00
Bogdan Marinescu d4e5206fe9 Merge pull request #39 from adamgreen/fsGccWarnSilence
fs: Silence GCC signed/unsigned warnings
2013-08-20 00:36:08 -07:00
Adam Green 5e4439bb19 fs: Silence GCC signed/unsigned warnings
The SDFileSystem class contained a few routines which compared a signed
integer loop index to an unsigned integer length/size.  I switched the
loop index to be uint32_t as well.
2013-08-19 22:53:49 -07:00
Bogdan Marinescu 3210ac98d1 Added LPC1114 in the official release list 2013-08-19 13:51:39 +03:00
Emilio Monti b248827341 Add script to export mbed SDK tests to different IDEs 2013-08-16 16:40:53 +01:00
Bogdan Marinescu b7301d7a54 Merge remote-tracking branch 'github/master' 2013-08-16 16:00:25 +03:00
Bogdan Marinescu 2139c101e5 Merge pull request #38 from ytsuboi/master
Added LPC1114 support and fixed hard fault
2013-08-16 05:59:41 -07:00
Bogdan Marinescu 15ed7ea66a Fix bug in Ticker code
The bug was introduced by the interrupt chaining code.
2013-08-16 15:55:24 +03:00
Toyomasa Watarai c26d2a990d Removed detach() call in test case 2013-08-16 21:47:42 +09:00
ytsuboi c4ea5e6717 Merge branch 'master' of https://github.com/ytsuboi/mbed 2013-08-16 21:36:52 +09:00
Toyomasa Watarai 10c44a99b2 Merge branch 'master' of https://github.com/ytsuboi/mbed 2013-08-16 21:07:03 +09:00
Toyomasa Watarai 66deb5e24d Added LPC1114 support and fixed hard fault 2013-08-16 21:06:51 +09:00
Bogdan Marinescu 10c7a71c92 Merge remote-tracking branch 'github/master' 2013-08-16 12:29:50 +03:00
Bogdan Marinescu a7628510f5 Added '-fno-delete-null-pointer-checks' to GCC compilation options
From Adam Green, regarding using -fno-delete-null-pointer-checks:

"I would argue that on Cortex-M processors, it is more dangerous to not
have it.  The compiler can actually generate incorrect code because it is
making an incorrect assumption (that reads from a NULL pointer will throw
an exception.)   The GCC for ARM developers should actually never enable
the delete-null-pointer-checks optimization for Cortex-M processors.
There is a comment in the GCC manual that indicates, "Some targets,
especially embedded ones, disable this option [delete-null-pointer-checks]
at all levels."  Not having this flag is pretty risky on the current
versions of GCC_ARM.  Just to clarify, this flag doesn't enable an
optimization...it disables an unsafe optimization."
2013-08-16 12:26:50 +03:00
Emilio Monti 3b0618ee1a Merge pull request #36 from TerryGuo/find_mbed_hid_devices
A better way to find and save mbed hid devices
2013-08-16 01:59:11 -07:00
Emilio Monti 089441f75d Merge pull request #34 from ytsuboi/master
Added LPC1114 and target for some test cases, Removed SWCLK and SWDIO pins from pinmap
2013-08-16 01:57:39 -07:00
Bogdan Marinescu 73b1687b83 Merge pull request #35 from adamgreen/fixNullIpAddrBug
Don't dereference NULL ipaddr in netif_set_ipaddr()
2013-08-16 00:27:25 -07:00
Terry Guo a02cee9d1e A better way to find and save mbed hid devices 2013-08-16 11:18:06 +08:00
Adam Green e0fb0a8f9b Don't dereference NULL ipaddr in netif_set_ipaddr()
The code in netif_set_ipaddr would read the memory pointed to by its
ipaddr parameter, even if it was NULL on this line:
  if ((ip_addr_cmp(ipaddr, &(netif->ip_addr))) == 0) {
On the Cortex-M3, it is typically OK to read from address 0 so this
code will actually compare the reset stack pointer value to the
current value in netif->ip_addr.

Later in the code, this same pointer will be used for a second read:
  ip_addr_set(&(netif->ip_addr), ipaddr);

The ip_addr_set call will first check to see if the ipaddr is NULL and
if so, treats it like IP_ADDR_ANY (4 bytes of 0).
    /** Safely copy one IP address to another (src may be NULL) */
    #define ip_addr_set(dest, src) ((dest)->addr = \
                                        ((src) == NULL ? 0 : \
                                        (src)->addr))

The issue here is that when GCC optimizes this code, it assumes that
the first dereference of ipaddr would have thrown an invalid memory
access exception and execution would never make it to this second
dereference.  Therefore it optimizes out the NULL check in ip_addr_set.

The -fno-delete-null-pointer-checks will disable this optimization and
is a good thing to use with GCC in general on Cortex-M parts.  I will
let the mbed guys make that change to their build system.

I have however corrected the code so that the intent of how to handle a
NULL ipaddr is more obvious and gets rid of the potential NULL
dereference.

By the way, this bug caused connect() to fail in obtaining an
address from DHCP.  If I recall correctly from when I first debugged
this issue (late last year), I actually saw the initial value of the
stack pointer being used in the DHCP request as an IP address which
caused it to be rejected.
2013-08-15 19:02:51 -07:00
ytsuboi 478f0498e7 Merge remote-tracking branch 'upstream/master' 2013-08-16 02:38:42 +09:00
Toyomasa Watarai 268a2973ea Added LPC1114 and target for some test cases 2013-08-16 00:24:09 +09:00
Toyomasa Watarai 1d1c8c6391 Removed SWCLK and SWDIO pins from pinmap 2013-08-16 00:17:52 +09:00
Emilio Monti 4d26e2ef4a Tidy up default settings 2013-08-15 16:09:12 +01:00
Bogdan Marinescu 6744f47359 Changed definition structure of ETHMEM_SECTION
This structure makes it easier to add more targets/toolchains in the
future and it's (arguably) a bit easier to read.
2013-08-15 15:56:24 +03:00
Bogdan Marinescu 1e4c9bed66 Merge remote-tracking branch 'github/master' 2013-08-15 15:18:52 +03:00
Bogdan Marinescu 2433eebf24 Merge pull request #33 from adamgreen/netMergeFixes
Fixups to network code after recent merges.
2013-08-15 05:18:14 -07:00
Adam Green 28a3466d11 Fixups to network code after recent merges.
Peter's and my changes to LPC1768.ld ended up adding the same AHBSRAM0
and AHBSRAM1 section clauses to the script twice.  I removed one copy.

I also pulled Peter's define of the ETHMEM_SECTION macro up into the
previous nested #if so that the preprocessor wouldn't spit out a
redefined macro warning.

I verified that building the code clean before and after these changes
still results in the same .bin file but now without warnings and/or
duplicate code.
2013-08-15 04:40:53 -07:00
Bogdan Marinescu 44c43e6e38 Restore C++98 compilation mode
Also, remove the line that required C++11 from UDPSocket.h
2013-08-15 14:09:20 +03:00
Bogdan Marinescu ff55aa3e13 Merge pull request #32 from adamgreen/netPerfRobustness
Updates to network code to improve performance/robustness
2013-08-15 03:51:47 -07:00
Bogdan Marinescu 370b270848 Merge pull request #30 from pbrier/master
Experimental fix for issue #29
2013-08-15 03:24:36 -07:00
Adam Green 657e6df203 Updates to network code to improve performance/robustness
I started out looking at some UDP receive code that was only able to
handle 3 inbound 550 byte datagrams out of 16 when sent in quick
succession.  I stepped through the ethernet driver code and it
seemed to work as expected but it just couldn't queue up more than
3 PBUFs for each burst.  It was almost like it was being starved of
CPU cycles.  Based on that observation, I looked up the thread
priorities for the receive ethernet thread and found the following
close to the top of the lpc17_emac.c source file:
    #define RX_PRIORITY   (osPriorityNormal)
This got me to thinking, what is the priority of the tcp thead?  It
turns out that it gets its priority from the following line in
lwipopts.h:
    #define TCPIP_THREAD_PRIO           1
Interesting!  What priority is 1?  It turns out that it corresponds
to osPriorityAboveNormal.  This means that while the tcp thread is
handling one packet that has been posted to its mailbox from the
ethernet receive thread, the receive thread is starved from processing
any more inbound ethernet packets.

What happens if we set TCP_IP_THREAD_PRIO to osPriorityNormal?  Crash!
The ethernet driver ends up crashing in lpc_low_level_input() when
it tries to set p->len on a NULL p pointer.  The p pointer ended up
being NULL because an earlier call to pbuf_alloc() in lpc_rx_queue()
failed its allocation (I will have more to say about this failed
allocation later since that is caused by yet another bug).  I pulled a
fix from http://lpcware.com/content/bugtrackerissue/lpc17xx-mac-bugs to
remedy this issue.  When the pbuf allocation fails, it discards the
inbound packet in the pbuf and just puts it back into the rx queue.
This means we never end up with a NULL pointer in that queue to
dereference and crash on.

With that bug fixed, the application would just appear to hang after
receiving and processing a few datagrams.  I could place breakpoints in
the packet_rx() thread function and found that it was being signalled
by the ethernet ISR but it was always failing to allocate new PBUFs,
which is what led to our previous crash.  This means that the new
crash prevention code was just discarding every packet that arrived.

Why are these allocations failing?  In my opinion, this was the most
interesting bug to track down.  Is there a memory leak somewhere in
the code which maybe only triggers in low memory situations?  I
figured the easiest way to determine that would be to learn a bit
about the format of the lwIP heap from which the PBUF was failing to
be allocated.  I started by just stepping into the failing lwIP memory
allocator, mem_malloc().  The loop which search the free list starts
with this code:
    for (ptr = (mem_size_t)((u8_t *)lfree - ram);
This loop didn't even go through one iteration and when I looked at the
initial ptr value it contained a really large value.  It turns out that
lfree was actually lower than ram.  At this point I figured that lfree
had probably been corrupted during a free operation after one of the
heap allocations had been underflowed/overflowed to cause the metadata
for an allocation to be corrupted.  As I started thinking about how to
track that kind of bug down, I noticed that the ram variable might be
too large (0x20080a68).  I restarted the debugger and looked at the
initial value.  It was at a nice even address (0x2007c000) and
certainly nothing like what I saw when the allocations were failing.
This global variable shouldn't change at all during the execution of
the program.  I placed a memory access watchpoint on this ram variable
and it fired very quickly inside of the rt_mbx_send() function.  The
ram variable was being changed by this line in rt_mbx_send():
    p_MCB->msg[p_MCB->first] = p_msg;

What the what?  Why does writing to the mailbox queue overwrite the
ram global variable?  Let's start by looking at the data structure used
in the lwIP port to target RTX (defined in sys_arch.h):
// === MAIL BOX ===

typedef struct {
    osMessageQId    id;
    osMessageQDef_t def;
    uint32_t        queue[MB_SIZE];
} sys_mbox_t;

Compare that to the utility macro that RTX defines to help setup one of
these mailboxes with queue:
    #define osMessageQDef(name, queue_sz, type)   \
    uint32_t os_messageQ_q_##name[4+(queue_sz)]; \
    osMessageQDef_t os_messageQ_def_##name = \
    { (queue_sz), (os_messageQ_q_##name) }
Note the 4+(queue_sz) used in the definition of the message queue
array.  What a hack!  The RTX OS requires an extra 16 bytes to contain
its OS_MCB header and this is how it adds it in.  Obviously the
sys_mbox_t structure used in the lwIP OS targetting code doesn't have
this.  Without it, the RTX mailbox routines end up scribbling on
memory following the structure in memory.  Adding 4 in that structure
fixes the memory allocation failure that I was seeing and now the network
stack can handle between 7 and 10 datagrams within a burst.
2013-08-15 03:08:47 -07:00
Bogdan Marinescu 9d846d9902 Merge pull request #31 from adamgreen/gccWarnFixNetwork
Update LPC1768 linker script and silence GNU warnings in network stack
2013-08-15 02:55:51 -07:00