Mini2440: Bug in DM9000 Driver for CE ?
http://www.friendlyarm.net/forum/topic/34?lang=en
Max Penth
Hello,
问题:
i'm experiencing problems similar to those:
http://www.friendlyarm.net/forum/topic/9
我要表达的问题类似于这个帖子:http://www.friendlyarm.net/forum/topic/9
When downloading a file via FTP or sending packets (TCP or UDP) from the
device it doesn't take long for the driver to crash. Crashing means the
device loses its IP address (at least its not possible to ping it anymore).
当从设备通过FTP下载一个文件或者发送一个(TCP or UDP) 包,不用多长时间驱动就会死掉(宕掉)。宕掉意味着设备丢掉它的IP地址(至少它的IP地址不能够再次Ping通)。
What seems to be strange is when sending UDP Packets after the crash (TCP
not possible because it cant handshake anymore) packets will be send to the
network (can be sniffed with Ethereal or Wireshark), but it doesn't send
UDP packets, it sends out raw Ethernet-II-Frames. So it appears that the
device driver has problems to communicate with the upper layers after the
crash.
看上去奇怪的是。在宕机厚发送UDP数据包(不可能是TCP,因为TCP不能再次握手)的时候将会发送网络(可以用Ethereal or Wireshark嗅探到)。但是事实上它并不发送UDP数据包,它会发出原始以太网- II帧(raw Ethernet-II-Frames)。所以看上去,在宕掉之后,设备驱动和上层的通信时有问题的。
When limiting transmit speed to about 20 kb/s it seems to run a long time
without a crash. However, when increasing the number of parallel send
threads the crash occurs pretty quickly, even with limited transfer rate
per thread.
当限制在速度大约是20 kb/s的时候,它看上去运行很长的一段时间不会宕机。然而,当增加发送线程的并行数量,则会相当快的速度宕机,甚至每个线程的传输速率都会受到限制。
Obviously limiting the speed to 20 kb/s and just one thread can't be the
solution to this.
显然,限制速度在20 kb/s,只有一个线程运行不是这个问题的解决方案。
After finding the solved bug in the linux driver I suspect the CE bug to be
similar.
在linux驱动下找到Bug之后,我推断CE下的bug应该相似。
I hope someone with more experience in driver developing could look into
this.
我希望有驱动经验的达人能够调查这个问题。
To reproduce the problem just set up a mini2440-Board with CE 5.0 and an
FTP-Server and try to download a bunch of files (e.g. \windows), it should
crash pretty quickly. The problem isn't caused by the FTP-Server because I
have the same issues with a custom socket application.
为了重现这个问题,我们只要建立基于mini2440开发板5.0的Windows CE操作系统和一个FTP服务器,并且从服务器上下载一些文件(例如\windows),它应该会很快的宕掉,问题并不是FTP服务器引起的,因为我在一个定制的套接字应用程序上(socket application)遇见了同样的问题。
解答1:
Baldwin NoUser:
Hello,
I have exactly the same problem. It seems that there's really a bug in that
driver.
Any ideas ?
Thank you
你好,
我遇见了和你一样的问题。在这个驱动里面似乎真的有个BUG。
请问你有什么想法吗?
谢谢!
解答2:
Max NoUser
After having to rebuild a new OS for the new 128MB-Boards this error seems
to be gone, at least I can't get it to crash anymore, stable FTP
downloading with ~600 kb/s.
I used this BSP to build my OS:
http://www.andahammer.com/Downloads/mini2440-en.rar
在基于128M的Mini2440的开发板上重新建立一个新的操作系统,这个错误似乎解决了,至少我在没有宕机,稳固的FTP下载,速率在600 kb/s左右。
解答3:
Baldwin NoUser
Hello again,
你好
Currently I'm using the BSP indicated as mini2440-bsp-en-090901 (last
version).
This driver is crashing sometimes using FTP. Please note that crashing is
not inmediate... it needs about 5-6 ftp downloads of 1Mb file and about 5-6
dirs in windows directory.
当前我正在使用的是mini2440-bsp-en-090901的BSP(最新的版本)。这个版本的驱动在使用FTP的时候,有的时候会宕机。请注意,宕掉并不是马上宕掉...它需要FTP下载5~6个1MB左右的文件和5~6个windows路径下的目录。
Examining the BSP DM9000 driver code I encountered a problem that can cause
this driver to crash. I'm not an expert in WCE driver development, but I
think that there's a possible workarond to solve this:
检查BSP的DM9000代码,我遇见了一个可能导致驱动宕机的问题。我并不是WCE驱动开发方面的专家,但是我认为这里有可能的工作区(workarond)去解决这个问题:
As I understand, the DM9000 has memory to alloc up to 2 IP TX packets (I
and II), but the driver is using these 2 packets incorrectly. In fact, the
driver is sending these two packets without any check. The DM9000 datasheet
indicates "... The D9000 starts to trasmit the index I packet. BEFORE the
trasmission of the index I packet ends, the data of the next (index II)
packet can be moved to TX RAM.", but also "AFTER the index I packet ends
trasmission, write the byte count data of the index II to BYTE_COUNT
registed an then set the bit 0 of TX control register to trasmit the index
II packet". The driver is NOT checking if pack I is trasmitted before
setting length and send packet II. This timing can cause, in some
conditions, the driver to fail to a non recoverable status (the m_nTx
variable has a wrong value and can't be recovered).
按照我的理解,DM9000最多可以分配2个IP发送包(I and II),但是驱动错误的使用这两个包。事实上,驱动正在发送这两个数据包,不带校验的。DM9000数据手册指出:“...DM9000开始发送索引I的数据包。在发送索引I的数据包完成之前,下一个数据包(索引II)可以被移动到发送RAM中”。而且在“索引I数据包结束传送之后,往BYTE_COUNT寄存器写入索引II要发送的数据统计字节数,通过设置发送控制寄存器的0bit位来发送索引II的数据包”。如果在设置长度之前数据包I被传送以及发送数据包II,驱动并不校验。这段时间可能会导致,在一些情况下,
驱动是不能回到原来的状态(m_nTx variable有一个错误的值不能复原)。
As a workaround I modified the driver to send only one packet at a time:
In file "dm9000.cpp", function C_DM9000:DeviceSend (..), replace the line
"for (;m_nTx < 2;)" by line "for (;m_nTx < 1;)".
当一个工作区(workarond)I修改驱动一次只发送一个数据包:在文件“dm9000.cpp”,函数C_DM9000:DeviceSend (..)里面"for (;m_nTx < 1;)"替代"for (;m_nTx < 2;)"行代码。
In my case this is solving the crash, but in theory is also slowing the
transfers (I didn't notice any slowdown).
在我的案例里面,这就解决了驱动宕机的问题,理论上这也降低了传输的速率(当然,我没有注意任何降低)。
I hope this post helps to modify the driver in a more elegant way.
我希望这个邮件能够以一种简练的方式帮助你修改驱动。