1.
What are segfault rip/rsp numbers and how to use them:
http://stackoverflow.com/questions/1456899/what-are-segfault-rip-rsp-numbers-and-how-to-use-them
When my linux application crashes, it produces a line in the logs something like:
segfault at 0000000 rip 00003f32a823 rsp 000123ade323 error 4
What are those rip and rsp addresses? how do I use them to pinpoint the problem? do they correspond to something in the "objdump" or "readelf" outputs? are they useful if my program gets its symbols stripped out (to a separate file, which can be used using gdb)
debugging segmentation-fault
link|edit|flag edited Jan 12 '10 at 2:30
Bill the Lizard?
65.2k29128285 asked Sep 21 '09 at 21:11
johnnys
3 Answersactive oldest votes up vote
3
down vote Well the rip pointer tells you the instruction that caused the crash. You need to look it up in a map file.
In the map file you will have a list of functions and their starting address. When you load the application it is loaded to a base address. The rip pointer - the base address gives you the map file address. If you then search through the map file for a function that starts at an address slightly lower than your rip pointer and is followed, in the list, by a function with a higher address you have located the function that crashed.
From there you need to try and identify what went wrong in your code. Its not much fun but it, at least, gives you a starting point.
Edit: The "segfault at" bit is telling you, i'd wager, that you have dereferenced a NULL pointer. The rsp is the current stack pointer. Alas its probably not all that useful. With a memory dump you "may" be able to figure out more accurately where you'd got to in the function but it can be really hard to work out, exactly, where you are in an optimised build
link|edit|flag edited Sep 21 '09 at 21:35
answered Sep 21 '09 at 21:20
Goz
19.5k1245
This link was very usefull to me.
link|edit|flag answered Oct 21 '10 at 14:29
tsotso
82
up vote
0
down vote I got the error, too. When I saw:
probe.out[28503]: segfault at 0000000000000180 rip 00000000004450c0 rsp 00007fff4d508178 error 4
probe.out is an app which using libavformat (ffmpeg). I disassembled it.
objdump -d probe.out
The rip is where the instruction will run:
00000000004450c0 <ff_rtp_queued_packet_time>:
4450c0: 48 8b 97 80 01 00 00 mov 0x180(%rdi),%rdx
44d25d: e8 5e 7e ff ff callq 4450c0 <ff_rtp_queued_packet_time>
finally, I found the app crashed in the function ff_rtp_queued_packet_time
PS. sometimes the address doesn't exactly match, but it is almost there.
link|edit|flag answered Jan 24 at 7:18
qrtt1
649313
Your Answer
draft saved
log in or Name
Email
never shown
Home Page
Not the answer you're looking for? Browse other questions tagged debugging segmentation-fault or ask your own question. Hello World!
This is a collaboratively edited question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.
about ? faq ?
tagged
debugging × 6655
segmentation-fault × 588
asked
1 year ago
viewed
1,906 times
latest activity
1 month ago
Related
Debug-compiled executable: Why not abort gracefully on invalid write to NULL?
How to interpret this debugging error
C Programming: seg faults, printf, and related quirks
segfault after return 0;
Debugging a clobbered static variable in C (gdb broken?)
Segfault when calling Gtkmm textBuffer->insert
What are some good methods or steps to debug a segmentation fault in Perl?
Waiting with a crash for a debugger?
How can I add debugging symbols to Audacious?
What could be wrong with this?
What does this stack trace possibly mean?
Having issues with counting the number of bytes in use by files in a folder. Getting a segfault error.
Determine the line of C code that causes a segmentation fault?
Educational example to show that sometimes printf as debugging may hide a bug
What is a segmentation fault?
electric-fence with pthread
How to debug an assembled program?
Methodology for fixing Segmentation faults in C++
Core dumped?----
backtrace by SIGSEGV
i know what causes the segfault, but why?
how is segmentation fault thrown
map with string is broken?[solved]
Segfault when using *this
Need suggestions with Seg fault debugging question feed
2.Bug172933 - gdm segfaults on boot
https://bugzilla.redhat.com/show_bug.cgi?id=172933
3.Segfault error 4:
from:http://nixcraft.com/linux-software/12412-segfault-error-4-a.html
|
|||
<!-- message, attachments, sig --><!-- icon and title -->
<!-- google_ad_section_start -->Segfault error 4<!-- google_ad_section_end -->
<!-- / icon and title --><!-- message -->
<!-- google_ad_section_start -->i am running pgcluster 1.9rc5 for some months, recently i am getting alerts in in message log for segfaults error 4...
<!-- / message -->
<!-- message, attachments, sig --> What could be the problem and any solution for this.. Can anyone give me why this error occurs and what is it's meaning. I am running Centos 5 on 64bit Blade servers and they are parted in 4 part using VMWARE with disabled HT. Here is the alerts that i am getting in message log /var/log/messages.2:Oct 4 10:18:49 ibn-cluster3 kernel: postgres[13458]: segfault at 00002aaaae097004 rip 0000000000536e10 rsp 00007fff97608930 error 4 /var/log/messages.2:Oct 4 10:18:49 ibn-cluster3 kernel: postgres[13438]: segfault at 00002aaaae097004 rip 0000000000536e10 rsp 00007fff97608930 error 4 /var/log/messages.2:Oct 4 13:59:26 ibn-cluster3 kernel: postgres[25406]: segfault at 0000000000000094 rip 00000000005ad88e rsp 00007fff932bcad0 error 4 /var/log/messages.2:Oct 4 19:07:23 ibn-cluster3 kernel: postgres[5698]: segfault at 0000000000000094 rip 00000000005ad88e rsp 00007fff08347500 error 4 /var/log/messages.4:Sep 20 14:02:43 ibn-cluster3 kernel: postgres[31633]: segfault at 0000000000000094 rip 00000000005ad88e rsp 00007ffff4f76370 error 4 /var/log/messages.4:Sep 20 14:48:01 ibn-cluster3 kernel: postgres[32302]: segfault at 0000000000000094 rip 00000000005ad88e rsp 00007fffc9b4aca0 error 4 /var/log/messages.4:Sep 20 14:48:23 ibn-cluster3 kernel: postgres[32330]: segfault at 0000000000000094 rip 00000000005ad88e rsp 00007fffc9b4aca0 error 4 /var/log/messages.4:Sep 20 14:48:25 ibn-cluster3 kernel: postgres[32338]: segfault at 0000000000000094 rip 00000000005ad88e rsp 00007fffc9b4aca0 error 4 /var/log/messages.4:Sep 20 14:48:28 ibn-cluster3 kernel: postgres[32347]: segfault at 0000000000000094 rip 00000000005ad88e rsp 00007fffc9b4aca0 error 4 /var/log/messages.4:Sep 20 15:23:31 ibn-cluster3 kernel: postgres[1474]: segfault at 0000000000000094 rip 00000000005ad88e rsp 00007fff34cd0340 error 4 /var/log/messages.4:Sep 20 16:46:46 ibn-cluster3 kernel: postgres[2480]: segfault at 0000000000000094 rip 00000000005ad88e rsp 00007ffffc58a6f0 error 4 /var/log/messages.4:Sep 20 16:52:53 ibn-cluster3 kernel: postgres[2984]: segfault at 0000000000000094 rip 00000000005ad88e rsp 00007fff5e7e59e0 error 4 /var/log/messages.4:Sep 20 20:00:01 ibn-cluster3 kernel: postgres[6654]: segfault at 0000000000000094 rip 00000000005ad88e rsp 00007fffa24b2df0 error 4 /var/log/messages.4:Sep 20 20:00:03 ibn-cluster3 kernel: postgres[6662]: segfault at 0000000000000094 rip 00000000005ad88e rsp 00007fffa24b2df0 error 4 I some more details needed then pl. let me know Regards<!-- google_ad_section_end --> |
<!-- user info -->
|
||||
<!-- message, attachments, sig --><!-- icon and title -->
<!-- / icon and title --><!-- message -->
<!-- google_ad_section_start -->May be suggestion posted below will help you out:
<!-- / message --><!-- sig -->
Why Does The Segmentation Fault Occur on Linux / UNIX Systems?<!-- google_ad_section_end -->
__________________
<!-- / sig -->
<!-- message, attachments, sig --> <!-- google_ad_section_start(weight=ignore) -->Vivek Gite Do you run a Linux? Let's face it, you need help! All [Solved] threads are closed by mods / admin to avoid spam issues. <!-- google_ad_section_end --> |
<!-- user info -->
|
|||
<!-- message, attachments, sig --><!-- icon and title -->
<!-- / icon and title --><!-- message -->
<!-- google_ad_section_start -->In my experience Apache segfaults can also be caused by having one or more damaged run-time components in Apache or one of its dependent modules. I had this happen to me last weekend when fixes applied to a corrupted file system apparently ended up damaging some of apache's components.
<!-- / message --><!-- edit note -->
Shortly after that happened, I noticed cascades of segment faults occurring in Apache on my system. In an effort to fix it, I used aptitude on my Debian system, to carefully build a detailed list of Apache2 and all components it depended on. Once I had a complete list, I then used Aptitude to remove all those apps (except one which the kernel depended on) and then I used "clean" to remove all traces of those apps except their config files from my system. Finally I reinstalled Apache2 and all components it depended on and the result was I managed to eliminate almost all the segfaults. Whereas before I was getting several of those errors at once (and hundreds or thousands in a 24 hour period), I'm now seeing only 5 or 6 in a 24 hour period. The point is segfaults can also occur as a by-product of a damaged file system as well as because of a hardware problem or a poorly written program. try removing and reinstalling Apache and the components it depends on then be sure to delete the binary runtimes for all those apps too -- THAT's the trickiest part -- but try to save your config files (if possible). In my case this solution reduced my number of Apache segfaults from hundreds per hour to 4 - 6 per day. Another thing to bear in mind is that Apache makes heavy use of memory and spawns dozens of dynamic tasks (e.g. php and its underlying applications, Python and its apps, Ruby and its apps, etc.) which then issue requests to other apps as well (e.g. mysql database request, perl requests and God KNOWS what else. In short, Apache and the tools it relies on very thoroughly exercise your system's memory. So if you had a bad stick of ram that was throwing random errors -- especially during periods of high system demand, Apache and the apps it calls might encounter that bad block of memory quite often. Have you tried running a hardware diagnostic on your system?<!-- google_ad_section_end --> Last edited by websissy; 14th October 2008 at 09:08 PM. |
<!-- user info -->
|
|||
<!-- message, attachments, sig --><!-- icon and title -->
<!-- / icon and title --><!-- message -->
<!-- google_ad_section_start -->I read the articles posted on the links, suggested by Vivek.
<!-- / message -->
<!-- message, attachments, sig --> Good articles, but really they couldn't do much to diagnose and solve a problem specially if you are not the software developer. Most people who use Linux systems, use open source software, but that does not mean they can understand all the programming that goes inside. Developers write programs, package and distribute it to people all over the world. Developers don't even know most of the users, and most users wouldn't know how to debug a crash. If you believe in Peter's Principle, believe me a crash occurs when you are least expecting it. I suppose we need an HowTo, that bridges the developers and the users, for the purpose of eliminating such crashes and flaws from the software. I hope somebody could write an article like: Suppose you are using a software that randomly experiences a crash, open up /var/log/messages and grep for segfault. You will notice one or more lines like: segfault at a594dec8 eip b7cc6283 esp ab78e658 error 4 I really wish I knew more about what to do next, and write the rest of the article. But I think the article should cover things like: How and why all software developers, both open source and closed source should ship the symbol tables of the executables and libraries? How should end-user use the symbol tables, and use them to analyse a crash, even when they do not have the original source code? All segfaults are not necessarily caused by a flaw in the application software. How to make sure that a crash reported as segfault is definitely NOT caused by a fault in the software? Most of the articles as seen on the web ask you to start an application under gdb, and then keep using it until a crash occurs. I am amazed, that so many people still believe that a software user really has nothing better to do. And moreover, if replicating the crash was so easy, wouldn't every developer be able to simply ship out stable releases, and cut out all that alpha, and beta crap. On numerous occasions I have even seen segfaults without any core dumps. And even other forms of crashes, like stack smashing etc. that doesn't even generate a core dump, and most users wouldn't even be able to record them, if they occurred in a background application or a daemon. Hopefully this article might cover points like capturing ALL outputs emitted by an application, due to systemic errors, like stack smashing, OOM, etc. Maybe all the stuff that I am wishing for does exist somewhere, and I am too stupid to have not discovered them, so I thank in advance all those who might be kind enough to point me to the correct links. Hopefully admins and the users of this forum will be able to get together and get such an article in place, under Vivek's stewardship. <!-- google_ad_section_end --> |
<!-- user info -->
|
||||
<!-- message, attachments, sig --><!-- icon and title -->
<!-- / icon and title --><!-- message -->
<!-- google_ad_section_start -->Yes, segfault errors are royal pain in a$$. gdb is the best tool to debug these problem. There is another good alternative called DTrace which is dynamic tracing framework for troubleshooting kernel and application problems on production systems in real time. But, it only works on Solaris / FreeBSD / Mac OS x but not on Linux.
<!-- / message --><!-- sig -->
So as a sys admin you get to train yourself using gdb. There are good books out there that teaches gdb. HTH<!-- google_ad_section_end -->
__________________
<!-- / sig -->
<!-- message, attachments, sig --> <!-- google_ad_section_start(weight=ignore) -->Vivek Gite Do you run a Linux? Let's face it, you need help! All [Solved] threads are closed by mods / admin to avoid spam issues. <!-- google_ad_section_end --> |
<!-- user info -->
|
|||
<!-- message, attachments, sig --><!-- icon and title -->
<!-- / icon and title --><!-- message -->
<!-- google_ad_section_start -->I have been on this subject for a while now, and did a bit of surfing around in search of nirvana.
<!-- / message -->
<!-- message, attachments, sig --> Let me share with you what I discovered, and let's hope there's people on this forum who might be interested to add in further: A super link that initiates people to the world of post-mortem analysis: YouTube - Gilad Ben-Yossef on using ldd and nm Another discussion thread at Getting stack traces on Unix systems, automatically - Stack Overflow is worth a visit. The second link requires a lot of recoding of any existing software, whereas the first link encourages analysis from whatever you already have. A bit more of surfing around, and I found that after compiling any application, it is possible to export it's symbols, into a separate library. Thus, even applications that are actually put into production after stripping, can be analysed with gdb, without necessarily requiring the source code. For example, one could do:
Code:
gdb ./${EXECUTABLE_BINARY} --readnow <<- _EOF maint print symbols ${SYMBOLS_FILE_FOR_THE_EXECUTABLE_BINARY} quit _EOF wait gdb allows invocation by specifying an executable, and a separate file that contains the symbols, in case you don't not have the source code, and are using a stripped executable. But I am not sure if maint print symbols is the accurate option, must be verified before used. But I guess, if that works, and does not allow reverse engineering, then even developers of closed source software, could be encouraged to release the symbols file within released packages. I discovered another good reference at Tuxology - a Linux embedded, kernel and training blog It's by the same gentleman in the youtube link. Dtrace, mtrace, strace, ptrace, etc. are good, but the only problem with them is they are good if you know the application is going to soon crash. Or if you know how to replicate the crash. All of them leave me miserably occupied on the console, waiting for hours for an application to crash. The scene is worse when the app's basically supposed to be run as a daemon, and we run it in the foreground just for witch-hunting. And if the stupid thing crashes just when you left for a quick cuppa, ..... !!!! The second link I mentioned above, discusses possibilities of making your application capable capturing a lot of details, when it gets a sigsegv. And I suppose it's just that enough examples need to be collected, so that newbies can learn it too. Btw. Does anybody know how to actually use objdump and nm? Their documentation only discusses how to invoke it. Nothing much about how to interpret the output and use it to analyse a segfault with fine accuracy. Cheers<!-- google_ad_section_end --> |
<!-- user info -->
|
|||
<!-- message, attachments, sig --><!-- icon and title -->
<!-- / icon and title --><!-- message -->
<!-- google_ad_section_start -->Ok I figured out a bit about the objdump!
<!-- / message --> It is possible to identify the location in source code, that causes problems like: segfault at XXXXXX eip YYYYYY esp ZZZZZZ error 4 Typically such lines would be witnessed in /var/log/messages in the following format:
Code:
Jul 28 20:51:32 ubuntu804 kernel: [ 8146.280653] YOUR_APPLICATION[992]: segfault at 0000004c eip 08094952 esp a7acddc0 error 4
Code:
objdump -DCl "/path/to/YOUR_APPLICATION" > APPLICATION_DEBUG In the Above example 08094952 represents YYYYYY so I would typically do this:
Code:
grep -n -A 6 -B 6 "8094952" APPLICATION_DEBUG The resulting output should give you a fair idea of where the problem lies in the code. grep -n would tell you the line number of the relevant information in the APPLICATION_DEBUG and you might even cat or less to view that entire file to look at things more holistically. -A 6 -B 6 simply show 6 lines before and after the matching position in the APPLICATION_DEBUG. Though the information in /var/log/messages could be different like:
Code:
segfault at 00002aaaae097004 rip 0000000000536e10 rsp 00007fff97608930 error 4 Happy Hunting, and if anybody else has notes to add, I guess this thread will be very useful to everybody, so please accept my thanks in advance.<!-- google_ad_section_end --> |
Author<!--startindex--> | Subject: LINUX: segfault error 4 | |||
Jojo Castro <!--this will show the hats gif according to the user points--> | <!-- Added by San -->
| <!--stopindex-->
|||
Note: If you are the author of this question and wish to assign points to any of the answers, please login first.For more information on assigning points ,click here <!--startindex--> | ||||
|
||||
<!-- beging rendering checkbox --><!-- end rendering checkbox --><!--startindex-->Matti Kurkela <!--IMG ALIGN=MIDDLE ALT='expert in this area' SRC="/service/forums/images/expert_small.gif"--> | <!-- Added by San Need to Check. Required -->
|
|||
<!-- beging rendering checkbox --><!-- end rendering checkbox --><!--startindex-->Jojo Castro | <!-- Added by San Need to Check. Required -->
|
|||
<!-- beging rendering checkbox --><!-- end rendering checkbox --><!--startindex-->Jojo Castro | <!-- Added by San Need to Check. Required -->
|
|||
<!-- beging rendering checkbox --><!-- end rendering checkbox --><!--startindex-->Jojo Castro | <!-- Added by San Need to Check. Required -->
|
5.Linux遭遇Segmentation fault
from:http://hi.baidu.com/goggle1/blog/item/1ee73d2fe90d985c4fc2261c.html
Program terminated with signal 11, Segmentation fault.
程序运行了8个小时之后,出现了上面的提示,并说有core.dump文件产生; 找到coredump文件core.2747, #gdb -c core.2747 #bt 看不到堆栈,看不到任何代码行的信息;开始以为是内存已被踩到大乱,导致! 在网上百度了“Program terminated with signal 11, Segmentation fault.”,找到了 《How to find and fix faults in Linux applications》, 发现1. 事实上,并非如此;而是gdb使用错误,正确的使用是: #gdb ./myprogram core.2747 #bt 现在堆栈信息出来了! 发现2. tail -f messages Mar 16 13:59:52 localhost kernel: myprogram[2856]: segfault at 0000000000003a49 rip 000000000041f82c rsp 000000004be1bfb0 error 4 这次google“segfault rip rsp error 4” 找到第二篇好文: 《Posts tagged segfault》了解了dmesg,可以找到一些信息; 了解了addr2line -e testseg 0000000000400470命令; 两篇文章太好,全文粘贴如下: How to find and fix faults in Linux applications Abstract: Everybody claims that it is easy to find and fix bugs in programs written under Linux. Unfortunately it is very hard to find documents explaining how to do that. In this article you will learn how to find and fix faults without first learning how an application internally works. _________________ _________________ _________________IntroductionFrom a user perspective there is hardly any difference between closed and open source systems as long as everything runs without faults and as expected. The situation changes however when things do not work and sooner or later every computer user will come to the point where things do not work.In a closed source system you have usually only two option:
Despite those obstacles there are a few things you can do without reading all the code and without learning how the program works internally. LogsThe most obvious and simplest thing you can do is to look at file in /var/log/... What you find in those files and what the names of those logs files are is configurable. /var/log/messages is usually the file you want to look at. Bigger applications may have their own log directories (/var/log/httpd/ /var/log/exim ...).Most distributions use syslog as system logger and its behavior is controlled via the configuration file /etc/syslog.conf The syntax of this file is documented in "man syslog.conf". Logging works such that the designer of an program can add a syslog line to his code. This is much like a printf except that it writes to the system log. In this statement you specify a priority and a facility to classify the message: #include <syslog.h>With this C-interface any application written in C can write to the system log. Other languages do have similar APIs. Even shell scripts can write to the log with the command: logger -p err "this text goes to /var/log/messages"A standard syslog configuration (file /etc/syslog.conf) should have among others a line that looks like this: # Log anything (except mail) of level info or higher.The "*.info" will log anything with priority level LOG_INFO or higher. To see more information in /var/log/messages you can change this to "*.debug" and restart syslog (/etc/init.d/syslog restart). The procedure to "debug" an application would therefore be as follows. 1) run tail -f /var/log/messages and then start the application whichThe problem with this method is that it depends entirely on what the developer has done in his code. If he/she did not add syslog statements at key points then you may not see anything at all. In other words you can find only problems where the developer did already foresee that this could go wrong. straceAn application running under Linux can execute 3 type of function:
These system calls can be intercepted and you can therefore follow the communication between application and the kernel. A common problem is that an application does not work as expected because it can't find a configuration file or does not have sufficient permissions to write to a directory. These problems can easily be detected with strace. The relevant system call in this case would be called "open". You use strace like this: strace your_applicationHere is an example: # strace /usr/sbin/uucicoWhat do we see here? Let's look e.g look at the following lines: open("/etc/ld.so.preload", O_RDONLY) = -1 ENOENT (No such file or directory)The program tries to read /etc/ld.so.preload and fails then it carries on and reads /etc/ld.so.cache. Here it succeeds and gets file descriptor 3 allocated. Now the failure to read /etc/ld.so.preload may not be a problem at all because the program may just try to read this and use it if possible. In other words it is not necessarily a problem if the program fails to read a file. It all depends on the design of the program. Let's look at all the open calls in the printout from strace: open("/usr/conf/uucp/config", O_RDONLY)= -1 ENOENT (No such file or directory)The program tries now to read /usr/conf/uucp/config. Oh! This is strange I have the config file in /etc/uucp/config ! ... and there is no line where the program attempts to open /etc/uucp/config. This is the fault. Obviously the program was configured at compile time for the wrong location of the configuration file. As you see strace can be very useful. The problem is that it requires some experience with C-programming to really understand the full output of strace but normally you don't need to go that far. gdb and core filesSometimes it happens that a program just dies out of the blue with the message "Segmentation fault (core dumped)". This means that the program tries (due to a programming error) to write beyond the area of memory it has allocated. Especially in cases where the program writes just a few bytes to much it can be that only you see this problem and it happens only once in a while. This is because memory is allocated in chunks and sometimes there is accidently still room left for the extra bytes.When this "Segmentation fault" happens a core file is left behind in the current working directory of the program (normally your home directory). This core file is just a dump of the memory at the time when the fault happened. Some shells provide facilities for controlling whether core files are written. Under bash, for example, the default behavior is not to write core files at all. In order to enable core files, you should use the command: # ulimit -c unlimitedThe core file can now be used with the gdb debugger to find out what was going wrong. Before you start gdb you can check that you are really looking at the right core file: # file core.16897OK, lshref is the program that was crashing so let's load it into gdb. To invoke gdb for use with a core file, you must specify not only the core filename but also the name of the executable that goes along with that core file. # gdb ./lshref core.23061Now we know that the program is crashing while it tries to do a strcpy. The problem is that there might be many places in the code where strcpy is used. In general there will now be 2 possibilities to find out where exactly in the code it goes wrong.
What we need is a stack trace which will tell us which function was called before strcpy was executed. The command to do such a stack trace in gdb is called "backtrace". It does however not work with only the core file. You have to re-run the command in gdb (reproduce the fault): gdb ./lshref core.23061Now we can see that function main() called string_to_list() and from string_to_list strcpy() is called. We go to string_to_list() and look at the code: char **string_to_list(char *string){This malloc line looks like a typo. Probably it should have been: dat=(char *)malloc(strlen(string)+5000); We change it, re-compile and ... hurra ... it works. Let's look at a second example where the fault is not detected inside a library but in application code. In such a case the application can be compiled with the "gcc -g" flag and gdb will be able to show the exact line where the fault is detected. Here is a simple example. #includeWe compile it: gcc -Wall -g -o exmp exmp.cRun it... # ./exmp gdb exmp core.5302gdb tells us now that the fault was detected at line 6 and that pointer "p" pointed to memory which can not be accessed. We look at the above code and it is of course a simple made-up example where p is a null pointer and you can not store any data in a null pointer. Easy to fix... ConclusionWe have seen cases where you can really find the cause of a fault without knowing too much about the inner workings of a program. I have on purpose excluded functional faults, e.g a button in a GUI is in the wrong position but it works. In those cases you will have to learn about the inner workings of the program. This will generally take much more time and there is no recipe on how to do that. However the simple fault finding techniques shown here can still be be applied in many situations. Happy troubleshooting! Posts tagged segfaulttestseg[24850]: segfault at 0000000000000000 rip 0000000000400470 rsp 0000007fbffff8a0 error 6 这种信息一般都是由内存访问越界造成的,不管是用户态程序还是内核态程序访问越界都会出core, 并在系统日志里面输出一条这样的信息。这条信息的前面分别是访问越界的程序名,进程ID号,访问越界的地址以及当时进程堆栈地址等信息,比较有用的信息是最后的error number. 在上面的信息中,error number是4 ,下面详细介绍一下error number的信息: 在上面的例子中,error number是6, 转成二进制就是110, 即bit2=1, bit1=1, bit0=0, 按照上面的解释,我们可以得出这条信息是由于用户态程序读操作访问越界造成的。 error number是由三个字位组成的,从高到底分别为bit2 bit1和bit0,所以它的取值范围是0~7. * bit2: 值为1表示是用户态程序内存访问越界,值为0表示是内核态程序内存访问越界 * bit1: 值为1表示是写操作导致内存访问越界,值为0表示是读操作导致内存访问越界 * bit0: 值为1表示没有足够的权限访问非法地址的内容,值为0表示访问的非法地址根本没有对应的页面,也就是无效地址 根据segfault信息调试定位程序bug: #include<stdio.h> int main() { int *p; *p=12; return 1; } 1. 1. gcc testseg.c -o testseg -g,运行./testseg查看dmesg信息如下: 2. testseg[26063]: segfault at 0000000000000000 rip 0000000000400470 rsp 0000007fbffff8a0 error 6 3. 2. 运行addr2line -e testseg 0000000000400470,输出如下: 4. /home/xxx/xxx/c/testseg.c:5 |
6.Lighttpd php segfault at 0000000000000040 rip 0000003e30228278 rsp 0000007fbffff708 error 4
from:http://www.cyberciti.biz/tips/lighttpd-php-segfault-at-0000000000000040-rip-error.html
Lighttpd php segfault at 0000000000000040 rip 0000003e30228278 rsp 0000007fbffff708 error 4
by Vivek Gite on October 17, 2006 · 0 comments
I have recently noticed this error. Although server continues to work w/o problem at some point your server will crash. It is better to fix this error. The main problem was chrooted lighttpd installation. Few libraries were not copied. You need to use ldd command to locate name of libraries. In my case it was curl library used my DOMXML php module. Use following procedure to trace required libraries:
# mkdir /webroot/bin
# cp /bin/bash /webroot/bin
# cp /usr/bin/strace /webroot/bin
# l2chroot /usr/bin/strace
# l2chroot /bin/bash
# chroot /webroot
# strace php /path/to/script.php 2> /tmp/debug.txt
# exit
# vi /webroot/tmp/debug.txt
Now find out which shared libraries not found. Next you need to copy all missing libraries to /lib or /usr/lib location. You need to repeat above procedure till all shared libraries not copied to chroot jail.
Following is recommended solution if you run Apache or lighttpd in chroot jail.
Copy all shared libs from /lib and /usr/lib to /chroot directory. But don't copy any executable from /bin/ /usr/bin or /usr/sbin directory.
# cp -avr /lib/ /chroot/lib/
# cp -avr /usr/lib/ /chroot/usr/lib/
Above solution is quite secure and I have successfully implemented it for high performance Apache shared load balancing business hosting. More than 800+ sites are hosted using 6 Apache web server and 2 node MySQL cluster.
Don't forget to remove /chroot/bin directory and all files after troubleshooting.