Saturday, June 21, 2008

Re: [GENERAL] System in Recovery Mode But No Activity

"Scott Marlowe" <scott.marlowe@gmail.com> writes:
> On Fri, Jun 20, 2008 at 7:56 PM, John Cheng <chonger.cheng@gmail.com> wrote:
>> The state of the server when I sent this e-mail was that there were
>> two remaining connections/postgres subprocesses. I used kill -9 to
>> stop those two subprocesses. Then postgres was able to stop normally.
>> After that, I restarted postgresql normally and it went into recovery
>> mode for about 30 seconds. After that, it started to behave normally
>> again.

> Definitely look into what's causing the oom killer to come out and
> play, and look at turning off overcommit (I think the setting is 2 to
> turn it off)

If you see this again, please get stack backtraces ('bt' command
to gdb). The fact that both stuck processes were in
__lll_mutex_lock_wait() suggests some sort of deadlock, but it's
impossible to guess more without seeing how they got there.

Also, the reference to libpthread is a bit worrisome; we've seen
deadlocks in the past that were a direct result of the backend
unexpectedly becoming multithreaded, eg see this thread:
http://archives.postgresql.org/pgsql-general/2007-11/msg00580.php
You should look into what's causing libpthread to get loaded, and
see if you can stop it. I don't see libperl mentioned in your
gdb output, but maybe something else is pulling it in --- ldd
might help track that down.

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

No comments: