Wednesday, May 7, 2008

Re: [HACKERS] [PATCHES] Testing pg_terminate_backend()

It looks pretty good from here. I have an output of about 50 million
lines, and the only FATAL stuff is the "terminating due to admin
command". All other errors look consistent with things like the backend
that creates a table gets killed, so anybody trying to access that
table later will fail with a does not exist error.


//Magnus


Bruce Momjian wrote:
>
> Magnus, others, how is the SIGTERM testing going?
>
> ---------------------------------------------------------------------------
>
> Bruce Momjian wrote:
> > bruce wrote:
> > > Tom Lane wrote:
> > > > Bruce Momjian <bruce@momjian.us> writes:
> > > > > Tom Lane wrote:
> > > > >> The closest thing I can think of to an automated test is to
> > > > >> run repeated sets of the parallel regression tests, and each
> > > > >> time SIGTERM a randomly chosen backend at a randomly chosen
> > > > >> time. Then see if anything "funny"
> > > >
> > > > > Yep, that was my plan, plus running the parallel regression
> > > > > tests you get the possibility of >2 backends.
> > > >
> > > > I was intentionally suggesting only one kill per test cycle.
> > > > Multiple kills will probably create an O(N^2) explosion in the
> > > > set of possible downstream-failure deltas. I doubt you'd
> > > > really get any improvement in testing coverage to justify the
> > > > much larger amount of hand validation needed.
> > > >
> > > > It also strikes me that you could make some simple alterations
> > > > to the regression tests to reduce the set of observable
> > > > downstream deltas. For example, anyplace where a test loads a
> > > > table with successive INSERTs and that table is used by later
> > > > tests, wrap the INSERT sequence with BEGIN/END. Then there is
> > > > only one possible downstream delta (empty table) and not N
> > > > different possibilities for an N-row table.
> > >
> > > I have added pg_terminate_backend() to use SIGTERM and will start
> > > running tests as discussed with Tom. I will post my scripts too.
> >
> > Attached is my test script. I ran it for 14 hours (asserts on),
> > running 450 regression tests, with up to seven backends killed per
> > regression test.
> >
> > I have processed the combined regression.diffs files by pickouting
> > out all the new error messages. I don't see anything unusual in
> > there.
> >
> > Should I run it differently?
> >
> > --
> > Bruce Momjian <bruce@momjian.us>

http://momjian.us
> > EnterpriseDB

http://enterprisedb.com
> >
> > + If your life is a hard drive, Christ can be your backup. +
>
> > #!/bin/bash
> >
> > REGRESSION_DURATION=80 # average duration of regression test
> > in seconds OUTFILE=/rtmp/regression.sigterm
> >
> > # To analyze output, use:
> > # grep '^\+ *[A-Z][A-Z]*:' /rtmp/regression.sigterm | sort | uniq |
> > less
> >
> >
> > cd /pg/test/regress
> >
> > while :
> > do
> > (
> > SLEEP=`expr $RANDOM \* $REGRESSION_DURATION / 32767`
> > echo "Sleeping $SLEEP seconds"
> > sleep "$SLEEP"
> > echo "Trying kill"
> > # send up to 7 kill signals
> > for X in 1 2 3 4 5 6 7
> > do
> > psql -p 55432 -qt -c "
> > SELECT
> > pg_terminate_backend(stat.procpid) FROM (SELECT procpid FROM
> > pg_stat_activity ORDER BY random() LIMIT 1) AS stat
> > " template1 2> /dev/null
> > if [ "$?" -eq 0 ]
> > then echo "Kill sent"
> > fi
> > sleep 5
> > done
> > ) &
> > gmake check
> > wait
> > [ -s regression.diffs ] && cat regression.diffs >>
> > "$OUTFILE" done
>
>
> >
> > --
> > Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
> > To make changes to your subscription:
> > http://www.postgresql.org/mailpref/pgsql-patches
>
> --
> Bruce Momjian <bruce@momjian.us>

http://momjian.us
> EnterpriseDB

http://enterprisedb.com
>
> + If your life is a hard drive, Christ can be your backup. +


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

No comments: