Sunday, September 7, 2008

Re: [GENERAL] A challenge for the SQL gurus out there...

On Sunday 07 September 2008, Gregory Stark wrote:
> "Uwe C. Schroeder" <uwe@oss4u.com> writes:
> > I want to get a list looking like
> >
> > forum id thread_id post_id
> > 1 6 443
> > 2 9 123
> > 3 3 557
>
> ...
>
> > It all boils down to me not being able to come up with a query that gives
> > me the latest post per forum_id.
>
> In a situation like this I would probably denormalize the tables slightly
> by adding a form_id key to the individual posts. That would make it hard to
> ever move a thread from one forum to another, though not impossible, but
> would help in this case as well as any other time you want to do an
> operation on all posts in a forum regardless of thread.
>
> If you add that column then you could index <form_id,date> and get the
> result you're looking for instantly with a DISTINCT ON query (which is a
> Postgres SQL extension).
>
> SELECT DISTINCT ON (form_id)
> forum_id, thread_id, post_id
> FROM thread
> ORDER BY forum_id, date DESC
>
> (actually you would have to make the index on <form_id, date DESC> or make
> both columns DESC in the query and then re-order them in an outer query)
>
> Alternatively you could have a trigger on posts which updates a
> last_updated field on every thread (and possibly a recent_post_id) then you
> could have a query on forums which pulls the most recently updated thread
> directly without having to join on form_post at all. That would slow down
> inserts but speed up views -- possibly a good trade-off for a forum system.

Thanks Gregory.
Just to put my final solution on the list: I ended up with a combined approach
of what you suggested:
I added the forum_id to the posts table and created 2 triggers: one that sets
the forum_id in the posts table to the forum_id in the threads table on
insert (therefor no change in the application was necessary).
The second trigger is to overcome the downside of adding the forum_id to the
posts table. On an update to forum_thread.forum_id the trigger updates all
posts in that thread to reflect the change in forum_id. That way one can just
move the whole thread by changing the forum_id and the posts are moved along
by the trigger.

Very nice! The query time is now 198ms instead of up to 48seconds !!!

Thanks for the idea

Uwe

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

No comments: