Friday, August 29, 2008

[GENERAL] Full Text Search, plus some General Ideas Beforehand

Dear Madam or Sir!

I am just studying PostgreSQL 8.3.1 from a mag CD unfortunately under
MS-Windows.

Aye, it is under the BSD licence which allows for re-proprietarisation of
the code.

I lean more to the GPL, the GPL v3.x being a bit too strict in the case of
my GNU Business Model.

This provides source code and uncertified binaries for the media and
handling costs as is known for OpenSource software.

Money is made, close to the ideas of the FSF, in leasing certified binaries
that are dongled to the particular machine, in this case the GPL v3.x has
to be liberated to a kind of LGPL, which makes the binary tamper proof.

Tightly included into this licence is a generous insurance against any
damages from using this software.

ClosedSource will be unconditionally liable to any damages according to
article 14, paragraph 2 of the German Constitution, we have the concept of
guilt-free damage risk liability.

I have serviced mission-critical embedded computers, so my hard position is
understandable.

This far to the general politics.

Aye, I am an old fossil, learning computers via punched cards over 30 years
ago.

Thus I am influenced not only by XBase, but also from database theory where
indexing everything was paramount.

Your problem of updating BTrees is most visibly in ReiserFS, whose
transactional benefits are counterweighted by the overhead of maintaining
mail and news servers.

There is a fragmentation problem under ext3 for IMAP directories, but this
can be solved by tar-ing and untar-ing the appropriate directories.

So we should leave that to the planner.

However, old DEC Alpha OS's provided an FX32! adaptive IA32 emulator,
actually Sun sports an hot-spot-JIT system in JRE 1.6.x.

A strategy for the PostgreSQL planner, aye this eats space, but this might
result in a much more optimised balance between indexing speedup vs.
overhead.

Well, for Full Text Search, especially copying Google & Co., I advertise to
augment the traditional presence index by a proximity index and a
permutation index. You already have functions to generate such a proximity
index, the permutation index being the least important.

This makes the whole Internet a natural test bed for PostgreSQL, and
somehow a basis for a Google successor.

Kind regards

Norbert Grün (gnor.gpl@googlemail.com)

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

No comments: