Saturday, July 12, 2008

Re: [HACKERS] PATCH: CITEXT 2.0 v3

On Jul 12, 2008, at 14:57, Tom Lane wrote:

> There was some discussion earlier about whether the proposed
> regression
> tests for citext are suitable for use in contrib or not. After
> playing
> with them for awhile, I have to come down very firmly on the side of
> "not". I have these gripes:

You're nothing if not thorough, Tom.

> 1. The style is gratuitously different from every other regression
> test
> in the system. This is not a good thing. If it were an amazingly
> better way to do things, then maybe, but as far as I can tell the
> style
> pgTAP forces on you is really pretty darn poorly suited for SQL tests.
> You have to contort what could naturally be expressed in SQL as a
> table
> result into a scalar. Plus it's redundant with the expected-output
> file.

Sure. I wasn't aware of the existing regression test methodology when
I wrote pgTAP and those tests. I'm fine to change them and use pgTAP
for testing my app code, rather than PostgreSQL itself.

> 2. It's ridiculously slow; at least a factor of ten slower than doing
> equivalent tests directly in SQL. This is a very bad thing. Speed of
> regression tests matters a lot to those of us who run them a dozen
> times
> per day --- and I do not wish to discourage any developers who don't
> work that way from learning better habits ;-)

Hrm. I'm wonder why it's so slow? The test functions don't really do a
lot. Anyway, I agree that they should perform well.

> Because of #1 and #2 I find the use of pgTAP to be a nonstarter.
> I have a couple of gripes about the substance of the tests as well:
>
> 3. What you are mostly testing is not the behavior of citext itself,
> but the behavior of the underlying strcoll function. This is pretty
> much doomed to failure, first because the regression tests are
> intended
> to work in multiple locales (and we are *not* giving that up for
> citext), and second because the behavior of strcoll isn't all that
> portable across platforms even given allegedly similar locale settings
> (as we already found out yesterday).

Yes, I *just* ran the tests on Ubuntu and opened my mail to ask about
the bizarre differences when I saw this message. Thanks for answering
my question before I asked it. :-)

> Sadly, I think you have to give up
> attempts to check the interesting multibyte cases and confine yourself
> to tests using ASCII strings.

Grr. Kind of defeats the purpose. Is there no infrastructure for
testing multibyte functionality? Are test database clusters all built
using SQL_ASCII and the C locale?

> 4. A lot of the later test cases are equally uselessly testing whether
> piggybacking over text functions works. The odds of ever finding
> anything with those tests are not distinguishable from zero (unless
> the
> underlying text function is busted, which is not your responsibility
> to
> test). So I don't see any point in putting them into the standard
> regression package. (What maybe *would* be useful to test, but you
> didn't, is whether the result of a function is considered citext
> rather
> than text.)

Well, I was doing test-driven development: I wrote the tests to ensure
that I had added the functions for CITEXT properly, and when they
passed, I could move on. As a unit tester, it'd feel weird for me to
drop these. I'm not testing the underlying functions; I'm making sure
that they work properly with CITEXT.

> I suggest something more like the attached as a suitable regression
> test. I got bored about halfway through and started to skim, so there
> might be a few tests toward the end of the original set that would be
> worth transposing into this one, but nothing jumped out at me.

Thanks! I'll work this in.

Best,

David

PS: I confirmed late yesterday that the memory leak I saw was with my
version for 8.3, where I had my own str_tolower(). It clearly has a
leak that I'll have to fix, but it has no bearing on the contribution
to 8.4, which has no such leak. Thanks for running the test yourself
to confirm.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

No comments: