Friday, August 1, 2008

Re: [PATCHES] Relation forks & FSM rewrite patches

Attached is an new version of the relation forks and FSM patches,
updated per Tom's comments (details below). I think the relation forks
patch is ready for commit, except that the docs on physical storage
needs to be updated. Barring any objections, I will commit the relation
forks patch in a few days, and submit a patch for the docs.

The FSM patch has been updated so that it applied over the new relation
forks patch.

Tom Lane wrote:
> "Heikki Linnakangas" <heikki@enterprisedb.com> writes:
>> The one part that I'm not totally satisfied in the relation forks patch
>> is the smgrcreate() function. The question problem is: which piece of
>> code decides which forks to create for a relation, and when to create
>> them? I settled on the concept that all forks that a relation will need
>> are created at once, in one smgrcreate() call. There's no function to
>> create additional forks later on.
>
> I think that's an extremely bad idea. You need look no further than
> Zdenek's hopes of in-place-upgrade to see a need for adding a fork
> to an existing relation; but even without that I can see possibly
> wanting to not create a fork right away. I think smgrcreate() should
> just create one fork as it's told (ie, same API you gave mdcreate).

Yeah, that's better.

> Other nits:
>
> relpath() is now in danger of buffer overrun: you need to increase
> the palloc() sizes.

Oops, fixed.

> Seems like a #define for the max number of digits
> in a ForkNumber wouldn't be out of place (though I sure hope it never
> gets past 1 ...).

Done. It makes it more obvious how the buffer length is calculated than
using plain numbers.

> Also, I strongly suggest *not* appending _0 to the name of the main fork's
> file. This'd break on-disk compatibility and people's expectations,
> for no particular reason.

Done.

> Don't like the name NUM_FORKS; it seems to suggest that's the actual
> number of forks in existence. MAX_NUM_FORKS would be better.

I don't like MAX_NUM_FORKS. I renamed it to MAX_FORKNUM (and changed its
semantics accordingly).

> I think that setting InvalidForkNumber to -1 is unportable: there is
> nothing compelling the compiler to represent enum ForkNumber as a signed
> type, so the places where you assume a ForkNumber variable can hold
> InvalidForkNumber all look like trouble. One solution is to include
> InvalidForkNumber in the enum:
>
> enum ForkNumber {
> InvalidForkNumber = -1,
> MAIN_FORKNUM = 0
> };
>
> This would be a bit annoying if someone wrote a switch statement listing
> different possible fork numbers, as the compiler would complain if
> there's no case for InvalidForkNumber; but I can't see a reason for code
> like that to be common.

I chose this approach. There's no switch constructs like that at the
moment, and I don't see any immediate need for one. In fact, at the
moment InvalidForkNumber is only used in bgwriter database fsync
requests, where the fork number field doesn't mean anything.

> BTW, don't put a comma at end of enum ForkNumber, I think some compilers
> will barf on that.

Done.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

No comments: