Monday, August 11, 2008

[HACKERS] Question regarding the database page layout.

Hello all,

I have been digging into the database page layout (specifically the tuples) to ensure the unsigned integer types were consuming the proper storage.
While digging around, I found one thing surprising:  

It appears the heap tuples are padded at the end to the MAXALIGN distance.

Below is my data that I used to come to this conclusion.
(This test was performed on a 64-bit system with --with-blocksize=32).

The goal was to compare data from comparable type sizes.
The first column indicates the type (char, uint1, int2, uint2, int4, and uint4),
the number in () indicates the number of columns in the table.

The Length is from the .lp_off field in the ItemId structure.
The Offset is from the .lp_len field in the ItemId structure.
The Size is the offset difference.

char (1)        Length      Offset  Size            char (9)         Length       Offset   Size
                            25      32736     32                                         33      32728      40
                            25      32704     32                                         33      32688      40
                            25      32672     32                                         33      32648      40
                            25      32640                                                  33      32608  
                                                               
uint1 (1)       Length       Offset   Size            uint1 (9)       Length       Offset  Size
                             25      32736      32                                        33      32728     40
                             25      32704      32                                        33      32688     40
                             25      32672      32                                        33      32648     40
                             25      32640                                                  33      32608  
                                                               
int2 (1)         Length       Offset   Size            int2 (5)         Length       Offset  Size
                             26      32736      32                                        34      32728     40
                             26      32704      32                                        34      32688     40
                             26      32672      32                                        34      32648     40
                             26      32640                                                  34      32608  
                                                               
uint2 (1)        Length      Offset   Size            unt2 (5)        Length       Offset  Size
                             26      32736      32                                        34      32728     40
                             26      32704      32                                        34      32688     40
                             26      32672      32                                        34      32648     40
                             26      32640                                                  34      32608  
                                                               
int4 (1)           Length      Offset  Size            int4 (3)           Length     Offset  Size
                             28      32736     32                                          36     32728     40
                             28      32704     32                                          36     32688     40
                             28      32672     32                                          36     32648     40
                             28      32640                                                   36     32608  
                                                               
uint4 (1)         Length       Offset  Size            uint4 (3)        Length      Offset  Size
                              28      32736     32                                         36     32728     40
                              28      32704     32                                         36     32688     40
                              28      32672     32                                         36     32648     40
                              28      32640                                                  36     32608  

From the documentation at: http://www.postgresql.org/docs/8.3/static/storage-page-layout.html
and from the comments in src/include/access/htup.h I understand the user data (indicated by t_hoff)
must by a multiple of MAXALIGN distance, but I did not find anything suggesting the heap tuple itself
had this requirement.

After a cursory glance at the HeapTupleHeaderData structure, it appears it could be aligned with
INTALIGN instead of MAXALIGN.  The one structure I was worried about was the 6 byte t_ctid
structure.  The comments in src/include/storage/itemptr.h file indicate the ItemPointerData structure
is composed of 3 int16 fields.  So everthing in the HeapTupleHeaderData structure is 32-bits or less.

I am interested in attempting to generate a patch if this idea appears feasible.   The current data
set I am playing with it would save over 3GB of disk space.  (Back of the envelope calculations
indicate that 5% of my current storage is consumed by this padding.   My tuple length is 44 bytes.)

Thanks,

- Ryan

No comments: