Tuesday, August 12, 2008

Re: [PERFORM] Using PK value as a String

You guys totally rock!

I guess, bottom line, we should take that extra day to convert our PK and FK to a numerical value, using BIG INT to be on the save side. (Even though Wikipedia's UserID uses just an integer as data type)

To Gregory: Thank you for you valuable statement.
"But the real question here is what's the better design. If you use Username you'll be cursing if you ever want to provide a facility to allow people to change their usernames. You may not want such a facility now but one day" I think you hit the nail on the head with this comment. If a member really wants to change their username, IE: Choose to go with IloveUSara, only to be dumped on the alter, who am I to say no.

To Valentin: I wish someone would prove us both wrong or right. I still thought it wasn't a bad idea to use username a varchar(256) to interact with all the modules... Well thats what I thought when I first started writing the tables...

To Jay: Thanks for keeping it short and simple. "I'd like sequence, which is unique and just use 8 bytes(bigint) When it querying other tables, it will faster , and disk space smaller than UUID(40 bytes)." I'm taking your advice on this^^ Although wikipedia's postgresql database schema still stands.

To Craig: Yes, I agree. Please see my comment on IloveUSara.

To Mario: Let's go! I'm Mario... Sorry, I love Mario Kart. Especially on the old super famacon. Going with int8, thank you for the advice.


On Tue, Aug 12, 2008 at 6:58 PM, Mario Weilguni <mweilguni@sime.com> wrote:
Valentin Bogdanov schrieb:

--- On Mon, 11/8/08, Gregory Stark <stark@enterprisedb.com> wrote:

 
From: Gregory Stark <stark@enterprisedb.com>
Subject: Re: [PERFORM] Using PK value as a String
To: "Jay" <arrival123@gmail.com>
Cc: pgsql-performance@postgresql.org
Date: Monday, 11 August, 2008, 10:30 AM
"Jay" <arrival123@gmail.com> writes:

   
I have a table named table_Users:

CREATE TABLE table_Users (
  UserID       character(40)  NOT NULL default
     
'',
   
  Username   varchar(256)  NOT NULL default
     
'',
   
  Email          varchar(256) NOT NULL default
     
''
   
  etc...
);

The UserID is a character(40) and is generated using
     
UUID function. We
   
started making making other tables and ended up not
     
really using
   
UserID, but instead using Username as the unique
     
identifier for the
   
other tables. Now, we pass and insert the Username to
     
for discussions,
   
wikis, etc, for all the modules we have developed. I
     
was wondering if
   
it would be a performance improvement to use the 40
     
Character UserID
   
instead of Username when querying the other tables, or
     
if we should
   
change the UserID to a serial value and use that to
     
query the other
   
tables. Or just keep the way things are because it
     
doesn't really make
   
much a difference.
     
Username would not be any slower than UserID unless you
have a lot of
usernames longer than 40 characters.

However making UserID an integer would be quite a bit more
efficient. It would
take 4 bytes instead of as the length of the Username which
adds up when it's
in all your other tables... Also internationalized text
collations are quite a
bit more expensive than a simple integer comparison.

But the real question here is what's the better design.
If you use Username
you'll be cursing if you ever want to provide a
facility to allow people to
change their usernames. You may not want such a facility
now but one day...

   

I don't understand Gregory's suggestion about the design. I thought using natural primary keys as opposed to surrogate ones is a better design strategy, even when it comes to performance considerations and even more so if there are complex relationships within the database.

Regards,
Valentin

 
UUID is already a surrogate key not a natural key, in no aspect better than a numeric key, just taking a lot more space.

So why not use int4/int8?






--
Regards,
Jay Kang


This e-mail is intended only for the proper person to whom it is addressed and may contain legally privileged and/or confidential information. If you received this communication erroneously, please notify me by reply e-mail, delete this e-mail and all your copies of this e-mail and do not review, disseminate, redistribute, make other use of, rely upon, or copy this communication. Thank you.

No comments: