When a Guid is not a Guid?
Databases are fun things. Assigning identifiers to things in your database can be important. Sometimes it is handy to have an Uber alternate key that is guaranteed globally unique. This can be useful in many ways one of which is database replication.
Globally Unique Identifiers
I just ran into a spot in a legacy database where it seems they wanted to future proof globally identifying database rows. They have quite a few tables with a field called guid. That might ring a bell. In case it doesn’t GUIDs are a Microsoft implementation of globally unique identifiers. Easy to create. 16 bytes wide and lovely to look at.
string guid = Guid.NewGuid().ToString(); value: ba72c396-4b49-4d95-b9d7-03d42d5d5141
Funny thing about this “guid” database field is that it’s a Unicode string 31 characters wide. You might notice the Guid value above is 36 characters. Ok, Get rid of the dashes and it is 32 characters.
Crap still too big. Wait, I need to shove this hex encoded value into a 31 character Unicode field?
Sigh, There is a lot wrong with that statement. Something smells wrong. Good thing you can base64 encode the raw bytes of a Guid into 24 characters.
string base64Guid = Convert.ToBase64String(Guid.NewGuid().ToByteArray()); value: JD/V9w5370OzsVK0RQ4u3A==
Wait a second
But still that’s fishy why do I have so much spare room left over. What are they getting at? I started looking at what values were actually in these guid fields and found a surprise.
What the heck?
- Oh look, 12345678 is part of the database identifier for the row.
- Oh look, 0123 is the id for the database table (long story.)
- Is that a company name in there?
Why are they manually creating “globally unique identifiers” with what is technically a compound key. Sure they are technically globally unique in this database but collisions could easily occur with other databases (for the same company).
So much wrong with this.
To set the “guid” field you have to know the database row’s identifier. Which means you need to insert the row. Get the identifier back create the faux guid and then update the row again. Yuck, that is a pretty chatty create mechanism.
Worse yet. What if someone couples to the information encoded into the faux guids?
Do I need to replicate their broken technique? Nah, I’ll just shove Base64 encoded Guids in there.