Portable Sql Data Types

(Part of PortableSql)

SQL data types vary widely. Some issues involving data types:

booleans (bits)

Access uses -1 to represent true. Most others use 1.

integers

The ranges of integer types vary a lot. For example, Oracle doesn't have the usual 8- 16- and 32-bit integers. Everything gets turned into a numeric.

When deciding on a range, it's tempting to just use numeric types everywhere so you can't run out no matter how big the database grows. However, keep in mind that clients might be using 32-bit ints. If you store a larger number than that in the database, the client will break.

char and varchar

One would have hoped that by now we wouldn't have to worry about string lengths anymore. It's tempting to use varchar(255) for all "short" strings but there are other limitations. (In MySQL, the total length of all the columns in an index is 255 characters.) A varchar(80) might be a better default.

Many databases treat the empty string ('') as a special case. With Oracle, an empty string becomes a NULL when stored in a table. With some other databases, the empty string becomes a single space (' ').

For long strings you need to use blobs. Some vendors support things like LargeText?.

You can't have characters without character sets. By default you'll probably get ISO-Latin-1. (And by default, Microsoft users will be allowed to insert non-portable characters like the copyright symbol.) And what about internationalization?

Some databases now have UniCode support. It can be with special types (like "wchar, wvarchar"). PostgreSQL lets you choose the character encoding for the entire database.

Case-insensitive string matching is different for different databases.

dates, times, and timestamps

Date formats can vary widely. Go slow here. Sometimes I just use a string to make dates more portable. A format of 'YyyyMmDd' will create a natural sort and comparable sequence. Some vendors support times as part of dates, some don't.

Proposal: Lets represent a date as an integer, and time as its fractional part. In SanDiego, as I write this it's 20050429.2103 -- Time is relative, you can see that I'm writing at 9pm somewhere. If we wanted to know about the most recent earthquake to the millisecond of it's occurrence, we'd be using GMT anyway. Indonesia: 20041226.005853000000, the deadliest disaster in modern history --http://en.wikipedia.org/wiki/2004_Indian_Ocean_earthquake

We're all on the Internet now, so why don't databases have any functions for dealing with time zones? The now() function typically returns the time in the database's local time zone.

Not to mention that now() isn't even a standard function...

longvarchar, longvarbinary, blobs

(Blob="Binary Large Object", which is really a file.)

The Oracle JDBC driver maps longvarchar and longvarbinary to the older text and binary datatypes, not the blob and clob datatypes, so you can't have more than one longvarchar or longvarbinary in a table.


EditText of this page (last edited April 16, 2013) or FindPage with title or text search