Search Java Programs

Friday, February 26, 2010

Understanding the SQL BIT Data Type

When you are working with data that can take on only one of two values, use the BIT data type. For example, you can use BIT fields to store the answers to yes/no or true/false survey questions such as: these "Are you a homeowner?" "Are you married?" "Did you complete high school?" "Do you love SQL?"

You can store answers to yes/no and true/false questions in CHARACTER columns using the letters Y, N, T, and F. However, if you use a column of type CHARACTER, each data value will take 1 byte (8 bits) of storage space. If you use a BIT column instead, you can store the same amount of data using 1/8 the space.

Suppose, for example, that you create a CUSTOMER table using the SQL statement:

 CREATE TABLE customer
(id INTEGER,
name VARCHAR(25),
high_school_graduate BIT,
some_college BIT,
graduate_school BIT,
post_graduate_work BIT,
male BIT,
married BIT,
homeowner BIT,
US_citizen BIT)

If you follow normal conventions, a 1 in a BIT column would represent TRUE, and a 0 would represent FALSE. Thus, if the value of the MARRIED column were 1, that would mean that the CUSTOMER is married. Similarly, if the value in the US_CITIZEN column were 0, that would mean that the CUSTOMER is not a U.S. citizen.

Using the BIT data type instead of a CHARACTER data type for the eight two-state (BIT) columns in the current example not only saves 56 bytes of storage space per row, but it also simplifies queries based on the two-state column values.

Suppose, for example, that you wanted a list of all male customers. If the MALE column were of type CHARACTER, you would have to know whether the column would contain a T, t, Y, y, or some other value to indicate that the CUSTOMER is a male. When the column is a BIT column, you know that the value in the male column can only be a 1 or a 0-and will most likely be a 1 if the CUSTOMER is a male, since a 1 would, by convention, indicate TRUE.

You can use a BIT column to select rows that meet a specific condition by checking the value of the column in the WHERE clause of your SQL statement. For example, you could make a list of all customers that are high school graduates using the SQL SELECT statement:

 SELECT id, name
FROM customer
WHERE high_school_graduate = 1

Selecting rows that meet any one of several criteria is also easy. Suppose, for example, that you want a list of all customers that are either married or homeowners. You could use the SQL SELECT statement:

 SELECT id, name
FROM customer
WHERE married = 1 OR homeowner = 1

If, on the other hand you want to select only married homeowners, you would use an AND in place of the OR in the WHERE clause.

Understanding Standard SQL Datetime Data Types and the DATETIME Data Type

Although you can store dates and times in columns of type CHAR or VARCHAR, you will find it more convenient to use datetime columns instead. If you put dates and times into date-time columns, the DBMS will format the dates and times in a standard way when you display the contents of the columns as part of SELECT statements. More importantly, by using datetime columns, you will be able to use specialized date and time functions (such as INTERVAL and EXTRACT) to manipulate date and time data.

The SQL-92 standard specifies five datetime data types:

  • DATE. Uses 10 characters to store the four-digit year, two-digit month, and two-digit day values of the date in the format 2000-04-25. Because the DATE data type uses a four-digit year, you can use it to represent any date from the year 0001 through the year 9999. Thus, SQL will have a year 10K problem, but I, for one, will let future generations worry about it.

  • TIME. Uses eight characters, including the colons, to represent the two-digit hours, two-digits minutes, and two-digit seconds in the format 19:22:34. Because the SQL formats time using the 24-hour clock, 19:22:34 represent 22 minutes and 34 seconds past 7 P.M., whereas 07:22:34 represents 22 minutes and 34 seconds past 7 A.M. If you define a column as type TIME, the default is for the DBMS to display only whole seconds. However, you can tell the DBMS to store (and display) fractions of seconds by adding the precision you want to the TIME data type when using it to define a column of type TIME. For example, if you create a table with the SQL statement

    CREATE TABLE time_table
    (time_with_seconds TIME(3))

    the DBMS will store time data including up to three digits representing thousandths of seconds.

  • TIMESTAMP. Includes both date and time using 26 characters-10 characters to hold the date, followed by a space for separation, and then 15 characters to represent the time, including a default of fractions of seconds to six decimal places. Thus, if you create a table using

    CREATE TABLE time_table
    (timestamp_column TIMESTAMP,
    timestamp_column_no_decimal TIMESTAMP (0))

    the DBMS will store the date and time in TIMESTAMP_COLUMN formatted as 2000-04-25 19:22:34.123456, and the date and time in TIMESTAMP_COLUMN_NO_DECIMAL formatted as 2000-04-25 19:25:34. (The number in parenthesis ( () ) after TIMESTAMP specifies the precision of the fractions of seconds portion of the time-0, in the example.)

  • TIME WITH TIME ZONE. Uses 14 characters to represent the time and the offset from Universal Coordinated Time (UTC)-eight characters to hold the time followed by the offset of the local time from (UTC)-formerly known as Greenwich Mean Time or GMT. Therefore, if you create a table using

    CREATE TABLE time_table
    (time_with_gmt TIME WITH TIME ZONE,
    time_with_seconds_gmt TIME (4) WITH TIME ZONE)

    the DBMS will store the time in TIME_WITH_GMT formatted as 19:22:24-05:00, and in TIME_WITH_SECONDS_GMT formatted as 19:22:24.1234-05:00. (The (4) in the data type for the TIME_WITH_SECONDS_GMT column in the example represents the optional precision you can specify to represent the fractions of seconds in the time.)

  • TIMESTAMP WITH TIME ZONE. Uses 32 characters to represent the date, the time, and the offset from Universal Coordinated Time (UTC)-10 characters to hold the date, followed by a space for separation, and then 21 characters to represent the time, including a default of fractions of seconds given to six decimal places and the office from UTC (GMT). Thus, if you create a table using

    CREATE TABLE time_table
    (timestamp_column TIMESTAMP WITH TIME ZONE,
    timestamp_no_dec TIMESTAMP(0)WITH TIME ZONE)

    the DBMS stores the date and time in TIMESTAMP_COLUMN formatted as 2000-04-25 19:22:34.123456+04:00 and in TIMESTAMP_NO_DEC using the format 2000-04-25 19:25:34+01:00 (The number in parenthesis ( () ) after TIMESTAMP specifies the precision of the fractions of seconds portion of the time-0, in the example.)

Unfortunately, not all DBMS products support all five of the standard SQL datetime data types. In fact, some DBMS products even use TIMESTAMP for purposes other than defining columns that hold date and time data. As such, check your system manual to see which of the SQL datetime data types your DBMS supports.

Don't be surprised to find that your system uses a nonstandard data type such as DATETIME (used by SQLBase, Sybase, and MS-SQL Server) to format columns that will hold dates and times.

If your system uses the DATETIME data type, you can define a column to hold date and time using an SQL statement similar to:

CREATE TABLE date_table
(date_time DATETIME)

To insert a date and time into a DATETIME column, enclose the date and time in single quotes using an INSERT statement similar to:

INSERT INTO date_table
VALUES ('04/25/2000 21:05:06:123')

If you are using MS-SQL Server and execute the SQL statement

SELECT * FROM date_table

the DBMS will display the value in the DATE_TIME column as: 2000-04-25 21:05:06.123.

MS-SQL Server lets you specify the date in the INSERT statement using any one of a variety of formats, including but not limited to:

  • Apr 25 2000

  • APR 25 2000

  • April 25, 2000

  • 25 April 2000

  • 2000 April 25

  • 4/25/00

  • 4-25-2000

  • 4.25.2000

MS-SQL Server also gives you a number of ways to express the time you want to insert into a DATETIME column. Valid ways to express time include:

  • 9:05:06:123pm

  • 9:5:6:123pm

  • 9:05pm

  • 21:00

  • 9pm

  • 9PM

  • 9:05

(Note that the last entry in this example ["9:05"] will insert 9:05am and not 9:05pm.) If you insert a date without a time, MS-SQL Server will append 00:00:00:000 to your date. Thus, the SQL statement

 INSERT INTO date_table VALUES ("2000 Apr 25")

will set the value of DATE_TIME to 2000-04-10 00:00:00.000. (MS-SQL Server will replace the portion of the time you leave off with zeroes.)

If you insert only a time into a DATETIME column, MS-SQL Server will replace the omitted date with 01/01/1900.

Understanding the MS-SQL Server IMAGE Data Type

The MS-SQL Server IMAGE data type is similar to the TEXT data type in that it you can store 2,147,483,647 bytes of data in a column declared as data type IMAGE. You would use an image type, for example, if you wanted to create a table of graphics images such as:

CREATE TABLE graphic_images
(id INTEGER,
description VARCHAR(250),
picture IMAGE)

Typically, you won't use an INSERT statement to enter binary data into an IMAGE column. Instead, you will use an application program that passes the binary (picture) data to the DBMS for storage in the table.

Similarly, an IMAGE column is not meant for direct output using a SELECT statement, although such a SELECT statement is not prohibited. Instead, you would have the DBMS pass the image data to a graphics program (like WinJPeg) or to a Web browser for display.

If you do display an IMAGE column using the SELECT statement, you will find that the SELECT statement does not translate the values in the IMAGE column to ASCII. For example, suppose that you use the INSERT statement

 INSERT INTO graphic_images
VALUES (123,'Picture 123','Picture'123')

to place data into a row in the GRAPHICS_IMAGES table created as the example at the beginning of this tip. If you use the SELECT statement

 SELECT * FROM graphic_images

MS-SQL Server would display:

id        description    picture
----------------------------------------------------
123 Picturel23 0x50696374757265313233

By not translating the hexadecimal representation of data in the IMAGE column to ASCII when SELECTED, the DBMS makes it easy to pass the actual "raw" picture file to a graphics program in answer to a query sent to the DBMS by an application program.

Website Design by Mayuri Multimedia