SQL: Structured Query Language
Structured Query Language (SQL) is the industry standard language for communicating with Relational Database Management Systems. SQL is used to look up data in relational tables, create tables, insert data into tables, grant permissions and many other things.
Structured Query Language is used to define the answer set that is returned from the
RDBMS. SQL is a non-procedural, set-oriented language, meaning it does not rely on procedural-type statements such as those listed here:
• GO TO
• OPEN FILE
• CLOSE FILE
• END OF FILE
To offer an 'ANSI-standard' product, all vendors of SQL must become certified in ANSI (American National Standards Institute) standards by the NIST (National Institute of
Standards and Technology), a government certification agency.
There are three ANSI standards:
• ANSI SQL-89 (SQL 1)
• ANSI SQL-92 (SQL 2)
• ANSI SQL-99 (SQL 3)
Teradata SQL is an ANSI compliant product. Teradata has its own extensions to the language, as do most vendors. Teradata SQL is fully certified at the SQL92 Entry level, with some intermediate, some full and some SQL-99 Core features also implemented.
ANSI vs. Teradata SQL
Because ANSI standard features in many cases duplicate existing Teradata features, many of those features are implemented both using the original Teradata syntax as well as the ANSI standard syntax, thus giving the user a choice. When new features are added to Teradata SQL, they are always implemented using ANSI SQL-99 standards. It is generally advisable to code new applications using ANSI standard syntax whenever possible.
As the Teradata Database evolves, each major release is expected to move Teradata
SQL closer to conformance with the SQL-99 core level standard.
1.) If a feature or features are added - they will be implemented according to the ANSI standard.
2.) For features already implemented, if the differences between the standard and the Teradata dialect are slight, then both will be implemented.
3.) If the differences are major, then the user has the option of operating in either the Teradata or ANSI mode. This choice may be specified as a system default or may be made at the session level.
4.) If new features are added that are not covered by the standard, then we may broadly comply or improve upon another vendor's implementation.
5.) If there is neither a standard, or another vendor precedent available, we will implement in a way that will be as clean and generic as possible without violating any of the basic tenets of the ANSI SQL2 standard.
SQL Commands
SQL statements commonly are divided into three categories:
Data Definition Language (DDL) - Used to define and create database objects such as tables, views, macros, databases, and users.
- Create
- Drop
- Alter
Data Manipulation Language (DML) - Used to work with the data, including such tasks as inserting data rows into a table, updating an existing row, or performing queries on the data. The focal point of this course will be on SQL statements in this category.
- Select
- Insert
- Update
- Delete
Data Control Language (DCL) - Used for administrative tasks such as granting and revoking privileges to database objects or controlling ownership of those objects. DCL statements will not be covered in detail in this course. For complete Data Control Language coverage, please see the NCR Customer Education "Teradata Database Administration" course at www.TeradataEducationNetwork.com.
- Grant
- Revoke
- Give
Table Relationships
Business data in a relational database is arranged in multiple tables related to one another by some of the columns in each table. Primary and foreign keys define these relationships.
Each primary key value uniquely identifies a row in the table
Each foreign key value references a primary key value found elsewhere.
When users pose business questions to a relational database, they often access multiple tables via these related columns.
A Simple SQL SELECT
The SELECT statement allows you to retrieve data from one or more tables.
For example, you can specify certain columns of data to be returned from a single table. Only those columns of data will be returned for all rows in the table.
To retrieve all values in a single column from a single table, use the SELECT statement in its simplest form:
Select <column_name>
from <table_name>
SELECT All Columns and All Rows
You can use an asterisk “*” to retrieve all columns of data associated with the specified table.
To retrieve all columns for every entry from a single table, use this variation of the SELECT statement with the “*” (wildcard column reference):
Select * from <table_name>
SELECT All Columns and All Rows
Display all columns of information
Select * from employee
WHERE Clause
Adding a WHERE clause to the SELECT statement restricts the response set to rows that meet a specified criteria.
To retrieve selected columns for only those rows that meet some criteria from a single table, use a SELECT statement that contains a WHERE clause:
Select <column_name> <,column name> from <table name>
Where <condition>
ORDER BY Clause
Use the ORDER BY clause to have your results displayed in a sorted order. Without the ORDER BY clause, resulting output rows are displayed in a random sequence.
To retrieve selected columns from a single table, arranging the entries in some order, use a SELECT statement that contains an ORDER BY clause:
SELECT <column_name> <, column_name> FROM <table_name>
WHERE <condition> ORDER BY <column_name>
;
Example
For each member of Department number 401, obtain the employee number, date hired, first and last name. Arrange these results in ascending order of date hired.
Sort Direction
In the facing page example, results will be returned in ascending order by hire date. Ascending order is the default sort sequence for an ORDER BY clause. To explicitly specify ascending or descending order, add ASC or DESC, to the end of the ORDER BY clause. The following is an example of a sort using descending sequence.
ORDER BY hire_date DESC; (descending sort)
Naming the Sort Column
You may indicate the sort column by naming it directly (e.g., hire_date) or by specifying its position within the SELECT statement. Since hire_date is the fourth column in the SELECT statement, the following ORDER BY clause is equivalent to saying ORDER BY hire_date..
ORDER BY 4;
The ORDER BY clause specifies the column(s) to be used for sorting the result.
Select employee number
Multiple ORDER BY Columns
An ORDER BY clause may specify multiple columns. No single column in an ORDER BY clause should exceed a length of 4096 bytes, otherwise it will be truncated for sorting purposes.
The order in which columns are listed in the ORDER BY clause is significant. The column named first is the major sort column. The second and subsequent are minor sort columns. In the example seen here, results are sorted by department number in ascending order.
NOTE: Each column specified in the ORDER BY clause can have its own sort order, either ascending or descending. The following example would display descending job codes within ascending department numbers.
Select employee_number, depart_number, job_code from employee where department_number < 302 Orderby department_number ASC job_code Desc.
DISTINCT Option
The DISTINCT operator will consolidate duplicate output rows to a single occurrence. The DISTINCT option also adds an implicit ORDER BY ASC for all columns selected.
To eliminate duplicate values from retrieved data, use this variation of the SELECT statement, which contains the DISTINCT option:
SELECT DISTINCT <column_name> <, column_name>
FROM <table_name>
WHERE <condition>
In the example seen here, two people in department 501 have the same job code (512101). If our purpose is simply to find out which job codes exist in department 501, we should use DISTINCT to avoid seeing duplicate rows.
DISTINCT appears directly after SELECT, and before the first named column. It may appear to apply only to the first column, but in fact, DISTINCT applies to all columns named in the query. Two rows in our result set both have department_number 501. The combination of department_number and job_code are distinct since the job codes differ.
Naming Rules—Teradata Extensions
Case Sensitivity
ANSI compliant object names and keywords use only upper case letters.
The Teradata Database accepts mixed case object names and keywords, and is not case sensitive.
All of the following examples are valid in Teradata: Select field1 from table1;
SELECT field1 FROM table1; select Field1 From tABLE1;
Only the following example is valid in ANSI:
SELECT FIELD1 FROM TABLE1;
Teradata Extensions
In addition to mixed case object names and keywords, Teradata offers several other extensions to the ANSI naming rules. These are summarized in the matrix below.
Differences: ANSI vs. Teradata SQL
Creating Object Names:
|
ANSI
|
Teradata SQL
|
Legal Characters
|
A-Z, 0-9, _
|
Same as ANSI,
|
(underscore)
|
plus: a-z, #, $
| |
First character
|
A-Z
|
Any except 0-9
|
Last character
|
Can’t be _
|
Any
|
(Underscore)
| ||
Length
|
18 characters
|
30 characters
|
Case sensitivity
|
N/A
|
No
|
Naming Rules—Teradata Extensions
Teradata naming rules may be used as follows on all Teradata database objects:
Names are composed of the following characters: A - Z (Upper or lowercase)
0 - 9 #, $, _
Names are limited to 30 characters.
Examples of named objects : Account_Table
Financials_2001_DB
Sales $ Column
#_of_Years_Column
Naming Requirements and Qualifications
Uniqueness
Names for a database object (database, user, table, view, macro, join index, trigger, stored procedure) must adhere to these rules:
• Database names and user names must be unique within the Teradata Database.
• TABLE, VIEW, and MACRO names must be unique within a database.
• Column names must be unique within a table.
Qualifying a Name
The syntax for qualifying a name is:
Recommended Coding Conventions
SQL is considered a 'free form' language, that is, it can cover multiple lines of code and there is no restriction on how much 'white space' (i.e., blanks, tabs, carriage returns) may be embedded in the query. Having said that, SQL is syntactically a very precise language. Misplaced commas, periods and parenthesis will always generate a syntax error.
The following coding conventions are recommended because they make it easier to read, create, and change SQL statements.
Recommended Practice
Select last_name,first_name from employee Where department_number = 401 Order by last_name
Not-Recommended Practice
select last_name, first_name, hire_date, salary_amount from employee where department_number = 401 order by last_name;
The first example is easy to read and troubleshoot (if necessary). The second example appears to be a jumble of words. Both, however, are valid SQL statements.
Default Database
Setting the Default Database
As a valid user, you will normally have access rights to your own user database and the objects it contains. You may also have permission to access objects in other databases.
The user name you logon with is usually your default database. (This depends on how you were created as a user.)
For example, if you log on as:
.logon johnc;
password: xyz
then ”johnc" is normally your default database.
Note the dot (.) before "logon". Commands that begin with a dot are BTEQ commands, not SQL commands. The BTEQ command processor will read this command; the SQL processor will not see it.
Queries you make that do not specify the database name will be made against your default database.
A Database may contain:
• Tables – contain data.
• Views – are previously defined “windows” to the data.
• Macros – are stored SQL statements.
• Triggers – are SQL statements associated with a table.
• Join Indexes – are user defined pre-joins or pre-aggregations.
• Stored Procedures – are pre-written SQL scripts with defined procedural processing using Stored Procedure control statements.
The SQL Parser looks in your default database for the objects you refer to in your SQL statement.
The normal default database for this logon is “johnc”.
Your user profile may be modified to use a different default database.
Changing the Default Database
Changing the Default Database
The DATABASE command is used to change your default database. For example,
DATABASE payroll;
sets your default database to payroll. Subsequent queries (assuming the proper privileges are held) are made against the payroll database.
Changing the Default Database
Once logged on, you can change the default database by using the DATABASE command.
Example:
DATABASE payroll;
The default database is “payroll” until either:
• User logs off
• A new Database command is issued
You can override a default DATABASE in a single SQL statement by explicitly specifying the database:
Select * from payroll, employee
System Built-in Functions
The following built-in functions are available for use in creating SQL queries.
SESSION - contains the session-id of the requesting session/user.
DATABASE - contains the current default database of the requesting session/user.
ACCOUNT - contains the user account info of the requesting session/user.
USER - contains the user name associated with this session.
Each is defined internally as a VARCHAR(30) data type. Data types are discussed in a forthcoming chapter.
Example: Return the current session-id, default database, the account string and the current user.
SELECT SESSION, DATABASE, ACCOUNT, USER;
Session
|
Database
|
Account
|
User
|
-------
|
--------
|
-------
|
------------
|
1251
|
PED1
|
DBC
|
SQL00
|
Using Built-in Functions
The default database setting may change during the course of a session. SELECT DATABASE; /* Show the current database for this user session*/
Database
------------------------------
PED1
DATABASE cs_views; /* Change the default database for this user session */ SELECT DATABASE; /* Show the new current database for this user session*/
Database
------------------------------
CS_VIEWS
Using Built-in Functions
Built-in functions are useful as guideposts during a session. The default database setting may change during the course of a session.
Show the current database for this user session
SELECT DATABASE;
Database
----------------------
PED1
Change the default database for this user session
DATABASE cs_views
Show the new current database for this user session
Select DATABASE;
Database
-----------------------
CS_VIEWS
Review Questions
1. Teradata conforms to the ANSI SQL-92 standard at the ______
______ .
2. The three types of SQL statements are ________ , _______, and
__________.
3. The real world values on which data values in a
column or columns must be based is called a ________ .
4. Tables relate to each other using the columns designated as
________ _______ and _________ ___________.
5. To restrict the rows which are returned by a SELECT
statement, the _______ clause is used.
6.
The ordering of
data is controlled by the ORDER BY clause and the default sequence is ________.
7. Duplicate answer rows may be reduced to a single row
by using the ________ clause.
8. The correct sequence for full qualification of a table’s column is
______, followed by _____, followed by ______.
9.
The database that
is initially assigned to your session is referred to as the _______ database.