Skip to main content

Primary and Surrogate Keys

Updated Oct 05, 2019 ·

Primary Keys

Primary keys are essential in database design. They ensure each record in a table is unique and can be easily referenced by other tables.

  • Primary keys uniquely identify records.
  • They must be on columns with unique, non-null values.
  • They are consistent for current and future data.

Specifying Primary Keys

Defining primary keys during table creation is straightforward. You can specify a primary key for one or more columns.

  • Primary keys can be specified at table creation.
  • Multiple columns can form a single primary key.
  • Use minimal columns for efficiency.

Simply add the PRIMARY KEY command after specifying the field during table creation.

CREATE TABLE table_name (
a integer PRIMARY KEY,
b numeric
c text,
d text,
e text
);

If you want to specify more than one column as primary keys, you can write them this way. Note that regardless of how many columns field inside the PRIMARY KEY function, they are still considered as one combination, thus one primary key only

CREATE TABLE table_name (
a integer,
b numeric
c char(16),
d text,
e text,
PRIMARY KEY (a, b)
);

Adding Primary Keys

Adding primary key constraints to existing tables follows a similar process as adding unique constraints. You need to assign a name to the primary key constraint, just as with unique constraints.

  • Use a naming convention for primary key constraints.
  • The process is similar to adding unique constraints.

Syntax:

ALTER TABLE table_name
ADD CONSTRAINT some_name PRIMARY KEY (column_name);

Surrogate Keys

Surrogate keys are artificial primary keys that aren't based on existing data columns. They exist to provide a consistent, unchanging identifier for records which are independent of any natural data attributes.

  • Surrogate keys minimize the number of columns in a primary key.
  • They provide a stable identifier that doesn’t change over time.

Reasons for Surrogate Keys

Ideally, a primary key should consist of as few columns as possible and remain constant over time. By using an artificial key, you ensure the identifier stays the same even if other attributes change.

  • Surrogate keys ensure consistency over time.
  • They simplify primary key construction.

Example: Surrogate Keys

Below table listing six different cars. Since you only "make" and "model" columns are the only combination that makes the records unique, you'd need to use both for the primary key, which isn't ideal. When only multiple columns qualify, consider a surrogate key.

ModelMakeColor
CivicHondaBlue
AccordHondaRed
Model STeslaBlack
MustangFordYellow
CorollaToyotaWhite
CorollaToyotaWhite
CamryToyotaSilver
ExplorerFordRed
AltimaNissanBlue
RogueNissanBlack
A4AudiYellow
A4AudiYellow

If you want to create this table:

CREATE TABLE cars (
make VARCHAR(64) NOT NULL,
model VARCHAR(64) NOT NULL,
color VARCHAR(32) NOT NULL
);

SELECT * FROM cars;

-- add the records
INSERT INTO cars (model, make, color) VALUES
('Civic', 'Honda', 'Blue'),
('Civic', 'Honda', 'Red'),
('Accord', 'Honda', 'Green'),
('Model S', 'Tesla', 'Black'),
('Mustang', 'Ford', 'Yellow'),
('Corolla', 'Toyota', 'White'),
('Corolla', 'Toyota', 'Gray'),
('Camry', 'Toyota', 'Silver'),
('Explorer', 'Ford', 'Orange'),
('Altima', 'Nissan', 'Blue'),
('Rogue', 'Nissan', 'Red'),
('A4', 'Audi', 'White');

SELECT * FROM cars;

Adding a Surrogate Key

To address the issue on the table above, add a surrogate key column like "id." PostgreSQL's serial data type automatically generates unique, auto-incrementing numbers.

  • Add "id" columns using the "serial" type for auto-increment.
  • The primary key constraint prevents duplicate IDs.
  • This ensures each record has a unique identifier.

To do this:

ALTER TABLE cars
ADD COLUMN id serial PRIMARY KEY;

SELECT * FROM cars;
ModelMakeColorid
CivicHondaBlue1
AccordHondaRed2
Model STeslaBlack3
MustangFordYellow4
CorollaToyotaWhite5
CorollaToyotaWhite6
CamryToyotaSilver7
ExplorerFordRed8
AltimaNissanBlue9
RogueNissanBlack10
A4AudiYellow11
A4AudiYellow12

Now if we want to add new records, they will automatically get an id that doesn't exist in the table yet.

INSERT INTO cars (model, make, color) VALUES ('Civic', 'Honda', 'Green');
INSERT INTO cars (model, make, color) VALUES ('Civic', 'Honda', 'Orange');
INSERT INTO cars (model, make, color) VALUES ('Sentra', 'Nissan', 'Orange');

SELECT * FROM cars;
ModelMakeColorid
CivicHondaBlue1
AccordHondaRed2
Model STeslaBlack3
MustangFordYellow4
CorollaToyotaWhite5
CorollaToyotaWhite6
CamryToyotaSilver7
ExplorerFordRed8
AltimaNissanBlue9
RogueNissanBlack10
A4AudiYellow11
A4AudiYellow12
CivicHondaGreen13
CivicHondaOrange14
SentraNissanOrange15

Adding an Existing Surrogate Key

If we try to add a new record and reuse an existing surrogate key, like the id from the table above, it will return an error:

INSERT INTO cars
VALUES ('Porsche', 'Porsche 911 Carrera', 'Guards Red', 1);

Combining Columns

Another approach is to combine two existing columns into one. First, add a new column with the varchar type, then update it with a concatenation of existing columns using the CONCAT function. This new column can become the surrogate primary key.

ALTER TABLE table_name
ADD COLUMN column_c varchar(256);

-- combine the new two existing columns and add to the new column
UPDATE table_name
SET column_c = CONCAT(column_a, column_b);

-- turn new column into surrogate primary key
ALTER TABLE table_name
ADD CONSTRAINT prime_key PRIMARY KEY (column_c);

Example: Combining Columns

To test this, we can drop the id column for now so that it returns to the original table:

ALTER TABLE cars
DROP COLUMN id;

SELECT * FROM cars;
ModelMakeColor
CivicHondaBlue
AccordHondaRed
Model STeslaBlack
MustangFordYellow
CorollaToyotaWhite
CorollaToyotaWhite
CamryToyotaSilver
ExplorerFordRed
AltimaNissanBlue
RogueNissanBlack
A4AudiYellow
A4AudiYellow
CivicHondaGreen
CivicHondaOrange
SentraNissanOrange

Now recreate the id column but this time set it to the combination of both the make and model.

-- Add the id column
ALTER TABLE cars
ADD COLUMN id varchar(128);

-- Update id with make + model
UPDATE cars
SET id = CONCAT(make, model);

SELECT * FROM cars;
ModelMakeColorID
CivicHondaBlueHondaCivic
CivicHondaRedHondaCivic
AccordHondaGreenHondaAccord
Model STeslaBlackTeslaModel S
MustangFordYellowFordMustang
CorollaToyotaWhiteToyotaCorolla
CorollaToyotaGrayToyotaCorolla
CamryToyotaSilverToyotaCamry
ExplorerFordOrangeFordExplorer
AltimaNissanBlueNissanAltima
RogueNissanRedNissanRogue
A4AudiWhiteAudiA4
CivicHondaGreenHondaCivic
CivicHondaOrangeHondaCivic
SentraNissanOrangeNissanSentra

Now if we try to set the id as the primary key, we will get an error message.

ALTER TABLE cars
ADD CONSTRAINT id_pk PRIMARY KEY(id);

SELECT * FROM cars;

From this we can see that combining the make and model, and using it as id is not a good idea since we can have records with the same make and model.