Design tables that avoid duplicated, inconsistent data — see the anomalies a flat table creates, reach first, second, and third normal form by splitting into related tables linked by foreign keys, and know when to denormalize on purpose.
Why: normalization is organizing columns into tables so each fact is stored once. The opposite — one wide table — duplicates data and breeds "anomalies": change a customer email and you must fix every order row or they disagree; you cannot add a customer before their first order; deleting their last order erases the customer entirely. Storing each fact once removes all three.
-- Denormalized: customer details repeat on every order row.
-- orders(id, customer_name, customer_email, product, qty)
-- 1, 'Ada', 'ada@x.com', 'Keyboard', 1
-- 2, 'Ada', 'ada@x.com', 'Mouse', 2 <- email duplicated
-- Change Ada's email and you must update EVERY one of her rows.Why: 1NF means every column holds a single, atomic value — no comma-separated lists, no repeating columns like product1, product2. A row with "Keyboard, Mouse" jammed into one column cannot be filtered, joined, or counted properly. Give each value its own row instead.
-- NOT 1NF: a list crammed into one column
-- order_products(order_id, products) -> (1, 'Keyboard, Mouse')
-- 1NF: one product per row, each value atomic
CREATE TABLE order_items (
order_id bigint NOT NULL,
product text NOT NULL,
quantity integer NOT NULL
);Note: normalization is the default because it keeps data correct, but a hot read path sometimes duplicates data deliberately to avoid an expensive JOIN — a cached order total, or a copied value on a reporting table. That is a trade: faster reads for the burden of keeping the copies in sync. Normalize first; denormalize only with a measured reason.
-- A deliberate, documented copy for read speed:
ALTER TABLE orders ADD COLUMN total numeric(10, 2);
-- You now MUST keep "total" in sync whenever items change
-- (a trigger, or application code). Worth it only when the
-- JOIN-and-SUM on every read is provably too slow.