Excel

Migrating off Excel without losing the bits that worked

HCOMS April 2026 7 min read

A reference of the SQL we end up writing on every project. Aimed at people who can already SELECT * but want to lift their game.

Imagine two tables: orders(id, customer_id, total, status, created_at) and customers(id, name, email, country).

Joins, properly

An INNER JOIN returns only rows that match in both tables — every order that has a known customer.

SELECT o.id, o.total, c.name
FROM orders o
INNER JOIN customers c ON c.id = o.customer_id
WHERE o.status = 'paid';

A LEFT JOIN keeps every row from the left side, with NULLs where the right side has no match — every customer, even ones with no orders.

SELECT c.name, COUNT(o.id) AS order_count
FROM customers c
LEFT JOIN orders o ON o.customer_id = c.id
GROUP BY c.id;

Aggregates without losing your mind

The rule: every column in the SELECT must either appear in GROUP BY or inside an aggregate function. MySQL used to let you cheat on this; it does not any more (since 5.7's ONLY_FULL_GROUP_BY).

SELECT
  c.country,
  COUNT(o.id)         AS num_orders,
  SUM(o.total)        AS revenue,
  ROUND(AVG(o.total)) AS avg_order
FROM orders o
JOIN customers c ON c.id = o.customer_id
WHERE o.created_at >= '2026-01-01'
GROUP BY c.country
HAVING revenue > 1000
ORDER BY revenue DESC;

Updating rows from another table

Useful for backfills. Set every order's customer_country from the customers table:

UPDATE orders o
JOIN customers c ON c.id = o.customer_id
SET o.customer_country = c.country
WHERE o.customer_country IS NULL;

Deleting safely

Always SELECT first. Always.

-- 1. Verify what you're about to delete
SELECT * FROM orders
WHERE status = 'cancelled' AND created_at < '2024-01-01';

-- 2. Then, only when the count looks right
DELETE FROM orders
WHERE status = 'cancelled' AND created_at < '2024-01-01';

Indexes — when to add them, when to leave them

Add an index on any column you regularly filter, join on, or sort by. The cost is slower writes; the benefit is queries that run in milliseconds instead of minutes.

-- A composite index for "find orders by customer, newest first"
CREATE INDEX idx_orders_customer_date
  ON orders (customer_id, created_at DESC);

-- See whether MySQL is using it
EXPLAIN SELECT * FROM orders
WHERE customer_id = 42
ORDER BY created_at DESC
LIMIT 20;

If EXPLAIN shows type: ref and your index name in the key column, you're good. If it shows type: ALL, MySQL is doing a full table scan and your index isn't being used.

UPSERT — insert or update on conflict

INSERT INTO product_stock (sku, quantity)
VALUES ('ABC-001', 5)
ON DUPLICATE KEY UPDATE
  quantity = quantity + VALUES(quantity);

This requires a UNIQUE index on sku. The newer, cleaner version on MySQL 8.0.20+ uses an alias:

INSERT INTO product_stock (sku, quantity)
VALUES ('ABC-001', 5) AS new
ON DUPLICATE KEY UPDATE
  quantity = quantity + new.quantity;

Window functions — the modern superpower

Available since MySQL 8.0. Rank each customer's orders by date:

SELECT
  customer_id,
  id AS order_id,
  total,
  created_at,
  ROW_NUMBER() OVER (
    PARTITION BY customer_id
    ORDER BY created_at DESC
  ) AS order_rank
FROM orders;

To get only each customer's most recent order, wrap it and filter on order_rank = 1.

Common gotchas

If you've inherited a database that nobody understands any more, that's most of what we do. Drop us a line.

Related notes