PostgreSQL Indexing for Rails — What to Index and Why

Indexing is the difference between a query that takes 3 milliseconds and one that takes 3 seconds, and most Rails developers do not think about it until a page loads slowly in production. That is too late. By the time a missing index causes visible slowness, you have thousands of users waiting on sequential scans across millions of rows, and the fix — adding an index — requires a migration that may lock the table during creation. This guide covers PostgreSQL indexing from the Rails developer's perspective: which index types matter (B-tree, GIN, partial), how to read EXPLAIN output without your eyes glazing over, composite index design, ActiveRecord migration patterns, when you should explicitly not add an index, and how to monitor whether your existing indexes are actually being used. It connects to the broader Ruby performance topic, because database query time dominates response latency in virtually every Rails application I have profiled over the past decade. The PostgreSQL documentation is comprehensive but dense — what follows is the practical subset a Rails developer needs, with the trade-offs spelled out.
How PostgreSQL decides to use an index
Before you can make good indexing decisions, you need a mental model of how the query planner thinks. PostgreSQL does not blindly use every index available. For each query, the planner estimates multiple execution strategies — sequential scan, index scan, bitmap index scan, index-only scan — and picks the one with the lowest estimated cost.
Cost estimation depends on table statistics: row count, data distribution, correlation between physical row order and index order, and the selectivity of your query conditions. A query that returns 80% of the table's rows will not use an index because reading every row sequentially is cheaper than bouncing between the index and the table data. A query that returns 0.1% of rows will almost certainly use an index.
This means an index is only useful if it helps PostgreSQL narrow down the result set significantly. Indexing a boolean column on a table where 95% of rows have the same value is usually pointless — the planner will ignore the index for the common value and sequential scan instead. Understanding this saves you from creating indexes that consume disk space and slow down writes without ever being used.
Index types that matter for Rails
PostgreSQL supports many index types. For Rails applications, three cover the vast majority of use cases.
B-tree indexes
The default. When you write add_index :users, :email in a Rails migration, you get a B-tree. It supports equality lookups (WHERE email = 'test@example.com'), range queries (WHERE created_at > '2025-01-01'), sorting (ORDER BY created_at DESC), and prefix matching (WHERE name LIKE 'John%'). It does not support pattern matching on the interior of a string (WHERE name LIKE '%john%').
B-tree indexes are the right choice for:
- Foreign keys (
user_id,order_id) - Columns used in
WHEREequality conditions - Columns used in
ORDER BY - Columns used in
JOINconditions - Unique constraints
If you are unsure which index type to use, B-tree is almost certainly correct.
GIN indexes
Generalized Inverted Indexes are designed for composite values — arrays, JSONB columns, full-text search vectors. Where a B-tree maps a single value to a row location, a GIN index maps each element within a composite value to all the rows that contain it.
Use GIN for:
- JSONB columns queried with
@>,?, or?|operators - Array columns queried with
@>or&& - Full-text search with
tsvectorand@@ hstorecolumns
# In a Rails migration
add_index :products, :tags, using: :gin # where tags is a JSONB or array column
GIN indexes are larger than B-tree indexes and slower to update. Every write to the indexed column must update the index, and for JSONB columns with many keys, that update cost is non-trivial. Do not add a GIN index on a JSONB column just because it is JSONB — add it when you have queries that actually filter on its contents.
Partial indexes
A partial index only includes rows that match a condition. This is enormously powerful for Rails applications, where queries often filter by status, soft-delete flags, or temporal conditions.
add_index :orders, :created_at, where: "status = 'pending'"
This creates a B-tree index on created_at, but only for pending orders. If you have 10 million orders and 50,000 are pending, this index is a fraction of the size of a full index and dramatically faster to scan. PostgreSQL uses it when your query includes the matching WHERE condition.
Real-world example: if you use acts_as_paranoid or Discard for soft deletes, you have a deleted_at column that is null for active records and non-null for deleted ones. Most queries filter to active records. A partial index on frequently-queried columns with where: "deleted_at IS NULL" gives you index performance on the rows you actually query without indexing the deleted ones you never touch.
add_index :users, :email, unique: true, where: "deleted_at IS NULL"
This enforces uniqueness only among active records and keeps deleted records out of the index entirely. Clean.
Reading EXPLAIN output
You cannot make informed indexing decisions without understanding what PostgreSQL is doing with your queries. EXPLAIN ANALYZE is the tool.
In Rails console:
ActiveRecord::Base.connection.execute("EXPLAIN ANALYZE SELECT * FROM users WHERE email = 'test@example.com'").to_a
Or use the explain method on an ActiveRecord relation:
User.where(email: "test@example.com").explain
The output looks intimidating. Focus on three things:
1. Scan type. Seq Scan means PostgreSQL is reading every row in the table. Index Scan means it is using an index. Index Only Scan means it is getting all needed data from the index itself without touching the table — this is the fastest option. Bitmap Index Scan means it is using the index to build a bitmap of matching rows and then fetching them — common with OR conditions or low-selectivity indexes.
2. Actual rows vs. estimated rows. If the planner estimated 10 rows but actually found 100,000, the statistics are stale. Run ANALYZE tablename to update them. Bad statistics lead to bad plan choices, and bad plan choices look like missing index problems when the index is actually there.
3. Execution time. EXPLAIN ANALYZE runs the query and reports real execution time. Compare the time before and after adding an index. If the improvement is marginal, the index may not be worth the write overhead.
One thing that trips people up: EXPLAIN (without ANALYZE) shows the estimated plan without running the query. It is fast but can be misleading. Always use EXPLAIN ANALYZE when diagnosing a real performance problem, because the planner's estimates are often wrong for skewed data distributions.
Composite indexes
A composite index covers multiple columns. Column order matters — a lot.
add_index :orders, [:user_id, :created_at]
This index efficiently serves queries that filter on user_id alone, or user_id AND created_at, or user_id with an ORDER BY created_at. It does not efficiently serve queries that filter only on created_at — the leftmost column must appear in the query condition for the index to be useful.
Think of it like a phone book sorted by last name, then first name. You can look up everyone named "Smith" quickly. You can look up "Smith, John" even more quickly. But looking up everyone named "John" regardless of last name requires scanning the entire book — the first name ordering only helps after the last name is fixed.
The rule: put the column with the highest selectivity (the one that narrows the result set most) first, unless you also need the index for queries that only filter on one column — in which case, put that column first.
For Rails applications, the most common composite indexes are:
[:user_id, :created_at]— "all orders for this user, sorted by date"[:status, :priority]— "all pending tasks, sorted by priority"[:tenant_id, :email]— multi-tenant applications where every query is scoped to a tenant
Do not create a composite index when two single-column indexes would serve the same queries. PostgreSQL can combine multiple indexes using bitmap scans, and single-column indexes are more versatile. Composite indexes win when the combination is used in a specific, high-frequency query pattern.
ActiveRecord migration patterns
Rails provides several ways to add indexes, and the choice matters for production deployments.
Standard index creation:
class AddIndexToOrdersStatus < ActiveRecord::Migration[7.1]
def change
add_index :orders, :status
end
end
This locks the table for the duration of the index build. For a table with millions of rows, that can be seconds to minutes, during which no writes can occur. For a small table, this is fine. For a large table in a busy production database, this is an outage.
Concurrent index creation:
class AddIndexToOrdersStatusConcurrently < ActiveRecord::Migration[7.1]
disable_ddl_transaction!
def change
add_index :orders, :status, algorithm: :concurrently
end
end
algorithm: :concurrently tells PostgreSQL to build the index without holding a lock that blocks writes. The trade-off: it takes longer, it cannot run inside a transaction (hence disable_ddl_transaction!), and it can fail if there is a uniqueness violation discovered during the build. For any table that receives production traffic, concurrent index creation is the correct default.
Removing unused indexes:
class RemoveUnusedOrdersIndex < ActiveRecord::Migration[7.1]
def change
remove_index :orders, :legacy_status_column
end
end
Removing indexes is fast and non-blocking. Do not be afraid to remove indexes that are not being used — every index slows down writes and consumes disk space. Dead indexes are pure overhead.
When NOT to index
More indexes is not better. Each index:
- Consumes disk space (often 10-30% of the table size per index)
- Slows down INSERT, UPDATE and DELETE operations because each write must update every index on the table
- Requires vacuuming and maintenance
- Uses shared buffer memory for cached pages
Do not index:
- Small tables. A table with 1,000 rows does not benefit from an index. A sequential scan across 1,000 rows is effectively instant. PostgreSQL will likely ignore the index anyway.
- Write-heavy columns that are rarely queried. If a column is updated on every request but only queried in a nightly report, the write overhead of maintaining the index across millions of updates outweighs the benefit of one fast read per day.
- Low-selectivity columns without partial index conditions. A boolean
activecolumn where 99% of rows aretruedoes not benefit from a full index. A partial index on the 1% that isfalsemight. - Columns only used in
SELECT(not inWHERE,JOIN, orORDER BY). Indexes help find rows, not retrieve columns. An index on a column that is only ever selected, never filtered, is waste. (Index-only scans are the exception, but you need to understand covering indexes before going there.)
Every index is a bet that the read performance improvement outweighs the write performance cost. Make that bet consciously.
Monitoring index usage
Creating indexes without monitoring whether they get used is like adding caching without measuring hit rates. PostgreSQL tracks index usage statistics, and you should review them periodically.
SELECT
schemaname,
relname AS table_name,
indexrelname AS index_name,
idx_scan AS times_used,
pg_size_pretty(pg_relation_size(indexrelid)) AS index_size
FROM pg_stat_user_indexes
ORDER BY idx_scan ASC;
This query shows every index in your database, how many times it has been scanned since the last statistics reset, and its disk size. An index with zero scans since the last pg_stat_reset() is a strong candidate for removal — assuming enough time has passed to capture all your application's query patterns, including scheduled jobs and monthly reports.
For Rails, the pghero gem provides a dashboard that surfaces unused indexes, missing indexes (based on sequential scan counts), and duplicate indexes. It is worth installing on any production Rails application backed by PostgreSQL.
A useful heuristic: if an index has not been used in 30 days and you have not added new features that query the indexed column, it is safe to remove. Drop it, deploy, and watch for slow queries. You can always recreate it.
What usually goes wrong
These are the indexing mistakes I see repeatedly in Rails applications:
- Missing foreign key indexes. Rails does not automatically create indexes on foreign key columns. Every
belongs_toassociation generates queries likeWHERE user_id = 123, and without an index, that is a sequential scan. Run through your schema and verify that every_idcolumn has an index. This is the single highest-impact indexing fix in most Rails apps. - Indexing everything "just in case." A developer adds an index to every column in a migration because "it might help." The table now has eight indexes, writes are slow, and half the indexes have zero scans. Indexes are not free. Add them when you have evidence — a slow query, a sequential scan on a large table — not speculatively.
- Wrong column order in composite indexes. The composite index on
[:created_at, :user_id]does not help the queryWHERE user_id = 123 ORDER BY created_at. The user_id needs to be first. Column order is not interchangeable. - Not using concurrent index creation. A migration adds an index to a 50-million-row table without
algorithm: :concurrently. The deploy locks the table for four minutes. The application queues requests, Puma workers saturate, and the site goes down. Always use concurrent creation for large tables. - Ignoring partial index opportunities. A
statuscolumn with five possible values gets a full B-tree index. But 95% of queries filterWHERE status = 'active'. A partial index onWHERE status = 'active'is a tenth the size and just as fast for the queries that matter. - Stale statistics causing bad plans. Autovacuum is not running frequently enough, table statistics are outdated, and the planner chooses sequential scans even though good indexes exist. The index is there; the planner just does not know the data well enough to use it. Run
ANALYZEmanually or tuneautovacuum_analyze_threshold. - Duplicate indexes. A composite index on
[:user_id, :email]makes a separate index on[:user_id]redundant — the composite index serves both use cases. But the single-column index was added first, and nobody removed it after adding the composite. Both consume space; only one is necessary.
Indexing checklist
- Every foreign key column (
_id) has a B-tree index - Columns used in frequent
WHEREconditions are indexed - Composite indexes have columns in the correct order (highest selectivity first)
- Soft-delete queries use partial indexes on active records
- Large table indexes use
algorithm: :concurrently - GIN indexes are used for JSONB/array columns that are queried, not just stored
-
EXPLAIN ANALYZEhas been run on the slowest queries - Unused indexes have been identified and removed
-
pgheroor equivalent monitoring is installed - Autovacuum settings are tuned for your write volume
FAQ
How many indexes is too many on a single table?
There is no fixed number, but I start looking critically at any table with more than six or seven indexes. Each index adds write overhead. If your table is read-heavy (hundreds of reads per write), more indexes are justified. If it is write-heavy (logging, event tracking), keep indexes minimal and add them only for proven query patterns.
Should I add an index to every foreign key?
Yes, with very rare exceptions. Foreign key columns are used in JOIN and WHERE conditions constantly in Rails applications. The default Rails generators do not add these indexes automatically (though references in migrations does since Rails 5). Audit your schema — missing foreign key indexes are the most common performance problem in Rails databases.
Does adding an index fix an N+1 query?
No. An N+1 query fires hundreds of individual queries where one query would suffice. An index makes each individual query faster, but you are still paying the round-trip overhead for each one. Fix the N+1 with includes, preload, or eager_load first. Then index the column for the single efficient query that replaces the hundreds.
When should I use a GIN index instead of a B-tree?
When the column contains composite values — JSONB objects, arrays, or tsvector data — and your queries search within those values. B-tree indexes work on the whole value; GIN indexes work on the elements inside. If your query uses @>, ?, ?|, &&, or @@, you want a GIN index.
How do I know if my index is actually being used?
Check pg_stat_user_indexes for the idx_scan count, or install pghero. If the count is zero after a reasonable observation period (at least a week, covering all your application's query patterns), the index is not being used. Before removing it, confirm that no scheduled jobs or monthly reports depend on it.
Can I add an index without downtime?
Yes. Use algorithm: :concurrently in your migration and disable_ddl_transaction!. The index builds in the background without blocking writes. It takes longer than a regular index build, but your application stays available throughout.
Related reading
- Ruby Performance — parent topic covering the full performance surface, where database time is often the dominant factor
- Deploy Ruby on Rails on a VPS — includes PostgreSQL setup and initial configuration
- Web Performance for Rails Developers — the full-stack performance view from browser to database