Directory Database Optimization Tips
Performance optimisation techniques for directory databases: slow query analysis, connection pooling, read replica setup, and caching layers that cut load times.
Slow query times, bloated indexes, and table lock contention are why directory sites hit a wall at scale. What worked at 10,000 listings rarely survives 500,000 without deliberate work at the database layer. Most directory operators don't hit these problems until they're already painful — at which point fixing them requires taking performance-critical systems offline or risking downtime during optimization.
This guide covers the optimizations that matter most for directories running on PostgreSQL or MySQL, with specific configuration values and tooling recommendations — all in service of the web performance metrics that determine how a directory's pages get crawled and ranked. Skip the generic database advice — these are the changes that actually move query times.
Index Strategy That Scales
The most common mistake is indexing every queried column individually without considering cardinality or query patterns. For a directory database, the high-impact indexes are on status, category_id, and created_at — columns that appear in almost every listing query.
A composite index on (status, category_id, created_at DESC) outperforms three separate indexes for paginated category views. Why: a composite index satisfies the full WHERE + ORDER BY clause in one operation. Three separate indexes force the query planner to choose one and filter the rest in memory.
Before adding any index, run EXPLAIN ANALYZE in PostgreSQL or EXPLAIN in MySQL. Read the output. If you see "Seq Scan" on a table with more than 50,000 rows, you need an index. If you see "Index Scan" but the estimated row count is far off from the actual rows, your planner statistics are stale — run ANALYZE to update them.
Indexes to audit on every directory database:
- Full-text search index on listing name and description fields — if you're doing
LIKE '%term%'queries, you're doing 10–100x more work than a proper GIN index on atsvectorcolumn - Partial index on
status = 'active'— if 80% of your queries filter for active listings, an index over just active rows is smaller and faster - Covering index for your most frequent listing API queries — if your API returns
(id, name, url, category_id, created_at)for every listing card, index those columns together so the query never touches the heap
Archiving and Table Partitioning
Directories accumulate records fast — rejected submissions, expired listings, spam flags, historical revisions. Keeping all of it in one table is a performance tax that compounds over time.
PostgreSQL range partitioning on created_at with monthly partitions reduces full-table scan time dramatically for date-filtered queries. A query for listings created in March 2026 scans only the March partition, not 3 years of historical data. Setup requires PostgreSQL 10+ and a one-time schema change — not trivial, but worth it at 500,000+ rows.
MySQL alternative: a separate listings_archive table plus a scheduled job (cron or a database event) to move records with status = 'expired' and updated_at < NOW() - INTERVAL 12 MONTH to the archive. Your production queries stay fast because the active table stays small.
Benchmark target: queries on your main listings table should return in under 50ms at the 95th percentile for category page loads. If you're seeing 200ms+ on a warm cache, the table is too large or the indexes aren't covering the query plan correctly. Use pg_stat_statements (PostgreSQL) or the slow query log (MySQL, threshold at 1 second) to identify which queries are causing pain.
Query Caching and Read Replicas
Category pages and search results are read-heavy and relatively static. Most directory category pages don't need real-time data — a listing that was approved 30 minutes ago doesn't need to appear immediately.
Cache frequently-hit queries using Redis with a TTL of 60–300 seconds. A TTL of 120 seconds means at most 120 seconds of staleness, but eliminates 95%+ of database read load during traffic spikes. At 100,000 monthly pageviews, caching category queries at 120s TTL reduces database queries by roughly 2,500x compared to uncached direct queries.
Redis configuration that matters:
- Set
maxmemory-policy allkeys-lruso Redis evicts the least-recently-used keys when memory fills up, rather than crashing or refusing writes - Use key namespacing:
dir:category:{id}:page:{n}makes cache invalidation surgical rather than requiring a full flush - Monitor cache hit rate via Redis INFO stats — target 85%+ hit rate for category and search queries
Read replicas are the right tool for queries that don't need real-time freshness — traffic reports, category counts, admin dashboards, analytics queries. If you're on Supabase, RDS, or a managed Postgres provider, adding a read replica typically takes under 30 minutes and costs $20–$80/mo for a small instance. Direct all reporting and analytics queries to the replica; only write operations and real-time reads touch the primary.
Connection Pooling
Unbounded database connections are a fast path to exhaustion. A serverless Next.js directory app on Vercel can spawn 100+ simultaneous connections during a traffic spike — far more than a typical PostgreSQL instance (max_connections defaults to 100 on many managed providers) can handle.
PgBouncer for PostgreSQL is the standard solution. Configure pool_mode = transaction for typical web workloads — each database connection is shared across multiple app connections, returning to the pool after each transaction completes. This is the default recommendation for Supabase-hosted projects, and Supabase ships PgBouncer pre-configured at port 6543.
Key PgBouncer settings to tune:
max_client_conn: set to 2–3x your expected peak concurrent connections from the application layerdefault_pool_size: start at 10–20 per database user; tune up if you see connection wait queues in the logsserver_idle_timeout: 600 seconds is a reasonable default; lower it if your database instance charges per connection-hour
For MySQL, configure max_connections explicitly in your database configuration and use ProxySQL or MySQL's built-in connection limits at the application layer.
Routine Maintenance That Operators Skip
Database performance degrades silently. A directory that performs well at launch often slows by 30–50% within 12 months purely from maintenance neglect.
Follow this maintenance schedule:
- Weekly: Run
VACUUM ANALYZEon PostgreSQL to reclaim dead rows and update planner statistics. On high-write tables (submission queue, audit log), run this daily - Monthly: Review slow query logs — pg_stat_statements shows cumulative query time, not just the slowest single query. A query that runs in 80ms but fires 500,000 times per day is a bigger problem than a 2-second query that runs once
- Quarterly: Check index bloat with
pgstattuple. Indexes on high-churn tables (submissions, audit logs) can grow 3–5x their logical size. An index that should be 200MB growing to 800MB is dead weight on every query that uses it - Annually: Review your full index set. Indexes added to solve a specific query bottleneck 18 months ago may no longer reflect current query patterns. Every unused index is write overhead on every INSERT and UPDATE
Set autovacuum_vacuum_scale_factor = 0.01 (default is 0.2) for large, high-write tables. This makes autovacuum trigger when 1% of rows are dead rather than 20% — preventing bloat accumulation between manual runs.
Monitoring Tools Worth Using
Don't fly blind on database performance. The tooling is free or cheap:
- pg_stat_statements (PostgreSQL built-in) — tracks cumulative query execution stats. Enable it with
CREATE EXTENSION pg_stat_statementsand query it weekly - Datadog or Grafana + Prometheus — database dashboards that show connection count, query latency percentiles, cache hit rate, and replication lag in one view. Datadog's PostgreSQL integration is well-maintained; Grafana is free if you self-host
- Screaming Frog — useful for identifying directory pages with slow Time to First Byte (TTFB) above 800ms, which often points back to slow database queries on the rendering path
- pgBadger — PostgreSQL log analyzer that generates query performance reports from your slow query log. Useful for monthly reviews
Benchmark target for a directory at 100,000 listings: category page TTFB under 300ms, API responses under 100ms, search queries under 200ms. If you're consistently above these thresholds and you've already implemented caching, the bottleneck is usually a missing index or an unpartitioned large table.
Knowing which directories actually matter is the hard part. DirectoryReady tracks and scores directories by quality, activity, and link type — so you can focus on submissions that move the needle.
Frequently Asked Questions
Why is a composite index better than separate indexes for category pages?
A composite index on (status, category_id, created_at DESC) satisfies the full WHERE and ORDER BY clause in a single operation, which is exactly what a paginated category view needs. Three separate indexes force the query planner to pick one and filter the rest in memory, doing more work. Before adding any index, run EXPLAIN ANALYZE in PostgreSQL or EXPLAIN in MySQL and read the output. A 'Seq Scan' on a table over 50,000 rows means you need an index; an 'Index Scan' with an estimated row count far from the actual count means your planner statistics are stale and you should run ANALYZE.
What caching and connection settings keep a directory database fast under load?
Cache frequently-hit category and search queries in Redis with a 60–300 second TTL — a 120-second TTL eliminates over 95 percent of read load during spikes while capping staleness at two minutes. Set maxmemory-policy allkeys-lru, use key namespacing like dir:category:{id}:page:{n} for surgical invalidation, and target an 85 percent-plus hit rate. For connections, use PgBouncer with pool_mode = transaction; Supabase ships it pre-configured at port 6543. This prevents a serverless app from spawning more connections than a PostgreSQL instance's default max_connections of 100 can handle.
What routine maintenance prevents a directory database from slowing over time?
Database performance degrades silently — a directory can slow 30–50 percent within 12 months from neglect alone. Run VACUUM ANALYZE weekly, or daily on high-write tables like the submission queue and audit log, to reclaim dead rows and update statistics. Review slow query logs monthly via pg_stat_statements, which surfaces cumulative time so you catch a fast query firing 500,000 times a day. Check index bloat quarterly with pgstattuple, and review your full index set annually to drop indexes that no longer match query patterns. Set autovacuum_vacuum_scale_factor to 0.01 on large high-write tables.
Read next
Directory Load Time Optimization
Load time optimisation for directory pages: critical render path analysis, font subsetting, third-party script audit, and the metrics that correlate with ranking improvement.
TechnicalDirectory Cache Management Systems
Cache management strategies for high-traffic directories: TTL tuning, cache invalidation logic, CDN integration, and handling real-time listing updates.
TechnicalDirectory Content Distribution Networks
Using CDNs to serve directory content faster: geographic distribution, image optimisation pipelines, cache-control headers, and handling dynamic listing data.
Stay ahead on directory tech
New + rising directories, scoring changes, and the technical SEO signals that move listings. One email a week.