Listen/download MITSUDA by Lefse Records
Fast, Unaccented ILIKE in PostgreSQL 9.1+
Working at Ajah we need to be able to run quick, unaccented ILIKE queries to power our French/English auto-complete features. As our database grows, these queries are incurring some serious overhead.
Because PostgreSQL’s native unaccent function isn’t immutable, we’re not able to index a direct transformation of a column, and instead we need to maintain a copy. Either via TRIGGER or, in our case, with SQLAlchemy events:
from sqlalchemy import event, func
def unaccented_name(mapper, connection, target):
target.name_en_unaccented = connection.scalar(
func.unaccent(target.name_en))
event.listen(Organization, 'before_insert', unaccented_name)
event.listen(Organization, 'before_update', unaccented_name)
This is assuming we have a table, organization, with TEXT columns name_en and name_en_unaccented.
Normal B-tree indexing won’t work here because we aren’t querying for exact matches. To speed up ILIKE queries on our name_en_unaccented column, we’re going to be using PostgreSQL’s pg_trgm module. Let’s create our extensions and index in postgres:
CREATE EXTENSION unaccent;
CREATE EXTENSION pg_trgm;
CREATE INDEX ix_trgm_name_en_unaccented
ON organization USING gin (name_en_unaccented gin_trgm_ops);
And that’s it! The speed improvement is dramatic:
# EXPLAIN ANALYZE SELECT * FROM organization
WHERE unaccent(name_en) ILIKE unaccent('%Toron%');
Total runtime: 700.570 ms
vs.
# EXPLAIN ANALYZE SELECT * FROM organization
WHERE name_en_unaccented ILIKE unaccent('%Toron%');
Total runtime: 6.669 ms
Yesssssssssss - Lil B


