obartunov: (Default)
[personal profile] obartunov
I'm going to the PGConEU in Prague ! Thanks to russian company 1c (it's really big company) for sponsorship !

I and Alexander Korotkov will present lightning talk "Fulltext search in PostgreSQL in milliseconds". I understand, that this topic needs more time, but we submitted the talk too late, sorry. I was in Karakorum, Pakistan the whole july. Anyway, we got really impressive results with prototype - on 6 mln records classified database we got 6-8 ms median search query time, total 41 mln search queries in 8 hours. We used real-life data from the biggest russian classified service and real search queries extracted from web-logs. We hope to discuss some implementation issues with developers and attract attention of sponsors. The latter is important, since the amount of development itself is big. We also need to pass through review nighmare :) There are not so much time remains for 9.3, by the way.


Fulltext search in PostgreSQL is well known by its powerfulness and extendability. However, there are two main reasons that prevent PostgreSQL fulltext search to be as fast as specialized solutions:

1) It's implemented inside ACID DBMS, that's why it have to support atomicity, concurrency, WAL etc. This issue is inevitable since we implement fulltext search inside object-relational DBMS, so it's both advantage and disadvantage.
2) Fulltext indexes are only used for document filtration, while ranking require fetching documents from heap. It reduces speed of high-selectivity queries processing. This disadvantage is not inevitable and it could be avoided by inclusion additional information into GIN-indexes.

This talk presents prototype of PostgreSQL patch allowing to store positional information into fulltext index and to use it for ranking. In this case ranking is performed using only index information without fetching documents from heap. It accelerates processing of high-selectivity queries in dozens of times. The work of prototype will be demonstrated on well known large databases.


obartunov: (Default)

November 2012

    1 23
456789 10

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags