Solr integration for the Plone CMS

Introduction

collective.solr integrates the Plone CMS with the Solr search engine.

Apache Solr is based on Lucene and is the enterprise open source search engine. It powers the search of sites like Twitter, the Apple and iTunes Stores, Wikipedia, Netflix and many more.

Solr does not only scale to any level of content, but provides rich search functionality, like faceting, geospatial search, suggestions, spelling corrections, indexing of binary formats and a whole variety of powerful tools to configure custom search solutions. It has integrated clustering and load-balancing to provide a high level of robustness.

collective.solr comes with a default configuration and setup of Solr that makes it extremely easy to get started, yet provides a vastly superior search quality compared to Plone’s integrated text search based on ZCTextIndex.

Current Status

The code is used in production in many sites and considered stable. This add-on can be installed in a Plone 4.1 (or later) site to enable indexing operations as well as searching (site and live search) using Solr. Doing so will not only significantly improve search quality and performance - especially for a large number of indexed objects, but also reduce the memory footprint of your Plone instance by allowing you to remove the SearchableText, Description and Title indexes from the catalog. In large sites with 100000 content objects and more, searches using ZCTextIndex often taken 10 seconds or more and require a good deal of memory from ZODB caches. Solr will typically answer these requests in 10ms to 50ms at which point network latency and the rendering speed of Plone’s page templates are a more dominant factor.

Credits

This code was inspired by enfold.solr by Enfold Systems as well as work done at the Snow Sprint 2008. The solr.py module is based on the original python integration package from Solr itself.

Development was kindly sponsored by Elkjop and the Nordic Council and Nordic Council of Ministers.

Contributors

  • Hanno Schlichting (hannosh)
  • Tom Gross (tomgross)
  • Timo Stollenwerk (tisto)
  • Manuel Reinhardt (reinhardt)
  • Patrick Gerken (do3cc)
  • Andreas Zeidler (witsch)
  • Martijn Pieters (mjpieters)
  • Carsten Senger (csenger)
  • Andrea Cecchi (cekk)
  • Florian Schulze (fschulze)
  • Mauro Amico (mamico)
  • Giacomo Spettoli (giacomos)
  • (jkubaile)
  • Luca Fabbri (keul)
  • Witek (witekdev)
  • Laurence Rowe (lrowe)
  • JC Brand (jcbrand)
  • Daniel Widerin (saily)
  • Wolfgang Thomas (pysailor)
  • Philip Bauer (pbauer)
  • Cédric Messiant (cedricmessiant)
  • Rodrigo (rristow)
  • (tschorr)
  • Alexander Pilz (pilz)
  • Jean Jordaan (jean)
  • Alexander Loechel (loechel)

Indices and tables