Solr search integration
In Newscoop 4.1.0 or later, there is optional support for Solr, the open source enterprise search platform from the Apache Lucene project (http://lucene.apache.org/solr/). Solr runs on Java, and so installation requires additional steps beyond a standard Newscoop installation on a typical LAMP web server.
1. First, you must install a Java environment. On Debian or Ubuntu GNU/Linux you can do this with the command:
sudo apt-get install openjdk-6-jre
2. Download Apache/Solr from http://www.apache.org/dyn/closer.cgi/lucene/solr/4.1.0 and unpack the tarball in the directory you want to run it from, such as /var/www/ in this example:
$ cp solr-4.1.0.tgz /var/www/ $ cd /var/www $ tar xvzf solr-4.1.0.tgz
3. Copy Newscoop's configuration for Solr into the Solr installation directory. If you are using the .deb packages of Newscoop, those files are found under the /var/lib/newscoop/example/solr/ directory:
sudo cp -a /var/lib/newscoop/example/solr/* /var/www/solr-4.1.0/example/solr/
4. To index languages other than the default of English, edit the file /var/www/solr-4.1.0/example/solr/solr.xml and add a <core> entry for each language you are using (the name of the core must be the ISO two-letter language code). Then copy the en folder for the name of each of those additional cores.
<cores adminPath="/admin/cores" defaultCoreName="en" host="${host:}" hostPort="${jetty.port:}" hostContext="${hostContext:}" zkClientTimeout="${zkClientTimeout:15000}"> <core name="en" instanceDir="en" /> </cores>
5. Change to the /var/www/solr-4.1.0/example/ directory and run the command:
sudo java -jar start.jar
The Solr administration dashboard should now be visible in a browser opened at http://localhost:8983/solr/ if you are using the default settings.
6. Open /var/lib/newscoop/application/configs/application.ini-dist with your editor of choice and uncomment this line (by removing the semi-colon at the beginning of the line):
listener[] = search.indexer.article
7. Newscoop looks for Solr by default at http://localhost:8983/solr. If your Solr installation is running on a different address or port, you can override this by changing the following line in /var/www/newscoop/application/configs/application.ini-dist:
search.solr_server = 'http://<url>:<port>/solr'
Make sure you add that line to production and cli environments.
8. We need to store Newscoop content in Solr so that search can start. You can run the following command manually to get some data for testing:
cd /var/lib/newscoop php scripts/newscoop.php index:update
In production environments you would set up a cron job to run the same command periodically, so that your Solr index is updated with any new article or article changes in Newscoop.
Preparing your Newscoop templates
Instead of list_search_results you have to use list_search_results_solr in your Newscoop templates for Solr search results to be displayed. The parameters are:
| Parameter | Description |
|---|---|
| q | Query string. Default value: $_GET['q']. |
| qf | Fields to query. Can be title, type, webcode, authors, topics, keywords, custom field name. You can use multiple separated by space e.g. qf="title full_text". You can also boost relevancy of any field with ^number operator e.g. fq="title^3.4". Default value: title. |
| rows | Number of rows to be returned. Use for pagination. Default value: 10. |
| start | Number of row to start with. Use for pagination. Default value: 0. |
| fq | Optional filter query. Can be used for results filtering e.g. fq="type:news". |
| sort | Set sorting of results. Default value: score desc. |
The search form
Here is an example of a search form you could use in a Newscoop template.
{{ form_search_solr id="search" class="hidden-phone" }} {{ form_text name="q" value=$smarty.get.q }} {{ form_submit name="" value=" " }} {{ /form_search_solr }}
Listing the results
Here is example template code that would list search results including article titles. author names, images and photographer credits:
{{ list_search_results_solr fq="type:news" qf="title^5 deck^3 full_text" start=$smarty.get.start }} {{ if $gimme->current_list->at_beginning }} <ul> {{ /if }} <li class="news_item {{ cycle values="odd,even" }}"> {{ image rendition="thumb" }} <img src="{{ $image->src }}" alt="{{ $image->caption }} (photo: {{ $image->photographer }})" /> <span>{{$gimme->section->name}}</span> {{/image}} <div class="content"> <h2 class="title"><a href="{{url options="article"}}"> {{$gimme->article->title}}</a></h2> <h5 class="author">{{list_article_authors}} {{$gimme->author->name}} {{/list_article_authors}}</h5> <p>{{$gimme->article->deck|strip_tags|truncate:200:"...":false}}</p> </div> </li> {{ if $gimme->current_list->at_end }} </ul> {{ /if }} {{ /list_search_results_solr }}





