Newscoop 4.1 Cookbook

Solr search integration

In Newscoop 4.1.0 or later, there is optional support for Solr, the open source enterprise search platform from the Apache Lucene project (http://lucene.apache.org/solr/). Solr runs on Java, and so installation requires additional steps beyond a standard Newscoop installation on a typical LAMP web server.

1. First, you must install a Java environment. On Debian or Ubuntu GNU/Linux you can do this with the command:

sudo apt-get install openjdk-6-jre 

2. Download Apache/Solr from http://www.apache.org/dyn/closer.cgi/lucene/solr/4.1.0 and unpack the tarball in the directory you want to run it from, such as /var/www/ in this example:

$ cp solr-4.1.0.tgz /var/www/ $ cd /var/www $ tar xvzf solr-4.1.0.tgz 

3. Copy Newscoop's configuration for Solr into the Solr installation directory. If you are using the .deb packages of Newscoop, those files are found under the /var/lib/newscoop/example/solr/ directory:

sudo cp -a /var/lib/newscoop/example/solr/* /var/www/solr-4.1.0/example/solr/ 

4. To index languages other than the default of English, edit the file /var/www/solr-4.1.0/example/solr/solr.xml and add a <core> entry for each language you are using (the name of the core must be the ISO two-letter language code). Then copy the en folder for the name of each of those additional cores.

<cores adminPath="/admin/cores" defaultCoreName="en" host="${host:}" hostPort="${jetty.port:}" hostContext="${hostContext:}" zkClientTimeout="${zkClientTimeout:15000}">      <core name="en" instanceDir="en" />  </cores>

5. Change to the /var/www/solr-4.1.0/example/ directory and run the command:

sudo java -jar start.jar 

The Solr administration dashboard should now be visible in a browser opened at http://localhost:8983/solr/ if you are using the default settings.

6. Open /var/lib/newscoop/application/configs/application.ini-dist with your editor of choice and uncomment this line (by removing the semi-colon at the beginning of the line):

listener[] = search.indexer.article 

7. Newscoop looks for Solr by default at http://localhost:8983/solr. If your Solr installation is running on a different address or port, you can override this by changing the following line in /var/www/newscoop/application/configs/application.ini-dist:

search.solr_server = 'http://<url>:<port>/solr' 

Make sure you add that line to production and cli environments.

8. We need to store Newscoop content in Solr so that search can start. You can run the following command manually to get some data for testing:

cd /var/lib/newscoop php scripts/newscoop.php index:update 

In production environments you would set up a cron job to run the same command periodically, so that your Solr index is updated with any new article or article changes in Newscoop.

Preparing your Newscoop templates

Instead of list_search_results you have to use list_search_results_solr in your Newscoop templates for Solr search results to be displayed. The parameters are:

Parameter Description
q Query string. Default value: $_GET['q'].
qf Fields to query. Can be title, type, webcode, authors, topics, keywords, custom field name. You can use multiple separated by space e.g. qf="title full_text". You can also boost relevancy of any field with ^number operator e.g. fq="title^3.4". Default value: title.
rows Number of rows to be returned. Use for pagination. Default value: 10.
start Number of row to start with. Use for pagination. Default value: 0.
fq Optional filter query. Can be used for results filtering e.g. fq="type:news".
sort Set sorting of results. Default value: score desc.

The search form

Here is an example of a search form you could use in a Newscoop template.

{{ form_search_solr id="search" class="hidden-phone" }}   {{ form_text name="q" value=$smarty.get.q }}   {{ form_submit name="" value=" " }} {{ /form_search_solr }} 

Listing the results

Here is example template code that would list search results including article titles. author names, images and photographer credits: 

{{ list_search_results_solr fq="type:news" qf="title^5 deck^3 full_text" start=$smarty.get.start }}   {{ if $gimme->current_list->at_beginning }}   <ul>   {{ /if }}     <li class="news_item {{ cycle values="odd,even" }}">       {{ image rendition="thumb" }}       <img src="{{ $image->src }}" alt="{{ $image->caption }} (photo: {{ $image->photographer }})" />       <span>{{$gimme->section->name}}</span>       {{/image}}        <div class="content">         <h2 class="title"><a href="{{url options="article"}}"> {{$gimme->article->title}}</a></h2>         <h5 class="author">{{list_article_authors}}         {{$gimme->author->name}}         {{/list_article_authors}}</h5>         <p>{{$gimme->article->deck|strip_tags|truncate:200:"...":false}}</p>       </div>     </li>   {{ if $gimme->current_list->at_end }}   </ul>   {{ /if }} {{ /list_search_results_solr }} 


your comment:
name :
comment :

If you can't read the word, click here
word :