Newscoop 4 Cookbook

Search engine optimisation (SEO)

Search Engine Optimisation is about improving the way your content is visible on the Internet. This is often understood as making your publication show up as highly as possible in search results. But limiting SEO to tricking search engines would be missing the point. Think about SEO as part of the service that you provide to your readers, not just a mechanism to jump the queue in search algorithms.

Imagine you have published an article about the impact the fall of the Berlin Wall has had on urban planning in that city today. It is named "Right in the middle" and because your web design uses big, trendy letters this short title just looks really good. Your Newscoop template is using the article name in the title tag in the header of the HTML document.

Imagine a potential reader who is typing "Berlin Wall" into their favourite search engine. Amongst the results, somewhere, your article shows up. The search engine will display the content of the title tag in the long list of results. What are the chances that the reader would click "Right in the middle" when looking for specific information about the Berlin Wall? The reader would probably be more likely to click "Fall of Berlin Wall heats up property speculation 20 years later".

This descriptive content increases the chances that readers will click on your article. At the same time, search engines value the content of the title tag highly. This little bit of extra work is likely to catapult your page upwards in the ranks of search results. Where the old title tag did not even mention the Berlin Wall, your new title tag does, and provides additional key words that will have an impact on your article's ranking and your publication's visibility.

This chapter will help you to make most of your publication's most valuable asset: your content. The following examples will cover a number of small modifications to your templates and other parts of your website which can deliver improved page rank and visibility. The examples will focus on SEO practices involving your publication or template structure, with a few journalistic guidelines.

Creating descriptive page titles

Add the field "seo_title" to your article type. This field can be displayed with $gimme->article->seo_title in the header region of your document.

<head>     <title>{{ $gimme->article->seo_title }}</title> </head> 

However, if the journalist forgot to fill in this field, the title tag of the page would be empty. So you should present a fallback option. A simple way of doing this, providing a reasonable solution for section pages and the home page at the same time, would be:

<head>     <title>{{ strip }}         {{ if $gimme->article->seo_title|trim !== "" }}             {{ $gimme->article->seo_title|escape:'html'|trim }} |         {{ else }}             {{ $gimme->article->name|escape:'html'|trim }} |         {{ /if }}         &nbsp;{{ $gimme->section->name }} in {{ $gimme->publication->name }}     {{ /strip }} </title> </head> 

The functions trim and escape:'html' are used to make sure the content is clean and HTML. If the seo_title field is not filled in, the article name is displayed instead. If you are on a section page, the article values are not displayed if you link to the section using option=section.

Use the "description" meta tag

The description is a summary of what your article is about. The description meta tag goes into the header of your document. Many times, the text in this description will be given as an introduction to the page in a search result. The meta tag looks like this:

<meta name="description=" content="..."> 

Ideally, you should add a field to the Article Type that holds the description content. If this field is empty, you should use text from the main text of the article. A custom description will often be more inviting to a reader, in a list of search results, than the first lines of the main text.

Because the description will most probably come from a WYSIWYG textarea field, it is important to strip_tags. Opinions on the ideal length for meta descriptions vary. In the following example, we set the length to the first 150 characters of the article's main text, if no custom description has been provided.

<meta name="description=" content="{{ strip }} {{ if $gimme->article->description_tag|strip_tags|trim !== "" }}     {{ $gimme->article->description_tag|strip_tags|escape:'html'|trim }} {{ else }}     {{ $gimme->article->full_text|strip_tags|escape:'html'|trim|truncate:150 }} {{ /if }} {{ /strip }}" /> 

Human readable URLs reflecting the content

Information in the URL describing the content of the page is valued highly by search engines. You can control the URL for each issue and section, setting short names. So instead of the section number "/12/", this part of the URL might read "/culture/". You can find these options in the Newscoop administration interface. Select "Settings" in the list of issues and sections.

The article content can be reflected in three different human readable ways in the URL. You can select the option to use the article title, article keywords, or topics linked with the article. If your publication requires it, you can also create a combination of these options. The configuration for the URL display is done in the administration interface under "Configure Publication".

Here some examples of what these URLs could look like:

  • by article title - http://yoursite.com/en/mar2011/posts/4/Healthy-options-for-your-sweet-tooth.htm (the article title is "Healthy options for your sweet tooth")
  • by article keywords - http://yoursite.com/en/mar2011/posts/4/healthy-options.htm (the article has keywords "healthy" and "options")
  • by article topics - http://yoursite.com/en/mar2011/posts/4/health-dine-wine-tomato-garlic-bread.htm (the article has topics "health", "dine", "wine", "tomato", "garlic" and "bread")
  • by combining some - or all - of these options

Structure heading tags properly

Heading tags (h1, h2, h3, ...) reflect the hierarchy of the content on a page. This is how search engines read them, so you should design your page in the same way for humans. For example, when designing a page, don't use heading tags to control the layout.

HTML 5 is not very different from HTML 4 when it comes to SEO, so the rules are almost the same:

  • Use only one H1 element on any page
  • You can use any number of H2, H3, H4, H5, H6 elements on any page, as long they follow this hierarchy
  • Use <ul> or <ol> tags for lists
  • For menus, use the <ul> tag in HTML 4 and the <nav> tag in HTML 5
  • Use <div> tags for styling blocks inside a template
  • Use inline tags like <p> or <span> only for content

The logic for using only one H1 element is derived from the fact that search engines identify <h1> tags as page titles. Sometimes, search engines ignore <title> tags because they have been abused by webmasters.

HTML 5 also introduces new tags like <header>, <footer>, <nav>, <article>, <aside> and <section>. Search engines disqualify the use of multiple <header> tags if they are positioned one after another, but not if they are used as headers for each <article> tag. The same thing happens for <nav> and <footer> tags. A page can have multiple <article>, <aside> and <section> tags, each of these containing just one <header>, <footer> and <nav> tag.

XML Sitemap for your publication

Providing a sitemap in a specific XML format will make it easy for search engines to gain access to your content. The XML sitemap delivers all content that you wish to be indexed in a machine readable file.

Providing a sitemap also makes sure that search engines will find all of your content. A simple example: if you are using Flash to link from one page to another, that link is not being followed, because it is invisible to spiders (search engine robots). Such "invisible" pages will be listed in the sitemap, and help search engines to understand where these pages are.

In order to create a sitemap for your publication, see the chapter about XML, RSS, KML and sitemaps.

Unique URLs: the canonical tag

Canonical tags have one important purpose: tell search engines what the "clean" URL of the page is. The canonical tag sits in the header of your page. It was introduced in February 2009 by Google, Yahoo and Microsoft and it looks like this:

<link rel="canonical" href="http://www.example.com/" /> 

This is meant to put an end to the issues related to duplicate content. In short: duplicate content was used by some sites to increase their page rank. To prevent this kind of spamming, search engines rated domains with duplicate content lower. But any CMS will need different URLs for the same page, for example when passing on a parameter in the URL for browsing history, login, related items and others. The canonical tag now allows sites to make sure they are not ranked lower because they produce some duplicate content. Using the canonical tag will result in higher page ranks.

You need to adjust the following example to the template and folder names you are using for your publication.

{{ if $gimme->template->name == "package_name/article.tpl" }}     <link rel="canonical" href="{{ url options="article" }}" /> {{ /if }} {{ if $gimme->template->name == "package_name/section.tpl" }}     <link rel="canonical" href="{{ url options="section" }}" /> {{ /if }} {{ if $gimme->template->name == "package_name/index.tpl" }}     <link rel="canonical" href="http://{{ $gimme->publication->site }}" /> {{ /if }} 

Additional checklist for your journalists and editors

The following list is not relevant for making templates. But while you are working on SEO, you might as well pass on some tips to your colleagues who are contributing content. At the end of the day, their input will guide the audience to your site.

Explain to the journalists and editors that their input into SEO could dramatically increase the readership of their articles on your publication's site. A little extra effort increases advertiser value, extends the shelf life of the article, and makes the journalist who wrote it much more famous.

When writing descriptions (for both articles and images)...

  • Summarize the content accurately. Below are a few pointers that might help journalists and editors create a good description.
  • Write unique descriptions. While this might appear impossible in large publications with thousands of articles and a large contributing staff, keeping this idea in your head will help avoid being boring and generic.
  • Write for your audience, not for a search engine. Avoid writing a description tag that bears no resemblance to the content of the article. Don't write up a list of keywords, but form a sentence or two.
  • Do not use the same description across your site. This could actually be worse than using no description tag at all. Imagine all search results in a list saying: "Simply the best magazine in the world".

When writing an article or image description, or providing a custom SEO title, include the following elements:

  • Who or what is being shown? What is the name of the person, the animal, the building, the event? "Bird" is better than nothing, but "Eagle" would be better
  • Where is it happening? "Eagle in the sky" is good, "Eagle in the sky above the Alps" is better
  • Why is it happening? Why are you writing about it? Are you writing about an "Endangered Eagle in the sky above the Alps"
  • When is this happening? Are we looking at an "Endangered Eagle in the sky above the Alps in Spring"?

You don't need to go overboard with building endless descriptions. Prioritize according to your story. The above questions are a good way to get to the essence quickly. In the end, you might settle for "Endangered Eagle flys in the Alps" - this captures the story better and would attract far more readers than "Bird" would ever do.

Make use of the image alt description

When adding an image, do not leave the alt tag empty. Firstly, screen reader programs for people with visual impairments rely on this information. Secondly, image searches on the Internet will categorise and rank images based on information in the alt tag. Readers finding your publication through image searches may be more common than you might think.

Link text should relate to the page it links to

When linking to a page, make sure the link text is related to the content of the page you link to. "You can download Newscoop here" is bad. "Download Newscoop for free" is better.