Open Translation Tools


Intellectual Property (IP) is an analogy for describing legal ownership of ideas. Common types of intellectual property include copyrights, trademarks, patents, industrial design rights and trade secrets. Copyrights are the legal area that apply most to Open Translation, as in the United States translations are deemed "derivative works," which may be violations of copyright. However, software patents for translation tools or machine translation algorithms may also be relevant, depending on a number of factors.

Outside the United States, whether or not translations are deemed "derivative works" is hotly debated and varies widely by jurisdiction. Note that part of the debate centers on the nature of translations themselves: are translations simply word-for-word substitutions from one language into another, or are they nuanced interpretations of the meaning of texts (as well as faithfulness where possible to specific words used) from one language to another? In the former case, an argument could be made that a translation is not a derivative work; in the latter case, it is virtually certain that the translation is a derivative work. This distinction matters because your perspective has an impact on the tools (including licenses) and goals of any open translation project. Note that many people have an expansive view of "translation" which includes format-shifting (such as rendering the content for disabled users) and other types of changes to the original work, which again suggests that translations are best understood as derivative works from an intellectual property perspective.

Why License?

Licensing is controlling the rights for use and redistribution of your created work. Licensing is very important. In the United States, by default you own an "all rights reserved" copyright to any work that you create. In regards to translation, republication, and sharing over the web, it becomes very important to understand exactly what licensing rights you have, and the details of which rights you want to waive in order to further your project.

For example, if you waive certain aspects of your copyright (from an "all rights reserved" to a "some rights reserved" model), it can be beneficial to the quantity, quality, scale and success of your translation project.

Not only does copyright apply to translatable content, but also to the software tools that aid in the translation of that content. When we refer to 'Open Translation Tools', for example, we are generally referring to software tools licensed in a way that allows the free use, reuse, and alteration of the software.

Common Types of Licenses

Choosing the proper license for your software tool or translatable content is extremely important. Below are some common types of licenses and their distinctions.

Full Copyright


The default "all rights reserved."

Copyright was designed to allow for legal protection of authors, allowing them to hold exclusive rights to their work for a limited time. While copyright term was only designed to last 14 years in the United States, corporations have pushed forward legislation that has now expanded the copyright term to last the lifetime of the author plus 70 years from the original date of publication. After this time, the work falls into the public domain--the free zone of sharing where everyone is free to use the work. Note that although all signatories of the Berne Convention on Literary and Artistic works have a limited (if overly long) duration for copyright, not all countries unambiguously deposit works out of copyright into the public domain.

Copyright protects content owners from having others steal their work and treat it as their own. However, many content creators can unintentionally prevent their work from being spread by not properly communicating (through licenses) which specific uses of their work they wish to allow (such as the right to save an image, document, or other file from the Internet.)

Permissive Licenses


"All wrongs reserved."

Permissive licenses allow the work to be re-purposed. From a legal perspective, permissive licenses often exist merely to disclaim warranty. This removes all liability by the copyright holder. If it breaks, you can't say it's our fault. By default a copyright has "all rights reserved." An easy way to think about permissive licenses is "all wrongs reserved." This kind of license says "we don't care about copyright." Some examples of software projects that use a permissive license are Apache Web Server and the Berkeley Software Distribution (BSD).

Permissive licenses can be good because they provide free and open source software, and free content, all of which are unencumbered by any significant restrictions on uses or users. Permissive licenses can be bad because modified versions of the software (for example), often improvements, do not have to be free like the original. For example, the proprietary, closed-source operating system Macintosh OS X is based off the Berkeley Software Distribution. Apple Macintosh is basically "free-riding" on the backs of the diligent coders who create BSD.

Some important free tools released under this license are Optical Character Recognition (OCR) tools such as Tesseract and OCRopus, both released under the Apache license. Google currently funds and uses both of these tools in its large book scanning project. Note that the decision steps regarding the use of permissive licenses often vary considerably depending on the nature of the content--the differences among software and other types of content are particularly important.

Copyleft Licenses


"Sharing is caring."

The "problem" of future versions not being free is solved by copyleft licenses. 'Copyleft' as an idea is a largely ethical, philosophical, and political movement that seeks to free ideas from the constraints of intellectual property law. According to the proponents of copyleft, duplication via communication is part of the very essence of what it means to have an idea, that is to say, what it means to share an idea. As they say, "sharing is the nature of creation."

The earliest example of a copyleft license is the GPL. Written in 1989 by a computer scientist at MIT named Richard Stallman, the license was designed to ensure the future success of Stallman's project: GNU. GNU, or GNU's Not Unix (a snarky recursive acronym I might add), is a massive collaboration project in free, community-developed, community-maintained software.

The GNU project has grown to enormity in its success, with the GPL quickly being adopted by free software projects across the world. Today, most serious and self-respecting open source projects publish their code under the GPL, which simply stands for GNU Public License. However, the freest license on the Internet today is not in fact the GPL, but the Affero GPL or AGPL. This license is only used for a few projects, such as Laconica.

Since FLOSS Manuals is designed as a documentation service for Free (as in Libre) and Open Source Software, our text is published under the GPL so that it can be redistributed with the free software that it is a reference for. This text, since it covers some open source tools, is released under the GPL as well.

Creative Commons Licenses

"Saving the world from failed sharing."

Creative Commons (CC) licenses are several copyright licenses first published on December 16, 2002 by the Creative Commons organization, a U.S. non-profit corporation founded in 2001 by a Stanford law professor named Lawrence Lessig. "Larry" has written several books on US copyright law--his latest is called Remix: Making Art and Commerce Thrive in the Hybrid Economy. The goal of the Creative Commons organization is to enable effective sharing of copyrighted content. Larry is also featured in a fantastic documentary called RiP: a remix manifesto. If you're too young to drink in the states, watch the movie. If you're too old to get smashed like a kid in the states, read the book. If you are in neither category do both. Then you will understand.

Attributes of CC Licenses

The Creative Commons organization sought to create a common language for licenses, so that they were readable and immediately understandable by the everyday man. They created a spectrum of free licenses, with varying freedoms controlled by various attributes. These attributes can be mixed and matched to create a custom license. Not all combinations are possible since some attributes are mutually exclusive. Here are the various licenses composed of the four basic attributes:

Attribution (CC-BY)

Attribution (BY) comes by default with all CC licenses, with the only exception being the CC0 license. This attribute is to say "give credit where credit is due." Copying is only permitted when the author or authors of the original work are properly attributed. You may choose to have attribution be the only stipulation (CC BY license), in which case the work can then be repurposed for anything. This is essentially a permissive license.

Attribution Share-Alike (CC-BY-SA)


Share-Alike (SA) means that any derivatives of the original work must be shared under the same exact license as the original. This option allows for the freedom to 'remix,' but backed by a principle of reciprocity. I share with you, you share with everyone else. The CC BY-SA license also requires that you properly attribute the original author, like all CC licenses. This is a copyleft license.

Attribution No-Derivatives (CC-BY-ND)


No-Derivatives (ND), like the rest of these terms, is fairly straightforward. If present, you may not make derivative works of the original work. The right to 'remix' is reserved by the original author, who may wish that his work is not mutilated or misrepresented. While this term can be used in a free license as it allows for copying, licenses using the ND term are less free than permissive or a copyleft licenses. Licenses using the ND term are less free is because they make for works that are sterile and not generative. I use the terms "sterile" and "generative" in the Jonathan Zittrain sense.

Because no derivatives are allowed, the ND and SA terms cannot be combined. Note that the lack of permission to make derivative works means that the work may not be translated (under most circumstances). As such, Open Translation projects are not likely to either use or recommend the use of the ND term.

Attribution Non-Commercial (CC-BY-NC-[ND/SA])


The Non-Commercial (NC) attribute is a stipulation that you may reuse the work provided that it is only used for non-commercial purposes. The author reserves the right to monetize and commercialize the work. This means the work cannot be sold by any person or corporate entity, not even a non-profit corporation. However, the work may be shared freely and, depending on the actual license chosen, adapted (including translated) or modifed without permission. If the share-alike (SA) term is also used, then those modifications must be relicensed and made available using exactly the same license. If the no derivatives (ND) term is used, then the original work can only be copied and distributed, not adapted, and then only non-commercially. As with the ND term, licenses with this attribute are less free than permissive or copyleft licenses.

It is worth noting that the interpretation of the non-commercial restriction is not clear-cut. In addition, most proponents of copyleft licensing for software (generally with the GPL) find the commercial constraint to be highly discriminatory to legitimate open business models and believe that the NC term should be avoided at all costs.

The CC0 Waiver

"No rights reserved."

The CC0 waiver is not a license, but is rather a tool that allows a copyright holder, to the extent possible, to release all restrictions on a copyrighted work worldwide. CC0 was created (at least in part) in response to new database rights ("moral rights") in the European Union. Because copyright law is different for databases in the European Union, the Creative Commons organization combined a public domain dedication with a waiver that released any and all ownership rights, including those on data. The CC0 waiver is the closest thing to a public domain dedication, as you waive any possible rights.

The Public Domain

The public domain is the free realm of sharing where the original author, if known, contains no rights at all over his work. Some freedom-lovers who seek no recognition for their work simply release it into the public domain with a tool called a public domain dedication. Additionally, when the copyright term expires for a particular work, that work enters the public domain (at least in the United States). Since work in the public domain lacks any copyright protection, it may be leveraged by corporations hoping to profit from the work, as well as anyone else. This is common for reprinting of out-of-copyright texts like Alice in Wonderland, Shakespeare, and other classics. Once a work is in the public domain, the original work can never be copyrighted by anyone, regardless of any copyright claims on new renditions.