Importing Data into CiviCRM
Most organisations have data in sources outside CiviCRM, such as previously used database platforms, spreadsheets created on the fly for specific events or other purposes, and email address books. Because manually entering large amounts of data can be tedious, CiviCRM provides a way do import data en masse if the source can export it into some common format such as a Comma Separated Version (CSV) file.
Considerations before importing
It might be useful to think about your pre-existing data in the same way as the contents of a house or apartment when you need to move. People often use moving as a chance to say, "Do we really need this? This stuff is too old; let's trash it and get some new stuff once we have moved in."
To apply this metaphor to your data, look for fields that have no purpose, such as old organisational divisions that you've abandoned or office locations in a facility you no longer occupy.
Some old data will have continuing value. For instance, one organisation in a financial pinch decided to use an old list of founding donors who had not given money for many years. It turned out that these lapsed donors still had strong emotional ties to the organisation that they had founded and they came to its financial rescue. In that case, saving old data was crucial.
Moving to a new living space doesn't just provide an opportunity to evaluate what's really important to keep and what can be left behind; it also gives you a chance to clean up everything that you do decide to take with you. Just as you wouldn't pack up dusty picture frames and dirty dishes because that would make your nice new clean place as dirty as your old place, you'll want to clean up your data before moving it into CiviCRM so that you're starting off with as clean and useful a database as possible.
In planning for a move to CiviCRM, prepare to spend a good amount of time looking at your old data; standardising how different elements of contacts' records are stored (e.g. are states entered as NY or New York?) and deleting obvious duplicates, accidental entries, outdated information, and corrupted records.
Preparing to import data
Importing data requires considerable attention and care, so we'll present some concepts here that you should know before you start your first import. You can import both core and custom data for contacts, as well as data for event attendance, activities, memberships and contributions. This chapter focuses solely on the import process for contacts, though the processes for other data are similar.
There are two ways to import data:
from CSV files. Most database and spreadsheet applications (e.g. OpenOffice.org Calc, Google Spreadsheets, Microsoft Excel) can create and manipulate files in this format. It is often easier to view and clean your data when it's in a CSV file than while it's still inside your old database.
Each column in your CSV file will map to a field in CiviCRM, so make sure you use a different column for every distinct bit of information.
Depending on your country or region, fields in your CSV files might be separated by semicolons (;) instead of commas. If so, you'll need to change the Import/Export Field Separator value in the CiviCRM Localization settings by going to the navigation menu and choosing Administer > Configure > Global Settings > Localization.
- from another SQL or MySQL database stored on the same server, using an SQL query.
If you do not have a clear understanding of your existing data and how it will map to CiviCRM fields, you will experience frustrations and problems when you try to import the data. Please read about each type of data in other sections of this CiviCRM Manual and visit the CiviCRM online documentation for more information: http://wiki.civicrm.org/confluence/display/CRMDOC/Importing+Data
The following rules and recommendations will help you to import data with minimal problems:
- Always test your data import with a small subset of your records. After importing the test set, visit the records within CiviCRM and ensure that the data was imported and functions as you expected.
- It can be helpful to create a test contact that has every attribute you've defined in your existing data set. Then import the contact and check results to ensure that CiviCRM correctly represents all the data.
- When you map the columns or fields from your source data to CiviCRM fields during the import, CiviCRM can save this field mapping as an import map for future use. This is helpful if you will be importing multiple files with the same structure. To save an import map for future use, click the "Save this field mapping" check-box at the bottom of the Match Fields screen of the import wizard and enter an appropriate name and description. To reuse a saved import map, select it from the Load Saved Field Mapping dropdown menu on the Choose Data Source screen (step 1) of the import wizard.
- If your imports are timing out or taking too long, try splitting up the imports into smaller batches. If you have the appropriate permissions on your web server, you can also increase the memory_limit and max_execution_time values in the file php.ini.
- You can add all of the contacts imported in an import to new or existing groups or tags. All of the contacts in a single import will be given the same groups and tags. This limitation has a couple effects on your import:
- Make sure that you assign groups and tags that are applicable to every contact in the imported set. If you need to assign groups or tags on a contact-by-contact basis, import contacts in small, discrete batches in which all contacts share the same tags and groups. Alternatively, you can create searchable custom data fields in CiviCRM that contain the groups and tags that you want to assign to imported contacts. After the import you can run searches on those fields and use the "Add Contacts to Group" or "Tag Contacts" batch actions on the search results.
- You can use this feature to manage the import. Consider adding contacts to a new group or tag that indicates what batch of imports the contacts were a part of, thereby allowing you to easily identify when a contact was imported and undo an entire import if necessary.
- CiviCRM stores first names and last names in separate fields, so these should appear as separate columns in your CSV file. The same goes for city and postal code/zip code. Most spreadsheet programs contain tools that automate the process of splitting text across fields.
- Ensure that your country names are expressed in the same way as they are in CiviCRM, i.e., 'United States', not 'United States of America', and 'United Kingdom', not 'Britain'.
- If you are importing multiple locations, the first location will be set as the primary location address. You may want to move your columns around to ensure that the desired location becomes the Primary Location. You may also need to split your import so that some records have one type of record as their Primary Location, while others have a different one.
- If you are importing data into multi-choice (e.g. check-box or radio button) custom fields, your data source can use either the label (what's visible to the user in the CiviCRM front end) or the value (what's actually stored in the database for that choice). CiviCRM will recognise it and import it appropriately. When importing into multi-choice core data fields, you can specify only the value(s) in your data source, not the label.
- If you are updating multiple choice options, new values will replace the entire field. For example, if you update the value of the Colors field to be "orange" for a contact that currently has Colors set to "blue", the result will be that Colors is set to orange, not orange and blue.
- Make sure your data source uses an accepted date format and that you select the same date format on the Choose Data Source screen of the import wizard.
- Make sure any name prefixes and suffixes you use have been set up in the administration interface (go to: Administer > Option Lists in the navigation menu).
- If you plan to do additional imports of related data that's associated with your contact data, e.g. contribution data, event participation data, membership data, you can make things easier by ensuring that your contact records have unique IDs that are also associated with the related data. When you do the initial import of your contact data, import these unique IDs and map them to CiviCRM's External ID field, so that you can then use your original (or legacy data) IDs to match to the contact records records for later imports of the related data.
Setting up a CSV file for importing
Example of spreadsheet .csv format
When thinking about setting up your spreadsheet, think about the data that you are collecting and plan out your column headings. Keep in mind that you may need to create more than one .csv file and perform multiple imports before you are finished.
If you plan to import related data that pertains to a specific contact, e.g. event participant information, contribution data, etc., you will need to make sure that each contact record has a unique identifier or the contact record should have First Name, Last Name and Email, so that you can link their related data during later imports. If you have unique ID, you would map the ID to CiviCRM's External Identifier on import.
Running an import
The import process has four steps.
Step 1: Setup
Setup lets you specify the basic details of your import, including the source of the data. Data can come from either a CSV file, or an SQL query of a database on your server. A check-box lets you indicate whether the first row of your file contains column headers.
Imports use the default strict rule to decide whether a contact record is a duplicate (refer to the Deduping and Merging chapter in this section for information on duplicate matching rules in CiviCRM). You can specify what action to take when an import encounters a duplicate:
- Skip: skip the duplicate contact, i.e. leave the original record as it is.
- Update: update the original record with the database fields from the import data. Fields that are not included in the import data will be left as they are.
- Fill: fill in the additional contact data, if it contains fields that are missing or blank in the original records, and leave fields which currently have values as they are.
- No Duplicate Checking: this inserts all valid records without comparing them to existing contact records for possible duplicates.
Import mappings tell CiviCRM how the fields of data in your import file correspond to the fields in CiviCRM. The first time you import from a particular data source, it's a good idea to check the box to "Save this field mapping" at the bottom of the page before continuing. The saved mapping can then be easily reused the next time similar data is imported, by requesting that it be loaded at this step.
Step 2: Match the fields
If you had column headings in your file, these headings will appear in the first column on the left-hand side of the Field Map, while the next two columns show two rows of data in your file to be imported, and the fourth column is the Matching CiviCRM Field. If you loaded an import mapping in Step 1, your choices will be reflected here. You can change them if they are inappropriate for this import.
The matching CiviCRM fields include standard CiviCRM data such as First Name and Last Name as well as any custom data fields that have been configured for use with contact records on your site. Match the fields by clicking the dropdown list and selecting the appropriate data. For example, if the heading of the second column in your input is Surname, you should choose Last Name as your Matching CiviCRM Field.
Select "- do not import -" for any columns in the import file that you don't want to import into CiviCRM.
If you have a saved mapping for a specific set of spreadsheet columns, and your spreadsheet layout has changed (for instance, you need to import additional fields, so you add the appropriate columns of data in the spreadsheet), you can modify and save the field mapping. One tip to ease the mapping process when you need to import additional fields is to place the additional columns of data in your import spreadsheet to the right of the columns you've previously mapped in CiviCRM. This allows you to use the existing saved field mapping to map the initial import fields, and then continue mapping the new data fields.
Note that if you add new data columns in your spreadsheet and do not position the columns AFTER the columns you previously mapped, you then can't use the saved mapping and will have to map all your import fields again.
Once you've mapped your fields, you can decide if you want to keep the original saved mapping unchanged, or check the box to "Update this field mapping" to include the new field mappings.
Step 3: Preview
This screen previews the results of importing your data, reports the number of rows to be imported, and allows you to double check your field matches.
If some of the rows in your spreadsheet contain data that doesn't match CiviCRM's requirements for one or more fields, you'll see an error message with a count of the invalid rows (see the screenshot below). Click the Download Errors link and review the errors reported in the downloaded file, so you can fix them before doing the import.
At the bottom of the form, you can choose to add the contacts to an existing group, import to a new group, create a new tag, or tag imported records. Adding imported records to a separate group is strongly recommended in order to be able to quickly find the imports and, if necessary, delete and reimport them.
Step 4: Summary
The final screen reports the successful imports along with Duplicate Contacts and Errors. If you have set the import to add all contacts to a Group or Tag, you can click through to see your imported contact records.
At this point it makes sense to check to make sure that your import has worked as expected. Search for the contacts that you just imported and examine their fields and custom data to make sure all is as expected.
Importing relational data
We have just described the process of importing one data file. But what about if you want to import related data, like parent child relationships, activities, contributions, etc.? For each type of data you want to import, you will need to import a seperate CSV file.
CiviCRM has specific tools for importing related contact data and a set of specific import tools for contributions, memberships, event participation etc. (and you should see specific chapters for details of how to use these tools). To import relationships, you should run multiple contact imports.
For example if we want to import data for children and then for both parents, we run three imports, one for the child, one for the father and one for the mother.
We first import the child remembering to include an external identifier that we can use to match the child to their parents. We then import the father, and then the mother, as related contacts, linking them to the child using the child's external identifier.
In the example below we have one CSV file which contains father and mother information. We use this CSV file twice as part of the import. Have a look at the fields below to understand what is happening.
We are linking the father to the original child using the external identifier and are then importing the related father name using the 'Child of' relationship type.
When the import is done, go back and verify the data by searching for the parent and examining the relationship tab. They should have a relationship linking them to the child.
You can then repeat this process for the mother, and also for other relationships as necessary.
For many organisations, an important element of cleaning your data is standardising addresses. In the US, this means conform to conventions defined by the United States Postal Service's Standards for Addresses. Standardising how addresses are entered into CiviCRM will allow for more accurate search results when searching by address, as CiviCRM can parse addresses based on the USPS standards if you choose to do so. To find out more about how Address Parsing is handled and used in CiviCRM, refer to the Installation chapter of the Configuration section of this manual. When adding or editing contacts, you will enter and edit such address elements as street number, street name, and Apt/Unit/Suite number according to these standards.
To find out more details about the USPS' Standards for addresses, refer to their Publication 28 athttp://pe.usps.com/text/pub28/welcome.htm orhttp://pe.usps.com/cpim/ftp/pubs/Pub28/pub28.pdf for the pdf version.