California Cancer Registry
Volume 2, Issue 1
Geocoding and Data Quality
The practice of geocoding, or associating a text address to its geospatial location and coordinates, has
become a critical tool in the collection and analysis of cancer data. At the central cancer registry, we
geocode patient’s street address at the time of diagnosis. During a patient’s tumor abstraction,
registrars make their best effort to collect the highest quality tumor data. However, I would like to
advocate that more emphasis be placed on collecting the most accurate patient address information.
Both the North American Association of Central Cancer Registries (NAACCR) and the National Cancer
Institute (NCI) require their members to geocode tumor patient data to the census tract level. NAACCR
and NCI assess the completeness of geocoded data on a yearly basis and include this assessment in
evaluating overall data quality, and in order to meet these stringent data quality measures we need to
have complete and accurate address information.
In the latest iteration of geocoding for the NCI data submission, various data issues were identified
related to address data. Unfortunately, the correction of address data creates a huge workload for
central registry staff. In preparation for the NCI SEER data submission in October 2013, more than 80
hours of work time were spent on manually reviewing and correcting address information at the
central registry; many more hours were most likely spent at the regional registry level. This is in
addition to the thousands of dollars spent to outsource geocoding activities to external vendors who
manually correct and geocode problem addresses. Many of the issues identified could be avoided and
included misspellings, or abbreviations of street names. To mitigate this problem in the future, below
are some helpful reminders when entering address data (from CCR Volume 1, Section III.2.5.2 Number
and Street at DX):
Direction (e.g., North, West) and street types (e.g., Avenue, Road) may be abbreviated (e.g.,
N MAIN ST). However, do not abbreviate a direction that is the name of a street (e.g., 123
If the address is longer than the allowable 60 characters, omit less important information,
such as apartment number or space number.
If a street, or postal box address cannot be determined, enter “UNKNOWN” in the field.
The CCR Volume I is a great resource for determining how address data should be collected from the
medical record. Below are a few important guidelines for collecting accurate address information:
Enter the address of the patient's
Usual Residence
on the date of the initial diagnosis.
is where the patient lives and sleeps most of the time and is not necessarily the
same as the legal, or voting residence.
If both a street address and a P.O. Box are given, use the street address.
For military personnel and their families living on base, the address is that of the base. For
