Wishlist: Postal Address Validation

Bug #1569889 reported by Josh Stompro
36
This bug affects 7 people
Affects Status Importance Assigned to Milestone
Evergreen
Triaged
Wishlist
Unassigned

Bug Description

Hello, something that I have always wanted in an ILS is built in postal address validation. This bug report is just to have a place to discuss this issue.

Postal address validation is the process of making sure that a postal address (where mail is delivered) is in the USPS database of delivery points. https://en.wikipedia.org/wiki/Postal_address_verification

The validation can be done at the point of entry, such as in the patron registration form, the self registration form, or from my account in the catalog. It can also be done in batch at a later date, or as a delayed backend process.

The built in zip code lookup is useful, but cannot handle certain situations, like zips that contain more than one city/county, which comes up all the time for several of our locations that are at the edge of counties. The address validation could replace the zip code lookup for sites that choose to use it.

There are numerous vendors that supply software and API's to provide the validation. I've played around with one called smartystreets.com in the past, since they provided free access for non-profits. It doesn't look like they provide that anymore, but it does looks like they have free accounts with 250 lookups a month.

For my own project I think I'll work on doing a batch update of all current addresses to clean up our database. It might make sense to add some fields to the actor.usr_address table to better track validation status for each address. There is a bunch of extra address metadata returned when an address is verified that may make sense to store.

Josh

Tags: patron
Kathy Lussier (klussier)
Changed in evergreen:
status: New → Triaged
importance: Undecided → Wishlist
Elaine Hardy (ehardy)
tags: added: patron webstaffclient wishlist
Revision history for this message
Katherine Dannehl (kdannehlpails) wrote :

My library system is experiencing major consequences to bad mailing address data entry. We just had 10% of our recent mailers come back because of incorrect/nonexistent addresses and mail forwarding issues. We had to pay for all of the return postage.

If anyone has experience with cleaning up mailing address data, I'd love to know how you did it. I'm new to SPARK and haven't found good documentation on batch updating patron records.

I am also here to second Josh and say that I would love to see an integrated postal address validation in the ILS.

Revision history for this message
Elizabeth McKinney (ebethmdg) wrote :

PINES conducted a patron database cleanup a few years ago. We contracted with Unique Management and Emerald Data Networks to do patron deduplication and address correction on our patron db of ~3 million. We had to manually look at a subset of potential dups. It was not horribly expensive and well worth doing.

Revision history for this message
Elizabeth McKinney (ebethmdg) wrote :

Experian offers postal and email address validation services. These services provide a way to validate identity. If we develop a connector to this service, it could allow for a more effective way to register patrons than the old driver's license and a piece of mail.

Oh, I will also mention as an addendum to my previous message about patron DB cleanup, Unique Management offers address correction services coupled with their overdue notice service.

tags: removed: webstaffclient wishlist
Revision history for this message
Millissa (millissam) wrote :

My director was asking about this recently and I was wondering if there had been any new development for this feature.

Revision history for this message
Terran McCanna (tmccanna) wrote :

I'm not aware of anyone working on this in Evergreen yet. PINES is currently contracting with a third party vendor (Quipu) to do postal address checks for self-registrations. Currently we're just having them verify the addresses are valid and Quipu is putting them into the USPS format. (Identity validation is beyond our budget at this time, unfortunately.)

There is no technical reason (just development time and funding reasons) for Evergreen not to be able to do this without relying on an intermediate vendor, though - the USPS has a full set of public APIs that are freely available: https://www.usps.com/business/web-tools-apis/

Revision history for this message
Jeff Godin (jgodin) wrote :

Terran-

The USPS APIs you refer to have the following line in their TOS:

> * User agrees to use the USPS Web site, APIs and USPS data to facilitate USPS shipping transactions only.

(this is in a small text box on the registration page, at https://registration.shippingapis.com/ , which is itself references in the FAQ at https://www.usps.com/business/web-tools-apis/#dev )

That would seem to rule out our use of those APIs in Evergreen or in relation to patron data, except perhaps in the limited case of home delivery services sent through USPS.

USPS has a variety of other data products available, including raw data files.

There are also other providers with APIs, many of which are themselves purchasing the data from USPS.

Another option we've looked at is utilizing US Census and local government GIS data, or even the Google Geocoding API (starting at $5 per 1,000 requests) but that focuses more on "where do you live/own property" and not as much on "can I deliver mail to this address".

I'm interested in improving how Evergreen handles both of those things, and will likely continue to research options.

Revision history for this message
Jeff Godin (jgodin) wrote :

Noting for the record that upon reviewing the terms of the Google Geocoding API, I'm reminded that it's likely not suitable for our use without a privately-negotiated ("contact Sales") license.

Revision history for this message
Jeff Godin (jgodin) wrote :

The $17 per 1,000 requests Address Validation API is more likely to be suitable for our needs, but like the Google Geocoding API(s), it has some rather strict terms that apply to storing/caching the data, batch requests, and seem to require that you notify end users that they are bound to Google terms of service and privacy policies.

https://developers.google.com/maps/documentation/address-validation

(Those downstream / flow-down requirements may be common with other services. We may find that it isn't just limited to Google APIs.)

Revision history for this message
Josh Stompro (u-launchpad-stompro-org) wrote :

We have been using Smarty for a few years to validate addresses in batch, and to produce a report on mailing addresses that fail validation. We pay $588 a year for 60K lookups a year.

They also offer an autocomplete api, to fill in correct addresses as they are entered. I really want that feature for staff and customers. It seems that a significant percentage of bad addresses are because of data entry typos.

I'm using a couple of scripts that grab new mailing addresses that haven't been validated recently and submit them to smarty using a batch tool they have, then loads the results into DB table. Then another script reviews the results, updates those that can be standardized and lets us know which ones need to be reviewed. We have a system of adding certain messages to the patron's account if seems to be correct when checking with the customer.

We have also been using our State's poll finder database to perform some validations. That gets us accurate city limits info. But it is just ranges of street addresses, so isn't great for 100% validation. But it did let me find all the USPS validate addresses with wrong county info. That is just a csv file that we have to query.

Josh

Revision history for this message
Jeff Godin (jgodin) wrote :

While geocoding is not address validation (and vice versa), I wanted to mention that the US Census Bureau has a recently-new-to-me geocoding API for residential addresses, which includes options to return geographic subdivisions such as township / city / etc. Many Evergreen libraries may track this information using manually-updated patron stat cats.

The geocoder also returns a weakly-normalized "matched address", though again not nearly as authoritative as USPS addressing data (direct or through a reseller), and with less tolerance for certain mismatches.

There's potential in this bug or a sibling bug for us to populate that kind of data automatically, and apparently for free, without needing to pay per lookup, or needing to build a geocoder based on the US Census data files.

Of course, there's some tradeoffs, and some might want to point at their own locally-hosted geocoder.

More information on the offering from the US Census Bureau is available at https://geocoding.geo.census.gov/geocoder/

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.