Newsgroups: comp.infosystems.gis From: jguthrie@wnet.gov.edmonton.ab.ca (James Guthrie) Subject: Re: Database structures for Street Addresses To: rpl@netcom.com Organization: WinterNet Date: Thu, 25 Jan 1996 19:17:07 GMT
Address databases are much more complex than people think, and their end use determines much of the needed complexity: address matching, address lookup, spatial allocation, spatial analysis, thematic analysis, etc. all have special address related characteristics. Keep in mind the fact that an address is only an identifier for one or more entities (land, structures, occupancies, events, etc.) over some range of time. The address has to be associated with the thing(s) or events to which it applies in an explicit relationship. The name is not the thing it describes, just like John Smith could be several people and a good bottle of beer, 10310 102 Ave. NW, Edmonton, AB. T5J 2X6 could describe many different things depending on context. Regarding the format of address fields: we have: SuiteName orSuiteNumber, HouseNumber, HouseNumberSuffix, (StreetName, StreetDirection, StreetType) or (StreetNumber, StreetNumberSuffix, StreetType), Quadrant. Beyond this there is handling of municipality name, post code(ZIP in the US). Then there are NamedLocations which are often used as surrogate addresses. As is pointed out in the earlier posting, Soundex or some other spelling and format standardization is vital when using uncontrolled lists of addresses and attempting the match them up or link them to spatial locations. I would suggest that you look through the proceedings of URISA Conferences, where some 30 years of experience in the field are covered. Some authors are: P. Eichleberger, W. Hurst, W. Huxhold, and myself (J. Guthrie). Careful planning can result in flexible and powerful systems, oversimplification can result in frustration and disappointment. In article <rplDLKDMI.FsE@netcom.com>, rpl@netcom.com (Robert Laudati) wrote: >John Lorenz (lorenz@netcom.com) wrote: > >: Personally, I don't see the need for so many fields. I believe all that >: is required are two fields, one for the street number and the other for >: the complete street name. This will provide compatibility between >: databases that have complete addresses (parcels, service addresses) and >: databases that only use street names (street light inventory, sign >: inventory, etc.). > >John: > >I thought many others would respond to your post, but since the replies >have been few (unless you've received personal emails), I'll jump in. > >Your reasoning is valid only if your GIS is to serve merely as a viewable >one. e.g., clicking on a street and presenting the information as you cite >as an example. However, if you want to perfom other tasks such as geocoding, >911 incident location, etc... you need a more detailed address structure. > >For example, you are given a dataset of recent fires, including an address, >date, and amount of damage. You want to create a GIS point data set in order >to determine whether a new distribution of fire stations could have alleviated >the destruction. Unfortunately, the fire dataset was typed in by an >insurance salesman who was not aware of your address conventions. So the >list looks like: > >1200 West Av 1/2/94 10000.00 >1202 West 1/2/94 8000.00 >100 Lafayette 3/16/95 20000.00 >98A W Tyler St 5/20/95 100.00 >98 B Tyler St 5/20/95 2000.00 >1322 Bell Rd 6/14/95 75000.00 >16 Mayfield unknown 4500.00 >1400 Bell Street 6/20/95 60000.00 > >As you see, there are several inconsistant addresses that your "complete" >address will not catch. Without the address fields parsed out, your GIS >may have "West Avenue", so neither of the first two records will match. >If "Lafayette" is really "Lafayette Square" again there is no match. Which >"Bell" is correct? With distinct fields, you can at least find all the >"Bell" street names and make the determination. > >Particularly if your area has similar street names with different types >such as the Bell example, or East/West or North/South prefixes, expect that >incoming data from other agencies will be inconsistant with respect to these >address components. > >Finally, most commercial geocoding algorithms use distinct fields to obtain >"near matches" and to give more/less weight to particular address components. >As a result, you may be forced to split your fields anyway.
Part of
Geometry in Action,
a collection of applications of computational geometry.
David Eppstein,
Theory Group,
ICS,
UC Irvine.
Last update: .