Harnessing the BGN

One source of data that I feel is under-utilized is the USGS Board of Geographic Names.  The BGN is a (mostly) comprehensive list of definitive feature names, covering most of the United States.  The features are organized by state, by county, and by feature type, and each one has a latitude and a longitude.

The only drawback to the BGN is that the files are pipe-delimited (the ‘|’ character is called ‘pipe’) text files.  For example, here’s the first five entries from the Hawaii file:

247074|Pacific Ocean|Sea|CA|06|Mendocino|045|391837N|1235041W|
39.3102778|-123.8447222|||||0|0|Mendocino|01/19/1981|06/21/2010
358293|Pearl and Hermes Atoll|Island|HI|15|Honolulu|003|275000N|1755000W|
27.8333333|-175.8333333|||||0|0|Unknown|09/30/2003|
358294|Laysan Island|Island|HI|15|Honolulu|003|254615N|1714415W|
25.7708333|-171.7375|||||3|10|Unknown|09/30/2003|
358295|Barking Sands|Beach|HI|15|Kauai|007|220418N|1594652W|
22.0716667|-159.7811111|||||0|0|Kekaha|02/06/1981|

(Yes, ladies and gentlemen, the Pacific Ocean is in Mendocino County, California.)

This format, while containing a lot of great information, is not exactly ArcGIS friendly.  Fortunately, it is really easy to transform this file into a comma-separated file, which can easily be manipulated in a spreadsheet, using awk.  Since there are 20 columns in each entry (whether the column is populated or not), the awk script will look something like this:

awk ‘BEGIN {FS = “|”; OFS = “,”} {print $1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16,$17,$18,$19,$20}’ HI_Features_20101203.txt > HIfeatures-awk.txt

That’s all there is to it.  The file can now be loaded into a database and plotted on a map.

Download BGN data here!

Comments are closed.