π Welcome! This is a project of the World Historical Gazetteer (WHG) β an interactive map and reader of A Gazetteer of the World, a seven-volume reference book published by A. Fullarton & Co. in 1856, describing tens of thousands of places across the globe as they were known in the mid-19th century. The title page credited only βa Member of the Royal Geographical Societyβ; the editor is now identified as George Godfrey Cunningham, a partner in the firm.
A gazetteer is a geographical dictionary: an alphabetical list of places, each with a short description β where it is, what kind of place it is, its population, nearby features, and so on. We have turned the original printed pages into data you can search, map, and read.
Nobody typed this in by hand β every step is automatic, and uses freely-available AI models run on our own computers (no per-use fees, nothing sent to outside services):
Each place is also given a feature type (city, river, mountain, ruinβ¦) chosen from the Getty Art & Architecture Thesaurus (AAT), a shared published vocabulary in which every term has a stable web address. Using such shared identifiers instead of free-text labels is what lets this data be linked to other datasets β the idea behind Linked Open Data β rather than sitting in an isolated silo.
All of this is very much a work in progress. The typing and fact-extraction are only as good as the instructions given to the language model, and can be improved simply by refining that prompt and re-running the extraction.
Over 116,000 places, most linked to a modern location, plus 1,774 statistical tables and 133 historical plates β all in a site that runs entirely in your browser, with no server doing the work as you click.
Please treat this as a research demonstrator, not an authority. Because every step is automatic:
So take everything here as a suggestion to check against the original scanned page (linked from every record), not as fact. Spotted an error? The β Report link in any place popup or reader entry takes a few seconds and genuinely helps improve the data.
The original book is public domain (page scans via HathiTrust). The project grew out of Humphrey Southallβs offer of his transcripts to the WHG and the discussion that followed, though it uses its own fresh OCR rather than those transcripts. The heavy computation ran on the HTC and H2P clusters at the University of Pittsburgh Center for Research Computing and Data (RRID:SCR_022735), supported by NIH award S10OD028483 and NSF award OAC-2117681. The code and full method are on GitHub.