File data/csv/db_docs.md artifact 558aaf40a1 part of check-in 18e3747a29
Toponymic Database Documentation
Overview
This database is designed to systematically store, analyze and eventually visualize historical and modern toponyms (place names). It takes a structured approach, focusing on spatial, linguistic, and temporal aspects. The database consists of six interconnected tables:
- spatial – geographical point locations and metadata
- water – non-point geographical features like rivers, streams, and lakes
- linguistic – attested toponyms, their variants, and linguistic attributes such as pronunciation and etymology
- temporal – historical changes and usage periods of toponyms
- experiment – a flexible structure for exploring additional data fields related to name changes
- sources – metadata about maps, books, and documents that contain toponym information
1. spatial table
Purpose:
Stores discrete geographic entities, their coordinates, and relevant metadata.
Columns:
ID
(int, PK): Unique identifier for each geographic object. Initially followed an alphabetical order based on the book by Дубанов И. С. «Топонимический словарь Чувашии. Географические названия и термины» (2023_jun_14_dubanov_toponymy), now it is basically free.LAT
,LON
(float, nullable): Estimated geographic coordinates derived from historical and modern maps (see http://www.etomesto.ru/ , https://retromap.ru/ and various other resources).OFFNAME
(string): The most "official" available name for the location (either current or historical).LANG
(string): The language of the name, using codes such asRUS
,CHU
,HIM
,MEM
,ERZ
,TAT
,UNK
.CLASS
(string, nullable): Specifies the toponymic category, e.g., oikonym (settlement), hydronym (water body), dromonym (road), etc.TYPE
(string, nullable): Specifies the sub-category, e.g., village, city (for oikonyms), river, stream (for hydronyms), etc.DISTRICT
(string, nullable): Modern administrative district (e.g., to differentiate places with the same name).DOUBT
(int, nullable): Certainty level of assigned coordinates (empty = "I'm sure", 1 = "The coordinates are doubtful").LANDMARK
(text, nullable): Relevant for microtoponyms; describes nearby visible features to help locate the object.COMMENTS
(text, nullable): Free-text notes by database contributors.OTHER
(text, nullable): Reserved for future use.
2. water table
Purpose:
Stores metadata about rivers, streams, lakes, and other non-point water bodies.
Columns:
Work In Progress
** 3. linguistic table**
Purpose:
Captures attested toponyms, their variations, and relevant linguistic metadata.
Columns:
ID
(int, PK): Unique identifier for each toponym entry, independent from the spatial table.SPATID
(int, FK → spatial.ID): Links the toponym to a geographic location.DOUSPAT
(int, nullable): Certainty of spatial link (empty = certain, 1 = doubtful).MAINID
(int, FK → linguistic.ID): Points to the main name in a cluster of interconnected names (sometimes grouping is necessary, many sources give several somehow connected names for a single spatial object and we want to reflect this connections in database).TOPONYM
(string): The attested name as recorded from sources.TOPFORMS
(text, nullable): Alternative spellings with worker comments (must contain the same language text info as theTOPONYM
field).DOUTOPO
(int, nullable): Certainty of the form inTOPONYM
only (empty = certain, 1 = doubtful).LANG
(string): The language of the toponym, following the predefined codes.DOULANG
(int, nullable): Certainty of the language of the toponym (empty = certain, 1 = doubtful).PRONUNC
(text, nullable): Reserved for pronunciation data (format TBD).DOUPRON
(int, nullable): Certainty of the pronunciation data (empty = certain, 1 = doubtful).ETYM
(text, nullable): Etymological explanation of the name's origin (always treated as doubtful).ORIGIN
(string, nullable): Language from which the name originated, e.g.CHU
,RUS
,TAT
,OTH
,UNK
(also always treated as doubtful).COMMENTS
(text, nullable): Free-text notes.OTHER
(text, nullable): Reserved for future use.
4. temporal table
Purpose:
Tracks historical changes in toponym usage over time.
Columns:
ID
(int, PK): Unique identifier for each historical record.LINGID
,LINGNAME
(Composite FK → linguistic (ID, TOPONYM)): References the linguistic entry.NESTID
(int, FK → temporal.ID): Used for grouping related toponyms (e.g., main village + merged settlements). Typically references the most current Russian toponym.STARTYEAR
(int): Earliest recorded use of the name (the year of first mention according to some source: book or map or something else), full source cited inFULLTEXT
.DOUSTART
(int, nullable): Certainty of the start year.ENDYEAR
(int, nullable): Last known use of the name (if still active, leave empty), full source cited inFULLTEXT
.DOUEND
(int, nullable): Certainty level for theENDYEAR
date (empty = "I'm sure", 1 = "It is doubtful").EVENT
(string): Type of historical event:MERGEIN
: Merged into another objectACTIVE
: Still in useRENAME
: RenamedCEASE
: Ceased to exist.
OBJID
(int, nullable),OBJNAME
(text, nullable): Contextual references:- ACTIVE: Leave both empty
- RENAME: Reference the new name, linguistic (ID, TOPONYM)
- MERGEIN: Reference the absorbing object's name, linguistic (ID, TOPONYM)
- CEASE: Leave both empty.
COMMENTS
(text, nullable): Free-text notes.OTHER
(text, nullable): Reserved.FULLTEXT
(text): A complete textual description of the historical event, providing transparency by explicitly referencing the relevant source(s), for instance, the source.ID (see 6. sources table) along with page numbers or other identifying details when applicable.
5. experiment table
Purpose:
Used for prototyping new fields and storing all details from sources about when a name was used.
Columns:
Work in progress.
6. sources table
Purpose:
Stores metadata for bibliographic or archival sources used in the temporal
table.
Columns:
ID
(string, PK): Unique identifier.TYPE
(string): Source type (BOOK
,ARTICLE
,MAP
,ARCHIVE
,WEB
).TITLE
(text): Short informal title of the source (book name, article title, map title etc).YEAR
(int, nullable): Year of publication or creation.CITATION
(text, nullable): Full bibliographic citation.URL
(text, nullable): Web link to the source (if available).DIGCOPY
(string, nullable): Is a digital copy stored in the project archive? (yes
orno
).COMMENTS
(text, nullable): Additional notes.OTHER
(text, nullable): Reserved.
Usage & Relationship with Temporal Table:
Each record in the temporal table should refer to one or more sources, cited via free text in the FULLTEXT
field.
Relationships and Interactions
SPATID
(linguistic) →ID
(spatial): Links names to placesLINGID
,LINGNAME
(temporal) → linguistic: Tracks historical shifts in usageOBJID
,OBJNAME
(temporal) → linguistic: Tracks renaming and mergers
Next Steps
- Finalize a format for the
PRONUNC
field. - Define conventions for populating
DOUBT
fields.
Commenting Convention
All comment fields should follow a consistent format. A contributor adds their initials (based on surname, name, patronymic), followed by a colon and a space. If building upon another’s comment, separate entries using |
(space-pipe-space).
Example:
SMY: в 27 км от райцентра, в лесах на правом берегу Суры, ныне не существует | IRI: На карте 1985 г. с retromap.ru это кордон Келейный, через реку от урочища Монашеское с более современных карт, примерно в месте впадения р. Чернушки в р. Бездну