File data/archive/db_experiment/v0.2/db_docs.md from the latest check-in
Toponymic Database Documentation
Overview
This database is designed to systematically store, analyze and eventually visualize historical and contemporary toponyms (place names) with a structured approach, focusing on spatial, linguistic, and temporal aspects. The structure consists of three interlinked tables:
- Spatial Table - Stores geographical locations and metadata.
- Linguistic Table - Captures toponyms, their various variants and linguistic attributes like pronunciation and etymology.
- Temporal Table - Documents historical changes in toponyms over time.
1. Spatial Table
Purpose:
The spatial table contains unique geographic entities, their locations, and relevant metadata.
Columns:
ID(int, PK): Unique identifier for each geographic object. In the very beginning the order followed an original alphabetically sorted table, 2023_jun_14_dubanov_toponymy, based on a book (Дубанов И.С. «Топонимический словарь Чувашии. Географические названия и термины») order, now it is basically free.LAT,LON(float): Estimated geographic coordinates derived from historical and modern maps (see http://www.etomesto.ru/ , https://retromap.ru/ and various other resources).OFFNAME(string): The most "official" available name for the location (either current or historical).LANG(string): The language of the name, using codes such asRUS,CHU,HIM,MEM,ERZ,TAT,UNK.CLASS(string): Specifies the toponymic category, e.g., oikonym (settlement), hydronym (water body), dromonym (road), etc.TYPE(string): Specifies the sub-category, e.g., village, city (for oikonyms), river, stream (for hydronyms), etc.DISTRICT(string, nullable): Modern administrative district (e.g., to differentiate places with the same name).DOUBT(int, nullable): Certainty level of assigned coordinates (empty = "I'm sure", 1 = "The coordinates are doubtful").LANDMARK(string, nullable): Relevant for microtoponyms; describes nearby visible features to help locate the object.COMMENTS(text, nullable): Free-text notes for human workers.OTHER(text, nullable): Reserved for future use.
2. Linguistic Table
Purpose:
The linguistic table stores different names (and their variants) of the same geographic object across different languages and sources, etymological information and other linguistic metadata.
Columns:
ID(int, PK): Unique identifier for each toponym entry, independent from the spatial table.SPATID(int, FK → Spatial.ID): Links the toponym to its corresponding spatial object.DOUSPAT(int, nullable): Certainty of name relation to the spatial object (empty = "I'm sure", 1 = "The spatial link is doubtful").MAINID(int, FK → Linguistic.ID): Represents the "main" name in a "nest" of interconnected toponyms (sometimes grouping is necessary, many sources give several somehow connected names for a single spatial object and we want to reflect this connections in database).TOPONYM(string): The attested name as recorded from sources.TOPFORMS(text, nullable): Holds different attested spelling variations of the name from multiple sources.DOUTOPO(int, nullable): Certainty of the givenTOPONYMform (empty = "I'm sure", 1 = "It is doubtful").LANG(string): The language of the toponym, following the predefined codes.DOULANG(int, nullable): Certainty of the language of the toponym (empty = "I'm sure", 1 = "It is doubtful").PRONUNC(text, nullable): Reserved for pronunciation information (format yet to be decided).DOUPRON(int, nullable): Certainty of the pronunciation information (empty = "I'm sure", 1 = "It is doubtful").ETYM(text, nullable): Etymological explanation of the name's origin. It is always doubtful.ORIGIN(string, nullable): Language from which the name originated (e.g.,CHU,RUS,TAT,OTH,UNK). This one is always doubtful too.COMMENTS(text, nullable): Free-text notes.OTHER(text, nullable): Reserved for future use.
3. Temporal Table
Purpose:
This table records historical changes in toponyms, tracking their usage over time.
Columns:
ID(int, PK): Unique identifier for each temporal record.LINGID,LINGNAME(Composite FK → Linguistic (ID, TOPONYM)): Together, these fields reference a specific toponym in the linguistic table.START(int): Earliest recorded use of the name (source text goes toFULLTEXTcolumn).DOUSTART(int): Certainty level for theSTARTdate (empty = "I'm sure", 1 = "It is doubtful").END(int, nullable): The last recorded use of the name. If the object still exists, this field remains empty. The source text is stored in the FULLTEXT column.DOUEND(int, nullable): Certainty level for theENDdate (empty = "I'm sure", 1 = "It is doubtful").EVENT(string): Categorization of historical change, with predefined values:MERGEIN: The object was absorbed into another.ACTIVE: The name is still in use.RENAME: The object was renamed.CEASE: The object ceased to exist.
OBJID(int, nullable),OBJNAME(text, nullable): Context-dependent:- ACTIVE: Both empty.
- RENAME: Stores the information about the new name of the object, it is a reference to Linguistic (ID, TOPONYM).
- MERGEIN: Stores the information about the absorbing object, again, it is a reference to Linguistic (ID, TOPONYM).
- CEASE: Both empty.
COMMENTS(text, nullable): Free-text notes.OTHER(text, nullable): Reserved for future use.FULLTEXT(text): Full textual information about the historical event, preserving source transparency.
Relationships and Interactions
SPATID(Linguistic) →ID(Spatial): Links toponyms to geographic locations.LINGID(Temporal) →ID(Linguistic) andLINGNAME(Temporal) →TOPONYM(Linguistic): Tracks historical changes of a specific name.OBJID(Temporal) →ID(Linguistic) andOBJNAME(Temporal) →TOPONYM(Linguistic): Records specifically name changes or mergers.
Next Steps
- Finalize the format for the
PRONUNCfield. - Make a separate table for the list of written sources