Artifact 1504b94df9f16f59f84b9111107c6eb4998dea5a415631f7f1f100ac2d61d78d:
- File data/archive/db_experiment/v0.2/db_docs.md — part of check-in [0eb324393e] at 2025-04-24 17:10:24 on branch trunk — archive 'db_experiment' directory and english language database documentation (user: sisrahtak, size: 6612) [annotate] [blame] [check-ins using]
- File data/csv/db_experiment/v0.2/db_docs.md — part of check-in [fbda0e251e] at 2025-02-11 13:17:10 on branch trunk — 2025 Feb 11 transfer TOPFORMS column to linguistic table (user: sisrahtak, size: 6612) [annotate] [blame] [check-ins using]
Toponymic Database Documentation
Overview
This database is designed to systematically store, analyze and eventually visualize historical and contemporary toponyms (place names) with a structured approach, focusing on spatial, linguistic, and temporal aspects. The structure consists of three interlinked tables:
- Spatial Table - Stores geographical locations and metadata.
- Linguistic Table - Captures toponyms, their various variants and linguistic attributes like pronunciation and etymology.
- Temporal Table - Documents historical changes in toponyms over time.
1. Spatial Table
Purpose:
The spatial table contains unique geographic entities, their locations, and relevant metadata.
Columns:
- ID(int, PK): Unique identifier for each geographic object. In the very beginning the order followed an original alphabetically sorted table, 2023_jun_14_dubanov_toponymy, based on a book (Дубанов И.С. «Топонимический словарь Чувашии. Географические названия и термины») order, now it is basically free.
- LAT,- LON(float): Estimated geographic coordinates derived from historical and modern maps (see http://www.etomesto.ru/ , https://retromap.ru/ and various other resources).
- OFFNAME(string): The most "official" available name for the location (either current or historical).
- LANG(string): The language of the name, using codes such as- RUS,- CHU,- HIM,- MEM,- ERZ,- TAT,- UNK.
- CLASS(string): Specifies the toponymic category, e.g., oikonym (settlement), hydronym (water body), dromonym (road), etc.
- TYPE(string): Specifies the sub-category, e.g., village, city (for oikonyms), river, stream (for hydronyms), etc.
- DISTRICT(string, nullable): Modern administrative district (e.g., to differentiate places with the same name).
- DOUBT(int, nullable): Certainty level of assigned coordinates (empty = "I'm sure", 1 = "The coordinates are doubtful").
- LANDMARK(string, nullable): Relevant for microtoponyms; describes nearby visible features to help locate the object.
- COMMENTS(text, nullable): Free-text notes for human workers.
- OTHER(text, nullable): Reserved for future use.
2. Linguistic Table
Purpose:
The linguistic table stores different names (and their variants) of the same geographic object across different languages and sources, etymological information and other linguistic metadata.
Columns:
- ID(int, PK): Unique identifier for each toponym entry, independent from the spatial table.
- SPATID(int, FK → Spatial.ID): Links the toponym to its corresponding spatial object.
- DOUSPAT(int, nullable): Certainty of name relation to the spatial object (empty = "I'm sure", 1 = "The spatial link is doubtful").
- MAINID(int, FK → Linguistic.ID): Represents the "main" name in a "nest" of interconnected toponyms (sometimes grouping is necessary, many sources give several somehow connected names for a single spatial object and we want to reflect this connections in database).
- TOPONYM(string): The attested name as recorded from sources.
- TOPFORMS(text, nullable): Holds different attested spelling variations of the name from multiple sources.
- DOUTOPO(int, nullable): Certainty of the given- TOPONYMform (empty = "I'm sure", 1 = "It is doubtful").
- LANG(string): The language of the toponym, following the predefined codes.
- DOULANG(int, nullable): Certainty of the language of the toponym (empty = "I'm sure", 1 = "It is doubtful").
- PRONUNC(text, nullable): Reserved for pronunciation information (format yet to be decided).
- DOUPRON(int, nullable): Certainty of the pronunciation information (empty = "I'm sure", 1 = "It is doubtful").
- ETYM(text, nullable): Etymological explanation of the name's origin. It is always doubtful.
- ORIGIN(string, nullable): Language from which the name originated (e.g.,- CHU,- RUS,- TAT,- OTH,- UNK). This one is always doubtful too.
- COMMENTS(text, nullable): Free-text notes.
- OTHER(text, nullable): Reserved for future use.
3. Temporal Table
Purpose:
This table records historical changes in toponyms, tracking their usage over time.
Columns:
- ID(int, PK): Unique identifier for each temporal record.
- LINGID,- LINGNAME(Composite FK → Linguistic (ID, TOPONYM)): Together, these fields reference a specific toponym in the linguistic table.
- START(int): Earliest recorded use of the name (source text goes to- FULLTEXTcolumn).
- DOUSTART(int): Certainty level for the- STARTdate (empty = "I'm sure", 1 = "It is doubtful").
- END(int, nullable): The last recorded use of the name. If the object still exists, this field remains empty. The source text is stored in the FULLTEXT column.
- DOUEND(int, nullable): Certainty level for the- ENDdate (empty = "I'm sure", 1 = "It is doubtful").
- EVENT(string): Categorization of historical change, with predefined values:- MERGEIN: The object was absorbed into another.
- ACTIVE: The name is still in use.
- RENAME: The object was renamed.
- CEASE: The object ceased to exist.
 
- OBJID(int, nullable),- OBJNAME(text, nullable): Context-dependent:- ACTIVE: Both empty.
- RENAME: Stores the information about the new name of the object, it is a reference to Linguistic (ID, TOPONYM).
- MERGEIN: Stores the information about the absorbing object, again, it is a reference to Linguistic (ID, TOPONYM).
- CEASE: Both empty.
 
- COMMENTS(text, nullable): Free-text notes.
- OTHER(text, nullable): Reserved for future use.
- FULLTEXT(text): Full textual information about the historical event, preserving source transparency.
Relationships and Interactions
- SPATID(Linguistic) →- ID(Spatial): Links toponyms to geographic locations.
- LINGID(Temporal) →- ID(Linguistic) and- LINGNAME(Temporal) →- TOPONYM(Linguistic): Tracks historical changes of a specific name.
- OBJID(Temporal) →- ID(Linguistic) and- OBJNAME(Temporal) →- TOPONYM(Linguistic): Records specifically name changes or mergers.
Next Steps
- Finalize the format for the PRONUNCfield.
- Make a separate table for the list of written sources