Chuvash Toponymy Database v.0.1: Artifact [8f27e620b9]

Artifact 8f27e620b93127862fdc97bc09610d7a2740d3210ed0ff8c2a5a026e38311c9b:

File data/csv/db_experiment/v0.2/db_docs.md — part of check-in [9d5675a10d] at 2025-02-10 17:32:43 on branch trunk — 2025 Feb 10 add database documentation for v0.2 (user: sisrahtak, size: 6558) [annotate] [blame] [check-ins using]

Toponymic Database Documentation

Overview

This database is designed to systematically store, analyze and eventually visualize historical and contemporary toponyms (place names) with a structured approach, focusing on spatial, linguistic, and temporal aspects. The structure consists of three interlinked tables:

Spatial Table - Stores geographical locations and metadata.
Linguistic Table - Captures toponyms, their various variants and linguistic attributes like pronunciation and etymology.
Temporal Table - Documents historical changes in toponyms over time.

1. Spatial Table

Purpose:

The spatial table contains unique geographic entities, their locations, and relevant metadata.

Columns:

ID (int, PK): Unique identifier for each geographic object. In the very beginning the order followed an original alphabetically sorted table, 2023_jun_14_dubanov_toponymy, based on a book (Дубанов И.С. «Топонимический словарь Чувашии. Географические названия и термины») order, now it is basically free.
LAT, LON (float): Estimated geographic coordinates derived from historical and modern maps (see http://www.etomesto.ru/ , https://retromap.ru/ and various other resources).
OFFNAME (string): The most "official" available name for the location (either current or historical).
LANG (string): The language of the name, using codes such as RUS, CHU, HIM, MEM, ERZ, TAT, UNK.
CLASS (string): Specifies the toponymic category, e.g., oikonym (settlement), hydronym (water body), dromonym (road), etc.
TYPE (string): Specifies the sub-category, e.g., village, city (for oikonyms), river, stream (for hydronyms), etc.
DISTRICT (string, nullable): Modern administrative district (e.g., to differentiate places with the same name).
DOUBT (int, nullable): Certainty level of assigned coordinates (empty = "I'm sure", 1 = "The coordinates are doubtful").
LANDMARK (string, nullable): Relevant for microtoponyms; describes nearby visible features to help locate the object.
COMMENTS (text, nullable): Free-text notes for human workers.
OTHER (text, nullable): Reserved for future use.

2. Linguistic Table

Purpose:

The linguistic table stores different names (and their variants) of the same geographic object across different languages and sources, etymological information and other linguistic metadata.

Columns:

ID (int, PK): Unique identifier for each toponym entry, independent from the spatial table.
SPATID (int, FK → Spatial.ID): Links the toponym to its corresponding spatial object.
DOUSPAT (int, nullable): Certainty of name relation to the spatial object (empty = "I'm sure", 1 = "The spatial link is doubtful").
MAINID (int, FK → Linguistic.ID): Represents the "main" name in a "nest" of interconnected toponyms (sometimes grouping is necessary, many sources give several somehow connected names for a single spatial object and we want to reflect this connections in database).
TOPONYM (string): The attested name as recorded from sources.
DOUTOPO (int, nullable): Certainty of the given orthographic form (empty = "I'm sure", 1 = "It is doubtful").
LANG (string): The language of the toponym, following the predefined codes.
DOULANG (int, nullable): Certainty of the language of the toponym (empty = "I'm sure", 1 = "It is doubtful").
PRONUNC (text, nullable): Reserved for pronunciation information (format yet to be decided).
DOUPRON (int, nullable): Certainty of the pronunciation information (empty = "I'm sure", 1 = "It is doubtful").
ETYM (text, nullable): Etymological explanation of the name's origin. It is always doubtful.
ORIGIN (string, nullable): Language from which the name originated (e.g., CHU, RUS, TAT, OTH, UNK). This one is always doubtful too.
COMMENTS (text, nullable): Free-text notes.
OTHER (text, nullable): Reserved for future use.

3. Temporal Table

Purpose:

This table records historical changes in toponyms, tracking their usage over time.

Columns:

ID (int, PK): Unique identifier for each temporal record.
LINGID, LINGNAME (Composite FK → Linguistic (ID, TOPONYM)): Together, these fields reference a specific toponym in the linguistic table.
TOPFORMS (text, nullable): Holds different attested spelling variations of the name from multiple sources.
START (int): Earliest recorded use of the name (source text goes to FULLTEXT column).
DOUSTART (int): Certainty level for the START date (empty = "I'm sure", 1 = "It is doubtful").
END (int, nullable): The last recorded use of the name. If the object still exists, this field remains empty. The source text is stored in the FULLTEXT column.
DOUEND (int, nullable): Certainty level for the END date (empty = "I'm sure", 1 = "It is doubtful").
EVENT (string): Categorization of historical change, with predefined values:
- MERGEIN: The object was absorbed into another.
- ACTIVE: The name is still in use.
- RENAME: The object was renamed.
- CEASE: The object ceased to exist.
OBJID (int, nullable), OBJNAME (text, nullable): Context-dependent:
- ACTIVE: Both empty.
- RENAME: Stores the information about the new name of the object, it is a reference to Linguistic (ID, TOPONYM).
- MERGEIN: Stores the information about the absorbing object, again, it is a reference to Linguistic (ID, TOPONYM).
- CEASE: Both empty.
COMMENTS (text, nullable): Free-text notes.
OTHER (text, nullable): Reserved for future use.
FULLTEXT (text): Full textual information about the historical event, preserving source transparency.

Relationships and Interactions

SPATID (Linguistic) → ID (Spatial): Links toponyms to geographic locations.
LINGID (Temporal) → ID (Linguistic) and LINGNAME (Temporal) → TOPONYM (Linguistic): Tracks historical changes of a specific name.
OBJID (Temporal) → ID (Linguistic) and OBJNAME (Temporal) → TOPONYM (Linguistic): Records specifically name changes or mergers.

Next Steps

Finalize the format for the PRONUNC field.