ADDED   data/csv/db_docs.md
Index: data/csv/db_docs.md
==================================================================
--- /dev/null
+++ data/csv/db_docs.md
@@ -0,0 +1,148 @@
+**Toponymic Database Documentation**
+
+## Overview
+This database is designed to systematically store, analyze and eventually visualize historical and modern toponyms (place names). It takes a structured approach, focusing on spatial, linguistic, and temporal aspects. The database consists of six interconnected tables:
+
+1. **spatial** – geographical point locations and metadata
+2. **water** – non-point geographical features like rivers, streams, and lakes
+3. **linguistic** – attested toponyms, their variants, and linguistic attributes such as pronunciation and etymology
+4. **temporal** – historical changes and usage periods of toponyms
+5. **experiment** – a flexible structure for exploring additional data fields related to name changes
+6. **sources** – metadata about maps, books, and documents that contain toponym information
+
+---
+
+## **1. spatial table**
+
+### **Purpose:**
+Stores discrete geographic entities, their coordinates, and relevant metadata.
+
+### **Columns:**
+- **`ID`** *(int, PK)*: Unique identifier for each geographic object. Initially followed an alphabetical order based on the book by Дубанов И. С. «Топонимический словарь Чувашии. Географические названия и термины» (2023_jun_14_dubanov_toponymy), now it is basically free.
+- **`LAT`, `LON`** *(float, nullable)*: Estimated geographic coordinates derived from historical and modern maps (see http://www.etomesto.ru/ , https://retromap.ru/ and various other resources).
+- **`OFFNAME`** *(string)*: The most "official" available name for the location (either current or historical).
+- **`LANG`** *(string)*: The language of the name, using codes such as `RUS`, `CHU`, `HIM`, `MEM`, `ERZ`, `TAT`, `UNK`.
+- **`CLASS`** *(string, nullable)*: Specifies the toponymic category, e.g., oikonym (settlement), hydronym (water body), dromonym (road), etc.
+- **`TYPE`** *(string, nullable)*: Specifies the sub-category, e.g., village, city (for oikonyms), river, stream (for hydronyms), etc.
+- **`DISTRICT`** *(string, nullable)*: Modern administrative district (e.g., to differentiate places with the same name).
+- **`DOUBT`** *(int, nullable)*: Certainty level of assigned coordinates (empty = "I'm sure", 1 = "The coordinates are doubtful").
+- **`LANDMARK`** *(text, nullable)*: Relevant for microtoponyms; describes nearby visible features to help locate the object.
+- **`COMMENTS`** *(text, nullable)*: Free-text notes by database contributors.
+- **`OTHER`** *(text, nullable)*: Reserved for future use.
+
+---
+
+## **2. water table**
+
+### **Purpose:**
+Stores metadata about rivers, streams, lakes, and other non-point water bodies.
+
+### **Columns:**
+Work In Progress
+
+---
+
+## ** 3. linguistic table**
+
+### **Purpose:**
+Captures attested toponyms, their variations, and relevant linguistic metadata.
+
+### **Columns:**
+- **`ID`** *(int, PK)*: Unique identifier for each toponym entry, independent from the spatial table.
+- **`SPATID`** *(int, FK → spatial.ID)*: Links the toponym to a geographic location.
+- **`DOUSPAT`** *(int, nullable)*: Certainty of spatial link (empty = certain, 1 = doubtful).
+- **`MAINID`** *(int, FK → linguistic.ID)*: Points to the main name in a cluster of interconnected names (sometimes grouping is necessary, many sources give several somehow connected names for a single spatial object and we want to reflect this connections in database).
+- **`TOPONYM`** *(string)*: The attested name as recorded from sources.
+- **`TOPFORMS`** *(text, nullable)*: Alternative spellings with worker comments (must contain **the same language text** info as the **`TOPONYM`** field).
+- **`DOUTOPO`** *(int, nullable)*: Certainty of the form in `TOPONYM` only (empty = certain, 1 = doubtful).
+- **`LANG`** *(string)*: The language of the toponym, following the predefined codes.
+- **`DOULANG`** *(int, nullable)*: Certainty of the language of the toponym (empty = certain, 1 = doubtful).
+- **`PRONUNC`** *(text, nullable)*: Reserved for pronunciation data (format TBD).
+- **`DOUPRON`** *(int, nullable)*: Certainty of the pronunciation data (empty = certain, 1 = doubtful).
+- **`ETYM`** *(text, nullable)*: Etymological explanation of the name's origin (always treated as doubtful).
+- **`ORIGIN`** *(string, nullable)*: Language from which the name originated, e.g. `CHU`, `RUS`, `TAT`, `OTH`, `UNK` (also always treated as doubtful).
+- **`COMMENTS`** *(text, nullable)*: Free-text notes.
+- **`OTHER`** *(text, nullable)*: Reserved for future use.
+
+---
+
+## **4. temporal table**
+
+### **Purpose:**
+Tracks historical changes in toponym usage over time.
+
+### **Columns:**
+- **`ID`** *(int, PK)*: Unique identifier for each historical record.
+- **`LINGID`, `LINGNAME`** *(Composite FK → linguistic (ID, TOPONYM))*: References the linguistic entry.
+- **`NESTID`** *(int, FK → temporal.ID)*: Used for grouping related toponyms (e.g., main village + merged settlements). Typically references the most current Russian toponym.
+- **`STARTYEAR`** *(int)*: Earliest recorded use of the name (the year of first mention according to some source: book or map or something else), full source cited in `FULLTEXT`.
+- **`DOUSTART`** *(int, nullable)*: Certainty of the start year.
+- **`ENDYEAR`** *(int, nullable)*: Last known use of the name (if still active, leave empty), full source cited in `FULLTEXT`.
+- **`DOUEND`** *(int, nullable)*: Certainty level for the `ENDYEAR` date (empty = "I'm sure", 1 = "It is doubtful").
+- **`EVENT`** *(string)*: Type of historical event:
+  - `MERGEIN`: Merged into another object
+  - `ACTIVE`: Still in use
+  - `RENAME`: Renamed
+  - `CEASE`: Ceased to exist.
+- **`OBJID`** *(int, nullable)*, **`OBJNAME`** *(text, nullable)*: Contextual references:
+  - **ACTIVE**: Leave both empty
+  - **RENAME**: Reference the new name, linguistic (ID, TOPONYM)
+  - **MERGEIN**: Reference the absorbing object's name, linguistic (ID, TOPONYM)
+  - **CEASE**: Leave both empty.
+- **`COMMENTS`** *(text, nullable)*: Free-text notes.
+- **`OTHER`** *(text, nullable)*: Reserved.
+- **`FULLTEXT`** *(text)*: A complete textual description of the historical event, providing transparency by explicitly referencing the relevant source(s), for instance, the source.ID (see **6. sources table**) along with page numbers or other identifying details when applicable.
+
+---
+
+## **5. experiment table**
+
+### **Purpose:**
+Used for prototyping new fields and storing all details from sources about when a name was used.
+
+### **Columns:**
+*Work in progress.*
+
+---
+
+## **6. sources table**
+
+### **Purpose:**
+Stores metadata for bibliographic or archival sources used in the `temporal` table.
+
+### **Columns:**
+- **`ID`** *(string, PK)*: Unique identifier.
+- **`TYPE`** *(string)*: Source type (`BOOK`, `ARTICLE`, `MAP`, `ARCHIVE`, `WEB`).
+- **`TITLE`** *(text)*: Short informal title of the source (book name, article title, map title etc).
+- **`YEAR`** *(int, nullable)*: Year of publication or creation.
+- **`CITATION`** *(text, nullable)*: Full bibliographic citation.
+- **`URL`** *(text, nullable)*: Web link to the source (if available).
+- **`DIGCOPY`** *(string, nullable)*: Is a digital copy stored in the project archive? (`yes` or `no`).
+- **`COMMENTS`** *(text, nullable)*: Additional notes.
+- **`OTHER`** *(text, nullable)*: Reserved.
+
+### **Usage & Relationship with Temporal Table**:
+Each record in the temporal table should refer to one or more sources, cited via free text in the `FULLTEXT` field.
+
+---
+
+## **Relationships and Interactions**
+- **`SPATID` (linguistic) → `ID` (spatial)**: Links names to places
+- **`LINGID`, `LINGNAME` (temporal) → linguistic**: Tracks historical shifts in usage
+- **`OBJID`, `OBJNAME` (temporal) → linguistic**: Tracks renaming and mergers
+
+---
+
+## **Next Steps**
+- Finalize a format for the `PRONUNC` field.
+- Define conventions for populating `DOUBT` fields.
+
+---
+
+## **Commenting Convention**
+All comment fields should follow a consistent format. A contributor adds their initials (based on surname, name, patronymic), followed by a colon and a space. If building upon another’s comment, separate entries using ` | ` (space-pipe-space).
+
+Example: 
+```
+SMY: в 27 км от райцентра, в лесах на правом берегу Суры, ныне не существует | IRI: На карте 1985 г. с retromap.ru это кордон Келейный, через реку от урочища Монашеское с более современных карт, примерно в месте впадения р. Чернушки в р. Бездну
+```