Artifact d76519318c197f06b223b601efe456b040b3d04091a2df058c58fa2a51dd7d97:
- File tests/test_docs_v0.5.md — part of check-in [da53d8a18a] at 2025-03-05 17:46:33 on branch trunk — 2025 Mar 05 fix test_docs_v0.5.md (user: sisrahtak, size: 2003) [annotate] [blame] [check-ins using]
Structure
tests/
├── __init__.py # Marks this as a Python package
├── config.py # Configuration (e.g., database connection details)
├── schema.py # Table creation and schema definitions
├── csv_loader.py # CSV loading logic
├── validators.py # Validation checks (schema, logic, data quality)
└── main.py # Entry point to tie everything together
init.py
Empty.
config.py
This file holds configuration details, such as database connection parameters and CSV file paths.
schema.py
This file defines the PostgreSQL table schemas with proper constraints.
csv_loader.py
Handles CSV parsing and loading into PostgreSQL.
validators.py
Contains all validation logic, adapted for PostgreSQL syntax.
main.py
Ties everything together and runs the validation.
Key Changes and Notes
PostgreSQL Integration:
- Replaced sqlite3 with psycopg2 for PostgreSQL connectivity.
- Added SERIAL for auto-incrementing IDs (PostgreSQL equivalent of SQLite’s INTEGER PRIMARY KEY).
- Moved some checks (e.g., coordinate ranges, allowed LANG/EVENT values) into schema-level CHECK constraints, leveraging PostgreSQL’s capabilities.
Modular Design:
- Separated concerns into configuration, schema, loading, validation, and execution.
- Each module can be independently tested or extended.
Error Handling:
- Added basic database connection error handling.
- Ensures the connection is closed in a finally block.
Execution:
Run the script with:
python main.py \
--spatial /path/to/Spatial.csv \
--linguistic /path/to/Linguistic.csv \
--temporal /path/to/Temporal.csv \
--sources /path/to/Sources.csv \
--db-name your_db \
--db-user your_user \
--db-password your_password
Customization:
- Adjust column names in main.py to match your CSV files exactly.
- Add more validation rules in validators.py as needed.
- Modify schema.py if your constraints differ.