# Structure
tests/
├── __init__.py # Marks this as a Python package
├── config.py # Configuration (e.g., database connection details)
├── schema.py # Table creation and schema definitions
├── csv_loader.py # CSV loading logic
├── validators.py # Validation checks (schema, logic, data quality)
└── main.py # Entry point to tie everything together
## __init__.py
Empty.
## config.py
This file holds configuration details, such as database connection parameters and CSV file paths.
## schema.py
This file defines the PostgreSQL table schemas with proper constraints.
## csv_loader.py
Handles CSV parsing and loading into PostgreSQL.
## validators.py
Contains all validation logic, adapted for PostgreSQL syntax.
## main.py
Ties everything together and runs the validation.
# Key Changes and Notes
## PostgreSQL Integration:
- Replaced sqlite3 with psycopg2 for PostgreSQL connectivity.
- Added SERIAL for auto-incrementing IDs (PostgreSQL equivalent of SQLite’s INTEGER PRIMARY KEY).
- Moved some checks (e.g., coordinate ranges, allowed LANG/EVENT values) into schema-level CHECK constraints, leveraging PostgreSQL’s capabilities.
## Modular Design:
- Separated concerns into configuration, schema, loading, validation, and execution.
- Each module can be independently tested or extended.
## Error Handling:
- Added basic database connection error handling.
- Ensures the connection is closed in a finally block.
## Execution:
Run the script with:
```
python main.py \
--spatial /path/to/Spatial.csv \
--linguistic /path/to/Linguistic.csv \
--temporal /path/to/Temporal.csv \
--sources /path/to/Sources.csv \
--db-name your_db \
--db-user your_user \
--db-password your_password
```
## Customization:
- Adjust column names in main.py to match your CSV files exactly.
- Add more validation rules in validators.py as needed.
- Modify schema.py if your constraints differ.