Data Stack
Overview
The Data Stack provides the persistence layer for the HealthFlow NDP platform, including relational databases, NoSQL databases, caching, and message queues.
Stack Architecture
Components
1. PostgreSQL
Purpose: Primary relational database for core NDP data
Version: PostgreSQL 15.x
Usage:
- Prescription records
- Patient registry
- Healthcare Provider Registry (HPR)
- Dispense records
- Transactional data
Key Features:
- ACID compliance
- Advanced indexing
- Full-text search
- JSON support (JSONB)
- Row-level security
- Streaming replication for HA
2. MySQL
Purpose: Legacy database for pharmacy and medicine data
Version: MySQL 8.0.x
Usage:
- Pharmacy registry
- Medicine directory (national drug database)
- EDA integration data
- Legacy system compatibility
Key Features:
- InnoDB storage engine
- ACID transactions
- Master-slave replication
- Point-in-time recovery
3. MongoDB
Purpose: Document database for audit and compliance logs
Version: MongoDB 6.x
Usage:
- Audit trail storage
- Compliance logging
- Event history
- Analytics data
- Unstructured data storage
Key Features:
- Flexible schema
- High write throughput
- Horizontal scalability
- Aggregation pipeline
- Time-series collections
4. Redis
Purpose: In-memory cache and session store
Version: Redis 7.x
Usage:
- API response caching
- Session management
- Rate limiting counters
- Real-time data
- Pub/sub messaging
- Distributed locks
Key Features:
- Sub-millisecond latency
- Multiple data structures
- Persistence options (RDB/AOF)
- Redis Sentinel for HA
- Cluster mode support
5. Apache Kafka
Purpose: Distributed event streaming platform
Version: Kafka 3.x
Usage:
- Event-driven architecture
- Service-to-service messaging
- Audit log streaming
- Real-time data pipelines
- Analytics event collection
Key Features:
- High throughput
- Fault tolerance
- Durable storage
- Stream processing
- Multiple consumers
Kubernetes Manifests
Namespace
apiVersion: v1
kind: Namespace
metadata:
name: data-stack
labels:
name: data-stack
stack: infrastructurePostgreSQL StatefulSet
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: postgresql
namespace: data-stack
spec:
serviceName: postgresql
replicas: 1
selector:
matchLabels:
app: postgresql
template:
metadata:
labels:
app: postgresql
spec:
containers:
- name: postgresql
image: postgres:15-alpine
ports:
- containerPort: 5432
name: postgres
env:
- name: POSTGRES_DB
value: "ndp"
- name: POSTGRES_USER
value: "ndp_user"
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: postgresql-secret
key: password
- name: PGDATA
value: /var/lib/postgresql/data/pgdata
volumeMounts:
- name: postgresql-storage
mountPath: /var/lib/postgresql/data
resources:
limits:
cpu: "4"
memory: 8Gi
requests:
cpu: "2"
memory: 4Gi
livenessProbe:
exec:
command:
- /bin/sh
- -c
- pg_isready -U ndp_user
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
exec:
command:
- /bin/sh
- -c
- pg_isready -U ndp_user
initialDelaySeconds: 5
periodSeconds: 5
volumeClaimTemplates:
- metadata:
name: postgresql-storage
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: gp3
resources:
requests:
storage: 500GiPostgreSQL Service
apiVersion: v1
kind: Service
metadata:
name: postgresql
namespace: data-stack
spec:
type: ClusterIP
selector:
app: postgresql
ports:
- port: 5432
targetPort: 5432
protocol: TCPMySQL StatefulSet
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mysql
namespace: data-stack
spec:
serviceName: mysql
replicas: 1
selector:
matchLabels:
app: mysql
template:
metadata:
labels:
app: mysql
spec:
containers:
- name: mysql
image: mysql:8.0
ports:
- containerPort: 3306
name: mysql
env:
- name: MYSQL_ROOT_PASSWORD
valueFrom:
secretKeyRef:
name: mysql-secret
key: root-password
- name: MYSQL_DATABASE
value: "pharmacy"
- name: MYSQL_USER
value: "pharmacy_user"
- name: MYSQL_PASSWORD
valueFrom:
secretKeyRef:
name: mysql-secret
key: password
volumeMounts:
- name: mysql-storage
mountPath: /var/lib/mysql
resources:
limits:
cpu: "2"
memory: 4Gi
requests:
cpu: "1"
memory: 2Gi
livenessProbe:
exec:
command:
- mysqladmin
- ping
- -h
- localhost
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
exec:
command:
- mysql
- -h
- localhost
- -u
- root
- -p${MYSQL_ROOT_PASSWORD}
- -e
- SELECT 1
initialDelaySeconds: 5
periodSeconds: 5
volumeClaimTemplates:
- metadata:
name: mysql-storage
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: gp3
resources:
requests:
storage: 100GiRedis StatefulSet
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: redis
namespace: data-stack
spec:
serviceName: redis
replicas: 1
selector:
matchLabels:
app: redis
template:
metadata:
labels:
app: redis
spec:
containers:
- name: redis
image: redis:7-alpine
command:
- redis-server
- --requirepass
- $(REDIS_PASSWORD)
- --appendonly
- "yes"
ports:
- containerPort: 6379
name: redis
env:
- name: REDIS_PASSWORD
valueFrom:
secretKeyRef:
name: redis-secret
key: password
volumeMounts:
- name: redis-storage
mountPath: /data
resources:
limits:
cpu: "2"
memory: 4Gi
requests:
cpu: "500m"
memory: 2Gi
livenessProbe:
exec:
command:
- redis-cli
- -a
- $(REDIS_PASSWORD)
- ping
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
exec:
command:
- redis-cli
- -a
- $(REDIS_PASSWORD)
- ping
initialDelaySeconds: 5
periodSeconds: 5
volumeClaimTemplates:
- metadata:
name: redis-storage
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: gp3
resources:
requests:
storage: 20GiMongoDB StatefulSet
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mongodb
namespace: data-stack
spec:
serviceName: mongodb
replicas: 1
selector:
matchLabels:
app: mongodb
template:
metadata:
labels:
app: mongodb
spec:
containers:
- name: mongodb
image: mongo:6
ports:
- containerPort: 27017
name: mongodb
env:
- name: MONGO_INITDB_ROOT_USERNAME
value: "admin"
- name: MONGO_INITDB_ROOT_PASSWORD
valueFrom:
secretKeyRef:
name: mongodb-secret
key: root-password
- name: MONGO_INITDB_DATABASE
value: "audit"
volumeMounts:
- name: mongodb-storage
mountPath: /data/db
resources:
limits:
cpu: "2"
memory: 4Gi
requests:
cpu: "1"
memory: 2Gi
livenessProbe:
exec:
command:
- mongo
- --eval
- db.adminCommand('ping')
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
exec:
command:
- mongo
- --eval
- db.adminCommand('ping')
initialDelaySeconds: 5
periodSeconds: 5
volumeClaimTemplates:
- metadata:
name: mongodb-storage
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: gp3
resources:
requests:
storage: 100GiService Dependencies
Environment Variables
PostgreSQL
| Variable | Description | Required |
|---|---|---|
POSTGRES_DB | Database name | Yes |
POSTGRES_USER | Database user | Yes |
POSTGRES_PASSWORD | User password | Yes |
PGDATA | Data directory path | Yes |
MySQL
| Variable | Description | Required |
|---|---|---|
MYSQL_ROOT_PASSWORD | Root password | Yes |
MYSQL_DATABASE | Initial database | No |
MYSQL_USER | Application user | No |
MYSQL_PASSWORD | User password | No |
MongoDB
| Variable | Description | Required |
|---|---|---|
MONGO_INITDB_ROOT_USERNAME | Root username | Yes |
MONGO_INITDB_ROOT_PASSWORD | Root password | Yes |
MONGO_INITDB_DATABASE | Initial database | No |
Redis
| Variable | Description | Required |
|---|---|---|
REDIS_PASSWORD | Authentication password | Yes |
Deployment Instructions
1. Create Namespace and Secrets
# Create namespace
kubectl create namespace data-stack
# Create PostgreSQL secret
kubectl create secret generic postgresql-secret \
--from-literal=password=$(openssl rand -base64 32) \
--namespace data-stack
# Create MySQL secret
kubectl create secret generic mysql-secret \
--from-literal=root-password=$(openssl rand -base64 32) \
--from-literal=password=$(openssl rand -base64 32) \
--namespace data-stack
# Create MongoDB secret
kubectl create secret generic mongodb-secret \
--from-literal=root-password=$(openssl rand -base64 32) \
--namespace data-stack
# Create Redis secret
kubectl create secret generic redis-secret \
--from-literal=password=$(openssl rand -base64 32) \
--namespace data-stack2. Deploy Databases
# Deploy PostgreSQL
kubectl apply -f postgresql/statefulset.yaml
kubectl apply -f postgresql/service.yaml
# Deploy MySQL
kubectl apply -f mysql/statefulset.yaml
kubectl apply -f mysql/service.yaml
# Deploy MongoDB
kubectl apply -f mongodb/statefulset.yaml
kubectl apply -f mongodb/service.yaml
# Deploy Redis
kubectl apply -f redis/statefulset.yaml
kubectl apply -f redis/service.yaml3. Verify Deployment
# Check pod status
kubectl get pods -n data-stack
# Check persistent volumes
kubectl get pvc -n data-stack
# Check services
kubectl get svc -n data-stack
# Test PostgreSQL connection
kubectl exec -it -n data-stack postgresql-0 -- psql -U ndp_user -d ndp
# Test MySQL connection
kubectl exec -it -n data-stack mysql-0 -- mysql -u root -p
# Test MongoDB connection
kubectl exec -it -n data-stack mongodb-0 -- mongosh -u admin -p
# Test Redis connection
kubectl exec -it -n data-stack redis-0 -- redis-cli -a <password> pingDatabase Initialization
Schema Management
Database schemas are managed by the application services themselves and will evolve during the development cycle.
- PostgreSQL schemas are managed by the application microservices using migration tools (e.g., Flyway, Liquibase)
- MySQL schemas are versioned with the application code
- MongoDB collections are created dynamically by the application
Refer to individual service documentation for current schema definitions.
Backup Strategy
Automated Backups
apiVersion: batch/v1
kind: CronJob
metadata:
name: postgresql-backup
namespace: data-stack
spec:
schedule: "0 2 * * *" # Daily at 2 AM
jobTemplate:
spec:
template:
spec:
containers:
- name: backup
image: postgres:15-alpine
command:
- /bin/sh
- -c
- |
pg_dump -h postgresql -U ndp_user ndp | \
gzip > /backup/ndp-$(date +%Y%m%d-%H%M%S).sql.gz
env:
- name: PGPASSWORD
valueFrom:
secretKeyRef:
name: postgresql-secret
key: password
volumeMounts:
- name: backup-storage
mountPath: /backup
restartPolicy: OnFailure
volumes:
- name: backup-storage
persistentVolumeClaim:
claimName: backup-pvcMonitoring & Alerts
Key Metrics to Monitor
PostgreSQL:
- Connection count
- Transaction rate
- Query duration
- Cache hit ratio
- Replication lag
MySQL:
- Connection count
- Slow query log
- InnoDB buffer pool usage
- Replication status
MongoDB:
- Operation latency
- Document count
- Write concern
- Replication lag
Redis:
- Memory usage
- Hit rate
- Evicted keys
- Connected clients
Kafka:
- Message rate
- Consumer lag
- Partition count
- Disk usage
Resource Requirements
Estimates
These are rough estimates. Monitor actual usage and adjust accordingly.
| Service | CPU | Memory | Storage | IOPS |
|---|---|---|---|---|
| PostgreSQL | 4 cores | 8 GB | 500 GB | 3000 |
| MySQL | 2 cores | 4 GB | 100 GB | 1000 |
| MongoDB | 2 cores | 4 GB | 100 GB | 1000 |
| Redis | 2 cores | 4 GB | 20 GB | 500 |
| Kafka | 2 cores | 4 GB | 50 GB | 1000 |
Troubleshooting
PostgreSQL Won't Start
# Check logs
kubectl logs -n data-stack postgresql-0
# Check permissions
kubectl exec -n data-stack postgresql-0 -- ls -la /var/lib/postgresql/data
# Reset data directory (CAUTION: destroys data)
kubectl delete pvc -n data-stack postgresql-storage-postgresql-0Redis Out of Memory
# Check memory usage
kubectl exec -n data-stack redis-0 -- redis-cli -a <password> INFO memory
# Clear all keys (CAUTION)
kubectl exec -n data-stack redis-0 -- redis-cli -a <password> FLUSHALLNext Steps
- Monitoring Stack - Setup observability
- Discovery Stack - Deploy Consul & Vault
- Application Stack - Deploy NDP services