Data Stack

Overview

The Data Stack provides the persistence layer for the HealthFlow NDP platform, including relational databases, NoSQL databases, caching, and message queues.

Stack Architecture

Components

1. PostgreSQL

Purpose: Primary relational database for core NDP data

Version: PostgreSQL 15.x

Usage:

Prescription records
Patient registry
Healthcare Provider Registry (HPR)
Dispense records
Transactional data

Key Features:

ACID compliance
Advanced indexing
Full-text search
JSON support (JSONB)
Row-level security
Streaming replication for HA

2. MySQL

Purpose: Legacy database for pharmacy and medicine data

Version: MySQL 8.0.x

Usage:

Pharmacy registry
Medicine directory (national drug database)
EDA integration data
Legacy system compatibility

Key Features:

InnoDB storage engine
ACID transactions
Master-slave replication
Point-in-time recovery

3. MongoDB

Purpose: Document database for audit and compliance logs

Version: MongoDB 6.x

Usage:

Audit trail storage
Compliance logging
Event history
Analytics data
Unstructured data storage

Key Features:

Flexible schema
High write throughput
Horizontal scalability
Aggregation pipeline
Time-series collections

4. Redis

Purpose: In-memory cache and session store

Version: Redis 7.x

Usage:

API response caching
Session management
Rate limiting counters
Real-time data
Pub/sub messaging
Distributed locks

Key Features:

Sub-millisecond latency
Multiple data structures
Persistence options (RDB/AOF)
Redis Sentinel for HA
Cluster mode support

5. Apache Kafka

Purpose: Distributed event streaming platform

Version: Kafka 3.x

Usage:

Event-driven architecture
Service-to-service messaging
Audit log streaming
Real-time data pipelines
Analytics event collection

Key Features:

High throughput
Fault tolerance
Durable storage
Stream processing
Multiple consumers

Kubernetes Manifests

Namespace

yaml

apiVersion: v1
kind: Namespace
metadata:
  name: data-stack
  labels:
    name: data-stack
    stack: infrastructure

PostgreSQL StatefulSet

yaml

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: postgresql
  namespace: data-stack
spec:
  serviceName: postgresql
  replicas: 1
  selector:
    matchLabels:
      app: postgresql
  template:
    metadata:
      labels:
        app: postgresql
    spec:
      containers:
      - name: postgresql
        image: postgres:15-alpine
        ports:
        - containerPort: 5432
          name: postgres
        env:
        - name: POSTGRES_DB
          value: "ndp"
        - name: POSTGRES_USER
          value: "ndp_user"
        - name: POSTGRES_PASSWORD
          valueFrom:
            secretKeyRef:
              name: postgresql-secret
              key: password
        - name: PGDATA
          value: /var/lib/postgresql/data/pgdata
        volumeMounts:
        - name: postgresql-storage
          mountPath: /var/lib/postgresql/data
        resources:
          limits:
            cpu: "4"
            memory: 8Gi
          requests:
            cpu: "2"
            memory: 4Gi
        livenessProbe:
          exec:
            command:
            - /bin/sh
            - -c
            - pg_isready -U ndp_user
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          exec:
            command:
            - /bin/sh
            - -c
            - pg_isready -U ndp_user
          initialDelaySeconds: 5
          periodSeconds: 5
  volumeClaimTemplates:
  - metadata:
      name: postgresql-storage
    spec:
      accessModes: ["ReadWriteOnce"]
      storageClassName: gp3
      resources:
        requests:
          storage: 500Gi

PostgreSQL Service

yaml

apiVersion: v1
kind: Service
metadata:
  name: postgresql
  namespace: data-stack
spec:
  type: ClusterIP
  selector:
    app: postgresql
  ports:
  - port: 5432
    targetPort: 5432
    protocol: TCP

MySQL StatefulSet

yaml

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mysql
  namespace: data-stack
spec:
  serviceName: mysql
  replicas: 1
  selector:
    matchLabels:
      app: mysql
  template:
    metadata:
      labels:
        app: mysql
    spec:
      containers:
      - name: mysql
        image: mysql:8.0
        ports:
        - containerPort: 3306
          name: mysql
        env:
        - name: MYSQL_ROOT_PASSWORD
          valueFrom:
            secretKeyRef:
              name: mysql-secret
              key: root-password
        - name: MYSQL_DATABASE
          value: "pharmacy"
        - name: MYSQL_USER
          value: "pharmacy_user"
        - name: MYSQL_PASSWORD
          valueFrom:
            secretKeyRef:
              name: mysql-secret
              key: password
        volumeMounts:
        - name: mysql-storage
          mountPath: /var/lib/mysql
        resources:
          limits:
            cpu: "2"
            memory: 4Gi
          requests:
            cpu: "1"
            memory: 2Gi
        livenessProbe:
          exec:
            command:
            - mysqladmin
            - ping
            - -h
            - localhost
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          exec:
            command:
            - mysql
            - -h
            - localhost
            - -u
            - root
            - -p${MYSQL_ROOT_PASSWORD}
            - -e
            - SELECT 1
          initialDelaySeconds: 5
          periodSeconds: 5
  volumeClaimTemplates:
  - metadata:
      name: mysql-storage
    spec:
      accessModes: ["ReadWriteOnce"]
      storageClassName: gp3
      resources:
        requests:
          storage: 100Gi

Redis StatefulSet

yaml

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: redis
  namespace: data-stack
spec:
  serviceName: redis
  replicas: 1
  selector:
    matchLabels:
      app: redis
  template:
    metadata:
      labels:
        app: redis
    spec:
      containers:
      - name: redis
        image: redis:7-alpine
        command:
        - redis-server
        - --requirepass
        - $(REDIS_PASSWORD)
        - --appendonly
        - "yes"
        ports:
        - containerPort: 6379
          name: redis
        env:
        - name: REDIS_PASSWORD
          valueFrom:
            secretKeyRef:
              name: redis-secret
              key: password
        volumeMounts:
        - name: redis-storage
          mountPath: /data
        resources:
          limits:
            cpu: "2"
            memory: 4Gi
          requests:
            cpu: "500m"
            memory: 2Gi
        livenessProbe:
          exec:
            command:
            - redis-cli
            - -a
            - $(REDIS_PASSWORD)
            - ping
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          exec:
            command:
            - redis-cli
            - -a
            - $(REDIS_PASSWORD)
            - ping
          initialDelaySeconds: 5
          periodSeconds: 5
  volumeClaimTemplates:
  - metadata:
      name: redis-storage
    spec:
      accessModes: ["ReadWriteOnce"]
      storageClassName: gp3
      resources:
        requests:
          storage: 20Gi

MongoDB StatefulSet

yaml

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mongodb
  namespace: data-stack
spec:
  serviceName: mongodb
  replicas: 1
  selector:
    matchLabels:
      app: mongodb
  template:
    metadata:
      labels:
        app: mongodb
    spec:
      containers:
      - name: mongodb
        image: mongo:6
        ports:
        - containerPort: 27017
          name: mongodb
        env:
        - name: MONGO_INITDB_ROOT_USERNAME
          value: "admin"
        - name: MONGO_INITDB_ROOT_PASSWORD
          valueFrom:
            secretKeyRef:
              name: mongodb-secret
              key: root-password
        - name: MONGO_INITDB_DATABASE
          value: "audit"
        volumeMounts:
        - name: mongodb-storage
          mountPath: /data/db
        resources:
          limits:
            cpu: "2"
            memory: 4Gi
          requests:
            cpu: "1"
            memory: 2Gi
        livenessProbe:
          exec:
            command:
            - mongo
            - --eval
            - db.adminCommand('ping')
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          exec:
            command:
            - mongo
            - --eval
            - db.adminCommand('ping')
          initialDelaySeconds: 5
          periodSeconds: 5
  volumeClaimTemplates:
  - metadata:
      name: mongodb-storage
    spec:
      accessModes: ["ReadWriteOnce"]
      storageClassName: gp3
      resources:
        requests:
          storage: 100Gi

Service Dependencies

Environment Variables

PostgreSQL

Variable	Description	Required
`POSTGRES_DB`	Database name	Yes
`POSTGRES_USER`	Database user	Yes
`POSTGRES_PASSWORD`	User password	Yes
`PGDATA`	Data directory path	Yes

MySQL

Variable	Description	Required
`MYSQL_ROOT_PASSWORD`	Root password	Yes
`MYSQL_DATABASE`	Initial database	No
`MYSQL_USER`	Application user	No
`MYSQL_PASSWORD`	User password	No

MongoDB

Variable	Description	Required
`MONGO_INITDB_ROOT_USERNAME`	Root username	Yes
`MONGO_INITDB_ROOT_PASSWORD`	Root password	Yes
`MONGO_INITDB_DATABASE`	Initial database	No

Redis

Variable	Description	Required
`REDIS_PASSWORD`	Authentication password	Yes

Deployment Instructions

1. Create Namespace and Secrets

bash

# Create namespace
kubectl create namespace data-stack

# Create PostgreSQL secret
kubectl create secret generic postgresql-secret \
  --from-literal=password=$(openssl rand -base64 32) \
  --namespace data-stack

# Create MySQL secret
kubectl create secret generic mysql-secret \
  --from-literal=root-password=$(openssl rand -base64 32) \
  --from-literal=password=$(openssl rand -base64 32) \
  --namespace data-stack

# Create MongoDB secret
kubectl create secret generic mongodb-secret \
  --from-literal=root-password=$(openssl rand -base64 32) \
  --namespace data-stack

# Create Redis secret
kubectl create secret generic redis-secret \
  --from-literal=password=$(openssl rand -base64 32) \
  --namespace data-stack

2. Deploy Databases

bash

# Deploy PostgreSQL
kubectl apply -f postgresql/statefulset.yaml
kubectl apply -f postgresql/service.yaml

# Deploy MySQL
kubectl apply -f mysql/statefulset.yaml
kubectl apply -f mysql/service.yaml

# Deploy MongoDB
kubectl apply -f mongodb/statefulset.yaml
kubectl apply -f mongodb/service.yaml

# Deploy Redis
kubectl apply -f redis/statefulset.yaml
kubectl apply -f redis/service.yaml

3. Verify Deployment

bash

# Check pod status
kubectl get pods -n data-stack

# Check persistent volumes
kubectl get pvc -n data-stack

# Check services
kubectl get svc -n data-stack

# Test PostgreSQL connection
kubectl exec -it -n data-stack postgresql-0 -- psql -U ndp_user -d ndp

# Test MySQL connection
kubectl exec -it -n data-stack mysql-0 -- mysql -u root -p

# Test MongoDB connection
kubectl exec -it -n data-stack mongodb-0 -- mongosh -u admin -p

# Test Redis connection
kubectl exec -it -n data-stack redis-0 -- redis-cli -a <password> ping

Database Initialization

Schema Management

Database schemas are managed by the application services themselves and will evolve during the development cycle.

PostgreSQL schemas are managed by the application microservices using migration tools (e.g., Flyway, Liquibase)
MySQL schemas are versioned with the application code
MongoDB collections are created dynamically by the application

Refer to individual service documentation for current schema definitions.

Backup Strategy

Automated Backups

yaml

apiVersion: batch/v1
kind: CronJob
metadata:
  name: postgresql-backup
  namespace: data-stack
spec:
  schedule: "0 2 * * *"  # Daily at 2 AM
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: backup
            image: postgres:15-alpine
            command:
            - /bin/sh
            - -c
            - |
              pg_dump -h postgresql -U ndp_user ndp | \
              gzip > /backup/ndp-$(date +%Y%m%d-%H%M%S).sql.gz
            env:
            - name: PGPASSWORD
              valueFrom:
                secretKeyRef:
                  name: postgresql-secret
                  key: password
            volumeMounts:
            - name: backup-storage
              mountPath: /backup
          restartPolicy: OnFailure
          volumes:
          - name: backup-storage
            persistentVolumeClaim:
              claimName: backup-pvc

Monitoring & Alerts

Key Metrics to Monitor

PostgreSQL:

Connection count
Transaction rate
Query duration
Cache hit ratio
Replication lag

MySQL:

Connection count
Slow query log
InnoDB buffer pool usage
Replication status

MongoDB:

Operation latency
Document count
Write concern
Replication lag

Redis:

Memory usage
Hit rate
Evicted keys
Connected clients

Kafka:

Message rate
Consumer lag
Partition count
Disk usage

Resource Requirements

Estimates

These are rough estimates. Monitor actual usage and adjust accordingly.

Service	CPU	Memory	Storage	IOPS
PostgreSQL	4 cores	8 GB	500 GB	3000
MySQL	2 cores	4 GB	100 GB	1000
MongoDB	2 cores	4 GB	100 GB	1000
Redis	2 cores	4 GB	20 GB	500
Kafka	2 cores	4 GB	50 GB	1000

Troubleshooting

PostgreSQL Won't Start

bash

# Check logs
kubectl logs -n data-stack postgresql-0

# Check permissions
kubectl exec -n data-stack postgresql-0 -- ls -la /var/lib/postgresql/data

# Reset data directory (CAUTION: destroys data)
kubectl delete pvc -n data-stack postgresql-storage-postgresql-0

Redis Out of Memory

bash

# Check memory usage
kubectl exec -n data-stack redis-0 -- redis-cli -a <password> INFO memory

# Clear all keys (CAUTION)
kubectl exec -n data-stack redis-0 -- redis-cli -a <password> FLUSHALL

Next Steps

Monitoring Stack - Setup observability
Discovery Stack - Deploy Consul & Vault
Application Stack - Deploy NDP services

Data Stack ​

Overview ​

Stack Architecture ​

Components ​

1. PostgreSQL ​

2. MySQL ​

3. MongoDB ​

4. Redis ​

5. Apache Kafka ​

Kubernetes Manifests ​

Namespace ​

PostgreSQL StatefulSet ​

PostgreSQL Service ​

MySQL StatefulSet ​

Redis StatefulSet ​

MongoDB StatefulSet ​

Service Dependencies ​

Environment Variables ​

PostgreSQL ​

MySQL ​

MongoDB ​

Redis ​

Deployment Instructions ​

1. Create Namespace and Secrets ​

2. Deploy Databases ​

3. Verify Deployment ​

Database Initialization ​

Backup Strategy ​

Automated Backups ​

Monitoring & Alerts ​

Key Metrics to Monitor ​

Resource Requirements ​

Troubleshooting ​

PostgreSQL Won't Start ​

Redis Out of Memory ​

Next Steps ​

Data Stack

Overview

Stack Architecture

Components

1. PostgreSQL

2. MySQL

3. MongoDB

4. Redis

5. Apache Kafka

Kubernetes Manifests

Namespace

PostgreSQL StatefulSet

PostgreSQL Service

MySQL StatefulSet

Redis StatefulSet

MongoDB StatefulSet

Service Dependencies

Environment Variables

PostgreSQL

MySQL

MongoDB

Redis

Deployment Instructions

1. Create Namespace and Secrets

2. Deploy Databases

3. Verify Deployment

Database Initialization

Backup Strategy

Automated Backups

Monitoring & Alerts

Key Metrics to Monitor

Resource Requirements

Troubleshooting

PostgreSQL Won't Start

Redis Out of Memory

Next Steps