Troubleshooting¶
This guide will help you diagnose and resolve common issues with the wazuh-dfn service.
Monitoring¶
Check service status:
sudo systemctl status wazuh-dfn
View logs:
sudo tail -f /opt/wazuh-dfn/logs/wazuh-dfn.log
# Or with journalctl if using systemd:
sudo journalctl -fu wazuh-dfn
Common Issues and Solutions¶
Startup Issues¶
Service fails to start
Check for configuration errors:
# Run with print-config-only to validate configuration sudo -u wazuh /opt/wazuh-dfn/venv/bin/wazuh-dfn --print-config-only --config /opt/wazuh-dfn/config/config.toml # Check logs for specific errors sudo tail -n 100 /opt/wazuh-dfn/logs/wazuh-dfn.log
Python version errors
Verify Python version:
# Check Python version in virtualenv source /opt/wazuh-dfn/venv/bin/activate python -V # Should show Python 3.12.x or later # If incorrect, recreate virtualenv with correct Python version python3.12 -m venv /opt/wazuh-dfn/venv --clear
Connection Issues¶
Wazuh socket connection failures
Check socket path and permissions:
# Verify socket exists sudo ls -l /var/ossec/queue/sockets/queue # Verify socket permissions sudo ls -la /var/ossec/queue/sockets/ # Make sure wazuh user can access the socket sudo usermod -a -G wazuh wazuh
Kafka connectivity issues
Test connection to Kafka broker:
# Basic connectivity test telnet kafka.example.org 443 # Advanced test with kcat/kafkacat kcat -b kafka.example.org:443 -X security.protocol=ssl \ -X ssl.ca.location=/opt/wazuh-dfn/certs/dfn-ca.pem \ -X ssl.certificate.location=/opt/wazuh-dfn/certs/dfn-cert.pem \ -X ssl.key.location=/opt/wazuh-dfn/certs/dfn-key.pem \ -L
Certificate issues
Validate certificate permissions and expiration:
# Check certificate permissions ls -l /opt/wazuh-dfn/certs/ # Check certificate expiration openssl x509 -enddate -noout -in /opt/wazuh-dfn/certs/dfn-cert.pem # Verify certificate chain openssl verify -CAfile /opt/wazuh-dfn/certs/dfn-ca.pem /opt/wazuh-dfn/certs/dfn-cert.pem
Alert Processing Issues¶
No alerts being processed
Check alert file and permissions:
# Verify alert file exists and is being updated sudo ls -la /var/ossec/logs/alerts/alerts.json sudo tail -f /var/ossec/logs/alerts/alerts.json # Check if wazuh user can read the file sudo -u wazuh cat /var/ossec/logs/alerts/alerts.json
Alerts queued but not sent
Check Kafka connection and worker status:
# Look for specific error patterns in logs grep "Error" /opt/wazuh-dfn/logs/wazuh-dfn.log grep "Kafka" /opt/wazuh-dfn/logs/wazuh-dfn.log # Check for failed alerts if storage is enabled ls -la /opt/wazuh-dfn/failed-alerts/
Asyncio-Specific Issues¶
Task cancellation warnings
When you see task cancellation warnings in logs, it’s usually during shutdown. If they appear during normal operation:
# Look for task-related errors grep "task" /opt/wazuh-dfn/logs/wazuh-dfn.log # Check for worker errors grep "worker" /opt/wazuh-dfn/logs/wazuh-dfn.log
High CPU usage
May indicate infinite loops or blocking operations in asyncio context:
# Monitor CPU usage top -p $(pgrep -f wazuh-dfn) # Adjust worker count to match your system's CPU cores # Edit in config.toml: # [misc] # num_workers = <number_of_cores>
Queue overflow warnings
Indicates alert processing can’t keep up with incoming volume:
# Check for overflow messages grep "overflow" /opt/wazuh-dfn/logs/wazuh-dfn.log # Increase workers or queue size in config.toml # [misc] # num_workers = 20 # Increase for more parallel processing # [wazuh] # json_alert_queue_size = 200000 # Increase queue capacity
Performance Tuning¶
Increasing throughput
Optimize for high-volume environments:
# config.toml [misc] num_workers = 20 # More workers for parallel processing [wazuh] json_alert_queue_size = 200000 # Larger queue json_alert_file_poll_interval = 0.5 # More frequent checks [kafka] producer_config = { "batch.size": 32768, "linger.ms": 5 } # Tune Kafka batching
Reducing memory usage
Optimize for resource-constrained environments:
# config.toml [misc] num_workers = 4 # Fewer workers [wazuh] json_alert_queue_size = 50000 # Smaller queue json_alert_file_poll_interval = 2.0 # Less frequent checks [log] interval = 1800 # Reduce logging frequency
Diagnostic Procedures¶
Run with detailed logging:
sudo -u wazuh /opt/wazuh-dfn/venv/bin/wazuh-dfn --log-level DEBUG --config /opt/wazuh-dfn/config/config.toml
Verify environment variables:
# For systemd service sudo systemctl show wazuh-dfn -p Environment # For troubleshooting env | grep -E 'DFN_|WAZUH_|KAFKA_|LOG_|MISC_'
Check for memory leaks:
# Watch memory usage over time watch -n 5 'ps -o pid,ppid,cmd,%mem,%cpu --sort=-%mem | grep wazuh-dfn'
Test alert processing manually:
# Process a single alert file for testing sudo -u wazuh /opt/wazuh-dfn/venv/bin/wazuh-dfn --wazuh-json-alert-file /path/to/test_alert.json