In a Nutshell (🌰)
- Follow structured diagnostic approaches to identify issues quickly
- Check connection settings for device-related problems
- Verify Docker and container configurations for container issues
- Review playbook syntax and permissions for automation errors
- Use SSM's built-in diagnostic tools for troubleshooting
Diagnostic Process
Start
Begin troubleshooting your SSM issue.
What type of issue are you having?
Choose the category that best matches your problem.
SSM Installation Issues
Common symptoms: Install errors, MongoDB AVX issues. Solutions: See [Installation Guide](/docs/getting-started/installation), check prerequisites, review Docker logs.
Device Connectivity Issues
Common symptoms: SSH connection fails, device shows as offline. Solutions: See [Device Management](/docs/user-guides/devices/management), verify network connectivity, check SSH credentials.
Container/Docker Issues
Common symptoms: Containers won't start, socket hangup errors. Solutions: See [Container Management](/docs/user-guides/containers/management), check Docker permissions, review container logs.
Performance Issues
Common symptoms: Resource limits, database optimization. Solutions: See [Performance Optimization Tips](/docs/reference/docker-configuration#performance-optimization-tips).
When you encounter an issue with Squirrel Servers Manager, following a structured diagnostic process helps you identify and resolve problems efficiently:
- Identify the Symptom - Determine the specific error or unexpected behavior
- Isolate the Component - Pinpoint which component (device, container, playbook) is affected
- Check Logs - Review relevant logs to find error messages
- Test Connectivity - Verify network connections to affected components
- Apply Solution - Implement the appropriate fix based on your findings
- Verify Resolution - Confirm the issue has been resolved
Decision Tree for Troubleshooting
Common Issues and Solutions
Device Connection Problems
If you're having trouble connecting to a device:
- Symptom: Device shows as offline or connections time out
- Possible Causes:
- Incorrect SSH credentials
- Firewall blocking connections
- Network connectivity issues
- SSH service not running on the device
Solutions:
- Verify SSH credentials in SSM device configuration
- Check firewall settings to ensure SSH port is open
- Test direct SSH connection from SSM server to the device
- Verify the SSH service is running on the target device
# Test SSH connection from command line
ssh user@device-ip -p port
# Check SSH service status on device
systemctl status sshd
Docker Engine Connectivity
If SSM can't connect to Docker on a device:
- Symptom: Docker containers not visible or Docker operations fail
- Possible Causes:
- Docker not installed or running
- Docker API not exposed
- TLS certificates issue
- Docker socket permissions problem
Solutions:
- Verify Docker is installed and running on the device:
docker --version
systemctl status docker
- Check Docker socket permissions:
ls -la /var/run/docker.sock
- For TLS connections, verify certificates are valid:
openssl x509 -text -in /path/to/cert.pem
- If using TCP, ensure the Docker daemon is configured for remote access
Container Management Issues
When containers can't be started, stopped, or managed:
- Symptom: Container operations fail with error messages
- Possible Causes:
- Insufficient privileges
- Resource constraints
- Port conflicts
- Volume mount issues
Solutions:
- Check container logs for specific error messages:
docker logs container_name
- Verify available resources (disk space, memory, CPU):
df -h
free -m
top
- Check for port conflicts:
netstat -tulpn | grep PORT_NUMBER
- Verify volume mount paths exist and have correct permissions
Playbook Execution Failures
If playbooks fail to execute properly:
- Symptom: Playbook execution errors or unexpected results
- Possible Causes:
- Syntax errors in playbook
- Missing variables or inventory
- Insufficient privileges on target hosts
- Network connectivity issues
Solutions:
- Check playbook syntax with ansible-playbook --syntax-check
- Review any missing variables or incorrect inventory entries
- Run playbook in verbose mode to see detailed execution:
ansible-playbook -vvv playbook.yml
- Verify target hosts are reachable and credentials are correct
Common Error Messages
Error Message | Likely Cause | Solution |
---|---|---|
SSH connection failed: Connection refused | SSH service not running or firewall blocking | Check SSH service, verify port, check firewall |
Error: connection error: desc = "transport: Error while dialing dial tcp: lookup [hostname] no such host" | DNS resolution failure | Check hostname/IP configuration |
Error response from daemon: Get "https://registry-1.docker.io/v2/" | Docker registry connectivity issue | Check internet connection, verify registry credentials |
Error: Unable to find image '[image]' locally | Image not available | Pull image manually, check registry connectivity |
Error: Ansible inventory missing or invalid | Incorrect inventory configuration | Verify inventory file format and host entries |
Log Locations
SSM logs are essential for troubleshooting. Here's where to find them:
- SSM Server Logs:
/var/log/ssm/server.log
or Docker container logs - Ansible Logs:
/var/log/ssm/ansible/
or within SSM interface - Docker Container Logs: Accessible via
docker logs container_name
- Client-side Logs: Browser console or within SSM interface
- Device Logs: System logs on the managed devices (
/var/log/syslog
,/var/log/messages
)
Built-in Diagnostic Tools
SSM provides several built-in diagnostic tools that can help identify and resolve issues:
Device Diagnostic Tool
- Navigate to the device's configuration page
- Click on the "Diagnostic" tab
- Run the connection test to verify SSH connectivity
- Check individual service tests (Docker, Proxmox, etc.)
Network Connectivity Test
The network connectivity test can help identify network-related issues:
- Access the device settings
- Run the network diagnostic
- Review results for any connection problems
Docker Engine Diagnostic
For Docker-related issues:
- Go to the device's Docker configuration
- Run the Docker engine test
- Check for connectivity, API version compatibility, and permissions
Getting Support
When reporting an issue, include:
- Detailed description of the problem
- Steps to reproduce the issue
- Relevant logs and error messages
- SSM version and installation method
- Device information (OS, Docker version, etc.)
Admin Password Recovery
If you are unable to log in as the administrator and need to reset your password, you can do so by accessing the MongoDB database directly:
# Connect to the MongoDB container
docker exec -it mongo-ssm mongosh
# Switch to the SSM database
use ssm
# Reset password (replace with your email)
db.users.updateOne(
{ email: "your.email@example.com" },
{ $set: { password: "$2b$10$CZt6MqBEVu8abVXel6mnn.A6AJuWlI8qKpPyTZ6TYWLm2jCr7HvdG" } }
)
This will reset the password to Password123!
. Be sure to change it immediately after logging in.
MongoDB Authentication
If your MongoDB instance is configured with authentication, you may need to authenticate first:
# For authenticated MongoDB
docker exec -it mongo-ssm mongosh -u $DB_USER -p $DB_USER_PWD --authenticationDatabase $DB_AUTH_SOURCE
# Then switch to SSM database and reset password as shown above
For more details on MongoDB authentication configuration, see the MongoDB Authentication Guide.