Azure SSH Connection Failure Troubleshooting
SSH Connection Failure Troubleshooting
Initial State
- Client: Local Windows (OpenSSH client)
- Server: Cloud Platform Linux VM (Azure)
- Symptoms: Local SSH client connection failed.
- Known: Cloud platform’s built-in Bastion (web terminal) can login normally.
Investigation Module 1: Server Log Analysis
- Issue:
sshdservice logs (journalctl -u sshd) show multiple warnings and errors.Deprecated option RhostsRSAAuthentication(and otherDeprecatedwarnings)error: Unable to load host key: /etc/ssh/ssh_host_dsa_key
- Investigation Process:
- Analyzed
Deprecatedwarnings: Confirmed these are old configuration items insshd_config. They are “warnings” not “errors” - the service ignores them and runs normally. - Analyzed
Unable to load host key: Confirmeddsa_keyis an obsolete key type not generated by default on new systems. Service skips this key and loads others (RSA, ED25519) normally.
- Analyzed
- Conclusion: These entries in server logs are not the cause of connection failure.
Investigation Module 2: Network Connectivity Test
Issue: Is network between local client and server port 22 blocked by firewall (cloud platform NSG or system firewall)?
Investigation Process:
Execute port connectivity test in local Windows PowerShell terminal:
1
Test-NetConnection -ComputerName [Server IP] -Port 22
Conclusion:
- Command returned
TcpTestSucceeded : True. - This proves network path is clear, and both cloud platform firewall (NSG) and server system firewall correctly allow traffic on port 22.
- Command returned
Investigation Module 3: Authentication Investigation
Issue: Network is clear, why does connection fail?
Investigation Process (1): Analyzing Server Real-time Logs (Misleading Information)
- Run
sudo journalctl -f -u sshdon server for real-time monitoring. - Attempt login from local machine.
- Server logs show
Failed password for root from [Unknown IP]. - Analysis: Username (
root) and source IP ([Unknown IP]) in logs don’t match local client (azureuser,[Local IP]). - Conclusion: This is a “brute force” attack from the internet, unrelated to this login failure investigation - background noise.
- Run
Investigation Process (2): Analyzing Local Detailed Logs (Root Cause)
- Run
ssh -vvv azureuser@[Server IP]on local client. - Key information from logs:
Connection established.(connection successfully established)Remote protocol version 2.0, remote software version OpenSSH_9.6p1 Ubuntu...(server is new version OpenSSH)identity file F:\\...\\centos_tydi.pem type -1(locally using a.pemkey)- Logs hang at
SSH2_MSG_KEXINIT sentstep and eventually fail.
- Analysis:
.pemkeys typically use olderssh-rsa(SHA-1) algorithm. New OpenSSH versions (9.6p1) disable this algorithm by default for security. Client and server cannot agree on encryption algorithm, causing negotiation failure.
- Run
Solution (1): Upgrade Key (Recommended)
- Generate new
ed25519key pair (ssh-keygen -t ed25519). - Login via Bastion, add new
ed25519public key to server’s~/.ssh/authorized_keys. - Modify local
~/.ssh/config, pointIdentityFileto newed25519private key. - Result: Login successful.
- Generate new
Solution (2): Support Legacy Key (Not Recommended)
Modify local
~/.ssh/configfile, add for this host:1
2HostkeyAlgorithms +ssh-rsa
PubkeyAcceptedAlgorithms +ssh-rsaResult: Forces local client to enable old algorithm, login also successful.
Investigation Module 4: Periodic “Connection Timeout” Issue
Issue: After successful login with new key, after some time (next login), occasionally encounters “connection timeout”.
Investigation Process:
- Discovered user connects via VPN.
- Analysis: VPN service assigns different exit IP addresses on each reconnection.
- Cloud platform firewall (NSG) rules were set to allow only single IP (IP from last login).
- When VPN IP changes, new IP is blocked by NSG, causing “connection timeout”.
Solution:
- Modify cloud platform NSG inbound rules, set SSH (port 22) “Source” to
Any(0.0.0.0/0).
- Modify cloud platform NSG inbound rules, set SSH (port 22) “Source” to
