SSH Connection Failure Troubleshooting

Initial State

  • Client: Local Windows (OpenSSH client)
  • Server: Cloud Platform Linux VM (Azure)
  • Symptoms: Local SSH client connection failed.
  • Known: Cloud platform’s built-in Bastion (web terminal) can login normally.

Investigation Module 1: Server Log Analysis

  • Issue: sshd service logs (journalctl -u sshd) show multiple warnings and errors.
    • Deprecated option RhostsRSAAuthentication (and other Deprecated warnings)
    • error: Unable to load host key: /etc/ssh/ssh_host_dsa_key
  • Investigation Process:
    1. Analyzed Deprecated warnings: Confirmed these are old configuration items in sshd_config. They are “warnings” not “errors” - the service ignores them and runs normally.
    2. Analyzed Unable to load host key: Confirmed dsa_key is an obsolete key type not generated by default on new systems. Service skips this key and loads others (RSA, ED25519) normally.
  • Conclusion: These entries in server logs are not the cause of connection failure.

Investigation Module 2: Network Connectivity Test

  • Issue: Is network between local client and server port 22 blocked by firewall (cloud platform NSG or system firewall)?

  • Investigation Process:

    1. Execute port connectivity test in local Windows PowerShell terminal:

      1
      Test-NetConnection -ComputerName [Server IP] -Port 22
  • Conclusion:

    • Command returned TcpTestSucceeded : True.
    • This proves network path is clear, and both cloud platform firewall (NSG) and server system firewall correctly allow traffic on port 22.

Investigation Module 3: Authentication Investigation

  • Issue: Network is clear, why does connection fail?

  • Investigation Process (1): Analyzing Server Real-time Logs (Misleading Information)

    1. Run sudo journalctl -f -u sshd on server for real-time monitoring.
    2. Attempt login from local machine.
    3. Server logs show Failed password for root from [Unknown IP].
    4. Analysis: Username (root) and source IP ([Unknown IP]) in logs don’t match local client (azureuser, [Local IP]).
    5. Conclusion: This is a “brute force” attack from the internet, unrelated to this login failure investigation - background noise.
  • Investigation Process (2): Analyzing Local Detailed Logs (Root Cause)

    1. Run ssh -vvv azureuser@[Server IP] on local client.
    2. Key information from logs:
      • Connection established. (connection successfully established)
      • Remote protocol version 2.0, remote software version OpenSSH_9.6p1 Ubuntu... (server is new version OpenSSH)
      • identity file F:\\...\\centos_tydi.pem type -1 (locally using a .pem key)
      • Logs hang at SSH2_MSG_KEXINIT sent step and eventually fail.
    3. Analysis: .pem keys typically use older ssh-rsa (SHA-1) algorithm. New OpenSSH versions (9.6p1) disable this algorithm by default for security. Client and server cannot agree on encryption algorithm, causing negotiation failure.
  • Solution (1): Upgrade Key (Recommended)

    1. Generate new ed25519 key pair (ssh-keygen -t ed25519).
    2. Login via Bastion, add new ed25519 public key to server’s ~/.ssh/authorized_keys.
    3. Modify local ~/.ssh/config, point IdentityFile to new ed25519 private key.
    4. Result: Login successful.
  • Solution (2): Support Legacy Key (Not Recommended)

    1. Modify local ~/.ssh/config file, add for this host:

      1
      2
      HostkeyAlgorithms +ssh-rsa
      PubkeyAcceptedAlgorithms +ssh-rsa
    2. Result: Forces local client to enable old algorithm, login also successful.


Investigation Module 4: Periodic “Connection Timeout” Issue

  • Issue: After successful login with new key, after some time (next login), occasionally encounters “connection timeout”.

  • Investigation Process:

    1. Discovered user connects via VPN.
    2. Analysis: VPN service assigns different exit IP addresses on each reconnection.
    3. Cloud platform firewall (NSG) rules were set to allow only single IP (IP from last login).
    4. When VPN IP changes, new IP is blocked by NSG, causing “connection timeout”.
  • Solution:

    1. Modify cloud platform NSG inbound rules, set SSH (port 22) “Source” to Any (0.0.0.0/0).