It could be a failing drive. If you have a spare drive you could reinstall on there and import your guests. Could save hours of troubleshooting software ghosts.
Proxmox
Proxmox VE is a complete, open-source server management platform for enterprise virtualization. It tightly integrates the KVM hypervisor and Linux Containers (LXC), software-defined storage and networking functionality, on a single platform. With the integrated web-based user interface you can manage VMs and containers, high availability for clusters, or the integrated disaster recovery tools with ease.
Thanks, I'll try that. Something odd is also happening where I just realised I can access it by tailscale, but not the local network so that narrows it down a bit, but I'll give the spare drive a go first.
Hmm, are you running Tailscale on the Proxmox VE, not in a container/VM?
I'm running it on both the host and the container
There you go :) I’ve had a Pi running Tailscale and it ws not reachable using its local IP (it was accessible when using Tailscale IP) when Tailscale was started. When I was using Tailscale for site-to-site connectivity (subnet router) I ran it in an LXC container on PVE, so I’d advise you try that. Avoiding installation of additional software on the hypervisor seems like a smart idea - whenever I can I put stuff in containers/VMs.
If it’s a failing drive running dmesg will tell you that.
Good point.
Very niche case, but I saw pretty much the same thing.
I was running from an SD card on the iDSDM in a Dell server. The module was flaky and I'd have to power it all off and pop the SD card out and back in then bring it back up.
Unlikely to be the same thing, but maybe some breadcrumbs to start your search
Side note: do not install Proxmox on cheap storage mediums
Hardware issues are often the case. I'll put my advice as a top comment.
Good point. All my VMs were on a zfs array.
It seemed like enough of the system was in ram that it could still run, but anything that needed the root file system was out of luck.
I wonder if it is the same issue. I installed on an emmc module which isn't recommended by Proxmox but I thought I'd get away with it. I might try reinstalling to another drive and see if it still happens.
Huh. Maybe it is related then. For what it's worth I ended up moving to a real drive, still solid state, a while ago and it's been working fine ever since.
If its only the main OS not responding to SSH and the web portal just disappears after a couple of minutes use, then it might be your router is reusing the IP address linked to Proxmox. Proxmox uses a static IP address, so will not change once set up. Your router on the other hand may re use the address if it was never reserved specifically for the Proxmox machine. In other words, two devices are fighting for the same IP address on your router.
This is what happened to me. Took a while to figure it out until someone suggested checking the IPs and Mac addresses on the router. No issue since fixing it on the router. Just reserve the address on the router, reset the device that is using the same address (if you can figure it out), and reset the router. Make sure you are reserving the address for the correct machine by chcking the Mac address.
Thanks, I have already reserved the IP address for my host.
Physically log into it and check the "vitals" (logs and resource usage)
If I had to speculate I would say that your network card has faulty firmware.