I'm looking into ways to get vGPU to work on VMware with the NVIDIA Tesla series, but as far as retail cards go, you will be hamstrung by the SR-IOV support and lack (or rarity) thereof.
For now I just use some low end Quadro GPUs passed through to VMs running docker, which then carves them up on a per-container basis.
Microsoft has GPU-P as you found, which is in Hyper-V on Windows 11 (maybe 10) and Windows Server 2025 and I believe works on retail cards.
For Proxmox, you have the vgpu-unlock script which will work for some consumer NVIDIA GPUs. I've heard of ways of getting this to work on xcp-ng as well.
You've got two options I can think of:
As others have eluded, split DNS. You need something handling DNS resolution internally that allows you to add custom records. You'll need to add a record of type "A" pointing to the internal IP where Immich sits.
Since you have Immich published to your public IP, you can use hairpin NAT. This is something that is a lucky dip with routers as to whether it works or not and only some make it configurable. This will allow you to hit Immich via public IP and the router will "hairpin" the traffic out to the WAN interface and back in. This is how I do it so I don't make a spaghetti mess of DNS records.
Failing to resolve DNS doesn't sound like this is actually the problem though. Do you have a domain registered and DNS records pointing to your public IP? Does it resolve fine outside your network? If yes, then something may be wrong on your internal network's DNS resolution.
Also worth noting, if you only just created the records in public DNS then tried to resolve it straight away, they will not have propagated yet and your DNS resolver will cache the "record doesn't exist" result for some time (most I've seen is a couple of hours).