Others are correct, the problem is the software. You are right to use memory requests and limits. The limits being the max it will use, but hopefully other pods won't be using all of their limits at once.
So all of the pods' memory requests on a given node will sum to < 100% of the total available memory. So you can of course say your pod requests the highest amount of ram it will ever need, but that does mean it's reserved for that pod and won't be used anywhere else even during downtime
K8s will allow over provisioning of ram for the limits though because it assumes it will not always need that as you are seeing.
What you can do is to set a priority class on the pod so when it spikes and you don't have enough ram, it will kill some other pod instead of yours, but that makes other pods more volatile of course.
There's many options at your disposal, you'll have to decide what works best for your use case.