TL;DR at the bottom, last line, and I will gladly reward whoever can help with this. It's been a few months of trying every other day to get this to stop! LOL
I'm running with only 512 mb of RAM, 768 of swap.
I know this is not a lot, but it seems to be able to handle up to 20 simultaneous users with no issue, and I never get a crash for any other reason.
My issue is at 2:35 AM, VirtualMin does SOMETHING causing it to stop the whole server, can't ssh, etc.
I have to login to my Vultr.com control panel and manually reboot to get everything to come back online.
I know it is something VirtualMin does, because:
1. It does it on 4 different servers, which are in no way related, one is even a minimal install for low memory systems, and that one crashes less, but still at the same time.
2. It is definitely something that chooses to run at that time. I figured this out because when Daylight Savings Time started, the servers started crashing at 1:35 instead of 2:35, so that was a huge indicator that something is running at that time.
There is no log file (that I am aware of) giving any details as to why that I can find. Every error log I find shows nothing.
I could guess it may be the automatic updating, but fairly sure I disabled that for a server for a week, and it still went down about 2 days later.
Finally, weirdly, one of my servers does not go down. Same setup (1 core, 512 MB RAM, 768 MB swap, etc)
If anyone has any ideas as to what the heck is happening at 2:35/1:35 DST every few days that my servers are going down.
I don't know if it helps but I am running VirtualMin free, and paid, on Ubuntu 16.04. There are 14 servers. All of them suffer from this issue.
I have uninstalled every module taking up more than 100 mb at any given time, except mySQL and Apache.
Sites all work perfectly.
Thanks in advance.
TL;DR - servers crash completely at 1:35AM~ every few days. Something grabs too much RAM at that time, I think.
If it's some sort of scheduled problem, have you thought of checking if it's in the cron tab list of tasks, something happening in that time range? You don't mention it, so I'm asking just in case.
https://www.liquidweb.com/kb/how-to-display-list-all-jobs-in-cron-crontab/ and, looking better, https://stackoverflow.com/questions/134906/how-do-i-list-all-cron-jobs-f...
0,5,10,15,20,25,30,35,40,45,50,55 * * * * /etc/webmin/status/monitor.pl 58 2 * * * /etc/webmin/webalizer/webalizer.pl /var/log/virtualmin/XXXXX.com_access_log 0 1 * * * /sbin/reboot 48 3 * * * /etc/webmin/package-updates/update.pl
This is all I have, and I don't install anything after VirtualMin other than WordPress and Roundcube, so unless they are doing something on EVERY server despite all having completely different (and mostly empty) plugins and configs, I'm stumped.
I did however test a blank WordPress install once, and it still went down 2 days later, exactly at 1:35 AM.
Thanks for the help though definitely. :)
No idea then, it's weird O_o Here's to hoping you get it fixed eventually, mate.
That said, no harm in trying, you thought of blaming your hosting company, sending them a ticket, telling all that, and hoping they go "oh right, our fault, let us move your rack to a working room"?
There's a lot more default cron jobs than the ones you've listed. Have to checked webmin > system > scheduled cron jobs?
root Yes /etc/cron.daily/popularity-contest /etc/cron.daily/bsdmainutils /etc/cron.daily/spamassassin /etc/cron.daily/ntp /etc/cron.daily/mdadm /etc/cron.daily/mlocate /etc/cron.daily/logrotate /etc/cron.daily/apache2 /etc/cron.daily/dpkg /etc/cron.daily/apport /etc/cron.daily/apt-compat /etc/cron.daily/cracklib-runtime /etc/cron.daily/man-db /etc/cron.daily/passwd /etc/cron.daily/update-notifier-common /etc/cron.daily/quota /etc/cron.daily/apt-show-versions /etc/cron.daily/webalizer
root Yes /etc/cron.weekly/fstrim /etc/cron.weekly/man-db /etc/cron.weekly/update-notifier-common
root Yes test -x /etc/cron.daily/popularity-contest && /etc/cron.daily/popularity-contest ...
root Yes [ -x /usr/lib/php/sessionclean ] && /usr/lib/php/sessionclean
root Yes if [ -x /usr/share/mdadm/checkarray ] && [ $(date +%d) -le 7 ]; then /usr/share/ ...
root Yes /etc/webmin/status/monitor.pl
root Yes /etc/webmin/webalizer/webalizer.pl /var/log/virtualmin/cutielust.com_access_log
root Yes /sbin/reboot
root Yes /etc/webmin/package-updates/update.pl
www-data No [ -x /usr/share/awstats/tools/update.sh ] && /usr/share/awstats/tools/update.sh
www-data No [ -x /usr/share/awstats/tools/buildstatic.sh ] && /usr/share/awstats/tools/build ...
Nothing there that would cause random server downtime, especially if daily reboots is one of the cron jobs
For some reason it didn't copy the whole list
/etc/cron.daily/popularity-contest /etc/cron.daily/bsdmainutils /etc/cron.daily/spamassassin /etc/cron.daily/ntp /etc/cron.daily/mdadm /etc/cron.daily/mlocate /etc/cron.daily/logrotate /etc/cron.daily/apache2 /etc/cron.daily/dpkg /etc/cron.daily/apport /etc/cron.daily/apt-compat /etc/cron.daily/cracklib-runtime /etc/cron.daily/man-db /etc/cron.daily/passwd /etc/cron.daily/update-notifier-common /etc/cron.daily/quota /etc/cron.daily/apt-show-versions /etc/cron.daily/webalizer