The past couple of months all my sites go down approx. 3am on Sundays. I only recently realised it was Apache as I would usually restart the whole server in panic when I wake up at 8am ish. Now I can do a "service httpd restart" which will start Apache and get it working.
The thing is, the server has monitoring of services, so if it monitors apache down it would run a "service httpd restart" automatically. When apache goes down other times this works, and I get a notice to say apache is down. But i dont get that notice when this happens, and also the restart does not work anyway. But the Webmin status/monitoring page does show apache as down.
But anyway, the main issue, why would Apache go down at 3am on sundays? I cant see any cronjob I have at that time that does sometimes and breaks it. I am wondering if its in one of the system cronjobs. Otherwebsites with this issue indicate it could be due to log rotation because this happened for someone else when a log rotation kicked in. but I cant get to the bottom of that either or know what to look at.
Anyone know what at 3am Sunday runs so I can pinpoint a service that may cause it?
Thanks
My first thought is it could be the out of memory process killer.
We can try and guess stuff all day. What OS, and server specs do you have? Lots of active sites? If VPS, what type?
You can look in the various /etc/cron* directories and find the system automated jobs.
I would recommend installing the sysstat package (on Debian it needs to be enabled after install). You can then use the 'sar' command to see a history of resource usage for the last day.
It’s centos 7. I keep it all updated.
I don’t think it’s oom. I upgraded the server the other day due to oom. I went from 8gb to 16gb and the oom I had at the time, at random times, stopped.
This is clockwork. 3am every Sunday.
Is it happening around the time logrotate is running?
You may want to take a look at this thread here to see if any of these suggestions help... I'd suggest starting with Comment #10:
https://www.virtualmin.com/comment/794993#comment-794993
Hello Andrey
Do you have any suggestion form my issue https://www.virtualmin.com/node/65359 ? In the past you have always been able to point me in the right solution. Thanks
Ok will take a look. How can I determine if it’s around log rotate? I looked at cron jobs but nothing at 3am Sunday. Where could I look to see what is running this time like log rotate?
If it's not a system job you can look at user crontabs in /var/spool/cron/crontabs.
I cant see anything running at 3am Sunday in cron jobs. There is nothing in /var/spool/cron/crontabs.
I can only imagine its a cronjob below calling some script and there is some check to run at 3am Sunday. Where are log rotate schedules and commands so I can see if its to do with that?
Here are the cronjobs. I have had to create a cron job after 3am Sunday to restart Apache due to it going down:
/usr/lib64/sa/sa1 1 1 At cron time */10 * * * *
/usr/lib64/sa/sa2 -A Every day at 23:53
/usr/sbin/csf --lfd restart > /dev/null 2>&1 Every day at 0:00
/usr/local/maldetect/maldet --mkpubpaths >> /dev/null 2>&1 At cron time */5 * * * *
/usr/sbin/csf -u Every day at 4:55
"/etc/cron.hourly/awstats
/etc/cron.hourly/0anacron" Every hour at 01 past the hour
systemctl try-restart atop Every day at 0:00
sudo bash /root/.loggly/file-monitoring-cron-mariadb.sh At cron time */10 * * * *
/etc/webmin/status/monitor.pl At cron time 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46,48,50,52,54,56,58 * * * *
service httpd restart At cron time 30,45 3 * * 0
wget -q -O - https://www.somedomain.co.uk/wp-admin/admin-ajax. ... At cron time 0,15,30,45 * * * *
/opt/rh/rh-php70/root/bin/php -q /home/server1/public_html/wp-cron.php At cron time 0,15,30,45 * * * *
/opt/rh/rh-php70/root/bin/php -q /home/server2/public_html/wp-cron.php At cron time 10,25,40,55 * * * *
/opt/rh/rh-php70/root/bin/php -q /home/server3/public_html/wp-cron.php At cron time 10,25,40,55 * * * *
/opt/rh/rh-php70/root/bin/php -q /home/server4/public_html/wp-cron.php At cron time 15,30,45,0 * * * *
sh /home/server5/backup_to_s3.sh > /dev/null 2>&1 Every day at 2:00
/opt/rh/rh-php70/root/bin/php -q /home/server6/public_html/wp-cron. ... At cron time 15,30,45,0 * * * *
/opt/rh/rh-php70/root/bin/php -q /home/server7/public_html/wp-cro ... At cron time 15,30,45,0 * * * *
wget -O - http://somedomain.co.uk/?ACT=100 >> /dev/null Every day at 8:00
/opt/rh/rh-php70/root/bin/php -q /home/server8/public_html/wp-cron.php At cron time 07,22,37,52 * * * *
/opt/rh/rh-php70/root/bin/php -q /home/server9/public_html/wp-cron.php At cron time 10,25,40,55 * * * *
/opt/rh/rh-php70/root/bin/php -q /home/server10/public_html/wp-cron ... At cron time 0,15,30,45 * * * *
/opt/rh/rh-php70/root/bin/php -q /home/server11/public_html/wp-cron ... At cron time 15,30,45,0 * * * *
/opt/rh/rh-php70/root/bin/php -q /home/server12/public_html/wp-cron.php At cron time 0,15,30,45 * * * *
/opt/rh/rh-php70/root/bin/php -q /home/server13/public_html/wp-cron.php At cron time 20,35,50,05 * * * *
php /home/server14/public_html/update_analytics.php Daily (at midnight)
/opt/rh/rh-php70/root/bin/php -q /home/server15/public_html/wp-cron.php At cron time 10,25,40,55 * * * *
/opt/rh/rh-php70/root/bin/php -q /home/server16/public_html/wp-cron.php At cron time 05,20,35,50 * * * *
/opt/rh/rh-php70/root/bin/php -q /home/server17/public_html/wp-cron.php At cron time 05,20,35,50 * * * *
/opt/rh/rh-php70/root/bin/php -q /home/server18/public_html/wp-cron.p ... At cron time 10,25,40,55 * * * *
/opt/rh/rh-php70/root/bin/php -q /home/server19/public_html/wp-cron.php At cron time 10,25,40,55 * * * *
/opt/rh/rh-php70/root/bin/php -q /home/server20/public_html/wp-cron.php At cron time 5,20,35,50 * * * *
/opt/rh/rh-php70/root/bin/php -q /home/server21/public_html/wp-cron.php At cron time 20,35,50,05 * * * *
In Logfile Rotation in Webmin there are LOADS set for Weekly periods, but no specific time. Just weekly. So would Weekly be set somewhere else, and could be 3am Sunday? How to know what Weekly log rotation is set for 3am Sunday, all of them?
We recently had a similar problem on Virtualmin.com and it came from having low disk space. So, you might check to make sure you've got plenty of free space...ours was because backups were filling the user-available space. We never noticed it was a problem because there was still ~10GB free, but our backups are that big. ;-)
Log rotation can also take a bit of disk space (though much less, obviously, so may be still worth checking if it is happening during/because of log rotation).
Most of the jobs Virtualmin schedules are done using Webmin's Scheduled Jobs functionality, which is more efficient and can be forced to run sequentially rather than herding as can happen with cronjobs (e.g. if all users had a job at 3AM, they'd all fire up at once and possibly saturate disk bandwidth, even if "nice" level was set to a low priority for all of them). You can see the schedule jobs in Webmin->Webmin Configuration->Webmin Scheduled Functions. I dunno if any of those are relevant to your specific case, but if you're not seeing a cron job at 3AM on Sunday, that'd be the next place to look.
Just finding the job that's triggering it won't fix it...you still need to figure out why it isn't coming back up after a reload or a restart. Could be disk space but could be something else. Logs from the time when it tries to restart and fails might help.
--
Check out the forum guidelines!
Did you try test manual reload? Don't remember more what it is/was , simular with other CP panel ( cronjob or log rotate no errors couldn't find and so on) changed the auto reload after that kind of jobs in restart in that config. ( here is EU nightly on that box almost no visitors, and so possible to do with produktion box without to many probs)
But if it is such difference between reload restart then you have a kind of extra hint to search for.
No I am not familiar with that manual reload, are you able to explain a bit more please? Although I would not want any manual reloading all the time, should be automated with no downtime.
Forgot to say, I have lots of servers, this is only on Centos 7. All my Centos 6's do not have this issue. They all had the same setup process using Virtualmin. And it only started 1 or 2 months ago.
@Joe yes plenty of space, i only recently upgraded from 8Gb to 16Gb server. I am using 45% of the space.
Please do some more websearch and reading forumrules and so on. You'r now posting versions wich...
Also people giving support puts time in it, but it would be more respect if those asking for support do to? ;)
Reload or AKA gracefull restart was giving us trouble with some on other CP, manual testing is testing and not saying please do it every time! If manual not working gracefull restarts then you search for that part. Howto cost me time to look for link with websearch , wich you .... ;) While free or even payd doesn't say for support persons here trying to help please you have to gues or roling dice..... https://httpd.apache.org/docs/2.4/stopping.html .
And that same then looking for howto in CENTOS 7
I'm Dutch German so typing her costs me... And Support Virtualmin is helping you even with free versions that is very nice, but take care of helping them also a bit please with full informations and sometimes answer questions from other free users if knowing answer we have to do it together for free versions is that is common use. ;)
OYEA mainreason of me asking pointing to this is. ( with sometime gracefull restart this could happen)
You should also be wary of other potential race conditions, such as using rotatelogs style piped logging. Multiple running instances of rotatelogs attempting to rotate the same logfiles at the same time may destroy each other's logfiles.
Hey, if you dont know the answer just dont reply. maybe someone who does can assist. - I have spent hours and hours on this issue researching the web before I came here - you have no idea what time I have put in, so take offence you imply I have not done any. When I ask virtualmin for support, it is always because I have spent so much time doing so myself without success.
When I ask for support in the past I have stated I am OK to pay and purchased a ticket but someone else declined it and refunded it. Its important to me to get this fixed and I can pay. I am not looking for free support.
But i did think it would be simple for someone who knows about logrotation, to basically tell me where is the 3am time controlled. It must be set somewhere, then I could go look into that command.
Who creates log rotation on Linux? If nothing to do with virtualmin I can go elsewhere to ask the question if providing support is an issue here.
No offence only pointing out with info from only this topic.
BUT still the test for restart<>reload so gracefull restart working yes or no could be giving a clue. Whil with the gracefull restart there could be causing a race condition with log rotate's that is more often giving such problems
I give you links for how to restart or reload and so on also a cite of that kind of problems if it is such. My reply was on your question: "howto manual reload test", that is really easy to find on the web so to explain my post before as result on your question direct for that. Finding solution you have to know first where cause is, otherwise it is trail on error to hopefully find one.
And also id id explain your question to me for reload/ gracefull and a why this could matter..... Also i did write my problem with it was on other CP, but is a common isue and therfore maybe also possible on VM
Thats OK.
I think I am a bit confused by your reply, maybe your english (sorry! :) ). I think I understand now. You are suggesting when I it goes down to try a graceful restart first, not a full restart?
I know how to restart httpd or graceful restart, no problem, I just did not understand "test manual reload".
It may take me some time. I have a cron job to restart apache properly various times after 3am. I will need to change these to graceful, so try graceful several times, then finish off maybe at 4:30am with a full restart in case graceful dud not work. I am not going to be present at 3am in the morning you see, but I dont want to wait until I wake up to manually try a graceful restart! So will use cron to automated it.
Thanks
Succes If so ( gracefull not working) then you have to find out. If gracefull is working you have for the time being a workaround for the problem that is also better then the restart. ;_
Due to the clocks changing an hour (I guess), Apache went down at 4am today. So my auto restart did not work as its before 4am.
This means when I got up, I could run a service httpd graceful. When I did I got this error:
Job for httpd.service invalid.
But I can see that that happens when Apache is stopped anyway.
Yup sorry so should. But if still apache is running to test only once, the gracefull restart works? ( that it is not a problem with the gracefull itself i mean after the log rotation or so , some build in script a kind of delay for that , don't know myself howto )
When Apache is running, graceful restart works fine
Hmm i am out now. https://www.virtualmin.com/comment/794993#comment-794993
You can scroll up there to read reason.
I have managed to fix this now. I found another website that basically stated due to having too many separate log rotation scripts for each vhost, it crashes. I disabled all vhosts log rotations, and then created one single log rotate script that does it all the same time rather then each one seperate, so apache should only be restarted once at the end, not after every one.
I think this is what the link above implies, BUT this really should be addressed, you cant leave a faulty process in there.