Submitted by aremdee on Sun, 10/11/2015 - 18:19 Pro Licensee
Seems I spoke too soon, unless this is another issue. MySQL service seemed to stop running (Wordpress sites failed to connect to database). I couldn't connect to Webmin either so I thought I would do a reboot. However the server didn't boot but got stuck on "fsck exited with status code 4". A quick Google showed me how to manually run fsck, which I did and after quite a few errors, the system started again but the wordpress sites again failed to connect to their databases. I logged in via PuTTY and ran "service mysql restart" and everything started working again. I could log into webmin again too. My question is what logs do I look into to see what may have caused the problem?
Status:
Active
Comments
Submitted by andreychek on Sun, 10/11/2015 - 19:04 Comment #1
Howdy -- unfortunately, you could be seeing the beginning of a hard disk failure. That could cause all the problems you're experiencing.
Is this a VPS, or dedicated server? If it's a dedicated server, is it using RAID?
Also, what errors did you run into while running the fsck?
Lastly, what is the output of this command:
dmesg | tail -30
Submitted by aremdee on Sun, 10/11/2015 - 20:09 Pro Licensee Comment #2
Hi,
No it's a VPS (in that it's a virtual machine under hyper-v). There are two other VM's on the same host, a windows VM and a another Linux VM (the original Ubuntu server). Neither of these are displaying the same issue though. The fsck errors are to numerous to list (plus I can't remember them, I just went yes to fix them), is there a log I can access for that? Haven't rebooted the machine for a while but things went funny after the webmin update.
Output below
[ 6.431063] systemd-udevd[208]: starting version 215 [ 7.501616] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input3 [ 7.501625] ACPI: Power Button [PWRF] [ 7.577334] hv_vmbus: registering driver hyperv_fb [ 7.578257] hyperv_fb: Screen resolution: 1152x864, Color depth: 32 [ 7.584749] Console: switching to colour frame buffer device 144x54 [ 7.686099] input: PC Speaker as /devices/platform/pcspkr/input/input4 [ 7.688612] hv_utils: Registering HyperV Utility Driver [ 7.688616] hv_vmbus: registering driver hv_util [ 7.690291] piix4_smbus 0000:00:07.3: SMBus base address uninitialized - upgrade BIOS or use force_addr=0xaddr [ 7.695039] hv_vmbus: registering driver hyperv_keyboard [ 7.695682] input: AT Translated Set 2 keyboard as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/vmbus_0_4/serio2/input/input5 [ 8.020536] psmouse serio1: alps: Unknown ALPS touchpad: E7=12 00 64, EC=12 00 64 [ 8.224139] psmouse serio1: trackpoint: failed to get extended button data [ 9.004617] Adding 2170876k swap on /dev/sda5. Priority:-1 extents:1 across:2170876k FS [ 9.039823] AVX version of gcm_enc/dec engaged. [ 9.044719] alg: No test for __gcm-aes-aesni (__driver-gcm-aes-aesni) [ 9.068024] alg: No test for crc32 (crc32-pclmul) [ 9.196447] EXT4-fs (sda1): re-mounted. Opts: data=ordered,grpquota,errors=remount-ro,usrquota [ 12.304179] psmouse serio1: trackpoint: IBM TrackPoint firmware: 0x01, buttons: 0/0 [ 12.305880] input: TPPS/2 IBM TrackPoint as /devices/platform/i8042/serio1/input/input6 [ 15.204052] systemd-journald[206]: Received request to flush runtime journal from PID 1 [ 21.864197] RPC: Registered named UNIX socket transport module. [ 21.864201] RPC: Registered udp transport module. [ 21.864202] RPC: Registered tcp transport module. [ 21.864203] RPC: Registered tcp NFSv4.1 backchannel transport module. [ 22.074266] FS-Cache: Loaded [ 22.128345] FS-Cache: Netfs 'nfs' registered for caching [ 22.433884] Installing knfsd (copyright (C) 1996 okir@monad.swb.de). [ 7711.963266] traps: php5-cgi[14587] general protection ip:70e439 sp:7ffcb361d570 error:0 in php5-cgi[400000+7e1000]
Submitted by andreychek on Sun, 10/11/2015 - 20:12 Comment #3
Unfortunately, fsck doesn't log any errors.
If you run it again and see errors, let us know some of the ones you see there.
As far as log files go, you could try looking at /var/log/syslog, /var/log/kern.log, and /var/webmin/miniserv.error.
You may also want to keep an eye on that dmest output to see if it shows errors in the future.
What kind of VPS is it that you have there? Is it OpenVZ? If so, could you share your /proc/user_beancounters file?
Submitted by aremdee on Sun, 10/11/2015 - 20:41 Pro Licensee Comment #4
It's a Hyper-V Virtual Machine. Full blown server running Debian Linux, but in a virtualised environment (so maybe it isn't a VPS). I've just checked the Host machine's event viewer and noticed it complaining that the Integrated Services on the server (host) is 6 where as the client is 5.1. There doesn't appear to be a way to fix this from another Google search as the Linux OS is supposed to have LIS built in. SO IU'm not sure if this is the problem or not. Ubuntu is OK and so maybe that's the reason it reboots fine. Might have to find a forum somewhere that can help with Hyper-V integration if this is the problem and it continues.
Submitted by andreychek on Sun, 10/11/2015 - 22:31 Comment #5
Ah, you did indeed mention hyper-v above.
I'd definitely suggest reviewing the logs above, looking back to roughly the time you began having some recent problems. In particular, the logs "syslog" and "kern.log" might have the best info.
Also, you may want to make sure you're using the most recent packages -- and in particular, make sure you're using the most recent kernel version for your distro.
Submitted by aremdee on Sun, 10/11/2015 - 23:13 Pro Licensee Comment #6
Mmmm. The trouble was apparent when I arrived at work at around 8. Snippet from syslog below.
Oct 11 19:55:01 debian CRON[6956]: (root) CMD (/etc/webmin/status/monitor.pl) Oct 12 08:32:35 debian rsyslogd: [origin software="rsyslogd" swVersion="8.4.2" x-pid="672" x-info="http://www.rsyslog.com"] start
We jump from 19:55 on the 11th to 8:32 on the 12th. The 8:32 time is after the problem was fixed using a manual fsck.
A snippet from "kern.log".
Oct 9 22:02:53 debian kernel: [1571658.828657] sd 0:0:0:0: Warning! Received an indication that the operating parameters on this target have changed. The Linux SCSI layer does not automatically adjust these parameters. Oct 10 22:40:13 debian kernel: [1660298.841532] traps: php5-cgi[61349] general protection ip:70e439 sp:7ffc2a235180 error:0 in php5-cgi[400000+7e1000] Oct 12 08:32:35 debian kernel: [ 0.000000] Initializing cgroup subsys cpuset
I can't see anything obvious but my knowledge is pretty limited.
Ran uname -a and that returned debian 3.16.0-4 which looks like the latest kernel. Maybe it was just a hold over from the webmin update. If there are any other logs you can think of that may help let me know.
Thanks for your help by the way.
Submitted by andreychek on Sun, 10/11/2015 - 23:54 Comment #7
Well, your server seems to be seeing some pretty significant low level issues, at the filesystem level or below.
This error here is concerning:
Warning! Received an indication that the operating parameters on this target have changed. The Linux SCSI layer does not automatically adjust these parameter
That seems to indicate that the Linux kernel thinks something "underneath" it has changed, such as what could happen if a VPS setting was modified. But if that error occurs regularly, it may indicate a kernel bug or VPS bug of some sort.
That looks a bit like this particular issue here:
https://social.technet.microsoft.com/Forums/windowsserver/en-US/8807f61c...
Is there anything else "heavy" that's running at the time when this occurs? Such as backups of some sort?
If it happens again, what I'd suggest doing is to run the "mount" command.
The users in the above forum suggested their filesystem was made read-only when that error occurred... if that's the case in your situation, that may explain why things like MySQL stopped working.
Submitted by aremdee on Sun, 10/11/2015 - 23:58 Pro Licensee Comment #8
Below are some snippets from webmin miniserv.error logs
[11/Oct/2015:15:33:56 +1100] [121.214.130.40] Document follows : This web server is running in SSL mode. Try the URL https://debian.host.net.au:10000/ instead.
Error: Server no longer exists! Warning: something's wrong at /usr/share/webmin/authentic-theme/authentic.pl line 8.
Mind you I get the same error message from Ubuntu except for the Error: Server no longer exist! That's a little different but maybe that was at the time webmin shutdown after the update.
Submitted by aremdee on Mon, 10/12/2015 - 01:57 Pro Licensee Comment #9
I looked at the Microsoft blog and the problems reported there were seemingly solved by doing a reboot. The difference with our server is that a reboot actually seemed to spark the issue. We backup every day (well every night using Altaro backup software), and backup three VM's, one a windows VM and two Linux VM's. The Debian one was the problem machine. But have been backing up these machines without fail for the last few weeks. Were there any significant changes with webmin being updated that may have affected the Debian install? That was the only change that I can see. There were a couple of package updates but nothing significant last week. Debian is pretty stable and needs infrequent updating, which is one reason I chose it over other distros.
If it happens again I may have to revert to Ubuntu or CentOS for backup reasons.
Submitted by andreychek on Mon, 10/12/2015 - 10:38 Comment #10
The issue wouldn't be related to Webmin or other services user-facing services.
Your server seems to be experiencing some pretty low level issues.
I'd be looking at lower level things. On the host, there is hyper-v itself and hardware, but since the other VM's aren't experiencing an issue, those don't stand out to me either.
On your Debian server, I'm suspicious of the kernel. But I'm also curious what other packages on there were updated recently, as there could possibly be other low-level tools in use there.
While we normally recommend against this sort of thing for stability reasons, since you're already experiencing stability issues, you might consider looking into whether moving to a non-standard kernel might be worth a try.
For example, perhaps the kernel from Debian Testing.
Now, you'd want to be able to revert back to this current setup if that doesn't work, so if you choose to do that, I'd highly recommend having an excellent backup of your entire server. But if you get stuck and just aren't sure what else to change -- I personally might try that, before migrating everything to a new distro.
However, I might wait until you see this issue again, and if you do, see if those same errors show up in the syslog file. Also, use the "mount" command to see if the filesystem is in readonly mode.
That will help us determine if you're experiencing the issue in the link I shared above.
Submitted by aremdee on Mon, 10/12/2015 - 17:13 Pro Licensee Comment #11
Well all good this morning. Server was backed up last night using Altaro, as well as the internal backup though webmin. I can log in to webmin without issues and all websites seem to be working ok.
Pasted below is the output from mount
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) udev on /dev type devtmpfs (rw,relatime,size=10240k,nr_inodes=633729,mode=755) devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000) tmpfs on /run type tmpfs (rw,nosuid,relatime,size=1017300k,mode=755) /dev/sda1 on / type ext4 (rw,relatime,quota,usrquota,grpquota,errors=remount-ro,data=ordered) securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime) tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev) tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k) tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755) cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/lib/systemd/systemd-cgroups-agent,name=systemd) pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime) cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset) cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct) cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices) cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer) cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio) cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio) cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event) systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=23,pgrp=1,timeout=300,minproto=5,maxproto=5,direct) mqueue on /dev/mqueue type mqueue (rw,relatime) hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime) debugfs on /sys/kernel/debug type debugfs (rw,relatime) rpc_pipefs on /run/rpc_pipefs type rpc_pipefs (rw,relatime)
Submitted by andreychek on Mon, 10/12/2015 - 18:06 Comment #12
I'd be interested in the output of "mount" only when experiencing a problem, that will show if the filesystem is in read-only mode.
You could also determine that by running "dmesg | tail -30" most likely.
Submitted by aremdee on Sun, 03/13/2016 - 22:53 Pro Licensee Comment #13
Well it happened again but this time I did a screen capture of the mount command. However when I look before the mount command it seems the server actually runs out of memory. This is odd as most of the time it runs at around 50% usage. I would notice if the memory was fast depleting as I check daily.
After running the mount command I also ran fsck /dev/sda1 to repair the drive. Sorry I didn't take note of the fixes but there was a heap of them. I didn't reboot the server before I did all this so not sure if the out of memory issue would have resolved itself doing the reboot. Would memory issues cause corruption of files on the drive? Perhaps this has been the problem all along. The server has 5Gb of memory assigned to it, but I'm not even sure that this is the problem. I can send through the image of the terminal output, just can't see anything obvious where I can attach it.
Submitted by andreychek on Sun, 03/13/2016 - 23:17 Comment #14
Sorry to hear you're still experiencing this issue!
Running out of memory would not cause filesystem issues though. That would cause processes to be killed off by the Linux kernel, and would prevent their usage until they are restarted. It wouldn't cause filesystem corruption though.
And yeah, 5GB of RAM should be more than plenty.
If you run out of RAM again, you may want to review your running processes to see what's eating up all the memory. You can do that with "ps auxw". While that's a different issue than what's causing the filesystem corruption, it would be good to sort out what's causing that.
Lastly -- attachments may not work at the moment, unfortunately, though you could always upload the image somewhere else, including on your server, and give us a link to that.
Submitted by aremdee on Wed, 03/30/2016 - 16:43 Pro Licensee Comment #15
Thanks for getting back. The system doesn't notify me anymore when there is a response. Link to the screen grab is below.
Debian_Mount_Output_13-3-2016.png
Hopefully this will give some light to what's happening. Thanks again.
Submitted by andreychek on Wed, 03/30/2016 - 16:59 Comment #16
Yeah it looks like the filesystem is indeed going into read-only mode, due to filesystem errors.
That doesn't identify what exactly the issue is, but it does mean that you're seeing some disk level issues.
I was previously suggesting you try a newer kernel on your Linux system... I would definitely suggest that still. However, the other thought I had, are there updates to Hyper-V available? It wouldn't hurt to try updating that as well.
Submitted by aremdee on Wed, 03/30/2016 - 21:56 Pro Licensee Comment #17
Interestingly enough we also have an Ubuntu VM that never shows this behaviour, but it does get rebooted quite often for updates. The Debian VM usually never needs to reboot (except when this happens) and this lock up usually occurs after 60 days or so of uptime. I'm thinking that every month during windows updates (which always require a reboot) I could just make it a practice to reboot the Debian server too and see if the problem presents itself again.
But having said all that, just had the server nearly run out of virtual memory. There was still plenty of ram left though, close to 60%. I will add that the symptoms prior to an emergency reboot were database connectivity issues with Wordpress sites. The reboot settled things down though. Could this be a MySQL issue?
As for Hyper-V on the host server, it is always updated so I would assume Hyper-V is updated also during these monthly patches too. I noticed today that there are a few optional updates (about 24 of them) that were posted on the 15th of this month.
Submitted by andreychek on Wed, 03/30/2016 - 21:58 Comment #18
Unfortunately, you'd have to look while the RAM is being used, in order to determine what's using it. After the problem has stopped, there's nothing that would show what was using it.
However, what you could always do is setup monitoring to notify you when there's a problem... and when that occurs, you could take a look and review the situation.
One way to do that would be to use Webmin -> Others -> System and Server Status. There, you can setup various monitors, including RAM monitoring.
However, I don't think the issue you're seeing with filesystem corruption is RAM related -- but it would also be good to resolve the RAM usage.
-Eric
Submitted by aremdee on Wed, 03/30/2016 - 22:54 Pro Licensee Comment #19
After the reboot virtual ram has steadily increased, real memory usage is 42%, virtual memory usage is 48%. Virtual server count is 63. The Ubuntu server on the other hand is running at 37% real memory, 1% virtual memory and has 28 virtual servers running. When you mentioned the kernel I noticed that Ubuntu is running Linux 3.16.0-67-generic on x86_64 where as Debian is using Linux 3.16.0-4-amd64 on x86_64. So it looks like Ubuntu is using a later kernel (unless I'm not understanding the kernel numbering system).
So maybe I should be using a later kernel. Just unsure how to go about it.
Submitted by andreychek on Wed, 03/30/2016 - 23:03 Comment #20
Well, I'm just taking a bit of a guess here.
All I can say for certain is, if you're seeing filesystem errors, something very wrong is occurring, at a low level.
That's different than the RAM issues you are seeing. If RAM usage is going up, you would want to monitor what processes are using more RAM.
But as far as filesystem errors goes -- you're looking at low-level problems there. Is there a hyper-v update available? If so, that might be an easier place to start.
Submitted by aremdee on Wed, 03/30/2016 - 23:28 Pro Licensee Comment #21
They are optional updates and the hyper-v ones seem more to do with a Windows VM. I keep wondering why Ubuntu doesn't suffer from the same problem if it's a Hyper-V issue. What I can do if it happens again is copy the Debian VM first before fixing it and then maybe run it up with out network connectivity to see what file system errors there are. If I fix the original VM then I can keep customers happy and investigate the other one to see what the problem maybe.
As for the memory issue, I'm running htop in putty at the moment but can't make head nor tail of some of the figures. Is there a command that gives me application usage or something similar in webmin that can do the same?
Submitted by aremdee on Wed, 03/30/2016 - 23:42 Pro Licensee Comment #22
Don't worry just found it :)
Submitted by aremdee on Wed, 03/30/2016 - 23:57 Pro Licensee Comment #23
PID 1234 mysql 1.39 GB.
Not sure if it's swap that it's using. Is there a way to check this please? And if it is, is there a way to get it use real memory instead?
Submitted by andreychek on Thu, 03/31/2016 - 09:18 Comment #24
Well, there isn't a simple way to see what process is using swap.
Though if the goal is to lower swap usage, it might be best just to lower RAM usage in general.
What is the output of these two commands:
free -m
ps auxwf
Submitted by aremdee on Thu, 03/31/2016 - 16:34 Pro Licensee Comment #25
Again swap got higher than 60% before I finished up last night. As I was concerned it would run out of virtual memory over night, I thought about the website I was working on when all this started. So as a test I disabled it and then enabled it again. Almost instantly swap went down to a respectable 20% and memory usage down to around 30%. However it all started to climb again but after 8 or so hours had gotten to around 52% virtual memory and memory to 40%. So this morning I disabled and enabled it again (about an hour and half ago) and the same thing happened. Almost instant drop again and we are now running at 42% memory usage and 29% virtual memory usage.
Have you ever heard of something like this happening? Is it possible for a website to cause these issues?
Submitted by aremdee on Thu, 03/31/2016 - 16:43 Pro Licensee Comment #26
Output from free -m
            total      used      free    shared   buffers    cached
Mem:Â Â Â Â Â Â Â Â Â 4967Â Â Â Â Â Â 4782Â Â Â Â Â Â Â 184Â Â Â Â Â Â 2425Â Â Â Â Â Â Â Â 31Â Â Â Â Â Â 2653
-/+ buffers/cache:Â Â Â Â Â Â 2097Â Â Â Â Â Â 2869
Swap:Â Â Â Â Â Â Â Â Â 2119Â Â Â Â Â Â Â 644Â Â Â Â Â Â 1475
The output from ps auxwf is fairly large. Would you prefer that in a text file?
Submitted by aremdee on Mon, 04/04/2016 - 21:41 Pro Licensee Comment #27
Still unsure what is going on here with virtual memory. However there were 52 updates yesterday that I installed and overnight Virtual Mem stayed at around 40 - 45%. During the day it has steadily increased to 77%. Not sure why unless it's MySQL?
Should I start a new ticket for this? Any help appreciated.
Submitted by andreychek on Mon, 04/04/2016 - 22:30 Comment #28
The output I'm seeing above isn't too unusual.
Of course if you're running out of swap that is unusual. But since the above looks okay, it would be difficult to say what exactly is causing the problem, we'd need to see it when the RAM usage is more of a problem.
However, it can't hurt to tweak MySQL to lower how much RAM it's using.
A simple way to do that would be to re-run the setup wizard (in System Settings). When doing that, try choosing one of the lower memory options for MySQL.
Submitted by aremdee on Tue, 04/05/2016 - 17:57 Pro Licensee Comment #29
Thanks for that. Did about 52 updates yesterday for the Debian server. Virtual usage started to rise again so I did the usual trick of disabling a virtual server and enabling it again. That dropped it all back to respectable levels. I didn't check last night, but did this morning and I'm happy to report that Virtual memory is running at 29% and memory is 43%. Maybe there was an issue that was fixed in yesterdays patches. Will keep an eye on it and see how it goes. Will also put into practice my new policy of rebooting the server during windows updates and see if the file system error returns. If it doesn't then that one can be crossed of the list too.
Submitted by andreychek on Wed, 04/06/2016 - 10:00 Comment #30
I just wanted to follow up, are things still looking good on your server there?
Submitted by aremdee on Thu, 04/07/2016 - 02:42 Pro Licensee Comment #31
Virtual memory is still going up and down, but hasn't maxed out like the other day. It's at 69% at present. Perhaps the problem stems from the fact that I didn't assigned enough virtual memory when I set the server up. Ubuntu has roughly the same amount of virtual memory as it has ram and it's virtual memory is running at 0%,
When I set Debian up it defaulted to about half. However memory usage always drops significantly whenever I disable a virtual server, not sure what processes are happening when that occurs though. But I will edit this to qualify the statement, not every domain disabling will drop virtual memory.
Submitted by aremdee on Fri, 04/15/2016 - 02:24 Pro Licensee Comment #32
Just had to restart webmin. Virtual memory was getting high at the time however it wasn't catastrophic. Have just gone through the logs for around that period of time and copied and pasted into notepad. There is quite a bit in there about the kernel and hyper v stuff too. Have uploaded the text file for you to look at as I can't see if there's an issue or not.
https://www.rdweb.com.au/mem_issue_15-4-2016.txt
Not sure if this is related to the corruption that occurs once every few months. Was about to ask the question about uninstalling clamav as it's mentioned a couple of times and that question was, as the server doesn't handle email, can it be safely removed? That would free up a little bit of memory too.
Also need to add that this morning I rebooted the host machine to install the latest server updates. There were quite a few, around 32 in total.
Thanks for the help.
Submitted by andreychek on Fri, 04/15/2016 - 09:13 Comment #33
Ah if you aren't using email, you may want to disable the ClamAV and SpamAssassin services. Both of those can use a decent amount of RAM.
I wouldn't remove them altogether, but disabling the service should do the trick.
The text file you shared does suggest that you're experiencing low memory situations
Submitted by aremdee on Sun, 04/17/2016 - 00:54 Pro Licensee Comment #34
Any tips how to permanently disable both. The only one that shows up under status is spamassassin which is already marked as disabled, but clam isn't listed. If it isn't running, any idea why it consumes around 600MB as shown in the processes list?
Submitted by andreychek on Sun, 04/17/2016 - 09:09 Comment #35
If the feature is disabled in Virtualmin, that doesn't necessarily mean it's not running, it just means Virtualmin isn't allowed to use it.
To disable a service, you can go into Webmin -> System -> Bootup and Shutdown, and there you can see the services that are running, and you can optionally disable them.
Submitted by aremdee on Mon, 04/18/2016 - 03:04 Pro Licensee Comment #36
Did that and it effectively shut it down. However when I refresh the page it shows clamav-daemon and clamav-daemon.socket as both starting at boot. Can't seem to stop from starting at boot (even thought they aren't running). Also Bind is back too. it's consuming around 400MB. Can I do the disable thing to it as well? These Linux boxes don't handle DNS, that's still handled by the Windows server.
Submitted by andreychek on Mon, 04/18/2016 - 09:36 Comment #37
By default, BIND is used for all DNS lookups.
However, if you edit /etc/resolv.conf and change the nameserver there to use your ISP's nameservers, it would then be possible to disable it.
For anything you're disabling, you'd want to make sure you're selecting the disable now and on boot option on the Bootup and Shutdown screen.
Submitted by aremdee on Mon, 04/18/2016 - 17:49 Pro Licensee Comment #38
Thanks for that. BIND isn't running on the ubuntu server so I assume that disabling it on debian shouldn't have any effect either. Having said all that I seem to recall that BIND was disabled earlier, see this thread. However it is still running as a process after reboots. I don't think the server uses it as it isn't ticked under System Settings>Features and Plugins. Is it safe then to stop and not have it start up after a reboot?
The other thing is that clamav-daemon, clamav-daemon.socket continually stay as "Yes" for start at boot, despite numerous attempts of selecting and clicking "Disable Now and On Boot". So I tried clicking on the service and opening up the page "Edit Systemd Service for clamav-daemon.socket, selecting Start at boot time=No and hitting save. That throws me to an error page that says "Failed to save systemd service : No systemd configuration entered" message and I have to return to previous page. But if I go back to "Bootup and Shutdown" the service is still listed as start at boot = yes. Maybe I need to add something to systemd to get it to run but am unsure. Any ideas?
Thanks again Roger
Submitted by andreychek on Mon, 04/18/2016 - 21:19 Comment #39
Whether or not Virtualmin is using BIND -- you'd definitely want to ensure that 127.0.0.1 isn't listed as a nameserver in /etc/resolv.conf. That can be listed there, even if Virtualmin isn't using BIND. If that's in there, that would cause DNS lookups to fail if BIND were stopped.
Submitted by aremdee on Tue, 04/19/2016 - 06:17 Pro Licensee Comment #40
Have done that thank you and disabled BIND from running now and at reboot. CLAM on the other hand is still persistent in wanting to started at boot. Any ideas as to what may be causing this? Again your help is appreciated.
Should I start another ticket for this issue? Seems we are going off track a little.
Roger
Submitted by andreychek on Tue, 04/19/2016 - 08:28 Comment #41
Hmm, what distro/version is it that you're using there? (my apologies if I asked that already, though I didn't see it above at a glance)
Submitted by aremdee on Tue, 04/19/2016 - 17:05 Pro Licensee Comment #42
Debian Linux 8.
Submitted by aremdee on Fri, 04/22/2016 - 20:25 Pro Licensee Comment #43
I have setup monitoring as you suggested earlier. Last night I received two emails, the first one said "Monitor on debian.host.net.au for 'MySQL Database Server' has detected that the service has gone down at 22/Apr/2016 23:30". The second email came shortly after "Monitor on debian.host.net.au for 'MySQL Database Server' has detected that the service has gone back up at 22/Apr/2016 23:35". The Server logs show at that time the below entry.
Is it correct to assume that MySQL is the culprit that is consuming Virtual Memory? If so do you know how I can force it to use RAM over Virtual Memory or a way I can increase Swap on a running server?
Thanks
Submitted by JamieCameron on Fri, 04/22/2016 - 21:57 Comment #44
Usually when a process is killed for the system being out of memory, it is the culprit - as Linux always kills the largest process.
If you have swap configured, that will be used automatically as a fallback.
Submitted by aremdee on Fri, 04/22/2016 - 22:04 Pro Licensee Comment #45
Thanks Jamie. I assumed that swap is virtual memory, but I may be wrong. Can you enlighten me please? Also when I setup the server it defaulted to 2GB of virtual memory, perhaps I need to increase that amount. Is there a way of doing that on an already running server?
Thanks again.
Submitted by andreychek on Sat, 04/23/2016 - 00:12 Comment #46
That's correct, swap is virtual memory.
I'm surprised your server is having so much trouble with RAM, you have over 8GB including swap.
What you could always try is to go into System Settings, and re-run the setup wizard.
In there is an option to specify which MySQL config to use. You may want to try setting it to use the smallest MySQL config, which will significantly reduce how much RAM it uses.
After that, restart MySQL, and see if that helps with your issue.
Submitted by aremdee on Mon, 04/25/2016 - 19:28 Pro Licensee Comment #47
Yep I re ran the installer a while back when you suggested it but didn't restart MySQL. Think I have had a reboot since then too. When I re-ran the installer I selected 1GB for MySQL however last night it was running at 2.3GB. I'm thinking that I need to increase the size of the swap partition or alternatively setup a new server with a much larger swap and move everyone over to that server. I could run that as a GPL and when I'm done switch it over to the Pro version. Any thoughts please.
Submitted by andreychek on Mon, 04/25/2016 - 22:28 Comment #48
Needing more than 2GB of swap, when you already have 6GB of RAM, seems like quite a bit. It really shouldn't need anything near that... and in general I'd suggest we try to configure your server so that it doesn't feel the need to use as much swap.
I mean, you could certainly do that, but in my opinion I'd work to figure out what's going on to cause it to need so much RAM.
You really may want to try the smallest MySQL size, at least for the time being (and then yeah, you may need to manually restart MySQL afterwards).
Another thing you could try is to edit /etc/apache2/apache2.conf, and set "MaxClients" to something smaller... it generally defaults to 150 or more, you might want to try setting it to say, 30 or 40 just for the moment (and then restart Apache).
That will make it so that a sudden flood of traffic can't bring your server down.
After making those changes and restarting the services (or rebooting), could you share what "ps auxf" shows?
Submitted by aremdee on Mon, 04/25/2016 - 23:28 Pro Licensee Comment #49
It seems weird, but what ever is using swap is not using RAM as it seems to hover around the 35% to 45%. Swap goes up and down like a yo yo. Will reducing RAM for MySQL put more load onto the processor? Also tried to find "MaxClients" in apache2.conf but can't find it.
Submitted by aremdee on Mon, 04/25/2016 - 23:30 Pro Licensee Comment #50
MySQL is running at 1.21GB at the moment, steadily increasing along with swap.
Real memory: 4.85 GB total / 2.88 GB free / 2.63 GB cached Swap space: 2.07 GB total / 884.90 MB free
Submitted by aremdee on Tue, 04/26/2016 - 00:09 Pro Licensee Comment #51
Ok just re ran the installer. MySQL is running at 714MB. Real memory: 4.85 GB total / 2.78 GB free / 2.62 GB cached Swap space: 2.07 GB total / 864.32 MB free.
Swap is still high even though MySQL has dropped by about 500MB.
ps auxf output
Submitted by andreychek on Tue, 04/26/2016 - 00:24 Comment #52
That's not abnormal to see something using swap in that case -- it just means something that was previously using swap is still using that.
I think changing MySQL for the time being is the best thing to do until we're sure you aren't running low on memory.
Now, it is going to grow, but it should slow down at some point.
From here, we just need to review what else is using RAM.
Which version of Debian is it you're using there? That will help me understand what your Apache config should look like so we can modify it to allow less connections.
Also, could you run the "ps auxwf" again and share that?
Submitted by aremdee on Tue, 04/26/2016 - 00:39 Pro Licensee Comment #53
Just as a comparison Ubuntu is running as below
Real memory: 4.84 GB total / 3.50 GB free / 2.10 GB cached Swap space: 5 GB total / 4.97 GB free
MySQL is 2.60GB.
Submitted by andreychek on Tue, 04/26/2016 - 00:43 Comment #54
There's a lot of variables that contribute to how much RAM is used by a server. It's not surprising that another system would be using a much different amount of RAM though.
We'll need to see the output to the "ps" command above, as well as the exact Debian version, to offer some additional input on all that.
Submitted by aremdee on Tue, 04/26/2016 - 01:58 Pro Licensee Comment #55
Hi Eric,
It's just above. But if you need the link again then please see below.
ps auxf output
Also the server is running Debian Jessie (8).
Thanks
Submitted by andreychek on Tue, 04/26/2016 - 10:27 Comment #56
Ah, my bad, looks like I missed your link!
Okay, so MySQL is definitely looking good there. It's not actually using that full 700MB, some of that is shared. It may really be in the 500's.
It looks like pulsewayd is using a decent amount of RAM (though, again, it should be okay, and having 6GB of RAM should be plenty for you to use tools like it).
Are you using Mailman? If not, you may want to disable that service, and then shut off the feature in Virtualmin.
However, after reviewing what you have there, I do think I'd focus on how many people are allowed to connect to Apache at one time, and possible on how much RAM each PHP process is taking.
If there are any PHP modules installed that you don't need, you may want to consider removing or disabling them. Each installed module takes up RAM during every Apache request.
As far as the Apache config goes -- it looks like they may have moved the files around where those configuration items are stored.
What is the output of this command:
ls /etc/apache2/mods-enabled | grep mpm
If it's just one file, could you also paste in the contents of that file?
Submitted by aremdee on Tue, 04/26/2016 - 17:55 Pro Licensee Comment #57
Hi Eric,
Various websites on the server send email notifications and submissions but it doesn't receive email so I'm not sure if mailman is what is used for this purpose. If it isn't using a lot RAM I might leave that. But then again looking at the list of Features and Plugins, it isn't even enabled so maybe I can get rid of it.
Output from "ls /etc/apache2/mods-enabled | grep mpm"
> ls /etc/apache2/mods-enabled | grep mpm
mpm_prefork.conf
mpm_prefork.load
Submitted by andreychek on Tue, 04/26/2016 - 18:03 Comment #58
You would know if you were using Mailman... it's not used on most systems (we're actually beginning the process of disabling this by default).
The Apache config file I'm interested in seeing is "mpm_prefork.conf" -- can you paste in the contents of that file?
Submitted by aremdee on Tue, 04/26/2016 - 18:23 Pro Licensee Comment #59
mpm_prefork.conf
# prefork MPM
# StartServers: number of server processes to start
# MinSpareServers: minimum number of server processes which are kept spare
# MaxSpareServers: maximum number of server processes which are kept spare
# MaxRequestWorkers: maximum number of server processes allowed to start
# MaxConnectionsPerChild: maximum number of requests a server process serves
<IfModule mpm_prefork_module>
StartServers 5
MinSpareServers  5
MaxSpareServers 10
MaxRequestWorkers  150
MaxConnectionsPerChild  0
</IfModule>
# vim: syntax=apache ts=4 sw=4 sts=4 sr noet
Submitted by andreychek on Tue, 04/26/2016 - 18:30 Comment #60
Try setting the "MaxRequestWorkers" parameter to "40", and then restart Apache, and see if that helps.
Submitted by aremdee on Tue, 04/26/2016 - 20:09 Pro Licensee Comment #61
Done. Real memory: 4.85 GB total / 3.53 GB free / 1.99 GB cached Swap space: 2.07 GB total / 1.81 GB free. However I don't believe it has impacted the system as it was around the same before I restarted Apache. The file I modified was in the mods-available directory. Does that mean it's active or inactive if it's in that directory?
Submitted by andreychek on Tue, 04/26/2016 - 19:36 Comment #62
That looks excellent. Yeah, your memory usage isn't too bad, when the system is recently rebooted or the services restarted. It seems to be running into periods of high resource usage though... so the goal of what we're doing is to make sure it can withstand those periods of time (and also, prevent over-usage).
Let's keep an eye on that and see how things go for a few days. Let us know what you see!
Submitted by aremdee on Tue, 04/26/2016 - 20:09 Pro Licensee Comment #63
Good idea and many thanks for your help. Will keep an eye on things. At the moment we are running at Real memory: 4.85 GB total / 3.15 GB free / 2.95 GB cached Swap space: 2.07 GB total / 1.80 GB free, which I think is slightly better than an hour ago.
Submitted by andreychek on Tue, 04/26/2016 - 20:18 Comment #64
After a few days, we can see how things look. At that point, if you aren't experiencing problems, we can review how much RAM is being used, and look into increasing the RAM MySQL is using, and/or increasing the max connections Apache is allowed to use.
Submitted by aremdee on Tue, 04/26/2016 - 22:26 Pro Licensee Comment #65
Have also just stopped mailman with no apparent problems (forms on websites seem to still send through ok). So we are now cruising at - Real memory: 4.85 GB total / 3.01 GB free / 2.81 GB cached Swap space: 2.07 GB total / 1.88 GB free. I know it isn't a couple of days yet but hey this is looking promising. If swap remains low for the whole day I'll be stoked.
Submitted by aremdee on Sun, 05/01/2016 - 23:20 Pro Licensee Comment #66
Happy to report that swap is less. It's not 0% but runs from around 10% to 50% but hasn't maxed out for a while (currently 44%). So this has helped but hasn't completely eliminated swap usage. MySQL is running at 2.20GB
mysql 34326 0.9 2.6 2310636 135768 ? Sl Apr27 69:55 _ /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysql/plugin --user=mysql --log-error=/var/log/mysql/error.log --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/run/mysqld/mysqld.sock --port=3306
The only errors I have had was a notification that services had shutdown and restarted after a few minutes. This was on Sunday morning and was regarding 10 sites. I couldn't find any corresponding errors in the Apache logs so maybe it was just an aberration with the service monitor.
Submitted by andreychek on Sun, 05/01/2016 - 23:55 Comment #67
Hmm, Sunday morning is the log rotation, it's possible that it was noticing that Apache was restarting a few times during the early hours.
What does "free -m" show now?
And can you share your "ps auxwf" output again? Thanks!
Submitted by aremdee on Mon, 05/02/2016 - 00:57 Pro Licensee Comment #68
> free -m
            total      used      free    shared   buffers    cached
Mem:Â Â Â Â Â Â Â Â Â 4967Â Â Â Â Â Â 4710Â Â Â Â Â Â Â 256Â Â Â Â Â Â 2666Â Â Â Â Â Â Â Â 16Â Â Â Â Â Â 2784
-/+ buffers/cache:Â Â Â Â Â Â 1909Â Â Â Â Â Â 3057
Swap:Â Â Â Â Â Â Â Â 2119Â Â Â Â Â Â Â 990Â Â Â Â Â Â 1129
Submitted by aremdee on Mon, 05/02/2016 - 01:07 Pro Licensee Comment #69
And the result from ps_auxf
ps_auxf output
Submitted by andreychek on Mon, 05/02/2016 - 10:15 Comment #70
Hmm, and if you restart MySQL with "service mysql restart", what are the output of these two commands:
ps auxw | grep mysql
free -m
MySQL is using a bit more RAM than I'd expect on your system, especially since we've given it a config file designed to run in a smaller footprint.
But, it does indeed seem a bit better.
I'd also be curious to see the output of these commands:
free -m
service apache2 restart
free -m
That will be a view of how much RAM your server is using before and after Apache is restarted. If that makes a big difference, another option will be to switch away from using a PHP Execution Mode that stores processes in memory.
Submitted by aremdee on Mon, 05/02/2016 - 17:20 Pro Licensee Comment #71
ps auxw | grep mysql output
root    31542 0.0 0.0  4336  756 ?       S   08:17  0:00 sh -c (ps auxw | grep mysql) 2>&1
root    31543 0.0 0.0  4336  104 ?       S   08:17  0:00 sh -c (ps auxw | grep mysql) 2>&1
root    31545 0.0 0.0 11128  976 ?       S   08:17  0:00 grep mysql
root    33919 0.0 0.0  4336  288 ?       S   Apr27  0:00 /bin/sh /usr/bin/mysqld_safe
mysql   34326 0.9 2.3 2310636 119252 ?     Sl  Apr27 81:34 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysql/plugin --user=mysql --log-error=/var/log/mysql/error.log --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/run/mysqld/mysqld.sock --port=3306
free -m output
            total      used      free    shared   buffers    cached
Mem:Â Â Â Â Â Â Â Â Â 4967Â Â Â Â Â Â 4457Â Â Â Â Â Â Â 509Â Â Â Â Â Â 2348Â Â Â Â Â Â Â 102Â Â Â Â Â Â 2818
-/+ buffers/cache:Â Â Â Â Â Â 1536Â Â Â Â Â Â 3431
Swap:Â Â Â Â Â Â Â Â 2119Â Â Â Â Â Â Â 570Â Â Â Â Â Â 1549
Will do the other shortly.
Submitted by andreychek on Mon, 05/02/2016 - 17:23 Comment #72
Could you the above info again, but restart the MySQL service first?
You can restart it with the command "service mysql restart".
Submitted by aremdee on Tue, 05/03/2016 - 01:36 Pro Licensee Comment #73
Have restarted apache service. Free -m earlier the same but below after restart
            total      used      free    shared   buffers    cached
Mem:Â Â Â Â Â Â Â Â Â 4967Â Â Â Â Â Â 1392Â Â Â Â Â Â 3574Â Â Â Â Â Â Â 167Â Â Â Â Â Â Â 110Â Â Â Â Â Â Â 685
-/+ buffers/cache:Â Â Â Â Â Â Â 596Â Â Â Â Â Â 4370
Swap:Â Â Â Â Â Â Â Â 2119Â Â Â Â Â Â Â 360Â Â Â Â Â Â 1759
Will do the MySQL restart later.
Submitted by aremdee on Tue, 05/03/2016 - 01:46 Pro Licensee Comment #74
> ps auxw | grep mysql
root    30912 0.0 0.0  4336  768 ?       S   16:40  0:00 sh -c (ps auxw | grep mysql) 2>&1
root    30913 0.0 0.0  4336  104 ?       S   16:40  0:00 sh -c (ps auxw | grep mysql) 2>&1
root    30915 0.0 0.0 11128 1028 ?       S   16:40  0:00 grep mysql
root    33919 0.0 0.0  4336   44 ?       S   Apr27  0:00 /bin/sh /usr/bin/mysqld_safe
mysql   34326 0.9 2.6 2310636 132528 ?     Sl  Apr27 86:43 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysql/plugin --user=mysql --log-error=/var/log/mysql/error.log --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/run/mysqld/mysqld.sock --port=3306
Restarted service
> ps auxw | grep mysql
root    31355 0.1 0.0  4336 1612 ?       S   16:41  0:00 /bin/sh /usr/bin/mysqld_safe
mysql   31762 11.3 1.9 599764 99200 ?       Sl  16:41  0:01 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysql/plugin --user=mysql --log-error=/var/log/mysql/error.log --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/run/mysqld/mysqld.sock --port=3306
root    31818 0.0 0.0 21704 2484 ?       S   16:41  0:00 /bin/bash /etc/mysql/debian-start
root    31843 0.0 0.0  6248 1788 ?       S   16:41  0:00 xargs -i /usr/bin/mysql --defaults-file=/etc/mysql/debian.cnf --skip-column-names --silent --batch --force -e {}
root    31857 0.0 0.0 104412 4568 ?       S   16:41  0:00 /usr/bin/mysql --defaults-file=/etc/mysql/debian.cnf --skip-column-names --silent --batch --force -e select count(*) into @discard from `information_schema`.`PARTITIONS`
root    31869 0.0 0.0  4336  796 ?       S   16:41  0:00 sh -c (ps auxw | grep mysql) 2>&1
root    31870 0.0 0.0  4336  104 ?       S   16:41  0:00 sh -c (ps auxw | grep mysql) 2>&1
root    31872 0.0 0.0 11132 1028 ?       S   16:41  0:00 grep mysql
But forgot to do the free -m beforehand. Output after is below.
> free -m
            total      used      free    shared   buffers    cached
Mem:Â Â Â Â Â Â Â Â Â 4967Â Â Â Â Â Â 4795Â Â Â Â Â Â Â 172Â Â Â Â Â Â 2380Â Â Â Â Â Â Â Â 41Â Â Â Â Â Â 2783
-/+ buffers/cache:Â Â Â Â Â Â 1970Â Â Â Â Â Â 2996
Swap:Â Â Â Â Â Â Â Â 2119Â Â Â Â Â Â Â 918Â Â Â Â Â Â 1201
Hasn't made any difference to usage from what I can see.
Submitted by aremdee on Wed, 05/04/2016 - 02:04 Pro Licensee Comment #75
Heads up. Swap is on the increase again.
> free -m
            total      used      free    shared   buffers    cached
Mem:Â Â Â Â Â Â Â Â Â 4967Â Â Â Â Â Â 4788Â Â Â Â Â Â Â 178Â Â Â Â Â Â 2491Â Â Â Â Â Â Â Â 19Â Â Â Â Â Â 2744
-/+ buffers/cache:Â Â Â Â Â Â 2023Â Â Â Â Â Â 2943
Swap:Â Â Â Â Â Â Â Â 2119Â Â Â Â Â Â 1300Â Â Â Â Â Â Â 819
Submitted by andreychek on Wed, 05/04/2016 - 10:54 Comment #76
Okay, try this:
free -m
# Restart Apache
free -m
# Restart MySQL
free -m
I have a sneaking suspicion one or both of the above two services are contributing to the issue you're seeing, the above should help us understand which.
Submitted by aremdee on Wed, 05/04/2016 - 17:13 Pro Licensee Comment #77
Swap has dropped over night. Do you want me to run these tests now or later tonight when swap will probably grow?
Submitted by andreychek on Wed, 05/04/2016 - 17:37 Comment #78
Only run the commands above if/when there is a problem.
Submitted by aremdee on Thu, 05/05/2016 - 22:43 Pro Licensee Comment #79
It's doing it again.
> free -m
            total      used      free    shared   buffers    cached
Mem:Â Â Â Â Â Â Â Â Â 4967Â Â Â Â Â Â 4212Â Â Â Â Â Â Â 754Â Â Â Â Â Â 2074Â Â Â Â Â Â Â Â 63Â Â Â Â Â Â 2401
-/+ buffers/cache:Â Â Â Â Â Â 1747Â Â Â Â Â Â 3219
Swap:Â Â Â Â Â Â Â Â 2119Â Â Â Â Â Â 1431Â Â Â Â Â Â Â 688
Restart Apache
> free -m
            total      used      free    shared   buffers    cached
Mem:Â Â Â Â Â Â Â Â Â 4967Â Â Â Â Â Â 1024Â Â Â Â Â Â 3942Â Â Â Â Â Â Â Â 75Â Â Â Â Â Â Â Â 68Â Â Â Â Â Â Â 434
-/+ buffers/cache:Â Â Â Â Â Â Â 521Â Â Â Â Â Â 4446
Swap:Â Â Â Â Â Â Â Â 2119Â Â Â Â Â Â Â 297Â Â Â Â Â Â 1822
Restart MySQL
> free -m
            total      used      free    shared   buffers    cached
Mem:Â Â Â Â Â Â Â Â Â 4967Â Â Â Â Â Â 3242Â Â Â Â Â Â 1724Â Â Â Â Â Â 1178Â Â Â Â Â Â Â Â 84Â Â Â Â Â Â 1996
-/+ buffers/cache:Â Â Â Â Â Â 1162Â Â Â Â Â Â 3804
Swap:Â Â Â Â Â Â Â Â 2119Â Â Â Â Â Â Â 209Â Â Â Â Â Â 1910
Submitted by andreychek on Thu, 05/05/2016 - 22:55 Comment #80
Okay, according to that, it's looking like an Apache related issue. Apache and the processes it's spawning are using a large amount of RAM.
There's two things we can do to help --
One, we can reduce how many processes it's allowed to spawn, as well as have it restart individual workers more often to ensure they aren't growing too large.
To do that, try setting the following in the mpm_prefork.conf file:
StartServers 5
MinSpareServers  5
MaxSpareServers 10
MaxRequestWorkers  30
MaxConnectionsPerChild  150
The other thing we can't help as much with... but it would be to see if you can reduce the PHP modules and Apache modules that are being loaded.
Only you can do that though, as we don't know which ones are being used on your server.
You can see what Apache modules are being loaded in /etc/apache2/mods-enabled.
And the PHP modules are loaded in /etc/php5/conf.d.
If you find some you think you don't need, try temporary removing them, restart Apache, and then ensure that your websites still work properly.
Submitted by aremdee on Thu, 05/05/2016 - 23:41 Pro Licensee Comment #81
Currently in mpm_prefork.conf we have
<IfModule mpm_prefork_module>
StartServers 5
MinSpareServers  5
MaxSpareServers 10
MaxRequestWorkers  40
MaxConnectionsPerChild  0
</IfModule>
Submitted by andreychek on Thu, 05/05/2016 - 23:47 Comment #82
Yup, understood! Replace those values with the ones I shared. That should help the issue you're experiencing.
Submitted by aremdee on Fri, 05/06/2016 - 00:09 Pro Licensee Comment #83
Have made those changes and restarted Apache. Will wait and see. Thank you.
Submitted by aremdee on Fri, 05/06/2016 - 21:24 Pro Licensee Comment #84
Received a low memory alert this morning at 4:03am. At 4:10am received another email stating free memory was ok again. When I saw these emails I tried to login to webmin only to be greeted by the "site not found" error message. The websites running on the server were ok, just webmin. I logged in via SSH and restarted webmin and all is good. Question, would low memory cause webmin to fail?
Submitted by andreychek on Fri, 05/06/2016 - 22:00 Comment #85
Hmm, what output do you receive if you run the command "dmesg | tail -50"?
Submitted by aremdee on Fri, 05/06/2016 - 22:20 Pro Licensee Comment #86
I get this
> dmesg | tail -50
[3173814.617639] [19207]Â Â Â 33 19207Â Â Â 93264Â Â Â Â Â 674Â Â Â Â 135Â Â Â Â 1805Â Â Â Â Â Â Â Â Â Â Â Â 0 apache2
[3173814.617640] [19208]Â 1056 19208Â Â Â 84108Â Â Â Â 3574Â Â Â Â 117Â Â Â Â Â Â 43Â Â Â Â Â Â Â Â Â Â Â Â 0 php5-cgi
[3173814.617642] [19211]Â Â Â 33 19211Â Â Â 93211Â Â Â Â Â 604Â Â Â Â 134Â Â Â Â 1810Â Â Â Â Â Â Â Â Â Â Â Â 0 apache2
[3173814.617643] [19215]Â 1056 19215Â Â Â 84119Â Â Â Â 3583Â Â Â Â 117Â Â Â Â Â Â 45Â Â Â Â Â Â Â Â Â Â Â Â 0 php5-cgi
[3173814.617645] [19220]Â 1056 19220Â Â Â 84119Â Â Â Â 3568Â Â Â Â 114Â Â Â Â Â Â 60Â Â Â Â Â Â Â Â Â Â Â Â 0 php5-cgi
[3173814.617646] [19223]Â 1056 19223Â Â Â 84108Â Â Â Â 3572Â Â Â Â 117Â Â Â Â Â Â 44Â Â Â Â Â Â Â Â Â Â Â Â 0 php5-cgi
[3173814.617648] [19224]Â 1056 19224Â Â Â 84108Â Â Â Â 3609Â Â Â Â 116Â Â Â Â Â Â Â 8Â Â Â Â Â Â Â Â Â Â Â Â 0 php5-cgi
[3173814.617649] [19226]Â 1056 19226Â Â Â 84108Â Â Â Â 3536Â Â Â Â 117Â Â Â Â Â Â 80Â Â Â Â Â Â Â Â Â Â Â Â 0 php5-cgi
[3173814.617650] [19227]Â 1056 19227Â Â Â 84108Â Â Â Â 3608Â Â Â Â 118Â Â Â Â Â Â Â 8Â Â Â Â Â Â Â Â Â Â Â Â 0 php5-cgi
[3173814.617652] [19233]Â 1056 19233Â Â Â 84108Â Â Â Â 3360Â Â Â Â 116Â Â Â Â Â 256Â Â Â Â Â Â Â Â Â Â Â Â 0 php5-cgi
[3173814.617653] [19236]Â Â Â 33 19236Â Â Â 93202Â Â Â Â Â 596Â Â Â Â 134Â Â Â Â 1805Â Â Â Â Â Â Â Â Â Â Â Â 0 apache2
[3173814.617655] [19241]Â 1056 19241Â Â Â 83112Â Â Â Â 2322Â Â Â Â 108Â Â Â Â Â 209Â Â Â Â Â Â Â Â Â Â Â Â 0 php5-cgi
[3173814.617656] [19245]Â 1056 19245Â Â Â 82917Â Â Â Â 2263Â Â Â Â 109Â Â Â Â Â 104Â Â Â Â Â Â Â Â Â Â Â Â 0 php5-cgi
[3173814.617658] [19248]Â 1056 19248Â Â Â 82917Â Â Â Â 1870Â Â Â Â 110Â Â Â Â Â 499Â Â Â Â Â Â Â Â Â Â Â Â 0 php5-cgi
[3173814.617659] [19253]Â 1056 19253Â Â Â 82903Â Â Â Â 2200Â Â Â Â 102Â Â Â Â Â 189Â Â Â Â Â Â Â Â Â Â Â Â 0 php5-cgi
[3173814.617660] [19255]Â 1056 19255Â Â Â 82984Â Â Â Â 2195Â Â Â Â 105Â Â Â Â Â 203Â Â Â Â Â Â Â Â Â Â Â Â 0 php5-cgi
[3173814.617662] [19258]Â 1056 19258Â Â Â 82984Â Â Â Â 2202Â Â Â Â 104Â Â Â Â Â 176Â Â Â Â Â Â Â Â Â Â Â Â 0 php5-cgi
[3173814.617663] [19262]Â 1056 19262Â Â Â 82984Â Â Â Â 2299Â Â Â Â 104Â Â Â Â Â Â 80Â Â Â Â Â Â Â Â Â Â Â Â 0 php5-cgi
[3173814.617665] [19263]Â 1056 19263Â Â Â 84108Â Â Â Â 3419Â Â Â Â 115Â Â Â Â Â 197Â Â Â Â Â Â Â Â Â Â Â Â 0 php5-cgi
[3173814.617666] [19266]Â 1056 19266Â Â Â 85068Â Â Â 10147Â Â Â Â 119Â Â Â Â Â Â 50Â Â Â Â Â Â Â Â Â Â Â Â 0 php5-cgi
[3173814.617667] [19273]Â 1056 19273Â Â Â 82984Â Â Â Â 1976Â Â Â Â 103Â Â Â Â Â 402Â Â Â Â Â Â Â Â Â Â Â Â 0 php5-cgi
[3173814.617669] [19282]Â 1056 19282Â Â Â 82550Â Â Â Â 1797Â Â Â Â 103Â Â Â Â Â 240Â Â Â Â Â Â Â Â Â Â Â Â 0 php5-cgi
[3173814.617670] [19304]Â 1056 19304Â Â Â 81897Â Â Â Â 1363Â Â Â Â Â 94Â Â Â Â Â Â 16Â Â Â Â Â Â Â Â Â Â Â Â 0 php5-cgi
[3173814.617672] [19317]Â Â 108 19317Â Â 218120Â Â Â 14155Â Â Â Â Â 76Â Â Â Â Â Â Â 9Â Â Â Â Â Â Â Â Â Â Â Â 0 mysqld
[3173814.617673] [19345]Â Â Â Â 0 19345Â Â Â 10589Â Â Â Â Â Â 82Â Â Â Â Â 26Â Â Â Â Â Â 25Â Â Â Â Â Â Â Â Â Â Â Â 0 cron
[3173814.617675] [19347]Â Â Â Â 0 19347Â Â Â Â 1084Â Â Â Â Â Â 22Â Â Â Â Â Â 7Â Â Â Â Â Â Â 0Â Â Â Â Â Â Â Â Â Â Â Â 0 sh
[3173814.617676] [19348]Â Â Â Â 0 19348Â Â Â 56505Â Â Â 11620Â Â Â Â Â 81Â Â Â Â Â Â 25Â Â Â Â Â Â Â Â Â Â Â Â 0 monitor.pl
[3173814.617678] [19401]Â Â Â Â 0 19401Â Â Â 38275Â Â Â Â 5906Â Â Â Â Â 79Â Â Â 14354Â Â Â Â Â Â Â Â Â Â Â Â 0 /usr/share/webm
[3173814.617679] [19441]Â Â Â Â 0 19441Â Â Â 42709Â Â Â 11830Â Â Â Â Â 86Â Â Â 12943Â Â Â Â Â Â Â Â Â Â Â Â 0 /usr/share/webm
[3173814.617681] [19483]Â Â Â 33 19483Â Â Â 93193Â Â Â Â Â 581Â Â Â Â 134Â Â Â Â 1807Â Â Â Â Â Â Â Â Â Â Â Â 0 apache2
[3173814.617682] [19489]Â Â Â 33 19489Â Â Â 93193Â Â Â Â Â 581Â Â Â Â 134Â Â Â Â 1807Â Â Â Â Â Â Â Â Â Â Â Â 0 apache2
[3173814.617684] [19490]Â Â Â 33 19490Â Â Â 93193Â Â Â Â Â 581Â Â Â Â 134Â Â Â Â 1807Â Â Â Â Â Â Â Â Â Â Â Â 0 apache2
[3173814.617685] [19491]Â Â Â 33 19491Â Â Â 93193Â Â Â Â Â 581Â Â Â Â 134Â Â Â Â 1807Â Â Â Â Â Â Â Â Â Â Â Â 0 apache2
[3173814.617687] [19492]Â Â Â 33 19492Â Â Â 93175Â Â Â Â Â 881Â Â Â Â 132Â Â Â Â 1834Â Â Â Â Â Â Â Â Â Â Â Â 0 apache2
[3173814.617688] [19493]Â Â Â 33 19493Â Â Â 93175Â Â Â Â Â 884Â Â Â Â 132Â Â Â Â 1834Â Â Â Â Â Â Â Â Â Â Â Â 0 apache2
[3173814.617689] [19494]Â Â Â 33 19494Â Â Â 93175Â Â Â Â Â 881Â Â Â Â 132Â Â Â Â 1834Â Â Â Â Â Â Â Â Â Â Â Â 0 apache2
[3173814.617691] [19495]Â Â Â 33 19495Â Â Â 93175Â Â Â Â Â 884Â Â Â Â 132Â Â Â Â 1834Â Â Â Â Â Â Â Â Â Â Â Â 0 apache2
[3173814.617692] [19497]Â Â Â 33 19497Â Â Â 93175Â Â Â Â Â 985Â Â Â Â 132Â Â Â Â 1834Â Â Â Â Â Â Â Â Â Â Â Â 0 apache2
[3173814.617693] [19498]Â Â Â 33 19498Â Â Â 93175Â Â Â Â Â 983Â Â Â Â 132Â Â Â Â 1834Â Â Â Â Â Â Â Â Â Â Â Â 0 apache2
[3173814.617695] [19499]Â Â Â 33 19499Â Â Â 93175Â Â Â Â Â 985Â Â Â Â 132Â Â Â Â 1834Â Â Â Â Â Â Â Â Â Â Â Â 0 apache2
[3173814.617696] [19500]Â Â Â 33 19500Â Â Â 93175Â Â Â Â Â 985Â Â Â Â 132Â Â Â Â 1834Â Â Â Â Â Â Â Â Â Â Â Â 0 apache2
[3173814.617697] [19501]Â Â Â 33 19501Â Â Â 93175Â Â Â Â Â 985Â Â Â Â 132Â Â Â Â 1834Â Â Â Â Â Â Â Â Â Â Â Â 0 apache2
[3173814.617699] [19502]Â Â Â 33 19502Â Â Â 93175Â Â Â Â Â 983Â Â Â Â 132Â Â Â Â 1834Â Â Â Â Â Â Â Â Â Â Â Â 0 apache2
[3173814.617700] [19503]Â Â Â 33 19503Â Â Â 93175Â Â Â Â Â 985Â Â Â Â 132Â Â Â Â 1834Â Â Â Â Â Â Â Â Â Â Â Â 0 apache2
[3173814.617701] [19504]Â Â Â 33 19504Â Â Â 93175Â Â Â Â Â 985Â Â Â Â 132Â Â Â Â 1834Â Â Â Â Â Â Â Â Â Â Â Â 0 apache2
[3173814.617703] Out of memory: Kill process 19441 (/usr/share/webm) score 13 or sacrifice child
[3173814.617737] Killed process 19441 (/usr/share/webm) total-vm:170836kB, anon-rss:47088kB, file-rss:232kB
[3176455.767563] traps: php5-cgi[12435] general protection ip:70eb29 sp:7ffdaadd3920 error:0 in php5-cgi[400000+7e2000]
[3183658.234774] traps: php5-cgi[29145] general protection ip:70eb29 sp:7ffd443fd170 error:0 in php5-cgi[400000+7e2000]
[3201785.253753] traps: php5-cgi[41364] general protection ip:70eb29 sp:7fff01e74260 error:0 in php5-cgi[400000+7e2000]
Submitted by andreychek on Sun, 05/08/2016 - 14:31 Comment #87
Yeah it is looking like you're still ending up in low memory situations. In addition to removing unneeded Apache and PHP modules, we could further tweak "MaxRequestWorkers". Is that currently set to 30?
And how many domains do you have on your server? I'm tempted to suggest switching your domains from FCGID to CGI, which will make it so that there aren't any processes being stored in memory, which may be contributing to the issue.
Submitted by aremdee on Sun, 05/08/2016 - 18:47 Pro Licensee Comment #88
MaxRequestWorkers is currently set to 40 as per your instructions from last week. Can drop it lower if you think that will help. Also not too sure as to which Apache and PHP modules aren't needed. If I can generate a list can you recommend ones to disable? Also will changing websites from FCGID to CGI impact their speed?
Thanks again.
Submitted by andreychek on Sun, 05/08/2016 - 18:50 Comment #89
Well, let's start here -- can you double-check that you've implemented the settings mentioned in Comment #80 above?
Submitted by aremdee on Sun, 05/08/2016 - 19:08 Pro Licensee Comment #90
That should have been 30 by the way. Current mpm_prefork.conf is :
<IfModule mpm_prefork_module>
StartServers 5
MinSpareServers  5
MaxSpareServers 10
MaxRequestWorkers  30
MaxConnectionsPerChild  150
</IfModule>
Submitted by aremdee on Sun, 05/08/2016 - 19:13 Pro Licensee Comment #91
Also forgot to mention that we are currently running 69 virtual servers on the Debian box. Was thinking of swapping around and moving everything to the Ubuntu server, then the 33 currently on it to the debian server and swapping over the pro licence to ubuntu. Would this work do you think?
Submitted by aremdee on Mon, 05/09/2016 - 21:07 Pro Licensee Comment #92
Did the webmin update yesterday and had to restart webmin as it crashed. Noticed this morning that things were nice and stable. Real memory was getting up to around 45% and swap was lingering at 19%. However in the space of an hour, swap has increased to 56% and real memory has dropped to 37%. Must be something that is moving stuff from real memory to swap. Is there a way to increase swap on a running system? Maybe that will solve the problem.
Submitted by andreychek on Mon, 05/09/2016 - 23:58 Comment #93
You're seeing something fairly odd going on there... I'm not sure what it is. I haven't seen a system with 6GB of real memory, and 2GB of swap, constantly struggle to not run out of RAM.
Yes, you could add additional swap, but that could create some serious performance issues on your system there.
You also asked if Ubuntu would make a different... I'm unfortunately not sure yet (though, since that does come with a different kernel, I am somewhat interested in seeing if it makes a difference... but moving to a new distro is a bit step).
Personally, I'd keep digging into the processes running on your server, and work on ways to reduce memory usage.
For starters, just to see if it helps, you could try changing "MaxRequestWorkers" to "20" rather than "30". And make sure that Apache is restarted afterwards.
This is somewhat extreme, but you could always setup a cron job to restart the Apache and MySQL processes once a day, which should keep memory usage down as well.
Submitted by aremdee on Tue, 05/10/2016 - 00:26 Pro Licensee Comment #94
I could just migrate virtual servers one at a time and see how it goes. Once the load is taken off the Debian server I might be able to balance things a little better. Tomorrow is patch Tuesday here in Oz (even though it's Wednesday). Might reboot the debian server at that time or Thursday morning to see if that settles things down a bit. Uptime is currently 40 days. Could also shut it off and assign more real memory to it.
Submitted by aremdee on Mon, 05/30/2016 - 01:34 Pro Licensee Comment #95
Ok so we are there again. I had thought that for the last two weeks that we were on top of memory usage. However today I received notification from the server that memory was down. About seven minutes later it notified me that all was good. Memory at the moment is running at
So any idea as to what I can trim to see if that makes a difference. Must be something to do with Apache as restarting that always drops usage dramatically. As I stated earlier, I could migrate sites to the Ubuntu server. It's not being stressed. So the question is if I migrate just one virtual server at a time, is that found in "Virtualmin virtual_sever_name>server configuration>transfer virtual server"?
Thanks.
Submitted by andreychek on Mon, 05/30/2016 - 08:23 Comment #96
If the problem gets better when restarting Apache, that means it could be related to either Apache or PHP.
If you didn't already, my suggestion would be to review all of your Apache and PHP modules that you're using, and disable ones you aren't using.
I suppose an option that could help prevent this from occurring is to try restarting Apache each night from within cron. Maybe there is a misbehaving module that is using up too much memory, restarting Apache could help keep that under control.
Submitted by aremdee on Mon, 05/30/2016 - 17:44 Pro Licensee Comment #97
I restarted Apache last night manually and yes that settled things down a bit. A quick comparison with Ubuntu Apache modules and Debian Apache Modules shows the same ones enable and disabled. Not sure if there is anything there I can prune.
I'll ask again. If I want migrate a virtual server to the Ubuntu box from the Debian box, do I do that through "Virtualmin>virtual_sever_name>server configuration>transfer virtual server"? I'm thinking of moving some non critical VS's to Ubuntu (it needs rebooting more often than Debian) and share the load a bit that way. I must be reaching some sort of limit on Debian with the way swap is behaving. Be also interesting to see if Ubuntu starts behaving the same if I put more load on it. If it doesn't then perhaps I'll transfer the pro license to it and use the Debian box for a few critical sites.
Submitted by andreychek on Mon, 05/30/2016 - 17:53 Comment #98
Sure, we'd definitely suggest using the distro you feel most comfortable with. I personally have had excellent luck on Ubuntu, and haven't seen the issues you're describing. But it could be something with the particular hardware combination or other unusual issue.
To migrate to a new server, these instructions here can assist with that:
https://www.virtualmin.com/documentation/system/migrate
Submitted by andreychek on Mon, 05/30/2016 - 17:54 Comment #99
Oh, and you also may want to compare PHP modules, which you can do by looking in /etc/php5/conf.d/.
Submitted by aremdee on Mon, 05/30/2016 - 18:18 Pro Licensee Comment #100
Thanks for that. I don't want to migrate all domains, just one at a time though. There's a feature under Virtualmin>virtual_server>Server Configuration>Transfer Virtual Server. Will this do a transfer to the other box of that domain? That way I can move them one at a time and see how thing travel as I do so. Call it a work in progress.
Will also look into the PHP modules and do a comparison. Report back shortly.
Submitted by andreychek on Mon, 05/30/2016 - 18:29 Comment #101
Ah right, you did mention that earlier.
You could either modify the instructions for moving one domain at a time, or you could use the GUI option you mentioned, either would work fine.
Submitted by aremdee on Mon, 05/30/2016 - 18:30 Pro Licensee Comment #102
Mmmm can't find /etc/php5/conf.d/. The directory php5 only contains apache2, cgi, cli and mods-available directories, no files. Same on both servers.
Submitted by andreychek on Mon, 05/30/2016 - 18:41 Comment #103
Okay, how about this -- what is the output of this command:
dpkg -l 'php5-*'
Submitted by aremdee on Mon, 05/30/2016 - 18:45 Pro Licensee Comment #104
Ubuntu
Debian
Submitted by andreychek on Mon, 05/30/2016 - 18:49 Comment #105
Yeah that all does seem fairly normal.
Sorry I'm not sure what else the problem might be... you could certainly try migrating some domains to another server to see if that makes a difference.
Submitted by aremdee on Mon, 05/30/2016 - 19:16 Pro Licensee Comment #106
Tried to transfer one virtual server over to ubuntu but it fails with this error : "Failed to transfer server : Failed to contact Virtualmin on destination system"
Submitted by andreychek on Mon, 05/30/2016 - 19:35 Comment #107
Is there anything that would be preventing port 10000 - 10010 from being accessible from your Ubuntu server?
Submitted by aremdee on Mon, 05/30/2016 - 19:49 Pro Licensee Comment #108
Not that I'm aware of. Can access virtualmin from a browser on that port number. Just did a Google and am looking at swappinness. It's set to 60 on both systems. What if I reduced the amount on the debian box to 40 to see if that has an impact on swap size?
Submitted by andreychek on Mon, 05/30/2016 - 20:06 Comment #109
RAM or swap should not be the cause of the problem you're seeing... there error is indicating that Webmin on your server is having difficulty accessing Webmin on the remote server.
A way around that would just be to create a backup file of your domain, copy that backup file to your other server, and then restore it on the other server. That's essentially what the transfer option you're attempting to use does.
Submitted by aremdee on Mon, 05/30/2016 - 20:08 Pro Licensee Comment #110
Just used telnet to port 10000 and it connected. So that's not the issue. Could it be that root access is needed yet by default root is denied to external connections?
Submitted by andreychek on Mon, 05/30/2016 - 20:21 Comment #111
Root logins shouldn't be denied by default... but if root isn't able to log in from your Ubuntu server, that could cause the problem you're experiencing.
Submitted by aremdee on Mon, 05/30/2016 - 20:44 Pro Licensee Comment #112
I think that root is denied from external logins. I can't SSH in using the root account, I have login using another account with sudo access on the Ubuntu box. I can then change to root once logged in (which I rarely do). So can I backup and restore using the LAN IP addresses of both boxes rather than their external facing IP addresses?
Submitted by andreychek on Mon, 05/30/2016 - 21:35 Comment #113
You would need to temporarily set a password for root, if you wish to use the Transfer Virtual Server feature in the GUI.
The other option is to just generate a Virtualmin backup of that domain, copy it, and then restore it on the other server.
As far as what IP address you can use to access the remote system -- I unfortunately don't know the answer to that, it would depend on your network architecture. However, you could certainly try using the internal LAN IP to see if that works, I don't believe there is any kind of need to use an external IP.
Submitted by aremdee on Mon, 05/30/2016 - 21:45 Pro Licensee Comment #114
Yep have managed to get root access remotely working. Can log in from the shop but the transfer still fails. The issue I can see from the logs is that the ip address of the debian box resolves to a different hostname rather than the one it should. Using the DNS on the windows box appears to be the problem in this instance. If I can get it not to do a DNS lookup for that ip address I reckon it would work. Is there a way to whitelist an ip address in SSHD that you know of?
Submitted by andreychek on Mon, 05/30/2016 - 22:05 Comment #115
You really might want to consider just making a backup and restoring that on the other server, that's likely to be a much simpler process.
You shouldn't need to whitelist any IP addresses in SSH though. However, you could always try SSH'ing to the Debian server from the Ubuntu one just to ensure that works.
Submitted by aremdee on Mon, 05/30/2016 - 22:34 Pro Licensee Comment #116
Nailed it. Working now. Transfer seems to be more elegant as it backs up and restores with one click on the new system with out user intervention. Like it. Will now proceed to move a few sites over and see how things pan out. Will keep you posted.
Submitted by aremdee on Tue, 05/31/2016 - 00:39 Pro Licensee Comment #117
Not sure what effect this will have but I have also set swappiness = 30 instead of the default 60. Different sites give different takes on this option, some say to leave it as is, while others say that systems with sufficient RAM, it may be beneficial to the overall performance of the system. After changing it swap usage has dropped with a corresponding increase in RAM usage. However it's not a dramatic climb nor decrease, swap dropped from 75% to 63% and RAM has increased from 38% to 45%. Hopefully there will be no implications from doing this. Tomorrow I'm intending to move 14 sites off and running them on the Ubuntu server. May have to increase RAM on the Debian server too, perhaps another 5GB which will bring it to 10 in total. Have to shut the VM down to do this so am a little loathe do it.
Submitted by aremdee on Tue, 05/31/2016 - 02:10 Pro Licensee Comment #118
Damn, that really did nothing in real terms. Swap is now at 88%. Going to have restart Apache. But before I do here is the output from ps auxf
Submitted by aremdee on Tue, 05/31/2016 - 02:12 Pro Licensee Comment #119
And here is the output after the restart.
Submitted by andreychek on Tue, 05/31/2016 - 07:47 Comment #120
Yeah there really seems to be a lot of PHP processes hanging around from the FCGID caching.
You really might want to try setting some domains to use CGI rather than FCGID. Yes, FCGID is generally a bit faster due to that caching, but it also uses more memory :-)
I personally set most of my servers to use CGI.
That's an option you can change in Server Configuration -> Website Options.
Submitted by aremdee on Tue, 05/31/2016 - 19:21 Pro Licensee Comment #121
Good spot. Am in the process of changing them all. As all sites are Wordpress sites, I can add the WP Super Cache plugin on all of them to speed things up if need be (although this can present problems with clients making changes that don't show up). If this doesn't help swap I still have a list of servers to move :-)
Will report back when done after 24 hours. Thanks for your patience.
Submitted by aremdee on Wed, 06/01/2016 - 17:51 Pro Licensee Comment #122
Brilliant. Memory is sitting at 23% and swap at 12%. Been pretty rock solid all the time. Trade off may be a bit higher CPU usage but apart from that no out of memory error messages at all. I actually switched three sites back to FCGI yesterday as they were ones that had a fair amount of traffic. Also haven't rebooted the server like I said I would, it's been up and running for 62 days.
Many thanks.
Submitted by andreychek on Wed, 06/01/2016 - 18:11 Comment #123
That's great to hear, thanks for letting us know how things are going!
Submitted by aremdee on Mon, 06/13/2016 - 22:28 Pro Licensee Comment #124
All has been running sweetly however this morning it all happened again, server went into read only mode. This time I grabbed some screen shots of what was going on as I was trying to get it fixed. Thankfully it does respond well to self repair but just don't know what the root cause is. Those screen captures are now on the server and can be found at the link below. Seems it happened at around 23:00 last night as the next entry in the logs was after I restarted the server. The last log entry for last night is "Jun 13 23:10:30 debian postfix/anvil[59011]: statistics: max connection count 1 for (submission:168.103.85.18) at Jun 13 23:00:49". Not sure if this means anything though. The next entry is "Jun 14 11:40:51 debian rsyslogd: [origin software="rsyslogd" swVersion="8.4.2" x-pid="627" x-info="http://www.rsyslog.com"] start". The backup of the server started last night at 22:46 and ran for 13 minutes. The Ubuntu server was backed up after this and stated at 22:59. Can't see any logs that indicate what may have transpired.
https://www.rdweb.com.au/DebianOutput14-6-2016_1.png
For the rest of them just substitute the last digit with 2 through 7. Not sure if I got enough screen captures, if not will have to wait till next time.
Submitted by andreychek on Mon, 06/13/2016 - 22:37 Comment #125
That generally means that something is occurring on the disk to cause a problem.
I understand that this is a VPS, but could there be a bad sector on the host that's causing you some trouble? I think something along these lines is the most likely issue.
You're looking at some kind of very low level problem there though. Either there's a hard drive problem, or potentially a kernel problem or bug.
If there's some way you could run a disk scan on the host, that would be my recommendation as to where to start.
Submitted by aremdee on Tue, 06/14/2016 - 00:01 Pro Licensee Comment #126
I've checked the drive for errors but the doesn't appear to be any. Event viewer doesn't show any hardware failures either. Usually any underlying hard drive issue will show up there.