These forums are locked and archived, but all topics have been migrated to the new forum. You can search for this topic on the new forum: Search for rendundancy, clustering and failovers on the new forum.
Hi,
I've currently got one server running virtualmin pro, and i'd like to get a second box to use as backup mail/dns and possibly a hot spare web server.
Anyone got any advice on how this might work, best configurations, synchronisation etc..
Hey Chris,
The <i>very first</i> Virtualmin installation three years ago was a fully redundant configuration. It wasn't easy, but it's gotten easier since then. ;-)
There are at least three orthogonal issues you have to deal with:
[ol]
[*]Failure detection and takeover[/*]
[*]Synchronization of configuration and data[/*]
[*]Notification of problems[/*]
[/ol]
Webmin has help for you on a couple of these, though you need an extra software component.
The first issue is failover. For this, I used Heartbeat, and I have zero complaints about it. It works beautifully, simply, reliably, and it has a nice Webmin module. You can get it here:
http://linux-ha.org/download/index.html
Synchronization of configuration and data can be done in roundabout ways with Webmin...but I think the ideal choice is rsync. You can configure it to run over ssh with a public key on the sending machine, so it can happen via a cronjob. On the system I setup, I ran rsync every hour. This is probably an area we ought to address in Webmin, but it's a pretty wide-open task and there are dozens of ways to go about it, depending on your focus. It's also easy to screw things up on the secondary machine--many configuration files are specific to the box, even if the boxes are identically configured and are expected to behave identically after a failover. Network details can't change, for example. I believe I had two or three subdirectories of /etc that I excluded from my rsync process, and otherwise included all of /etc, all of /home, and the /var/named directory (depending on OS and configuration, this may be in /etc or elsewhere). I'll think on doing something like this for Webmin. For a very specific purpose (like a Virtualmin hosting server), it's probably easy enough to select the right bits and then use the remote function API to copy the right stuff over on schedule, possibly using rsync to reduce the size of transmissions (now we're getting fancy!).
The final bit is notifications, since you need to know about failover events so you can fix whatever went wrong on the primary. I believe the System and Server Status module covers this nicely. Both servers can monitor each other, and send notifications in the event of trouble. Once when I didn't want to use heartbeat, I used this module all by itself to detect failure and takeover and IP automagically. Worked just fine, but seems like a "when all you've got is a hammer" kind of solution. ;-)
This would be a good area for a meta-module (like Virtualmin...stepping out of the traditional role of Webmin modules that address only one service or configuration file and actually perform actions on many modules in order to ease a specific task--more complicated to write and much more specific in function, but more useful in the circumstances for which it was written). Maybe some third party will step up to the plate and tackle this one. It's a big job, though, and Jamie and I have our work scheduled out for at least two or three months.
--
Check out the forum guidelines!
Chris/Joe,
Any movement on DRDB as a possible solution to mirror two virtualmin servers? Just wondering.
Thanks!
...this thread is pushing 3 years old.
It would be ideal if fail-over and load-balancing could be achieved in the context of Virtualmin. Of course there are manual options as outlined in this post...but it seems virtualmin is very, very close to a clustered, load-balanced, fail-over solution.
A how-to a bit more fleshed out would be totally awesome and raise the bar even higher on an already outstanding piece of software.
I agree -- I have one client that asked me the very same thing just the other day.
Hi Joe,
Seems like a combination of heartbeat and DRBD (http://www.drbd.org/) might be a good combination for creating a failover pair.
I'll have to find some time myself to setup a couple of test servers.
Hey Chris,
DRBD looks like a great tool. Scary, but really cool.
I've not yet needed 1-3 minute resync times, but that's pretty darned impressive, and I'll definitely look into it more. It'd be problematic to implement for Webmin/Virtualmin in a generic way, since it seems intimately tied to Linux (so we couldn't ship it as a standard component for all Virtualmin users)...but it's darned cool nonetheless.
So, yes, it looks like this might be a nice mechanism for synching your user data, especially if you need that kind of resync time. Careful of your network settings, and any other host-specific files, though, as it seems like it wouldn't handle exclusions the way rsync does. Be sure to let us know how it turns out!
--
Check out the forum guidelines!
Yes, with DRBD and Heartbeat you can set-up an HA solutions for Virtualmin, including also MySQL, Mail etc ...
And works well!
That sounds great - is there a "howto?" Or perhaps a rough outline?
Thanks!!!
I'm interested in this also. I've had a look at the drbd page and it looks like chinese to me.
I think I know the answer to this already but it might be worth getting an official answer...
If we're running Virtualmin Pro in such an arrangement, how many server licenses are required?
I talked with Joe - he explained to me that for a "hot swap" the requirement is one license but for multiple servers with fail-over capability AND load-balancing you'd need a valid license for each load-balanced server.
Joe also mentioned that having a "hot-swap" feature was in consideration as a part of the set-up for virtualmin. I am really hoping that this sort of feature is added to virtualmin - as it would make itself all that more attractive to the enterprise market.
Besides - fail-over and load balancing are essentials for high-traffic sites IMHO. There have been several suggested methods in this thread (drdb for one) and different techniques for fail-over (like, installing different parts of the filesystem on NFS and then having that NFS redundant and there being two apache webservers that serve the virtualmin content).
I am not sure best of practice here - hopefully more interest in this kind of feature set will encourage the developers to enhance the already powerful cluster tools to this stage of functionality.
In any case, when I have the time, I plan on experimenting with this:
http://wiki.centos.org/HowTos/Ha-Drbd
and this:
http://howtoforge.com/high_availability_loadbalanced_apache_cluster
But my time resources are very limited...and there are considerations to deal with (like - ensuring that virtualmin is only accessed on the "master" machine(s), there may be issues with DNS/IP, and so on.)
Anyways - hopefully this thread will congeal into some sound solutions.
If Virtualmin Pro had an option to run redundant servers / loadbalancing servers that could take over in case one machine crashed i sure would invest in the needed amount of licenses faster then you could say paypal ;)
redHOST.dk | redHOST.pro | redHOST.vn | redHOST.se
<div class='quote'>If Virtualmin Pro had an option to run redundant servers / loadbalancing servers that could take over in case one machine crashed i sure would invest in the needed amount of licenses faster then you could say paypal ;)</div>
I've talked about this a few times in the forums.
It's definitely not the fear of not making any money on it that prevents us from tackling it.
The problem is that "load balancing" is meaningless without scaling of the whole stack, and it's pretty much impossible to scale an application without the application being designed to scale. Hot spare <i>is</i> something that we can provide, and we will, in the not distant future...this is probably not more than two or three months away, actually.
But let's talk load balancing and scaling, since it does come up so much:
Let's first note that a simple HTML website doesn't need load balancing. Apache, without any performance tweaks at all, can serve several hundred HTML requests per second on any modern hardware. Say, 500 requests/second, to be conservative...that's 43 million requests per day. Google and Yahoo serve more than that...you and I don't. Since that's the case, we're talking about scaling applications, not simple web data.
For example, if we needed to scale Virtualmin.com (which is a Joomla+FlySpray+a few custom components installation) to several machines, we would have many problems to solve, and load balancing would be the easiest of all of them.
We'd have to solve the database problem first, because nothing works without the database. So, we need multiple backend databases with replication. OK, so the simplest solution is one-write/many-read databases. So, now we have to make Joomla (and all of its wildy varying applications) use the database in a way that allows it to work in a one-write/many-read environment. We can inject memcached into the chain as a stop gap, but we still have to make the app know about the new database problems. So, three months worth of code spelunking, and we can make Joomla and its apps aware of multiple databases.
But how would Virtualmin solve that? It simply can't--no tool can. We can't modify the 85 applications we install automatically, and we certainly can't modify <i>your</i> custom applications to work against multiple databases. We can, and already do, manage cluster tables within MySQL--you can already create redundant databases in Virtualmin today. But a lot of the other stuff is going to be specific to your application and your deployment.
So, even if we add a load balancing component, which we might--as I've mentioned a few times in the past, my previous company built scalability tools (and some of the Squid module in Webmin was developed for my old company, along with a half dozen other modules related to scalability and fault tolerance)--scalability is a problem that no single product can solve. Your app has to be built from the ground up to scale, or refactored in significant ways in order to shoehorn scalability it.
The links above talk about the simple part--they assume that the backends are identical and static. Throw a real application into the mix (the only thing slow enough to <i>need</i> load balancing) and suddenly things get a lot more complicated.
So, let's break it into pieces when talking about these components, because they are very different problems...and some are much wider ranging than Virtualmin could address.
And, a hot spare of a bunch of simple websites is definitely something we will add to Virtualmin in the new future.
--
Check out the forum guidelines!
The hotspare option would in fact be good enough for me - just a shame its 2-3 months out in the future cause that means i gotta work something out myself in the mean time.
But my entire setup is consisting of pairs of servers - for every server i setup i setup a twin server but id really like to be able to have a hotspare solution for the twin server so it is automatically duplicated in certain time intervals and takes over in case it needs to.
So thinking into the hotspare solution it would be nice to be able to setup "pairs" or "twins" in virtalmin så you could have cluster control of the hotspares too.
redHOST.dk | redHOST.pro | redHOST.vn | redHOST.se
Just thinking out loud here...
If the only goal is on the apache side of it (web content, not mail, not dns, and not _really_ database) where all you are doing is replicating home directories and databases directly...it seems like you could pull this off.
In my context, by far the primary need is a "hot spare." I am glad to hear that that is coming down the pipeline. DRDB seems like a great candidate for creating "spares" (and it sounds like users already have been doing this...)
Not to be too simplistic - but if a method exists to replicate a virtualmin server in the context of a "hot spare" it seems the logical first step in solidifying not only fail over but load balancing as well. That's why conceptually I personally tend to group these together.
From a usage perspective - if database driven text content with lean/normal images are what you are serving, then load balancing is overkill except in the case of sites getting millions of hits (ala Yahoo and Google). In our case we are serving large (200+GB) .flv files, as well as many audio files from other hosted servers (10-20MB mp3 each). Having load-balanced servers handling requests for files of this size will improve performance - media rich environments (more and more common these days) will benefit from this.
If the virtualmin team is going to tackle the "hot spare" feature (again - that is awesome); what is the next logical step to achieving a load-balanced environment for strictly port 80 requests? What time I can spare (read: sleep I can go without) I'd like to research it and contribute. Maybe a itemized list of the issues in the strict sense port 80 web requests? If I am reading this correctly, its the database?
Sincerely,
-Jeremy
I've been working on the "hot spare" concept.
Here is a general concept:
http://books.google.com/books?id=wNzltxkWiGAC&pg=PA75&lpg=PA75&a...
...from the "Linux Server Hacks" book. Basically a bash script that uses an rsync configuration file to copy everything from on server (or servers) to the other. I've modified the files to work >somewhat< in virtualmin.
The current problem I have is that the userIDs of serverA are different than the userIDs of serverB so files are copied over and the users do not exist correctly on serverB. It's kind-of funky.
Right now I running a backup creation on serverA and then recreating it on serverB making sure that users and GUID's are created with the originals. Then the rsync from serverA->serverB I think will be working correctly...
After I've perfected the process, I'll post a few things if there are interested parties. In the meantime - it's the 2-3 month mark since the initial posting of the "hot-swap" feature request - so my labor may be in vain. I would not mind frankly...
Cheers!
-merlynx
I've not had the chance to spend the time working on this solution, I was told months ago that the development team was putting this feature on their radar to work on...that was nearly a year ago. There are tons of feature requests and bugs to iron out in a such a complex tool like virtualmin...I don't know if there is any deadline on such a feature.
I've experimented with drbd - this solution takes a lot of foresight IMHO but is an excellent one according to those implementing it.
I've gotten some great responses from this thread:
<a href='http://www.virtualmin.com/forums/virtualmin/clustering-virtualmin.html' target='_blank'>http://www.virtualmin.com/forums/virtualmin/clustering-virtualmin.html</a>
Including a sample/example implementation of fail-over that is live and working. It takes some significant time and tweaking to make it work (at least, for me), and I've not had the time to really implement the feasible solution mentioned above - I believe either the DRBD solution or the shared directory solution(s) would work for anyone - IMHO they are simply not easy to setup for the average linux admin. Clearly, there is more than one way to skin this cat, but the solutions are currently outside of the context of virtualmin and are dependent on your own methods and administrative ingenuity.
Good luck!
There are a few things sort of simmering on this capability. VM2 is going public pretty soon, which gives an over-arching "one server manages many" capability--which is kind of necessary for marshaling resources for failover and other sorts of high availability.
Eric has been itching to work on documentation and processes for backup servers and such, and when the new website launches (meaning he's finished with a bunch of other docs I've got him working on this month), I'll set him loose on that, and he and I will work together on coming up with requirements for the major dev work that would need to go into it on Jamie's side. The process of documenting what goes into it will make it more apparent what kinds of things Virtualmin and/or VM2 and/or Webmin can do to make those tasks simpler, more automatic, and more fool-proof.
As I've mentioned in several threads on the topic, the really hard work will never be within Virtualmin's purview. If you want your applications to scale, then your applications have to be designed to scale...and Virtualmin can't automatically make them scale.
The things we can do, however, include things like MySQL replication, shared data via ZFS or GFS, IP takeover, and possibly load balancing via mod_proxy_balancer--all of which are challenging in their own right, and I know a lot of folks who've had a hard time with them. Those are all big projects, however, and they kind of all have to be designed together, or they won't fit together very well at the end of it all. (This is also a problem. If the infrastructure we design doesn't fit exactly with the way people want to do things, they won't use it. We always design a lot of flexibility into our products, but this is an area where the full stack has to be pretty precisely designed.)
Also, I think I need to mention why this has remained a backburner project (beyond being really complicated to implement): You guys are <i>way</i> in the minority among our customers. There are maybe a dozen folks, out of a couple thousand, who have expressed an interest in clustering and failover and such...and none of our large hosting provider customers and potential customers have even mentioned it. You guys, and me and Eric and Jamie think this stuff is cool because it's really interesting and fun to play with, and we talk about it quite a bit...but, the market isn't demanding it. I'm afraid we may have to make this a proprietary option that costs extra in order to make it feasible to build and maintain it--would having to spend some extra money make this feature less appealing? And, how much extra money would make it less appealing? Right now, it's looking like at bare minimum, it's going to be a Virtualmin+VM2-only feature, so you'll be buying at least one license of each (Virtualmin can be used on a hot spare at no extra cost--though if you begin doing load balancing, we'll probably want you to buy another Virtualmin license). VM2 in this particular configuration will probably only be $198 (or free if you have more than five Virtualmin licenses). If an extra plugin, costing maybe another $98 or $198 or even more, were needed, would this be cost-prohibitive? (So, now we're talking about $296 or $396 or more extra on top of your Virtualmin licenses, in order to perform high availability and load balancing.)
I don't know if the math will work out, even at the higher price I mentioned...but since we do know that it's a big project, requiring involvement of a lot of people and a lot of software (including a lot of packages not provided by a stock CentOS install), and the userbase is far more limited than for a lot of other capabilities, I do know we're going to have to figure out how to make it pay for itself. For some reason it never really occurred to me that we could just say, "Hey guys, if you want it, we'll build it, but we're going to want you to pay for it." We've always just thrown all new features into Virtualmin, for free, and assumed that the increased sales would make it worthwhile...but the more I do the math on this one, the more I realize this model is <i>not</i> going to work for clustering and high availability. ;-)
--
Check out the forum guidelines!
<b>alessice wrote:</b>
<div class='quote'>Yes, with DRBD and Heartbeat you can set-up an HA solutions for Virtualmin, including also MySQL, Mail etc ...
And works well!</div>
Yes that works very well, although requiring quite some sysadmin and webmaster skillset, a good test-bed to try pulling all cables out, one at a time preferably :D and then a very good datacenter which is capable of providing the required hardware for that. And lots of time to design, implement and test. But at the end of the trip, it works well.
I don't think that it's really Virtualmin Pro's task to manage such a setup, as it's highly hardware-configuration dependant, and actually when you replicate/failover redundantly, you are failing-over virtualmin Pro with the server anyway.
But, when you start this kind of setups, you start to balance loads, and quickly have many Virtualmin Pro instances (and licenses as well, which is really cool for the great Virtualmin team) to manage.
Now that clustering of servers is solved, our main hurdle is to manage and keep up to sync all the configs of webmins and virtualminPros around. We got a few only at this time, but if things go well, there might be ...more, ...hopefully.
Before going into the clustering management, I believe that an architectural change is required in Virtualmin (not necessarily webmin in a first step, although that would make sense as well): Here is an idea that might work:
- Having one (redundant/replicated through clustering) instance of VirtualMin Client, where all the web-frontend happens.
- Having one instance of VirtualMin Server per host
- "clustering" the VirtualMin Client with each of the VirtualMin Server that it manages
A little bit like your would do with X11 and NFS to mention a few.
The single VirtualMin Client is the interface for admins and all customers of all servers, and keeps a single coherant database of sites. It also remotely administers the sites.
That would be making migrating of a single site from one server to another seamless, at least in the User interface, and balancing loads between servers in the cluster depending on each site load easier, would well integrate with cloud computing offerings, and simplify administration of systems hugely.
Also a tool to quickly see which site suddenly eats up many CPU or MySql resources in main control panel would be really nice. That would help load balancing as well...
Any hope that VM2 would solve that ? :)
I don't want to comment on the prices you mentioned at this stage, as I'm not sure to understand what the final bill would be and what's required.
Can't edit my previous message as i hit another bug of fireboard.
Was just to say that my quote was wrong as well, as I wanted initially to quote Joe on VM2 and clustering :D
Anyway, my reply holds :D
I would be very satisfied with a hot-swap solution.
A replication process that targets another virtualmin system and literally copies everything from server A to server B. Something you can set up as a cron job and configure in the cluster interface or something like that. The secondary server, in my case, does not even need to be "on-line" it is just a system ready to take the place of the primary *when* - not *if* - it goes down.
Including a load-balancing solution in this setup seems logical, but with the interworkings of it all, is clearly not a simple matter.
I would like to try VM2 in this context, but in my testing, our older hardware is not really ready (IMHO) to handle a virtualized or para-virtualized (Xen) solution. It's just not that powerful and most of it is around 5 years old. I don't know, I've not looked into VM2 in detail, so perhaps I am wrong here.
I find it interesting that with an enterprise product, more people are not requesting this kind of functionality. I suppose, the systems administrators that have this on their minds are the same ones who have the skill set to implement such a solution without a tool like virtualmin making it simpler for them.
Thanks for your thoughts.
The client-server architecture could allow not only for easy migration of sites, but also for cloning between hosts, and also for replication/backup/archival purpooses, and in that case be very efficient also on old hardware, using rync or rsync-diff for backup+archival.
Just some thoughts how this client-server feature could be appealing for more customers, any of those having more than 1 server to manage with Virtualmin Pro ;)
You can start from here: http://www.linux-ha.org/DRBD
Ciao
>>>Bump<<<
Can we get a status update on the "hot swap" feature?
It would be awesome for some of the community members who have configured failover and load balancing to build a document that got officially ratified in some way...we could refer to it here and let me tell you, I'd be willing to test implement it in a "heartbeat" as I've been directly banging my blockhead reading documents about this...
Hey there!
I'd be interested in a hot spare environment too because customers keep asking me about this feature.
DRBD is very nice, but you have to establish this kind of solution manually because there are no modules in webmin/virtual for active/passive "clustering" so far.
Any news on that?
Hi all... I managed to get this working quite nicely and its been stable for a couple of weeks now and the failover works well so far..
HowTo on my blog.. http://safestream.net/blog/?p=1
Hope this is of use...
Wow, that's a pretty thorough writeup.
Glad to hear you got it working, and thanks for sharing your work! -Eric
sorry, blog moved.. here is the new link for the howto..
http://safestream.net/cms/
doesnt work for me - is the site available still?
Sorry been away for awhile.. here is the link:
http://data-server.org/blog
oh, I solved the sync problem with system files using csync2, also provided by the developers of DRBD..
any CentOS equivalent?
Hello Hizar and to others as well, I have started with a high-available and load balanced setup for virtualmin. I am looking out for some guides/experience. So if you have any URL or links, it will help me most.
Hizar, I tried to reach your blog, but am unable to view it at all. Can you please let me know how I can have a look at your howto.
Thanks in advance
with regards
Hi,
Would it be to crazy just to do a find and replace of the ip address.
so first a rsync and then.
for originalip in file replace with this hotspareip.
Or are there many other things to configure.
To me it looks like the only difference with the original server is the ip address.
Well, it comes down to the specifics of how you intend on doing all that -- but something along those lines could certainly work, sure. It's certainly worth perusing -- just make sure to test it before having to rely on it ;-)
-Eric
Just dropped the idea I could get a master-master setup so i'm now focusing on a master-slave setup.
This 'kfsmd' http://www.linux.com/archive/feature/124903 could be the trigger for the rsync script.
Maybe there is a way for the user to have a master-master setup by a config file in their home dir. To tell kfsmd what to monitor on the slave and a daemon on the master to process that.
For now I only quickly found a server specific ip in the apache config, but if someone has a list of all the files containing ip's or could potentially contain an ip that would be nice. If i'm missing something please let me know.
The main focus of this solution would be a very cheap and simple way to get an acceptable redundancy. Again the main focus is to reduce cost and to use the cheap vps servers that are available.
So 100% reliable is not the aim but to get something acceptable for a good price.
I recently started playing around with linodes. It's been a lot of fun. I even bought a Cloudmin license to see how far it would go. I'd love to "run a cloud" with cloudmin but amazon seems kinda pricey compared to $200 for a quad core dedicated server. Linodes are great but webmin seems a better way to cluster them than cloudmin.
Just my $.02
Hi,
Still after a easy solution.
I found http://code.google.com/p/lsyncd/ and
a python script for ip's.
import os,sys
def getFiles(dir): foundFiles = []
if dir[-1:] == "/": dir = dir[0:-1]
for x in os.listdir(dir): if os.path.isdir(dir + "/" + x): foundFiles.extend(getFiles(dir + "/" + x)) else: # You can replace/comment out this if you want to, it'll only work # with "html" files if you don't.
return foundFiles
def dryrun(file): infile = open(file, "r") text = infile.read() if text.find(sys.argv[2]) != -1: #infile.close() #infile = open(file, "w") text = text.replace(sys.argv[2], sys.argv[3]) print "Replaced strings in " + file #infile.write(text) infile.close()
def fixFile(file): """ I would use "r+" here, but it didn't work right, it was acting to append, not to write over. So since I was far too lazy to do it right the script ends up with this hackish open/close cycle that is probably really inefficient. Oh well.:P """ infile = open(file, "r") text = infile.read() if text.find(sys.argv[2]) != -1: infile.close() infile = open(file, "w") text = text.replace(sys.argv[2], sys.argv[3]) print "Replaced strings in " + file infile.write(text) infile.close()
This program doesn't have a concept of input checking, you better know how to use this, don't look at me!if len(sys.argv) != 4: print "Usage: %s " % sys.argv[0] sys.exit(1)
files = getFiles(sys.argv[1])
for file in files: dryrun(file) #fixFile(file)
If anybody has any experience with extreemfs or glusterfs I would be interested in that.
regards,
There are excellent tips, thanks for the inspiration!
@jessec:
I'm using glusterfs myself for various aspects. I use it on two NAS servers to basically provide my redundant SAN, both using Raid 5 arrays for the physical storage, and Raid1 over GlusterFS.
My next feat is actually to attempt to setup pacemaker to provide failover NFS access to the glusterfs mounts on the two NAS servers, because glusterfs itself doesn't support quotas.
Has anyone done something similar in production for deploying an HA website?
In the LAB, I setup two kvm web server's using proxmox ve 1.9. The kvm images for both web server are stored using a DRBD storage type, done using the proxmox DRBD wiki.
On each kvm guest, I added a second hard drive same size for both. I then used the extra hard drive to create a DRBD block to hold web content served by both web servers. To allow concurrent access to the web content coming from either web server's I added a distributed file system called ocfs2 on the drbd resource.
In front of the web servers I have another KVM vm running zenloadbalancer to divert web traffic in case one of the web servers die.
All seems to work but I still have to load test both web servers to simulate real traffic. The other thing I still need to do is have pacemaker handle the auto mount of the drbd resource drive. When I reboot the web servers there is an issue where apache2 fails to start during boot up because the drbd resource mounted on /home is not available yet. To get around this I currently use the webmin module "system and server status" to monitor apache then have it start apache2.