Tuesday, October 26, 2010

High Availability systems under Linux

文章閱讀筆記


#
What is HA?
High Availability is what it says it is.
Something that is Highly Available.

#
The service runs on a machine, and redirecting the service and requests to another healthy machine is the art of High Availability.

#
How will this work
if master node fails, then the slave node may take over its ip address and start serving the requests.This method is called IP takeover.

#
How do clusters talk
They will talk to each other over a serial cable and over a cross link Ethernet cable (for redundancy, serial cable or Ethernet cable may fail) and check each others heartbeat.
The program to monitor the heartbeats of the cluster nodes is called... guess...heartbeat.
heartbeat is available at http://www.linux-ha.org/download/
The program for ip address take over is called fake and is integrated in heartbeat.

#
What about data integrity issues
When service httpd moves from node1 to node2 it does not see the same data. I loose all the files that I was creating with my httpd CGI's.

Two Answers:
1. You should never write to file from your CGI's. (use a network database instead.. MySQL is pretty good)
2. You can attach the two nodes to a central external SCSI storage, and make sure that only one is talking to it at one time, and also make sure that you change the SCSI id of the host card on machine a to 6 and leave on machine b 7 or vice -versa.

You can run GFS (Global File System, see below in resources) over FC which allows you to have transparent access to the storage from all machines as if they were local storage.

#
What about active/active cluster

You can easily build an Active/Active server if you have a good storage system that allows concurrent access.
Examples are Fibrechannel and GFS.

If you are content with Network filesystems such as NFS, you may use that, but I would not suggest that.

Anyway, you can map serviceA to clustnode1 and serviceB to clustnode2 example of my haresource file

clustnode2 172.23.2.13 mysql
clustnode1 172.23.2.14 ldap
clustnode2 172.23.2.15 cyrus

I use GFS for storage so I don't have a problem with concurrent access to data and can run as many services as is manageable by these machines.
Here clustnode2 is the master for mysql and cyrus which clustnode1 is the master for ldap.
If clustnode2 goes down then clustnode1 takes over all the ip addresses and the services.

No comments: