linuxwebcluster.com

Frequently Asked Questions

 

Q: When I add a diskless cluster node and try to boot it, I get an error connecting to the cluster filesystem. It reads:

----------Error!-----------
Unsuccessful connection to cluster shared root environment.
There is probably something wrong with either the network connection
or the cluster filesystem. Check the network cables, switch, and
make sure that gluster-server is running on the master:
/etc/init.d/gluster-server status

A: Aside from the issues mentioned in the error message itself, the most common cause of this error is a faulty fuse (Filesystem in USErspace) module in the client. This can occur when the kernel is changed in any way,  usually by something like a yum update. The fuse module has to be rebuilt against the new kernel, so make sure the current kernel source is installed and linked to /usr/src/linux and then change directories to where you unpacked our software. By default it unpacks to a directory named "distrib". Then run the "rebuild.sh" script and in about 5 minutes, the fuse modules and cluster filesystem source code will have been rebuilt against the kernel and a new boot image created.

 


 

 

Q: How do I get support?

A: The cluster management software is commercially licensed. Trial versions are available that will run for 60 days. Please contact us for support while you are evaluating the software, and check the documentation on this website. For licensed customers, support is available either via annual contract or hourly terms. On site support may be possible for licensed customers - please inquire if this would be helpful to you or your organization. Also, our work is only possible because of the great open source software we started with. Each project that is bundled with our software has free support options like mailing lists, online documentation, etc., and links to the project homepages can be found everywhere the projects are mentioned in this website.


 

Q: Can I use 32 bit or 64 bit versions of Linux?

If you use a 64 bit version of Centos / RHEL, then ALL of your cluster nodes must be 64 bit capable. The cluster operates as a shared root where all nodes run the same operating system concurrently. If you use the standard 32 bit version, then 64 bit nodes will run fine but will not be able to address more than 4 GB of memory each.

 


 

 

Q: How can I tell that my cluster is up and running?

A: Type the cluster IP address into a web browser. This should direct you to the web interface, where you can use the Status / Performance Graphs tab to view live information in Ganglia or the Monitoring tab to see the state of cluster monitoring system. If the web interface does not load, log in to the cluster through SSH or the console and make sure tomcat and terracotta are running:

service tomcat status
service terracotta status


Q: Does the cluster have centralized logging?

A: Yes. Lots of information is logged to /var/log/messages on the master server. This is the first place to look when you want to see what the cluster is doing. All client nodes log to the master as well, so an event on any node will appear in this file in real time. An easy way to keep tabs on what's happening is to leave the log file scrolling on the screen with the `tail` command:

tail -f /var/log/messages


Q: How can I watch the cluster load?

A: There are several ways. First, you can use the web interface under the Status / Performance Graphs tab. This gives good information about each node, including processes, memory usage, network usage, etc. For java web applications, use the Terracotta Admin Console. This is a java application that ships with Terracotta, and it gives web application information like transactions / sec / node, cache hits, etc. Start it by navigating to /cluster/terracotta/bin and running ./admin.sh. Another method is to use the 'top' utility just as with any other Linux server. This will show you what applications, if any, are stressing the server.



Q: How can I add or remove a node?

A: The easiest way is to log in to the web interface and use the Manage Nodes page. Just click the “Add Node” or “Remove Node” button and answer the prompts about what name and IP address to use. For more information, read the documentation on adding and removing nodes. Don't forget to add the node name to /cluster/http_nodes if you want it to process http connections.



Q: I want my master server to process http requests along with all the other cluster nodes. After adding it to /cluster/http_nodes, it still does not join the load balancing pool. Why not?

A: The master server does not turn off and on like the compute nodes, and only checks /cluster/http_nodes when load balancing services start. Either restart LVS like this (you'll need to restart your compute nodes afterward and let them rejoin, so don't do this in production):

service ipvsadm restart

or run the following command to immediately include the master in load balancing. First, ensure that your application is running correctly! If the hostname were bigserver.local and the virtual IP address was 10.0.0.99, run:

/sbin/ipvsadm -a -t 10.0.0.99:80 -r bigserver.local -g 

 



Q: When I click on Status / Performance Graphs in the web interface, it fails to open the new window or the window is blank.

A: This most commonly caused by an incorrect /etc/hosts file. The web interface tries to open Ganglia on the IP listed there, and an incorrect IP will result in this particular problem. Another common issue is that pop-up blockers may be preventing the new window from opening.



Q: How do I change the user or password on the web interface?

A: The user and password are specified in /cluster/tomcat/auth, a plain text file which is readable only by root. Change them by changing the values in this file.



Q: What kinds of applications can I load balance on the cluster?

A: The cluster is designed to load balance web applications. In particular, it is pre-configured with Terracotta and Tomcat to be an exceptional java application cluster back end. It can serve http applications right out of the box, and could in theory support any IP connection based application. Because of the cluster filesystem, applications that are more CPU / memory intensive and less i/o intensive will perform the best.

An extremely i/o intensive application (like a database) will perform worse on the cluster than on a standalone server, and/or may not work at all. But these types of applications are not designed to be load balanced by a connection oriented IP load balancer, and are inappropriate for this application. However, they will still work fine for failover and a two node mysql high availability cluster (note: high availability is NOT the same as load balancing) is a common use for ClusterMaker.



Q: What happens when my trial license expires?

A: The software is internally limited to a certain time period. If your trial expires, the cluster will stop working and you'll see log messages stating that the trial period has expired. You can contact us to request and extension or purchase the full license.



 

Tell the developers:

The type of clustering you are most likely to deploy is:
 
What Linux distro do you use for clusters?
 

Copyright 2010    RapidScale Clusters, LLC