I have had the opportunity to test the next release of Sun Cluster, the one that will support Solaris 10 zones. It looks rather impressive to my opinion…
1. What do you need ?
At least 2 machines + some shared storage + Sun Cluster software + the new “zones” Sun Cluster data agents
2. What’s the main idea ?
The main idea is to make it as transparent as possible that you are running your application in a Cluster inside a Zone. The way it is achieved is by letting Sun Cluster software run in the Global zones and perform its traditional job of global devices management, volume management, private interconnect management,… The actual to-be-clustered application is installed inside a Zone whose entire directory tree ( zonepath variable ) resides on the shared storage. This Zone is going to be started by one node at a time. New fault monitors will need to detect zone’s failures and instruct the Sun Cluster framework either to restart the zone or to failover the zone to the other node.
3. What is this zone agent ?
Up to now, testing the sanity of the application was done by means of specialized Sun-provided Sun Cluster Data Agent which were providing the glue between the application and the Sun Cluster framework. The Sun Cluster software also checks the sanity of both nodes. Since now the application is running in a Zone, we also need to monitor the Zone and take action at the Zone level instead of the node level. For instance, failing over to the other node involves stopping the zone on the first node, deporting/importing device group & starting the zone on the other node.
The other job of the zone agent is to monitor the application itself.
4. The zone boot component
The component will be registered and added to a Sun Cluster ressource group to take care of the monitoring of the zone. It will perform some sanity checks to make sure that the zone is doing fine. If something is wrong with the zone or if instructed to do so by Sun Cluster, it will organize the shutdown, restart or failover of the zone.
5. Application Monitoring in the zone
Two components can be used (registered and added to a Sun Cluster ressource group) to monitor the application. The first one is very simple, we only have to mention the scripts that will be used to start/stop/monitor the application.
The second one implements the integration with SMF, the Service Management Facility. SMF is capable of stopping, starting and monitoring an application ( called a Service ). So we can benefit from the additional features of SMF just by telling the cluster which FMRI ( Fault Managed Ressource Identifier ) it needs to supervise. When onlining the ressource, the component will start the service using the SMF framework.
6. IP address
Unique IP Address : where in the past, Sun Cluster had to manage the failover of the application IP address between the two nodes by using a ressource of type “Logical Hostname”, this is now opotional. If the configuration of the zone is identical on both nodes ( very easy to accomplish ) then the “active” node will boot the zone with the one unique zone IP. This is possible because the zone Iwill ony be started by one node at a time.
Strong points :
- Easier installation / management : many applications have to be installed on each node of a typical cluster. Because zones provide a application environment that doesn’t depend on the underlying physical infrastructure, the application now just has to be installed on the Zone that will happily be booted by each node. Bringing an application to Sun Cluster is now much easier.
- Taking advantage of the better security & ressource flexibility provided by the Zone feature.
- Taking advantage of the added security provided by the SMF feature. This supposes that your application has already been converted to a SMF service. Lots of them are/will be available by the time companies start to evaluate the product.
T : Zones