Grid Computing

From Ggl's wiki

Jump to: navigation, search

Contents

Purpose

This page is a try to implement a virtual datacenter architecture based on a linux grid.

Principles

The main principles are:

  • Using virtualization
  • Loosely coupled tiers
  • Distributed computing and storage
  • Optimized power consumption
  • Cheap but efficient high-availability

Implementation

I begin by using a simple architecture with multipe layers:

  • Routing with quagga or a Cisco emulator (but routing is somewhat out of the scope of this project)
  • L2-7 switching: you might think about a Cisco 650x - or another switch chassis - or a bladecenter with a L2-7 card to do the job. My first thought was to use haproxy, but finally I'll start with UltraMonkey which is based on LVS (Linux Virtual Server) and Hearbeat.
  • OS base: HLFS/BLFS + Debian packaging. I like debian (dpkg + aptitude) tools but I may take a look at Conary and also Smart. BTux might be a inspiration ;).
  • Virtualization with Xen which is production grade, but I'll be closely watching KVM.
  • Front-end and Caching with HTTP server software like Apache and the ad hoc modules. For caching, Varnish seems to be a wise choice. As I want to use virtual appliances and ready to use RubyOnRails and Django web framework platform, I may use Swiftiply.
  • Web Server with Apache.
  • Database: MySQL and Postgresql. When I said earlier loosely coupled, it's like you don't know to be stuck with a database server or engine.
  • Filesystem: Lustre at first, but I'll be closefly watching Ceph.
  • Storage: linux iSCSI SAN or the cheaper ATA over Ethernet
Component Mainstream Alternative
load balancing UltraMonkey HAProxy
OS Base LFS/BLFS HLFS
Virtualization / Hypervisor Xen KVM
Caching reverse-proxy Varnish  ?
Database MySQL PostgreSQL
Filesystem Lustre Ceph
Storage iSCSI ATA over Ethernet

Ok, that's a lot of things to do! Maybe the most valuable parts in this work will be:

  • Automation
  • Centralized administration with configuration management software like cfengine, puppet or bcfg2
  • Centralized monitoring with zabbix or centreon.
  • Security

Roadmap

Where to begin?

There are three main parts:

  • cluster/grid management
    • job scheduler
    • monitoring
    • configuration management
    • security infrastructure
    • load balancing and high availability

Some docs:

HOWTO Torque/Maui - grid scheduler and resource manager
Cluster Resources Documentation
HOWTO Configure Gentoo Linux for Clustering
How To Set Up A Loadbalanced High-Availability Apache Cluster 
LVS HA
Eddie
Ganglia
  • host base
    • Installer
    • OS foundations
    • Configuration
  • VM appliances
    • Configuration and creation
    • on-demand provisionning

The first step is to have a simple virtualized system with:

  • a custom built host base (a master image)
  • a custom built guest virtual appliance base (a master)
  • iSCSI storage and lustre filesystem

Building the host base image

The host base is the foundation. Some work that's done here will be re-used for guests as well. Basically, the host provides hardware access to guests. It also manages common tasks that can be shared by guests.

Organisation

Just a quick overview of the organisation. We need a building environments for host and guest appliances. Like in development process with continuous integration, it's a good practice to build and test each modification done on the configuration. It is also a good thing to manage modification with a SCM. I'll use Hg (Mercurial), feel free to use another one.

  • A building machine with a directory for Hosts and Guests. A Debian with HLFS tools
  • A testing machine with Xen that boots host and guest appliances when they are built.

I need a way to automate this. Thus I'll write a script (I might call it build_appliance for example), that:

  • build the HLFS system
  • make the appliance image
  • copy it to the testing directory
  • launch tests
  • store tests logs
  • send a report which mainly said if it worked or not

OS Base

It's the starting point. I believe that having a secure by default base is great. Then I'm not quite sure to build a LFS. Beginning with a HLFS will avoid wasting time to migrate from another LFS install.

HLFS brings a hardened toolchain to LFS. hardening compilation flags will be used with gcc, as well as a custom glibc. Although it talks about Fedora, this article lists security features that have to be included in the OS base. Kernel must also be patched with PaX/GRSecurity.

Network

At least two interfaces. VLANs, interfaces, IP addresses, netmasks, default gateway, routes. Default drop, explicitly authorizes incoming and outcoming network connections.

Default software and services

OpenSSH, NTP client, configuration management agent, HIDS agent, monitoring agent.

Filesystem

Lustre on Ata over Ethernet or iSCSI SAN drive.

Personal tools