Parallels Virtualization Engineer

Posted: 2010-08-17 18:53:00 by Alasdair Keyes

Direct Link | RSS feed


At Daily.co.uk we've recently released our Parallels Virtuozzo Based VPS systems. Parallel's container based virtualization is a very low-overhead system which allows a very high ratio of VPS guest OS's per physical hardware node.

As part of this I attended a two-day course plus various online tutorials and exams and I am now a Parallels Certified Virtualization Engineer and Parallels Certified Automation Engineer.

Which is nice.

If you are using Parallels and wish to look at getting certified with their technologies you can check them out at http://www.parallels.com/uk/support/training/


If you found this useful, please feel free to donate via bitcoin to 1NT2ErDzLDBPB8CDLk6j1qUdT6FmxkMmNz

OS Deduplication with LessFS

Posted: 2010-07-18 16:41:36 by Alasdair Keyes

Direct Link | RSS feed


Working in hosting means that I often have to deal with large amounts of data. Either in customer files on a shared hosting platform or as VHDs on a Virtual platform.

Storing lots of data in any form is a pain, it's unweildy, hard to backup, hard to move and storage costs a lot, not only for money on buying disks, but if not done in a smart way, takes up cabinet space and extra money in cooling/powering servers just for their storage capacity. With the recent evolution of cheap SAN hardware with iSCSI networking data storage for small companies is getting easier, however the amount of storage being used by any company is always increasing faster than you'd like.

Many SAN providers have deduplication which can save a lot of space, however, do we really need to buy SAN technology to get the benefits of deduplication? Many companies like ours have lots of server and lots of data which we would certainly like to slim down on. I started investigating into Open Source deduplication and found a couple of contenders.

First of all is lessfs. LessFS is an Open Source deduplication filesystem written using FUSE. I thought I'd give it a go and it seems pretty good so I thought I'd run a quick tutorial on how to use it. For this demo, I used Ubuntu 10.04 LTS x86_64. I chose this over my favourite choice of CentOS as it's software and libraries are far more current which I found would help in the installing of dependencies. This was installed into a VMware VM with a 3GB / ext3 partition and a 10GB ext3 partition mounted onto /data

We could stop there but it's worth investigating the power of Deduplication and seeing how lessfs stacks up. Now lessfs creates a stats folder in your mount under .lessfs/lessfs_stats it shows how much space has been used to store files and how much has been saved using it's deduplication, lets have a look at the file

root@dedup:~# cat /mnt/less/.lessfs/lessfs_stats
  INODE             SIZE  COMPRESSED_SIZE  FILENAME
      7                0                0  lessfs_stats

Lets check how much data is being used by lessfs at the moment

root@dedup:~#  du -hs /data
64M      /data

This space is just lessfs's database and other storage mechanisms. We'll write some data and see how it goes. Lets create a 200MB file full of zeros, this should be easily dedup'd as it's all identical.

root@dedup:~# dd if=/dev/zero of=/mnt/less/zero bs=1M count=200
root@dedup:~# ls -alh /mnt/less/zero
-rw-r--r-- 1 root root 200M 2010-07-16 19:40 /mnt/less/zero

You can see that the system sees it using 200MB, what does less fs think we've used

root@dedup:~# head /mnt/less/.lessfs/lessfs_stats
  INODE             SIZE  COMPRESSED_SIZE  FILENAME
      7                0                0  lessfs_stats
      8        209715200             1628  zero

That's pretty good, 200MB compressed into 1K! But lessfs could be lying, as we know the actual data is stored in /data, how much has that grown.

root@dedup:~# du -hs /data
64M      /data

Absolutely nothing, it's looking good. It's worth storing some real-world data...

root@dedup:~# scp -r 192.168.1.1:/Aerosmith /mnt/less/

This will allow some dedup, but lets see what happens if we store the same data again into a different folder

root@dedup:~# scp -r 192.168.1.1:/Aerosmith /mnt/less/Aerosmith_copy
root@dedup:~# du -hs /mnt/less/Aerosmith/
249M    /mnt/less/Aerosmith/
root@dedup:~# du -hs /mnt/less/Aerosmith_copy/
249M    /mnt/less/aerosmith_copy/

That's 500 MB we've copied to that filesystem, yet we're only using 308MB in total in the /data folder

root@dedup:~# du -hs /data/
308M    /data/

If we check the lessfs_stats file, we actually seem to see an increase in size vs compressed size. I'm not sure if this is a calculation issue or one related to block size etc. However, you wouldn't expect much with compressed files anyway.

     45          3727130          3727159  08 - Cryin'.mp3
     46          2942524          2942547  05 - Don't Stop.mp3
     47          3301450          3301476  01 - Eat The Rich.mp3

The interesting bit is when we check the section related to the duplcate set of files...

    131          3727130                0  08 - Cryin'.mp3
    132          2942524                0  05 - Don't Stop.mp3
    133          3301450                0  01 - Eat The Rich.mp3

it is deduplicating it perfectly. It has worked out that the files are copies and the copies using little to no space on disk.

There are plenty of options available with lessfs, tweaking the config file allows for encryption, dynamic defrag and transaction logging (to enable recovery after a crash) and the level of compression by using different compression systems (BZIP, GZIP, Deflate etc).

As with anything like this, I wouldn't use it on data I couldn't afford to lose until I was very sure it was stable. However, it shows plenty of promise for what lies ahead in the world of data storage.


If you found this useful, please feel free to donate via bitcoin to 1NT2ErDzLDBPB8CDLk6j1qUdT6FmxkMmNz

Apache Caching

Posted: 2010-04-18 15:51:10 by Alasdair Keyes

Direct Link | RSS feed


With the release of Apache 2.2, Apache's built in caching modules (mod_cache, mod_disk_cache and mod_mem_cache) became "production ready", although I'm not too sure as I'll discuss.

I implemented Apache's mod_disk_cache on a production server and it seemed to work very well, essentially, each request is hashed into a file path on your web server. This is very useful if you serve content from a remote share such as NFS, it will allow the web server to skip out all that unneeded network activity just to send the same 2KB image or CSS document 20,000 times. Compared to memory, disk is relatively slow but by further comparison, NFS or remote shares will crawl if you're handling lots of small documents.

However, I did find out that Wordpress sites have problems when mod_disk_cache is used. It seems to generate collisions and you will find that people will visit different pages in your Wordpress site but just keep on getting served the same page from the cache, often this seems to be the RSS feed for that site.

At first I thought that I'd just have to increase the path depth and size, however, even when increasing this to the maximum and flushing the cache, the problems seemed to stay. As a result, I skipped the disk cache and stuck with mod_mem_cache.

Obviously memory is much much faster than disk, although much less abundant, a modern server usually won't arrive with less than a 100GB disk but unless you spend serious cash, you won't have more than 8-16GB RAM. After implementing mod_mem_cache the problems with wordpress disappeared, the only thing I can think of is that either mod_mem_cache wasn't caching the requests and they all went to the data store... or Apache's disk cache implementation didn't have enough path size/depth to correctly stop collisions whereas it's memory hashing didn't have the same path limitations so was able to succesfully handle the request.

Unfortunately, there doesn't seem to be a way to view the requests stored in memory. When disk cache is being used you can look through the hashed locations on the filesystem /var/myapachecache/abc/def/123/345/.... but no such help when you're just using memory caching.

Anyway, long story short, memory caching works wonders and depending on your site. Disk caching might work wonders too, but if you seem to get collisions, just ditch it.


If you found this useful, please feel free to donate via bitcoin to 1NT2ErDzLDBPB8CDLk6j1qUdT6FmxkMmNz

Exim

Posted: 2010-03-07 23:40:34 by Alasdair Keyes

Direct Link | RSS feed


For the past 4 years or so, I've used Sendmail as my MTA of choice, no real reason for it except that it was the default on most Linux distros. However, over the same time, all the mail systems I've had to maintain professionally have been Exim. I'm in the process of migrating my dedicated server over to a Daily VPS solution... mainly because I created the whole system so I trust it :)

Anyway, I've decided that instead of using Sendmail, which is pretty horrid, clunky and not easily (in my opinion) extendible, I'd change my MTA to Exim and Dovecot.

Although MTAs are a necessary evil as email's not going anywhere... I hate them, I hate MTAs, I hate configuring them, I hate tweaking them, and although I think Exim's configurability is outstanding, it can be a real pain setting up a mail server so I thought I'd outline how to setup a basic Exim mail server (Under CentOS 5). As I only have ~20 domains which don't change very often, I've decided to stick with file-based mail configuration, you can use a SQL backend, but there's no need for it on my system. This setup will hold mailboxes/forward and allow users to send mail. So...

Install exim and if necessary remove any other MTA on the machine (Sendmail,Postfix,etc). Also install saslauthd for authentication and dovecot for mail collection.

yum remove sendmail -y;
yum install exim dovecot saslauthd -y

Create a folder to hold the mail routing information on a per-domain basis. In this folder we will create files with the same name as the domains we wish to handle mail for and in each file we will place the mail routing information

mkdir /etc/exim/mail_configs
chown root:mail /etc/exim/mail_configs

Create the file /etc/exim/mail_configs/example.com to hold information for a domain example.com

al : al@localhost
* : :fail: Unknown User
group : al[at]gmail[dot]com,al[at]hotmail[dot]com

The above tells it to deliver al@example.com to the localuser al. To forward group@example.com to a Hotmail and Gmail address and reject all other addresses. That's all that's required for the per-domain setup, now we just have to configure Exim.

In /etc/exim/exim.conf change the following...

Tell exim that all the names of all the files in /etc/exim/mail_configs should be considered the domains we handle locally

domainlist local_domains = @ : localhost : localhost.localdomain

to

domainlist local_domains = @ : localhost : localhost.localdomain : dsearch;/etc/exim/mail_configs

Tell Exim how to find local users, place this block as the first entry in Exim's router configuration

my_aliases:
      driver = redirect
      allow_defer
      allow_fail
      domains = dsearch;/etc/exim/mail_configs
      data = ${expand:${lookup{$local_part}lsearch*@{/etc/exim/mail_configs/$domain}}}
      retry_use_local_part
      pipe_transport   = address_pipe
      file_transport   = address_file
      no_more

Tell Exim to allow plaintext authentication when users send emails through the server. Enter this under the begin authenticators section of exim.conf

begin authenticators

PLAIN:
  driver                     = plaintext
  server_set_id              = $auth2
  server_prompts             = :
  server_condition           = ${if saslauthd{{$2}{$3}{smtp}} {1}}
  server_advertise_condition = ${if def:tls_cipher }

LOGIN:
  driver                     = plaintext
  server_set_id              = $auth1
  server_prompts             = <| Username: | Password:
  server_condition           = ${if saslauthd{{$1}{$2}{smtp}} {1}}
  server_advertise_condition = ${if def:tls_cipher }

Because we're using plaintext, force users who want to send mail to use TLS otherwise they'll just get a relay denied error. Enter this under the main exim config section

auth_advertise_hosts = ${if eq {$tls_cipher}{}{}{*}}

By default Exim will store messages in /var/mail/$user as a regular spool. I want to use Maildir storage so Change the local_user transport section to use maildir

local_delivery:
  driver = appendfile
  file = /var/mail/$local_part
  delivery_date_add
  envelope_to_add
  return_path_add
  group = mail
  mode = 0660

to

local_delivery:
  driver = appendfile
  directory = $home/Maildir
  maildir_format
  maildir_use_size_file
  delivery_date_add
  envelope_to_add
  return_path_add

That's Exim sorted. Now we tell saslauthd to look at the /etc/shadow file for authentication and not PAM. Edit /etc/sysconfig/saslauthd Change MECH=pam to MECH=shadow

Finally tell Dovecot that we're using Maildir and not mbox. Of course this step isn't necessary if you want to use mbox. Edit /etc/dovecot.conf and set

mail_location = maildir:~/Maildir

Restart the lot

service exim restart;
service saslauthd restart;
service dovecot restart;


If you found this useful, please feel free to donate via bitcoin to 1NT2ErDzLDBPB8CDLk6j1qUdT6FmxkMmNz

BASHing things up

Posted: 2009-11-24 01:20:23 by Alasdair Keyes

Direct Link | RSS feed


Anyone who uses Linux will most likely be familiar with BASH the Bourne Again SHell. Although on the whole it is fantastic out of the box and requires no customisation to be useful, however there are a few tiny things that niggle at me which I have finally looked into sorting, so I thought I might share...

If you work as part of a team to administer servers, you will no doubt have found that BASH's standard history logging can be a bit lacking. If two people are logged in under the same user (such as root), you will find that whoever logs out last will have their history added to the ~/.bash_history file, the user to log out first has their changes wiped out... this is because when you logout BASH saves your history completely over-writing the file. This can be a great pain especially when you need to log back in and look at what you did. To change this set the following in your ~/.bashrc, which tells BASH to append your changes instead of completely overwrite. Be warned, you will have to keep an eye on the size of your history file,

shopt -s histappend

The second annoyance I has is that if your network connection to the remote machine died or the box froze/rebooted then your last bash session's history would not be saved as it only saves on logout. To get BASH to save instantly, add the following

PROMPT_COMMAND='history -a'

I hope this helps those with similar issues.


If you found this useful, please feel free to donate via bitcoin to 1NT2ErDzLDBPB8CDLk6j1qUdT6FmxkMmNz

cpan2rpm

Posted: 2009-09-12 18:46:56 by Alasdair Keyes

Direct Link | RSS feed


When developing new applications, I often find that I make use of CPAN (Comprehensive Perl Archive Network) http://search.cpan.org and use modules in applications that I write.

The standard way of installing modules is by using cpan from the command line

# cpan -i Perl::Module

This is fine when installing the module on just one server, but if you have a cluster or the module needs to be distributed to other machines, running cpan on each is time consuming and difficult to script, especially if the module needs to be installed automatically on newly built machines.

The answer for this is cpan2rpm, it holds no suprises, it takes an module from CPAN and builds an RPM for it, you can then add the RPM to a YUM repository.

Download it here http://sourceforge.net/projects/cpan2rpm/ I'm going with the RPM as it makes installation so much easier

It has a number of dependencies, mostly perl modules, but also rpm-build, so make sure you have that installed

# wget http://sourceforge.net/projects/cpan2rpm/files/cpan2rpm/2.027/cpan2rpm-2.027-1.noarch.rpm/download
# yum install rpm-build
# rpm -ivh cpan2rpm-2.027-1.noarch.rpm

There are many options for it, you can add author information and also sign the generated RPM, but for simple use that's not necessary, something as simple as

# cpan2rpm --no-sign Perl::Module
-- cpan2rpm - Ver: 2.027 --
Upgrade check
Fetch: HTTP

-- module: Perl::Module --
Found perl-module-1.00.tar.gz
...
...
...
+ exit 0
RPM: /usr/src/redhat/RPMS/noarch/perl-Perl-Module-1.00.noarch.rpm
SRPM: /usr/src/redhat/SRPMS/perl-Perl-Module-1.00.src.rpm
-- Done --
[1]+  Terminated              perl_module

You can then install the RPM on your machine

# rpm -ivh usr/src/redhat/RPMS/noarch/perl-Perl-Module-1.00.noarch.rpm


If you found this useful, please feel free to donate via bitcoin to 1NT2ErDzLDBPB8CDLk6j1qUdT6FmxkMmNz

/etc/shadow hash generation in shell

Posted: 2009-08-17 17:25:51 by Alasdair Keyes

Direct Link | RSS feed


I've been playing about with system templating recently, configuring a base Linux system then allowing it to be customised for future roll-out without starting the machine up. I found the following little trick to generating password hashes as they appear within the /etc/shadow file. It uses the mkpasswd binary, there are many ways to generate the hash which use openssl and various password apps, but this was the most simple I came across.

I don't believe mkpasswd is in the CentOS yum repo, but is available in Ubuntu's apt repo (apt-get install mkpasswd).

MD5 hashed passwords as found as default on Redhat/CentOS systems

# mkpasswd -m md5 password saltsalt
$1$saltsalt$qjXMvbEw8oaL.CzflDtaK/

SHA512 hashed passwords as found on Ubuntu

# mkpasswd -m sha-512 password saltsaltsaltsalt
$6$saltsaltsaltsalt$bcXJ8qxwY5sQ4v8MTl.0B1jeZ0z0JlA9jjmbUoCJZ.1wYXiLTU.q2ILyrDJLm890lyfuF7sWAeli0yjOyFPkf0


If you found this useful, please feel free to donate via bitcoin to 1NT2ErDzLDBPB8CDLk6j1qUdT6FmxkMmNz

IT Consultancy Services

I'm now available for IT consultancy and software development services - Cloudee LTD.



Happy user of Digital Ocean (Affiliate link)


Version:master-619e08f203


Validate HTML 5