A look at CockroachDB

Posted: 2017-05-20 13:10:28 by Alasdair Keyes

Direct Link | RSS feed


CockroachDB has been floating around for a few years and version 1.0 has just been released ready for production. I had been aware of the system for some time but had never really played with it, so I decided now would be as good a time as any to prod it a little.

This article won't be any kind of how-to because their own documentation (available at https://www.cockroachlabs.com/docs/) is fantastic and if you do look into using CockroachDB, it is by far the best place to start.

My main DB experience is with configuring, maintaining and developing on MySQL (although I've slowly been using Postgresql on projects due to the advanced features it provides) so these two RDMS systems are my benchmark going forward. (Within the MySQL/Postgresql labels I'm also including the add-ons and enterprise tools such as Percona for MySQL, and EnterpriseDB for Postgresql).

As the name 'CockroachDB' suggests, the system is designed to be hard to kill, being able to provide a scalable, fault tolerant, distributed DB solution which will continue running with multiple nodes missing.

For any system nowadays, high availability is not a 'nice to have' feature or requirement to consider later, but something that requires careful thought and planning from the outset; for even the most basic set-up. For many companies this is often just setting up a MySQL Master/Slave (Or Master/Master if you're a sadist and into hacky solutions) or Postgresql's streaming replication to "kind-of-sort-of" get some duplication of data and redundancy. Although this does provide some quick wins over a single node setup, in a modern platform that needs to minimise downtime and remove risk of data loss it is not a good solution. Postgres has some solutions such as PG-Pool, EDB Failover Manager, PgBouncer etc, but these are still tacked on and from experience, not a solution that I would want to force my business to rely on.

It's with this experience that I've been waiting for something like CockroachDB. On top of this, it's good to see that 'old-fashioned' Relational databases are still getting new blood after the fast increase of NoSQL systems over the last 10 years.

From having a play about these are the key things that poppet out at me (but I'm sure there are many others)

Any node can be used to run SQL queries

With clusters such as MySQL's NDB, there are data nodes and SQL nodes. Clients can only run queries via the SQL nodes and I've always thought this a limitation that you are not utilising your cluster to the full. With CockroachDB, if your node is running, you can connect to it and run SQL queries against the data stored on it. You will need to look at some way of managing connections, the simple way is with HAProxy and they even provide a way of making the HAProxy config automatically for you https://www.cockroachlabs.com/docs/manual-deployment.html###step-5-set-up-haproxy-load-balancers

Easy to scale

And when I say easy.... I mean easy. With any real-world use-case you will be tweaking and configuring your system to use many more switches, but in testing, I just started a node and told it which cluster to join and it just joined, synced data and became usable within seconds.

cockroach start --insecure --host=cdbnode02 --join=cdbnode01 --background

And that's it.

Note: The --insecure switch just allows you to run a local cluster without generating TLS CA/Client certs etc, this would not be used in a live environment

Web Interface

I'm not usually a fan of pretty interfaces for server applications, often they sacrifice the brevity and conciseness of a command line for very little benefit, however CockroachDB starts a web interface by default when the node starts... and it's fantastic. The interface is clean and easily understandable. You can view DB logs, statistics, cluster information, node details all through one screen. With DB systems, interfaces like this usually require you installing some bulky Java app or paying a fortune for their 'Enterprise' tools, but this is neither and invaluable for monitoring the health and performance of your cluster.

CockroachDB uses Postgres interface

Any web developer will have got used to using MySQL/Postgresql/MSSQL/other RDMS client libraries for their chosen language and it can take some time for a new DB to get a mature, reliable library. With CockroachDB this is not an issue. The system is designed to be compatible with Postgresql, so you can use the existing libraries for your language to get stuck right in https://www.cockroachlabs.com/docs/install-client-drivers.html

This is also a benefit for users should CockroachDB not succeed. A company can go down the CockroachDB road early on and be secure in the knowledge that even if it doesn't succeed or shuts up shop after 5 years, there is a migration path to Postgresql and their application will require no big re-write. This will really help adoption of Cockroach early on and is a great decision by CokcroachLabs.

Simplicity of installation

I can only speak for Linux, but I assume Mac/Windows is the same. Everything is available in one binary and installation is to download this binary and place it into /usr/local/bin/ (or elsewhere, should you prefer) and that is it. The same binary provides server tools and client tools all in one. If I were to use this in production, I would likely use BASH aliases or similar to split out the client/server functionality, but this means that upgrades are a doddle and it would be good to have.

Simplicity of configuration and setup

CockroachDB have taken a interesting path with configuration.... there are no config files. Everything is configured using command line switches. There is no SysV Init/Upstart/Systemd shipped with it, setup is controlled by creating a startup script with all your settings and then placing it into a VCS.

One thing to note is that CockroachDB uses $HOME as a base for creating/storing files by default.

Simplicity of Upgrades

Going back to the simplicity of a single binary. CockroachDB is designed to use a rolling upgrade path (See here for more information https://www.cockroachlabs.com/docs/upgrade-cockroach-version.html). To upgrade, you just rollout the new binary and restart.

Simplicity of backups

Most MySQL/Postgresql clusters usually have a slave set aside solely for backups. This is a node that receives updates from the master but accepts no other client connections so it can be used to backup data without causing locking/load issues. This is the same with CockroachDB, add another node, firewall off client IPs and set up a backup cron https://www.cockroachlabs.com/docs/back-up-data.html

I will continue to play about with Cockroach, and I have not had to use it in anger or run any performance benchmarks against it so I have no idea how it competes with other RDMS', but for a first look, it is outstanding.

One thing that I am impressed with is how stable it feels, the availability of tools such as the web interface, the easy of set-up and configuration really lends itself to feeling like an extremely mature and safe system. And although it's not at all logical to make decisions on a hunch; feeling safe and confident in software will be 9/10s of the battle when a team/company is deciding if it should implement a specific database solution.

I'm extremely excited about the idea of an easy to use, reliable HA database system such as CockroachDB, the only worry I have is that in a cloud-driven era where lots of people are already invested into platforms such as MySQL on AWS, it will be hard for CockroachDB to get a foot in the door. It would however be fantastic for an in-house system.

Note: Beware that Cockroach does send scrubbed diagnostics information back to CockroachLabs, see here for information on how to stop it https://www.cockroachlabs.com/docs/diagnostics-reporting.html

Note #2: Reading the FAQ is a must, there are some use-cases where CockroachDB is not suitable https://www.cockroachlabs.com/docs/frequently-asked-questions.html


If you found this useful, please feel free to donate via bitcoin to 1NT2ErDzLDBPB8CDLk6j1qUdT6FmxkMmNz

Linux Desktop Firewall and VPN

Posted: 2017-04-29 22:06:05 by Alasdair Keyes

Direct Link | RSS feed


I use Linux Mint as my OS on my Laptop as well as OpenVPN for all external traffic.

The Ubuntu/Mint Network manager can be instructed to connect to a VPN when the network is started up, which is great for privacy however there are three instances I've noticed when this falls short.

There have been a few instances where these have occurred and it meant I was sending out traffic insecurely until I noticed.

To combat this I set UFW to automatically reject all packets on the OUTPUT chain. This means my laptop is unable to send any packets over any network device (as long as the firewall is running. I then updated my UFW firewall with the following rules into /etc/ufw/user.rules to allow outbound connections for specific devices etc.

# Allow LXC containers to send traffic out on the LXC bridge
-A ufw-user-output -o lxcbr0 -j ACCEPT
# Allow LXC containers to send traffic onto their virtual ethernet device
-A ufw-user-output -o veth+ -j ACCEPT

### Allow traffic out through the OpenVPN tun0 interface
-A ufw-user-output -o tun0 -j ACCEPT

### Allow traffic to my VPN host
-A ufw-user-output -o wlp8s0 -p tcp --dport 1194 -d 9.8.7.6 -j ACCEPT

### Allow traffic out to my local networks
-A ufw-user-output -d 192.168.0.0/24 -j ACCEPT

### Allow traffic out to virtualbox network devices 
-A ufw-user-output -o vboxnet+ -j ACCEPT

Additional rules will be required into your /etc/ufw/user6.rules.

Now if VPN doesn't connect or drops out unexpectedly, I lose connectivity but I won't be sending out unsecured traffic and I can just reconnect.


If you found this useful, please feel free to donate via bitcoin to 1NT2ErDzLDBPB8CDLk6j1qUdT6FmxkMmNz

Mozilla Observatory - How safe is your site

Posted: 2017-04-14 22:41:29 by Alasdair Keyes

Direct Link | RSS feed


Someone on the Nottingham Linux User Group posted about Mozilla Observatory today.

If you're a developer/sysadmin for any website it's worth checking out. It checks the security HTTP headers that your site returns and grades it accordingly.

I was getting a B this afternoon and after a crash course in Referrer Policy and Content Security Policy I managed to get it up to an A+.

My site doesn't accept user posted content so the XSS security this provides isn't too important, however if your site does accept user submitted content, then it really is critical that you implement this. XSS is still one of the most common WebApp vulnerabilities, and if you can force the browser to help limit the damage it means you can worry less about any bugs that creep into your code.


If you found this useful, please feel free to donate via bitcoin to 1NT2ErDzLDBPB8CDLk6j1qUdT6FmxkMmNz

Slimming down

Posted: 2017-03-24 20:24:16 by Alasdair Keyes

Direct Link | RSS feed


This evening the site has just been migrated from a mostly static using Template Toolkit's ttree with the occasional PHP/Perl script to provide dynamic content to a site built with..

The server is still running NGINX and PHP 5.6

The site is ready to run on PHP 7, however Debian still only provides 5.6. As soon as that's updated, I'll be running twice as fast.

If there's any odd behaviour 404/500/502 type errors, please let me know.

P.S. Whilst writing this, I noticed that Doctrin Project and Twig don't have an HTTPS site.... come on guys!


If you found this useful, please feel free to donate via bitcoin to 1NT2ErDzLDBPB8CDLk6j1qUdT6FmxkMmNz

Firejail

Posted: 2017-03-22 23:15:59 by Alasdair Keyes

Direct Link | RSS feed


I recently had some problems with some software on my laptop calling home and receiving an invalid response, this then caused the software to stop working correctly. Until this is resolved, I really want to keep on using the software. After testing in a VM with the network disabled, I realised that if it was unable to call home then it continued to work correctly.

A Virtualbox VM works fine and with the Vbox tools installed I have bi-directional copy/paste etc but it's not an elegant solution and the VM overhead is much greater than the native application.

From this I found out about the firejail tool. This is shipped in the standard Ubuntu repos and provides a great deal of sandboxing utilities that I was unaware of.

For me the --net=none argument was suitable. This creates a new unconnected network namespace before executing the app and restricting it's network access to localhost only.

$ firejail --net=none mytroublesomeapp

This is incredibly useful and a tool I will be making much more use of in future.

If you wish to test, try some of the following.

firejail --net=none firefox
firejail --net=none ping google.co.uk

The man pages show what other options are available too. It's well worth a look


If you found this useful, please feel free to donate via bitcoin to 1NT2ErDzLDBPB8CDLk6j1qUdT6FmxkMmNz

bashfunc - Bash Function Library

Posted: 2017-03-08 15:27:11 by Alasdair Keyes

Direct Link | RSS feed


I've been doing some systems scripting in BASH the past couple of days and often find myself recoding the same functionality over and over, not just at work but home too. So I decided that I'd write a library to cover some common functionality I find myself needing.

All functions are explained in the README.md and working examples are in bashfunc_examples.sh in the repo.

It's designed for use with BASH 4 and up. Test it out and let me know if there's other common functionality that could be added. I'm currently adding to it quite frequently as my requirements grow so keep checking for new versions.

https://github.com/alasdairkeyes/bashfunc


If you found this useful, please feel free to donate via bitcoin to 1NT2ErDzLDBPB8CDLk6j1qUdT6FmxkMmNz

RSS Feed

Posted: 2017-02-23 22:44:30 by Alasdair Keyes

Direct Link | RSS feed


I've finally got round to re-implementing RSS feed on the site again. Links are here ^^^. Or here RSS feed


If you found this useful, please feel free to donate via bitcoin to 1NT2ErDzLDBPB8CDLk6j1qUdT6FmxkMmNz

Trello all the things

Posted: 2017-02-11 23:36:51 by Alasdair Keyes

Direct Link | RSS feed


I've used Trello a number of times for work projects and I've always enjoyed using it. It's simplicity is the key to it's usefulness.

I've now moved onto using it personally too. Previously my todos were on a Memo app on my phone. Now I mainly use Trello on my laptop, but also the App on my phone for when I'm out and everything goes on there.

The act of moving cards from Todo to Done fills me with more pleasure than it really should.... but it keeps me productive! It's well worth looking to move to it if you're the kind of person that makes a lot of lists.


If you found this useful, please feel free to donate via bitcoin to 1NT2ErDzLDBPB8CDLk6j1qUdT6FmxkMmNz

MySQL encrypted client password storage

Posted: 2017-02-10 14:27:55 by Alasdair Keyes

Direct Link | RSS feed


For years I've been using MySQL's ~/.my.cnf file to automatically manage logins for databases. However it's never sat well with me due to the fact that the file is plain text and even though you can restrict access with 0600 permissions, it's never good to have a password stored in plaintext.

I've recently been working on a MySQL 5.7 cluster and needed access to the production slave database and this issue raised it's head again. However as of MySQL 5.6, there is the option to store login details encrypted using mysql_config_editor

This tool allows you to setup profiles to access servers and store the details encrypted.

For example my previous ~/.my.cnf/ file might have

[mysql]
username=al
password=ComplexPassword

I could then access mysql like so...

# mysql
mysql> 

Now you define a profile so for the above example use

# mysql_config_editor set --login-path=localhost --host=localhost --user=root --password
Enter Password: <enter password>

--login-path is just a name and can be anything you like.

I can now login by specifying the login path

# mysql --login-path=localhost

What's nice is that you don't need to specify all the details, if you had a production and beta environment both with multiple servers you could run the following with different passwords and then supply the hostname on the command line

# mysql_config_editor set --login-path=production --user=root --password
Enter Password: <enter password>
# mysql_config_editor set --login-path=beta --user=root --password
Enter Password: <enter password>
# mysql --login-path=production -h proddb3
mysql>

The data is now stored in ~/.mylogin.cnf and is not readable

# cat ~/.mylogin.conf
<<JUMBLEDMESS>>

If you want to make backups or see what profiles you have, you can use

# mysql_config_editor print --all
[production]
user = root
password = *****
[beta]
user = root
password = *****

Removing profiles is as easy as

# mysql_config_editor remove --login-path=production


If you found this useful, please feel free to donate via bitcoin to 1NT2ErDzLDBPB8CDLk6j1qUdT6FmxkMmNz

TDD Deciphered

Posted: 2017-02-09 12:16:58 by Alasdair Keyes

Direct Link | RSS feed


I recently happened upon this website about how to use TDD when building a project. Although it's written for PHP and PHPUnit, the premise can be applied to any language. The great thing about this site over others is that it actually shows TDD on the lifecycle of a valid project, not just using trivial one off examples.

https://tdd-deciphered.com/


If you found this useful, please feel free to donate via bitcoin to 1NT2ErDzLDBPB8CDLk6j1qUdT6FmxkMmNz

IT Consultancy Services

I'm now available for IT consultancy and software development services - Cloudee LTD.



Happy user of Digital Ocean (Affiliate link)


Version:master-11be96eee2


Validate HTML 5