mohangk.org/blog

Thoughts, somewhat explained

July 6, 2011
by admin
0 comments

Paramiko SSHClient.exec_command timeout workaround

For some reason Paramiko’s SSHClient.exec_command has an issue with setting timeouts.

For example, despite passing the timeout=3 to the client.connect, the code below will not timeout if the connection to the server is flaky.

client = paramiko.SSHClient()
client._policy = paramiko.AutoAddPolicy()
client.load_system_host_keys()
client.connect(host, int(port), username=getpass.getuser(),
                       timeout=3)
stdin, stdout, stderr = client.exec_command('ls -l /tmp')
print stdout.read()

 

I found a nice workaround on a comment to a blog post by a chap who was having the same problem - http://www.stillhq.com/commentform.cgi?post=python/paramiko/000004 and I am blogging it here for posterity. It involves overiding the exec_command method to accept at timeout like so:

class mySSHClient(paramiko.SSHClient):
    ## overload the exec_command method
    def exec_command(self, command, bufsize=-1, timeout=None):
        chan = self._transport.open_session()
        chan.settimeout(timeout)
        chan.exec_command(command)
        stdin = chan.makefile('wb', bufsize)
        stdout = chan.makefile('rb', bufsize)
        stderr = chan.makefile_stderr('rb', bufsize)
        return stdin, stdout, stderr
client = mySSHClient()
client._policy = paramiko.AutoAddPolicy()
client.load_system_host_keys()
client.connect(host, int(port), username=getpass.getuser())
stdin, stdout, stderr = client.exec_command('ls -l /tmp',timeout=3)
print stdout.read()

Works like a charm!

June 1, 2011
by admin
0 comments

Python threads resulting in higher then usual loads on StormOnDemand cloud servers

We have a python application that polls directories using threads and inotify watchers. We have always run this application in a cloud server provided by Voxel.

We are currently testing a different cloud server provider StormOnDemand and when we ran our application, our load averages were a lot higher then they were when they were running on the Voxel cloud server despite the specs being about the same (Refer below for more details on setup). We have also ensured then when testing the server was not handling any other loads.

I have written a simple test application (test_threads.py) that simulates the issues we are seeing by starting up  threads that loops, sleeping for a user defined time on each loop. It takes 2 parameters, the amount of threads to start and the interval period.

When I run, “python test_threads.py 50 0.1″ for about 10 minutes

Load average results:

StormOnDemand - $ uptime 18:46:22 up  7:29,  6 users,  load average: 4.43, 4.16, 2.93

Voxel - $ uptime 18:48:14 up 9 days, 15:09,  9 users,  load average: 0.51, 0.47, 0.43

The load average on the StormOnDemand server is a lot higher.

Python version:

StormOnDemand – 2.6.5

Voxel – 2.6.5

Server spec:

StormOnDemand – 8 x Intel(R) Xeon(R) CPU E5506 @ 2.13GHz; 16GB RAM; 230GB HDD (Storm Bare Metal servers)

Voxel – 7 x Intel(R) Xeon(R) CPU L5640 @ 2.27GHz; 14GB RAM; 200GB HDD (VoxCloud servers)

OS:

StormOnDemand – Ubuntu 10.04 – 2.6.36-rc8101910 #1 SMP Tue Oct 19 19:18:34 UTC 2010 x86_64 GNU/Linux

Voxel – Ubuntu 10.04 –  2.6.32-31-server #61-Ubuntu SMP Fri Apr 8 19:44:42 UTC 2011 x86_64 GNU/Linux

Virtualisation method:

StormOnDemand – Not 100% sure, but I think they use Xen

Voxel – Not sure

It is still unclear as to why there is this difference in the load averages between the two providers. I can only guess that it might have something to with the way the virtualisation is being done.  The StormOnDemand servers come with custom compiled kernels. We have also done some testing on GoGrid and do not seem to have this issue.

One of the reason we were considering the StormOnDemand servers were because we were having I/O bottlenecks during certain times of the days on Voxel. StormOnDemand Bare Metal Servers don’t have those issues because the server is dedicated to you and true enough we were very happy with the I/O performance. Too bad we discovered this other issue and might probably need to drop them off the list.

 

November 23, 2009
by admin
6 Comments

My foss.my 2009 talk – “Tomboy WebSync Explained”

This was a talk that I gave during the foss.my 2009 conference. It covers a bit of background on Tomboy before focusing on the web based sync that shipped with Tomboy 1.0 and the accompanyin sync server implementation Snowy.

I managed to do a demo of 2 Tomboy notes instances (on running in a VM) syncing to a common Snowy instance.

The crowd was a bit thin as the other track had Brian Aker giving a talk on Gearman – but the talk sparked interest and I had a couple engaging hallway converstations on the subject.

On the whole I enjoyed myself at foss.my, meeting and chatting with people. The grassroots, community driven aspect of the conference always ensures that there is lot more signal then noise. I look forward to foss.my 2010.

October 3, 2009
by admin
1 Comment

Tomboy Addin – Developing with MonoDevelop

Introduction

The following guide is meant to be an addendum to Tomboy guide to creating addins on the Tomboy wiki. It shows how to create and compile the addin using MonoDevlop. Although written based on MonoDevelop for the Mac, I believe the steps should be the same for any other platform running MonoDevelop. I also assume that you have already downloaded the Tomboy source from its git repository.

Step 1: Load the tomboy solution into MonoDevelop. When checking out from git, in the root folder there will be a file called Tomboy.sln (On the Mac use Tomboy-mac.sln). Open this up in MonoDevelop. Upon doing that your navigator will have the list of projects that come with Tomboy.
step1_started_monodevelop

Step 2: Adding the addin as a new prpject to the Tomboy solution. Right click on the Tomboy solution and select the “Add > Add New Project”.

step2_add_project.jpg

Step 3: Select to create a C# Library project. Insert the name of the addin as the name of the project and the loacation of the parent folder of the project.

step3_project_settings.jpg

Step 4: You should be able to see your new addin project within the navigator window. There will be some files auto generated that you can remove.

step4_delete_unnecessary_files.jpg

Step 5: Create the two files InsertDateTimeAddin.cs and InsertDateTime.addin.xml as instructed by the tutorial and make sure they are contained within our newly created InsertDateTime project. You can either do this by right clicking the InsertDateTime project and selecting “Add > New Files” or by copying the files into the project directory and then adding them to the project via “Add > Add Files”.

step5_add_or_new_files_into_project.jpg

Step 6: Add the required references (dependencies) for the addin project. To do this, right click the “Reerences” folder in the project and select “Edit References”.

step6_edit_references.jpg

Step 7: When the dialog box pops up makes ure the relevant packages, project and assemblies are added as references. You can see the list of references I have added by looking at the screenshot below. I based the list of the other addin projects.

step7_add_references.jpg

Step 8: Once added all the references should be visible as children to the references folder in the navigator.

step8_references_list.jpg

Step 9: Modify the options of the addin project by bringing up its “Options” dialog box by right clicking on the project and selecting “Options”.

step9_set_project_options.jpg

Step 10: When the “Options” dialog box appears, under “Build > General” ensure that the “Compile Target” is set to “Library” and that the “Runtime version” is set to “Mono/ .NET 2.0.”

step10_options_setting.jpg

Step 11: Under “Build > Output” of the ”Options” dialog box make sure that the output path is set to the appropriate path. To determine the right path I would suggest looking at the other addins setting as a guide of what this value should be.

step11_options_setting2.jpg

Step 12: The final configuration step is to ensuret that MonoDevelop know that the InsertDateTime.addin.xml is to be added as a resource with the built dll. To do this right click on InsertDateTime.addin.xml and select “Build Action > Embed as resource”.

step12_set_xml_as_resource.jpg

Step 13: Finally – build the complete project via the main menu option “Build > Build All”.

step13_build.jpg

 

September 26, 2009
by admin
0 comments

mod_deflate – Apache HTTP Server

mod_deflate – Apache HTTP Server
“The mod_deflate module also provides a filter for decompressing a gzip compressed request body . In order to activate this feature you have to insert the DEFLATE filter into the input filter chain using SetInputFilter or AddInputFilter. Now if a request contains a Content-Encoding: gzip header, the body will be automatically decompressed. Few browsers have the ability to gzip request bodies. However, some special applications actually do support request compression, for instance some WebDAV clients.” Implementing request compression support seems relatively straight forward. Whats needed is client support

September 25, 2009
by admin
0 comments

Simple londiste replication setup

Caveat

This is a very simple and straightforward setup. It is a setup of londiste where I only install londiste on one machine, the provider (or master) and not the subscriber (or slave). I have checked that this setup is fine , but it would make less sense once you have more then one subscriber.  Most of the information here was obtained from the excellent Londiste tutorial . Please read that tutorial as well. This is purely an elaboration.

Assumptions

1. Master server – with database “db”.

2. Slave server -  with database “db_slave”.

3. We will be installing Londiste/Skytools only on the master server.

4. The root londiste folder will be ~/londiste, where all config files will be stored in ~/londiste/etc , log files will be stored in ~/londiste/log and PID files will be stored in ~/londiste/pid. It is assumed that all these directories have already been created.

Overview

Londiste is actually a component of the Skytools package that contains some postgres module, a couple of python modules, and python based admin scripts.

With regards to Londiste there are 2 main components to be setup

1. Ticker – this is to be installed into the master “db”.

2. Londiste – replication engine – one instance run for every slave database.

It is recommended that the ticker be run on the master database and that one replication engine instance be run for every one slave database, on the slave database. But since this setup is a simple one master, one slave setup, I am only going to install everything on the master database.

Installation – dependencies

The following is based on Debian/Ubuntu – sorry Fedora/RedHat/Centos folk.

Apart from the build tools (which you should be able to install easily via a sudo apt-get install build-essential), the following dependencies are required as well:

sudo apt-get install postgresql-server-dev-8.3

sudo apt-get install python-psycopg2

sudo apt-get install python-dev

Download the skytools tar package from http://pgfoundry.org/projects/skytools/

Installation  – Installing from a package by creating the deb from the source package

To build the deb install the following dependecies.

sudo apt-get install yada pbuilder devscripts fakeroot

Then, untar the skytools package and from within it do the following

tar -xvzf  ./skytools-2.1.9.tar.gz

cd ./skytools-2.1.9

make deb83

There is also a “make deb84″ to build a deb for postgresql8.4

You will end up with 2 deb packages for your architecture in the same directory that your skytools directory resides (The skytools-2.1.9 parent dir). For example since my machine was a 64 bit machine the packages generated by the make process was

skytools_2.1.9_amd64.deb

skytools-modules-8.3_2.1.9_amd64.deb

Install both these packages

sudo dpkg -i skytools-modules-8.3_2.1.9_amd64.deb ytools_2.1.9_amd64.deb

Installation -  From source

sudo make install

sudo python setup.py install

Setup of slave database

Dump the master database schema only and get it loaded into the slave. You will need to load the schema (just the table structure without the data itself) of the master into the slave database. You can get this by doing the following. Load the result into an empty slave database

pg_dump -Upostgres db -s –schema=public> db_schema.sql

Configuration and installation of ticker on master

The first component that we will be setting up is the ticker. From what I understand you will setup one ticker for every database that you plan to replicate.

Create a config file ~/londiste/etc/ticker.ini as follows. Edit and replace with your own settings as necessary.

[pgqadm]

job_name = db_ticker

db = dbname=db user=postgres password=dbpass #this is your standard psycopg2 connection string

# how often to run maintenance [seconds]

maint_delay = 600

# how often to check for activity [seconds]

loop_delay = 0.1

logfile = ~/londiste/log/%(job_name)s.log

pidfile = ~/londiste/pid/%(job_name)s.pid

After doing that run

pgqadm.py ~/londiste/etc/ticker.ini install

Caveat for python-2.6 - You will need to change the name of the variable "as" on line 576 of the
file /usr/local/lib/python2.6/dist-packages/pgq/status.py - "as" is a reserved word in
Python2.6. I belive this has been rectified with version 2.1.10 that has just come out.

Configuration and installation of londiste

In the tutorial (http://wiki.postgresql.org/wiki/Londiste_Tutorial) it is recommended that the replication daemon be run on the subscriber. Hence you would normally be setting this bit up on the subscriber machine that would also have londiste installed. You would end up with one config file for every subscriber. However this guide assumes that everything is being run from one machine.

Create a config file ~/londiste/etc/p-to-s.ini as follows. Edit and replace with your own settings as necessary.

[londiste]

job_name = p_to_s # this needs to be globally unique, multiple slaves – would requires different names

provider_db = host=localhost dbname=db user=postgres password=dbpass

subscriber_db = host=slavehost dbname=db_slave user=postgres password=dbpass #this would be different for every slave

# it will be used as sql ident so no dots/spaces

pgq_queue_name = londiste.replica # common queue name for common provider – all subscriber for the same provider hence should use the same value

logfile = ~/londiste/log/%(job_name)s.log

pidfile = ~/londiste/pid/%(job_name)s.pid

Upon doing this we install the londiste component to both the provider and subscriber databases as follows

londiste.py ~/londiste/etc/p-to-s.ini provider install

londiste.py ~/londiste/etc/p-to-s.ini subscriber install

Keep in mind that for our setup we only have one subscriber. If you have more then one you would need to repeat the installation for the rest of the subscribers.

Launch replication

Once the relevant londiste installation is complete we begin the database replication as follows:

londiste.py ~/londiste/etc/p-to-s.ini replay -d

You will need to ensure that the replication has started before proceeding to the next steps. Check the logs.  Again, if you had more then one subscriber you will run the replay for each subscriber.

Adding tables for replication

Once the replication is running, you will need to add the tables and sequences that you want replicated in the provider. When it comes to the subscriber you can pick and choose which tables and sequences you would like to replicate as well. For this example I am going to assume that you would want to replicate all the tables and sequences and instead of defining the list of tables individually I am going to do it for all tables and sequences right away.

To add all the tables and sequences  to be replicated on the provider, do the following

londiste.py ~/londiste/etc/p-to-s.ini provider add –all

londiste.py ~/londiste/etc/p-to-s.ini provider add-seq –all

To add all the tables and sequences to be replicated on the subscriber, do the following

londiste.py ~/londiste/etc/p-to-s.ini provider tables | xargs londiste.py ~/londiste/etc/p-to-s.ini subscriber add

londiste.py ~/londiste/etc/p-to-s.ini provider seqs | xargs londiste.py ~/londiste/etc/p-to-s.ini subscriber add-seq

By this stage the replication process should start of started. In the londiste log fil, as you add tables you should see entries as follows:

2009-07-28 17:45:37,914 22880 INFO Adding public.table1

You should also see log entries such as

2009-07-28 17:46:15,096 22886 INFO Starting full copy of public.table1

that would indicate the initial data transfer has started.

To check that the replication is really running you can use the compare command that will compare the provider row count and subscriber row count for the replicated tables, as follows:

londiste.py ~/londiste/etc/p-to-s.ini compare

Summary

That’s it! Hope this was useful for someone. Londiste is bloody easy and to get started with and works like a charm. I have been using very much the configuration that I have described above for one of our large database that gets frequent inserts and updates and I have not had a single problem since installing it slightly over 2 months ago. The manual management of column changes in the database can be a bit tedious, but I am looking to Londiste 3.0 to sort that out.

References

Londiste tutorial – http://wiki.postgresql.org/wiki/Londiste_Tutorial

Really helpul and frinedly mailing list – http://pgfoundry.org/mailman/listinfo/skytools-users

Londiste presentation from pgcon – Useful explanation and diagrams of both Londiste 2.0 and 3.0 – http://www.pgcon.org/2009/schedule/attachments/101_Londiste3.pdf

Appendix

How to correctly temporarily disable and resume replication in londiste 2.1

Proper way to rotate logs for replay and ticker daemons?