Skip to main content

Posts about python (old posts, page 1)

Using server-side objects with XML-RPC

XML-RPC is a very handy little standard. It's straightforward, lightweight, and implementations exist for pretty much every language out there.

One thing I've found a bit lacking with it however is that it's kind of a pain to deal with objects. The standard itself supports some very basic types like strings, integers, arrays, and structures. But there's no way of handling more complicated types. A project I've been working on involves manipulating server-side objects over a network connection, so I figured that XML-RPC would be a good place to start.

I believe I've come up with a good way of allowing clients to build proxies for server side objects, while still being compatible with regular XML-RPC implementations.

I've been experimenting a little bit with an XML-RPC server written in Python, and a JavaScript client (using jsolait). I considered adding a new type of parameter (<object>?), but decided against it since it would break older implementations. Instead, the client and server agree to a few conventions:

  • A two-tuple whose first element begins with "types." should be interpreted as a reference to an object on the server, where the first element specifies the type of the object, and the second is a unique identifier for that object.
  • The server exposes object methods as functions with names of the format "typeMethods.typeName.methodName".

When the client receives a two-tuple object reference, it can now look in the list of methods supported by the server, and create a new object with wrappers for all the appropriate class methods bound to that object. For example, the following python code:

class MyClass: def doSomething(self, x, y): pass

def makeObj(): return MyClass()

would be exposed as these XML-RPC functions:

makeObj() / returns an object reference: ("types.MyClass", objectId) /

typeMethods.MyClass.doSomething(objectRef, x, y) / objectRef should be ("types.MyClass", objectId) /

When the client sees a two-tuple of the form ("types.MyClass", objectId), it can create a new object along the lines of:

var o = {

"objectId" = objectId,

"typeName" = "types.MyClass",

"doSomething" = function(x,y) { typeMethods.MyClass.doSomething([this.typeName, this.objectId], x, y); }}

(JavaScript isn't my strong suit, so I appologize if this isn't exactly right. Hopefully the intent is clear!)

So now you've got a first-class object in your client, with methods that behave just like you would expect! You can now write:

o.doSomething(x,y);

instead of something along the lines of:

serverproxy.MyClass_doSomething(objectId, x, y);

Using the system.listMethods() function to get a list of all methods supported by the server enables you to bind all of a type's methods to an object.

Generating objectId's is application specific, so I won't go into that here. I would like to see a generic way for a user to extend Python's SimpleXMLRPCServer to marshall and unmarshall new data types. The pickle methods (getstate, setstate) seem promising, but those are intended to serialize the entire representation of an object, not simply a reference to the object.

Re: Re: Ruby and Python compared

On his blog, Ian Bicking responds to the article, Ruby and Python compared. While there is much in the latter that is uninformed as to what Python is capable of, the most important point I got from Ian's post was:

An important rule in the Python community is: we are all consenting adults. That is, it is not the responsibility of the language designer or library author to keep people from doing bad things. It is their responsibility to prevent people doing bad things accidentally. But if you really want to do something bad, who are we to say you are wrong? It's your program. Maybe you even have a good reason.

I think this should be the motto of any module developer: "Keep people from doing bad things accidentally." It's impossible to keep a developer from shooting himself in the foot if he really wants to, so don't try too hard. Your job is to enable users of your code, not restrict them.

I've heard many C++ / Java programmers complain that Python isn't object oriented because it doesn't offer private/protected data for classes. In a perfect world all libraries and modules would be perfectly designed and there would be no need to go mucking with the internals of a module you didn't write. Back here in the real world, APIs are often not as well thought out as they should be. In Python (and in Ruby as well I'm guessing) you can muck about with the internals of classes or objects if you have to. It's either that or get the upstream package fixed and distributed everywhere before you can deploy your application.

splittar 0.2

I've finally finished version 0.2 of splittar. In addition to some bug fixes, I've added a few new features:

  • Generate files with more consistent file sizes
  • Keep multiple files open so that smaller files can be added to archives with a bit of room left
  • Ability to tweak how splittar estimates how well a file will compress. This is a weighted average between the actual file size, and the file size multiplied by the current compression ratio

Python Warts, part 1 - pointless modules

Ian Bicking has inspired me to start keeping track of the little things in Python annoy me a little bit whenever I run into them. They don't really get in the way of development, but it would be nice if I didn't have to deal with them at all :)

Maybe I'm just slow, but I didn't know about the os.statvfs() method until just yesterday. Up until then I've been parsing the output of 'df -k /mnt/drive' to find out how much free space a drive has!

Of course, os.statvfs() is pretty much useless by itself. Try it out:

import os

os.statvfs(".")

(4096, 4096, 19107656L, 2773425L, 1802799L, 9715712L, 8529635L, 8529635L, 0, 255)

Great. What does that mean? Oh, I need the import the statvfs module to be able to usefully interpret this data. This module contains nothing other than a set of constants that index into the above tuple. So to get the free disk space, I would do something like:

import os, statvfs

s = os.statvfs(".")

freebytes = s[statvfs.F_BSIZE] * s[statvfs.F_BAVAIL]

Note that I'm not sure if statvfs.F_BSIZE or statvfs.F_FRSIZE is the proper entry to use. And the python documentation doesn't clear it up for me.

There are really two things that bug me about this. The first is that the only reason for the statvfs module's existence is to make the os.statvfs() function useful. I would prefer that os.statvfs() returned a dictionary with meaningful keys, or an object with meaningful attributes. The stat module is another violator; it exists to make os.stat() useful.

The second thing that bothers me is that I don't really think something like os.statvfs() is the best way to be calculating the free disk space in Python. Maybe I'll rant about this more in a future post, but functions and modules in Python that exactly mirror the underlying C library bug me. Sure, there may be times when you need access to the raw system call, but I can imagine that many people just want to know how much free space there is in a certain directory. Something along the lines of os.freespace(dir) would be a welcome addition to the standard library.

splittar version 0.1

I've just uploaded my first version of splittar - a small utility that will create tar files for you, but limit the size of each one.

I wrote this because I couldn't find anything that would generate tar files of a certain size for me, and splitting one giant tar file isn't acceptable to me because that means in order to recover any data out of the piece, you need all of them assembled back into the original file...Which can mean several tens of gigabytes for me.

You can check it out at https://atlee.ca/blog/software/splittar

Munin plugin for Shorewall accounting

I wrote this little script to monitor traffic on various machines at work. We use Shorewall to set up all the netfilter rules, traffic shaping, etc. It also makes it easy to set up rules to monitor traffic for different types of traffic.

We use Munin to track all sorts of things over time. The script below is a Munin plugin that will create a graph with one data series for each of the chains defined in your shorewall accounting file.

Put this script into /etc/munin/plugins and call it something like shorewall_accounting, and then add this in /etc/munin/plugin-conf.d/munin-node:

[shorewall_accounting]

user root

The name in between the square brackets should match the name of the file you saved the script in. The script needs to run as root in order to get access to iptables.

Edit Jan 20, 2006: Some minor bugfixes to the script have now been included. The shorewall accounting chains are now output in alphabetical order, and the regexp has been fixed to catch very large numbers.

#!/usr/bin/python

# shorewall_accounting

# A munin plugin for tracking traffic as recorded by shorewall accounting rules

# Written by Chris AtLee 

# Released under the GPL v2

import sys, commands, re

accountingLineExp = re.compile(r"^\s*\d+\s+(\d+)\s+(\w+).*$")



def getBytesByChain():
    status, output = commands.getstatusoutput("shorewall -x show accounting")
    if status != 0:
        raise OSError("Error running command (%s)[%i]: %s" % (trafficCmd, status, output))
    chains = {}
    for line in output.split("\n"):
        m = accountingLineExp.match(line)
        if m is not None:
            target = m.group(2)
            bytes = int(m.group(1))
            if target in chains:
                chains[target] += bytes
            else:
                chains[target] = bytes
    retval = []
    chainNames = chains.keys()
    chainNames.sort()
    for name in chainNames:
        retval.append((name, chains[name]))
    return retval


if len(sys.argv) > 1:
    if sys.argv[1] == "autoconf":
        print "yes"
        sys.exit(0)
    elif sys.argv[1] == "config":
        print "graph_title Shorewall accounting"
        print "graph_category network"
        print "graph_vlabel bits per ${graph_period}"
        for chain,bytes in getBytesByChain():
            print "%s.min 0" % chain
            print "%s.type DERIVE" % chain
            print "%s.label %s" % (chain, chain)
            print "%s.cdef %s,8,*" % (chain, chain)
        sys.exit(0)


for chain, bytes in getBytesByChain():
    print "%s.value %i" % (chain, bytes)

TurboGears on Debian

I was really impressed with the 20 minute wiki demo using TurboGears , so I spent a little bit of time today trying to get it running on my laptop, which is running Debian (sid). While I really like the motivation behind EasyInstall / setuptools / eggs, the implementation isn't quite there yet...

I spent quite a bit of time fighting with it since I didn't want to install these packages into /usr/lib/python2.4/site-packages. My first thought was to install this stuff into my home directory somehow...Well, I never figured out if that was possible...It seems that python doesn't look at .pth files outside of certain directories, resulting in errors like this when trying to run turbogears-admin:

Traceback (most recent call last): File "/home/catlee/python2.4/site-packages/turbogears-admin.py", line 4, in ? import pkg_resources ImportError: No module named pkg_resources

So, the way I got it working was to give myself write permissions on /usr/local/lib/python2.4/site-packages, create ~/.pydistutils.cfg with this:

[easy_install]

install-dir=/usr/local/lib/python2.4/site-packages

site-dirs=/usr/local/lib/python2.4/site-packages

script-dir=/home/catlee/bin

and then run the ez_setup.py bootstrap script to install TurboGears. Everything seems to work for now....

Python 2.4 and new decorators

I always get excited when a new release of python is announced. There are always all sorts of goodies to play around with. One of things I've been following for a while is PEP 318 which provides an easier way to specify class methods / static methods / whatever other kind of method you want. Basically syntactic sugar for wrapping methods.

I have to say that among the various proposals for exactly what the syntax should be, the "@decorator" syntax is my least favourite. The '@' character has a definite un-pythonic feel, rather it feels closer to perl or ruby. Having the decorator specified on the preceeding line also doesn't make sense to me...A decorator is part of the function declaration, so it should be on the same line.

The pre- or post-argument [decorator,...] syntax really appeals to me, it puts the list of decorators on the same line as the function declaration, and it obviously something other than the list of arguments to the function.

Other 2.4 goodies that appeal to me include some nifty keywords to the list sort method ("key" in particular) and generator expressions.

Now if only the path module would make it into the standard python library...

Just my humble python-user's opinion.