PHP and Python

My new job involves a lot more PHP, which seems closely related to Perl. There are a number of observations that I would make in the comparison between PHP and Python.

1. PHP is more verbose. In particular, PHP requires braces around code blocks, semi-colons, $ on variable names. Python requires only the colon at the end of certain lines.

2. Both have good libraries. However, the ability of Python to easily interface to shared libraries written in any language, and not specifically written for Python, is a big advantage, in my opinion.

3. PHP has more unexpected gotchas. I hadn’t noticed this when writing in Python (the absence of a problem is less noticeable than the presence of the same problem). One example I came across today:

is_file() tests for the existence of a file, as you would expect by the name. But it is cached. So if you are waiting for a file to disappear, and you repeatedly call is_file to test this, then you will wait for a long time. You need to call clearstatcache() between each call to get an honest answer.

Of course, the documentation states this. But it also states that file_exists (an equivalent function) is also cached. But it appears not to be (PHP v5.3.3):

<?php

print "Waiting for test.txt to appear\n";
while (!is_file("test.txt"))
{
}

print "Waiting for test.txt to disappear\n";
while (True)
{
    sleep(.2);
    if (!is_file("test.txt"))
    {
        print "is_file detected disappearance\n";
        exit(0);
    }
    elseif (!file_exists("test.txt"))
    {
        print "file_exists detected disappearance\n";
        exit(0);
    }
}

?>

Running this code (and creating, then deleting file test.txt) invariably results in the detection by file_exists, even though I have biased it to let is_file test it first.

So ideally you need accurate documentation. But I think even better is to write in a way that requires minimal documentation.

Posted in Python, Uncategorized | Leave a comment

SendKeys in Linux

SendKeys is a useful function in the Windows API which enables a program to send keystrokes. I was looking for an equivalent in Linux, first in Ubuntu at home, and then in Fedora 13 at work.

For Ubuntu (Hardy Heron, 8.04), I found xautomation, which worked fine. For many (but not all) of the non alpha-numeric keystrokes, you need to use the name of the key – there is a helpful list here.

I tried xautomation with Fedora 13, but it was not happy. So I had a look around for an alternative. One possibility (I found them hard to find) was xdotool. Of course, the syntax for sending keystrokes is slightly different! It is slightly more consistent in that non-alphanumerics all need to be sent using the name of the key, so the helpful list (above) is still helpful!

I’ll try to post some examples at a later date.

Posted in Software | 1 Comment

Strange behaviour using ctypes in IronPython

I came across some slightly unexpected behaviour with ctypes in IronPython. Maybe this is the wrong way to look at it – I should be amazed that ctypes works at all in IronPython. But I thought I would document it here.

This deals with DLLs, and so is Windows-centric. I need to find out how / if ctypes works on Linux.

Start with a trivial DLL. It contains a single function, nAdd, which takes an array of integers, and a length of the array, and returns the sum of the array elements:

#ifdef __cplusplus
extern &quot;C&quot;
{
#endif


int nAdd(int * pnData, int nDataLen)
{
    int nTotal = 0;
    for (int i = 0; i &lt; nDataLen; ++i)
    {
        nTotal = nTotal + pnData[i];
    }

    return nTotal;
}


#ifdef __cplusplus
}
#endif

and build it using MSYS / MinGW:

gcc -c strange.cpp
gcc -fPIC -shared -o strange.dll strange.o

Now lets call the function in the DLL, using ctypes:

import ctypes
import sys

# define an array of 3 integers 
array3type = ctypes.c_int * 3
array3 = array3type()
array3[0] = 4
array3[1] = 5
array3[2] = 21

# define an array of 2 integers 
array2type = ctypes.c_int * 2
array2 = array2type()
array2[0] = 65
array2[1] = 34


def call3then2_noTypes():
    print(&quot;call3then2_noTypes&quot;)
    dll = ctypes.cdll.LoadLibrary(&quot;strange.dll&quot;)

    dll.nAdd.restype = ctypes.c_int
    #dll.nAdd.argtypes = (ctypes.POINTER(ctypes.c_int), ctypes.c_int)

    try:
        print(&quot;  %d&quot; % dll.nAdd(array3, 3))
        print(&quot;  %d&quot; % dll.nAdd(array2, 2))
    except Exception:
        e = sys.exc_info()[1]
        print(&quot;  %s&quot; % e)
    print(&quot;&quot;)


def call2then3_noTypes():
    print(&quot;call2then3_noTypes&quot;)
    dll = ctypes.cdll.LoadLibrary(&quot;strange.dll&quot;)

    #dll.nAdd.restype = ctypes.c_int
    #dll.nAdd.argtypes = (ctypes.POINTER(ctypes.c_int), ctypes.c_int)

    try:
        print(&quot;  %d&quot; % dll.nAdd(array2, 2))
        print(&quot;  %d&quot; % dll.nAdd(array3, 3))
    except Exception:
        e = sys.exc_info()[1]
        print(&quot;  %s&quot; % e)
    print(&quot;&quot;)
    
    
call3then2_noTypes()
call2then3_noTypes()

For Python 2.5+ and Python 3.x, this correctly prints the following:

call3then2_noTypes
  30
  99

call2then3_noTypes
  99
  30

So the question is, what will it do in IronPython 2.6?

The answer is unexpected (at least, to me):

call3then2_noTypes
  30
  expected c_long_Array_3, got c_long_Array_2

call2then3_noTypes
  99
  expected c_long_Array_2, got c_long_Array_3

It seems to be trying to learn the signature of the function from the first call, but unfortunately, in this case, getting it wrong. Note that the first call to the function was successful in each case – it is the second call that is failing.

Two of the following three methods work for Python 2.5+, Python 3.x and IronPython 2.6:

import ctypes
import sys

# define an array of 3 integers 
array3type = ctypes.c_int * 3
array3 = array3type()
array3[0] = 4
array3[1] = 5
array3[2] = 21

# define an array of 2 integers 
array2type = ctypes.c_int * 2
array2 = array2type()
array2[0] = 65
array2[1] = 34

def call():
    print(&quot;call&quot;)
    dll = ctypes.cdll.LoadLibrary(&quot;strange.dll&quot;)

    dll.nAdd.restype = ctypes.c_int
    dll.nAdd.argtypes = (ctypes.POINTER(ctypes.c_int), ctypes.c_int)

    try:
        print(&quot;  %d&quot; % dll.nAdd(array3, 3))
        print(&quot;  %d&quot; % dll.nAdd(array2, 2))
    except Exception:
        e = sys.exc_info()[1]
        print(&quot;  %s&quot; % e)
    print(&quot;&quot;)


def call_voidPointer():
    print(&quot;call_voidPointer&quot;)
    dll = ctypes.cdll.LoadLibrary(&quot;strange.dll&quot;)

    dll.nAdd.restype = ctypes.c_int
    dll.nAdd.argtypes = (ctypes.c_void_p, ctypes.c_int)

    print(&quot;  %d&quot; % dll.nAdd(array3, 3))
    print(&quot;  %d&quot; % dll.nAdd(array2, 2))
    print(&quot;&quot;)


def call_voidPointer_addressOf():
    print(&quot;call_voidPointer_addressOf&quot;)
    dll = ctypes.cdll.LoadLibrary(&quot;strange.dll&quot;)

    dll.nAdd.restype = ctypes.c_int
    dll.nAdd.argtypes = (ctypes.c_void_p, ctypes.c_int)

    print(&quot;  %d&quot; % dll.nAdd(ctypes.addressof(array3), 3))
    print(&quot;  %d&quot; % dll.nAdd(ctypes.addressof(array2), 2))
    print(&quot;&quot;)


call()
call_voidPointer()
call_voidPointer_addressOf()

Extra hint: the output for IronPython 2.6 is as follows:

call3then2
  expected c_long, got c_long_Array_3

call_voidPointer
  30
  99

call_voidPointer_addressOf
  30
  99
Posted in Python, Uncategorized | Leave a comment

ctypes and addressof

Here’s a gotcha I came across in ctypes a while ago. Have a look at the code below, and see if you can work out what it will print:

import ctypes

int_array_type = ctypes.c_int * 2
pint_array_type = ctypes.POINTER(ctypes.c_int)


def display_values(pointer_list):
    #print("pointer_list = %s" % pointer_list)
    for i in range(3):
        pint_array = ctypes.cast(pointer_list[i], pint_array_type)

        print("values of pointer_list[%d] are [%d, %d]" % (i, pint_array[0], pint_array[1]) )


def wrong():
    pointer_list = []
    for i in range(3):
        int_array = int_array_type()
        for j in range(2):
            int_array[j] = i + j

        pointer_list.append(ctypes.addressof(int_array))

    display_values(pointer_list)


wrong()

If you’re using Python 2.x or Python 3.x, then you might be surprised. Your values may vary, but I get:

values of pointer_list[0] are [2, 3]
values of pointer_list[1] are [11187064, 0]
values of pointer_list[2] are [2, 3]

(If you’re using IronPython 2.6.1, the code works, and does not display this particular problem).

Uncommenting line 7 gives us a bit more of a clue. Your values will vary, but mine produces:
pointer_list = [11187064, 11187144, 11187064]

It took me ages to work out what was going on here. The problem seems to be that storing the address of some Python variable (in this case int_array) is not enough to keep it from being reused. The addressof method returns an integer, and this is not a Python reference, so the memory can be reused. And ctypes seems to reuse it every other time.

So one way round this problem would be to keep a reference to it in some other way. There are many ways to do this, but one example would be:

import ctypes

int_array_type = ctypes.c_int * 2
pint_array_type = ctypes.POINTER(ctypes.c_int)


def display_values(pointer_list):
    #print("pointer_list = %s" % pointer_list)
    for i in range(3):
        pint_array = ctypes.cast(pointer_list[i], pint_array_type)

        print("values of pointer_list[%d] are [%d, %d]" % (i, pint_array[0], pint_array[1]) )


def right():
    pointer_list = []
    safe_store = []
    for i in range(3):
        int_array = int_array_type()
        for j in range(2):
            int_array[j] = i + j

        pointer_list.append(ctypes.addressof(int_array))
        safe_store.append(int_array)

    display_values(pointer_list)


right()

This produces the correct output:
values of pointer_list[0] are [0, 1]
values of pointer_list[1] are [1, 2]
values of pointer_list[2] are [2, 3]

Posted in Python, Software | Tagged , | Leave a comment

Python decorators

I have been learning about decorators recently. It is interesting functionality, but also one of those areas where the issues just keep appearing, the more you look at it.

There are some good articles on it here and here

To my mind, a key requirement of decorators, after the need to be useful, is to be as unintrusive as possible. So the decorated function or method, and more particularly, users of the decorated function or method should not notice the decoration.

Some of the issues, in order of increasing complexity:

  • functions and methods
  • return values
  • function / method arguments and keyword arguments
  • attributes such as __name__ and __doc__
  • decorator arguments
  • multiple decorators
  • introspection

I don’t have all the answers yet, and I’ll write more in a later post.

An example implementation shows some of the points of interest. This is a decorator with arguments implemented using a function.

from functools import wraps

def decoratorWithArguments(decoratorArgument = None):
    def wrap(functionOrMethod):
        # Support naive introspection
        @wraps(functionOrMethod)
        def wrappedFunctionOrMethod(*args, **kwargs):
            print("Before %s" % str(decoratorArgument) )
            ret = functionOrMethod(*args, **kwargs)
            print("After %s" % str(decoratorArgument) )
            return ret

        return wrappedFunctionOrMethod
    return wrap


@decoratorWithArguments("Decorator Argument Value")
def functionExample():
    """ functionExample docstring """
    return "functionExample"


class ClassExample:
    @decoratorWithArguments()  # defaults to None
    def methodExample(self):
        """ methodExample docstring """
        return "methodExample"


print(functionExample())
classInstance = ClassExample()
print(classInstance.methodExample())

Notes:

  • the wraps function from functools copies across various attributes such as the docstring.
  • *args and **kwargs should cope with both positional arguments and keyword arguments
  • Because decorators work differently with and without arguments, even though this decorator will work without any arguments, I still need to use () on line 22.

It took me an age to understand how decoratorWithArguments could possibly work. It’s a bit like Russian dolls:

  • The outer function, decoratorWithArguments, takes the decorator arguments. It returns the inner function, wrap (a closure).
  • Having unwrapped the outer function, we now have the returned inner function (wrap). This can then be called with the function or method that we want to decorate. It returns the inner function, wrappedFunctionOrMethod.
  • wrappedFunctionOrMethod can then be called. It will be called in place of the original decorated function or method.

Note that the decoration process takes care of all of the above for us. It’s still nice to understand how it can work.

Posted in Python, Software | Tagged | Leave a comment

Ubuntu upgrade

I wanted to install an Ubuntu upgrade on a very old machine, from 8.04 LTS (Hardy Heron) to 10.04 LTS (Lucid Lynx). I like Ubuntu, so please do not take this post as a moan about it.

(See here for more information on Ubuntu releases.)

The download of the packages went fine, but the upgrade froze at about 75%. Which is probably not a good thing, as there is no cancel option at this point. Sure enough, on the reboot, the system was not happy.

It was at this point that I discovered the CD-player on the computer was not working either (a hardware issue rather than anything to do with the upgrade).

Luckily, I had an old copy of 6.06 LTS (Dapper Drake) that I could use to back up important data.

I then looked around for non-CD installation options. There seemed to be 3 options:

  • Install from ISO mounted on a partition
  • Install from ISO placed on a drive
  • Install from the network

This is not an area that I am overly familiar with, so problems that I had are likely to be PEBKAC.

The method that I eventually got working is the install from the network. Here are a few hints:

  • The install can appear to hang in places. This usually means that it is taking a long time. I got quicker response times later in the evening
  • It offers the option of installing lots of extra software towards the end. Don’t! Because your system will not be left in a working state if anything goes wrong at this point. Just install the necessary software later on, once you have a working system. The install process is much more robust once Ubuntu is installed.
  • Take care to get the right linux and initrd.gz files. These exist, for each Ubuntu version, for a reboot from the network, an install from the network, and many other purposes.

I also learned a lot about fdisk and the GRUB (bootloader). GRUB is a very useable piece of software. I will write more about it in a later post.

I managed to get Lucid Lynx installed, but it is too slow on the machine (256MB of memory, so maybe not too surprising). Hardy Heron continues to be my default version, and is pleasantly responsive.

Posted in Software | Tagged | Leave a comment

Python whitespace

I was asked in an interview yesterday what I think of Python white space.

Python is unusual (possibly unique?), in that the indentation of your code is significant, and effects the way that your code is compiled / runs.

I know that when I first used Python, I was ambivalent about this feature.  But as a seasoned user of Python, I think it is a good feature.

Firstly, the way we lay out code is often designed to make it more readable for humans.  But if the way the code runs ignores this layout, then there is a danger that bugs will be introduced through this disconnect.

Secondly, the fact that whitespace is significant allows Python to eliminate braces (or other code grouping statements): so you avoid all the coding standards “discussions” on the correct style of braces, which can occupy huge amounts of (unproductive) time, and adversely impact team morale before you have even started.

Here’s an example in C:

#include &lt;stdio.h&gt;

int main(void)
{
    int a = 1;
    int b = 2;

    if (a == 1) {
        printf(&quot;a is 1\n&quot;);
    }

    if (a==1)
    {
        printf(&quot;a is 1 still\n&quot;);
    }
    if (a==1)
        {
        printf(&quot;a is 1 yet again\n&quot;);
        }

    if (a==1 &amp;&amp; b==3)
        printf(&quot;a is 1 for the last time\n&quot;);
        printf(&quot;b is 3\n&quot;);
}

All three of the bracing styles that I have encountered are displayed here (I won’t inflame the discussion by stating my preference, only to say that the example above is inconsistent in its use of braces, and that is not desirable. The final section is potentially misleading to a developer, although the compiler will not complain.

The common point of all three of the styles of braces is that the code within the braces is indented. Which the Python approach follows as well. Here is the equivalent Python example:

a = 1
b = 2

if a == 1:
    print(&quot;a is 1&quot;)

if a==1:
    print(&quot;a is 1 still&quot;)

if a==1:
    print(&quot;a is 1 yet again&quot;)

if a==1 and b==3:
    print(&quot;a is 1 for the last time&quot;)
print(&quot;b is 3&quot;)

Note that, because whitespace is significant, it is essential that everyone on the team ensures that tabs are converted to spaces, or their tab indent is set to the same value. In my experience, the former is preferable.

Posted in Python, Software | Tagged | Leave a comment