Code Reviews with Git

A few weeks ago at work we improved our code review process by using Git more effectively. Previously a code review happened after the topic branch was merged into master. This obviously was not very effective as changes could have broken master without a proper review and there was less incentive to perform as careful of a review since the code was "working" already. This was carried over from when we were using SVN until we realized we were no longer forced to work in the dark ages.

Since we were already using Git we could easily change our workflow for a better review process. Once a topic branch was ready for review we could push a remote branch. Our remote branches take the form /review/{initials}/{topic}

To push a new remote branch:

{% highlight bash %} git push origin {branch name}:/review/{initials}/{branch name} {% endhighlight %}

And then we would move the Kanban card to the review column and find someone to review our changes.

When the code has been reviewed and any necessary changes made the reviewer will merge them into master.

{% highlight bash %} git checkout master git merge --no-ff --no-commit /review/{initials}/{branch name} git commit -s {% endhighlight %}

The merge command turns off fast-forward merging and commiting so when we commit with a -s we can sign-off on the changes. This shows who reviewed the code. We don't permit anyone to push their own changes without review although Git doesn't prevent you from changing the committer or the sign-off.

Then finally remove the remote branch by pushing an empty branch over it and delete your local copy of the branch.

{% highlight bash %} git push origin :/review/{initials}/{branch name} git branch -d /review/{initials}/{branch name} {% endhighlight %}


Plug v0.1.2 Bug Fixes

This release fixes a bug cause by linking all plug service instances to the installed plug. Runit uses a ./supervise in the service's directory to maintain state which would be clobbered when multiple services link to the same plug.

Now the virtualenv is copied to /srv/plug/plug_instances/ and linked into runit.

There is also a fix for uninstalling plugs leaving orphaned processes. Now Plug will stop the processes before removing them to prevent this.


Plug v0.1.1

Release v0.1.1 adds an uninstall command to Plug that takes a --plug= option and removes the virtualenv and all runit links.

You can get Plug on PyPi and try it out. As always, report any issues.


TestHTTPServer v0.1.2 - Beta Status!

Release v0.1.2 includes the ability to set custom response headers and the __version__ package attribute! This release should be "complete" for my own needs and for common uses. If you see anything you need open an issue and I can probably get it updated and packaged in a day or so.

After I use the package a bit more I will bump the version and switch it to production status.


TestHTTPServer v0.1.1

I've push v0.1.1 of TestHTTPServer. This release adds the ability to handle any method as well as storing request headers and content for all methods.

For v0.1.2 I should be adding more to the server reponse options.


Google Music

I've been using Google Music quite a bit lately. It was rather painless to setup although the upload process can take a while. The service works out well with my Macbook Air since I only have a 64GB SSD and don't want to fill my drive with music I may or may not be listening to very frequently.

The only issue for me is that the Flash performance can frequently cause studdering if I am doing too much in Chrome at the same time.


Plug v0.1.0

I've pushed the first version of Plug. You can download it with pip

pip install plug

This is mostly to help me manually test Plug by using it in a few of my own projects and getting together a list of glaring issues.


TestHTTPServer v0.1.0

I've pushed a new PyPi package TestHTTPServer based on some work at AWeber where I needed to test processes that make web requests.

This probably shouldn't be used right away as it was from memory that I created the package. It will be expanded to record more of the request data and handle more requests.


Packaging

Roughly 2 weeks ago I started Plug which aims to create a package format for Python daemons. The project started after seeing how Supervisor handles 150+ processes.

A current project at work can easily have many daemon processes with differing number of running instances that may need to be adjusted frequently. Deploying with Supervisor can be a problem given the amount of time Supervisor would take to start/stop processes.

Plug installs each package into a virtualenv then uses runit to manage each daemon instance.

I have a prototype version working now with a packaged version to come in the next few weeks after giving it more testing.

The biggest issue is after watching To Package or Not to Package I am falling in more of the "to package" crowd and despite Plug being a packaging solution smells a bit too much of NIH.


Worker Process

Part of a current project at work involves writing many stand-alone processes to handle events from RabbitMQ. The worker processes are managed by Supervisor but should gracefully handle SIGTERM and generally follow a common pattern.

The main design is splitting out how worker's should run from what they are supposed to be doing. Extending the BaseWorker class will give a main classmethod used to start new workers.

The BaseWorker class is split from the WorkerRunner class giving a common interface for all workers and Unixy interaction.

You can take a look at what I have so far on the projects page.


New Feed

I've added an atom feed for all your newsreading pleasure!


Python TDD with Dingus - A Markdown Function

Here is a walk through for testing a simple function that will convert a directory of markdown files to a directory of HTML files

It should take 2 paths, the source directory and the output directory.

Acceptance Test

In acceptance tests you will want to test as large a feature as possible. The test in this case will assert that a file in the src directory is converted to HTML in the output directory.

Since we want this to create the files on disk we will need to import os.path for some helper functions.

{% highlight python %} import os.path {% endhighlight %}

Then we import the function we plan on testing.

{% highlight python %} from markdown_processor import process_markdown {% endhighlight %}

From there we can begin by creating the test class that will handle all of our setup and our assertion.

{% highlight python %} class WhenRunningProcessor(object):

    @classmethod
    def setup_class(cls):
        cls.src_dir = 'src_example'
        cls.target_dir = 'target_dir'

        process_markdown(cls.src_dir, cls.target_dir)

{% endhighlight %}

This creates a class WhenRunningProcessor That inherits from object. Before each test case Nose allows us to run code to setup the test. In this case we use the @classmethod decorator and setup_class. This function will be run once before all tests in this class. Acceptance tests will take longer to run then unit tests and usually do not require the same level of isolation.

Then we define the src_dir and target_dir since we will be using them a few times.

Finally we run the function we plan on testing passing in the src_dir and target_dir.

Now we can write the acceptance test for this function

{% highlight python %} def should_have_html_hello_world(self): file_path = os.path.join(self.target_dir, 'hello_world.html') content = open(file_path, 'r').read() assert '

Hello World!

' in content {% endhighlight %}

Our test is checking that the text '<p>Hello World!</p>' is in a file in the output directory. This requires some fixture data in the source directory which is only

Hello World!

The file_path is the our target_dir folder and the hello_world.html file. The file is read and then an assertion checking that the test exists.

The whole test file will look like this:

{% highlight python %} import os.path

from markdown_processor import process_markdown

class WhenRunningProcessor(object):

    @classmethod
    def setup_class(cls):
        cls.src_dir = 'src_example'
        cls.target_dir = 'target_dir'

        process_markdown(cls.src_dir, cls.target_dir)

    def should_have_html_hello_world(self):
        file_path = os.path.join(self.target_dir, 'hello_world.html')
        content = open(file_path, 'r').read()
        assert '<p>Hello World!</p>' in content

{% endhighlight %}

Unit Tests

In our unit tests we will test how we plan to implement this functionality. The mocking library Dingus will allow us to isolate the our function from the OS and our other libraries. After the function is run we can test to make sure the code works how we expected it to.

First will will import everything we need for the test

{% highlight python %} from dingus import Dingus, DingusTestCase

from markdown_processor import process_markdown
import markdown_processor as mod

{% endhighlight %}

The dingus library provides us with a Dingus class which we will use to assert what our function is doing and the DingusTestCase will automatically isolate our function.

As well as the function we want to test we also import the module to help us make assertions about what goes on outside the function.

Now we can setup a base class to use for our tests. This will hold the common elements we use for our tests.

{% highlight python %} class BaseProcessing(DingusTestCase(process_markdown)):

    def setup(self):
        super(BaseProcessing, self).setup()
        self.src_dir = Dingus('src_dir')
        self.target_dir = Dingus('target_dir')

        mod.os.listdir.return_value = ['hello_world.markdown']
        mod.os.path.splitext.return_value = ('hello_world', 'markdown')

        self.md = mod.markdown.Markdown()

{% endhighlight %}

Our BaseProcessing class inherits from DingusTestCase while passing in our function. This will isolate our function and replace everything around it with Dingus objects. The setup method will be run before each test, unlike the acceptance tests where setup_class was only run once. We once again setup our src_dir and target_dir but this time use Dingus objects while passing in a helpful name. In this case the strings for our directories shouldn't be modified but this allows you to verify any operations performed on the arguments.

Next we set the return values of some function we plan on using. Remember everything but our function is a Dingus which only return a new Dingus when called. The listdir function will return a list of entries in the directory. In this case we just want to return a list with 1 file. We will need to get the filename without an extension so we will set the return_value to be a tuple of the name and extension.

The last part is setting a shorter name for the Markdown instance.

Test Cases

Of first test is to assert that we see if the target directory exists.

{% highlight python %} def should_check_existance_of_target_dir(self): assert mod.os.path.calls('exists', self.target_dir) {% endhighlight %}

We will use os.path.exists to check. The assert uses the .calls method to see if os.path.exists was called with our target directory.

Next will want to assert that a markdown instance is created

{% highlight python %} def should_create_markdown_instance(self): assert mod.markdown.calls( 'Markdown', extensions=['codehilite'] ).once() {% endhighlight %}

This time we check that the object is called with the extension 'codehilite' and that it is only called once.

Now we need to get the files we plan on converting so we will use the os.listdir that uses our mocked return value.

{% highlight python %} def should_find_markdown_files(self): assert mod.os.calls('listdir', self.src_dir) {% endhighlight %}

Finally we can test that our file is actually converted by markdown

{% highlight python %} def should_join_source(self): assert mod.os.path.calls('join', self.src_dir, 'hello_world.markdown')

def should_join_target(self):
    assert mod.os.path.calls('join', self.target_dir, 'hello_world.html')

def should_convert_files(self):
    in_file = mod.os.path.join()
    out_file = mod.os.path.join()
    assert self.md.calls('convertFile', in_file, out_file)

{% endhighlight %}

We first assert that our function properly creates the paths to use for the input and output files then calls markdown's convertFile function. The in_file and out_file are actually the same dingus in this case, the join() will return an object we can use to test with.

Now that the tests are written we can add the classes to run them.

{% highlight python %} class WhenProcessingMarkdown(BaseProcessing):

    def setup(self):
        BaseProcessing.setup(self)
        mod.os.path.exists.return_value = False
        process_markdown(self.src_dir, self.target_dir)

    def should_create_target_directory(self):
        assert mod.os.calls('mkdir', self.target_dir)

{% endhighlight %}

WhenProcessingMarkdown inherits from our BaseProcessing class including all of the test cases. In our setup this time we want to make sure we run BaseProcessing.setup with our instance and set the return value of os.path.exists. After that we run our function.

Since our output directory doesn't exist in this test it will want to make sure that we try to create it.

Because we need to test the other case where the directory already exists we create a new test class where that is true.

{% highlight python %} class WhenProcessingMarkdownAndDirectoryExists(BaseProcessing):

    def setup(self):
        BaseProcessing.setup(self)
        mod.os.path.exists.return_value = True
        process_markdown(self.src_dir, self.target_dir)

    def should_not_create_target_directory(self):
        assert not mod.os.calls('mkdir')

{% endhighlight %}

This is very similar except that the exists() return value is True so we will want to assert that we do not call mkdir at all.

Implementation

Our tests are complete and we can move on to our implementation.

{% highlight python %} import os

import markdown

def process_markdown(src_dir, target_dir):
    "Converts files from :param src_dir: to html in the :param target_dir:"
    if not os.path.exists(target_dir):
        os.mkdir(target_dir)

    md = markdown.Markdown(extensions=['codehilite'])

    for file in os.listdir(src_dir):
        name, ext = os.path.splitext(file)

        in_file = os.path.join(src_dir, file)
        out_file = os.path.join(target_dir, name + '.html')
        md.convertFile(in_file, out_file)

{% endhighlight %}

The implementation shouldn't be surprising after planning it out. Our function takes 2 arguments, possibly creates the target directory, then converts each file from our source directory.

You can download the full source


Test Sizes

At AWeber we tend to write tests. A lot of them. I spend more time writing tests then features. My belief is that if a line of code isn't tested it wasn't important enough to write in the first place. You won't know when it breaks and now you're delivering broken software.

Testing isn't covered much in school. If your projects are only 2 weeks long and disappear after you've handed them in, testing doesn't prove it's worth. When tests are mentioned it's never anything practical, never how to test certain features, what parts are even worth testing?

My current project has 3 sets of tests, unit, integration, and acceptance tests. Google has small, medium, and large tests which roughly correspond to what we write.

Acceptance Tests

These are written first to tell us what we want from a feature. Usually these are structure as interaction stories of the Cucumber variety.

{% highlight bash %}
When a customer has a list of subscribers
And sends a message to the list
The subscribers receive the message
{% endhighlight %}

A common example, definitely a feature you want working. The tests are a little more complicated than that, including assertions to make sure certain steps execute as anticipated.

Integration tests

Integration tests started as a need to test various states of our database and queuing systems. They generally involve testing components to make sure they behave when everything isn't perfect.

What happens when a worker dies in the middle of a job?

Can another worker pick up the job immediately or does it require human intervention?

These stories enumerate the states our models can occupy. When an event comes in from the queue it may be the first, a duplicate, or a re-fired event from a failed worker. A simplified case for sending an email that can be either :

{% highlight bash %}
New
Trying to send
Sending
Sent
{% endhighlight %}

Our tests classes would cover the cases:

{% highlight bash %}
When sending a new email
When sending an email that is trying to send
When sending an email that is sending
When sending a sent email
{% endhighlight %}

Once we have the tests we can figure out which states are possible to recover from. These cases are much more difficult to cover in acceptance tests. They would require excessive test setup and extensive knowledge of our system.

Unit tests

We write unit tests to verify our implementation. This means mocking out as much as possible and isolating individual objects and functions. We use Dingus as our mocking library. At this point the feature is well defined and important test cases are covered, we can focus on how we want to solve them.

{% highlight bash %}
When sending an email
If email is New
Generate and send an email

When sending an email
If email is Trying to send, Sending, or Sent
Do not try and send the email
{% endhighlight %}

With tests of each size we gain greater confidence that our code is working and you can use them in a month when you forget how or why things were done this way.