Posts about Python

PyCon Canada 2018

I've very happy to have had the opportunity to attend and speak at PyCon Canada here in Toronto last week.

PyCon has always been a very well organized conference. There are a wide range of talks available, even on topics not directly related to Python. I've attended previous PyCon events in the past, but never the Canadian one!

My talk was titled How Mozilla uses Python to Build and Ship Firefox. The slides are available here if you're interested. I believe the sessions were recorded, but they're not yet available online. I was happy with the attendance at the session, and the questions during and after the talk.

As part of the talk, I mentioned how Release Engineering is a very distributed team. Afterwards, many people had followup questions about how to work effectively with remote teams, which gave me a great opportunity to recommend John O'Duinn's new book, Distributed Teams.

Some other highlights from the conference:

  • CircuitPython: Python on hardware I really enjoyed learning about CircuitPython, and the work that Adafruit is doing to make programming and electronics more accessible.

  • Using Python to find Russian Twitter troll tweets aimed at Canada A really interesting dive into 3 million tweets that FiveThirtyEight made available for analysis.

  • PEP 572: The Walrus Operator My favourite quote from the talk: "Dictators are people too!" If you haven't followed Python governance, Guido stepped down as BDFL (Benevolent Dictator for Life) after the PEP was resolved. Dustin focused much of his talk about how we in the Python community, and more generally in tech, need to treat each other better.

  • Who's There? Building a home security system with Pi & Slack A great example of how you can get started hacking on home automation with really simple tools.

  • Froilán Irzarry's Keynote talk on the second day was really impressive.

  • You Don't Need That! Design patterns in Python My main takeaway from this was that you shouldn't try and write Python code as if it were Java or C++ :) Python has plenty of language features built-in that make many classic design patterns unnecessary or trivial to implement.

  • Numpy to PyTorch Really neat to learn about PyTorch, and leveraging the GPU to accelerate computation.

  • Flying Python - A reverse engineering dive into Python performance Made me want to investigate Balrog performance, and also look at ways we can improve Python startup time. Some neat tips about examining disassembled Python bytecode.

  • Working with Useless Machines Hilarious talk about (ab)using IoT devices.

  • Gathering Related Functionality: Patterns for Clean API Design I really liked his approach for creating clean APIs for things like class constructors. He introduced a module called variants which lets you write variants of a function / class initializer to support varying types of parameters. For example, a common pattern is to have a function that takes either a string path to a file, or a file object. Instead of having one function that supports both types of arguments, variants allows you to make distinct functions for each type, but in a way that makes it easy to share underlying functionality and also not clutter your namespace.

PyCon 2016 report

I had the opportunity to spend last week in Portland for PyCon 2016. I'd like to share some of my thoughts and some pointers to good talks I was able to attend. The full schedule can be found here and all the videos are here.

Monday

Brandon Rhodes' Welcome to PyCon was one of the best introductions to a conference I've ever seen. Unfortunately I can't find a link to a recording... What I liked about it was that he made everyone feel very welcome to PyCon and to Portland. He explained some of the simple (but important!) practical details like where to find the conference rooms, how to take transit, etc. He noted that for the first time, they have live transcriptions of the talks being done and put up on screens beside the speaker slides for the hearing impaired.

He also emphasized the importance of keeping questions short during Q&A after the regular sessions. "Please form your question in the form of a question." I've been to way too many Q&A sessions where the person asking the question took the opportunity to go off on a long, unrelated tangent. For the most part, this advice was followed at PyCon: I didn't see very many long winded questions or statements during Q&A sessions.

Machete-mode Debugging

(abstract; video)

Ned Batchelder gave this great talk about using python's language features to debug problematic code. He ran through several examples of tricky problems that could come up, and how to use things like monkey patching and the debug trace hook to find out where the problem is. One piece of advice I liked was when he said that it doesn't matter how ugly the code is, since it's only going to last 10 minutes. The point is the get the information you need out of the system the easiest way possible, and then you can undo your changes.

Refactoring Python

(abstract; video)

I found this session pretty interesting. We certainly have lots of code that needs refactoring!

Security with object-capabilities

(abstract; video; slides)

I found this interesting, but a little too theoretical. Object capabilities are a completely orthogonal way to access control lists as a way model security and permissions. It was hard for me to see how we could apply this to the systems we're building.

Awaken your home

(abstract; video)

A really cool intro to the Home Assistant project, which integrates all kinds of IoT type things in your home. E.g. Nest, Sonos, IFTTT, OpenWrt, light bulbs, switches, automatic sprinkler systems. I'm definitely going to give this a try once I free up my raspberry pi.

Finding closure with closures

(abstract; video)

A very entertaining session about closures in Python. Does Python even have closures? (yes!)

Life cycle of a Python class

(abstract; video)

Lots of good information about how classes work in Python, including some details about meta-classes. I think I understand meta-classes better after having attended this session. I still don't get descriptors though!

(I hope Mike learns soon that __new__ is pronounced "dunder new" and not "under under new"!)

Deep learning

(abstract; video)

Very good presentation about getting started with deep learning. There are lots of great libraries and pre-trained neural networks out there to get started with!

Building protocol libraries the right way

(abstract; video)

I really enjoyed this talk. Cory Benfield describes the importance of keeping a clean separation between your protocol parsing code, and your IO. It not only makes things more testable, but makes code more reusable. Nearly every HTTP library in the Python ecosystem needs to re-implement its own HTTP parsing code, since all the existing code is tightly coupled to the network IO calls.

Tuesday

Guido's Keynote

(video)

Some interesting notes in here about the history of Python, and a look at what's coming in 3.6.

Click

(abstract; video)

An intro to the click module for creating beautiful command line interfaces.

I like that click helps you to build testable CLIs.

HTTP/2 and asynchronous APIs

(abstract; video)

A good introduction to what HTTP/2 can do, and why it's such an improvement over HTTP/1.x.

Remote calls != local calls

(abstract; video)

Really good talk about failing gracefully. He covered some familiar topics like adding timeouts and retries to things that can fail, but also introduced to me the concept of circuit breakers. The idea with a circuit breaker is to prevent talking to services you know are down. For example, if you have failed to get a response from service X the past 5 times due to timeouts or errors, then open the circuit breaker for a set amount of time. Future calls to service X from your application will be intercepted, and will fail early. This can avoid hammering a service while it's in an error state, and works well in combination with timeouts and retries of course.

I was thinking quite a bit about Ben's redo module during this talk. It's a great module for handling retries!

Diving into the wreck

(abstract; video)

A look into diagnosing performance problems in applications. Some neat tools and techniques introduced here, but I felt he blamed the DB a little too much :)

Wednesday

Magic Wormhole

(abstract; video; slides)

I didn't end up going to this talk, but I did have a chance to chat with Brian before. magic-wormhole is a tool to safely transfer files from one computer to another. Think scp, but without needing ssh keys set up already, or direct network flows. Very neat tool!

Computational Physics

(abstract; video)

How to do planetary orbit simulations in Python. Pretty interesting talk, he introduced me to Feynman, and some of the important characteristics of the simulation methods introduced.

Small batch artisinal bots

(abstract; video)

Hilarious talk about building bots with Python. Definitely worth watching, although unfortunately it's only a partial recording.

Gilectomy

(abstract; video)

The infamous GIL is gone! And your Python programs only run 25x slower!

Larry describes why the GIL was introduced, what it does, and what's involved with removing it. He's actually got a fork of Python with the GIL removed, but performance suffers quite a bit when run without the GIL.

Lars' Keynote

(video)

If you watch only one video from PyCon, watch this. It's just incredible.

Diving into python logging

Python has a very rich logging system. It's very easy to add structured or unstructured log output to your python code, and have it written to a file, or output to the console, or sent to syslog, or to customize the output format.

We're in the middle of re-examining how logging works in mozharness to make it easier to factor-out code and have fewer mixins.

Here are a few tips and tricks that have really helped me with python logging:

There can be only more than one

Well, there can be only one logger with a given name. There is a special "root" logger with no name. Multiple getLogger(name) calls with the same name will return the same logger object. This is an important property because it means you don't need to explicitly pass logger objects around in your code. You can retrieve them by name if you wish. The logging module is maintaining a global registry of logging objects.

You can have multiple loggers active, each specific to its own module or even class or instance.

Each logger has a name, typically the name of the module it's being used from. A common pattern you see in python modules is this:

```python

in module foo.py

import logging log = logging.getLogger(name) ```

This works because inside foo.py, __name__ is equal to "foo". So inside this module the log object is specific to this module.

Loggers are hierarchical

The names of the loggers form their own namespace, with "." separating levels. This means that if you have have loggers called foo.bar, and foo.baz, you can do things on logger foo that will impact both of the children. In particular, you can set the logging level of foo to show or ignore debug messages for both submodules.

```python

Let's enable all the debug logging for all the foo modules

import logging logging.getLogger('foo').setLevel(logging.DEBUG) ```

Log messages are like events that flow up through the hierarchy

Let's say we have a module foo.bar: ```python import logging log = logging.getLogger(name) # name is "foo.bar" here

def make_widget(): log.debug("made a widget!") ```

When we call make_widget(), the code generates a debug log message. Each logger in the hierarchy has a chance to output something for the message, ignore it, or pass the message along to its parent.

The default configuration for loggers is to have their levels unset (or set to NOTSET). This means the logger will just pass the message on up to its parent. Rinse & repeat until you get up to the root logger.

So if the foo.bar logger hasn't specified a level, the message will continue up to the foo logger. If the foo logger hasn't specified a level, the message will continue up to the root logger.

This is why you typically configure the logging output on the root logger; it typically gets ALL THE MESSAGES!!! Because this is so common, there's a dedicated method for configuring the root logger: logging.basicConfig()

This also allows us to use mixed levels of log output depending on where the message are coming from: ```python import logging

Enable debug logging for all the foo modules

logging.getLogger("foo").setLevel(logging.DEBUG)

Configure the root logger to log only INFO calls, and output to the console

(the default)

logging.basicConfig(level=logging.INFO)

This will output the debug message

logging.getLogger("foo.bar").debug("ohai!") ```

If you comment out the setLevel(logging.DEBUG) call, you won't see the message at all.

exc_info is teh awesome

All the built-in logging calls support a keyword called exc_info, which if isn't false, causes the current exception information to be logged in addition to the log message. e.g.: ```python import logging logging.basicConfig(level=logging.INFO)

log = logging.getLogger(name)

try: assert False except AssertionError: log.info("surprise! got an exception!", exc_info=True) ```

There's a special case for this, log.exception(), which is equivalent to log.error(..., exc_info=True)

Python 3.2 introduced a new keyword, stack_info, which will output the current stack to the current code. Very handy to figure out how you got to a certain point in the code, even if no exceptions have occurred!

"No handlers found..."

You've probably come across this message, especially when working with 3rd party modules. What this means is that you don't have any logging handlers configured, and something is trying to log a message. The message has gone all the way up the logging hierarchy and fallen off the...top of the chain (maybe I need a better metaphor). python import logging log = logging.getLogger() log.error("no log for you!") outputs: No handlers could be found for logger "root"

There are two things that can be done here:

  1. Configure logging in your module with basicConfig() or similar

  2. Library authors should add a NullHandler at the root of their module to prevent this. See the cookbook and this blog for more details here.

Want more?

I really recommend that you read the logging documentation and cookbook which have a lot more great information (and are also very well written!) There's a lot more you can do, with custom log handlers, different output formats, outputting to many locations at once, etc. Have fun!

Stuff I learned this weekend - vim, python and more!

Call me strange, but I actually enjoy spending time reading up on programming tools that I use regularly. I think of programming tools as tools in same way that a hammer or a saw is a tool. They both help you to get a job done. You need to learn how to use them properly. You need to keep tools well maintained. Sometimes you need to throw a tool away and get a new one.

For my professional and personal programming I spend 99% of my time writing python with vim, and so I really enjoy learning more about them.

Stuff I learned about vim:

How I boosted my vim - lots of great vim tips (how did I not know about :set visualbell until now???) and plugins, which introduced me to...

nerdtree - for file browsing in vim. It also reminded me to make use of the command-t plugin I had installed a while back.

surround - for giving you the ability to work with the surroundings for text objects. Ever wanted to easily add quotes to a word, or change double quotes surrounding a string to single quotes? I know you have - so go install this plugin now!

snipmate - lets you define lots of predefined snippets for various languages. Now in python I can type "def<tab>" and bam! I get a basic function definition.

I wasn't able to get to PyCon US 2012 this year, so I'm very happy that the sessions were all recorderd.

The art of subclassing - great tips on how to do subclassing well in python.

why classes aren't always what you want - I liked how he emphasized that you should be always be open to refactoring your code. Usually making your own exception classes is a bad idea...however one great nugget buried in there was if you can't decide if you should raise a KeyError, AttributeError or TypeError (for example), make a class that inherits from all 3 and raise that. Then consumers can catch what makes sense to them instead of guessing.

introduction to metaclasses - metaclasses aren't so scary after all!

nice framework for building gevent services I liked the simple examples here. It introduces the ginkgo framework, which I'm hoping to have some time to play with soon.

How RelEng uses mercurial quickly and safely

Release Engineering uses hg a lot. Every build or test involves code from at least one hg repository.

Last year we started using some internal mirrors at the same time as making use of the hg share extension across the board, both of these had a big impact on the load on hg and time to clone/update local working copies.

I think what we've done is pretty useful and resilient to various types of failure, so I hope this blog post is helpful for others trying to automate processes involving hg!

The primary tool we're using for hg operations is called hgtool (available from our tools repo). Yes, we're very inventive at naming things.

hgtool's basic usage is to be given the location of a remote repository, a local directory, and usually a revision. Its job is to make sure that the local directory contains a clean working copy of the repository at the specified revision.

First of all, you don't need to worry about doing an 'hg clone' if the directory doesn't exist, or 'hg pull' if it does exist. This simplifies a lot of build logic!

Next, we've build support for mirrors into hgtool. You can pass one or more mirror repositories to the tool with '--mirror', and it will attempt to pull/clone from the mirrors before trying to pull/clone from the primary repository. At Mozilla we have several internal hg mirrors that we use to reduce load on the primary public-facing hg servers.

To improve the case when you need to do a full clone, we've added support for importing an hg bundle to initialize the local repository rather than doing a full clone from the mirror or master repositories. You can pass one or more bundle urls with '--bundle'. hgtool will download and import the bundle, and then pull in new changesets from the mirrors and master repositories.

Finally, hgtool supports the 'hg share' extension. If you specify a base directory for shared repositories, all of the above operations will be run on a locally shared repository first, and then the working copy will be created with 'hg share', and updated to the correct revision.

There are all kinds of fallback behaviours specified, like if you fail to import a bundle, try to clone from a mirror; then if you fail to clone from a mirror, try to clone from the master. These fallbacks have resulted in a far more resilient build process.

Investigating hg performance

(caveat lector: this is a long post with lots of shell snippets and output; it's mostly a brain dump of what I did to investigate performance issues on hg.mozilla.org. I hope you find it useful. Scroll to the bottom for the summary.)

Everybody knows that pushing to try can be slow. but why?

while waiting for my push to try to complete, I wondered what exactly was slow.

I started by cloning my own version of try:


$ hg clone http://hg.mozilla.org try

destination directory: try

requesting all changes

adding changesets

adding manifests

adding file changes

added 95917 changesets with 447521 changes to 89564 files (+2446 heads)

updating to branch default

53650 files updated, 0 files merged, 0 files removed, 0 files unresolved

Next I instrumented hg so I could get some profile information:


$ sudo vi /usr/local/bin/hg

python -m cProfile -o /tmp/hg.profile /usr/bin/hg $*

Then I timed out long it took me to check what would be pushed:


$ time hg out ssh://localhost//home/catlee/mozilla/try

hg out ssh://localhost//home/catlee/mozilla/try  0.57s user 0.04s system 54% cpu 1.114 total

That's not too bad. Let's check our profile:


import pstats

pstats.Stats("/tmp/hg.profile").strip_dirs().sort_stats('time').print_stats(10)

Fri Dec  9 00:25:02 2011    /tmp/hg.profile


         38744 function calls (37761 primitive calls) in 0.593 seconds

   Ordered by: internal time
   List reduced from 476 to 10 due to restriction 

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
       13    0.462    0.036    0.462    0.036 {method 'readline' of 'file' objects}
        1    0.039    0.039    0.039    0.039 {mercurial.parsers.parse_index2}
       40    0.031    0.001    0.031    0.001 revlog.py:291(rev)
        1    0.019    0.019    0.019    0.019 revlog.py:622(headrevs)
   177/70    0.009    0.000    0.019    0.000 {__import__}
     6326    0.004    0.000    0.006    0.000 cmdutil.py:15(parsealiases)
       13    0.003    0.000    0.003    0.000 {method 'read' of 'file' objects}
       93    0.002    0.000    0.008    0.000 cmdutil.py:18(findpossible)
     7212    0.001    0.000    0.001    0.000 {method 'split' of 'str' objects}
  392/313    0.001    0.000    0.007    0.000 demandimport.py:92(_demandimport)

The top item is readline() on file objects? I wonder if that's socket operations. I'm ssh'ing to localhost, so it's really fast. Let's add 100ms latency:


$ sudo tc qdisc add dev lo root handle 1:0 netem delay 100ms

$ time hg out ssh://localhost//home/catlee/mozilla/try

hg out ssh://localhost//home/catlee/mozilla/try  0.58s user 0.05s system 14% cpu 4.339 total


import pstats

pstats.Stats("/tmp/hg.profile").strip_dirs().sort_stats('time').print_stats(10)

Fri Dec  9 00:42:09 2011    /tmp/hg.profile


         38744 function calls (37761 primitive calls) in 2.728 seconds

   Ordered by: internal time
   List reduced from 476 to 10 due to restriction 

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
       13    2.583    0.199    2.583    0.199 {method 'readline' of 'file' objects}
        1    0.054    0.054    0.054    0.054 {mercurial.parsers.parse_index2}
       40    0.028    0.001    0.028    0.001 revlog.py:291(rev)
        1    0.019    0.019    0.019    0.019 revlog.py:622(headrevs)
   177/70    0.010    0.000    0.019    0.000 {__import__}
       13    0.006    0.000    0.006    0.000 {method 'read' of 'file' objects}
     6326    0.002    0.000    0.004    0.000 cmdutil.py:15(parsealiases)
       93    0.002    0.000    0.006    0.000 cmdutil.py:18(findpossible)
  392/313    0.002    0.000    0.008    0.000 demandimport.py:92(_demandimport)
     7212    0.001    0.000    0.001    0.000 {method 'split' of 'str' objects}

Yep, definitely getting worse with more latency on the network connection.

Oh, and I'm using a recent version of hg:


$ hg --version

Mercurial Distributed SCM (version 2.0)



$ echo hello | ssh localhost hg -R /home/catlee/mozilla/try serve --stdio

145

capabilities: lookup changegroupsubset branchmap pushkey known getbundle unbundlehash batch stream unbundle=HG10GZ,HG10BZ,HG10UN httpheader=1024

This doesn't match what hg.mozilla.org is running:


$ echo hello | ssh hg.mozilla.org hg -R /mozilla-central serve --stdio  

67

capabilities: unbundle lookup changegroupsubset branchmap stream=1

So it must be using an older version. Let's see what mercurial 1.6 does:


$ mkvirtualenv hg16

New python executable in hg16/bin/python

Installing setuptools...



(hg16)$ pip install mercurial==1.6

Downloading/unpacking mercurial==1.6
  Downloading mercurial-1.6.tar.gz (2.2Mb): 2.2Mb downloaded
...



(hg16)$ hg --version

Mercurial Distributed SCM (version 1.6)



(hg16)$ echo hello | ssh localhost /home/catlee/.virtualenvs/hg16/bin/hg -R /home/catlee/mozilla/mozilla-central serve --stdio

75

capabilities: unbundle lookup changegroupsubset branchmap pushkey stream=1

That looks pretty close to what hg.mozilla.org claims it supports, so let's time 'hg out' again:


(hg16)$ time hg out ssh://localhost//home/catlee/mozilla/try

hg out ssh://localhost//home/catlee/mozilla/try  0.73s user 0.04s system 3% cpu 24.278 total

tl;dr

Finding missing changesets between two local repositories is 6x slower with hg 1.6 (4 seconds with hg 2.0 to 24 seconds hg 1.6). Add a few hundred people and machines hitting the same repository at the same time, and I imagine things can get bad pretty quickly.

Some further searching reveals that mercurial does support a faster method of finding missing changesets in "newer" versions, although I can't figure out exactly when this change was introduced. There's already a bug on file for upgrading mercurial on hg.mozilla.org, so hopefully that improves the situation for pushes to try.

The tools we use everyday aren't magical; they're subject to normal debugging and profiling techniques. If a tool you're using is holding you back, find out why!

cURL and paste

cURL and paste...two great tastes that apparently don't go well at all together!

I've been writing a bunch of simple wsgi apps lately, some of which handle file uploads.

Take this tiny application:


import webob



def app(environ, start_response):
    req = webob.Request(environ)
    req.body_file.read()
    return webob.Response("OK!")(environ, start_response)


import paste.httpserver

paste.httpserver.serve(app, port=8090)

Then throw some files at it with cURL:


[catlee] % for f in $(find -type f); do time curl -s -o /dev/null --data-binary @$f http://localhost:8090; done

curl -s -o /dev/null --data-binary @$f http://localhost:8090  0.00s user 0.00s system 0% cpu 1.013 total

curl -s -o /dev/null --data-binary @$f http://localhost:8090  0.01s user 0.00s system 63% cpu 0.013 total

curl -s -o /dev/null --data-binary @$f http://localhost:8090  0.01s user 0.00s system 64% cpu 0.012 total

curl -s -o /dev/null --data-binary @$f http://localhost:8090  0.01s user 0.00s system 81% cpu 0.015 total

curl -s -o /dev/null --data-binary @$f http://localhost:8090  0.01s user 0.00s system 0% cpu 1.014 total

curl -s -o /dev/null --data-binary @$f http://localhost:8090  0.00s user 0.00s system 0% cpu 1.009 total

Huh? Some files take a second to upload?

I discovered after much digging, and rewriting my (more complicated) app several times, that the problem is that cURL sends an extra "Expect: 100-continue" header. This is supposed to let a web server respond with "100 Continue" immediately or reject an upload based on the request headers.

The problem is that paste's httpserver doesn't send this by default, and so cURL will wait for a second before giving up and sending the rest of the request.

The magic to turn this off is the '-0' to cURL, which forces HTTP/1.0 mode:


[catlee] % for f in $(find -type f); do time curl -0 -s -o /dev/null --data-binary @$f http://localhost:8090; done

curl -0 -s -o /dev/null --data-binary @$f http://localhost:8090  0.00s user 0.00s system 66% cpu 0.012 total

curl -0 -s -o /dev/null --data-binary @$f http://localhost:8090  0.01s user 0.00s system 64% cpu 0.012 total

curl -0 -s -o /dev/null --data-binary @$f http://localhost:8090  0.00s user 0.01s system 58% cpu 0.014 total

curl -0 -s -o /dev/null --data-binary @$f http://localhost:8090  0.01s user 0.00s system 66% cpu 0.012 total

curl -0 -s -o /dev/null --data-binary @$f http://localhost:8090  0.00s user 0.00s system 59% cpu 0.013 total

curl -0 -s -o /dev/null --data-binary @$f http://localhost:8090  0.01s user 0.00s system 65% cpu 0.012 total

self-serve builds!

Do you want to be able to cancel your own try server builds?

Do you want to be able to re-trigger a failed nightly build before the RelEng sheriff wakes up?

Do you want to be able to get additional test runs on your build?

If you answered an enthusiastic YES to any or all of these questions, then self-serve is for you.

self-serve was created to provide an API to allow developers to interact with our build infrastructure, with the goal being that others would then create tools against it. It's still early days for this self-serve API, so just a few caveats:

  • This is very much pre-alpha and may cause your computer to explode, your keg to run dry, or may simply hang.
  • It's slower than I want. I've spent a bit of time optimizing and caching, but I think it can be much better. Just look at shaver's bugzilla search to see what's possible for speed. Part of the problem here is that it's currently running on a VM that's doing a few dozen other things. We're working on getting faster hardware, but didn't want to block this pre-alpha-rollout on that.
  • You need to log in with your LDAP credentials to work with it.
  • The HTML interface is teh suck. Good thing I'm not paid to be a front-end webdev! Really, the goal here wasn't to create a fully functional web interface, but rather to provide a functional programmatic interface.
  • Changing build priorities may run afoul of bug 555664...haven't had a chance to test out exactly what happens right now if a high priority job gets merged with a lower priority one.

That being said, I'm proud to be able to finally make this public. Documentation for the REST API is available as part of the web interface itself, and the code is available as part of the buildapi repository on hg.mozilla.org

https://build.mozilla.org/buildapi/self-serve

Please be gentle!

Any questions, problems or feedback can be left here, or filed in bugzilla.

Just who am I talking to? (verifying https connections with python)

Did you know that python's urllib module supports connecting to web servers over HTTPS? It's easy!

import urllib

data = urllib.urlopen("https://www.google.com").read()

print data

Did you also know that it provides absolutely zero guarantees that your "secure" data isn't being observed by a man-in-the-middle?

Run this:


from paste import httpserver

def app(environ, start_response):
    start_response("200 OK", [])
    return "Thanks for your secrets!"


httpserver.serve(app, host='127.0.0.1', port='8080', ssl_pem='*')

This little web app will generate a random SSL certificate for you each time it's run. A self-signed, completely untrustworthy certificate.

Now modify your first script to look at https://localhost:8080 instead. Or, for more fun, keep it pointing at google and mess with your IP routing to redirect google.com:443 to localhost:8080.


iptables -t nat -A OUTPUT -d google.com -p tcp --dport 443 -j DNAT --to-destination 127.0.0.1:8080

Run your script again, and see what it says.

Instead of the raw HTML of google.com, you now get "Thanks for your secrets!". That's right, python will happily accept without complaint or warning the random certificate generated this little python app pretending to be google.com.

Sometimes you want to know who you're talking to, you know?


import httplib, socket, ssl, urllib2

def buildValidatingOpener(ca_certs):
    class VerifiedHTTPSConnection(httplib.HTTPSConnection):
        def connect(self):
            # overrides the version in httplib so that we do
            #    certificate verification
            sock = socket.create_connection((self.host, self.port),
                                            self.timeout)
            if self._tunnel_host:
                self.sock = sock
                self._tunnel()

            # wrap the socket using verification with the root
            #    certs in trusted_root_certs
            self.sock = ssl.wrap_socket(sock,
                                        self.key_file,
                                        self.cert_file,
                                        cert_reqs=ssl.CERT_REQUIRED,
                                        ca_certs=ca_certs,
                                        )

    # wraps https connections with ssl certificate verification
    class VerifiedHTTPSHandler(urllib2.HTTPSHandler):
        def __init__(self, connection_class=VerifiedHTTPSConnection):
            self.specialized_conn_class = connection_class
            urllib2.HTTPSHandler.__init__(self)

        def https_open(self, req):
            return self.do_open(self.specialized_conn_class, req)

    https_handler = VerifiedHTTPSHandler()
    url_opener = urllib2.build_opener(https_handler)

    return url_opener


opener = buildValidatingOpener("/usr/lib/ssl/certs/ca-certificates.crt")

req = urllib2.Request("https://www.google.com")

print opener.open(req).read()

Using the this new validating url opener, we can make sure we're talking to someone with a validly signed certificate. With our IP redirection in place, or pointing at localhost:8080 explicitly we get a certificate invalid error. We still don't know for sure that it's google (could be some other site with a valid ssl certificate), but maybe we'll tackle that in a future post!

Faster try builds!

When we run a try build, we wipe out the build directory between each job; we want to make sure that every user's build has a fresh environment to build in.

Unfortunately this means that we also wipe out the clone of the try repo, and so we have to re-clone try every time.

On Linux and OSX we were spending an average of 30 minutes to re-clone try, and on Windows 40 minutes. The majority of that is simply 'hg clone' time, but a good portion is due to locks: we need to limit how many simultaneous build slaves are cloning from try at once, otherwise the hg server blows up.

Way back in September, Steve Fink suggested using hg's share extension to make cloning faster.

Then in November, Ben Hearsum landed some changes that paved the way to actually turning this on.

Today we've enabled the share extension for Linux (both 32 and 64-bit) and OSX 10.6 builds on try. Windows and OSX 10.5 are coming too, we need to upgrade hg on the build machines first.

Average times for the 'clone' step are down to less than 5 minutes now.

This means you get your builds 25 minutes faster! It also means we're not hammering the try repo so badly, and so hopefully won't have to reset it for a long long time.

We're planning on rolling this out across the board, so nightly builds get faster, release builds get faster, clobber builds get faster, etc...

Enjoy!