Here at HumanGeo we use a lot of Python, and it is tons of fun. Python is a great language for writing beautiful and functional code amazingly fast, and it is most definitely my favorite language to use both privately and professionally. However, even though it is a wonderful language, Python can be painfully slow. Luckily, there are some amazing tools to help profile your code so that you can keep your beautiful code fast.
When I started working here at HumanGeo, I was tasked with taking a program that took many hours to run, finding the bottlenecks, and then doing whatever I could to make it run faster. I used many tools, including
cProfile, PyCallGraph (source), and even PyPy (an alternate, fast, interpreter for Python), to determine the best ways to optimize the program. I will go through how I used all of these programs, except for PyPy (which I ruled out to maintain interpreter consistency in production), and how they can help even the most seasoned developers find ways to better optimize their code.
Disclaimer: do not prematurely optimize! I'll just leave this here.
Lets talk about some of the handy tools that you can use to profile Python code.
The CPython distribution comes with two profiling tools,
cProfile. Both share the same API, and should act the same, however the former has greater runtime overhead, so we shall stick with
cProfile for this post.
cProfile is a handy tool for getting a nice greppable profile of your code, and for getting a good idea of where the hot parts of your code are. Lets look at some sample slow code:
Here we are simulating a long running program by calling
time.sleep, and pretending that the result matters. Lets profile this and see what we find:
Now this is very trivial code, but I find this not as helpful as it could be. The list of calls is sorted alphabetically, which has no importance to us, and I would much rather see the list sorted by number of calls, or by cumulative run time. Luckily, the
-s argument exists, and we can sort the list and see the hot parts of our code right now!
Ah! Now we see that the hot code is in our
expensive function, which ends up being calling
time.sleep enough times to cause an annoying slowdown.
The list of valid arguments to the
-s parameter can be found in the Python documentation. Make sure to use the output option,
-o, if you want to save these results to a file!
With the basics down, lets look at some other ways that we can find hot code using profiling tools.
PyCallGraph can be seen as a visual extension of
cProfile, where we can follow the flow of the code with a nice Graphviz image to look through. PyCallGraph is not part of the standard Python installation, and therefore can be simply installed with:
We can run this graphical application with the following command:
cProfile). The numbers should be the same as the data we got from
cProfile, however, the benefit of PyCallGraph is in its ability to show the relationships of functions being called.
Let us look at what that graph looks like
This is so handy! It shows the flow of the program, and nicely notifies us of each function, module, and file that the program runs through, along with runtime and number of calls. Running this in a big application generates a large image, but with the coloring, it is quite easy to find the code that matters. Here is a graph from the PyCallGraph documentation, showing the flow of code involving complex regular expression calls:
What can we do with this information?
Once we determine the cause of the slow code, we can then properly choose a course of action to speed up our code. Lets talk about some possible solutions to slow code, given an issue.
If you find your code is heavily input/output dependant, including sending many web requests, then you may be able to solve this problem by using Python's standard threading module. Non I/O related threading is not suited for Python, due to CPython's GIL, which precludes it from using more than one core at a time for code centric tasks.
You know what they say, once you decide to use regular expressions to fix a problem, you now have two problems. Regular expressions are hard to get right, and hard to maintain. I could write a whole separate blog post on this (and I will not, regexes are hard, and there are much better posts than one that I could write), but I will give a few quick tips:
.*, greedy catchalls are slow, and using character classes as much as possible can help with this
- Do not use regex! many regexes can be solved with simple string methods anyways, such as
str.endswith. Check out the
strdocumentation for tons of great info.
re.VERBOSE! Python's Regex engine is great and super helpful, use it!
Thats all I will say on Regex, there are some great posts all over the internet if you want more information.
In the case of the code I was profiling, we were running a Python function tens of thousands of times in order to stem English words. The best part about finding that this was the culprit was that this kind of operation is easily cachable. We were able to save the results of this function, and in the end, the code ran 10 times as fast as it did before. Creating a cache in Python is super easy:
This technique is called memoization, and is shown being implemented as a decorator, which can be easily applied to Python functions as so:
Now if we run this function multiple times, then the result will only be computed once.
This was a great speedup for the project, and the code runs without a hitch.
disclaimer: make sure this is only used for
pure functions! If memoization is used for functions with side effects such as I/O, then caching can have unintended results.
If your code is not readily memoizable, your algorithm is not something crazy like
O(n!), or your profile is 'flat', as in there are no obvious hot sections of code, then maybe you should look into another runtime or language. PyPy is a great option, along with possibly writing a C-extension for the meat of your algorithm. Luckily, what I was working on did not come to this, but the option is there if it is needed.
Profiling code can help you understand the flow of the project, where the hot code is, and what you can do as a developer to speed it up. Python profiling tools are great, super easy to use, and in depth enough to help you get to the root of the cause fast. Python is not meant to be a fast language, but that does not mean that you should be writing slow code! Take charge of your algorithms, do not forget to profile, and never prematurely optimize.
We’re hiring! If you’re interested in geospatial, big data, social media analytics, Amazon Web Services (AWS), visualization, and/or the latest UI and server technologies, drop us an e-mail at firstname.lastname@example.org.