Introduction

This is a tutorial for starting to use Python in a scientific environment. If you wonder why you should care, please refer to the enthusiastic page. I work on Windows, so these tutorials will refer to Win systems. Anyway, given the fact Python is available for OsX and Linux, it won’t be hard for the reader to figure out the alternative route in those few platform-specific steps.

Goal

At the end of this tutorial, you will have a  Python system ready for scientific use. You will learn:

  • Differences between Python versions
  • How to install new modules/libraries
  • What are the common development environments
  • What are the most common Python libraries used for science

note: If you want to follow the easy way, you can install Enthought Python Distribution.

Installing Python

Python is an interpreter, a program that reads a text file and and then acts depending on each line of code. A Python script is a text file (with a .py extension) that is read by the Python machine (an executable file, i.e. python.exe). All the rest is the Python environment, a suite of modules, programs, editors, batch files and links to help the programmer.

First one needs to install the Python suite.
Go on www.python.org and download version 2.7 x32 for your system. (note: picking the x32bit is a valid and safe option even for x64 systems).
Unix systems should have Python already installed. Verify that the version is 2.7

What’s up with Python 3? Python had been slowly evolving, but yet to maintains backward compatibility. Only in recent time (8ish years) it was decided to make more substantial changes to the language, and a new main release was born. Since Python has a huge userbase, both 2 and 3 versions are supported until most of the users have done the transition. At the time of writing, the main scientific libraries have just started offering a Python 3 compatible version, but many have not updated yet, so it’s better to play it safe, sticking with 2.7

Now try to execute the python command line. Either look in the installed programs, or open the Command Prompt and type python. You should see something similar to

[code]
Python 2.7 (#1, Feb 28 2010, 00:02:06)
Type "help", "copyright", "credits" or "license" for more information.
>>>
[/code]

This is it. I won’t go into Python syntax at this point. Refer to the next tutorial, or to the official introduction.

Trick: Place ‘python.exe’ in the system path

Trick: Verify that double-clicking on a .py file executes it as Python script. If you are on Windows, you’ll probably just see a flash of a DOS window. This is because once the script is over, the operating system closes it immediately: a quick fix is to end the script with a breaking command (i.e. ‘raw_input()’ or ‘while True: pass’ ), later on we will introduce some developing environments that take care of this problem.

Installing New Modules

Even though “Python comes with batteries included”, that is, the standard library is very extensive and rich, the default installation is not sufficient for most of the scientific applications. Rather then reinventing the wheel, one can rely on external modules or libraries (a set of modules) that extend the language.

Every Python module is a python file, containing the new functionality. We’ll see in the next tutorial how to actually load modules, and the only requirement in order to use it is to indicate to Python where it is located: that means you can just copy the module file into the working directory (i.e. the directory where the interpreter is executed) or include it to the python path, even during execution, and start using it immediately.

Python has also a ‘library’ directory, and in it a ‘site-package’  that is the designated directory for third-parties libraries. All modules in that directory are automatically included to the path, and accessible to every instance of Python

Indeed one would like more sophisticated possibility (i.e. a library could require configuration depending on the OS, or is based on some c++ code that needs to be compiled): for this there is ‘distutils’, and it comes in 2 flavors.

The simplest thing you can come across is the standalone installer: it is built against a specific Python version and OS. You just double click on it, and follow the wizard. You can try it (win) with the Python Image Library.

If the standalone is not provide, the library comes with a setup.py, that contains the installing script. In order to install, you need to call (from command prompt)

[code]
python setup.py install
[/code]

and it will install the library in the proper third-party library directory. For more installation options, refer to the readme.

This is it for installing new modules, but there is an additional tool that makes managing the library easier. setuptools is a module that installs a couple of scripts in the python directory (i.e. ‘python27/scripts’). If you call

[code]
python easy_install.py name_of_the_module
[/code]

the script will check in several databases to find module with the given name, and will install it. If it fails, you can refer to it’s errors for completing the installation (i.e. missing dependency). It doesn’t work 100% of the time, but it’s a very useful tool. It is even more useful in updating the installed modules. You can do just typing

[code]
python easy_install.py –upgrade name_of_the_module
[/code]

Science Libraries

Now that you have all the tools to upgrade your vanilla Python distribution, let’s look for scientist essential libraries.

NumPy:  is “the fundamental package for scientific computing with Python. It contains among other things:

  • a powerful N-dimensional array object
  • sophisticated (broadcasting) functions
  • tools for integrating C/C++ and Fortran code
  • useful linear algebra, Fourier transform, and random number capabilities”
Almost every other scientific module is based on this one. One of the main aspects is that it brings into Python the array manipulation matlab-style.

Scipy: it contains advanced numerical mathematical routines, among all DFFTs, optimization, regression and ODE integration. Most of this tools are compiled  Fortran or C programs, so they perform fast. Scipy is also the name of the community for the scientific use of Python.

Matplotlib: enables graphical output, plots and graphs. It can be used simple mathematical graphical plotter, with a syntax similar to matlab (pylab). In addition, it provides an extremely powerful api for more complex applications.

Try to install all these 3 libraries with any of the methods from the previous section. We will learn a little bit of how to use them in the next tutorial, but you should always refer to their documentation for knowing how to use them.

Choosing the Environment

The last thing to prepare is the developing environment, that is which editor and pipeline you want to use for coding. I will list here few options that were very convenient in my experience in the lab, but they are not the only or the best. In the next tutorial we will employ each of this tools, but now let’s just list them:

Idle

Notepad++ (or any other augmented text-editor)

Ipython

Spyder

Eclipse