Beginning Python for Bioinformatics
Pages: 1, 2, 3, 4, 5
A Bit of Python
Now that we've covered some of the basics of molecular biology, let's take a look at Python to see how its language features can be used to deal with biological research data. We mentioned that DNA, RNA, and proteins are all linear sequences that can be easily represented in a computer-friendly fashion. Python has several built-in structures for handling sequences. Three that we will look at are strings, lists, and dictionaries. To do that, we first need to introduce the Python shell.
There are a couple of different ways to run Python code. One way, which
should be familiar to anyone with experience in another language, is to enter
lines of Python code into a text file and save it with a .py extension. That
program file can then be run from an operating system prompt, or by
double-clicking on the file, depending on your setup. The other way is to
interact with the Python interpreter in a Python shell, where you can enter
lines of code, hit return, and get a immediate response back from Python.
|
Related Article: Building GUI Applications with PythonCard and PyCrust -- Developing the GUI for a Python application is often a tedious and time-consuming process. This is the exact opposite of how Python programmers would describe other aspects of software development using Python. In this article, Patrick O'Brien explains how PythonCard and PyCrust, the graphical Python shell, ease the GUI development process. |
The Python shell is a great environment in which to learn the Python language and to explore new programming concepts. There are even graphical Python shells that will colorize your code, pop up a list of autocompletion options as you type, display all the variables currently available to your program, and help out in any number of other ways. The Python shell that we will use here is called PyCrust, and it comes with the wxPython GUI toolkit.
When you start a Python shell, you will be prompted to enter a line of Python
code. The main prompt is ">>> " (without the quotes). If
the Python code you are entering requires more than one line, subsequent lines
will display the secondary prompt of "... " Let's see what this
looks like in PyCrust.

The initial view of the PyCrust shell.
After we've entered some examples of Python code in the PyCrust shell, it may look like this:

A popup listing available methods for the 'dna' object.
Python Strings
Let's take a look at the example code in more detail. The first thing we did
was to create a string and assign it to a variable. Strings in Python are
sequences of characters. You create a string literal by enclosing the
characters in single ('), double (") or triple
(''' or """) quotes. In the example we assigned the
string literal CTGACCACTTTACGAGGTTAGC to the variable named dna.
>>> dna = 'CTGACCACTTTACGAGGTTAGC'
Then we simply typed the name of the variable, and Python responded by displaying the value of that variable, surrounding the value with quotes to remind us that the value is a string.
>>> dna
'CTGACCACTTTACGAGGTTAGC'
A Python string has several built-in capabilities. One of them is the
ability to return a copy of itself with all lowercase letters. These
capabilities are known as methods. To invoke a method of an object, use the
dot syntax. That is, you type the name of the variable (which in this case is
a reference to a string object) followed by the dot (.) operator, then the name
of the method followed by opening and closing parentheses.
>>> dna.lower()
'ctgaccactttacgaggttagc'
You can access part of a string using the indexing operator s[i]. Indexing
begins at zero, so s[0] returns the first character in the string, s[1] returns
the second, and so on.
>>> dna[0]
'C'
>>> dna[1]
'T'
>>> dna[2]
'G'
>>> dna[3]
'A'
The final line in our screen shot shows PyCrust's autocompletion feature, whereby a list of valid methods (and properties) of an object are displayed when a dot is typed following an object variable. As you can see, Python lists have many built-in capabilities that you can experiment with in the Python shell. Now let's look at one of the other Python sequence types, the list.