Many people use Python because, like other scripting languages, it is a portable, platform-independent and general-purpose language that can perform the same tasks as the database-centric, proprietary 4GL tools supplied by database vendors. Like 4GL tools, Python lets you write programs that access, display and update the information in the database with minimal effort. Unlike many 4GLs, Python also gives you a variety of other capabilities, such as parsing HTML, making socket connections and encrypting data.
Possible applications for the Python DB-API include:
To solve the problem, a Special Interest Group (or SIG) for databases was formed. People interested in using Python for a given application form a SIG of their own: they meet on an Internet mailing list where they discuss the topic, exchange ideas, write code (and debug it), and eventually produce a finished product. (Sounds a lot like the development process for the Linux kernel, doesn't it?)
After some discussion, the Database SIG produced a specification for a consistent interface to relational databases -- the DB-API. Thanks to this specification, there's only one interface to learn. Porting code to use a different database product is much simpler, often requiring the change of only a few lines.
The database modules written before the Database SIG are still around and don't match the specification -- the mSQL module is the most commonly used one. These modules will eventually be rewritten to comply with the DB-API; it's just a matter of the maintainers finding the time to do it.
For an example database, we'll use a small table designed to track
the attendees for a series of seminars. (See listing 1.) The Seminars table lists the
seminars being offered; an example row is (1, Python Programming, 200,
15)
. Each row contains a unique identifying ID number (1, in this
case), the seminar's title (Python Programming), its cost ($200), and
the number of places still open (15). The Attendees table lists the
name of each attendee, the seminar that he or she wishes to attend,
and whether the fee has been paid. If someone wants to
attend more than one seminar, there will be more than one row with that
person's name, with each row having a different seminar number and
payment status.
ID | title | cost | places_left |
---|---|---|---|
1 | Python Programming | 200 | 15 |
2 | Intro to Linux | 100 | 32 |
3 | Socket Programming | 475 | 7 |
4 | Web Commerce | 300 | 26 |
name | seminar | paid |
---|---|---|
Albert | 1 | no |
Albert | 4 | no |
Beth | 1 | yes |
Chuck | 2 | no |
Dale | 4 | yes |
Elaine | 3 | yes |
Elaine | 1 | yes |
Felix | 4 | no |
The examples below will use the soliddb module, which supports accessing SOLID databases from Python. SOLID is a product from Solidtech that was reviewed by Bradley Willson in LJ, September, 1997. I'm not trying to cover CGI or Tkinter programming, so only the commands to access the database are presented here, in the same manner as if typed directly into the Python interpreter.
The next step is to create an object that represents a database connection. The object has the same name as the module. The information required to open a connection, and its format, varies for different databases. Usually it includes a user name and password, and some indication of how to find the database server, such as a TCP/IP hostname. If you're using the free trial version of SOLID, Unix pipes are the only method available to connect to the server, so the code is:
>>> import soliddb >>> db = soliddb.soliddb('UPipe SOLID', 'amk', 'mypassword'); >>> db <Solid object at 809bf10>
Next, you should create a cursor object. A cursor object acts as a handle for a given SQL query; it allows retrieval of one or more rows of the result, until all the matching rows have been processed. For simple applications that do not need more than one query at a time, it's not necessary to use a cursor object since database objects support all the same methods as cursor objects. We'll deliberately use cursor objects in the following example. (For more on beginning SQL, see At The Forge by Reuven Lerner in LJ, Octoboer, 1997.)
Cursor objects provide an execute()
method that accepts a
string containing an SQL statement to be performed. This, in turn,
causes the database server to create a set of rows that match the
query.
The results are retrieved by calling a method whose name begins with
'fetch', which returns one or more matching rows, or None if there
are no more rows to retrieve. The fetchone()
method
always returns a single row, while fetchmany()
returns a
small number of rows, and fetchall()
returns all the rows
that match.
For example, to list all the seminars being offered:
>>>cursor = db.cursor(); >>> # List all the seminars >>> cursor.execute('select * from Seminars') >>> cursor.fetchall() [(4, 'Web Commerce', 300.0, 26), (1, 'Python Programming', 200.0, 15), (3, 'Socket Programming', 475.0, 7), (2, 'Intro to Linux', 100.0, 32), ]A row is represented as a Python tuple, so the first row returned is
(4, 'Web Commerce', 300.0, 26)
. Notice that the rows
aren't returned in sorted order; to do that, the query has to be
slightly different (just add 'order by ID' to the SQL query). Because
they return multiple rows, the fetchmany()
and
fetchall()
methods return a list of tuples. It's also
possible to manually iterate through the results using the
fetchone()
method and looping until it returns
None
, as in this example which lists all the attendees
for seminar 1:
>>> cursor.execute('select * from Attendees where seminar=1') >>> while (1): ... attendee = cursor.fetchone() ... if attendee == None: break ... print attendee ... ('Albert', 1, 'no') ('Beth', 1, 'yes') ('Elaine', 1, 'yes')
SQL also lets you write queries that operate on multiple tables, as in this query, which lists the seminars that Albert will be attending.
>>> cursor.execute("""select Seminars.title ... from Seminars, Attendees ... where Attendees.name = 'Albert' ... and Seminars.ID = Attendees.seminar""") >>> cursor.fetchall() [('Python Programming',), ('Web Commerce',)]
Now that we can get information out of the database, it's time to
start modifying it by adding new information. Changes are made by
using the SQL insert and update statements. Just like queries, the SQL
command is passed to the execute()
method of a cursor object.
places_left
value for their seminar should be
decreased by one, since there's room for one less person. SQL has no
way to combine two commands, so this requires two
execute()
calls. But what if something happens and the
second command isn't executed?--because the computer crashed, the
network died, or there was a typo in the Python program? The database
is now inconsistent: an attendee has been added, but the
places_left
column for that seminar is now wrong.
Most databases offer transactions as a solution for this problem. A transaction is a group of commands, such that either all of them are executed, or none of them are. Programs can issue several SQL commands as part of a transaction, and then commit them, (i.e. tell the database to apply all these changes simultaneously"). Alternatively, the program can decide that something's wrong, and roll back the transaction, which says "Forget all these changes."
For databases that support transactions, the Python interface
silently starts a transaction when the cursor is created. The
commit()
method commits the updates made using that
cursor, and the rollback()
method discards them. Each method then starts a new transaction. Some databases don't have
transactions, but simply apply all changes as they're executed. On
these databases, commit()
does nothing, but you should
still call it in order to be
compatible with those
databases that do support transactions.
Listing 2 is a Python function tries to get all this right by committing the transaction once both operations have been performed.
def addAttendee(name, seminar, paid): # Fetch the number of places left, and the cost cursor.execute('select places_left from Seminars where ID=%s' % (seminar),) (places_left, )=cursor.fetchone() # If there if places_left==0: raise ValueError, 'No more places left in seminar'+str(seminar) # Add a row to the Attendees table cursor.execute("insert into Attendees values (?, ?, ?)", (name, seminar, paid) ) # Decrease the places_left count of the selected seminar cursor.execute("update Seminars set places_left = %i where ID=%s" % (places_left-1, seminar) ) # Commit the changes db.commit()
Calling this function is simple:
addAttendee('George', 4, 'yes')
We can verify that the change was performed by checking the listing for seminar #4, and listing its attendees. This produces the following output:
Seminars: 4 'Web Commerce' 300.0 25 Attendees: Albert 4 no Dale 4 yes Felix 4 no George 4 yesNote that this function is still buggy if more than one process or thread tries to execute it at the same time. Database programming can be potentially quite complex. With this standardized interface, it's not difficult to write all kinds of Python programs that act as easy-to-use front ends to a database.
Solidtech Web site: http://www.solidtech.com
DB-SIG Web page: http://www.python.org/sigs/db-sig/
C.J. Date, "An Introduction to Database Systems", 1995: Addison-Wesley, ISBN 0-201-54329-X.