Finding duplicate records in a books to prisoners database application

High on my list of neglected tech. projects is the Testament books to prisoners database web application.  This is the database program that projects like the Midwest Pages to Prisoners Project use to track packages sent and returned and books requested in the hopes avoiding delays in delivering books to incarcerated people and to provide metrics that grant providers like.

One of the design challenges has to do with duplicate records.  Recipients of books are identified by their state/federal department of correction (DOC) number (if they’re in a state or federal prison – most jails don’t use ID numbers), their state of incarceration and their name.  I assume that the database was designed originally to minimize barriers for the book project volunteers so both the name and DOC# are free text fields.  Javascript is used to match existing records based on the DOC#, but there is still a large possibility for duplicate records.

The reason for duplicate records is that both the person writing to request books and the volunteer may list their name and/or DOC# inconsistently.  For instance, the state may store the DOC# in their database as A-123456 but the incarcerated person may write it as A123456 A-123-456 or just 123456.  Volunteers who don’t know about this and aren’t careful may not check beforehand for an existing record.

This is probably preventable through more sophisticated validation, but we still need a way to find duplicates in the existing records.  As this application is written in the Django framework, I want to try to use the Django API to find matches.

At first thought, it seems like I will have to iterate through each inmate record and check if there is a duplicate record.  This seems pretty slow, but I can’t think of a better way to do this.  At this point, there aren’t so many records that this approach will fail, but it would be nice to do something slicker.

The other problem is how to match a duplicate.  One approach might be to build a regexp for the DOC# (for instance, match either the first character or omit it, allow dashes or spaces between all characters, …) and then use the iregexp field lookup to try to find matches. One challenge with this is that the current Testament codebase is using Django 0.97 (I think) and iregexp is only available starting in 1.0.  Maybe it’s time we updated our code anyway.

There is also the Python difflib module that can compute deltas between strings.  However, it seems like this would slow things down even further because you would have to load each inmate object and then use difflib to compare the DOC#s.  I assume that the previous approach would be faster because the regexp matching happens at the database level.

Django: Querying data from the Python shell

I needed to get some stats for some research that we’re doing and was happy to see that you can use Django and the python shell to query testament data in a way that’s database independent.  It’s a little unintuitive if you’re thinking in SQL mode, but it is usable and super-helpful.  I wanted to share it with ya’ll in case you needed to quickly pull stats or examine info.

Helpful reference Django docs:


Print the prison name, city, and state of all prisons that received a package sent by the Midwest Pages to Prisones Project from 2009-01-01 to 2009-03-22

geoff@btp:/var/www/testament/testament_trunk/btp$ python shell
>>> import datetime
>>> from core.models import Prison, Package
>>> start_date =, 1, 1)
>>> end_date =, 3, 22)
>>> prisons = Prison.objects.filter(package__sent_on__range=(start_date, end_date), package__group__username__exact='mwpp').distinct()
>>> for prison in prisons:
>>>    print "%s %s, %s" % (,, prison.state)

Pages vision

A college student who is working with Pages as part of one of their classes asked me what I thought the organization’s long term vision was.  It was a good question, and this is what I responded with, though I feel like it’s just a starting point

> What is Pages long term vision?
> What type of social impact would you like to see Pages make(prisoner
> rights/awareness/literacy)?
Really, these are two parts to the same question, so I'll answer it as
such.  I think Pages' long term vision is a world where everyone has the
knowledge, perspective, and skills to live an interesting, dignified
life.  Pages' focus is on making sure that incarcerated and formerly
incarcerated people are included as part of everyone.

Like any social movement, community project, or nonprofit, I think that
our vision includes envisioning a day when our project doesn't need to
exist.  I will be the first to admit that the model of the prison book
project is not a particularly efficient way to get books to incarcerated
people.  Sadly, for many, it is the only way that they can get access to
the knowledge and perspective that they want, which is why our project
continues to do the work of sending free books to incarcerated people.
The aspect of the project that moves us toward a world where we are no
longer necessary is the volunteer experience of service and of reading
the letters from incarcerated people and, hopefully, complicating the
volunteer's perception of incarcerated people and incarceration.  

I want people who volunteer with us to be able to think and make decisions
more rationally when it comes to community safety, crime, incarceration and
incarcerated people.  I also do work with a local group called Decarcerate
Monroe County which is resisting jail expansion in this county.  In some
ways it addresses some of the same issues as Pages, but more fundamentally,
because it aims to make one government entity spend money on empowering
people instead of incarcerating them and finding solutions to keep people
out of jail.  However, in doing this organizing, attending county meetings,
and talking with people who have conflicting ideas, it makes me appreciate
Pages because I feel like people who volunteer here get a perspective that
helps them think beyond the cultural stereotypes about crime,
incarceration, and the people affected by incarceration.  At Pages, we're
not trying to portray every incarcerated person as an innocent victim of
the system. Instead, through their letters, we're letting incarcerated
people speak for themselves in the hopes that those reading the letters
will at least appreciate them as an individual.  Ultimately, I hope the
experience of volunteering with Pages will make people realize that
incarceration in this country, as it stands now, is not working very well
for anyone - whether it's the incarcerated or the community at large. 
Hopefully people will keep this perspective in mind when they are voting,
working, and involved in their community.

cash rules everything around me

I’m soliciting funds again.  This time, to support a project that I work closely with called the Midwest Pages to Prisoners Project (  We send free reading material to people incarcerated in juvenile facilities, prisons, and jails throughout the Midwest, Florida, and Arizona.  Our primary means of support is  through the donation of books and money to cover buying books and the ever-rising cost of postage.  To put things in perspective, it costs between $3-7 to send one package of free books to an incarcerated person.  With each biweekly mailing, we send 100-200 packages, so the expenses grow quickly.  Though this might seem like a costly and not-so-efficient way to get reading material to people, it is often the only access to information, ideas, and mental stimulation that many incarcerated people have.  This Saturday, I will be bowling in a Bowl-A-Thon fundraiser and I would appreciate your sponsorship.  You can read more about the event at and donate via PayPal by visiting

Thanks for your support,

Jail Book Group

I’m trying to be better about posting what I’ve been doing lately.  Last night, the book group I’m facilitating through Pages and New Leaf New Life in the “therapeutic” block of the local jail met for the second time and we picked the book that we’re going to read, A Walk in the Woods by Bill Bryson.  We did a rough vote and there wasn’t an overwhelming consensus so I’m going to bring in a few copies of the other books that I brought up as options including Me Talk Pretty One Day by David Sedaris, Oryx and Crake by Margaret Atwood, and The Golden Compass by Phillip Pullman.

Doing the group is challenging.  Some people are extroverted and seem to love to talk about themselves and their experiences.  Some are the exact opposite.  I feel like we’re also fighting the difficult dynamic of being a “group” amidst a lot of other mandatory groups that the men have to go through all day.  I think, at the end of the day, some just aren’t feeling another group.  People in the block are respectful and quiet, but the whole jail is noisy.  There are lots of interruptions like meds and the church group that comes in to provide worship services without notices.  I’m still getting my balance as a facilitator and trying to make it more clear why I’m there and what I’m doing and try to get past the reasonable distrust that some of the guys have for people like me.

In spite of all the challenges, we had a short discussion about a piece of writing titled The Best Time in My Life and many shared a memory or description of places and eras that they had seen pass.  For some it was rock quarries in southern Indiana, for another being towed around on an old car hood in his tiny hometown, and for another it was the closure of a vital youth center in his Chicago neighborhood.

Defiance, Ohio Audio Files

I finally posted audio files from the recent Defiance, Ohio record The Fear, The Fear, The Fear to the web.  After seeing how El-Iqaa distributed his recent release as well as the encouragement of others, I decided to make the audio available as both a free download of 128Kbps or Ogg/Vorbis files on as well as a donation-requested 320Kbps or FLAC file download where the proceeds get paypalled to the Midwest Pages to Prisoners Project.



So I’m still pretty transfixed on the population growth map from the New York Times that I wrote about a while ago.  I came across these two maps that were pretty damn intriguing.  One is of prison proliferation over the last century, and one is about privatized prisons and immigration enforcement.  It’s amazing all the things beyond geography that can be represneted with maps:

I ended up using one of the maps in a flyer for Pages’ upcoming Pack-A-Thon event:

craft night for boxcar prom @ microcosm hq (nw corner of 3rd + rodgers). 4p.

hey y'all!  come help make prom decorations so we can make the  
bluebird look sweeeet!  this is the biggest fundraiser of the year for  
both boxcar and pages, so we really need your help!

- 4pm on sunday (we'll go for hours, so show up whenever you can)
- sparky's house/microcosm headquarters (north-west corner of 3rd & rogers)
- bring any craft supplies you have (paint, markers, paper, cardboard,  

we are also building a decorating committee for the day of the prom.   
decorating starts at noon on saturday the 7th.  if you're into it,  
please e-mail me or sign up at boxcar.  yeah!


boxcar technology wg project ideas

  • Mass inventory edit feature to Boxcar inventory.  This would allow user to change inventory details for multiple inventory entries all at once. Essentially this would allow you to change everything that you can in the inventory edit form.
  • Barcode-based inventory entry
  • Integrate POS and inventory
  • Make pages card catalog/inventory electronic

bloomington veteran resources

Someone wrote pages asking for literature for veterans.  He said that incarcerated vets are pretty neglected by the government/military.  So, I’m trying to collect contact info for veterans groups so I can ask them about getting literature to make available to incarcerated people through pages.

IU Office of Veterans Affairs
Georgann Wilson

Monroe County Veterans’ Affairs Office
(812) 349-2568