Introducing Clarify

state_summary_page__KY

Source has a blog post that Derek Willis and I wrote about Clarify, a small Python library for parsing data from Clarity Election Night Reporting (ENR) systems.

This was developed during OpenNews’ Elections Code Convening, which was really fun, productive and a compelling alternative to hackathons as a model for collaborative hacking.


Cleaner Data

11899178333_3a4781efb9_h

Along with David Eads, I gave a talk on cleaning data based on my experience working with convictions and elections data at the Chicago School of Data Conference. The session detail page is here and my slides are here. You might find my speaker notes helpful. They’re in the GitHub repo for the slide deck.

Image: Mud Volcanos creative commons licensed (BY-ND) flickr photo by “Caveman Chuck” Coker


New music for back to school

Prince (the other one)

I work up this morning to the din of the first day of school. Even being out of school for a while, I feel like the fall still has this sense of beginning to me, even as the summer comes to the close.

Out of curiosity looked up the current band of a friend I’ve fallen out of touch with and was instantly pulled in by what I heard. Have you ever been haunted by a song and found yourself playing and replaying the same song a dozen times to see if the lyrics match up with the music’s powerful first impression? That’s what happened to me when I heard “How ya been feelin'”, a track from a forthcoming 7″ from Austin’s Prince. It has such a great combination of energy and sadness. The lyrics are direct, but use imagery that evokes something more than the words. The other recordings on the band’s Bandcamp page are pretty great as well.

Una Bèstia Incontrolable and Iron Lung

Last week I saw Una Bèstia Incontrolable and Iron Lung play and I enjoyed hearing two bands play interesting heavy music that’s still grounded in punk and hardcore idioms. It’s hard because seeing heavy bands live sometimes feels like the way we engage with the music is so predictable compared with the music.

I really enjoyed this Noisey interview with Iron Lung because it revealed some surprising influences that I hadn’t listened to very intently including Flipper’s “Generic Flipper” and Rudimentary Peni’s “Death Church”.

Dystopian Society

Some dear friends just moved to Florence, Italy and I was curious what was going on in local music. This death rock band is the first thing that came up when I google “Florence Italy DIY punk”.


Tools for visualizing network graphs

Tools for extracting structured data from a PDF file

These are tools that have been suggested to me to extract structured data from a PDF files:


New music for early August

Richard The Third by Richard Album and The Singles

I play in a cover band with the drummer of the Singles and he had a great one-liner describing this band which was something like “theatric power pop”. This is their new record and they’re on tour now.

Black Rainbow

They played Chicago this week and it was one of the best performances I’ve seen in a while. Direct but not boring punk music.

Sorrows and In School

Chicago’s queer punk fest, Fed Up Fest was a few weeks ago. It was really great. Good music, and a vibe that felt fun and purposeful. While I was excited to finally get to see Limp Wrist, even if they hadn’t played their unannounced set, I would have been satisfied to see some awesome bands that I had never heard of before. Two of my favorites were New York hardcore bands Sorrows and In School.


Lock straps

IMG_6987

IMG_6989

I love my Soma Porteur Rack and I love how it helps me take weight off of my body and onto my bike.

However, my small u-lock didn’t securely fit on the rack with a single bungee chord.

So, I sewed these straps out of 1 1/2″ velcro.

I’m interested in seeing if the velcro makes lock-ups more annoying and how the velcro holds up to getting wet and dirty.


The Che Cafe vs. profit-seeking models at public universities


by

The Che Cafe, a collectively run all-ages music venue in San Diego, is facing closure by university administrators. Luckily, a court has temporarily halted their eviction.

It’s a nice venue and I’ve played there with Defiance, Ohio a number of times. Long-running, all-ages, DIY spaces are important, but this this paragraph from a press release about the court order connects the cafe with larger dynamics around the financialization of public higher education playing out in so many of our communities and lives.

Arguably, the real reason for the lease termination is economic. And this is why non-students and the broader community should care and join this push to preserve the venue, even if you have never attended or heard of it before. The University administration has shifted to decisions rooted in valuing revenue-generation and profit-seeking above all else. The Che Facility does not bring in windfall profits for the University. It stands in contrast to a Starbuck’s licensed cafe, or a parking lot where each space brings in hundreds of dollars, or even to a new science building that can house researchers securing grant dollars from which the University can take a sizeable cut. The social spaces the University seems to prefer are privately operated, profit-driven and not dedicated to providing practical educational opportunities, self development and creative expression and growth that more traditional spaces like the Che Cafe affords.


Fuzzy-matching strategies

This is a list of strategies for doing quick fuzzy matches that I’m summarizing from a thread that started on June 9, 2014 on the NICAR-L mailing list.

Fuzzy Lookup Excel Add-on

This add-on created by Microsoft can be downloaded here.

It reportedly runs into trouble when trying to match ~3000 records with another ~3000 records.

Increasing the threshold from it’s default to a higher value might provide better performance.

Reconcile CSV

Reconcile CSV is a project of Open Knowledge labs that is described as

Reconcile-csv is a reconciliation service for OpenRefine running from a CSV file. It uses fuzzy matching to match entries in one dataset to entries in another dataset, helping to introduce unique IDs into the system – so they can be used to join your data painlessly.

MySQL’s Soundex() function

OpenRefine

Dan Nguyen provided this recipe for OpenRefine:

If you’re looking for non-Excel/database solutions…you can also do it by hand with OpenRefine.

  1. Combine both lists into one file with a single name column
  2. Import it into Refine
  3. Create a second column called “refined_name_key” that is a duplicate of the original name field
  4. Cluster and de-dupe using Refine’s text-clustering
  5. Export out (into something like a CSV)
  6. Import this table into your existing setup
  7. Join the name fields of the two original tables against the “refined_name_key”

Paxata

http://www.paxata.com/


Rewriting URLs for static files using PHP’s built-in webserver

I don’t particularly like coding in PHP, but I do think WordPress works well for building websites for small organizations in certain use cases. PHPs built-in webserver, which was added in recent versions of PHP helps make PHP web development feel closer to my flow using other languages and frameworks. In particular, it removes the overhead and context switch for having to configure instances of a webserver like Apache or Nginx for local development.

One feature of the the built-in webserver is that you can define a “router” script to segment out serving of static assets or to direct certain paths to a CMS’ main PHP file.

There are lots of examples of making a router script that will work for one’s particular environment. I used this one for WordPress, because it’s just what came up first in my Google search.

However, I ran into trouble when I was trying to develop locally on a multi-site WordPress instance that used path prefixes rather than subdomains to identify certain blogs. For instance, /blog-1/ would go to one blog while /blog-2/ would go to another. I needed to replicate the functionality of these Apache rewrite rules that would remove the blog prefix from the path:

RewriteRule  ^([_0-9a-zA-Z-]+/)?(wp-.*) $2 [L]
RewriteRule  ^([_0-9a-zA-Z-]+/)?(.*.php)$ $2 [L]

The first rule caused the most problems since I needed to return a static file at a path different than the one reflected in the request URL. I found the answer in this example from the built-in server docs of handling unsupported file types.

To rewrite the path of a static file, you need to:

  • Use a regex to update the path.
  • Figure out the mime type of the file and set the appropriate header.
  • Read the contents of the file and return them.

My finished router.php looks like this:

$root = $_SERVER['DOCUMENT_ROOT'];
chdir($root);
$path = '/'.ltrim(parse_url($_SERVER['REQUEST_URI'])['path'],'/');

// Do some URL rewriting
if (preg_match('/\/([_0-9a-zA-Z-]+\/)?(wp-.*)/', $path, $matches)) {
  $path = '/' . $matches[2];
  if (file_exists($root . $path) && !strpos($path, ".php")) {
    // The rewritten path is to a non-PHP file.  It's probably a static asset
    // or theme asset.  Load the file and return it.
    header("Content-Type: " . mime_content_type($path));
    return readfile($root . $path);
  }
}

if (preg_match('/\/([_0-9a-zA-Z-]+\/)?(.*\.php)$/', $path, $matches)) {
  // The path is to some PHP file.  Remove the leading blog prefix.
  // Logic below will load this PHP file.
  $path = '/' . $matches[2];
}

set_include_path(get_include_path().':'.__DIR__);
if (file_exists($root.$path)) {
  if (is_dir($root.$path) && substr($path,strlen($path) - 1, 1) !== '/')
    $path = rtrim($path,'/').'/index.php';
  if (strpos($path,'.php') === false)
    return false;
  else {
    chdir(dirname($root.$path));
    require_once $root.$path;
  }
} else include_once 'index.php';