CloudObjects / Blog / Simplifying CouchDB View Management

Simplifying CouchDB View Management

We use CouchDB as a part of our storage layer for CloudObjects. CouchDB is a schema-less document-oriented database. Database entries, or documents, are JSON objects. By default, these documents are accessed through their identifiers. Unlike traditional RDBMs, NoSQL databases like CouchDB do not have a default query language (SQL) to look up entries by anything other than their primary key. The alternative proposed by CouchDB are views, which represent a map-reduce approach to querying. A map function looks at each new or modified document and emits data with a different key into a separate index for the view. Consumers of the database can now also efficiently query the database with this new key through the view. An optional reduce function can be used to further aggregate data. This article only deals with the map part.

CouchDB views are defined with custom Javascript code, which makes them very powerful because you have a complete programming language at your disposal for defining the map function for the view. The function you define is called by the CouchDB engine for each document and inside your function you can call emit() for each entry that you want to add to the view, thus a document can map to 0, 1 or more keys for the view. The map functions are then added to a design document, which despite having a special role is basically just a JSON object in your database like any of your custom documents.

Having the map function embedded in the JSON document makes it difficult to maintain, for example because you have to be wary of escaping your code correctly and newlines or comments that increase legibility to a developer are also an issue. The most efficient way for storing your functions in the database is not the easiest way to create and edit them. Due to lack of tooling here’s what I used to do: write a function, then escape using search&replace, copy&paste into the JSON document, remove newlines and finally update that JSON in the database using the CouchDB GUI or API. It’s needless to say that after having to edit your map functions once or twice this is the kind of process that leaves you unsatisfied as a developer and makes you look into automation.

I didn’t want to invest into complicated external tooling so I thought of just writing a small script to do this myself. The idea behind it was this:

  1. Store all map functions for your views in a directory, one file for each. In this file Javascript could be written without limitations and also comments can be used to explain the code. These files go into source control.
  2. The script would iterate through all files in this directory, minify them and build the design document automatically.

I wrote the script in PHP so I could use a JS minification library that I was already used to, Matthias Mullie’s CSS & JS minifier “minify”. My script is a single PHP file with 32 lines of code (including comments and empty lines) and an accompanying composer.json that is used to install the library. These are stored in the repository where I’m using them because they are too small to be considered their own project, however I’m sharing them with you in this post and also as a GitHub gist so you can adapt them to your own project if you find them helpful.

Here’s the composer.json:

{
	"require": {
		"matthiasmullie/minify" : "1.3.21"
	}
}

And here’s my script, compileDesignDocument.php:

<?php

require_once "vendor/autoload.php";

use MatthiasMullie\Minify\JS;

$baseName = $argv[1];
if (!is_dir($baseName)) {
    echo "Could not find directory ".$baseName.".\n";
    exit(-1);
}
$output = ['language' => 'javascript'];

$baseDir = opendir($baseName);
while (false !== ($entry = readdir($baseDir))) {
    $elements = explode(".", $entry);
    if (!isset($elements[2]) || $elements[2] != 'js')
        continue; // ignore non-JS files

    // Load JS file for minification
    $minifier = new JS(file_get_contents($baseName."/".$entry));
    if ($elements[1] == 'map') {
        // Add map function
        $output['views'][$elements[0]]['map'] = $minifier->minify();
        echo "views -> ".$elements[0]." -> map added.\n";
    } else {
        echo "Nothing to do with ".$entry.".\n";
    }
}

file_put_contents($baseName.".json", json_encode($output, JSON_PRETTY_PRINT));
echo "Written ".$baseName.".json.\n";

As you can see, the script takes a single argument which is then treated as the base name which is both the name of the directory that contains the design documents as well as the filename (+ .json extension) for the generated design document. Files inside the design document directory follow the convention view name + “map” + “.js”. The script is easily extensible, for example to incorporate reduce functions as well.

To further increase automation, especially since I generally do not want to add files that can be regenerated easily into source control, I also created a makefile so I can install the library and regenerate my design document with a single make command. Here’s an example how such a makefile could look like:

.PHONY: all

all: vendor design_mydb.json

vendor: composer.lock
	composer install

design_mydb.json: vendor design_mydb/sample_view.map.js design_mydb/another_view.map.js
	php compileDesignDocument.php design_mydb

Let’s end up with an example. Here’s a map function, stored as sample_view.map.js:

function (doc) {
    // Sample function
    for (k in doc.values) {
        // Emit a key for each of the values
        emit(doc.values[k], doc._id);
    }
}

After running my script the generated, minified output looks like this - ready for upload into CouchDB:

{
    "language": "javascript",
    "views": {
        "sample_view": {
            "map": "function(doc){for(key in doc.values){emit(doc.values[key],doc._id)}}"
        }
    }
}

I hope you enjoyed this article and it has inspired you to add more automation to your software development because a simple script can often go a long way in simplifying a developer’s life. This was also the first post in which I’ve shared a bit of behind-the-scenes of our own development here at CloudObjects. Let me know if you are interested in more content like this!

by Lukas Rosenstock