Posts Tagged ‘Open Source’

Datadotgc.ca – A Drupal case study: Part 2

Colin Calnan | Wednesday, June 23rd, 2010

This is the second part of Drupal Case Study on integrating the CKAN data repository with Drupal 6. Part 1 covered the following:

  • What is CKAN?
  • CKAN’s API
  • The Foundation
  • The Build
  • Theming
  • Homepage Chart

Caching

API calls are expensive. There’s no doubt about that. Particularly when you’re returning large amounts of data. To avoid any issues of the CKAN API being exhausted from requests and to ensure that the site remained responsive, I decided to leverage Drupals caching mechanisms and pretty much cached everything I could, within reason. The Chart, Tag Cloud, Tag lists, Ministry lists, All Packages list and all individual packages are cached. The issue with caching on this site is that if a package gets updated on the CKAN instance, we need to know about that on our Drupal site immediately and then clear the appropriate caches so that the most recent data can be retrieved.

For caching I created a table called ‘cache_ckan’, that stores everything I need. To create this table I used the schema of the existing cache table and put that in my .install file in my module directory.

/**
 * Implementation of hook_install().
 */
function ckan_install() {
  drupal_install_schema('ckan');
}
 
/**
 * Implementation of hook_uninstall().
 */
function ckan_uninstall() {
  drupal_uninstall_schema('ckan');
}
 
/**
 * Implementation of hook_schema().
 */
function ckan_schema() {
  $schema = array();
  $schema['cache_ckan'] = drupal_get_schema_unprocessed('system', 'cache');
  return $schema;
}

Whenever this module is enabled this schema will be run and the table will be created.

What is stored in the ckan_cache table?

There are various items stored in the cache table.

  1. The Homepage chart data
  2. Tag lists
  3. Ministry lists
  4. List of all datasets

Let’s take the list of all packages as an example. I covered how I implemented the paging in my previous post. As this list is paginated it’s important that every page be cached to improve the speed of the site. As the paging mechanism is already implemented it’s just a case of creating a cache table entry (ckan:all{page-number}) for each page, and then checking for it’s existence when loading the page.

if(($cache = cache_get('ckan:all'.$page, 'cache_ckan')) && !empty($cache->data)) { // If cached data exists for this page...
	$results = $cache->data;
} else {
	$ckan = ckan_ckan();
 
	$start = 0;
	$items_per_page = variable_get('ckan_items_per_page', 4);
	if($page) {
		// If we're in a page, we need to set where to start the list
		$start = $page * $items_per_page;
	}
 
	// Set the offset to the number of records in
	$offset = $start;
	// Limit to the number of items per page 
	$limit = $items_per_page;
 
	try {
		$results = $ckan->advancedSearch(array('groups' => 'canadagov', 'all_fields' => '1', 'offset' => $offset, 'limit' => $limit));
	} catch (Exception $e){
		return $e->getMessage();
	}
 
	// If the API call worked
	watchdog('ckan', 'Called CKAN API for list of all packages');
    	cache_set('ckan:all'.$page, $results, 'cache_ckan');
}

This method is very simple and very effective. It means the pages load lightning fast and only one page of data at a time is retrieved.

How does the cache get cleared/updated

Datasets/Packages change all the time on the CKAN instance, so how do you make sure that the Drupal site has the most current data. This module has two ways of managing that.

1. Using hook_form to redirect to CKAN

As the CKAN nodes on Drupal are created on the fly and hold very little information, there is really no need to access the EDIT form for these nodes. Whenever an admin user clicks the edit tab on the node, they are automatically redirected to the appropriate CKAN package editing screen. hook_form is called to retrieve the form that is displayed when one attempts to “create/edit” an item. For CKAN content types, the user is redirect to the CKAN instance.

/**
 * Implementation of hook_form
 *
 * Redirect the user to ca.ckan.net package edit screen on edit
 */
function ckan_form(&$node, $form_state) {
  if($node->type == 'ckan') {
  	drupal_goto('http://ca.ckan.net/package/edit/'.$node->body);
  }
}

When the CKAN form is submitted, CKAN then redirects back to the Drupal site and calls a specific URL that tells Drupal to call CKAN again to get the package information and populate the node. To clarify, the process is

  1. Redirect http://www.datadotgc.ca/node/X/edit to http://ca.ckan.net/package/edit/{name of X}
  2. On save of CKAN Package, redirect to http://www.datadotgc.ca/{special_url}/{name_of_X}
  3. Load the node with {name_of_X}
  4. Call CKAN to get the (updated) data for Package {name_of_X}
  5. Save the node with updated data

Using Cron and an Atom Feed

CKAN provides an Atom feed of recent updates to the Packages. Cron checks this feed every time it runs. If the feed has changed since the last cron run, then we know there have been updates and we clear all of the caches.

/**
 * Implementation of hook_cron()
 *
 **/
function ckan_cron() {
	// Get the md5sum of the current atom feed
	$current_feed = trim(md5_file('http://ca.ckan.net/revision/list?format=atom'));
	watchdog('ckan', 'Current feed md5: '. $current_feed);
	// Retrieve the previously stored md5sum
	$previous_feed = variable_get('ckan_atom_feed_md5', $current_feed);
	watchdog('ckan', 'Previous feed md5: '.$previous_feed);
 
	// If there have been changes
	if($current_feed != $previous_feed) {
		watchdog('ckan', 'ATOM feed has updated, clearing caches and deleting nodes');
		// Flush all the caches
		cache_clear_all('*', 'cache_ckan', TRUE);
  	        // Set the previous feed md5
		variable_set('ckan_atom_feed_md5', $current_feed);
	}
}

Tag cloud creation

I borrowed some code from the Tagadelic module to achieve the tag cloud

/**
 * Build a tag cloud based on the settings provided
 *
 * @return	String	A themed list of weighted tags
 */
function ckan_tag_cloud() {
	// If there is cached data
	if(($cache = cache_get('ckan:tags', 'cache_ckan')) && !empty($cache->data)) {
		$results = unserialize($cache->data);	
	} else {
		$ckan = ckan_ckan();
		$results = $ckan->getTagCount();
		watchdog('ckan', 'Called CKAN API for tag cloud');
		cache_set('ckan:tags', serialize($results), 'cache_ckan');
	}
 
	// Let's sort them by weight first off
	foreach ($results as $key => $row) {
    $tag[$key]  = $row[0];
    $weight[$key] = $row[1];
	}
	array_multisort($weight, SORT_DESC, $results);
 
	// Now let's get the top X number of tags
	$results = array_slice($results, 0, variable_get('ckan_tagcloud_total', 40));
 
	// Now build the tags
	$tags = ckan_tag_build_weighted($results);
	// Sort them
	$tags = ckan_tag_sort($tags);
	// Theme them
	$output = theme('ckan_weighted_tags', $tags);
	return $output;
}
 
/**
 * Theme function that renders the HTML for the tags
 * @ingroup themable
 */
function theme_ckan_weighted_tags($tags) {
  $output = '';
  foreach ($tags as $tag) {
    $output .= l($tag['name'], 'data/tag/'.$tag['name'], array('attributes' => array('class' => "tagcloud level".$tag['weight'], 'rel' => 'tag'))) ." \n";
  }
  return $output;
}

Using the CKAN Search API for all lists

Ok, so what’s this all about? CKAN has some nice API calls like /api/rest/package/PACKAGE-REF that return a list of Packages. However these return the name/id of the Package ONLY. In our case, for our listings, we wanted other data, such as the tags attached to the Package as well as a brief description.

The only way to get this data was to do a search API call /api/search/package and pass some extra parameters, in this case all_fields=1 and department={name of Ministry}.

all_fields=1 tells the search to return all Package fields, not just the name/id; just as is if you called /api/rest/package/PACKAGE-REF.

department={name of Ministry} tells the search to return all packages that have a department of {name of Ministry}. The lovely folks at CKAN added this functionality for us on request.

What does this look like, well it’s pretty simple really. Call the advancedSearch() function. Pass it a few parameters and it returns you all the data you need. Here’s the function itself:

public function advancedSearch($parameters){
	foreach($parameters as $key => $value) {
		$querystring .= $key .'='. urlencode($value) .'&';
	}
	$results = $this->transfer('api/search/package?'. $querystring);
	if (!$results->count){
		throw new CkanException("Search Error");
	}
	return $results;	
}

And here is that function being called for the list of Ministry Packages. The offset and limit are for the paging mechanism:

// Call the function
$results = $ckan->advancedSearch(array('department' => $ministry, 'all_fields' => '1', 'offset' => $offset, 'limit' => $limit));

There’s a lot more functionality in this module, more than I can go through in a blog post, even 5 posts. If you’re trying to integrate Drupal with a CKAN instance and are not sure where to start then please leave a comment and I’ll get back in touch.

Drupal vs WordPress: Which one is right for you?

Lauren Bacon | Monday, November 9th, 2009

Here at Raised Eyebrow, while we have experimented with dozens of Content Management Systems (CMS’s), these days we mostly build websites using either Drupal or WordPress.

Why these two CMS’s, of the thousands of content management systems available? Both CMS’s share several key qualities:

  • They’re open-source projects. Over the past few years, Raised Eyebrow has increasingly turned to open-source software options because of the flexibility and security they offer.
  • Both WordPress and Drupal boast huge communities of developers and widespread adoption; those are important things to look at when working with open-source software, because we like to see a critical mass of people who are invested in making the software better, both on the coding side and from the end-user perspective.
  • They offer a rich and robust feature set, both within the core CMS and in terms of the plugins (or in Drupal parlance, modules) that are available — plugins and modules help us extend the base functionality of your site with features such as photo galleries, event calendars, interactive forms, shopping carts, and so on.
  • Perhaps the most compelling reason we’ve chosen these two, though, is that our clients like using them. The interfaces are user-friendly; the software is reliable; and the basic functions that our clients need (from uploading a file attachment to creating new pages and blog posts) are available, easy to use, and intuitive. (I won’t claim that there aren’t things I wouldn’t change if I could wave a magic wand — but of the CMS’s we’ve tested, these two are far and away at the top of the heap.)

So how do you choose which one is appropriate for your project? Drupal & WordPress are very different systems, with different strengths and weaknesses. Here’s a quick overview of some of the distinguishing features of each CMS.

Drupal

Drupal welcome screen

Drupal welcome screen

Community focus: Drupal has extensive functionality for allowing people to interact with one another via your website. Creating accounts; logging in to access special content — or create their own; connecting with one another — all of these are possible with a Drupal site, so if your short- or long-range plans include turning your website into a social hub for your visitors, Drupal is a better choice.

Editing a page in Drupal

Editing a page in Drupal

Editing is seamless: In Drupal, if you have administrative privileges, and you are logged in, you can edit your content simply by navigating to the page you want to update, and clicking an unobtrusive “Edit” tab. Many people find this a particularly intuitive approach to site editing. (Not only that, but Drupal is so profoundly customizable that if you want to, you can create custom themes for different areas of your site — so your back-end could look totally different from your front-end, should you feel so inspired.)

Specialized content types in Drupal

Specialized content types in Drupal

Built for dynamic content: Drupal has some very clever ways of cross-categorizing content, so if you have the kind of website where you want content to appear in multiple places based on various categories you assign to it, Drupal may be just right for you. And it’s often the better choice for managing complex kinds of content, where a simple 2-field “Title” and “Body” editing screen won’t suffice.

Highly modular & extensible: The underlying architecture of Drupal is quite flexible, and the CMS can be adapted for a wide variety of purposes. Drupal is like a Swiss Army knife or a food processor: it is many tools in one, and you can choose to use it for one task or several. WordPress is much more specific in its function: it does a handful of things and does them very well, but it isn’t the right tool for every job. (On the other hand, if you need a simple site, Drupal may be overkill, and you could spend a lot of time turning off the features you don’t want.)

Greater investment required up front: Drupal’s out-of-the-box configuration is somewhat limiting, and most people prefer to customize it pretty heavily. This requires not only a solid understanding of HTML and CSS, but also of PHP and of Drupal’s underlying architecture, which has a fairly steep learning curve. As a result, Drupal sites tend to cost more to set up, though the initial investment is well worth it if you plan to extend your site’s functionality to take advantage of Drupal’s flexibility.

WordPress

WordPress's editing screen looks quite different from your site's front-end. This is the screen I see while editing the blog post you're reading.

WordPress's editing screen looks quite different from your site's front-end. This is the screen I see while editing the blog post you're reading.

Built for blogging: I personally find Drupal’s blogging capabilities somewhat limited — for example, creating blog category lists, tag clouds and date-based archives is rather onerous in Drupal, whereas in WordPress they take a matter of minutes to set up. WordPress was first developed as blogging software, and it shows: its blogging features are well thought-through and have been polished by years of improvements.

WordPress's Media Library gives you easy access to all the files you've uploaded to your site: images, PDFs, media files, etc.

WordPress's Media Library gives you easy access to all the files you've uploaded to your site: images, PDFs, media files, etc.

Easy-to-use file management: WordPress’s “Media Library” feature allows you to browse through all the files you’ve uploaded to your site — images, PDFs, multimedia files, whatever they might be — in a clean, attractive & easy-to-use interface. It makes managing your files and inserting them into your blog posts and site pages a much easier task.

Smart spam filtering: Because of WordPress’s blogging focus, the developers had to pay close attention to managing spam. (Blogs attract a lot of spam via comments and pingbacks.) WordPress comes bundled with spam-filtering software that does a remarkably good job — and moreover, its comment-management features are well thought-out and simple to use.

The WordPress Dashboard gives you an overview of activity on your blog or website.

The WordPress Dashboard gives you an overview of activity on your blog or website.

Quick to install and configure: WordPress is famous for its 5-minute install, and it really does live up to its name. Although that doesn’t mean you’ll have a fully-functioning website in 5 minutes, it works well “out of the box” for most simple sites & blogs. As a result it is often less costly then Drupal to set up.

Easy to theme: Both Drupal and WordPress have a great deal of flexibility with regard to visual design — you can make a site built in either CMS look beautiful via either free templates or by applying your own custom design. However, theming a Drupal site is a much bigger task than theming a WordPress site, unless you are simply going to download a free theme and slap it on your site. If you want to be able to tweak design details, in our experience, that’s a much faster job in WordPress.

If you aren’t planning to use WordPress’s blogging features, navigating through the CMS can be a little confusing, because blog posts are the primary focus in the menus, and page editing is less prominent. In this sense, its focus on blogging can be a weakness as well as a strength.

WordPress keeps your site’s back-end (that’s the area where you create & edit content) totally separate from the front-end (the part your visitors see). Some people (like yours truly) prefer this approach, where content is more or less divorced from presentation, whereas others prefer Drupal’s integrated editing options. In my experience, this is a highly subjective preference, and it’s worth trying both to see which feels better to you.

In Summary

As with most decisions about your website, you are well advised to consider the long-range goals of your site before selecting the CMS you’re going to use. If you foresee a highly dynamic website with complex content types, and/or community features such as member login areas, multiple blogs, or user-created content, Drupal may be better suited to the job. On the other hand, if your content management needs are relatively straightforward, or if you intend to have a blog-centered website, WordPress could be just right for you.

Still have questions? Please feel free to leave them in the comments & we’ll do our best to answer them.

Contribute to the Community? Yes You Can!

Colin Calnan | Thursday, October 22nd, 2009

I’d been a little skeptical about the idea that anyone can contribute to the Open Source community by giving a little help now and again. This skepticism came from the flaming given for asking ‘newbie’ questions, asking a question in the wrong room or from suffering the raging ego of a well seasoned and highly adored contributor. Today however, I feel much more positive about my ability to give back to Drupal and the Open Source community, and it all came from a simple (or maybe not so simple) thank you. Here’s the story:

Yesterday afternoon, while tearing my hair out over some tricky views problem that I could not find a solution to, I logged in to IRC, which I don’t usually do, to see if I could get some help.

In the #drupal room I asked my question and as usual didn’t receive any replies (which is why I don’t usually log in). It’s a bit of a difficult VIEWS problem and those usually don’t get answered quickly in IRC. While waiting for a response I noticed a question that was similar to mine but not as complex. I took some time (2 minutes) to review the code and I was able to speculate on what the problem was. I posted a proposed solution and it worked. I didn’t really think much of it until later on when I received an email via drupal.org from the person I helped asking me for my e-mail address. I sent it on and this morning I awoke to the email below. Until now I had been wary of devoting my time to hanging around in the IRC channels. This experience has changed my feelings on this quite a bit and I hope to spend a little more of my time in there, when I have it, helping others. In this case, it definitely did pay to help others, but the thank you email was more than enough, it made my day, my week, my month and maybe even my year :) Here’s to contributing more to the community. It doesn’t take much, and just a simple thank you for the help, when given, makes all the difference.

Hi Colin,

Thanks. I am in Fort Smith, NT just on the Alberta/NWT border directly north of Edmonton.
I am a Fire Scientist, I model Wildfire on the landscape. Yesterday I was completing the Drupal Integration part of a multi-year government project, most of us running on little or no sleep and less patience. I CVS updated VIEWS and it became way more strict about the PHP code and broke my drupal. I had been begging for assistance in different irc drupal chats for about 45 mintues when you came along. It really was like saving our team, when the view worked, the entire fire science team cheered. We just wanted to say a real heartfelt thanks.
I am sending you, $20 via email, its a small gesture, as our team knows that $20 will buy a friday case of beer anywhere in canada except north of Norman Wells NWT ;-) Perhaps you can claim to be the first in your office to have been mailed a case of beer by the drupal community – LOL

 


t. 604.684.2498 | f. 604.721.4007 | e. turningheads [at] raisedeyebrow.com