Fixing O'Reilly Safari Books Online annoyances, part 2

As I said in my last post, I like O'Reilly's Safari Online, but I find some of the presentation a little annoying. Other than the grotty font choices, my other bugbear is the awful "Additional reading" section that appears at the bottom of lots of the pages. The blurb says:

Safari has identified sections in other books that relate directly to this selection using Self-Organizing Maps (SOM), a type of neural network algorithm. SOM enables us to deliver related sections with higher quality results than traditional query-based approaches allow.

but high-quality it isn't, for example whilst viewing a page about SQL queries I get a suggestion that I go read some chapter in a C# book! Whilst I could fix the font problem with Firefox's userContent.css file, I couldn't use that method to excise the "Additional reading" cruft, as although it was in a div the div in question had neither a class nor an id. This looked like a job for Greasemonkey. Greasemonkey is a Firefox plugin that allows you to run user-defined Javascript over each page as it is loaded by the browser, so you can modify page content before it is displayed. Earlier versions of Greasemonkey had some serious security flaws, but these have been fixed in the current version. With Greasemonkey installed, removing the cruft was a snap - each section I wanted to chop out was enclosed in a div as I said, and had a h4 heading containing the string Additional reading as its content. As I was going to have to use Greasemonkey, I reimplemented the CSS hack I described in my last post in the Greasemonkey script - the advantage being that individual Greasemonkey scripts can be enabled and disabled, unlike userContent.css, which is permanent. The following script did the trick:

// ==UserScript==
// @name		Safari Books Online cleanup
// @namespace		http://bleaklow.com/greasemonkey
// @description		Fix fonts and remove 'Additional reading' section from Safari pages
// @include		http://*.safaribooksonline.com/*
// ==/UserScript==

/* Make the fonts readable. */
GM_addStyle(' \
	.docText, .docList { \
		font-family:	sans-serif	!important; \
		font-size:	medium		!important; \
	} \
	.docFootnote, .docItemizedlist { \
		font-family:	sans-serif	!important; \
		font-size:	small		!important; \
	} \
	tt, pre, code, .docMonoFont { \
		font-family:	monospace	!important \
	} \
} \
');

/* Remove cruft from the top of the page. */
var node;
var nodes = document.evaluate(
    '//tr[@class="toplogo"]/../..',
    document, null, XPathResult.ORDERED_NODE_SNAPSHOT_TYPE, null);
if (nodes.snapshotLength == 2) {
	node = nodes.snapshotItem(0).parentNode;
	node.parentNode.removeChild(node);
	node = nodes.snapshotItem(1);
	node.parentNode.removeChild(node);
}

/* Remove the stupid "Additional reading" section. */
node = document.evaluate(
    '//h4[.="Additional reading"]/..',
    document, null, XPathResult.ANY_UNORDERED_NODE_TYPE, null).singleNodeValue;
if (node != null) {
	node.parentNode.removeChild(node);
}

/* Remove the book cover and details and replace with a simple heading. */
node = document.evaluate(
    '(//a[@title="Book Cover"])[1]/../..',
    document, null, XPathResult.ANY_UNORDERED_NODE_TYPE, null).singleNodeValue;
if (node != null) {
	var td_cover = node.childNodes[0];
	var tbody_info = node.childNodes[1].childNodes[0].childNodes[0];
	node.removeChild(td_cover);
	var title =
	    tbody_info.childNodes[0].childNodes[0].childNodes[0].innerHTML;
	while (tbody_info.childNodes.length > 1) {
		tbody_info.removeChild(tbody_info.lastChild);
	}
	tbody_info.childNodes[0].innerHTML = "<h2>" + title + "</h2>";
}

The important bits to note are the @include, which restricts this script to just the Safari website, and the use of document.evaluate with an XPath expression to find the appropriate part of the DOM to tweak.

Tags : ,
Categories : Web, Tech