Skip to content

Decay in the Amazon APIs

In Chapter 17 of Pro Web 2.0 Mashups, I created a mashup of an Amazon Wishlist and Google Spreadsheets. When I returned to examine my code last night, I learned that it no longer worked.  Why?  First, the  Amazon Ecommerce API morphed into the Amazon Product Advertising API; I was puzzled why the API wasn't listed where I expected it to be.  Unfortunately, Amazon, in its infinite and inscrutable wisdom,  also decided to kill the ListLookup operation, the one call that I depended on to retrieve the content of my Amazon wishlist.  (I'm not alone in having broken applications because of this change.)

So what to do now?  Interestingly enough, someone just announced a JSON feed service for a given wishlist, for example, Jeff Bezos' wishlist and mine (in JSON).  I hope it stays around.  How does it work given the demise of the ListLookup operation?  My guess is that some sort of screen-scraping is going on.

Mapper for serializing XML ElementTree with SQLAlchemy

This morning I've been working out how to configure a SQLAlchemy mapper to enable me to store an XML blob to relational databases.  Storing the XML element is a bit of  hack right now; ultimately, I want to map all the relevant pieces of the XML to appropriate Python object attributes.  But until I figure that mapping out, I'm saving the XML so in theory I can avoid making yet another API call.

I think I may have found relevant examples to guide me.  SQLAlchemy comes with a set of examples:  see docs and code (v 0.6.1) — specifically, the examples for serializing XML.  The first approach (using PickleType) is a simple and expedient approach, a good enough one to start with.   I'll come back to study the other ones.

Tagged , ,

LinkedIn API: First steps using Python

My enthusiasm  for LinkedIn increased dramatically once I learned that LinkedIn had opened up its API to the public at large.  What is still unclear to me is how much the API allows one to get data in and out LinkedIn.   One of the best ways to find out:  dive in and see what we can learn.

In this post, I describe  some first steps you can take to learn to access the LinkedIn API  with Python (a favorite programming language of mine and many):

  1. Get oriented by looking the main page for the API, which in this case, is  LinkedIn Developer Network.
  2. You'll need a set of developer keys for each application, which you can get by registering an application.  You'll be asked to login with your LinkedIn user account email/password.  If you don't yet have a LinkedIn user account, sign up for one. (Using the API doesn't require a separate developer account.)
  3. I found a few tricky bits in the registration process.   First of all, you're going to have remind yourself (or teach yourself for the first time) the basics of OAuth, an open protocol used by LinkedIn (and other websites) to authorize users.   (I won't attempt to provide such a tutorial here. A set of slides  from LinkedIn does a pretty good job of giving an overview of OAuth. ) The second tricky part was that I forgot whether OAuth could support desktop applications.  It turns out that you do get 3 options for the types of app you are registering: desktop, web, and mobile.  In my case, I registered that I was creating a desktop app.
  4. When you are finished registering your application, you will get two important parameters for your app:  the OAuth (consumer) key and secret. You will need these two parameters for your Python application.
  5. You can choose to work with the API directly at  the HTTP level protocol level or look for API libraries that wrap the protocol.  I searched for such a Python library, and found pylinkedin (a barebone but functional library), whose source you can get via mercurial by hg clone pylinkedin
  6. Install the library using the usual python install.  pylinkedin was tested on Python 2.5 but so far, I've found it to work on Python 2.6. Note that the library requires that you have the  oauth Python library installed (available via svn at
  7. You will need to correct a small bug that I found in pylinkedin (around the parsing of companies) — you'll need to edit the file according to instructions I posted.

Now we're ready to run the following code, which will display the first and last name of your LinkedIn "connections" (the people in your immediate circle).  Remember that key and secret you got when you registered your application: plug them into the following program.  (Note that a browser window should open prompting you for your LinkedIn password.  You'll then have to enter a code that LinkedIn gives you to let the program access your LinkedIn data.)

# insert your application KEY and SECRET


import webbrowser

from linkedin import LinkedIn
li = LinkedIn(API_KEY, SECRET_KEY)

token = li.getRequestToken(None)

# prompt user in the web browser to login to LinkedIn and then enter a code that LinkedIn gives to the user

auth_url = li.getAuthorizeUrl(token)
validator = input("Enter token: ")

# list all connections

connections = li.connections_api.getMyConnections(access_token)

print "number of connections: ", len(connections)
for c in connections:
    print c.firstname, " ", c.lastname

There are obviously improvements to make in this starter code — but this should get us all on our way.

Acknowledgements: Thanks to LinkedIn for the API and to Max Lynch for pylinkedin, which made getting into the API much easier!

Tagged , ,

Zotero REST API: early developments

Readers of my book know that I'm an avid user of social bookmarking and online bibliographic systems. As  a big fan of Zotero ("a free, easy-to-use Firefox extension to help you collect, manage, and cite your research sources"),  I have been looking for  ways to further integrate Zotero (both the client and the web front end) into my workflows.  Specifically, I would like to experiment with integrating Zotero into my teaching this semester,  specifically to help my students and I  share references.

I've played with scripting Zotero from within the web browser using Chickenfoot.  However, a fully fleshed out Zotero REST API will, I think, be even helpful in tying Zotero to other applications.  There is a Zotero REST API under development, though it is currently limited and not well publicized.

The best publicly available documentation of the API as it stands that I know of is Jeremy Boggs' code phpZotero.   Reading it taught me how to make some basic calls.

(Note: I currently have two separate Zotero accounts:  a personal one (rdhyee) and one for my course (mixingandremixinginfo).  Ideally, I'd rather manage only one account but have two because, from what I understand, there is no fine-grained privacy control that would let me  keep some references private while leaving most public.    Moreover, it seems that the REST API can read only public items for now — so I'll be applying it to mixingandremixinginfo for now. )

Let me write out what you can do with the API to get started.

  1. Create a key at  Zotero | Settings > Feeds/API (Though I've used a key, I don't see any indication of key usage in my panel — is this a bug?)
  2. Go to the Zotero privacy tab to check "Publish Entire Library" and "Publish Notes" to make the items (and optionally notes) visible to your API key.
  3. Figure out the userID for the account.  You can do so but doing a HTTP GET on{username} e.g., — see

    <?xml version="1.0" encoding="utf-8"?>
    <entry xmlns="" xmlns:zapi="">
        <link rel="self" type="application/atom+xml"
        <link rel="alternate" type="text/html"
        <content type="application/xml">
                <zapi:affiliation><![CDATA[UC Berkeley]]></zapi:affiliation>
                <zapi:bio><![CDATA[<p>This is a public account associated with the Mixing and Remixing Information course at the UC Berkeley School of Information</p>]]></zapi:bio>
                <zapi:disciplines><![CDATA[Information Science and Technology]]></zapi:disciplines>
                <zapi:location><![CDATA[Berkeley, CA]]></zapi:location>
                <zapi:realname><![CDATA[Mixing and Remixing Information]]></zapi:realname>

    and then parse out for the id element   (e.g., 119961)

  4. Now to get items and collections, note that the base URL for the API is
    and that you can use two
    parameters:  username,  apiKey in
    curl -v -k -X GET "{username}&apiKey={apiKey}"
    to return publicly visible items.
  5. You can get data about a  specific item (e.g.,

    curl -v -k -X GET "{userid}/items/{itemid}?username={username}&apiKey={apiKey}"

    e.g., curl -v -k -X GET "{apiKey}"

  6. Jeremy's phpCode hints at other things you can read from the API, including user collections and their  items, that I've not tried accessing.  However doing so should be straightforward.

Don't consider what I've written here as definitive by any means since  I don't know of any publicly available official documentation.   I'm eager to see the further development of the API, including the ability to access more data (about users,  groups,  etc) and the ability to write to Zotero servers.

Comments welcome!


Updates to Chapter 1

Some updates I'm making to Chapter 1

  1. Re: LibraryLookup Bookmarklet -> .   The LibraryLookup Project is a good page to use as a reference for the project.
  2. -> (thanks to

Creating book XHTML from DocBook

Ever since Pro Web 2.0 Mashups came out, I've wanted to get the book on the web.  Publishing PDFs was a start — but I have envisioned developing a full-blown web application, a book that would could interact with my readers, be self-correcting and self-updating.  It's only appropriate that a book about APIs and mashups should itself embody the techniques that are describe in it!

Well, it's going to be a while until I get there — but I'm happy that I haven't taken the next step:  publishing a (X)HTML version of the book.   The canonical version of the book ended up being a series of QuarkXPress files.  I had written some Python appscript programs to convert the book to a simple homebrew XML representation but didn't have sufficient time to take it all the way to DocBook.  I hired Liza Daly's Threepress Consulting to do the bulk of the conversion to DocBook, leaving me some labor intensive details to fit my budget.  (BTW, Liza is a great person to work with, very  smart and responsive to my queries. )

With the book in DocBook, I have been using oXygen 11 to edit the files and transform them into XHTML. I was hunting around for CSS files for DocBook-derived XHTML, but found surprisingly few options.  The one I'm currently settling on to get me started is the stylesheet  for FreeBSD documentation (see the style in action on the FreeBSD site).   I'll definitely want to customize the stylesheet for the book to reflect the look and feel I desire — but I'm happy with the starting point.

Tagged ,

a rough XHTML version of my book is now live

I'm pleased to announce that I've posted an  XHTML version of Pro Web 2.0 Mashups:  Remixing Data and Web Services. This should make it much easier for folks to use my book.  It also opens up some good opportunities for me to update the book, which I plan to do as I teach my Mixing and Remixing Information course this semester.

A warning: the XHTML translation has some technical flaws that I'm working to fix.  However, I figured that it was better to put a working version up and then fix it along the way.

Enjoy and let me know what you think!


Google Mashup Editor will be shut down

I'm sad about the announcement that Google is shutting down its Mashup Editor — one reason being that I had spent a fair amount of effort writing about it in Chapter 11 of  my book.  Oh well.  The Google App Engine is touted as a suitable and more powerful alternative to the Mashup Editor — but I have to agree with that the comment that the simplicity of the Mashup Editor was a virtue.

Tagged ,

Services built upon Amazon EC2

According to How To: Getting Started with Amazon EC2 –, the following services are built on top of EC2 (and thereby perhaps make scaling up EC2 easier):

What other ones are out there?

S3Ajax: creating buckets and uploading keys

Continuing on Mashup Guide :: listing keys with S3Ajax, here I present a Chickenfoot script to create a bucket and upload a file (specifically, it creates a bucket by the name of raymondyeetest and uploads a file (D:\Document\PersonalInfoRemixBook\examples\ch16\Exported Items.rdf from my WinXP machine) to the key exportitems.rdf in the bucket:


var AWSAccessKeyId = "[AWSAccessKeyId]";
var AWSSecretAccessKey = "[AWSSecretAccessKey]";

S3Ajax.URL = '';
S3Ajax.DEBUG = true;
S3Ajax.KEY_ID = AWSAccessKeyId;
S3Ajax.SECRET_KEY = AWSSecretAccessKey;

// function to read contents from a file

function read_file_contents(aFileURL) {

  var ios = Components.classes[";1"]
  var url = ios.newURI(aFileURL, null, null);

  if (!url || !url.schemeIs("file")) throw "Expected a file URL.";
  var bFile = url.QueryInterface(Components.interfaces.nsIFileURL).file;
  var istream = Components.classes[";1"]
  istream.init(bFile, -1, -1, false);
  var bstream = Components.classes[";1"]
  var bytes = bstream.readBytes(bstream.available());
  return bytes;


// create a bucket

var newBucketName = 'raymondyeetest';

S3Ajax.createBucket(newBucketName, function() {
    output("created a buicket: " + newBucketName);
}, function () {
      output ("error in createBucket");

// add a key to a bucket

var fileURL = 'file:///D:/Document/PersonalInfoRemixBook/examples/ch16/Exported%20Items.rdf'
var content = read_file_contents(fileURL);

S3Ajax.put(newBucketName , "exportitems.rdf", content,
       content_type: "application/xml; charset=UTF-8",
       meta:  {'creator':'Raymond Yee'},
       acl:    "public-read",
    function(req) {
        output("Upload succeeded");
    function(req, obj) {
        output("Upload failed");

A few things to note about this code:

  • read_file_contents() doesn't strike me as a terribly elegant way of reading contents from a file — but that's what I have working for now
  • a tricky part of getting this to work was to note that in FF 3.x,  charset=UTF-8 is automatically tacked on to content-type HTTP request header in a xmlhttprequest — I don't know how to change the charset or whether you can — and why UTF-8 is being tacked on.  But figuring out the charset was crucial to getting this working since the content-type HTTP header is used in the calculation of the Amazon S3 signature.