Chapter 2 analyzes Flickr for what makes it a mashup platform par excellence through which you can learn how to remix a specific application and exploit features that make it so remixable. The chapter compares and contrasts Flickr with other remixable platforms: del.icio.us, Google Maps, and amazon.com. On my plate is writing the second draft of Chapter 2. Besides correcting small scale errors, refining the prose of the chapter and giving it a jazzier and more accurate title, my focus is on providing more details about mashups that could actually be created from the features I write about. I write that "a goal of this chapter is to train you [readers] to deconstruct applications for their remix and mashup potential." While I do spell out in substantial detail the ways URLs are constructed and organized for Flickr, amazon.com, Google Maps, and del.icio.us, I need to describe how to generalize these ideas to other circumstances and suggest possible mashups that can be built.
Here are some other issues to work out:
URLs as little languages and the connection to REST
I spend a lot of effort in Chapter 2 on the notion of "URLs as little languages to understand and to speak." I think that it's easy for experienced programmers to these ideas about URLs (e.g., Hacking the URL) for granted. But I want to show the importance of being able to link to specific resources. For instance, LibraryLookup depends on being able to point to a book by constructing a URL based on an ISBN. If you can't easily link to a resource, you are going to be hard-pressed to reuse it, especially if there is not formal API. (Note: Some library catalogues have odd session-dependent cookies that make it difficult to forge such a URL to the book. You can sometimes manage to create a URL that will work (temporarily), through a multi-step screen-scraping — in contrast to just dropping an ISBN into a URL.)
Having a simple URL to represent a specific resource means one of the simplest mashup design patterns is possible: you can substitute some parameters and get the corresponding web page. For websites that don't have formal APIs, such URLs are the closest one comes to a programming interface. (Sometimes, even if there is an API, it is simpler to use the human user interface URL and do a bit of screen-scraping. And sometimes even with an API that does not cover the functionality that you care about, having access to the URL is the only way to go.)
I have a sense that there are deep connections between RESTful architecture and the importance of little URL languages — but I can't put my fingers on the specific connections. I just ordered a copy of Leonard Richardson and Sam Ruby's Restful Web Services (RESTful Web Services) to help me better understand REST. Some impressions that I have about REST that I believe to be correct
- A fundamental idea behind REST is using URLs to represent resources.
- If the website that you are trying to mashup is truly RESTful, then figuring out the structures of URLs is akin to figuring how resources are named in the application — what are the "nouns".
- There would be pretty strong continuity between the structure of the human-facing website and any API in a RESTful site.
- Coherent, clean URL languages correlate with good REST design.
Identifiers as glue
I want to strengthen my description of how to use identifiers, tags, and search terms to correlate similar or the same things within and across websites and applications. Think about the use of an ISBN in LibraryLookup and latitude and longitude in Google Maps in Flickr — how those identifiers and broadly used ways of describing things connect websites together.
How the mashups we studied in Chapter 1 make use of the techniques of Chapter 2
To make the three mashups we studied in chapter 1, their creators had to understand the functioning of the constituent applications they were recombining. For instance:
- for LibraryLookup, Udell needed to understand the use of ISBNs as identifiers among library catalogs and other book-oriented websites (such as amazon.com and other bookstores). Then you can use this ISBN (and speak the URL languages of various library catalogs) to glue together these various websites via JavaScript. (There are some challenges: it was difficult for Jon Udell to craft a totally user-friendly system for easily creating the LibraryLookup bookmarklet just for your library.)
- for GMiF, a Greasemonkey script — which is very much about remixing the existing user interface of an application, CK Yuan had need to understand the user interface of Flickr in order to insert the GMap icon among the other icons, how others have exploited the user tagging can be hacked to hold location data (in a system that ultimately become productized by Flickr in to machine tags). Moreover, on a prosaic level, you have to understand how to form URLs for each of the pictures.
- housingmaps.com depends on craigslist, which has no formal API. Hence, Paul Rademacher has to parse the HTML and understand the URL structure of craigslist, what cities are covered, how to make use of the RSS and supplement that data with screen-scraping.
What you get by studying the application and not just the API
My point is the developers need to understand apps as end-users too and not just jump into the API. Learn the application first (if you are an experienced developer and user of these types of applications, it won't take that long.). It's worth the investment of time. Why not just jump into the API?
- You're more likely to make a more useful mashup by availing yourself of knowledge as an end-user
- You can plug the mashup into the context of how users are already using the application
- You understand what is currently missing from the application and can be improved
- You see hooks into the application that are not necessarily obvious from the API alone
- You can more easily make sense of the API when you know what key data entities are and some of the functionality — you can ask, how might it be reflected in the APIs.
Looking for signs of mashability; ties to further chapters
Chapter 2 is also a prelude to the chapters that immediately follow it, elements of a website that make it more remixable. Indeed, the topics are the basis of a checklist of questions to pose in assessing the mashability/remixability/recombinatorial potential of applications:
- Are tags used to describe resources on the website (described in greater detail in Chapter 3)
- Are RSS and other syndication feeds available? (We will deal with this issue in greater depth in Chapter 4)
- Do you see functionality for integrating with weblogs? (Chapter 5)
- Is there an API for the application (Chapter 6, 7, and 8.)
In addition, you would look for the existence of browser toolbars, desktop clients, and mobile interfaces that interact with the websites — they not only show that the website is remixable but often show how you can do so. (I will have to give specific examples here in the chapter, but I have some already installed in my own browser: del.icio.us Firefox extension and Amazon S3 Firefox Organizer(S3Fox)).
Data formats, nouns, and Verbs
"What is the underlying data format?" — and a related question "What are the core entities or resources in the website" — are useful questions to pose when studying an application. If we use grammatical analogies, what are the "nouns"? When we look then at what functionality there is around the entities, we are asking what the "verbs" are. If there is an API, it will make a lot more sense if you have a sense of what those entities and their functionality are.