Skip to content

Some useful references for Chapter 17 on office suites

Fodder for Chapter 17 on office suites when I get to writing it in detail:

    "We definitely want to build out APIs, especially for the spreadsheets side, as spreadsheets are more data-oriented, but maybe also for he word processor," Google product manager Jonathan Rochelle said. "People will be able to do mashups with our tools for other things, and not be stuck behind our dev cycle for everything they want."

Posting early drafts of Introduction and Chapter 1

Today, I start posting early drafts of the Introduction and Chapter 1 of my book. I'm keen on hearing feedback from those of you who would like to read early drafts of my book. The latest versions of chapters will be linked from the table of contents.

Although my book will be fully released under a Creative Commons license once my publisher Apress and I finish off the first edition ("v 1.0") , I won't apply the license to the pre-v1.0. I don't want too much reuse of the materials until we've hit the level of quality of the first edition.

I will to being a bit nervous about posting early drafts publicly since the drafts are still rough around the edges (if not more deeply flawed!) but I will put some faith in the open source philosophy embodied in the mantra of Release Early, Release Often. One of my goals in writing this book is to create a community resource. The more input I can get from the community, the better. Moreover, I'd like to get as many improvements and fix as many mistakes ("bugs") as I can before the book gets committed to paper. As with software, it's easier and less expensive to make changes earlier in the process.

A note about the publishing process I'm using here:

  • I'm using the standard Apress process of writing books in Microsoft Word. (I'm sure with the heavy representation of open source topics that there are many Apress authors who use another tool but I've decided to go with Word. I wrote my dissertation with LaTeX and I figure that it would be instructive to learn the ins and outs of Microsoft Office in the process of writing a book. I'm crossing my fingers that I won't have major Office corruption problems.)
  • I will start with the easiest way for me to post my manuscript, that is, as a series of PDFs. I will move towards posting the manuscript in more flexible formats, such as transforming the book into a wiki that we can keep up to date.
  • Although I don't have a good structure for aggregating discussion around chapters yet, I decided to experiment a bit with Google Docs and Zoho Writer. To that end, in addition to the PDFs, I'm uploading my draft chapters to the two systems.

I admire the commenting system available at the Django Book site and hope to have a system in place that allows for inline annotation of my book. For now, I provide at least three places to provide comments:

  • For the GoogleDoc version, I can add collaborators who would like to edit or comment directly on the manuscript. (Send me email.)
  • For Zoho, people can tack on comments to the document.
  • We can use the commenting system in WordPress.

I look forward to using the Zoho APIs and the Google Docs API (which I'm sure will eventually be there.)

You will note the figures are currently missing from Chapter 1. Though I have a strong argument for posting these pictures under fair use, I'm in the process of securing all the copyright clearances just to ease everyone's minds about including screenshots in the book.

Fixing smart quote problem in WordPress

By default, WordPress displays apostrophes and double quotation marks as smart quotes. This causes major problems for displaying programming code. A search for a way to turn off the smart quoting led to a number of possible solutions:

I settled on the solution in the first article, which is to create a small plugin that removes the wptexturize filter from a number of places.

Mashup vs Remix

By the end of this week, I plan to post a draft of Chapter 1 ("Learning from a Study of Specific Mashups") for public comment. I'm a bit scared to do so since I see so many flaws in what I've written so far — and am wary of having even more pointed out by others! I also know, however, that the more intelligent and constructive feedback I get on my manuscript before the words get committed to print, the better it will be for me, the book, and ultimately my readers.

I currently have a sidebar that discusses nomenclature: specifically, what is the relationship among the word mashup and remixing in the context of my book — which is about "web mashups", the reuse and recombination of digital content. I'm tempted to just use the terms loosely and interchangeably and not to make a tight distinction between the terms. I'm not ready to go in that direction without taking another close look at the issue.

At this point, I am making some tentative conclusions, based largely on what I've read on the Wikipedia.

  • mashup and remix are terms that have their origin in popular music.
  • roughly speaking, a remix is a alternate version of a song while a mashup brings together elements of two or more songs.
  • there seems to be a tussle on the article on web mashups (technically, "Mashup (web application hybrid)") as to whether an application which is a mashup has to be "web application" or any type of "application".

At this point, I will say that if I wanted to make the parallels from popular music hold up for digital applications, I would use remix to talk about scenarios that are about reusing or repackaging data without combining it with other content (e.g., using the Flickr API to make a web page that has only Flickr images) while reserving mashups to refer to combinations of data from a variety of sources (e.g., combining Flickr photos with photos from Yahoo! photos). But the lines are fuzzy and, imho, not worth the effort to draw too carefully.

Chapter 8 on the programmable web browser, Javascript, and AJAX

Starting today, I will be writing much more often on this weblog to narrate the progress of my mashup book. The writing has been going well, but needless to say, there's so much more to do. This week, I am working on two fronts: cleaning up Chapter 1, an overview of mashups, and drafting, Chapter 8 on the programmable Web browser, Javascript, and AJAX. Although I am writing many, many words in a word processor — many of which I hope will make it into the final draft of my book, I long to write shorter pieces, which will facilitate the development of the book. That's why am I'm taking time out of the book to weblog a bit.

Let me tell you a bit about Chapter 8, whose working title is "Learning Ajax/JavaScript widgets and their APIs." In thinking about the chapter today, I realize that the big idea I want to get at is that the modern web browser is programmable and hence, is a rich platform for mashing up data and services. As a connoisseur of mashups, I would want to figure out all the different ways in which I could extend, change, subvert, and customize the web browser, which is the dominant client-side platform for exchanging information on the Internet. The possibilities are astounding for customization both in how a web server host communicates with others and how you as an end-user could process communications coming at you.

A specific example, and certainly not a surprising one, to cover in Chapter 8 is Google Maps, which I call (without great precision of wording) an Ajax widget. Ajax, because it involves the constant and fluid interchange of data between the browser and the server executed through JavaScript — and a widget because one can use Google maps without knowing all the inner workings of Ajax. That is, you can use it at a high level of abstraction. (I do use Google maps as a specific instance of Ajax widgets but I cover Google maps again in greater detail in other chapters to emphasize the mapping (functional) aspect of it — instead of the technical implementation part of it.)

Ajax is a rich subject, as can be seen in by the myriad books that have been published recently on the subject. I would like to put Ajax in the larger context of the programmable Web browser. Here I will admit to struggling with how to piece together a chapter that I believe should at least mention, if not plumb the depths of the following:

  • both how an "ideal" W3C DOM-standards compliant browser works and how various browsers actually work in various areas: how javascript is implemented, object models behavior, CSS, events, etc.
  • Javascript-based APIs, widgets such as Google maps — what are they, how to use any all.
  • non-browser environments for Javascript, such as Google Gadgets, Yahoo Widgets, Adobe Acrobat
  • extension mechanisms in browsers (Firefox addons, Safari , IE , Opera)
  • Javascript and browser debugging tools like Firebug
  • Javascript libraries: how they relate and what can be intermixed — and which ones are tied to which web programming frameworks.
  • what people have done already on all these fronts using Javascript and remixing the browser
  • how to write Javascript and Javascript widgets that can be reused by other people, including cross-platform Javascript
  • ideas of what you can do in terms of mashups

I obviously would not be to cover all these topics, nor should I even try! What I plan to actually cover as a way into this big list of topics is the following:

  • the lastest versions of Firefox, instead of looking in depth at all browers — old and new
  • the Yahoo UI Library, as a specific example of a packaged javascript library
  • a walk-through of how to use Firefox + Firebug / Javascript Shell + YUI Connection Manager
  • to build a Google Map example, as a way to get into Ajax widgets in general
  • how to build a basic AJAX call to Flickr
  • how to write a simple Greasemonkey script to lay the foundation of understanding Google Maps in Flickr , a major example in the book).

Spring Break, the Book, and Amazon.com

Because next week is spring break at UC Berkeley, I have a bit more breathing room to work on my book. While I need to turn in the first draft of Chapter 8 (on AJAX and Javascript) next week, my current priority is to recalibrate the schedule for the book as a whole. I'm please to see that my book already shows up on the Apress site as well as on Amazon.com. It's pleasing and a bit scary at the same time as I deal with the myriad details of finishing up the writing!

Notelets for 2007.03.15

For a long time, the Firebug extension would not display any CSS info. I fixed that problem last week:

I'm glad that Flickr has introduced Collections. (See my collections, for instance.) Pieces of the API are coming: Flickr Services: Flickr API: flickr.collections.getInfo — but it's not all there yet. ( yws-flickr : Collections). I was hoping, however, that the collections would also be able to contain other people's pictures.

Web 2.0 & Mashups: How People can Tap into the "Grid" for Fun & Profit » SlideShare is a good short presentation on mashups.

OpenOffice.org Training, Tips, and Ideas is a blog with lots of tips on how to use OpenOfficeOrg more effectively.

I'd love to take in code4lib video for the 2007 conference. Sorry I couldn't make it since I had a blast that 2006 meeting.

Blossoms and the Berkeley winter (spring)?

Some nice blossoms in Berkeley….

(I'm demonstrating the blogging of multiple Flickr photos with Flock with this post)

Notelets for 2007.02.19

I may have to get in the business of parsing Excel spreadsheets, making use of information I find at OpenOffice.org's Documentation of the Microsoft Excel File Format Excel Versions 2, 3, 4, 5, 95, 97, 2000, XP, 2003 and sc: Spreadsheet ProjectMicrosoft Excel – Wikipedia, the free encyclopedia

QEDWiki is ready to try. Will it make mashups easier, even trivial, to create?

I just posted a query on the Flickr discussion group: WSDL from Flickr: unorthodox SOAP invocation?:

    I've been interested in generating WSDL from the Flickr reflection methods to generate a library that could better keep up with the changes in the Flickr APIs. However, I've run into a problem that stems from what I believe to be either unorthodox SOAP syntax in Flickr — or just the limitations in my own understanding of WSDL and SOAP.

Let's see what I hear back.

Extracting text from a Word document

Although I'm writing my mashup book in Microsoft Word, I'd like to publish it in a variety of forms, including HTML, various varieties of XML, PDF, wiki-markup. There are various ways to extract content out of my Word documents, including Word macros, external scripts using the COM interface, or saving the Word 2003 documents as Word XML. I'm partial to using Python to do some simple extraction of text as a first step:

   
import win32com.client  
wd = win32com.client.Dispatch("Word.Application")  
doc = wd.Documents.Open(r'D:\\Document\\PersonalInfoRemixBook\\858Xch05__.doc')  
print doc.Content.Text

I've not been able to find complete reference documentation for the Word 2003 object model. Word 2003 Object Model was a blank page for me under "Objects". Best, probably, to look at documentation for Office XP.