Latest Trends in Website Optimizations for Web2.0

Trends are changing with the change in latest technologies and web2.0 standards.

Compress Graphics of site using PNG and not GIF for Transparent

For Transparent we can use PNG formats and for NON Transparent we can use JPG formats

PNGs were designed to be a superior replacement for the Graphic Interchange Format (GIF). GIFs are limited to 256 colors (8-bit color palette), one level of transparency, and the Lempel-Ziv-Welch (LZW) compression algorithm that was patented by UNISYS. In most cases, PNG files from the same source images are smaller than corresponding GIFs. PNGs use the “deflate” compression algorithm, which is 10 to 30% more efficient than LZW compression.

By design PNGs have some advantages over GIF images. PNGs offer more choices in color depths than GIFs, including 8-bit (256 colors), 24-bit (8 bits per channel), and 48-bit (16 bits per channel) truecolor allowing for greater color precision and smoother transitions. When you add an alpha channel, PNGs allow for up to 64 bits per channel. PNGs can have index color transparency (one color) or alpha transparency (multiple levels) useful for smooth shadow transitions over other images. In summary, the advantages of PNGs over GIFs are:

  • Alpha channels (multilevel transparency)
  • Variable bit depths
  • Cross-platform gamma and color correction
  • Two-dimensional interlacing
  • More efficient lossless compression (LZ77 vs. LZ78+)

Using image sprites

If you use a lot of background images in CSS, it’s a good practice to put all the images you need in one big canvas. You can then set background-position in CSS to get the image you want from the big image. The advantage here is that instead of having to make numerous HTTP requests on a page, the browser only needs to make one request for the big image and thus speeding up load time. Some people usually create a sprite for images of the same purpose, for example, a sprite for navigation images, a sprite for logo images, a sprite for footer images, etc, but there is no reason why you can’t create combine all images, be it navigation, icons or footer, in one single sprite.

Learn here how to create image sprite

Minify and Pack Your JS and CSS files

Instead of Using Multiple JS and CSS files while loading a website, use only one file of JS and one file for CSS, they are many tools that are available to achieve this.

Here is one of such tool

http://code.google.com/p/minify/

You can use this online tool to compress your Javascript files using the above algorithms.

Caching Files on Server

Here is a simple caching method you can use with .htaccess file. It simply sets the expiry header and cache control for browsers so the browser keeps certain components in its cache and retrieve them from the cache rather than making a new HTTP request every time.

#604800  = 1 week in seconds
<FilesMatch "\.(gif|jpg|jpeg|png)$">
Header set Cache-Control "max-age=604800"
</FilesMatch>

#86400 = 1 day in seconds
<FilesMatch "\.(js|css)$">
Header set Cache-Control "max-age=86400"
</FilesMatch>

Multiple Domains

If your are targeting a high volume side, Then its always time for you to have a separate sub domain for all Images ,JS, CSS.

This will help you to load all the files at parallel

for example.

If your domain name is domain.com

then always use js.domain.com for all your javascript files

img.domain.com for all your images

css.domain.com for all your Style sheet files.

Benchmark and Test

Always benchmark, test and optimise more whenever possible. I use Firebug Network Monitoring Tool and YSlow for this purpose.

What is Web2.0

What Is Web 2.0?

Design Patterns and Business Models for the Next Generation of Software

by Tim O’Reilly
09/30/2005

The bursting of the dot-com bubble in the fall of 2001 marked a turning point for the web. Many people concluded that the web was overhyped, when in fact bubbles and consequent shakeouts appear to be a common feature of all technological revolutions. Shakeouts typically mark the point at which an ascendant technology is ready to take its place at center stage. The pretenders are given the bum’s rush, the real success stories show their strength, and there begins to be an understanding of what separates one from the other.

The concept of “Web 2.0” began with a conference brainstorming session between O’Reilly and MediaLive International. Dale Dougherty, web pioneer and O’Reilly VP, noted that far from having “crashed”, the web was more important than ever, with exciting new applications and sites popping up with surprising regularity. What’s more, the companies that had survived the collapse seemed to have some things in common. Could it be that the dot-com collapse marked some kind of turning point for the web, such that a call to action such as “Web 2.0” might make sense? We agreed that it did, and so the Web 2.0 Conference was born.

In the year and a half since, the term “Web 2.0” has clearly taken hold, with more than 9.5 million citations in Google. But there’s still a huge amount of disagreement about just what Web 2.0 means, with some people decrying it as a meaningless marketing buzzword, and others accepting it as the new conventional wisdom.

This article is an attempt to clarify just what we mean by Web 2.0.

In our initial brainstorming, we formulated our sense of Web 2.0 by example:

Web 1.0

Web 2.0

DoubleClick

–>

Google AdSense

Ofoto

–>

Flickr

Akamai

–>

BitTorrent

mp3.com

–>

Napster

Britannica Online

–>

Wikipedia

personal websites

–>

blogging

evite

–>

upcoming.org and EVDB

domain name speculation

–>

search engine optimization

page views

–>

cost per click

screen scraping

–>

web services

publishing

–>

participation

content management systems

–>

wikis

directories (taxonomy)

–>

tagging (“folksonomy”)

stickiness

–>

syndication

The list went on and on. But what was it that made us identify one application or approach as “Web 1.0” and another as “Web 2.0”? (The question is particularly urgent because the Web 2.0 meme has become so widespread that companies are now pasting it on as a marketing buzzword, with no real understanding of just what it means. The question is particularly difficult because many of those buzzword-addicted startups are definitely not Web 2.0, while some of the applications we identified as Web 2.0, like Napster and BitTorrent, are not even properly web applications!) We began trying to tease out the principles that are demonstrated in one way or another by the success stories of web 1.0 and by the most interesting of the new applications.

1. The Web As Platform

Like many important concepts, Web 2.0 doesn’t have a hard boundary, but rather, a gravitational core. You can visualize Web 2.0 as a set of principles and practices that tie together a veritable solar system of sites that demonstrate some or all of those principles, at a varying distance from that core.

Figure 1 shows a “meme map” of Web 2.0 that was developed at a brainstorming session during FOO Camp, a conference at O’Reilly Media. It’s very much a work in progress, but shows the many ideas that radiate out from the Web 2.0 core.

For example, at the first Web 2.0 conference, in October 2004, John Battelle and I listed a preliminary set of principles in our opening talk. The first of those principles was “The web as platform.” Yet that was also a rallying cry of Web 1.0 darling Netscape, which went down in flames after a heated battle with Microsoft. What’s more, two of our initial Web 1.0 exemplars, DoubleClick and Akamai, were both pioneers in treating the web as a platform. People don’t often think of it as “web services”, but in fact, ad serving was the first widely deployed web service, and the first widely deployed “mashup” (to use another term that has gained currency of late). Every banner ad is served as a seamless cooperation between two websites, delivering an integrated page to a reader on yet another computer. Akamai also treats the network as the platform, and at a deeper level of the stack, building a transparent caching and content delivery network that eases bandwidth congestion.

Nonetheless, these pioneers provided useful contrasts because later entrants have taken their solution to the same problem even further, understanding something deeper about the nature of the new platform. Both DoubleClick and Akamai were Web 2.0 pioneers, yet we can also see how it’s possible to realize more of the possibilities by embracing additional Web 2.0 design patterns.

Let’s drill down for a moment into each of these three cases, teasing out some of the essential elements of difference.

Netscape vs. Google

If Netscape was the standard bearer for Web 1.0, Google is most certainly the standard bearer for Web 2.0, if only because their respective IPOs were defining events for each era. So let’s start with a comparison of these two companies and their positioning.

Netscape framed “the web as platform” in terms of the old software paradigm: their flagship product was the web browser, a desktop application, and their strategy was to use their dominance in the browser market to establish a market for high-priced server products. Control over standards for displaying content and applications in the browser would, in theory, give Netscape the kind of market power enjoyed by Microsoft in the PC market. Much like the “horseless carriage” framed the automobile as an extension of the familiar, Netscape promoted a “webtop” to replace the desktop, and planned to populate that webtop with information updates and applets pushed to the webtop by information providers who would purchase Netscape servers.

In the end, both web browsers and web servers turned out to be commodities, and value moved “up the stack” to services delivered over the web platform.

Google, by contrast, began its life as a native web application, never sold or packaged, but delivered as a service, with customers paying, directly or indirectly, for the use of that service. None of the trappings of the old software industry are present. No scheduled software releases, just continuous improvement. No licensing or sale, just usage. No porting to different platforms so that customers can run the software on their own equipment, just a massively scalable collection of commodity PCs running open source operating systems plus homegrown applications and utilities that no one outside the company ever gets to see.

At bottom, Google requires a competency that Netscape never needed: database management. Google isn’t just a collection of software tools, it’s a specialized database. Without the data, the tools are useless; without the software, the data is unmanageable. Software licensing and control over APIs–the lever of power in the previous era–is irrelevant because the software never need be distributed but only performed, and also because without the ability to collect and manage the data, the software is of little use. In fact, the value of the software is proportional to the scale and dynamism of the data it helps to manage.

Google’s service is not a server–though it is delivered by a massive collection of internet servers–nor a browser–though it is experienced by the user within the browser. Nor does its flagship search service even host the content that it enables users to find. Much like a phone call, which happens not just on the phones at either end of the call, but on the network in between, Google happens in the space between browser and search engine and destination content server, as an enabler or middleman between the user and his or her online experience.

While both Netscape and Google could be described as software companies, it’s clear that Netscape belonged to the same software world as Lotus, Microsoft, Oracle, SAP, and other companies that got their start in the 1980’s software revolution, while Google’s fellows are other internet applications like eBay, Amazon, Napster, and yes, DoubleClick and Akamai.

DoubleClick vs. Overture and AdSense

Like Google, DoubleClick is a true child of the internet era. It harnesses software as a service, has a core competency in data management, and, as noted above, was a pioneer in web services long before web services even had a name. However, DoubleClick was ultimately limited by its business model. It bought into the ’90s notion that the web was about publishing, not participation; that advertisers, not consumers, ought to call the shots; that size mattered, and that the internet was increasingly being dominated by the top websites as measured by MediaMetrix and other web ad scoring companies.

As a result, DoubleClick proudly cites on its website “over 2000 successful implementations” of its software. Yahoo! Search Marketing (formerly Overture) and Google AdSense, by contrast, already serve hundreds of thousands of advertisers apiece.

Overture and Google’s success came from an understanding of what Chris Anderson refers to as “the long tail,” the collective power of the small sites that make up the bulk of the web’s content. DoubleClick’s offerings require a formal sales contract, limiting their market to the few thousand largest websites. Overture and Google figured out how to enable ad placement on virtually any web page. What’s more, they eschewed publisher/ad-agency friendly advertising formats such as banner ads and popups in favor of minimally intrusive, context-sensitive, consumer-friendly text advertising.

The Web 2.0 lesson: leverage customer-self service and algorithmic data management to reach out to the entire web, to the edges and not just the center, to the long tail and not just the head.

A Platform Beats an Application Every Time

In each of its past confrontations with rivals, Microsoft has successfully played the platform card, trumping even the most dominant applications. Windows allowed Microsoft to displace Lotus 1-2-3 with Excel, WordPerfect with Word, and Netscape Navigator with Internet Explorer.

This time, though, the clash isn’t between a platform and an application, but between two platforms, each with a radically different business model: On the one side, a single software provider, whose massive installed base and tightly integrated operating system and APIs give control over the programming paradigm; on the other, a system without an owner, tied together by a set of protocols, open standards and agreements for cooperation.

Windows represents the pinnacle of proprietary control via software APIs. Netscape tried to wrest control from Microsoft using the same techniques that Microsoft itself had used against other rivals, and failed. But Apache, which held to the open standards of the web, has prospered. The battle is no longer unequal, a platform versus a single application, but platform versus platform, with the question being which platform, and more profoundly, which architecture, and which business model, is better suited to the opportunity ahead.

Windows was a brilliant solution to the problems of the early PC era. It leveled the playing field for application developers, solving a host of problems that had previously bedeviled the industry. But a single monolithic approach, controlled by a single vendor, is no longer a solution, it’s a problem. Communications-oriented systems, as the internet-as-platform most certainly is, require interoperability. Unless a vendor can control both ends of every interaction, the possibilities of user lock-in via software APIs are limited.

Any Web 2.0 vendor that seeks to lock in its application gains by controlling the platform will, by definition, no longer be playing to the strengths of the platform.

This is not to say that there are not opportunities for lock-in and competitive advantage, but we believe they are not to be found via control over software APIs and protocols. There is a new game afoot. The companies that succeed in the Web 2.0 era will be those that understand the rules of that game, rather than trying to go back to the rules of the PC software era.

Not surprisingly, other web 2.0 success stories demonstrate this same behavior. eBay enables occasional transactions of only a few dollars between single individuals, acting as an automated intermediary. Napster (though shut down for legal reasons) built its network not by building a centralized song database, but by architecting a system in such a way that every downloader also became a server, and thus grew the network.

Akamai vs. BitTorrent

Like DoubleClick, Akamai is optimized to do business with the head, not the tail, with the center, not the edges. While it serves the benefit of the individuals at the edge of the web by smoothing their access to the high-demand sites at the center, it collects its revenue from those central sites.

BitTorrent, like other pioneers in the P2P movement, takes a radical approach to internet decentralization. Every client is also a server; files are broken up into fragments that can be served from multiple locations, transparently harnessing the network of downloaders to provide both bandwidth and data to other users. The more popular the file, in fact, the faster it can be served, as there are more users providing bandwidth and fragments of the complete file.

BitTorrent thus demonstrates a key Web 2.0 principle: the service automatically gets better the more people use it. While Akamai must add servers to improve service, every BitTorrent consumer brings his own resources to the party. There’s an implicit “architecture of participation”, a built-in ethic of cooperation, in which the service acts primarily as an intelligent broker, connecting the edges to each other and harnessing the power of the users themselves.

2. Harnessing Collective Intelligence

The central principle behind the success of the giants born in the Web 1.0 era who have survived to lead the Web 2.0 era appears to be this, that they have embraced the power of the web to harness collective intelligence:

  • Hyperlinking is the foundation of the web. As users add new content, and new sites, it is bound in to the structure of the web by other users discovering the content and linking to it. Much as synapses form in the brain, with associations becoming stronger through repetition or intensity, the web of connections grows organically as an output of the collective activity of all web users.
  • Yahoo!, the first great internet success story, was born as a catalog, or directory of links, an aggregation of the best work of thousands, then millions of web users. While Yahoo! has since moved into the business of creating many types of content, its role as a portal to the collective work of the net’s users remains the core of its value.
  • Google’s breakthrough in search, which quickly made it the undisputed search market leader, was PageRank, a method of using the link structure of the web rather than just the characteristics of documents to provide better search results.
  • eBay’s product is the collective activity of all its users; like the web itself, eBay grows organically in response to user activity, and the company’s role is as an enabler of a context in which that user activity can happen. What’s more, eBay’s competitive advantage comes almost entirely from the critical mass of buyers and sellers, which makes any new entrant offering similar services significantly less attractive.
  • Amazon sells the same products as competitors such as Barnesandnoble.com, and they receive the same product descriptions, cover images, and editorial content from their vendors. But Amazon has made a science of user engagement. They have an order of magnitude more user reviews, invitations to participate in varied ways on virtually every page–and even more importantly, they use user activity to produce better search results. While a Barnesandnoble.com search is likely to lead with the company’s own products, or sponsored results, Amazon always leads with “most popular”, a real-time computation based not only on sales but other factors that Amazon insiders call the “flow” around products. With an order of magnitude more user participation, it’s no surprise that Amazon’s sales also outpace
    competitors.

Now, innovative companies that pick up on this insight and perhaps extend it even further, are making their mark on the web:

  • Wikipedia, an online encyclopedia based on the unlikely notion that an entry can be added by any web user, and edited by any other, is a radical experiment in trust, applying Eric Raymond’s dictum (originally coined in the context of open source software) that “with enough eyeballs, all bugs are shallow,” to content creation. Wikipedia is already in the top 100 websites, and many think it will be in the top ten before long. This is a profound change in the dynamics of content creation!
  • Sites like del.icio.us and Flickr, two companies that have received a great deal of attention of late, have pioneered a concept that some people call “folksonomy” (in contrast to taxonomy), a style of collaborative categorization of sites using freely chosen keywords, often referred to as tags. Tagging allows for the kind of multiple, overlapping associations that the brain itself uses, rather than rigid categories. In the canonical example, a Flickr photo of a puppy might be tagged both “puppy” and “cute”–allowing for retrieval along natural axes generated user activity.
  • Collaborative spam filtering products like Cloudmark aggregate the individual decisions of email users about what is and is not spam, outperforming systems that rely on analysis of the messages themselves.
  • It is a truism that the greatest internet success stories don’t advertise their products. Their adoption is driven by “viral marketing”–that is, recommendations propagating directly from one user to another. You can almost make the case that if a site or product relies on advertising to get the word out, it isn’t Web 2.0.
  • Even much of the infrastructure of the web–including the Linux, Apache, MySQL, and Perl, PHP, or Python code involved in most web servers–relies on the peer-production methods of open source, in themselves an instance of collective, net-enabled intelligence. There are more than 100,000 open source software projects listed on SourceForge.net. Anyone can add a project, anyone can download and use the code, and new projects migrate from the edges to the center as a result of users putting them to work, an organic software adoption process relying almost entirely on viral marketing.

The lesson: Network effects from user contributions are the key to market dominance in the Web 2.0 era.

Blogging and the Wisdom of Crowds

One of the most highly touted features of the Web 2.0 era is the rise of blogging. Personal home pages have been around since the early days of the web, and the personal diary and daily opinion column around much longer than that, so just what is the fuss all about?

At its most basic, a blog is just a personal home page in diary format. But as Rich Skrenta notes, the chronological organization of a blog “seems like a trivial difference, but it drives an entirely different delivery, advertising and value chain.”

One of the things that has made a difference is a technology called RSS. RSS is the most significant advance in the fundamental architecture of the web since early hackers realized that CGI could be used to create database-backed websites. RSS allows someone to link not just to a page, but to subscribe to it, with notification every time that page changes. Skrenta calls this “the incremental web.” Others call it the “live web”.

Now, of course, “dynamic websites” (i.e., database-backed sites with dynamically generated content) replaced static web pages well over ten years ago. What’s dynamic about the live web are not just the pages, but the links. A link to a weblog is expected to point to a perennially changing page, with “permalinks” for any individual entry, and notification for each change. An RSS feed is thus a much stronger link than, say a bookmark or a link to a single page.

The Architecture of Participation

Some systems are designed to encourage participation. In his paper, The Cornucopia of the Commons, Dan Bricklin noted that there are three ways to build a large database. The first, demonstrated by Yahoo!, is to pay people to do it. The second, inspired by lessons from the open source community, is to get volunteers to perform the same task. The Open Directory Project, an open source Yahoo competitor, is the result. But Napster demonstrated a third way. Because Napster set its defaults to automatically serve any music that was downloaded, every user automatically helped to build the value of the shared database. This same approach has been followed by all other P2P file sharing services.

One of the key lessons of the Web 2.0 era is this: Users add value. But only a small percentage of users will go to the trouble of adding value to your application via explicit means. Therefore, Web 2.0 companies set inclusive defaults for aggregating user data and building value as a side-effect of ordinary use of the application. As noted above, they build systems that get better the more people use them.

Mitch Kapor once noted that “architecture is politics.” Participation is intrinsic to Napster, part of its fundamental architecture.

This architectural insight may also be more central to the success of open source software than the more frequently cited appeal to volunteerism. The architecture of the internet, and the World Wide Web, as well as of open source software projects like Linux, Apache, and Perl, is such that users pursuing their own “selfish” interests build collective value as an automatic byproduct. Each of these projects has a small core, well-defined extension mechanisms, and an approach that lets any well-behaved component be added by anyone, growing the outer layers of what Larry Wall, the creator of Perl, refers to as “the onion.” In other words, these technologies demonstrate network effects, simply through the way that they have been designed.

These projects can be seen to have a natural architecture of participation. But as Amazon demonstrates, by consistent effort (as well as economic incentives such as the Associates program), it is possible to overlay such an architecture on a system that would not normally seem to possess it.

RSS also means that the web browser is not the only means of viewing a web page. While some RSS aggregators, such as Bloglines, are web-based, others are desktop clients, and still others allow users of portable devices to subscribe to constantly updated content.

RSS is now being used to push not just notices of new blog entries, but also all kinds of data updates, including stock quotes, weather data, and photo availability. This use is actually a return to one of its roots: RSS was born in 1997 out of the confluence o

f Dave Winer’s “Really Simple Syndication” technology, used to push out blog updates, and Netscape’s “Rich Site Summary”, which allowed users to create custom Netscape home pages with regularly updated data flows. Netscape lost interest, and the technology was carried forward by blogging pioneer Userland, Winer’s company. In the current crop of applications, we see, though, the heritage of both parents.

But RSS is only part of what makes a weblog different from an ordinary web page. Tom Coates remarks on the significance of the permalink:

It may seem like a trivial piece of functionality now, but it was effectively the device that turned weblogs from an ease-of-publishing phenomenon into a conversational mess of overlapping communities. For the first time it became relatively easy to gesture directly at a highly specific post on someone else’s site and talk about it. Discussion emerged. Chat emerged. And – as a result – friendships emerged or became more entrenched. The permalink was the first – and most successful – attempt to build bridges between weblogs.

In many ways, the combination of RSS and permalinks adds many of the features of NNTP, the Network News Protocol of the Usenet, onto HTTP, the web protocol. The “blogosphere” can be thought of as a new, peer-to-peer equivalent to Usenet and bulletin-boards, the conversational watering holes of the early internet. Not only can people subscribe to each others’ sites, and easily link to individual comments on a page, but also, via a mechanism known as trackbacks, they can see when anyone else links to their pages, and can respond, either with reciprocal links, or by adding comments.

Interestingly, two-way links were the goal of early hypertext systems like Xanadu. Hypertext purists have celebrated trackbacks as a step towards two way links. But note that trackbacks are not properly two-way–rather, they are really (potentially) symmetrical one-way links that create the effect of two way links. The difference may seem subtle, but in practice it is enormous. Social networking systems like Friendster, Orkut, and LinkedIn, which require acknowledgment by the recipient in order to establish a connection, lack the same scalability as the web. As noted by Caterina Fake, co-founder of the Flickr photo sharing service, attention is only coincidentally reciprocal. (Flickr thus allows users to set watch lists–any user can subscribe to any other user’s photostream via RSS. The object of attention is notified, but does not have to approve the connection.)

If an essential part of Web 2.0 is harnessing collective intelligence, turning the web into a kind of global brain, the blogosphere is the equivalent of constant mental chatter in the forebrain, the voice we hear in all of our heads. It may not reflect the deep structure of the brain, which is often unconscious, but is instead the equivalent of conscious thought. And as a reflection of conscious thought and attention, the blogosphere has begun to have a powerful effect.

First, because search engines use link structure to help predict useful pages, bloggers, as the most prolific and timely linkers, have a disproportionate role in shaping search engine results. Second, because the blogging community is so highly self-referential, bloggers paying attention to other bloggers magnifies their visibility and power. The “echo chamber” that critics decry is also an amplifier.

If it were merely an amplifier, blogging would be uninteresting. But like Wikipedia, blogging harnesses collective intelligence as a kind of filter. What James Suriowecki calls “the wisdom of crowds” comes into play, and much as PageRank produces better results than analysis of any individual document, the collective attention of the blogosphere selects for value.

While mainstream media may see individual blogs as competitors, what is really unnerving is that the competition is with the blogosphere as a whole. This is not just a competition between sites, but a competition between business models. The world of Web 2.0 is also the world of what Dan Gillmor calls “we, the media,” a world in which “the former audience”, not a few people in a back room, decides what’s important.

3. Data is the Next Intel Inside

Every significant internet application to date has been backed by a specialized database: Google’s web crawl, Yahoo!’s directory (and web crawl), Amazon’s database of products, eBay’s database of products and sellers, MapQuest’s map databases, Napster’s distributed song database. As Hal Varian remarked in a personal conversation last year, “SQL is the new HTML.” Database management is a core competency of Web 2.0 companies, so much so that we have sometimes referred to these applications as “infoware” rather than merely software.

This fact leads to a key question: Who owns the data?

In the internet era, one can already see a number of cases where control over the database has led to market control and outsized financial returns. The monopoly on domain name registry initially granted by government fiat to Network Solutions (later purchased by Verisign) was one of the first great moneymakers of the internet. While we’ve argued that business advantage via controlling software APIs is much more difficult in the age of the internet, control of key data sources is not, especially if those data sources are expensive to create or amenable to increasing returns via network effects.

Look at the copyright notices at the base of every map served by MapQuest, maps.yahoo.com, maps.msn.com, or maps.google.com, and you’ll see the line “Maps copyright NavTeq, TeleAtlas,” or with the new satellite imagery services, “Images copyright Digital Globe.” These companies made substantial investments in their databases (NavTeq alone reportedly invested $750 million to build their database of street addresses and directions. Digital Globe spent $500 million to launch their own satellite to improve on government-supplied imagery.) NavTeq has gone so far as to imitate Intel’s familiar Intel Inside logo: Cars with navigation systems bear the imprint, “NavTeq Onboard.” Data is indeed the Intel Inside of these applications, a sole source component in systems whose software infrastructure is largely open source or otherwise commodified.

The now hotly contested web mapping arena demonstrates how a failure to understand the importance of owning an application’s core data will eventually undercut its competitive position. MapQuest pioneered the web mapping category in 1995, yet when Yahoo!, and then Microsoft, and most recently Google, decided to enter the market, they were easily able to offer a competing application simply by licensing the same data.

Contrast, however, the position of Amazon.com. Like competitors such as Barnesandnoble.com, its original database came from ISBN registry provider R.R. Bowker. But unlike MapQuest, Amazon relentlessly enhanced the data, adding publisher-supplied data such as cover images, table of contents, index, and sample material. Even more importantly, they harnessed their users to annotate the data, such that after ten years, Amazon, not Bowker, is the primary source for bibliographic data on books, a reference source for scholars and librarians as well as consumers. Amazon also introduced their own proprietary identifier, the ASIN, which corresponds to the ISBN where one is present, and creates an equivalent namespace for products without one. Effectively, Amazon “embraced and extended” their data suppliers.

Imagine if MapQuest had done the same thing, harnessing their users to annotate maps and directions, adding layers of value. It would have been much more difficult for com

petitors to enter the market just by licensing the base data.

The recent introduction of Google Maps provides a living laboratory for the competition between application vendors and their data suppliers. Google’s lightweight programming model has led to the creation of numerous value-added services in the form of mashups that link Google Maps with other internet-accessible data sources. Paul Rademacher’s housingmaps.com, which combines Google Maps with Craigslist apartment rental and home purchase data to create an interactive housing search tool, is the pre-eminent example of such a mashup.

At present, these mashups are mostly innovative experiments, done by hackers. But entrepreneurial activity follows close behind. And already, one can see that for at least one class of developer, Google has taken the role of data source away from Navteq and inserted themselves as a favored intermediary. We expect to see battles between data suppliers and application vendors in the next few years, as both realize just how important certain classes of data will become as building blocks for Web 2.0 applications.

The race is on to own certain classes of core data: location, identity, calendaring of public events, product identifiers and namespaces. In many cases, where there is significant cost to create the data, there may be an opportunity for an Intel Inside style play, with a single source for the data. In others, the winner will be the company that first reaches critical mass via user aggregation, and turns that aggregated data into a system service.

For example, in the area of identity, PayPal, Amazon’s 1-click, and the millions of users of communications systems, may all be legitimate contenders to build a network-wide identity database. (In this regard, Google’s recent attempt to use cell phone numbers as an identifier for Gmail accounts may be a step towards embracing and extending the phone system.) Meanwhile, startups like Sxip are exploring the potential of federated identity, in quest of a kind of “distributed 1-click” that will provide a seamless Web 2.0 identity subsystem. In the area of calendaring, EVDB is an attempt to build the world’s largest shared calendar via a wiki-style architecture of participation. While the jury’s still out on the success of any particular startup or approach, it’s clear that standards and solutions in these areas, effectively turning certain classes of data into reliable subsystems of the “internet operating system”, will enable the next generation of applications.

A further point must be noted with regard to data, and that is user concerns about privacy and their rights to their own data. In many of the early web applications, copyright is only loosely enforced. For example, Amazon lays claim to any reviews submitted to the site, but in the absence of enforcement, people may repost the same review elsewhere. However, as companies begin to realize that control over data may be their chief source of competitive advantage, we may see heightened attempts at control.

Much as the rise of proprietary software led to the Free Software movement, we expect the rise of proprietary databases to result in a Free Data movement within the next decade. One can see early signs of this countervailing trend in open data projects such as Wikipedia, the Creative Commons, and in software projects like Greasemonkey, which allow users to take control of how data is displayed on their computer.

4. End of the Software Release Cycle

As noted above in the discussion of Google vs. Netscape, one of the defining characteristics of internet era software is that it is delivered as a service, not as a product. This fact leads to a number of fundamental changes in the business model of such a company:

  1. Operations must become a core competency. Google’s or Yahoo!’s expertise in product development must be matched by an expertise in daily operations. So fundamental is the shift from software as artifact to software as service that the software will cease to perform unless it is maintained on a daily basis. Google must continuously crawl the web and update its indices, continuously filter out link spam and other attempts to influence its results, continuously and dynamically respond to hundreds of millions of asynchronous user queries, simultaneously matching them with context-appropriate advertisements.

It’s no accident that Google’s system administration, networking, and load balancing techniques are perhaps even more closely guarded secrets than their search algorithms. Google’s success at automating these processes is a key part of their cost advantage over competitors.

It’s also no accident that scripting languages such as Perl, Python, PHP, and now Ruby, play such a large role at web 2.0 companies. Perl was famously described by Hassan Schroeder, Sun’s first webmaster, as “the duct tape of the internet.” Dynamic languages (often called scripting languages and looked down on by the software engineers of the era of software artifacts) are the tool of choice for system and network administrators, as well as application developers building dynamic systems that require constant change.

  1. Users must be treated as co-developers, in a reflection of open source development practices (even if the software in question is unlikely to be released under an open source license.) The open source dictum, “release early and release often” in fact has morphed into an even more radical position, “the perpetual beta,” in which the product is developed in the open, with new features slipstreamed in on a monthly, weekly, or even daily basis. It’s no accident that services such as Gmail, Google Maps, Flickr, del.icio.us, and the like may be expected to bear a “Beta” logo for years at a time.

Real time monitoring of user behavior to see just which new features are used, and how they are used, thus becomes another required core competency. A web developer at a major online service remarked: “We put up two or three new features on some part of the site every day, and if users don’t adopt them, we take them down. If they like them, we roll them out to the entire site.”

Cal Henderson, the lead developer of Flickr, recently revealed that they deploy new builds up to every half hour. This is clearly a radically different development model! While not all web applications are developed in as extreme a style as Flickr, almost all web applications have a development cycle that is radically unlike anything from the PC or client-server era. It is for this reason that a recent ZDnet editorial concluded that Microsoft won’t be able to beat Google: “Microsoft’s business model depends on everyone upgrading their computing environment every two to three years. Google’s depends on everyone exploring what’s new in their computing environment every day.”

While Microsoft has demonstrated enormous ability to learn from and ultimately best its competition, there’s no question that this time, the competition will require Microsoft (and by extension, every other existing software company) to become a deeply different kind of company. Nativ

e Web 2.0 companies enjoy a natural advantage, as they don’t have old patterns (and corresponding business models and revenue sources) to shed.

A Web 2.0 Investment Thesis

Venture capitalist Paul Kedrosky writes: “The key is to find the actionable investments where you disagree with the consensus”. It’s interesting to see how each Web 2.0 facet involves disagreeing with the consensus: everyone was emphasizing keeping data private, Flickr/Napster/et al. make it public. It’s not just disagreeing to be disagreeable (pet food! online!), it’s disagreeing where you can build something out of the differences. Flickr builds communities, Napster built breadth of collection.

Another way to look at it is that the successful companies all give up something expensive but considered critical to get something valuable for free that was once expensive. For example, Wikipedia gives up central editorial control in return for speed and breadth. Napster gave up on the idea of “the catalog” (all the songs the vendor was selling) and got breadth. Amazon gave up on the idea of having a physical storefront but got to serve the entire world. Google gave up on the big customers (initially) and got the 80% whose needs weren’t being met. There’s something very aikido (using your opponent’s force against them) in saying “you know, you’re right–absolutely anyone in the whole world CAN update this article. And guess what, that’s bad news for you.”

Nat Torkington

5. Lightweight Programming Models

Once the idea of web services became au courant, large companies jumped into the fray with a complex web services stack designed to create highly reliable programming environments for distributed applications.

But much as the web succeeded precisely because it overthrew much of hypertext theory, substituting a simple pragmatism for ideal design, RSS has become perhaps the single most widely deployed web service because of its simplicity, while the complex corporate web services stacks have yet to achieve wide deployment.

Similarly, Amazon.com’s web services are provided in two forms: one adhering to the formalisms of the SOAP (Simple Object Access Protocol) web services stack, the other simply providing XML data over HTTP, in a lightweight approach sometimes referred to as REST (Representational State Transfer). While high value B2B connections (like those between Amazon and retail partners like ToysRUs) use the SOAP stack, Amazon reports that 95% of the usage is of the lightweight REST service.

This same quest for simplicity can be seen in other “organic” web services. Google’s recent release of Google Maps is a case in point. Google Maps’ simple AJAX (Javascript and XML) interface was quickly decrypted by hackers, who then proceeded to remix the data into new services.

Mapping-related web services had been available for some time from GIS vendors such as ESRI as well as from MapQuest and Microsoft MapPoint. But Google Maps set the world on fire because of its simplicity. While experimenting with any of the formal vendor-supported web services required a formal contract between the parties, the way Google Maps was implemented left the data for the taking, and hackers soon found ways to creatively re-use that data.

There are several significant lessons here:

  1. Support lightweight programming models that allow for loosely coupled systems. The complexity of the corporate-sponsored web services stack is designed to enable tight coupling. While this is necessary in many cases, many of the most interesting applications can indeed remain loosely coupled, and even fragile. The Web 2.0 mindset is very different from the traditional IT mindset!
  2. Think syndication, not coordination. Simple web services, like RSS and REST-based web services, are about syndicating data outwards, not controlling what happens when it gets to the other end of the connection. This idea is fundamental to the internet itself, a reflection of what is known as the end-to-end principle.
  3. Design for “hackability” and remixability. Systems like the original web, RSS, and AJAX all have this in common: the barriers to re-use are extremely low. Much of the useful software is actually open source, but even when it isn’t, there is little in the way of intellectual property protection. The web browser’s “View Source” option made it possible for any user to copy any other user’s web page; RSS was designed to empower the user to view the content he or she wants, when it’s wanted, not at the behest of the information provider; the most successful web services are those that have been easiest to take in new directions unimagined by their creators. The phrase “some rights reserved,” which was popularized by the Creative Commons to contrast with the more typical “all rights reserved,” is a useful guidepost.

Innovation in Assembly

Lightweight business models are a natural concomitant of lightweight programming and lightweight connections. The Web 2.0 mindset is good at re-use. A new service like housingmaps.com was built simply by snapping together two existing services. Housingmaps.com doesn’t have a business model (yet)–but for many small-scale services, Google AdSense (or perhaps Amazon associates fees, or both) provides the snap-in equivalent of a revenue model.

These examples provide an insight into another key web 2.0 principle, which we call “innovation in assembly.” When commodity components are abundant, you can create value simply by assembling them in novel or effective ways. Much as the PC revolution provided many opportunities for innovation in assembly of commodity hardware, with companies like Dell making a science out of such assembly, thereby defeating companies whose business model required innovation in product development, we believe that Web 2.0 will provide opportunities for companies to beat the competition by getting better at harnessing and integrating services provided by others.

6. Software Above the Level of a Single Device

One other feature of Web 2.0 that deserves mention is the fact that it’s no longer limited to the PC platform. In his parting advice to Microsoft, long time Microsoft developer Dave Stutz pointed out that “Useful software written above the level of the single device will command high margins for a long time to come.”

Of course, any web application can be seen as software above the level of a single device. After all, even the simplest web application involves at least two computers: the one hosting the web server and the one hosting the browser. And as we’ve discussed, the development of the web as platform extends this idea to synthetic applications composed of services provided by multiple computers.

But as with many areas of Web 2.0, where the “2.0-ness” is not something new, but rather a fuller realization of the true potential of the web platform, this phrase gives us a key insight into how to design applications and services for the new platform.

To date, iTunes is the best exemplar of this principle. This application seamlessly reaches from the handheld device to a massive web back-end, with the PC acting a

s a local cache and control station. There have been many previous attempts to bring web content to portable devices, but the iPod/iTunes combination is one of the first such applications designed from the ground up to span multiple devices. TiVo is another good example.

iTunes and TiVo also demonstrate many of the other core principles of Web 2.0. They are not web applications per se, but they leverage the power of the web platform, making it a seamless, almost invisible part of their infrastructure. Data management is most clearly the heart of their offering. They are services, not packaged applications (although in the case of iTunes, it can be used as a packaged application, managing only the user’s local data.) What’s more, both TiVo and iTunes show some budding use of collective intelligence, although in each case, their experiments are at war with the IP lobby’s. There’s only a limited architecture of participation in iTunes, though the recent addition of podcasting changes that equation substantially.

This is one of the areas of Web 2.0 where we expect to see some of the greatest change, as more and more devices are connected to the new platform. What applications become possible when our phones and our cars are not consuming data but reporting it? Real time traffic monitoring, flash mobs, and citizen journalism are only a few of the early warning signs of the capabilities of the new platform.

7. Rich User Experiences

As early as Pei Wei’s Viola browser in 1992, the web was being used to deliver “applets” and other kinds of active content within the web browser. Java’s introduction in 1995 was framed around the delivery of such applets. JavaScript and then DHTML were introduced as lightweight ways to provide client side programmability and richer user experiences. Several years ago, Macromedia coined the term “Rich Internet Applications” (which has also been picked up by open source Flash competitor Laszlo Systems) to highlight the capabilities of Flash to deliver not just multimedia content but also GUI-style application experiences.

However, the potential of the web to deliver full scale applications didn’t hit the mainstream till Google introduced Gmail, quickly followed by Google Maps, web based applications with rich user interfaces and PC-equivalent interactivity. The collection of technologies used by Google was christened AJAX, in a seminal essay by Jesse James Garrett of web design firm Adaptive Path. He wrote:

“Ajax isn’t a technology. It’s really several technologies, each flourishing in its own right, coming together in powerful new ways. Ajax incorporates:

Web 2.0 Design Patterns

In his book, A Pattern Language, Christopher Alexander prescribes a format for the concise description of the solution to architectural problems. He writes: “Each pattern describes a problem that occurs over and over again in our environment, and then describes the core of the solution to that problem, in such a way that you can use this solution a million times over, without ever doing it the same way twice.”

  1. The Long Tail
    Small sites make up the bulk of the internet’s content; narrow niches make up the bulk of internet’s the possible applications. Therefore: Leverage customer-self service and algorithmic data management to reach out to the entire web, to the edges and not just the center, to the long tail and not just the head.
  2. Data is the Next Intel Inside
    Applications are increasingly data-driven. Therefore: For competitive advantage, seek to own a unique, hard-to-recreate source of data.
  3. Users Add Value
    The key to competitive advantage in internet applications is the extent to which users add their own data to that which you provide. Therefore: Don’t restrict your “architecture of participation” to software development. Involve your users both implicitly and explicitly in adding value to your application.
  4. Network Effects by Default
    Only a small percentage of users will go to the trouble of adding value to your application. Therefore: Set inclusive defaults for aggregating user data as a side-effect of their use of the application.
  5. Some Rights Reserved. Intellectual property protection limits re-use and prevents experimentation. Therefore: When benefits come from collective adoption, not private restriction, make sure that barriers to adoption are low. Follow existing standards, and use licenses with as few restrictions as possible. Design for “hackability” and “remixability.”
  6. The Perpetual Beta
    When devices and programs are connected to the internet, applications are no longer software artifacts, they are ongoing services. Therefore: Don’t package up new features into monolithic releases, but instead add them on a regular basis as part of the normal user experience. Engage your users as real-time testers, and instrument the service so that you know how people use the new features.
  7. Cooperate, Don’t Control
    Web 2.0 applications are built of a network of cooperating data services. Therefore: Offer web services interfaces and content syndication, and re-use the data services of others. Support lightweight programming models that allow for loosely-coupled systems.
  8. Software Above the Level of a Single Device
    The PC is no longer the only access device for internet applications, and applications that are limited to a single device are less valuable
    than those that are connected. Therefore: Design your application from the get-go to integrate services across handheld devices, PCs, and internet servers.

AJAX is also a key component of Web 2.0 applications such as Flickr, now part of Yahoo!, 37signals’ applications basecamp and backpack, as well as other Google applications such as Gmail and Orkut. We’re entering an unprecedented period of user interface innovation, as web developers are finally able to build web applications as rich as local PC-based applications.

Interestingly, many of the capabilities now being explored have been around for many years. In the late ’90s, both Microsoft and Netscape had a vision of the kind of capabilities that are now finally being realized, but their battle over the standards to be used made cross-browser applications difficult. It was only when Microsoft definitively won the browser wars, and there was a single de-facto browser standard to write to, that this kind of application became possible. And while Firefox has reintroduced competition to the browser market, at least so far we haven’t seen the destructive competition over web standards that held back progress in the ’90s.

We expect to see many new web applications over the next few years, both truly novel applications, and rich web reimplementations of PC applications. Every platform change to date has also created opportunities for a leadership change in the dominant applications of the previous platform.

Gmail has already provided some interesting innovations in email, combining the strengths of the web (accessible from anywhere, deep database competencies, searchability) with user interfaces that approach PC interfaces in usability. Meanwhile, other mail clients on the PC platform are nibbling away at the problem from the other end, adding IM and presence capabilities. How far are we from an integrated communications client combining the best of email, IM, and the cell phone, using VoIP to add voice capabilities to the rich capabilities of web applications? The race is on.

It’s easy to see how Web 2.0 will also remake the address book. A Web 2.0-style address book would treat the local address book on the PC or phone merely as a cache of the contacts you’ve explicitly asked the system to remember. Meanwhile, a web-based synchronization agent, Gmail-style, would remember every message sent or received, every email address and every phone number used, and build social networking heuristics to decide which ones to offer up as alternatives when an answer wasn’t found in the local cache. Lacking an answer there, the system would query the broader social network.

A Web 2.0 word processor would support wiki-style collaborative editing, not just standalone documents. But it would also support the rich formatting we’ve come to expect in PC-based word processors. Writely is a good example of such an application, although it hasn’t yet gained wide traction.

Nor will the Web 2.0 revolution be limited to PC applications. Salesforce.com demonstrates how the web can be used to deliver software as a service, in enterprise scale applications such as CRM.

The competitive opportunity for new entrants is to fully embrace the potential of Web 2.0. Companies that succeed will create applications that learn from their users, using an architecture of participation to build a commanding advantage not just in the software interface, but in the richness of the shared data.

Core Competencies of Web 2.0 Companies

In exploring the seven principles above, we’ve highlighted some of the principal features of Web 2.0. Each of the examples we’ve explored demonstrates one or more of those key principles, but may miss others. Let’s close, therefore, by summarizing what we believe to be the core competencies of Web 2.0 companies:

  • Services, not packaged software, with cost-effective scalability
  • Control over unique, hard-to-recreate data sources that get richer as more people use them
  • Trusting users as co-developers
  • Harnessing collective intelligence
  • Leveraging the long tail through customer self-service
  • Software above the level of a single device
  • Lightweight user interfaces, development models, AND business models

The next time a company claims that it’s “Web 2.0,” test their features against the list above. The more points they score, the more they are worthy of the name. Remember, though, that excellence in one area may be more telling than some small steps in all seven.

Tim O’Reilly
O’Reilly Media, Inc., tim@oreilly.com
President and CEO

50 Essential Strategies For Creating A Successful Web 2.0 Product

50 Essential Strategies For Creating A Successful Web 2.0 Product

« H E » email

posted Monday, 26 January 2009

I am fortunate enough to spend a lot of time looking at various online products and services in the development stage, mostly of the Web 2.0 variety, meaning they use one or more of the principles in the Web 2.0 set of practices. It’s been going on 4 years now and what’s fascinating to me, despite the enormous amount of knowledge that we’ve accumulated on how to create modern Web applications, is how many of the same lessons are learned over and over again.

Wouldn’t it be handy if we had a cheat sheet that combined many of these lessons into one convenient list? In this vein of thinking, I decided to sit down recently to capture are some of the most important lessons I’ve learned over the last few years along with some of the thinking that went into them.

The Web Community Gets Smarter Every Time It Builds A Product

If there’s one thing that the Web has taught us it’s that the network gets smarter by virtue of people using it and product development is no exception. Not only do we have examples of great online applications and systems to point to and use for best practices, but the latest tools, frameworks, development platforms, APIs, widgets, and so on, which are largely developed today in the form of open source over the Internet, tend to accumulate many of these new best practices. I’ve lauded everything from frameworks like Rails, Cake PHP, and Grails to online community platforms like Drupal and Joomla as examples of guiding solutions that can be vital springboards for the next great Web product or service.

However, most of the success of an online product, Web 2.0 or otherwise, comes from two things: Its software architecture and its product design. It’s also the case that the story of any product is a story of ten of thousands of little decisions made throughout the life of the product, of which only a key — and heartbreakingly small — set will make much of a difference to its success on the network. The list of strategies below tells part of the story of which decisions will make that critical difference.

What then is software architecture and product design when it comes to today’s Web applications? The good news: They’re often the same as they’ve always been, albeit just a bit more extreme, though there are some additions for the 2.0 era as well:

Software architecture determines a Web application’s fundamental structure and properties: Resilience, scalability, adaptability, reliability, changeability, maintainability, extensibility, security, technology base, standards compliance, and other key constraints, and not necessarily in that order.

Product design determines a Web application’s observable function: Usability, audience, feature set, capabilities, functionality, business model, visual design, and more. Again, not necessarily in priority order.

Doing both of these top-level product development activities well, striking a healthy balance between them (one often dominates the other), and doing it with a small team, requires people with deep and multidisciplinary backgrounds in creating successful products across this extensive set of practice areas. These people are often hard to find and extremely valuable. This means it’s also not likely you’ll be able to easily put together a team with all the capabilities that are needed from the outset.

Be prepared from the outset for on-the-job learning and study, relying on tools and products that embody best practices, and replicating only the best designs and ideas (while being very conscientious not to steal IP.)

Balancing Software Architecture and Product Design in Web 2.0 Applications

In this way, I’ve collected a set of strategies that address the most common issues that I see come up over and over again as online products go to market. I’ve decided to share these with you so we can continue to teach the network, and consequently ourselves, a little bit more about how to make extraordinary Web applications that can really make a difference in the marketplace.

This of course is just my experience and is not intended to be a complete list of Web 2.0 strategies. However, I think most people will find it a valuable perspective and useful cross check in their product design and development. And please keep in mind this list is for Web 2.0 applications, not necessary static Web sites, or traditional online Web presence, though there is much that here that can be applied to them to make them more useful and successful as well.

Finally, a good number of these strategies are not specifically Web 2.0 concepts. They are on the list because they are pre-requisites to many Web 2.0 approaches and to any successful product created with software and powered by people.

Please add your own strategies in comments below for anything that I’ve missed.

50 Strategies For Creating A Successful Web 2.0 Product

1. Start with a simple problem. All of the most successful online services start with a simple premise and execute on it well with great focus. This could be Google with it’s command-line search engine, Flickr with photo sharing, Digg with user generated news. State your problem simply: “I make it easier to do X”. Focus on solving it elegantly and simply, only add features carefully. Over time, complexity will becom

e the enemy of both your product design and your software architecture, so start with as much focus as you can muster.

2. Create prototypes as early as possible. Get your idea into a working piece of software as quickly as possible. The longer you take to go through one entire cycle, the more unknown work you have ahead of you. Not producing software also means that you are not getting better and better at turning the work of your team into the most important measurable output: Functioning software. Throughout the life of your product, turning your ideas into software as quickly and inexpensively as possible will be one of the most important activities to get right.

3. Get people on the network to work with the product prototype rapidly and often. The online world today is fundamentally people-centric. If your product isn’t about them and how it makes their lives better, your product really doesn’t matter. And if they’re not using your Web application as soon as possible, you just don’t know if you are building the right product. Constant, direct feedback from real people is the most important input to our product design after your idea. Don’t wait months for this to happen; get a beta out to the world, achieve marketplace contact in weeks, or at most a few months, and watch carefully what happens. This approach is sometimes called Web 2.0 Development .

4. Release early and release often. Don’t get caught up in the massive release cycle approach, no matter how appealing it may be. Large releases let you push off work tomorrow that should be done today. It also creates too much change at once and often has too many dependencies, further driving an increase in the size of the release. Small releases almost always work better, are easier to manage, but can require a bit more operations overhead. Done right, your online product will iterate smoothly as well as improve faster and more regularly than your competitors. Some online products, notably Flickr, have been on record as saying they make new releases to production up to several times a day. This is a development velocity that many new startups have trouble appreciating or don’t know how to enable. Agile software development processes are a good model to start with and and these and even more extreme methods have worked well in the Web 2.0 community for years.

5. Manage your software development and operations to real numbers that matter. One often unappreciated issue with software is its fundamentally intangible nature. Combine that with human nature, which is to manage to what you can see, and you can have a real problem. There is a reason why software development has such a variable nature in terms of time, budget, and resources. Make sure you have as many real numbers as possible to manage to: Who is making how many commits a week to the source repository, how many registered users are there on a daily basis, what does the user analytics look like, which product features are being used most/least this month, what are the top 5 complaints of customers, and so on. All of these are important key performance indicators that far too many startups don’t manage and respond to as closely as they should.

6. Gather usage data from your users and input it back into product design as often as possible. Watch what your users do live with your product, what they click on, what do they try to do with it, what they don’t use, and so on. You will be surprised; they will do things you never expected, have trouble with features that seem easy to you, and not understand parts of your product that seemed obvious. Gather this data often and feed it back into your usability and information architecture processes. Some Web applications teams do this almost daily, others look at click stream analytics once a quarter, and some don’t it at all. Guess who is shaping their product faster and in the right direction?

7. Put off irreversible architecture and product design decisions as long as possible. Get in the habit of asking “How difficult will it be to change our mind about this later?” Choosing a programming language, Web framework, relational database design, or a software interface tend to be one-way decisions that are hard to undo. Picking a visual design, logo, layout, or analytics tool generally is not. Consequently, while certain major decisions must be made up front, be vigilant for seemingly innocuous decisions that will be difficult to reverse. Not all of these will be a big deal, but it’s all too often a surprise to many people where the architect should be malleable. Reduce unpleasant surprises by always asking this question.

8. Choose the technologies later and think carefully about what your product will do first. First, make sure your ideas will work on the Web. I’ve seen too many startups with ideas that will work in software but not on the Web. Second, Web technologies often have surprising limits, Ajax can’t do video or audio, Flash is hard to get to work with SEO for example. Choosing a technology too early will constrain what is possible later on. That being said, you have to choose as rapidly as you can within this constraint since you need to build prototypes and the initial product as soon as you are able.

9. When you do select technologies, consider current skill sets and staff availability. New, trendy technologies can have major benefits including higher levels of productivity and compelling new capabilities, but it also means it’ll be harder to find people who are competent with them. Having staff learn new technology on the job can be painful, expensive, and risky. Older technologies are in a similar boat; you can find people that know them but they’ll most likely not want to work with them. This means the middle of the road is often the best place to be when it comes to selecting technology, though you all-too-often won’t have a choice depending on what your staff already knows or because of the pre-requisites of specific technologies that you have to use.

10. Balance programmer productivity with operational costs. Programming time is the most expensive part of product creation up front while operations is after you launch. Productivity-oriented platforms such as Ruby on Rails are very popular in the Web community to drive down the cost of product development but can have significant run-time penalties later when you are supporting millions of users. I’ve previously discussed the issues and motivations around moving to newer programming languages and platforms designed for the modern Web, and I encourage you to read it. Productivity-oriented platforms tend to require more operational resources during run-time, and unlike traditional software products, the majority of the cost of operations falls upon the startup. Be aware of the cost and scale of the trade-offs since every dollar you save on the development productivity side translates into a run-time cost forever after on the operations side.

11. Variability in the productivity amongst programmers and development platforms each varies by an order of magnitu

de. Combined together and your choice of programming talent and software development platforms can result in a 100x overall effect on product development productivity. This means that some teams can ship product in as little as 3 months and some projects won’t ship ever, at least not without truly prohibitive time and resource requirements. While there are a great many inputs to an Internet startup that will help or hinder it (take a look at Paul Graham’s great 18 Mistakes That Kill Startups for a good list), these are two of the most central and variable: Who is developing the product and what development platform they are using. Joel Spolsky’s write-up on programmer productivity remains one of the best at understanding this issue. It usually turns out that paying a bit more for the right developer can often mean tremendous output gains. One the other side of the coin, choosing a development platform not designed for creating modern Web applications is another decision that can sap your team of productivity while they spend months retrofitting it for the features they’ll need to make it work properly in today’s Internet world.

12. Plan for testing to be a larger part of software development process than non-Web applications. Cross browser testing, usability, and performance/load testing are much bigger issues with Web applications than many non-Web applications. Having to do thorough testing in a half-dozen to a dozen browser types can be an unexpected tax on the time and cost of creating a Web product. Doing adequate load testing is another item that often waits until the end, the very worst time to find where the bottlenecks in your architecture are. Plan to test more than usual. Insist on automated unit and integration tests that build up over time and run without having to pay developers or testers to do it manually.

13. Move beyond traditional application hosting. Single Web-server hosting models are not going to suffice for your 2.0 applications. Reliability, availability, and scalability are essential and must be designed into your run-time architecture and supported by your hosting environment. Solutions like 3Tera, Amazon’s Elastic Compute Cloud, and Google’s App Engine are three compelling, yet very different solutions to the hosting problem. Either way, grid and cloud approaches to hosting will help you meet your growth and scalability requirements while managing your costs.

14. Have an open source strategy. This has two important aspects. One, developing and hosting a product built with open source software (the ubiquitious LAMP stack) is almost always much less expensive than using commercial software and is what most online products use. There are certainly commercial licenses that have fair terms for online services, but almost none of them will match the cost of free. This is one reason why you won’t find Windows or Oracle embedded in very many Web 2.0 services. Two, you’ll have to decide whether to open source or commercial open source your product. This has entirely to do with what your product does and how it does it, but an increasing number of Web 2.0 hosted products are releasing their offerings as open source to appeal to customers, particularly if they are business customers. Done right, open sourcing can negate arguments about the size of your company while enlisting many 3rd party developers to help enrich and make your product better.

15. Consider mobile users as important as your regular browser customers. Mobile devices will ultimately form the majority of your user base as the capability and adoption of smartphones, Internet tablets, laptops, and netbooks ushers in mobile Web use as the dominant model. Having an application strategy as well as well-supported applications for the iPhone, Android, and RIM platforms is essential for most Web products these days. By the time you get to market, mobile will be even more important than it is now. Infoworld confirmed today, in fact, that wireless enterprise development will be one of 2009’s bright spots.

16. Search is the new navigation, make it easy to use in your application. You have 5-10 seconds for a new user to find what they want from your site or application. Existing users want to directly access what they need without going through layers of menu items and links. Search is the fastest way to provide random access navigation. Therefore, offer search across data, community, and help at a minimum. A search box must be on the main page and indeed, every page of the modern Web application.

17. Whenever users can provide data to your product, enable them. Harnessing collective intelligence is the most central high-level principle of Web 2.0 applications. To be a major online competitor, getting your millions of users to build a valuable data set around the clock is the key to success. Many product designers look at this too narrowly and usually at a small set of data. Keep a broad view of this and look for innovative ways to get information from explicit contributions to the database of intentions can form your architecture of participation.

18. Offer an open API so that your Web application can be extended by partners around the world. I’ve covered this topic many times in the past and if you do it right, your biggest customers will soon become 3rd party Web applications building upon your data and functionality. Critically, offering an API converts your online product into an open platform with an ecosystem of 3rd party partners. This is just one of many ways to realize Jakob’s law, as is the next item.

19. Make sure your product can be spread around the Web by users, provide widgets, badges, and gadgets. If your application has any success at all, your users will want to take it with them and use your features elsewhere. This is often low-effort but can drive enormous growth and adoption; think about YouTube’s badge.

20. Create features to make the product distribute virally. The potency of this is similar to widgets above and everything from simple e-mail friend invites to importing contact lists and social graphs from other Web apps are critical ways to ensure that a user can bring the people they want into the application to drive more value for them and you.

21. The link is the fundamental unit of thought on the Web, therefore richly link-enable your applications. Links are what make the Web so special and fundamentally makes it work. Ensuring your application is URL addressable in a granular way, especially if you have a rich user experience, is vital to participate successfully on the Web. The Web’s link ecosystem is enormously powerful and is needed for bookmarking, link sharing/propagation, advertising, makes SEO work, drives your page rank, and much more. Your overall URL structure should be thought out and clean, look to Flickr and del.cio.us for good examples.

22. Create an online user community for your product and nurture it. Online communities are ways to engage passionate users to provide feedback, support, promotion, evangelism and countless other useful outcomes. While this is usually standard fare now with online products, too many companies don’t start this early enough or give it enough resources despite the benefits it confers in terms of customer support, user feedback, and free marketing, to name just three benefits. Investing in online community approaches is ultimately one of the least expensive aspects of your product, no matter the upfront cost. Hire a good community manager and set them to work.

23. Offer a up-to-date, clean, compelling application design. Attractive applications inherently attract new customers to try them and is a pre-requisite to good usability and user experience. Visual and navigational unattractiveness and complexity is also the enemy of product adoption. Finally, using the latest designs and modes provides visual cues that conveys that the product is timely and informed. A good place to start to make sure you’re using the latest user experience ideas and trends is Smashing Magazine’s 2009 Web Design survey.

24. Load-time and responsiveness matter, measure and optimize for them on a regular basis. This is not a glamorous aspect of Web applications but it’s a fundamental that is impossible to ignore. Every second longer a key operation like main page load or a major feature interaction takes, the more likely a customer is to consider finding a faster product. On the Web, time is literally money and building high speed user experiences is essential. Rich Internet Application technologies such as Ajax and Flash, albeit used wisely, can help make an application seem as fast as the most responsive desktop application. Using content distribution networks and regional hosting centers.

25. User experience should follow a “complexity gradient.” Novice users will require a simple interface but will want an application’s capabilities to become more sophisticated over time as they become more skilled in using it. Offering more advanced features that are available when a user is ready but are hidden until they are allows a product to grow with the user and keeps them engaged instead of looking for a more advanced alternative.

26. Monetize every page view. There is no excuse for not making sure every page is driving bottom-line results for your online business. Some people will disagree with this recommendation and advertising can often seem overly commercial early in a product’s life. However, though a Web application should never look like a billboard, simple approaches like one line sponsorships or even public service messages are good ideas to maximize the business value of the product and there are other innovation approaches as well.

27. Users’ data belongs to them, not you. This is a very hard strategy for some to accept and you might be able to get away with bending this rule for a while, that is, until some of your users want to move their data elsewhere. Data can be a short-term lock-in strategy, but long-term user loyalty comes from treating them fairly and avoiding a ‘Roach Motel’ approach to user data (“they can check-in their data, but they can’t check out.”) Using your application should be a reversible process and users should have control of their data. See DataPortability.org for examples of how to get started with this.

28. Go to the user, don’t only make them come to you. The aforementioned APIs and widgets help with this but are not sufficient. The drive strong user adoption, you have to be everywhere else on the Web that you can be. This can mean everything from the usual advertising, PR, and press outreach but it also means creating Facebook applications, OpenSocial gadgets, and enabling use from mashups. These methods can often be more powerful than all the traditional ways combined.

29. SEO is as important as ever, so design for it. One of the most important stream of new users will be people coming in from search engines looking for exactly what you have. This stream is free and quite large if you are ensuring your data is URL addressable and can be found via search engine Web crawlers. Your information architecture should be deeply SEO-friendly and highly granular.

30. Know thy popular Web standards and use them. From a consumer or creator standpoint, the data you will exchange with everyone else will be in some format or another. And the usefulness of that data or protocol will be in inverse proportion to how well-known and accepted the standard is. This generally means using CSS, Javascript, XHTML, HTTP, ATOM, RSS, XML, JSON, and so on. Following open standards enables the maximum amount of choice, flexibility, time-to-market, access to talent pools, and many other benefits over time to both you and your customers.

31. Understand and apply Web-Oriented Architecture (WOA). The Web has a certain way that it works best and understanding how HTTP works at a deep level is vital for getting the most out of the unique power that the Internet has to offer. But HTTP is just the beginning of this way of thinking about the Web and how to use its intrinsic power to be successful with with it. This includes knowing why and how link structure, network effects, SEO, API ecosystems, mashups, and other aspects of the Web are key to making your application flourish. It’s important to note that your internal application architecture is likely not fundamentally Web-oriented itself (because most software development platforms are not Web-oriented) and you’ll have to be diligent in enabling a WOA model in your Web-facing product design. The bottom line: Non-Web-oriented products tend not to fare very well by failing to take advantage of the very things that have made the Web itself so successful. < /span>

32. Online products that build upon enterprise systems should use open SOA principles. Large companies building their first 2.0 products will often use existing IT systems and infrastructure that already have the data and functionality they need. Although they will often decouple and cache them for scalability and performance, the connectedness itself is best done using the principles of SOA. That doesn’t necessarily mean traditional SOA products and standards, although it could, often using more Web-oriented methods works better. What does this really mean? Stay away from proprietary integration methods and use the most open models you can find, understanding that the back-end of most online products will be consumed by more than just your front-end (see API discussion above for a fuller exploration).

33. Strategically use feeds and syndication to enable deep content distribution. This is another way to use Jakob’s Law to increase unintended uses and consumption of an application from other sites and ecosystems. Feeds enable many beneficial use cases such as near real-time perception of fresh data in your application from across the Web in feed readers, syndication sites, aggregators, and elsewhere. Like many other techniques here, knee-jerk use of feeds won’t drive much additional usage and adoption, but carefully designing feeds to achieve objectives like driving new customers back to the application directly from the feed can make a big difference. Failing to offer useful feeds is one of the easiest ways to miss out on business opportunities while giving your competitors an edge.

34. Build on the shoulders of giants; don’t recreate what you can source from elsewhere. Today’s Internet application usually require too much functionality to be cost-effectively built by a single effort. Typically, an application will actually source dozens of components and external functionality from 3rd parties. This could be off-the-shelf libraries or it could be the live use of another site’s API, the latter which has become one of the most interesting new business models in the Web 2.0 era. The general rule of thumb: Unless it’s a strategic capability of your application, try hard to source it from elsewhere before you build it; 3rd parties sources are already more hardened, robust, less expensive, and lower defect than any initial code could that you could produce. Get used to doing a rapid build vs. buy evaluation for each major component of your application.

35. Register the user as soon as possible. One of the most valuable aspects of your onine product will be the registered user base. Make sure you application gives them a good reason to register and that the process is as painless as possible. Each additional step or input field will increase abandonment of the process and you can always ask for more information later. Consider making OpenID the default login, with your local user database a 2nd tier, to make the process even easier and more comfortable for the user.

36. Explicitly enable your users to co-develop the product. I call this concept Product Development 2.0 and it’s one of the most potent ways to create a market-leading product by engaging the full capabilities of the network. The richest source of creative input you will have is your audience of passionate, engaged users. This can be enabled via simple feedback forms, harvested from surveys and online community forums, via services such as GetSatisfaction, or as the ingredients to mashups and user generated software. As you’ll see below, you can even open the code base or provide a plug-in approach/open APIs to allow motivated users and 3rd parties to contribute working functionality. Whichever of these you do, you’ll find that the innovation and direction to be key to making your product the richest and most robust it can be. A significant percentage of the top online products in the world take advantage of this key 2.0 technique.

37. Provide the legal and collaborative foundations for others to build on your data and platform. A good place to start is to license as much of your product as you can via Creative Commons or another licensing model that is less restrictive and more open than copyright or patents. Unfortunately, this is something for which 20th century business models around law, legal precedent, and traditional product design are ill-equipped to support and you’ll have to look at what other market leaders are doing with IP licensing that is working. Giving others explicit permission up-front to repurpose and reuse your data and functionality in theirs can be essential to drive market share and success. Another good method is to let your users license their data as well and Flickr is famous for doing this. It’s important to understand that this is now the Some Right Reserved era, not the All Rights Reserved era. So openly license what your have for others to use; the general rule of thumb is that the more you give away, the more you’ll get back, as long as you have a means of exercising control. This is why open APIs have become as popular as they have, since they are essentially “IP-as-a-service” and poorly behaving partner/licensees can be dealt with quickly and easily.

38. Design your product to build a strong network effect. The concept of the network effect is something I’ve covered here extensively before and it’s one of the most important items in this list. At their most basic, Web 2.0 applications are successful because they explicitly leverage network effects successfully. This is the underlying reason why most of the leading Internet companies got so big, so fast. Measuring network effects and driving them remains one of the most poorly understood yet critical aspects of competing successfully online. The short version: It’s extremely hard to fight an established network effect (particularly because research has shown them to be highly exponential). Instead, find a class of data or a blue ocean market segment for your product and its data to serve.

39. Know your Web 2.0 design patterns and business models. The fundamental principles of Web 2.0 were all identifid and collected together for a good reason. Each principle is something that must be considered carefully in the design of your product given how they can magnify your network effect. Your development team must understand them and know why they’re important, especially what outcomes they will drive in your product and business. It’s the same with Enterprise 2.0 products: There is another, related
set of design principles (which I’ve summarized as FLATNESSES) that makes them successful as well. And as with everything on this list, you don’t apply 2.0 principles reflexively; they need to be intelligently used for good reason.

40. Integrate a coherent social experience into your product. Social systems tend to have a much more pronounced network effect (Reed’s Law) than non-social systems. Though no site should be social without a good reason, it turns out that most applications will benefit from having a social experience. What does this mean in practice? In general, social applications let users perceive what other users are doing and actively encourage them to interact, work together, and drive participation through social encouragement and competition. There is a lot of art to the design of the social architecture of an online product, but there is also an increasing amount of science. Again, you can look at what successful sites are doing with their social interaction but good places to start are with user profiles, friends lists, activity streams, status messages, social media such as blogs and microsharing, and it goes up from there. Understand how Facebook Connect and other open social network efforts such as OpenSocial can help you expand your social experience.

41. Understand your business model and use it to drive your product design. Too many Web 2.0 applications hope that they will create large amounts of traffic and will then find someone interested in acquiring them. Alternatively, some products charge too much up front and prevent themselves from reaching critical mass. While over-thinking your exit strategy or trying to determine your ultimate business model before you do anything isn’t good either, too many startups don’t sit down and do the rigorous thinking around how to make their business a successful one in the nearer term. Take a look at Andrew Chen’s How To Create a Profitable Freemium Startup for a good example of the framework on how to do some of the business model planning. Taking into account the current economic downturn and making sure you’re addressing how you offering can help people and businesses in the current business climate will also help right now.

42. Embrace emergent development methods. While a great many of the Web’s best products had a strong product designer with a clear vision that truly understood his or her industry, the other half of the equation that often gets short shrift is the quality of emergent design through open development. This captures the innate crowdsourcing aspects of ecosystem-based products, specifically those that have well-defined points of connectedness with external development inputs and 3rd party additions. Any Web application has some emergent development if it takes development inputs or extensibility with via 3rd party plug-ins, widgets, open APIs, open source contributions, and so on. The development (and a good bit of the design) of the product then “emerges” as a function of multiple inputs. Though there is still some top-down control, in essence, the product becomes more than the sum total of its raw inputs. Products like Drupal and Facebook are good examples of this, with thousands of plug-ins or 3rd party apps that have been added to them by other developers.

43. It’s all about usability, usability, and usability. I’ve mentioned usability before in this list but I want to make it a first class citizen. Nothing will be a more imposing barrier to adoption that people not understanding how your product works. Almost nothing on this list will work until the usability of your application is a given. And hands down the most common mistake I see are Web developers creating user experiences in isolation. If you’re not funded to have a usability lab (and you probably should be, at some level), then you need to grab every friend and family member you have to watch how they use your application for the first time. Do this again for every release that makes user experience changes. You will change a surprising number of assumptions and hear feedback that you desperately need to hear before you invest any more in a new user experience approach. This now true even if you’re developing enterprise applications for the Web.

44. Security isn’t an afterthought. It’s a sad fact that far too much of a successful startup’s time will be spent on security issues. Once you are popular, you will be the target of every so-called script kiddie with a grudge or with the desire to get at your customer data, etc. Software vulnerability are numerous and the surface area of modern Web apps large. You not only have your own user experience but also your API, widgets, semantic Web connections, social networking applications, and other points of attack. Put aside time and budget for regular vulnerability assessments. You can’t afford a public data spill or exploit due to a security hole that will compromise your user’s data, or you may well find yourself with a lot of departing customers.Web 2.0 applications also need unique types of security systems, from rate limiters to prevent valuable user-generated data from being systematically scraped from the site (this is vital to “maintaining control of unique and hard-to-re-create datasets”) to monitoring software that will screen for objectionable or copyrighted contributions.

45. Stress test regularly and before releases. It’s a well known saying in the scalability business that your next bottleneck is hiding just behind your last high water mark. Before your launch, data volumes and loads that work fine in the lab should be tested to expected production volumes before launch. The Web 2.0 industry is rife with examples of companies that went down the first time they got a good traffic spike. That’s the very worst time to fail, since it’s your best chance of getting a strong initial network effect and may forever reduce your ultimate success. Know your volume limits and ceilings with each release and prepare for the worst.

46. Backup and disaster recovery, know your plan. This is another unglamorous but essential aspect for any online product. How often are backups being made of all your data? Are the backups tested? Are they kept offsite? If you don’t know the answers, the chances that you’ll survive a major event is not high.

47. Good Web products understand that there is more than the Web. Do you have a desktop widget for Vista or the Mac? Might you benefit from offering an Adobe AIR client version of your application? How about integration and representation in vitual worlds and games? How about linkages to RFID or GPS sensors? Startups thinking outside the box might even create their own hardware device if it makes sense (see Chumby and the iPhone/iPod for examples). If one thing that is certain is that the next generation of successful Web startups will only partially resemble what we see today. Successful new online products will take advantage of “software above the level of a single dev

ice” and deliver compelling combinations of elements into entirely new products that are as useful and innovative as they are unexpected. A great Web 2.0 product often has a traditional Web application as only part of its overall design, see the Doritos Crash the Superbowl campaign for just one small example of this.

48. Look for emerging areas on the edge of the Web. These are the spaces that have plenty of room for new players and new ideas, where network effects aren’t overly established and marketshare is for the taking. What spaces are these? The Semantic Web seems to be coming back with all new approaches (I continue to be amazed at how much appears about this topic on http://delicious.com/popular/web3.0 these days.) Open platform virtual worlds such as Second Life were hot a few years ago and may be again. Mobile Web applications are extremely hot today but slated to get over crowded this year as everyone plans a mobile application for phone platforms. What is coming after this? That is less clear but those that are watching closely will benefit the most.

49. Plan to evolve over time, for a long time. The Web never sits still. Users change, competitors improve, what’s possible continues to expand as new capabilities emerge in the software and hardware landscape. In the Perpetual Beta era, products are never really done. Never forget that, continue to push yourself, or be relegated to a niche in history.

50. Continually improve yourself and your Web 2.0 strategies. While process improvement is one of those lip-service topics that most people will at least admit to aspire to, few have the time and resources to carry it out on a regular basis. But without that introspection on our previous experience we wouldn’t have many of the “aha” moments that drove forward our industry at various points in term. Without explicit attempts at improvement, we might not have developed the ideas that became object-oriented languages, search engine marketing, Web 2.0, or even the Internet itself. This list itself is about that very process and encapsulates a lot of what we’ve learned in the last 4 years. Consequently, if you’re not sitting down and making your own list from your own experiences, you’re much more likely to repeat past history, never mind raising the bar. Like I’m often fond of saving; civilization progresses when we make something that was formerly hard to do and make it easy to do. Take the time, capture your lessons learned, and improve your strategies.

What else is missing here? Please contribute your own 2.0 strategies in comments below:

You can also get help with these strategies from a Web 2.0 assessment, get a deeper perspective on these ideas at Web 2.0 University, or attend our upcoming Economics 2.0 workshop at Web 2.0 Expo SF on March 31st, 2009.

Pre-Loading Images

The Cache

If you’re not familiar with the idea of a cache, allow me to explain. Every webpage and its contents you view gets saved in a special part of your hard disk called a cache (pronounced cash). The next time you visit that page the images are taken from the cache instead of being downloaded again. This means they appear faster and make way for new things that may have to be downloaded for the first time. You may have had a run through your ‘Temporary Internet Files‘ folder in Windows — that’s Internet Explorer’s cache. Netscape keeps its cache in its program folder.
So, on your first visit you have to download a load of stuff, but on every visit thereafter you’re just pulling stuff up from the cache. This is also how it’s possible to read websites offline; they’re just being read off your hard drive.

The idea of pre-loading images is to load them and put them in the cache before they’re even needed. This means that when they are called for they’ll appear almost immediately. This property is most important with things like navigation graphics and image rollovers. You can guess what images your reader might need and load them in advance, in the background so they’ll never see it happen.

The Script

So, let’s get the script. This is done in JavaScript, by the way.

<script type=”text/javascript”>
<!– hide from non JavaScript Browsers
 
Image1= new Image(150,20)
Image1.src = “pic1.gif”
 
Image2 = new Image(10,30)
Image2.src = “pic2.gif”
 
Image3 = new Image(72,125)
Image3.src = “pic3.gif”
 
// End Hiding –>
</script>

That script would pre-load images 1 to 3. If you’ve used image flips before you can see that that code has a pre-load built into it. If you’ve never seen one of these run before, here’s what it all means:

  • Image1= new Image is the first step of setting up a new script effect for an image
  • (150,20) are the width and height, respectively, of this image
  • Image1.src = "pic1.gif" gives the source of the image.

Place the script near the top of your page, in the head if you want. This will ensure it runs early as the page loads.


Pre-loading HTML

If you’re looking to pre-load HTML files, I’d have to tell you that the JavaScript methods in use are not great. You can see them in » this tutorial. In any case, the concept of pre-loading a whole page seems a bit strange — why have a large extra download on a page when the reader may not need it? If you must pre-load a page, the method I would use is to open a 1×1 iframe on your page with the next page inside it. Clever.

Online Script Generator for Preloading Images

right here: http://javascript.internet.com/generators/preload-images.html

Preload – Rollover Script Generator. Fill in the empty form fields, three for each preload/rollover effect you wish to use. The first field is for the initial image which will be shown on your page. You can use a relative URL for the image, such as alien1.gif, or you can use the full URL, like http://webdeveloper.com/javascript/alien1.gif. The second field is where you’ll place the URL for the rollover image–the one that will be pre-loaded. Again, you can use a relative or a full URL. The third blank is where you’ll place the URL for the hyperlink itself, such as http://www.webdeveloper.com. Continue in this fashion until you have listed all the images you wish to use as mouseovers (you can create five pre-loads/mouseovers using the generator), and then click the button on the bottom of the page.

 

Surviving the Economic Slowdown

Exadel… Everything for Web 2.0

——————————————————————————–

Hello

All of today’s news is focusing on the current global financial crisis. Conserving capital dollars is the primary concern for executives right now. Most companies are looking for savings in every aspect of their business. Organizations are under increasing pressure to make their finances last longer as they work their way through these turbulent times.

How can you do more with less? How can Exadel help you?

In spite of the economic slowdown you can continue to grow your business, and manage your budget by partnering with Exadel. Now that companies are facing liquidity issues and low consumer sentiment, it is time to think about how Exadel’s offshore technical services can help you reduce operating costs.

Partnering with Exadel is an effective approach for:

· Improved cost efficiencies (generally 3:1)

· Higher return on your investment

· Specialized technical expertise

· High quality development

We have a comprehensive technical portfolio including skills in developing Rich Internet Applications (RIA), leveraging technologies like JSF, RichFaces, Seam, Spring and Flex. We know RichFaces better than anyone else; we created it, and continue to create new RichFaces components.

Some of our global clients include Deutsche Bank, Bank of America, Wolters Kluwer, Sears, Samsung, and ABN AMRO Bank, to name a few. Our service offerings include product development, testing and support.

Though your apprehension to engage in new partnerships is understandable with the prospect of a global macro economic meltdown, the question you should ask yourself is not if you can afford to partner with Exadel, but can you afford not to.

Thank you,

John Erdiakoff

Exadel Sales

john@exadel.com

925-602-5555

888-4EXADEl

Informal Learning: Is your company using these Web 2.0 tools?

 

 

Harness the power of informal
learning technology

And tap these inexpensive tools for knowledge sharing
& just-in-time learning

Download the white paper now

SumTotal

Power. Simplicity. No compromises.

Web 2.0 isn’t new any more, but its application to corporate learning is new to many organizations. Companies are just discovering the benefits of tapping tools like wikis, blogs, and even social networking to facilitate informal learning between employees.

Download the white paper, “Harnessing the Power of Informal Learning Technology” and learn:

  • How Web 2.0 has shifted gears toward social learning
  • How informal learning drives business benefits
  • Real-world examples of informal learning solutions

Download the white paper now

Download the white paper to find out how you can harness the power of Informal Learning technology

 

Download the white paper
to find out how you can harness the power of Informal Learning technology.

 

 

 

Special Entities

The following table gives the character entity reference, decimal character reference, and hexadecimal character reference for markup-significant and internationalization characters, as well as the rendering of each in your browser. Glyphs of the characters are available at the Unicode Consortium.

With the exception of HTML 2.0‘s ", &, <, and >, these entities are all new in HTML 4.0 and may not be supported by old browsers. Support in recent browsers is good.

Character Entity Decimal Hex Rendering in Your Browser
Entity Decimal Hex
quotation mark = APL quote " " "
ampersand & & & & & &
less-than sign < < < < < <
greater-than sign > > > > > >
Latin capital ligature OE Œ Œ &#x152; Œ Œ Œ
Latin small ligature oe œ œ &#x153; œ œ œ
Latin capital letter S with caron Š Š &#x160; Š Š Š
Latin small letter s with caron š š &#x161; š š š
Latin capital letter Y with diaeresis Ÿ Ÿ &#x178; Ÿ Ÿ Ÿ
modifier letter circumflex accent ˆ ˆ &#x2C6; ˆ ˆ ˆ
small tilde ˜ ˜ &#x2DC; ˜ ˜ ˜
en space
em space
thin space
zero width non-joiner
zero width joiner
left-to-right mark
right-to-left mark
en dash
em dash
left single quotation mark
right single quotation mark
single low-9 quotation mark
left double quotation mark
right double quotation mark
double low-9 quotation mark
dagger
double dagger
per mille sign
single left-pointing angle quotation mark
single right-pointing angle quotation mark
euro sign

Entities for Symbols and Greek Letters — Special Characters

The following table gives the character entity reference, decimal character reference, and hexadecimal character reference for symbols and Greek letters, as well as the rendering of each in your browser. Glyphs of the characters are available at the Unicode Consortium.

These entities are all new in HTML 4.0 and may not be supported by old browsers. Support in recent browsers is good.

Character Entity Decimal Hex Rendering in Your Browser
Entity Decimal Hex
Latin small f with hook = function = florin ƒ ƒ &#x192; ƒ ƒ ƒ
Greek capital letter alpha Α Α &#x391; Α Α Α
Greek capital letter beta Β Β &#x392; Β Β Β
Greek capital letter gamma Γ Γ &#x393; Γ Γ Γ
Greek capital letter delta Δ Δ &#x394; Δ Δ Δ
Greek capital letter epsilon Ε Ε &#x395; Ε Ε Ε
Greek capital letter zeta Ζ Ζ &#x396; Ζ Ζ Ζ
Greek capital letter eta Η Η &#x397; Η Η Η
Greek capital letter theta Θ Θ &#x398; Θ Θ Θ
Greek capital letter iota Ι Ι &#x399; Ι Ι Ι
Greek capital letter kappa Κ Κ &#x39A; Κ Κ Κ
Greek capital letter lambda Λ Λ &#x39B; Λ Λ Λ
Greek capital letter mu Μ Μ &#x39C; Μ Μ Μ
Greek capital letter nu Ν Ν &#x39D; Ν Ν Ν
Greek capital letter xi Ξ Ξ &#x39E; Ξ Ξ Ξ
Greek capital letter omicron Ο Ο &#x39F; Ο Ο Ο
Greek capital letter pi Π Π &#x3A0; Π Π Π
Greek capital letter rho Ρ Ρ &#x3A1; Ρ Ρ Ρ
Greek capital letter sigma Σ Σ &#x3A3; Σ Σ Σ
Greek capital letter tau Τ Τ &#x3A4; Τ Τ Τ
Greek capital letter upsilon Υ Υ &#x3A5; Υ Υ Υ
Greek capital letter phi Φ Φ &#x3A6; Φ Φ Φ
Greek capital letter chi Χ Χ &#x3A7; Χ Χ Χ
Greek capital letter psi Ψ Ψ &#x3A8; Ψ Ψ Ψ
Greek capital letter omega Ω Ω &#x3A9; Ω Ω Ω
Greek small letter alpha α α &#x3B1; α α α
Greek small letter beta β β &#x3B2; β β β
Greek small letter gamma γ γ &#x3B3; γ γ γ
Greek small letter delta δ δ &#x3B4; δ δ δ
Greek small letter epsilon ε ε &#x3B5; ε ε ε
Greek small letter zeta ζ ζ &#x3B6; ζ ζ ζ
Greek small letter eta η η &#x3B7; η η η
Greek small letter theta θ θ &#x3B8; θ θ θ
Greek small letter iota ι ι &#x3B9; ι ι ι
Greek small letter kappa κ κ &#x3BA; κ κ κ
Greek small letter lambda λ λ &#x3BB; λ λ λ
Greek small letter mu μ μ &#x3BC; μ μ μ
Greek small letter nu ν ν &#x3BD; ν ν ν
Greek small letter xi ξ ξ &#x3BE; ξ ξ ξ
Greek small letter omicron ο ο &#x3BF; ο ο ο
Greek small letter pi π π &#x3C0; π π π
Greek small letter rho ρ ρ &#x3C1; ρ ρ ρ
Greek small letter final sigma ς ς &#x3C2; ς ς ς
Greek small letter sigma σ σ &#x3C3; σ σ σ
Greek small letter tau τ τ &#x3C4; τ τ τ
Greek small letter upsilon υ υ &#x3C5; υ υ υ
Greek small letter phi φ φ &#x3C6; φ φ φ
Greek small letter chi χ χ &#x3C7; χ χ χ
Greek small letter psi ψ ψ &#x3C8; ψ ψ ψ
Greek small letter omega ω ω &#x3C9; ω ω ω
Greek small letter theta symbol ϑ ϑ &#x3D1; ϑ ϑ ϑ
Greek upsilon with hook symbol ϒ ϒ &#x3D2; ϒ ϒ ϒ
Greek pi symbol ϖ ϖ &#x3D6; ϖ ϖ ϖ
bullet = black small circle
horizontal ellipsis = three dot leader
prime = minutes = feet
double prime = seconds = inches
overline = spacing overscore
fraction slash
script capital P = power set = Weierstrass p
blackletter capital I = imaginary part
blackletter capital R = real part symbol
trade mark sign
alef symbol = first transfinite cardinal
leftwards arrow
upwards arrow
rightwards arrow
downwards arrow
left right arrow
downwards arrow with corner leftwards = carriage return
leftwards double arrow
upwards double arrow
rightwards double arrow
downwards double arrow
left right double arrow
for all
partial differential
there exists
empty set = null set = diameter
nabla = backward difference
element of
not an element of
contains as member
n-ary product = product sign
n-ary sumation
minus sign
asterisk operator
square root = radical sign
proportional to
infinity
angle
logical and = wedge
logical or = vee
intersection = cap
union = cup
integral
therefore
tilde operator = varies with = similar to
approximately equal to
almost equal to = asymptotic to
not equal to
identical to
less-than or equal to
greater-than or equal to
subset of
superset of
not a subset of
subset of or equal to
superset of or equal to
circled plus = direct sum
circled times = vector product
up tack = orthogonal to = perpendicular
dot operator
left ceiling = APL upstile
right ceiling
left floor = APL downstile
right floor
left-pointing angle bracket = bra
right-pointing angle bracket = ket
lozenge
black spade suit
black club suit = shamrock
black heart suit = valentine
black diamond suit

Web Site Optimization: 13 Simple Steps

Earlier this year, Steve Souders from the Yahoo! Performance team published a series of front-end performance optimization “rules” for optimizing a page.

This tutorial takes a practical, example-based approach to implementing those rules. It’s targeted towards web developers with a small budget, who are most likely using shared hosting, and working under the various restrictions that come with such a setup. Shared hosts make it harder to play with Apache configuration — sometimes it’s even impossible — so we’ll take a look at what you can do, given certain common restrictions, and assuming your host runs PHP and Apache.

The tutorial is divided into four parts:

  1. basic optimization rules
  2. optimizing assets (images, scripts, and styles)
  3. optimizations specific to scripts
  4. optimizations specific to styles
Credits and Suggested Reading

The article is not going to explain Yahoo!’s performance rules in detail, so you’d do well to read through them on your own for a better understanding of their importance, the reasoning behind the rules, and how they came to be. Here’s the list of rules in question:

  1. Make fewer HTTP requests
  2. Use a Content Delivery Network
  3. Add an Expires header
  4. Gzip components
  5. Put CSS at the top
  6. Move scripts to the bottom
  7. Avoid CSS expressions
  8. Make JavaScript and CSS external
  9. Reduce DNS lookups
  10. Minify JavaScript
  11. Avoid redirects
  12. Remove duplicate scripts
  13. Configure ETags

You can read about these rules on the Yahoo! Developer Network site. You can also check out the book “High Performance Web Sites” by Steve Souders, and the performance research articles on the YUI blog by Tenni Theurer.

Basic Optimization Rules

Decrease Download Sizes

Decreasing download sizes isn’t even in Yahoo!’s list of rules — probably because it’s so obvious. However I don’t think it hurts to reiterate the point — let’s call it Rule #0.

When we look at a simple web page we see:

  • some HTML code
  • different page components (assets) referenced by the HTML

The assets are images, scripts, styles, and perhaps some external media such as Flash movies or Java applets (remember those?). So, when it comes to download sizes, you should aim to have all the assets as lightweight as possible — advice which also extends to the page’s HTML content. Creating lean HTML code often means using better (semantic) markup, which also overlaps with the SEO (search engine optimization) efforts that are a necessary part of the site creation process. As most professional web developers know, a key characteristic of good markup is that it only describes the content, not the presentation of the page (no layout tables!). Any layout or presentational elements should be moved to CSS.

Here’s an example of a good approach to HTML markup for a navigation menu:

<ul id="menu">

<li><a href="home.html">Home</a></li>

<li><a href="about.html">About</a></li>

<li><a href="contact.html">Contact</a></li>

</ul>

This sort of markup should provide “hooks” to allow for the effective use of CSS and make the menu look however you want it to — whether that means adding fancy bullets, borders, or rollovers, or placing the menu items into a horizontal menu. The markup is minimal, which means there are fewer bytes to download; it’s semantic, meaning it describes the content (a navigation menu is a list of links); and finally, being minimal, it also gives you an SEO advantage: it’s generally agreed that search engines prefer a higher content-to-markup ratio in the pages that they index.

Once you’re sure your markup is lightweight and semantic, you should go through your assets and make sure they are also of minimal size. For example, check whether it’s possible to compress images more without losing too much quality, or to choose a different file format that gives you better compression. Tools such as PNGOUT and pngcrush are a good place to start.

Make Fewer HTTP Requests

Making fewer HTTP requests turns out to be the most important optimization technique, with the biggest impact. If your time is limited, and you can only complete one optimization task, pick this one. HTTP requests are generally the most “expensive” activity that the browser performs while displaying your page. Therefore, you should ensure that your page makes as few requests as possible.

How you can go about that, while maintaining the richness of your pages?

  • Combine scripts and style sheets: Do you have a few <script> tags in your head? Well, merge the .js files into one and save your visitors some round trips; then do the same with the CSS files.
  • Use image sprites: This technique allows you to combine several images into one and use CSS to show only the part of the image that’s needed. When you combine five or ten images into a single file, already you’re making a huge saving in the request/response overhead.
  • Avoid redirects: a redirect adds another client-server round trip, so instead of processing your page immediately after receiving the initial response, the browser will have to make another request and wait for the second response.
  • Avoid frames: if you use frames, the browser has to request at least three HTML pages, instead of just one — those of the frameset as well as each of the frames.

You’ve got the basics now. In summary, make your page and its assets smaller in size, and use fewer assets by combining them wherever you can. If you concentrate on this aspect of optimization only, you and your visitors will notice a significant improvement.

Now let’s explore some of the Yahoo! recommendations in more detail, and see what other optimizations can be made to improve performance.

Optimizing Assets

Use a Content Delivery Network

A Content Delivery Network (CDN) is a network of servers in different geographical locations. Each server has a copy of a site’s files. When a visitor to your site requests a file, the file is delivered from the nearest server (or the one that’s experiencing the lightest load at the time).

This setup can have a significant impact on your page’s overall performance, but unfortunately, using a CDN can be pricey. As such, it’s probably not something you’d do for a personal blog, but it may be useful when a client asks you to build a site that’s likely to experience high volumes of traffic. Some of the most widely known CDN providers are Akamai and Amazon, through its S3 service.

There are some non-profit CDNs in the market; check the CDN Wikipedia article to see if your project might qualify to use one of them. For example, one free non-profit peer-to-peer CDN is Coral CDN, which is extremely easy to integrate with your site. For this CDN, you take a URL and append “nyud.net” to the hostname. Here’s an example:

http://example.org/logo.png

becomes:

http://example.org.nyud.net/logo.png

Host Assets on Different Domains but Reduce DNS Lookups

After your visitor’s browser has downloaded the HTML for a page and figured out that a number of components are also needed, it begins downloading those components. Browsers restrict the number of simultaneous downloads that can take place; as per the HTTP/1.1 specification, the limit is two assets per domain.

Because this restriction exists on a per-domain basis, you can use several domains (or simply use subdomains) to host your assets, thus increasing the number of parallel downloads. Most shared hosts will allow you to create subdomains. Even if your host places a limit on the number of subdomains you can create (some restrict you to a maximum of five), it’s not that important, as you won’t need to utilize too many subdomains to see some noticeable performance improvements.

However, as Rule #9 states, you should also reduce the number of DNS lookups, because these can also be expensive. For every domain or subdomain that hosts a page asset, the browser will need to make a DNS lookup. So the more domains you have, the more your site will be slowed down by DNS lookups. Yahoo!’s research suggests that two to four domains is an optimal number, but you can decide for yourself what’s best for your site.

As a general guideline, I’d suggest you use one domain to host HTML pages and two other domains for your assets. Here’s an example:

  • www.sitepoint.com – hosts only HTML (and maybe content images)
  • i1.sitepoint.com – hosts JS, CSS, and some images
  • i2.sitepoint.com – hosts most of the site’s images

Different hosting providers will probably offer different interfaces for creating subdomains, and ideally they should provide you with an option to specify the directory that holds the files for the subdomain. For example, if your canonical domain is www.sitepoint.com, and it points to /home/sitepoint/htdocs, ideally you should be able to create the subdomain i1.sitepoint.com (either via an administration control panel or by creating a symbolic link in the file system) and point it to the same folder, /home/sitepoint/htdocs. This way, you can keep all files in the same location, just as they are in your development environment, but reference them using a subdomain.

However, some hosts may prevent you from creating subdomains, or may restrict your ability to point to particular locations on the file system. In such cases, your only real options is to physically copy the assets to the new location. Don’t be tempted to create some kind of redirect in this case — it will only make things worse, as it creates two requests for each image.

If your hosting provider doesn’t allow subdomains at all, you always have the option of buying more domains and using them purely to host assets — after all, that’s what a lot of big sites do. Yahoo! uses the domain yimg.com, Amazon has images-amazon.com, and SitePoint has sitepointstatic.com. If you own several sites, or manage the hosting of your client’s sites, you might consider buying two domains, such as yourdomain-i1.com and yourdomain-i2.com, and using them to host the components for all the sites you maintain.

Place Assets on a Cookie-free Domain

If you set a lot of cookies, the request headers for your pages will increase in size, since those cookies are sent with each request. Additionally, your assets probably don’t use the cookies, so all of this information could be repeatedly sent to the client for no reason. Sometimes, those headers m

ay even be bigger than the size of the asset requested — these are extreme cases of course, but it happens. Consider downloading those small icons or smilies that are less than half a kB, and requesting them with 1kB worth of HTTP headers.

If you use subdomains to host your assets, you need to make sure that the cookies you set are for your canonical domain name (e.g. www.example.org) and not for the top-level domain name (e.g. example.org). This way, your asset subdomains will be cookie-free. If you’re attempting to improve the performance of an existing site, and you’ve already set your cookies on the top-level domain, you could consider the option of hosting assets on new domains, rather than subdomains.

Split the Assets Among Domains

It’s completely up to you which assets you decide to host on i1.example.org and which you decide to host on i2.example.org — there’s no clear directive on this point. Just make sure you don’t randomize the domain on each request, as this will cause the same assets to be downloaded twice — once from i1 and once from i2.

You could aim to split your assets evenly by file size, or by some other criterion that makes sense for your pages. You may also choose to put all content images (those that are included in your HTML with <img /> tags) on i1 and all layout images (those referenced by CSS‘s background-image:url()) on i2, although in some cases this solution may not be optimal. In such cases, the browser will download and process the CSS files and then, depending on which rules need to be applied, will selectively download only images that are needed by the style sheet. The result is that the images referenced by CSS may not download immediately, so the load on your asset servers may not be balanced.

The best way to decide on splitting assets is by experimentation; you can use Firebug‘s Net panel to monitor the sequence in which assets download, then decide how you should spread components across domains in order to speed up the download process.

Configure DNS Lookups on Forums and Blogs

Since you should aim to have no more than four DNS lookups per page, it may be tricky to integrate third-party content such as Flickr images or ads that are hosted on a third-party server. Also, hotlinking images (by placing on your page an <img /> tag whose src attribute points to a file on another person’s server) not only steals bandwidth from the other site, but also harms your own page’s performance, causing an extra DNS lookup.

If your site contains user-generated content (as do forums, for example), you can’t easily prevent multiple DNS lookups, since users could potentially post images located anywhere on the Web. You could write a script that copies each image from a user’s post to your server, but that approach can get fairly complicated.

Aim for the low-hanging fruit. For example, in the phpBB forum software, you can configure whether users need to hotlink their avatar images or upload them to your server. In this case, uploaded avatars will result in better performance for your site.

Use the Expires Header

For best performance, your static assets should be exactly that: static. This means that there should be no dynamically generated scripts or styles, or <img> tags pointing to scripts that generate dynamic images. If you had such a need — for example, you wanted to generate a graphic containing your visitor’s username — the dynamic generation could be taken “offline” and the result cached as a static image. In this example, you could generate the image once, when the member signs up. You could then store the image on the file system, and write the path to the image in your database. An alternative approach might involve scheduling an automated process (a cron job, in UNIX) that generates dynamic components and saves them as static files.

Having assets that are entirely static allows you to set the Expires header for those files to a date that is far in the future, so that when an asset is downloaded once, it’s cached by the browser and never requested again (or at least not for a very long time, as we’ll see in a moment).

Setting the Expires header in Apache is easy: add an .htaccess file that contains the following directives to the root folder of your i1 and i2 subdomains:

ExpiresActive On

ExpiresDefault "modification plus 10 years"

The first of these directives enables the generation of the Expires header. The second sets the expiration date to 10 years after the file’s modification date, which translates to 10 years after you copied the file to the server. You could also use the setting “access plus 10 years”, which will expire the file 10 years after the user requests the file for the first time.

If you want, you can even set an expiration date per file type:

ExpiresActive On

ExpiresByType application/x-javascript "modification plus 2 years"

ExpiresByType text/css "modification plus 5 years"

For more information, check the Apache documentation on mod_expires.

Name Assets

The problem with the technique that we just looked at (setting the Expires

header to a date that’s far into the future) occurs when you want to modify an asset on that page, such as an image. If you just upload the changed image to your web server, new visitors will receive the updated image, but repeat visitors won’t. They’ll see the old cached version, since you’ve already instructed their browser never to ask for this image again.

The solution is to modify the asset’s name — but it comes with some maintenance hurdles. For example, if you have a few CSS definitions pointing to img.png, and you modify the image and rename it to img2.png, you’ll have to locate all the points in your style sheets at which the file has been referenced, and update those as well. For bigger projects, you might consider writing a tool to do this for you automatically.

You’ll need to come up with a naming convention to use when naming your assets. For example, you might:

  • Append an epoch timestamp to the file name, e.g. img_1185403733.png.
  • Use the version number from your source control system (cvs or svn for example), e.g. img_1.1.png.
  • Manually increment a number in the file name (e.g. when you see a file named img1.png, simply save the modified image as img2.png).

There’s no one right answer here — your decision will be depend on your personal preference, the specifics of your pages, the size of the project and your team, and so on.

If you use CVS, here’s a little PHP function that can help you extract the version from a file stored in CVS:

function getVersion($file) {

$cmd = ‘cvs log -h %s’;

$cmd = sprintf($cmd, $file);

exec($cmd, $res);

$version = trim(str_replace(‘head: ‘, ”, $res[3]));

return $version;

}

// example use

$file = ‘img.png’;

$new_file = ‘img_’ . getVersion($file) . ‘.png’;

Serve gzipped Content

Most modern browsers understand gzipped (compressed) content, so a well-performing page should aim to serve all of its content compressed. Since most images, swf files and other media files are already compressed, you don’t need to worry about compressing them.

You do, however, need to take care of serving compressed HTML, CSS, client-side scripts, and any other type of text content. If you make XMLHttpRequests to services that return XML (or JSON, or plain text), make sure your server gzips this content as well.

If you open the Net panel in Firebug (or use LiveHTTPHeaders or some other packet sniffer), you can verify that the content is compressed by looking for a Content-Encoding header in the response, as shown in the following example:

Example request:

GET /2.2.2/build/utilities/utilities.js HTTP/1.1

Host: yui.yahooapis.com

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.5) Gecko/20070713 Firefox/2.0.0.5

Accept-Encoding: gzip,deflate

Example response:

HTTP/1.x 200 OK

Last-Modified: Wed, 18 Apr 2007 17:36:33 GMT

Vary: Accept-Encoding

Content-Type: application/x-javascript

Content-Encoding: gzip

Cache-Control: max-age=306470616

Expires: Sun, 16 Apr 2017 00:01:52 GMT

Date: Mon, 30 Jul 2007 21:18:16 GMT

Content-Length: 22657

Connection: keep-alive

In this request, the browser informed the server that it understands gzip and deflate encodings (Accept-Encoding: gzip,deflate) and the server responded with gzip-encoded content (Content-Encoding: gzip).

There’s one gotcha when it comes to serving gzipped content: you must make sure that proxies do not get in your way. If an ISP’s proxy caches your gzipped content and serves it to all of its customers, chances are that someone with a browser that doesn’t support compression will receive your compressed content.

To avoid this you can use the Vary: Accept-Encoding response header to tell the proxy to cache this response only for clients that send the same Accept-Encoding request header. In the example above, the browser said it supports gzip and deflate, and the server responded with some extra information for any proxy between the server and client, saying that gzip-encoded content is okay for any client that sends the same Accept-Encoding content.

There is one additional problem here: some browsers (IE 5.5, IE 6 SP 1, for instance) claim they support gzip, but can actually experience problems reading it (as described on the Microsoft downloads site, and the support site). If you care about people using these browsers (they usually account for less than 1% of a site’s visitors) you can use a different header — Cache-Control: Private — which eliminates proxy caching completely. Another way to prevent proxy caching is to use the header Vary: *.

To gzip or to Deflate?

If you’re confused by the two Accept-Encoding values that browsers send, think of deflate as being just another method for encoding content that’s less popular among browsers. It’s also less efficient, so gzip is preferred.

Make Sure you Send gzipped Content

Okay, now let’s see what you can do to start serving gzipped content in accordance with what your host allows.

Option 1: mod_gzip for Apache Versions Earlier than 2

If you’re using Apache 1.2 and 1.3, the mod_gzip module is available. To verify the Apache version, you can check Firebug’s Net panel and look for the Server response header of any request. If you can’t see it, check you provider’s documentation or create a simple PHP script to echo this information to the browser, like so:

<?php echo apache_get_version(); ?>

In the Server header signature, you might also be able to see the mod_gzip version, if it’s installed. It might look like s

omething like this:

Server: Apache/1.3.37 (Unix) mod_gzip/1.3.26.1a.....

Okay, so we’ve established that we want to compress all text content, PHP script output, static HTML pages, JavaScripts and style sheets before sending them to the browser. To implement this with mod_gzip, create in the root directory of your site an .htaccess file that includes the following:

mod_gzip_on Yes

mod_gzip_item_include mime ^application/x-javascript$

mod_gzip_item_include mime ^application/json$

mod_gzip_item_include mime ^text/.*$

mod_gzip_item_include file .html$

mod_gzip_item_include file .php$

mod_gzip_item_include file .js$

mod_gzip_item_include file .css$

mod_gzip_item_include file .txt$

mod_gzip_item_include file .xml$

mod_gzip_item_include file .json$

Header append Vary Accept-Encoding

The first line enables mod_gzip. The next three lines set compression based on MIME-type. The next section does the same thing, but on the basis of file extension. The last line sets the Vary header to include the Accept-Encoding value.

If you want to send the Vary: * header, use:

Header set Vary *

Note that some hosting providers will not allow you to use the Header directive. If this is the case, hopefully you should be able to substitute the last line with this one:

mod_gzip_send_vary On

This will also set the Vary header to Accept-Encoding.

Be aware that there might be a minimum size condition on gzip, so if your files are too small (less than 1kb, for example), they might not be gzipped even though you’ve configured everything correctly. If this problem occurs, your host has decided that the gzipping process overhead is unnecessary for very small files.

Option 2: mod_deflate for Apache 2.0

If your host runs Apache 2 you can use mod_deflate. Despite its name, mod_deflate also uses gzip compression. To configure mod_deflate, add the following directives to your .htaccess file:

AddOutputFilterByType DEFLATE text/html text/css text/plain text/xml application/x-javascript application/json

Header append Vary Accept-Encoding

Option 3: php.ini

Ideally we’d like Apache to handle the gzipping of content, but unfortunately some hosting providers might not allow it. If your hosting provider is one of these, it might allow you to use custom php.ini files. If you place a php.ini file in a directory, it overwrites PHP configuration settings for this directory and its subdirectories.

If you can’t use Apache’s mod_gzip or mod_deflate modules, you might still be able to compress your content using PHP. In order for this solution to work, you’ll have to configure your web server so that all static HTML, JavaScript and CSS files are processed by PHP. This means more overhead for the server, but depending on your host, it might be your only option.

Add the following directives in your .htaccess file:

AddHandler application/x-httpd-php .css

AddHandler application/x-httpd-php .html

AddHandler application/x-httpd-php .js

This will ensure that PHP will process these (otherwise static) files. If it doesn’t work, you can try renaming the files to have a .php extension (like example.js.php, and so on) to achieve the same result.

Now create a php.ini file in the same directory with the following content:

[PHP]

zlib.output_compression = On

zlib.output_compression_level = 6

auto_prepend_file = "pre.php"

short_open_tag = 0

This enables compression and sets the compression level to 6. Values for the compression level range from 0 to 9, where 9 is the best (and slowest) compression. The last line sets up a file called pre.php to be executed at the beginning of every script, as if you had typed <?php include "pre.php"; ?> at the top of every script. You’ll need this file in order to set Content-Type headers, because some browsers might not like it when you send a CSS file that has, for example, a text/html content type header.

The short_open_tag setting is there to disable PHP short tags (<? ... ?>, as compared to <?php ... ?>). This is important because PHP will attempt to treat the <?xml tag in your HTML as PHP code.

Finally, create the file pre.php with the following content:

<?php

$path = pathinfo($_SERVER[‘SCRIPT_NAME’]);

if ($path[‘extension’] == ‘css’) {

header(‘Content-type: text/css’);

}

if ($path[‘extension’] == ‘js’) {

header(‘Content-type: application/x-javascript’);

}

?>

This script will be executed before every file that has a .php, .html, .js or .css file extension. For HTML and PHP files, the default Content-Type text/html is okay, but for JavaScript and CSS files, we change it using PHP’s header function.

Option 3 (Variant 2): PHP Settings in .htaccess

If your host allows you to set PHP settings in your .htaccess file, then you no longer need to use php.ini file to configure your compression settings. Instead, set the PHP setting in .htaccess using php_value (and php_flag
).

Looking at the modified example from above, we would have the same pre.php file, no php.ini file, and a modified .htaccess that contained the following directives:

AddHandler application/x-httpd-php .css

AddHandler application/x-httpd-php .html

AddHandler application/x-httpd-php .js

php_flag zlib.output_compression on

php_value zlib.output_compression_level 6

php_value auto_prepend_file "pre.php"

php_flag short_open_tag off

Option 4: In-script Compression

If your hosting provider doesn’t allow you to use php_value in your .htaccess file, nor do they allow you to use a custom php.ini file, your last resort is to modify the scripts to manually include the common pre.php file that will take care of the compression. This is the least-preferred option, but sometimes you may have no other alternative.

If this is your only option, you’ll either be using an .htaccess file that contains the directives outlined in Option 3 above, or you’ll have had to rename every .js and .css file (and .xml, .html, etc.) to have a .php extension. At the top of every file, add <?php include "pre.php"; ?> and create a file called pre.php that contains the following content:

<?php

ob_start("ob_gzhandler");

$path = pathinfo($_SERVER[‘SCRIPT_NAME’]);

if ($path[‘extension’] == ‘css’) {

header(‘Content-type: text/css’);

}

if ($path[‘extension’] == ‘js’) {

header(‘Content-type: application/x-javascript’);

}

?>

As I indicated, this is the least favorable option of all — you should try Option 1 or 2 first, and if they don’t work, consider Option 3 or 4, or a combination of both, depending on what your host allows.

Once you’ve established the degree of freedom your host permits, you can use the technique that you’ve employed to compress your static files to implement all of your Apache-related settings. For example, earlier I showed you how to set the Expires header. Well, guess what? Some hosts won’t allow it. If you find yourself in this situation, you can use PHP’s header function to set the Expires header from your PHP script.

To do so, you might add to your pre.php file something like this:

<?php

header("Expires: Mon, 25 Dec 2017 05:00:00 GMT");

?>

Disable ETags

Compared to the potential hassles that can be encountered when implementing the rule above, the application of this rule is very easy. You just need to add the following to your .htaccess file:

FileETags none

Note that this rule applies to sites that are in a server farm. If you’re using a shared host, you could skip this step, but I recommend that you do it regardless because:

  • Hosts change their machines for internal purposes.
  • You may change hosts.
  • It’s so simple.

Use CSS Sprites

Using a technique known as CSS sprites, you can combine several images into a single image, then use the CSS background-position property to show only the image you need. The technique is not intended for use with content images (those that appear in the HTML in <img /> tags, such as photos in a photo gallery), but is intended for use with ornamental and decorative images. These images will not affect the fundamental usability of the page, and are usually referenced from a style sheet in order to keep the HTML lean (Rule #0).

Let’s look at an example. We’ll take two images. The first is help.png; the second is rss.png. From these, we’ll create a third image, sprite.png, which contains both images.

Combining two image files into a single image (click to view image)

The resulting image is often smaller in size than the sum of the two files’ sizes, because the overhead associated with an image file is included only once. To display the first image, we’d use the following CSS rule:

#help {

background-image: url(sprite.png);

background-position: -8px -8px;

width: 16px;

height: 16px;

}

To display the second image, we’d use the following rule:

#rss {

background-image: url(sprite.png);

background-position: -8px -40px;

width: 16px;

height: 16px;

}

At first glance, this technique might look a bit strange, but it’s really useful for decreasing the number of HTTP requests. The more images you combine this way, the better, because you’re cutting the request overhead dramatically. For an example of this technique in use “in the wild”, check out this image, used on Yahoo!’s homepage, or this one from Google’s.

In order to produce sprite images quickly, without having to calculate pixel coordinates, feel free to use the CSS Sprites Generator tool that I’ve developed. And for more information about CSS sprites, be sure to read Dave Shea’s article, titled CSS Sprites: Image Slicing’s Kiss of Death.

Use Post-load Pre-loading and Inline Assets

If you’re a responsible web developer, you’re probably already adhering to the separation of concerns and using HTML for your content, CSS for presentation and JavaScript for behavior. While these distinct parts of a page should be kept in separate files at all times, for performance reasons you might sometimes consider breaking the rule on your index (home) page. The homepage should always be the fastest page on your site — many first-time visitors may leave your site, no matter what content it contains, if they find the homepage slow to load.

When a visitor arrives at your homepage with an empty cache, the fastest way to deliver the page is to have only one request and no separate components. This means having scripts and styles inline (gasp)! It’s actually possible to have inline images as well (although it’s not supported in IE) but that’s probably taking things too far. Apart from being semantically incorrect, using inline scripts and styles prevents those components from being cached, so a good strategy will be to load components in the background after the home page has loaded — a technique with the slightly confusing name of post-load preloading. Let’s see an example.

Let’s suppose that the file containing your homepage is named home.html, that numerous other HTML files containing content are scattered throughout your site, and that all of these content pages use a JavaScript file, mystuff.js, of which only a small part is needed by the homepage.

Your strategy might be to take the part of the JavaScript that’s used by the homepage out of mystuff.js and place it inline in home.html. Then, once home.html has completed loading, make a behind-the-scenes request to pre-load mystuff.js. This way, when the user hits one of your content pages, the JavaScript has already been delivered to the browser and cached.

Once again, this technique is used by some of the big boys: both Google and Yahoo! have inline scripts and styles on their homepages, and they also make use of post-load preloading. If you visit Google’s homepage, it loads some HTML and one single image — the logo. Then, once the home page has finished loading, there is a request to get the sprite image, which is not actually needed until the second page loads — the one displaying the search results.

The Yahoo search page performs conditional pre-loading — this page doesn’t automatically load additional assets, but waits for the user to start typing in the search box. Once you’ve begun typing, it’s almost guaranteed that you’ll submit a search query. And when you do, you’ll land on a search results page that contains some components that have already been cached for you.

Preloading an image can be done with a simple line of JavaScript:

new Image().src='image.png';

For preloading JavaScript files, use the JavaScript include_DOM technique and create a new <script> tag, like so:

var js = document.createElement('script');

js.src = 'mysftuff.js';

document.getElementsByTagName('head')[0].appendChild(js);

Here’s the CSS version:

var css = document.createElement('link');

css.href = 'mystyle.css';

css.rel = 'stylesheet';

document.getElementsByTagName('head')[0].appendChild(css);

In the first example, the image is requested but never used, so it doesn’t affect the current page. In the second example, the script is added to the page, so as well as being downloaded, it will be parsed and executed. The same goes for the CSS — it, too, will be applied to the page. If this is undesirable, you can still pre-load the assets using XMLHttpRequest.

JavaScript Optimizations

Before diving into the JavaScript code and micro-optimizing every function and every loop, let’s first look at what big-picture items we can tackle easily that might have a significant impact on a site’s performance. Here are some guidelines for improving the impact that JavaScript files have on your site’s performance:

  1. Merge .js files.
  2. Minify or obfuscate scripts.
  3. Place scripts at the bottom of the page.
  4. Remove duplicates.

Merge .js Files

As per the basic rules, you should aim for your JavaScripts to make as few requests as possible; ideally, this also means that you should have only one .js file. This task is as simple as taking all .js script files and placing them into a single file.

While a single-file approach is recommended in most cases, sometimes you may derive some benefit from having two scripts — one for the functionality that’s needed as soon as the page loads, and another for the functionality that can wait for the page to load first. Another situation in which two files might be desirable is when your site makes use of a piece of functionality across multiple pages — the shared scripts could be stored in one file (and thus cached from page to page), and the scripts specific to that one page could be stored in the second file.

Minify or Obfuscate Scripts

Now that you’ve merged your scripts, you can go ahead and minify or obfuscate them. Minifying means removing everything that’s not necessary — such as comments and whitespace. Obfuscating goes one step further and involves renaming and rearranging functions and variables so that their names are shorter, making the script very difficult to read. Obfuscation is often used as a way of keeping JavaScript source a secret, although if your script is available on the Web, it can never be 100% secret. Read more about minification and obfuscation in Douglas Crockford’s helpful article on the topic.

In general, if you gzip the JavaScript, you’ll already have made a huge gain in file size, and you’ll only obtain a small additional benefit by minifying and/or obfuscating the script. On average, gzipping alone can result in savings of 75-80%, while gzipping and minifying can give you savings of 80-90%. Also, when you’re changing your code to minify or obfuscate, there’s a risk that you may introduce bugs. If you’re not overly worried about someone stealing your code, you can probably forget obfuscation and just merge and minify, or even just merge your scripts only (but always gzip them!).

An excellent tool for JavaScript minification is JSMin and it also has a PHP port, among others. One obfuscation tool is Packer — a free online tool that, incidentally, is used by jQuery.

Changing your code in order to merge and minify should become an extra, separate step in the process of developing your s

ite. During development, you should use as many .js files as you see fit, and then when the site is ready to go live, substitute your “normal” scripts with the merged and minified version. You could even develop a tool to do this for you. Below, I’ve included an example of a small utility that does just this. It’s a command-line script that uses the PHP port of JSMin:

<?php

include 'jsmin.php';

array_shift($argv);

foreach ($argv AS $file) {

echo ‘/* ‘, $file, ‘ */’;

echo JSMin::minify(file_get_contents($file)), “n”;

}

?>

Really simple, isn’t it? You can save it as compress.php and run it as follows:

$ php compress.php source1.js source2.js source3.js > result.js

This will combine and minify the files source1.js, source2.js, and source3.js into one file, called result.js.

The script above is useful when you merge and minify as a step in the site deployment process. Another, lazier option is to do the same on the fly — check out Ed Eliot’s blog post, and this blog post by SitePoint’s Paul Annesley for some ideas.

Many third-party JavaScript libraries are provided in their uncompressed form as well as in a minified version. You can therefore download and use the minified versions provided by the library’s creator, and then only worry about your own scripts. Something to keep in mind is the licensing of any third-party library that you use. Even though you might have combined and minified all of your scripts, you should still retain the copyright notices of each library alongside the code.

Place Scripts at the Bottom of the Page

The third rule of thumb to follow regarding JavaScript optimization is that the script should be placed at the bottom of the page, as close to the ending </body> tag as possible. The reason? Well, due to the nature of the scripts (they could potentially change anything on a page), browsers block all downloads when they encounters a <script> tag. So until a script is downloaded and parsed, no other downloads will be initiated.

Placing the script at the bottom is a way to avoid this negative blocking effect. Another reason to have as few <script> tags as possible is that the browser initiates its JavaScript parsing engine for every script it encounters. This can be expensive, and therefore parsing should ideally only occur once per page.

Remove Duplicates

Another guideline regarding JavaScript is to avoid including the same script twice. It may sound like strange advice (why would you ever do this?) but it happens: if, for example, a large site used multiple server-side includes that included JavaScript files, it’s conceivable that two of these might double up. The duplicate script would cause the browser’s parsing engine to be started twice and possibly (in some IE versions) even request the file for the second time. Duplicate scripts might also be an issue when you’re using third party libraries. Let’s suppose you had a carousel widget and a photo gallery widget that you downloaded from different sites, and they both used jQuery. In this case you’d want to make sure that you didn’t include jQuery twice by mistake. Also, if you use YUI, make sure you don’t include a library twice by including, for example, the DOM utility (dom-min.js), the Event utility (event-min.js) and the utilities.js library, which contains both DOM and Event.

CSS Optimizations

Merge and Minify

For your CSS files you can follow the guidelines we discussed for JavaScripts: minify and merge all style sheets into a single file to minimize download size and the number of HTTP requests taking place. Merging all files into one is a trivial task, but the job of minification may be a bit harder, especially if you’re using CSS hacks to target specific browsers — since some hacks exploit parsing bugs in the browsers, they might also trick your minifier utility.

You may decide not to go through the hassle of minifying style sheets (and the associated re-testing after minification). After all, if you decide to serve the merged and gzipped style sheet, that’s already a pretty good optimization.

If you do decide to minify CSS, apart from the option of minifying manually (simply removing comments and whitespace), you can use some of the available tools, such as CSSTidy, PEAR‘s HTML_CSS library (http://pear.php.net/package/HTML_CSS/), or SitePoint’s own Dust-me Selectors Firefox plugin.

Place Styles at the Top of the Page

Your single, gzipped (and optionally minified) style sheet is best placed at the beginning of the HTML file, in the <head> section — which is where you’d usually put it anyway. The reason is that most browsers (Opera is an exception) won’t render anything on the page until the all the style sheets are duly downloaded and parsed. Additionally, none of the images referenced from the CSS will be downloaded unless the CSS parsing is complete. So it’s better to include the CSS as early on the page as possible.

You might think about distributing images across different domains, though. Images linked from the CSS won’t be downloaded until later, so in the meantime, your page can use the available download window to request content images from the domain that hosts the CSS images and is temporarily “idle”.

Ban Expressions

IE allows JavaScript expressions in CSS, like this one:

#content {

left: expression(document.body.offsetWidth)

}

You should avoid JavaScript expressions for a number of reasons. First of all, they’re not supported by all browsers. They also harm the “separation of concerns”. And, when it comes to performance, expressions are bad because they’re recalculated every time the page is rendered or resized, or simply when you roll your mouse over the page. There are ways to make expressions less expensive — you can cache values after they’re initially calculated, but you’re probably better off simply to avoid them.

Tools for Performance Optimization

A number of tools can help you in your performance optimization quest. Most importantly, you’d want to monitor what’s happening when the page is loaded, so that you can make informed decisions. Try these utilities:

Summary

Whew! If you’ve made it this far, you now know quite a lot about how to approach a site optimization project (and more importantly, how to build your next web site with performance in mind). Remember the general rule of thumb that, when it comes to optimization, you should concentrate on the items with the biggest impact, as opposed to “micro-optimizing”.

You may choose not to implement all the recommendations discussed above, but you can still make quite a difference by focusing on the really low-hanging fruit, such as:

  • making fewer HTTP requests by combining components — JavaScript files, style sheets and images (by using CSS Sprites)
  • serving all textual content, including HTML, scripts, styles, XML, JSON, and plain text, in a gzipped format
  • minifying and placing scripts at the bottom, and style sheets at the top of your files
  • using separate cookie-free domains for your components

Good luck with your optimization efforts — it’s very rewarding when you see the results!