bish.co.uk – Page 19 – Musings of a software quality engineer

At a “fork” in the superhighway

Yesterday I was reading this article from TechRepublic which promtped me to write a blog post for Trust IV. I’ve decided to reproduce the article here so that some of my more regular readers might find it.

Browsers have come a long way since Tim Berners-Lee developed his first browser (WorldWideWeb) in 1990. In 1993 NCSA Mosaic became the defacto browser for early adopters of the Internet and this browser is credited with popularising the world wide web. Many of today’s browsers have evolved from this early browser as shown in my simplified* “browser family tree”. (Click image to download .pptx file)

At the heart of each browser is the browser engine. This is the code that takes content from HTML, CSS, JS and image files and displays it on your screen. The browser engine is transparent to most users but it can be important. Changes in browser engine are important for both functional and non-functional testers. Functional testers may come across a website that “works” for one browser and doesn’t work on another browser, or displays different behaviours when tested with different browsers or browser versions. Performance testers may encounter sites that perform well in one browser, but not in another.

This week the Mozilla foundation announced that it was collaborating with Samsung to develop a new browser engine (Servo) to take advantage of the greater performance from multi-core architectures (which are now common in PCs, laptops and even smart phones). At the same time, Google has announced that it will “fork” the Webkit engine (currently used by Chrome, Konqueror and Safari) and develop a new engine (Blink).

Why this matters to testers (and you). Early performance testing was all about making simple requests for page content, then timing how long it took for the web server to respond to the client request. Web pages were often static, and rendered quickly in either IE or Firefox (the prevalent browsers from year 2000 onwards). Internet connections were slow and the bulk of the time that a user spent waiting was download time.

Nowadays things are different. Multiple browsers are available (some optimised for mobile use). This means that the same web server may serve different content based on browser version. Some users are likely to be on high speed internet connections and others will be using 3G or Edge on a mobile device. As the number of browsers increases it is still possible to test in “the old way” but testing in this way is becoming increasingly less valid. I often find that my clients are interested in realistic page response times for users, rather than simply the time taken to download content.

For example, I used a private instance of WebPageTest to measure the page response time for the TrustIV website. For a new user with no cached content the page downloaded in 1.8 seconds but the page was not visually complete (from a user perspective) until 2.9 seconds had elapsed. Which of these response times would/should I report if this were a performance test?

With low end performance test tools, all I could report on are the page download times. This is fine in many cases where a period of iterative testing is carried out to improve performance and find bottlenecks. But what if there’s a problem in client-side code? Unless my test tool takes the time to render the content I’ll be mis-reporting true end-user performance to the client.

Choosing a test tool
Some of the higher end performance test tools such as LoadRunner, SilkPerformer, SOASTA and NeoLoad render content using one or other of the browser engines. This gives an indication of true page load times, but not all test tools can do this. It’s important to fully understand your client’s browser types and the limitations of your test tools before you try to advise your customers on probable end-user response times. This is even more true now that there are 6 browser engines “in the wild”, rather than the 4 that we knew about last week.

I’m looking forward to hearing from the major test tool vendors about how they’ll adapt their test tools now we’re at yet another “fork in the road”.

*I based my “browser family tree” on information from a number of sources including the Wikipedia “timeline of web browsers” and the wonderful interactive graphic from “evolution of the web”

Using Buffer to schedule “Tweets”

If you’re a user of social media, you may have heard of the social media tool called Buffer. In case you haven’t, it’s basically a sharing and scheduling tool which “buffers” outbound tweets and then sends them in the future. I use it for Twitter, but it can also be connected to Google+, FaceBook, LinkedIn and so on (if you have a premium account)…. For me the free one works just fine.

Buffer differs from other Twitter scheduling apps because you don’t have to manually schedule tweets, you set a timetable for when you want tweets to go out and Buffer automatically tweets at the predetermined times.

You can fill your buffer in a variety of ways:

Emailing tweets to a private email address
Via the web interface at https://bufferapp.com
From your smart phone
Through Firefox, Chrome or Safari browser plugins (not IE … yet).

This is great and I’ve noticed more activity and engagement from my followers since I used Buffer. Unfortunately, I couldn’t think of a way to populate multiple Twitter accounts without paying for the premium version….. until now.

Buffer screenshot

As you can see from the screenshot above, I have access to three Twitter accounts. My own account @richardbishop, my company account @TrustIV and the @VivitWorldwide account. By logging into Twitter in three browsers, I can stay connected to each of them and use the different plugins to populate my Buffer.

In the screenshot above, I used Chrome, Chrome Canary and Firefox and installed the Buffer plugin in each of them. This allows me to paste Tweets between the browsers if I want the same message in multiple accounts or post to individual Twitter feeds if I want to. Each of my accounts has a different schedule and it is possible to share my non-personal Buffer account with the other users of the @TrustIV and @VivitWorldwide Twitter accounts.

The dangers of extrapolation

As a performance tester I’m often asked to predict the behaviour of a client application based on test results. This is often difficult and I was reminded of this when I saw this xkcd cartoon recently.

Often I’m asked to predict how many users a website will handle based on test results. For example, I may run a test on a cut-down version of a production system, the test system may have 2 webservers and the larger-scale production system may have 5 webservers.

If our 2-server system can handle 200 users, why isn’t it safe to assume that the 5-server system will handle 500 users?

Using extrapolation to predict the scalability or performance of a system is rarely possible in performance testing. The diagram below illustrates some of the reasons why.

The “real” network is invariably more complex than the test environment, the network topology or application architecture may be different. As well as this the patterns of user behaviour may not be as predicted, meaning that performance tests were unrealistic.

A few key differences:

Network connection speeds:
Some users are connecting via slower mobile networks, this may hold connections open longer and affect overall system performance. Mobile users may use a disproportionate number of connections due to the higher network latencies.
Other, local users, may connect at faster “LAN” speeds. I once worked in an office where a network upgrade brought the Exchange mail servers down. Until the network upgrade, the mail servers had their traffic “nicely throttled” by a slow network. Once this bottleneck was removed the servers couldn’t handle the higher throughputs.
Network load balancing: ideally this should be the same in production and test environments. In my experience it rarely is!
Sources of load: In our example test environment, we may only simulate user load on the webserver and database server. But what about other load on these systems? In a world increasingly built on SOA principles, what else is communicating with your database servers, contributing to network traffic or accessing your shared storage?

You may feel that in this article, I’m arguing against performance testing. Far from it, performance testing is vital, I’m only highlighting some of the pitfalls. Testing will help you to find and fix problems before your “real users” do, this is exactly what you want.

As well as more robust testing you need to do the following:

System monitoring, check real against predicted performance.
User monitoring, check real against predicted behaviour.
Repeat tests once you have this new data.
Test early and test often. Repeat tests over and over again throughout the SDLC.
Test your application from “all angles”, consider the use of stubs, test harnesses or service virtualisation technology to supplement your performance test tool.