Saturday, November 16, 2013

Website unusual behavior

Website unusual behavior

The normal vision an e-commerce web site’s owner has is that it is always available, running smoothly and providing an easy experience for the customer.
When using a commercial platform (open source or payable) this is the minimum service level needed.
But as this not a pure aetherial world, things are a little bit more complicated and there are herds of issues that can run around and make the journey a tougher (or stopped)
All these rock & roll experiences tend to lower the conversion rate (leading it to the zero rate) and put a great challenge on the essence of this sales channel. 

We can draw some main categories (there are a lot more) for the unusuality :
* Technical issues (access, availability, response time…)
* Architecture/connectivity issues (plugin connections, external components rights, data flows…)
* Navigation issues (page sequences, funnel scenarios, backward/forward navigation…)  
* Design issues (device adaptation, browser supported, links, call to action, screen positioning…)
* Unexpected customer behavior (repeating of same action, weird navigation, unusual booked/bought quantities…)

As the e-commerce web site is operating there is an imperative necessity for the site owner/manager to be aware of the unusuality, not as in Prokofiev’s « Peter and the wolf » where everything is shouted as the death of the website, but much more in an audible, structured vision to act as fast as possible.
Our last thursday hack night (thursdhack) was focused on how to categorized and find easy manageable solutions for each category of problems.

Most of the technical problems are quite well managed by a multitude of tools (commercial or open source), processes, check lists.
Using embedded platform watchdogs, adding some technical components which play the role of a user playing scenarios and reporting the misses. But there is a need to understand what is expected, to implement the scenarios, to build the communication dialog between the tool and the site owner. Most of the available tools are very complex (yet powerful) to implement facing the very simple needs. A huge part of e-commerce websites just rely on repair when it is broken.  Solutions are provided in platform ecosystems (Magento and the watchdog plugin, same with Shopify, Bigcommerce, Prestashop).

Expect the unexpected

Our main goal was to focus on unexpected customer behavior. Using this global approach may help us detect a lot of running problems (as far as the customer is connected).
Focusing on the customer sequencing voyage provides a huge number of indicators which may drive to understand that the web site is not working correctly. It has the great value of dealing with reality. 
What are the most needed actions?, when are they used?, how do they interact? how is it possible to predict them and makes the right tests and the possible detours?
Balancing aggregated customer behaviors as a parameter facing an individual journey adds a powerful borderlines management to check excess and build pertinent notification alerts.
Each call from the e-commerce website is tracked and asynchronously stored using a single cruxbase tag (which may handle a full pack of services) centrally or locally (to check the dialog quality of the server and provide a trace of local behavior).

Some of the analyzed issues are listed under :
- Abnormal page sequences (based on history or on customer segmentation) 
- slower page sequence calls (rhythm analysis)
- Delays in call to action
- Unoptimized flow
- mutliple identical call to action
- Abnormal quantity selection for a product

Using existing tools

To solve our night problem we selected one service (Google Analytics) and a workflow (either as a service on Amazon SWF or as a component with Bonitasoft).
Managing google Analytics (or other analytical tools) is not a simple tasdk as it has to be relevant, detailed, simple to implement and usable. That is one of the major reason why marketing tools platform like cruxbase have to generate the analytic tagging system, either by a point and shoot interface or using a simple vocabulary. Customer segmentation (implicitly provided by the browser, explicitly caught from the e-commerce platform or added by the marketing layer) is a mandatory exercise. Adding the good analytic parameters it is now possible to have a real time summarized numbered vision of page sequences, used links, selected call to actions. These are the foundation for comparisons and behavior analysis. These numbers are extracted and stored asynchronously at each server call (with cache acceleration to limit uninteresting analytic calls).
The complete customer navigation is maintained using a recorder and stored locally and transferred to the server (based on formal dialog, or on stage level) then plug to the BPM workflow (with analytical data) which handle all the current/recent sessions to handle the scenarios
and define if unusuality happens.

Experience vs Check lists

This approach of unusuality is much more founded on real experience than on listing multiple potential problems that have to be solved. Using available numbers and services and acting analytically instead of making reports.
Storing organized full navigation journeys (with additional local events) enhance the ability to build a very reactive and productive solution to understand what is happening on the e-commerce website at minimum cost.

Wednesday, November 13, 2013

E-commerce site first glance

E-commerce site first glance and basic vocabulary

The footprint of an e-commerce website is a strategic element to understand how it is perceived, how do people connect and what do they first understand when they reach it.

There are zillions of analytic tools, web agency products, payable apps all proposing the job of giving a vision of the reputation of a target website.

There even are free online diagnosis web sites providing list of key data summarizing all technical key points, mashing up mutliple componenets to build scoreboards, piling up numbers and showing their ability to launch computation. A normalized approach on very generic targets (one size fits all) with a acrobatic vision on grabbing as much attention as it is possible.

Our last hacking thursday night was built on thinking on the very little private elements which can alert a e-commerce site owner that his web site is not in accordance with his expectations.
Our main motto here at Cruxbase is to build layers on extremely simple, versatile components to reach a focused target.

The night goals were :
- How do (soon to be) customer get the first glances of the website when using search engine ?
- How does the first displayed page vocabulary synthesized the "spirit" of the e-commerce site ?

We did quite an exhaustive search on already working available tools which provide wide range of numbers based on multiple data flows. From backlink counts to SNS analysis, from loading speed to word couniting. You name your idea, you get a tool.
Some are based on Google components (with the insane registration and connection process…. but willing to have a Google analytics data flow makes you very humble and ready for the volontary servitude - Thanks to Etienne de la Bo√ętie)
Some provides a full set of open source application, others deliver a payable (little $) based on the same components.
You use what you decide to use. It is what we may call « Auberge espagnole » metaphor.

We decided to select two major components as the skeleton for the nigh hack to fulfill our targets :
- Search autocomplete (suggestions)
- Page site scraping and word density

Keywords suggestion (autocomplete)

When looking for a site, the customer often types the name of the target company or product in the search toolbar.
The search engine (Google, Bing…) starts suggesting words based on the entered data, first it accelerates the process of typing (requiring only few keystrokes to attain the full name) then it gives a synthesized vision of usual typing behaviour by the community.
This step is very interesting because it provides associated words linked with the site name, from geography to global sentiment. These words managed in a smart semantic dictionary gives easily a first step tendancy (playing with re-suggestion based on alternates enhance the process)

Both Google and Bing provides a simple call which return either XML or JSON answers which can be stored, evaluated and relaunched

Google : or
Just note the « hl » parameter which provide suggestion in the focused language (suggestions may be different for a french speaker)
it returns an XML formatted answer

Bing :
Returns a JSON formatted string

There are lots of other suggestion engine based on search strings managed by Yahoo, baidu, rednano, altavista… But our main commitment was just to cacth what the user see when typing the company/product name. To capture the first feeling.

Displayed Page Vocabulary

After selecting the good link (based on the suggestion) the (soon to be) customer catch a first page which representant the first contact with the e-commerce site. This 1st contact may be very focus (a landing page) very specific (a product page) or global (the home page). Nevertheless there is an imperative need to catch all text, comments, title, tags and rich snippets just to synthesize the vocabulary proposed and then understand if it is fitable, appropriate.

It is simply a question of scraping the page, extracting valuable data (getting rid or not of the HTML tagging) then organize an efficient storage to filter of parasite words, check for duplicates (singular/plural) and retrieve/rebuild grouped words.
The analysing frequency and sorting key words to be followed and enhanced. All this work is done using very common PHP classes or files (as open source class.html2Text.php and class.keywordDensity.php…)

What else ?

As far as the vocubalary is extracted then the magic can start! Using back Keywords suggestion it is now possible to check the competitors, the market impacts, the associations. The real marketing search and analysis business can now starts based on these two very simple but powerful tools. Best of all, it can be fully automated with real time rule based result analysis.