Sunday, December 20, 2015

Roundup of Latest Features

Cloud9 Charts Product Updates
 
LATEST FEATURES
 
 
We've got lots of new goodies to help you go from data to insights in a snap from your NoSQL, SQL, files and other sources.

 
 
 
 
 
 
DATA TO INSIGHTS
 
Complete Analytics platform for Unstructured and Stuctured Data
 
 
 
 
 
MULTI-DATASOURCE JOINS
Data in multiple sources? Learn how to effectively leverage your "polyglot" architecture to join data across multiple database systems using our Multi-Datasource Join capability.
 


REST API CONNECTIVITY
Connect to any REST API, query it using a SQL-like syntax on multi-structured data and visualize it.
 


PREDICTIONS
Apply predictions on any of your data. We'll back-test the data against a number of prediction models automatically to get the best fit.
 


NEW INTEGRATIONS
Native integrations to derive insights instantly: DataStax, ElasticSearch, Spark.

If you are a MarkLogic user, our XQuery generator is here by popular demand.

Also check out query generators for MongoDB, DataStax, SQL, Spark and others.
 


COHORT ANALYTICS
Cohort analytics can be useful to determine churn and behaviour across user groups, but can also be notoriously difficult to do. Learn how we simplify it to build your cohort analysis easily.
 


NEW VISUALIZATIONS
Choose from over 20 visualization types, including gauges, grid heatmaps, enhanced geo options and several more.

 
 
 
 
 
 

Friday, December 11, 2015

RESTful services, Part 2 - Mixpanel

In the last blog entry in this series, we introduced our integration to connect to any REST API's querying using a SQL-like syntax and generating custom visualizations from it. The example used a live Parse account and we've since had some requests for the same for Mixpanel. 

This outlines connecting and querying Mixpanel data using our REST API interface to build custom visualizations from it. Whereas Parse API  is simple to connect, query and track, Mixpanel API's are not particularly easy to work with. Partly this is due to the way in which authentication is handled, where any parameters to the API must be signed using MD5 in alphabetical order. 

The high level steps are:
  1. Determine your API Key and API secret. This can be found under the Accounts --> Projects section of your Mixpanel account. 
  2. Determine what you'd like to query from Mixpanel from their documentation: https://mixpanel.com/docs/api-documentation/data-export-api
  3. Build the URL, and generate a signature with the parameters.
  4. Query and post process the nested data using Cloud9QL.

Let's take a live Mixpanel demo account to make it easier to follow along. This example determines the count of events over the past day. The API documentation for this can be found here: https://mixpanel.com/docs/api-documentation/data-export-api#events-top

1. Visit https://cloud9charts.com/rest-api. Enter http://mixpanel.com/api/2.0/ as the host. 
(If you an existing user, this is also accessible within the REST datasource section within your account) 

2. Leave the Headers section empty. For end point, set to events/top.

3. Now comes the tricky part. To build the URL, we need to build a MD5 hash for the alphabetically ordered parameters. 

Let's build up the URL with basic params first: api_key=a8d3995557ce4cea1ae92ec03314a620&type=unique

Determine the epoch expiry date for the request. For 1st, Dec  2016, this would be 1480550400 (www.epochcoverter.com). Adding this to the URL, we get: api_key=a8d3995557ce4cea1ae92ec03314a620&type=unique&expire=1480550400

Next comes the sig field, which is the MD5 hash for the request:
  • Build a string with the url parameters in ascending form concatenated together without the & separator: api_key=a8d3995557ce4cea1ae92ec03314a620expire=1480550400type=unique
  • Concatenate the Mixpanel API  secret to it:  
  • api_key=a8d3995557ce4cea1ae92ec03314a620expire=1480550400type=unique6a2532b187735ea4d531f4b7dfc7a7d2
  • MD5 this string (http://www.md5.net/md5-generator/) to generate a hash. This will be the sig field for the request. MD5 Result: befa480e75440ef4b8141f2664ba6753
  • Putting it all together, the resulting URL parameter looks like this: api_key=a8d3995557ce4cea1ae92ec03314a620&type=unique&expire=1480550400&sig=befa480e75440ef4b8141f2664ba6753
4. Add the above into the URL params field at https://cloud9charts.com/rest-api


5. Now we are ready to query Mixpanel to fetch the results. The return data looks something like this:


{"events": [{"amount": 2,
             "event": u"funnel",
             "percent_change": -0.35635745999582824},
            {"amount": 75,
             "event": u"pages",
             "percent_change": -0.20209602478821687},
            {"amount": 2, "event": u"projects", "percent_change": 1.0}],
 "type": u"unique"}


To unwind the events nested array above, add the following Cloud9QL into the the Cloud9QL Query: 

select expand(events);

Click on Show me to visualize the results. 










Tuesday, December 8, 2015

Analyzing data from Parse & other RESTful services


Cloud9 Charts provides connectivity to third party REST enabled services to be able to query, transform, visualize, analyze and track that data. 

In the following example, we'll connect to Parse to query and derive visualizations from it.

To see it in action visit https://www.cloud9charts.com/rest-api -- this is pre-configured with a live Parse URL, authentication headers and the end point to query from. 



If you are a parse user, you can plug in your own end point and authentication parameters. (To determine your Parse auth key, see https://parse.com/docs/rest/guide#queries).

Parse returns data in an embedded JSON form with results at the top level JSON element. For example:



{
  "results": [
    {
      "createdAt": "2015-12-09T02:12:05.185Z",
      "customer": "Costco",
      "objectId": "SF5hWccArl",
      "opened": 98240,
      "sent": 604585,
      "updatedAt": "2015-12-09T02:12:05.185Z",
      "week": "2015-04-13"
    },
    {
      "createdAt": "2015-12-09T02:12:05.190Z",
      "customer": "eHarmony",
      "objectId": "QuVeGh6hkA",
      "opened": 81738,
      "sent": 557202,
      "updatedAt": "2015-12-09T02:12:05.190Z",
      "week": "2015-04-13"
    },
    ...
  ]
}

Let's say that we are looking for the sent count by customer, by week across this dataset to see the weekly trends by customer. For this, we'll need the following:
  • Unwind top level results.
  • Perform aggregations on the sent count by week by customer.
So how do we make this happen? Using Cloud9QL

The above can be done using the following Cloud9QL:


select expand(results);
select customer,date(week) as week, sent group by customer, week

The first statement unwinds the  array from the top level results object. The second calculates the weekly totals for each customer. 

Click on 'Show Me' to instantly visualize the data. 





To learn more about tracking query results, multi-datasource joins and variety of use cases, see our documentation.


Saturday, December 5, 2015

MarkLogic XQuery Generator

Due to popular demand from our MarkLogic users, we are pleased to unveil XQuery generation capabilities. This allows MarkLogic users to interactively discover and auto-generate XQuery using a point and click interface and build interactive, embeddable visualizations and dashboards.

Highlights:
  • Unlock insights from multi-structured MarkLogic data instantly
  • Discover collections
  • Auto-generate XQuery using a point and click interface
  • Native integration (withour ODBC) that maintains the multi-structured view of the data 
  • Eliminates complex ETL processing and shoehorning the data into relational form for BI
  • Instant visualizations, dashboards and share/embed easily
See for yourself at https://www.cloud9charts.com/marklogic. This is pre-configured with a live MarkLogic database that you can interact and build dynamic queries from the data.




Additional documentation: https://www.cloud9charts.com/marklogic






Tuesday, November 24, 2015

Recent Press Highlighting Cloud9 Charts

Two noteworthy articles about Cloud9 Charts have been recently published -- one by The New Stack and another by 451 Research.

The New Stack’s recent article, “Cloud9 Charts Unveils Analytics-as-a-Service Offering,” features a conversation with Cloud9 Charts’ CEO, Jay Gopalakrishnan, about the process of traditionally building BI tools, and how Cloud9 Charts has transformed the entire process into a simpler, faster and more efficient approach.

Information technology research and advisory company, 451 Research, published a review of Cloud9 Charts titled, “Cloud9 Charts emerges with reporting and visualization service for multi-structured data.” The research report goes into details about Cloud9 Charts’ product and current position in the market, its strategy and competitors in the analytics space. The full article is gated to subscribers.

Both articles emphasize the challenge of working with multi-structured data and how Cloud9 Charts solves that problem.

Special thanks to Susan Hall at The New Stack and Krishna Roy at 451 Research.

Related

Wednesday, November 4, 2015

Introduction to Prediction Modeling

Cloud9 Charts now offers predictive analytics capabilities that can be applied to any data.

The data is passed through a variety of prediction models automatically - including Moving Average Models, Exponential Smoothing, Regressions and others - to determine the best fit based on historical data. 

Example: Let's take monthly stock prices for Amazon. The monthly stock prices looks like this:

Date, Price
06/01/16, 719.14
05/01/16, 683.85
04/01/16, 598.00
03/01/16, 579.00
02/01/16, 574.81
01/05/16, 633.79
12/01/15, 679.36
11/2/15, 628.35
10/1/15, 625.90
9/1/15, 511.89
8/3/15, 512.89
7/1/15, 536.15
6/1/15, 434.09
5/1/15, 429.23
4/1/15, 421.78
3/2/15, 372.10
2/2/15, 380.16
1/2/15, 354.53


1. Copy and paste the above dataset into https://www.cloud9charts.com/docs/predictive-analytics.html

2. Apply the following Cloud9QL into the query section:
select predict(Price, Date, 12/01/2015,1m,2)

This will backtest the dataset to determine the model with the best fit to then predict the price for future dates. (I hasten to add that historical data may not be always be a true indicator for future prices!)

Results:



Similar to the example above, you can predict any metric across any of your data.


Monday, October 26, 2015

Cohort Analysis made easy

Cohort analytics can be defined as groups of users with similar characteristics over time and can be a great indicator for product effectiveness and user engagement.


Some Examples:
  • Percentage of users who signup that remain active over time
  • Customers by cohort over time
  • Revenues by cohort over time
Thomas Tunguz from Redpoint has written some great blog posts on cohort analytics and how to interpret them effectively.


Cohort Analysis can be notoriously difficult to perform within the database. In this blog post, we’ll take a step-by-step guide to generating cohort analysis using Cloud9 Charts to go from raw data to the following cohort heatmap grid in a few easy steps.





The example focuses on signups that are active over a period of time. (For cohort analysis from event based data, see our documentation.)


Following is a sample of the data that we'll be working with. Assume that the data contains two fields that can be obtained using a database query: a registration date, when the user first signed up for a service and a transaction date when the user performed a transaction. For example:
registration_date
transaction_date
07/01/2014 18:01:16 PDT
07/01/2014 18:02:18 PDT
07/01/2014 18:03:50 PDT
07/01/2014 18:06:06 PDT
07/02/2014 01:43:30 PDT
07/02/2014 01:45:50 PDT
07/02/2014 01:56:27 PDT
07/02/2014 01:57:40 PDT
….
….

Our objective here to determine the monthly transactions cohorts based on the signup month.

Calculating cohorts is as simple as applying the Cloud9QL below. In case you are unfamiliar, Cloud9QL is a post processor with SQL-like familiarity that be used for analytics post processing on your data on top of your database queries.


select month(transaction_date) as transaction_date, month(registration_date) as registration_date;

select cohort(transaction_date, registration_date as Cohort Date, 1m as Cohort Period), count(*) as Total  group by Cohort Date, Cohort Period  order by Cohort Date desc, Cohort Period asc;

select date_format(cohort date,MMM-yy) as cohort date, cohort period, total; 
Here we are chaining three Cloud9QL statements, where the results of the first are passed into the second and so on. To take a deeper dive:

select month(transaction_date) as transaction_date, month(registration_date) as registration_date;
This truncates the dates to the start of the month, since we are interested in a monthly cohort. The second statement applies the Cohort function that counts the results for n months since each signup month:


select cohort(transaction_date, registration_date as Cohort Date, 1m as Cohort Period), count(*) as Total  group by Cohort Date, Cohort Period  order by Cohort Date desc, Cohort Period asc;
The last query in the chain adjusts the date format for display purposes:

select date_format(cohort date,MMM-yy) as cohort date, cohort period, total; 
Last Step: Change the visualization type to a heat map grid.




In a few simple steps, we've obtained a visual cohort grouping of your data. If you need help with determining cohorts for your own data, please feel free to contact us.