Thursday, September 24, 2015

Introducing Spark Support - Cloud9 Charts


Today, we are excited to announce immediate availability of Cloud9 Charts native support for Apache Spark SQL giving businesses using Spark to get real time analytics for both on-premises and the cloud.

Apache Spark, the fast engine for processing big data projects, allows businesses to connect to as simply and fast as possible, and across the widest variety of data sources. Now with Spark, Cloud9 Charts provides the power and immediate value demanded by today’s businesses with a native approach to business intelligence sources on SQL and NoSQL sources, making it one of the simplest ways to add SQL, NoSQL, and a blend of source data in one system. 

Benefits of the new offering include:
  • Instant Visualizations: Generate shareable, embeddable dashboards. 
  • Access data from Spark RDD directly. Eliminate complex ETL processes into relational databases for BI purposes. 
  • Native integration with SparkSQL.
  • Data Discovery: Discover tables and fields. 
  • Query Generation: Automatically generate and explore Spark SQL queries and aggregations.
  • Joins: Blend and join data from data in Spark as well as other NoSQL and SQL datastores.
  • Advanced cloud or on-premise analytics include data aggregations, calculations, date bucketing, predictive analytics and more. 

See it in action at booth #117 at the Cassandra Summit. Alternatively, visit https://www.cloud9charts.com/spark to analyze a live dataset, generate Spark SQL and visualize the data.  

Wednesday, September 23, 2015

Cassandra Native Business Intelligence


Stop by our Booth #117 at Cassandra Summit this week, 9/22-9/24 in Santa Clara!



Originally posted on DataStax.com

DataStax Enterprise, built on Apache Cassandra, is synonymous for scalability and high availability while delivering the flexibility required for modern Web, mobile and IOT applications. Cloud9 Charts provides a native Business Intelligence platform that enables data discovery, CQL query generation, analysis, multi-datasource joins, along with easily shareable, embeddable dashboards from data stored in DataStax Enterprise.


Analytics and Business Intelligence from data within DataStax Enterprise today generally means writing MapReduce and ETL processes to extract the relevant data out of DataStax Cassandra and moving them into a relational database, to then provision a traditional BI tool to derive insights and reporting from it. This has significant drawbacks, since data integration, data modeling and database administration specialities requires an excessive amount of engineering effort and coordination of different tools and teams.


Cloud9 Charts offers a dramatically simpler path. By enabling queries on data stored in DataStax Enterprise and Cassandra directly with the ability to store and incrementally track the results to instant visualizations, enterprises can go from data to insights and react to data changes quickly without engineering efforts around MapReduce, ETL processes and relational database schema definitions and storage.


Let's take a 10-minute hands-on look at generating visual insights from a live DataStax Enterprise/Cassandra database (as well as some simple predictive analytics on it):


Macintosh HD:Users:gopalaj61:Dropbox:c9 Screenshots:Screen Shot 2015-05-23 at 11.18.32 AM.png


Overview of the Solution:
  • Eliminates complex ETL into relational databases for BI purposes.
  • Native integration with CQL (without shoehorning the data in relational ODBC form).
  • Data Discovery: Connect and discover keyspaces, tables, fields & key detection (primary partition, clustering and secondary index keys).
  • Query Generation: Automatically generate and explore CQL queries and aggregations using a visual interface.
  • Joins: Blend and join data across multiple DataStax Enterprise databases and other NoSQL and SQL datastores.
  • Optional seamless warehousing  of query results with incremental tracking.
  • Advanced Analytics: aggregations, calculations, date bucketing, predictions ….
  • Instant Visualizations: Generate shareable, embeddable dashboards.
  • Connect from the Cloud or on-premise with cloud or on-premise deployment modes.


1. Getting Started: Go to www.cloud9charts.com and click on ‘Get Started’ button to sign up (it’s free to get started).


2. Connect: Click on Data sources from the left hand settings menu, then select Datastax as shown in the animated GIF below. Use the default settings point to a live demo DataStax Enterprise database hosted by us.


Note: If you prefer to connect to your own DataStax Enterprise cluster and the database is inaccessible from outside your network, use our agent to facilitate secure connectivity to it.  


3. Queries & Insights: Now, let’s generate some insights from a demo dataset. This database contains a demo dataset of email campaign details; in the following example, we’ll determine the Total Sent, Delivered and Opened on a weekly basis from it, while showcasing the following along the way:
  • Table, Keyspace, Field and Key discovery
  • Auto-generate queries
  • Aggregations
  • Visualizations


Click on Configure Queries.
i. Expand the ‘Query Generator’ section to determine tables within this DataStax Enterprise Cassandra database, along with identifying partition keys, cluster keys and secondary index columns.
ii.  Select cloud9_demo as the database.
iii. CQL queries are automatically generated when you use the Query Generator. Click on Preview to immediately see a preview of all data in this table.
iv. Select Sent, Delivered and Opened from the metrics dropdown. Click on each to display aggregation options. Select Sum on each.
v. On the Dimensions option, select Date, then click on it to select Weekly aggregation.
Now we have the sum of Sent, Delivered and Opened on a weekly basis.


The auto-generated queries so far looks like this:
CQL:
select "date", "sent", "delivered", "opened"
from "cloud9_demo"
limit 10000


Cloud9QL: This is a post processor with a SQL like familiarity that enables easy aggregations and a range of more advanced analytics capabilities. It’s not a replacement for CQL, but rather complements it with a range of powerful Business Intelligence and reporting capabilities that CQL is not designed for. Learn more about Cloud9QL at www.cloud9charts.com/docs/cloud9QL.html


The generated Cloud9QL looks like the following, which processes the results returned from CQL to an aggregation of Sent, Delivered and Opened totals on a weekly basis.


select SUM(sent) as Sum of sent,
 SUM(delivered) as Sum of delivered,
 SUM(opened) as Sum of opened,
 WEEK(date) as Week of  date
group by WEEK(date)
vi.  Click on Preview to instantly visualize the results.


vii. Save the Results. The results can be either be seamlessly saved into our schemaless data warehouse for fast access (with auto-update and incremental query/upsert capabilities that are beyond the scope of this post), or executed against the datastore in real-time.  


4. Dashboard: Click on Dashboards. Drag & drop the newly created visualization into the dashboard. The dashboard can be easily shared, embedded and filtered on.
5. Predictions: A range of analytics capabilities are part of the Cloud9 Charts platform, including seamless predictive analytics that backtests the data across a number of algorithms to automatically determine the best fit.


In the following example, we’ll put together predictions for the Messages sent on a monthly basis for the next 6 months.   
  1. Clone the widget.
  2. Add a Cloud9QL filter (see the GIF) with the following:
select predict(Sum of Sent,Week of  date, 10/01/2015, 1m, 6)
  1. Save. Our new widget will be updated with the predictions for 6 months starting from October.


Summary
In a few simple steps, we’ve derived insights and predictions from a DataStax Enterprise cluster with easily customizable, shareable and embeddable dashboards.                                                          


Resources
Instantly connect and derive insights from our demo DataStax Enterprise cluster: https://www.cloud9charts.com/datastax


Join DataStax Enterprise data with other data sources: https://www.cloud9charts.com/docs/multi-datasource-joins.html


Post process CQL results to aggregate, date bucketing and other: http://cloud9charts.com/docs/cloud9QL.html

Wednesday, September 16, 2015

DataStax Partnership & Cassandra Certification


Cloud9 Charts Announces Validated Business Intelligence Solution for DataStax Enterprise
Certified Business Intelligence Solution Partners With DataStax
OAKLAND, Calif. (Sept. 15, 2015) – Cloud9 Charts, Inc., today announced a partnership with DataStax and a validated native BI solution for DataStax Enterprise. The solution provides joint customers native Business Intelligence capabilities to instantly visualize, analyze and query data on Apache Cassandra™.

“Cloud9 Charts and DataStax have worked closely together to deliver an enterprise grade BI platform that provides instant, actionable insights from their data in Cassandra," said Jay Gopalakrishnan, Cloud9 Charts founder and CEO. “This provides technical and business users clarity into their data within Cassandra like never before, without being constrained by complex ETL processes that shoehorn data into relational form.”

The Cloud9 Charts solution for DataStax Apache Cassandra™ allows users to create instant visualizations from Cassandra data in easily shareable, embeddable dashboards. The solution includes a CQL Query generator and enables discovery services that enable schema and field discovery with primary partition and clustering key, along with secondary index key detection capabilities. Cloud9 Charts for DataStax also supports aggregations, date bucketing, built-in prediction modelling and other advanced analytics capabilities including multi-dataset joins within Cassandra, as well as join capabilities across heterogeneous databases.

“In today’s highly connected online economy, modern applications generate massive distributed data sets for enterprises to manage, often across multiple data centers and clouds,” said Matt Rollender, vice president of Cloud Strategy at DataStax. “Enterprises require a secure, always-on, managed database platform solution so they can focus on running their business, and not their database infrastructure.”

DataStax delivers Apache Cassandra™ in a database platform purpose-built for the performance and availability demands for web, mobile and Internet of Things applications, with more than 500 customers and 30 of the top Fortune 100 companies worldwide. Cloud9 Charts provides a BI platform on DataStax Enterprise to instantly discover, visualize and analyze data from DataStax as well blend data from other NoSQL and structured sources.
To learn more about how Cloud9 Charts and DataStax work together, visit the Cloud9 Charts booth #117 at Cassandra Summit 2015 in Santa Clara, California September 22-24, 2015.

Resources
Cloud9 Charts blog: http://blog.cloud9charts.com



Wednesday, September 2, 2015

How to choose the best visualization


Choosing the right visualization is a key interface into your data and something we spend a lot of time thinking about for our customers. Here are some simple guidelines to help clear the path to find the best visualization to use.

When choosing the best way to portray your data, you must first figure out what it is that you would like to show:


  • Comparison/relationship
  • Distribution
  • Trend
  • Composition


A comparison/relationship chart is used to compare multiple values to each other (highest vs lowest, old values vs new values, etc.). They can also be used to find correlations between two sets of data, outliers, and where the data clusters. Comparison/relationship charts are commonly used to answer questions such as “How do the sales values of this year compare to the sales values from last year?”, “What is the most popular product we are offering?”, and “What is the relationship between sales and holiday season?”

Visualization options to use for comparison/relationship:


  • Bar chart
  • Line graph
  • Data grid
  • Area chart
  • Column chart
  • Funnel chart
  • Scatter plot
  • Bubble plot




A distribution chart is used to visualize the dispersal of data points. When the data is shaped it makes it easier to identify characteristics of the data; outliers, normal ranges, tendency, etc. It is used to answer questions such as “What is the distribution of ages among customers?” and “How are the calls being distributed in our call center?”

Visualization options to use for distribution:


  • Column chart
  • Bubble chart
  • Scatter chart
  • Line graph
  • Spline graph




A trend chart is used to understand the trend over time of a variable or multiple variables. Common questions are “How did our stock price do over the course of the year?” and “What days did we have the most customers enter our store?”

Visualization options to use for trend:


  • Line chart
  • Column chart
  • Data grid
  • Spline chart









A composition chart is used to see how various parts comprise the whole and how a total value can be divided into shares. Commons questions are “What are the individual product sales as a percentage of the whole revenue?” or “How much of the market do we control in this region?”

Visualization options to use for composition:

  • Column chart
  • Bar chart
  • Pie chart
  • Stacked chart
  • Stacked area chart
  • Geo – heat maps
  • Geo – marker pins
  • Geo – marker heat maps
  • Geo – regions
  • Data grid heat map







Cloud9 Charts currently offers 19 different visualization options that allow our customers to go from data to visualizations in a snap.