Thursday, September 24, 2015

Introducing Spark Support - Knowi

Native Analytics on Apache Spark

Today, we are excited to announce immediate availability of Knowi native support for Apache Spark SQL giving businesses using Spark to get real time analytics for both on-premises and the cloud.

Apache Spark, the fast engine for processing big data projects, allows businesses to connect to as simply and fast as possible, and across the widest variety of data sources. Now with Spark, Cloud9 Charts provides the power and immediate value demanded by today’s businesses with a native approach to business intelligence sources on SQL and NoSQL sources, making it one of the simplest ways to add SQL, NoSQL, and a blend of source data in one system. 

Benefits of the new offering include:
  • Instant Visualizations: Generate shareable, embeddable dashboards. 
  • Access data from Spark RDD directly. Eliminate complex ETL processes into relational databases for BI purposes. 
  • Native integration with SparkSQL.
  • Data Discovery: Discover tables and fields. 
  • Query Generation: Automatically generate and explore Spark SQL queries and aggregations.
  • Joins: Blend and join data from data in Spark as well as other NoSQL and SQL datastores.
  • Advanced cloud or on-premise analytics include data aggregations, calculations, date bucketing, predictive analytics and more. 

See it in action at booth #117 at the Cassandra Summit. Alternatively, visit to analyze a live dataset, generate Spark SQL and visualize the data.  

Wednesday, September 23, 2015

Cassandra Native Business Intelligence

Native Analytics on DataStax Enterprise Eliminates the Need to Move Your Data

Originally posted on

DataStax Enterprise, built on Apache Cassandra, is synonymous for scalability and high availability while delivering the flexibility required for modern Web, mobile and IOT applications. Knowi provides a native Business Intelligence platform that enables data discovery, CQL query generation, analysis, multi-datasource joins, along with easily shareable, embeddable dashboards from data stored in DataStax Enterprise.

Analytics and Business Intelligence from data within DataStax Enterprise today generally means writing MapReduce and ETL processes to extract the relevant data out of DataStax Cassandra and moving them into a relational database, to provision a traditional BI tool to derive insights and reporting from it. This has significant drawbacks, since data integration, data modeling and database administration specialties require an excessive amount of engineering effort and coordination of different tools and teams.

Knowi offers a dramatically simpler path. By enabling queries on data stored in DataStax Enterprise and Cassandra directly with the ability to store and incrementally track the results to instant visualizations, enterprises can go from data to insights and react to data changes quickly without engineering efforts around MapReduce, ETL processes and relational database schema definitions and storage.

Let's take a 10-minute hands-on look at generating visual insights from a live DataStax Enterprise/Cassandra database (as well as some simple predictive analytics on it):

Macintosh HD:Users:gopalaj61:Dropbox:c9 Screenshots:Screen Shot 2015-05-23 at 11.18.32 AM.png

Overview of the Solution:
  • Eliminates complex ETL into relational databases for BI purposes.
  • Native integration with CQL (without shoehorning the data in relational ODBC form).
  • Data Discovery: Connect and discover keyspaces, tables, fields & key detection (primary partition, clustering and secondary index keys).
  • Query Generation: Automatically generate and explore CQL queries and aggregations using a visual interface.
  • Joins: Blend and join data across multiple DataStax Enterprise databases and other NoSQL and SQL datastores.
  • Optional seamless warehousing of query results with incremental tracking.
  • Advanced Analytics: aggregations, calculations, date bucketing, predictions ….
  • Instant Visualizations: Generate shareable, embeddable dashboards.
  • Connect from the Cloud or on-premise with cloud or on-premise deployment modes.

1. Getting Started: Go to and click on ‘Get Started’ button to sign up (it’s free to get started).

2. Connect: Click on Data sources from the left-hand settings menu, then select Datastax as shown in the animated GIF below. Use the default settings point to a live demo DataStax Enterprise database hosted by us.

Note: If you prefer to connect to your own DataStax Enterprise cluster and the database is inaccessible from outside your network, use our agent to facilitate secure connectivity to it.  

3. Queries & Insights: Now, let’s generate some insights from a demo dataset. This database contains a demo dataset of email campaign details; in the following example, we’ll determine the Total Sent, Delivered and Opened on a weekly basis from it, while showcasing the following along the way:
  • Table, Keyspace, Field and Key discovery
  • Auto-generate queries
  • Aggregations
  • Visualizations

Click on Configure Queries.
i. Expand the ‘Query Generator’ section to determine tables within this DataStax Enterprise Cassandra database, along with identifying partition keys, cluster keys and secondary index columns.
ii.  Select cloud9_demo as the database.
iii. CQL queries are automatically generated when you use the Query Generator. Click on Preview to immediately see a preview of all data in this table.
iv. Select Sent, Delivered and Opened from the metrics dropdown. Click on each to display aggregation options. Select Sum on each.
v. On the Dimensions option, select Date, then click on it to select Weekly aggregation.
Now we have the sum of Sent, Delivered and Opened on a weekly basis.

The auto-generated queries so far look like this:
select "date", "sent", "delivered", "opened"
from "cloud9_demo"
limit 10000

Cloud9QL: This is a post processor with a SQL like familiarity that enables easy aggregations and a range of more advanced analytics capabilities. It’s not a replacement for CQL, but rather complements it with a range of powerful Business Intelligence and reporting capabilities that CQL is not designed for. Learn more about Cloud9QL at

The generated Cloud9QL looks like the following, which processes the results returned from CQL to an aggregation of Sent, Delivered and Opened totals on a weekly basis.

select SUM(sent) as Sum of sent,
 SUM(delivered) as Sum of delivered,
 SUM(opened) as Sum of opened,
 WEEK(date) as Week of  date
group by WEEK(date)
vi.  Click on Preview to instantly visualize the results.

vii. Save the Results. The results can be either be seamlessly saved into our schemaless data warehouse for fast access (with auto-update and incremental query/upsert capabilities that are beyond the scope of this post), or executed against the datastore in real-time.  

4. Dashboard: Click on Dashboards. Drag & drop the newly created visualization into the dashboard. The dashboard can be easily shared, embedded and filtered on.
5. Predictions: A range of analytics capabilities are part of the Knowi platform, including seamless predictive analytics that backtests the data across a number of algorithms to automatically determine the best fit.

In the following example, we’ll put together predictions for the Messages sent on a monthly basis for the next 6 months.   
  1. Clone the widget.
  2. Add a Cloud9QL filter (see the GIF) with the following:
select predict(Sum of Sent,Week of  date, 10/01/2015, 1m, 6)
  1. Save. Our new widget will be updated with the predictions for 6 months starting from October.

In a few simple steps, we’ve derived insights and predictions from a DataStax Enterprise cluster with easily customizable, shareable and embeddable dashboards.                                                          

Instantly connect and derive insights from our demo DataStax Enterprise cluster:

Join DataStax Enterprise data with other data sources:

Post process CQL results to aggregate, date bucketing and other:

Wednesday, September 16, 2015

DataStax Partnership & Cassandra Certification

Cloud9 Charts Announces Validated Business Intelligence Solution for DataStax Enterprise
Certified Business Intelligence Solution Partners With DataStax
OAKLAND, Calif. (Sept. 15, 2015) – Cloud9 Charts, Inc., today announced a partnership with DataStax and a validated native BI solution for DataStax Enterprise. The solution provides joint customers native Business Intelligence capabilities to instantly visualize, analyze and query data on Apache Cassandra™.

“Cloud9 Charts and DataStax have worked closely together to deliver an enterprise grade BI platform that provides instant, actionable insights from their data in Cassandra," said Jay Gopalakrishnan, Cloud9 Charts founder and CEO. “This provides technical and business users clarity into their data within Cassandra like never before, without being constrained by complex ETL processes that shoehorn data into relational form.”

The Cloud9 Charts solution for DataStax Apache Cassandra™ allows users to create instant visualizations from Cassandra data in easily shareable, embeddable dashboards. The solution includes a CQL Query generator and enables discovery services that enable schema and field discovery with primary partition and clustering key, along with secondary index key detection capabilities. Cloud9 Charts for DataStax also supports aggregations, date bucketing, built-in prediction modeling and other advanced analytics capabilities including multi-dataset joins within Cassandra, as well as join capabilities across heterogeneous databases.

“In today’s highly connected online economy, modern applications generate massively distributed data sets for enterprises to manage, often across multiple data centers and clouds,” said Matt Rollender, vice president of Cloud Strategy at DataStax. “Enterprises require a secure, always-on, managed database platform solution so they can focus on running their business, and not their database infrastructure.”

DataStax delivers Apache Cassandra™ in a database platform purpose-built for the performance and availability demands for the web, mobile and Internet of Things applications, with more than 500 customers and 30 of the top Fortune 100 companies worldwide. Cloud9 Charts provides a BI platform on DataStax Enterprise to instantly discover, visualize and analyze data from DataStax as well blend data from other NoSQL and structured sources.
To learn more about how Cloud9 Charts and DataStax work together, visit the Cloud9 Charts booth #117 at Cassandra Summit 2015 in Santa Clara, California September 22-24, 2015.

Cloud9 Charts blog: