Monday, May 7, 2018

Knowi Product Update Q1 2018


To see these exciting new capabilities in action, please join Lorraine Williams, Head of Success at Knowi, for a web demo on Wednesday, May 16th at 10:30 AM PT.

Below are the highlights of our product updates for this quarter.  Enjoy!

Knowi Product Update Q1 2018

Ready. Set. Data Prep.

We now offer a full data preparation suite that helps guide you through the steps necessary to ensure that your data is machine learning ready.

After selecting the training dataset and your prediction variable, the system will provide statistical distribution information on the metrics along with a data an easy to use wizard that will guide you through a series of steps to prepare your data for analysis.

These steps include verifying/amending the data types, identifying and dealing with outliers and missing values, rescaling of attributes using either normalization or standardization techniques, the creation of discrete groupings of your data and finally the creation of dummy variables. The ability to then manually select or have the system select model features is then available.

Learn More >

Join Our Community

We have revamped the Knowi documentation site to include a Community portal that will allow you to post and answer questions on how to do something, common getting started questions and even the occasional undocumented "feature". We're excited to enable you guys with a channel to share your ideas, provide expertise, and to give us feedback on features and wishlist items. 

Please join the Community by going to our docs page and creating a post or leaving a comment.

The 'What's New' section contains our product release notes and the 'Knowledge Base' section contains the product documentation.

Lovin' Query Management

Query Revert

The system now has the ability to revert to a prior version of a query from the Query History page. When reverted, the existing query will be overwritten with the selected version. 

Knowi - Show Query History

Query Cancel

The system now supports the ability to cancel a running query. The behavior of this action depends upon datasource type and may not result in the ability to actually kill the query on the database itself. However, the thread that is executing the query will be interrupted, thereby allowing it to disconnect and return. A new run of the same query can then be executed.

Knowi - Query Cancel

Pivot! Pivot! Pivot!

We've added a new Pivot table visualization. This supports easy pivoting of data without the need for the Cloud9QL transpose function of the regular Data Grid. Simply drag the available fields into the following areas to slice and dice the data accordingly: 
  • Columns 
  • Rows 
  • Values 
  • Filters 
Knowi - Pivot Table Visualization

Other Cool Stuff

Widget Sharing

Prior to this release, widgets by themselves could not be shared, unless they were part of a shared dashboard. This has changed and widgets now have the ability to be shared with both users and groups in isolation and in either View or Edit mode.

Login As (For Admin Users)

Admin user can now log in as another user on their team. Once logged in, they are able to perform any action that the accessed account has access to. When logged in as another user, the system will display the accessed account username at the top of the screen.

API Pagination

Modified the API pagination function to be able to also supporting a next page URL parameter. This enhances the current method of using a page token such as a page number of specific flag

Data Explorer

We have added the ability to expand/collapse the sample fields from within the Data Explorer at all nested levels 

Tuesday, April 10, 2018

4 Simple Steps to Create Reports and Drive Actions from REST APIs

Real-time Customer Support Team Performance Reporting Using Zendesk APIs

APIs are everywhere these days. While most SaaS products now offer API access, interacting with APIs, gathering insights and driving actions, typically requires custom implementations to manage and maintain.

This post walks through how to query and drive actions from API’s with Knowi, using a real-world use case with Zendesk APIs to alert our customer success team.

Knowi is a new kind of analytics platform designed for modern data architectures, including:

  • Calling API endpoints
  • Prep and transform the results
  • Multi-source Joins to join results with other API calls and/or NoSQL/SQL databases
  • Visualizing and triggering actions on the data
  • Integrating machine learning for forecasting, anomaly detection, predictive analytics.

Use Case

Our customer success team needed alerts on tickets under certain conditions.  For example, new tickets that have not been responded to for over 3 hours.

At a high level, this boils down to:
  • Retrieve Ticket data from Zendesk
  • Join with another REST call in Zendesk to retrieve Submitter data
  • Transform/Manipulate the data (using Cloud9QL, a SQL like syntax to filter the results)
  • Setup a Trigger Notification into Slack with a list of tickets that require attention

Connecting to Zendesk API

First step is setting up a REST datasource in Knowi, by specifying a base URL (https://<yourdomain>, along with the authentication method. Zendesk supports basic user/password auth, so we’ll use that. Knowi will pass along the authentication as part of the API calls into subsequent requests.

Knowi REST API Query

Calling Endpoints

Retrieving Ticket Data

The API endpoint is
GET /tickets.json

The API returns tickets in a nested JSON array, with associated details for each ticket.

 "tickets": [
     "id":      35436,
     "subject": "Help I need somebody!",
     "id":      20057623,
     "subject": "Not just anybody!",
To query this in Knowi:
  • Specify the end-point to query. In this case, it’s /tickets
  • Pass along URL params sort_by=created_at&sort_order=desc&include=comment_count to get the most recent tickets first and include the comment count in the ticket.
  • Use Cloud9QL to manipulate the results. For example, to get the total tickets by date from the results, the syntax would be:

select expand(tickets);
select count(*) as count, date(created_at) as week
group by date(created_at)

This expands the nested array to calculate total tickets by date. This can be immediately visualized and added into a dashboard (for sharing and embedding).

Knowi RESt API data visualization

To get the new tickets that have not been responded to:

1. Retrieve Ticket data from Zendesk

Using the following Cloud9QL to get the relevant fields from the nested tickets array.
select expand(tickets);
select id as ticket_id,
order by ticket_id desc;
Knowi REST API data transformation

2. Join with another REST call in Zendesk to retrieve Submitter info

Knowi supports Joins with other API endpoints (or disparate databases). Since the tickets only contain submitter id, get the submitted name/email from a second endpoint (/users) to join it with the first set of results.

3. Post-Join Data Transformation

The last step of the data processing is to filter the results to what we are looking for. Specifically, we are looking for tickets in open status, and not created by internal users. In addition, we are only looking to receive alerts during business hours, Pacific Time..
Knowi REST API joins

4. Action: Setup a Trigger notification into Slack

With this data, now we are ready to drive an action. Knowi provides Triggers to drive calls into other API’s, emails or Slack. In this case, we’ll send a slack notification to our Customer Success team with the list of tickets that require attention attached.

Triggering actions from REST APIs with Knowi

Knowi Slack Alert Configuration


We pulled support tickets from Zendesk through the Zendesk Ticket API. We visualized tickets by date. We added Submitter information to create a blended dataset that we could use for alerting. We created an alert to send a list of open tickets that exceeded our response threshold of 3 hours to our customer support Slack channel.

Try it on your own APIs. See our REST API Playground to point to your own.


Thursday, March 15, 2018

Tableau and MongoDB Analytics: Why It's a Bad Marriage

The Challenges of MongoDB Analytics with Tableau

First off, we are not here to bash Tableau. Tableau is an excellent analytics tool for structured relational data. Our point here is that data has moved beyond well-understood structured data and now includes semi-structured and unstructured data stored in newer NoSQL databases, like MongoDB. Trying to using analytics tools that are architecturally committed to relational structures for analytics on MongoDB (NoSQL) is the definition of putting a square peg in a round hole.

To be fair, Tableau was developed before NoSQL before Big Data. It's designed to understand SQL and nothing else, making analytics on newer data sources a significant challenge. To do it, customers typically perform data discovery somewhere else to model the data so they can build transformations and mappings to load it into a relational structure like a MySQL table. This action is accomplished either through ETL or using ODBC drivers that "map" unstructured data into a table-like structure, which is exactly what the MongoDB BI Connector does.

MongoDB BI Connector Description


In case you missed it, the BI Connector is moving data out of MongoDB into MySQL tables so Tableau will work. MongoDB is a powerful database solution for modern data. Moving data out of MongoDB into MySQL for analytics seems to somewhat defeat the purpose of the investment.

Questioning if Tableau is the right analytics tools to meet your future needs should be top of mind as Big Data becomes deeply integrated into your operational analytics.  As you begin to explore how to leverage advanced analytics and machine learning to develop new products and services, these systems are already complicated, so it's hard to imagine how these legacy BI tools stay relevant when they add so much unnecessary complexity, overhead, and cost while limiting data availability and potentially impact data fidelity.

As the future gets closer, we are seeing the early signs of a shift.  A new wave of innovation in the analytics space that is driven by business expectations for instant insights and the fundamental change is data brought by cloud, Big Data and most recently, Internet of Things.

Data engineers and business teams want the same capabilities of instant data discovery and self-service analytics with MongoDB data. They don't understand the complexities behind why Tableau and other SQL-based analytics tools struggle to work in the same way with MongoDB as with a MySQL database. Because they don't understand the complexities, they have little patience for waiting weeks or months for MongoDB data to be made available for analytics.

This lack of patience and business leaders expectations that analytics will drive decision making at all levels of their organizations means a fundamental shift in how data and analytics teams integrate modern unstructured and semi-structured data into their analytics architecture is underway. 

There is growing intolerance for building heavy ETL processes to move, transform, prep and load data into a staging area. In addition to slow projects down, the cost of changes is high making experimentation less likely to happen. The trend is towards simplifying data architectures with native integration to these modern data stores, like MongoDB, Cassandra, Couchbase, etc. 

Today, in many cases, to go native means building custom code and processes which limited the number of teams that could access the data. Again, this is pushing analytics tools to step up and manage data from new data sources differently and not require it to be moved and transformed back into relational structures.

As I mentioned, we are at the early stages of the next wave of innovation in analytics where you will see changes in how analytics platforms interact with newer data sources and learn how to handle structured, semi-structured and unstructured data in the same way. Only then will business teams be able to leverage their data fully and experiment with new insights, machine learning and create data-driven actionable intelligence.


Our mission at Knowi is to simplify and shorten the distance between data and insights for all data: unstructured, structured and multi-structured. To accomplish we believe you need to: a) leave data where it is and b) enable data engineers to explore all data without any restrictions that result from mapping it to a relational structure.

We are a certified MongoDB partner and the only analytics partner to natively integrate. No ETL. No ODBC drivers. No proprietary query language

You can play around with a NYC Restuarant dataset in our MongoDB sandbox to see for yourself how nice it is not to move your data out of MongoDB to analyze it. 

We also natively integrate to most other leading NoSQL, SQL, RDMS data sources, and REST API's enabling data engineers to create blended datasets and visualizations in minutes.

Download our solution guide: Why Native Matters

Friday, February 23, 2018

Will Cockroaches and Data Silos Be the Only Things Left? Part II

Data Services is the Answer!  Yes, but...

data services for analytics
In our last post, we talked about using a data warehouse strategy as one of the ways to break down data silos across multiple departments and systems.  Building a data warehouse is a traditional way to tackle the data silo problem.  However, successful projects take serious organizational commitment and months of development time.  With data moving faster than ever and business teams increasingly looking for the ability to rapidly experiment with analytics, the wait time for a data warehouse often leads business teams to move on before the warehouse can even be deployed.  As a result, business and IT teams are looking for other ways to unify data across disparate data sources.

The second option is to build a data services layer where data engineers can query disparate repositories, including unstructured and structured data, to build blended data sets for business teams.  There a many advantages to this approach over building a data warehouse.

Benefits of Data Services vs. Data Warehouse Approach

No moving data

The elimination of building ETL processes or developing custom extract and load jobs to wrangle data from your various sources, transform it into a relational structure and then load it into your data warehouse is arguably the most significant benefit of using a data services strategy over a data warehouse strategy.  Data services, by their nature, do not move data from the source systems and allow you to blend data to create virtual datasets.  For example, you can pull data from your MySQL database, blend it with data from your Cassandra data store and create a new data set for use in analytics.  However, there is another factor to consider.   If you're looking at data virtualization solutions, many still require data to be transformed into a common relational structure before it can be used.  Data has evolved beyond well-understood relational models so forcing it to conform adds cost and complexity so choose your solution wisely.  

Free advice:  Make sure the data services layer in your solution natively supports (no drivers to install) unstructured and semi-structured formats from NoSQL and REST-API sources so you can avoid the need to transform your data and shoehorn into back into a relational structured.  Even if you are only using structured data today that may not be the case tomorrow as most new data that is interesting to explore is semi-structured or unstructured.

Experimentation and agility

Most business teams and leaders understand that analytics can make the difference between profit and loss or beating the competition or taking a beating.  As analytics becomes critical to a companies ability to compete in the future, agility in building new data pipelines also becomes critical.  With a data services layer that natively integrates with unstructured and structured data sources, you give your data teams need the ability to rapidly discover and experiment without the overhead of updating schemas and ETL processes.  By unshackling them from a pre-defined schema, they can transition to an iterative agile development model for building data analytics products and work closely with the business to rapidly experiment and refine.  Analytics products built in this manner are much more effective in moving business teams towards data-driven decision making because they deliver exactly what teams need much faster.  If business teams have to wait weeks or months for their change requests to be acted upon, they have moved on.  

Free advice:  Ask yourself how difficult it is to add a field, a table, a new data source into your existing analytics architecture.  If the answer is, "I'd rather have a root canal" then you might have a problem.  Your data services layer should resolve this problem, not contribute to it. Be sure you're not adding barriers to experimentation by forcing conformance to a relational structure when you have semi-structured or unstructured data sources in your stack.

Reduced cost of ownership

All things considered, the data architecture when using a data services layer should be less complex than that of a data warehouse simply because you are not moving data and no pre-defined schemas are used.  With reduced complexity comes a reduction in costs to build and maintain the architecture as you need fewer resources to develop the data pipeline and the cost to make changes is relatively low.  

Free advice:  I can't emphasize enough that simplification goes out the window as soon as you start transforming unstructured data back into a relational structure so this benefit assumes native integration with no use of drivers, etc.  Building the integration may sound difficult but there are tools out there that have already solved the native integration problem.  We are one of them but there are others.

In our humble opinion, the need to move, flatten, transform and apply structure to unstructured data should be a thing of the past.  We are evidence that there is an emerging wave of new analytics tools that are leading the way to the future of data analytics where business self-service, experimentation, and data agility thrive.  Come catch the wave with us!   Sign up for a free trial here 

Thursday, January 18, 2018

Knowi Product Update Q4 2017

To see these exciting new capabilities in action, please join Lorraine Williams, Head of Success at Knowi, for a web demo on Wednesday, January 31th at 11:00 AM PT.

Lovin' Query Management

Our query capabilities are at the heart of what makes Knowi different.  We constantly add capabilities but in the past few weeks we've focused our efforts and added a number of enhancements:  

Join Builder
In addition to performance improvements to the Join functionality, we now provide query join assistance from within the Query page. The supported joins types are listed along with auto-detected key field candidates for each side of the join. You can, of course, still enter the join criteria manually.

Join Query Help
Save Query as Draft
You now have the ability to save a query in progress without creating an associated dataset and widget.  Go for that coffee break!

View Query Change History
Wondering who messed up your query?  Wonder no more.. you now have the ability to view an audit history of query changes. This is only applicable if the user has edit permissions for the query in question. From the Query Listing page, a history icon is now available.  When clicked, the user will username and timestamp of when each change was made.

Query Filters Suggestions
In Query, filters auto-suggestions and hit list filter capabilities can now be seen from within the Query Builder itself.

Join Post-Processing
You can now apply Cloud9QL functions to a dataset post join.

Preview Data at Each Join Step
You can now preview dataset results at each join step.  This can be especially useful when you have multiple join steps.

We're Getting Slacky

Slack integration allows you to trigger actions in your slack channel(s) for a given condition triggered by an alert. When the condition is triggered, we'll send a message to a predefined channel(s) including the attachment of full data or conditional data depending on the options selected.

Stranger Danger

Enterprise data security is top of mind for everyone.  Whenever we can, we leverage new security capabilities from our database partners as quickly as possible.

SSL Support
We now support SSL enabled MarkLogic and Datastax/Cassandra

Role-based access control (RBAC) Support
We now support RBAC in Couchbase 5.0

Access Control List
The system now supports the ability to create white and black lists of datasource assets (tables/collections/indexes). This will allow the datasource creator to specify those assets available to subsequent queries. The datasources that support the ACL functionality are currently: Elasticsearch, Oracle, Knowi Elasticstore

Other Cool Stuff

Email Reporting Improvements
Parametrized Report Templates The Email Report function has been enhanced to pass in user-lever query filters, ensuring only the data the recipient is allowed to see is contained within the report. Any dataset attachments also adhere to the passed in parameters.

Analyze Grid Formatting
A number of usability enhancements we made including:
  • Ability to view statistical data for numerical columns
  • Added formatting options for numeric and data columns: currency, date, percent and decimal place
  • Ability to resize columns
  • Added Count option for column aggregation
  • Added 'does not equal' as an operand in the conditional grid formatting options 
Embed API Formatting
An option has been added into the JS Embed API that allows for auto-sizing of content based upon the full height of the dashboard. 

 New Datasources
Added support for
  • Apache Hive
  • Couchbase 5.0
Cloud9QL Enhancements
Cloud9QL Function AutoComplete
When adding a function in Analyze or preview modes, the system now gives a dropdown list of C9QL functions available along with autocompleting capability

A new CLoud9QL function has been added that allows you to control the display of numerical values. The format is NUMBER_FORMAT(<number>,<format>), and an example is:   select number_format(clicks,##,###.00) as Number of clicks

If your data is a JSON string, the PARSE function can be used to convert it into an object which can then be further manipulated and processed.

Provide an alternate value to be used in case the specified field doesn't exist or the value is NULL.