Friday, December 22, 2017

Why the World Needs Another Business Analytics Tool

Future of business analytics

Time to Rethink Business Analytics Architectures 

I said to a friend not too long ago that I'm going to do a new Business Intelligence startup. His response was, "Just what the world needs, another BI solution." After telling him to stop being a nitwit, I realized that I would have to answer this question: Why the world needs another business analytics tool? Well, it doesn't. Not in the sense you are thinking, anyway. Let me explain.

When someone says BI or data analytics tool, I would hazard to guess you think about data visualizations and dashboards. However, to get to the point you can do visualizations and create awesome looking business dashboards, your data has already been moved, transformed, aggregated, moved, joined, and moved again until it finally lands in a prepped relational form in a SQL-friendly database. Traditionally, BI tools have left much of the heavy lifting for data analytics to middleware and data integration tools, like Talend, which extract-transform-load (ETL) data from various sources to a staging/reporting area.

Modern data sourcesThat worked great ten years ago when data was relational, structured and prepped but the data stack has completely changed in the last seven years. Now you’ve got SQL databases co-existing with NoSQL databases that are workload optimized. You've got Elasticsearch for searches on large sets of data and MongoDB for storing general purpose semi-structured data, along with REST API's. At the same time, with over 40 years of history, relational databases aren’t going away. They are going to remain in the enterprise for the foreseeable future. 

So while in the past decade, data itself have massively evolved, business analytics tools, even newer Cloud BI solutions, have not. They are still architected for smaller, structured, prepped relational datasets. The result is fragmentation of enterprise data architectures to include various analytics, data integration, and data prep solutions to handle the gap between what traditional BI tools can handle and the reality of modern data stacks that include structured, semi-structured and unstructured data. 

Data architecture are fragmented

Now, look at what you're are trying to accomplish with your data analytics in the next few years. Most enterprises understand their data is becoming a valuable asset. How well you leverage it will positively or negatively impact your future competitiveness. Competitive advantage will come from transitioning to a data-driven enterprise, creating new data products and services and driving actions with real-time analytics. But to get there, your BI tools have to work with data that is essential to you no matter its source, size or speed.

I know almost all the existing BI tools claim to support modern data with their "native" connectors and drivers, etc. The reality is these drivers use ODBC frameworks which were built 20 years ago for relational data. The whole point of their existence is to have some sort of translation layer BI tools can understand. What I mean by “understand” is a column and row model. But that model is no longer applicable because data is no longer structured this way. Trying to use them for unstructured or semi-structured data is like putting a square peg in a round hole.

data analytics maturity scaleEnterprises who have a multitude of data sources are still ironing out how to get a unified view of all their data to support data agility and, more importantly, experimentation. I would argue to achieve the level of data agility required for digital transformation, you have to significantly reduce, if not eliminate, ETL processes and tools involved. Once you can provide an enterprise-wide unified view of data, that becomes the fundamental building block into like predictive analytics & machine learning, natural language queries, prescriptive actions, etc. The difference from what we see today is business analytics innovations are applied enterprise-wide not just in one or two departments for specific use cases.

In short, the world doesn't need another BI tool, it needs an analytics platform that completely rethinks data and analytics architectures for modern data. Where ETL is minimized, if not eliminated. Where any kind of data can be analyzed and insights are visualized instantly, anywhere. Where business users interact with business analytics naturally and where data drives actions at all levels of your organization. Where companies can embed analytics easily to drive new monetization opportunities using their data. Where historical data can seamlessly combined with Machine Learning to drive insights and actions.  

Business leaders understand that analytics can transform their business. Now its time for analytics vendors to build the platform to get them there. Our vision for Knowi is to lead the next wave of data analytics solutions that completely change how enterprises build, interact, predict and monetize their data. 

Thursday, November 2, 2017

Real World Healthcare Analytics Dashboard Examples

It can be hard to find real-world examples of how organizations are using analytics and dashboards in to manage specific aspects of their business.  Recently, our partner Sagence went through the dashboards they built for Shirley Ryan AbilityLabs for denials management and data quality monitoring.

Denials Management

Using visualizations claims managers at Shirley Ryan AbilityLabs can point to patterns of denials and work with payers to uncover the root cause.  They can also use these identified patterns of denial to predict higher risk claims and start working with payers early in the processes to avoid final review denials.  

Healthcare analytics dashboard for denials management
Denials Management Dashboard
Note: Not Reflective of Actual Numbers
When claims managers can have all the information about denials at their fingertips, it shifts the conversation with payers from anecdotal to fact-based.  In this 10-min video, you can see the different dashboards and how each is used by claims managers to reduce denials.

Data Quality Monitoring

The challenge of monitoring data quality across multiple departments and different systems is significant but critical for analytics.  Shirley Ryan AbilityLab wanted to take an innovative approach to data quality monitoring by building a single dashboard for analysts to monitor data quality across the network.  

Healthcare analytics dashboard for data quality monitoring
Data Quality Monitoring Dashboard
Note: Not Reflective of Actual Numbers
With data quality threshold alerts and drill-downs, an analyst can identify where data quality issues are increasing and work with departments to adjust processes or conduct additional training. In this 10-min video, see the full dashboard and hear an explanation of how each visualization is working to help improve data quality.

You can download the Sagence and Shirley Ryan AbilityLab customer story, here.  It details the full solution architecture and additional use cases.

Tuesday, October 17, 2017

Knowi Product Update Q3 2017

Know Product Update Q3 2017

You can see the exciting new capabilities described below in action. Lorraine Williams, Head of Success at Knowi, demonstrated them recently and we recorded it. To watch the replay, click the button below

Register for product update webinar

Expanded Machine Learning Capabilities 

For supervised learning, you now can select algorithms for either classification or regression model, meaning you can now predict continuous values (i.e., housing prices in Boston) or predict categories or classes (i.e., likelihood of a person to default on credit card payment).

Stats, Stats Everywhere
The system now allows you to view the statistical metadata about your datasets such as the total number of rows and columns, max. and min. values, mean and standard deviation. The dataset overview can be viewed by selecting the bar chart icon on the Analyze Grid. For more detailed analysis, pairwise scatterplots of the interaction of each data variable with its peer are also available from the overview.
Knowi Data Statistics

Filter Like You Mean It

Generating filter values based upon a separate query 
The system now supports filtering based on the results of another query. This dynamic filtering capability is achieved by first creating a query that returns possible filter values and then selecting the database icon next to add/remove filter buttons. Clicking this option will set the auto-suggestions based upon the secondary query results.

Knowi filter values

Setting the Filter Audience
The system now offers the options when setting filters at both the Dashboard and the Widget level. You can set a personal filter that is only seen by you.  Admins and Dashboard owners can set a global filter which acts as a default filter for all users and admins can reset filters which resets any personal filters back to the global default set by the Admin or Owner

Multi-value user filter support
An admin can now add to a user's profile user specific filter parameters to be passed into queries upon user login

Being RESTful

Added the ability to add paging to a REST-API datasource.  The system will automatically loop through multiple pages to collect data when some tokens are defined.

The system now supports the concept of a Loop JOIN. This type of join allows you execute and retrieve the results for the first part of the join and for each row (from the resulting set) extract the template value, update the second query in the join, execute it then combine the result with the current row.

Other Cool Stuff

Adding Steps to the Ad-hoc Grid
After creating your query and previewing the data returned in the ad-hoc grid, the ability now exists to add multiple steps to the same data query workflow.

Learn More
Grid Formatting  A new feature has been added that allows for alignment of data in the Data Grid widget type. The data grid also supports conditional formatting of colors based upon content value. Any formatting made to the data in the grid will be passed through into any subsequent PDF exports.

Learn More 
Automated Dashboard Sharing There may be cases when any asset that the user creates needs be automatically shared to other groups. In such cases, you can apply an 'Automatic Share to Group' setting that will automatically publish any assets created by the user to those groups that can be used by other users.

Learn More
 New Datasources Knowi has added native integration with Snowflake, a cloud-based SQL data warehouse.

Learn More
New Visualizations Threshold
This visualization allows for the simple tracking of your key metrics. A user can:
  • Select the metric to monitor
  • Enter a threshold value
  • Choose the display color for when the metric is <= the threshold
  • Choose the display color for when the metric is > the threshold
Data Summary
Display the data in summary form. (Ex. Total messages delivered, opened, etc.)

Friday, October 6, 2017

Will Cockroaches and Data Silos Be the Only Things Left?

Destroying data silos is a quest any organization transitioning to data-driven must undertake.   Some see success while others fail after valiant (i.e. expensive) efforts.  Even for those that think they've succeeded in killing off their data silos, the need to stay vigilant is ever present because data silos are like cockroaches. You think you've killed them all, so you relax for just a minute, and they're back!

In this series, we'll discuss some options for eliminating your existing data silos, how to ensure new ones don't pop up and, finally, how to make the most of your data once it's unified.
  • The first option is to be build data warehouse.  Here you will be bringing select data from select systems into a central repository where the data is normalized and prepped
  • The second option is to build a data services layer where data engineers (technical users) can query disparate repositories and deliver a variety of blended data sets
  • The third option is a hybrid that includes a data warehouse and a data services layer.  
  • The fourth option is a "data lake" where all data is moved into a massively scalable repository, like Hadoop or Spark, and tools are placed on top to enable querying. 

Bridging Your Data Silos Using a Data Warehouse

Bridging Data GapsLet's talk healthcare for a minute.  Healthcare data is ugly.  It's big.  It's a mix of structured and unstructured data.  It must be secured.  It's stored in a variety of different systems.  The combination of these traits makes sharing healthcare data a bit of a nightmare for even the most technically sophisticated hospital networks.  However, the upside of being able to efficiently share data across multiple departments is better patient outcomes, reduction in claim denials and improved financial performance, etc., so worth the effort.  Let's take a look at what Shirley Ryan AbilityLab's did with Sagence Consulting (a Knowi partner) to break down their data silos and implement a solution that enables data sharing across multiple departments within their hospital network.

The Shirley Ryan AbilityLab, formerly the Rehabilitation Institute of Chicago (RIC), is the #1-ranked global leader in physical medicine and rehabilitation for adults and children with the most severe, complex conditions — from traumatic brain and spinal cord injury to stroke, amputation, and cancer-related impairment. Shirley Ryan AbilityLab is the first-ever “translational” research hospital in which clinicians, scientists, innovators, and technologists work together in the same space, 24/7, surrounding patients, discovering new approaches and applying (or “translating”) research real time.

Obviously, data plays a core role in their mission but was often locked in disparate repositories across the hospital, limiting the ability of administrators and clinicians to fully leverage it.  A textbook example of data silos limiting an otherwise sophisticated data-driven culture.

AbilityLabs decided to implement a healthcare data warehouse strategy to serve data to all their departments from patient outcomes to finance.  They selected Sagence Consulting to assist them in building the first iteration of the data warehouse.

Before the coded their first query, the Sagence and AbilityLab team spent a considerable amount of time planning before building the first data pipeline.  Data warehouses take time to develop so doing the right preparation upfront is essential.

Set Impactful but Achievable Goals
This can be summed up in the old adage "Don't try to boil the ocean."  A critical factor for success in a data warehouse project is to build something that actually makes things better for people.  This means giving people access to data they didn't have before or making it significantly easier and faster for them to access existing data.  I know it sounds obvious, but you'd be surprised.

At the same time, be careful not to get too far over your skis and try to deliver the something so "revolutionary" that it requires specialized technology or skills to implement or use.  If you keep saying to the team, "I know this sounds complicated but it will change everything if we can do it." Stop. Step Back. Rethink.

Get Buy-in at All Levels
I hear a lot that "we've got management buy-in and executive level sponsorship" so teams think they are all set and once the data warehouse is up, people will line up to get their user account.  Well... not so much.  Change is hard for most people especially ones who perceive their roles as data gurus where they are the keepers of "the spreadsheet."  These people are incredibly vital to the success of your project so dismiss them at your own peril.

They can help you understand where the bodies are buried when it comes to data related processes.  They know what data is good and what data is bad and, usually, why.  The key is to show them how the technology will make their lives better so they can start using data to further their goals rather than spending all their time collecting, cleaning and preparing data for others to use. They will become the data warehouses greatest advocates and help with the significant task of change management.

Understand Current State of Available Data
This step can take the longest because it often morphs into a data quality and data entry process analysis exercise.   Data quality is the elephant in the room when it comes to building a data warehouse or any kind of data analytics platform, for that matter.  You want to start off your new data warehouse with pristinely accurate and complete data.  Good luck with that.

Did I mention data is ugly?  Naturally, some cleansing and improvement of data must happen but don't get obsessed that every field must be complete and every piece of information validated.  Your time is better spent addressing the root cause of the data quality issues and adjusting data collection and data entry processes.  This will resolve data quality issues in the long-term.   With a couple of concentrated efforts to address legacy data issues, your data quality will get there.

Build Processes People Can Actually Follow
That gets to my last point.  Data collection and data entry processes.  If you have data quality issues, they can probably be traced to requiring people to enter too much data into too many systems.  Wherever possible, automate integration between systems.  For data that must be entered, keep the amount of data required to a minimum, at least in the beginning.  Expecting people to enter data into one system and turn around and enter similar data into another is not going to help your data quality issues.

If you cannot automate the integration, try to reduce the number of systems that need the data and use the data warehouse to provide a centralized view of information vs. each system.  I know business needs often dictate a different path but think about how you can leverage your data warehouse to actually minimize the amount of data that is duplicated across systems.

Sagence Consulting are experts in data so helped AbilityLab create a strategy that resulted in a successful deployment of an enterprise data warehouse built on PostgreSQL within 6 months of kicking off the project.  Knowi provides the analytics and visualizations for the embedded dashboards used by multiple departments across the Shirley Ryan AbilityLab Hospital network.

We recently did a webinar with Sagence were they went through in detail the architecture they deployed to support Shirly Ryan's Healthcare data warehouse.  In the webinar, the team from Sagence walked through three different use cases including managing research project financial, claims management and data quality management.

Monday, September 25, 2017

Advanced Machine Learning on Big Data

Ever wished your data could give you a heads up? Tired of looking in the rearview mirror when it comes to business intelligence? We designed a product that adapts to your data, intelligently, so you can take action today.

Knowi provides Machine Learning models to drive action by combining AI & BI. We call this Adaptive Intelligence and it can only be found at

Learn More: Sign up for a 21-day free trial here.

Tuesday, August 22, 2017

IoT Time Series Data with InfluxDB and Knowi

InfluxDB is a Time Series NoSQL database specifically built for use cases centered around IoT/sensor data. This is a brief introduction into InfluxDB as well as analytics on InfluxDB using Knowi.  


Internet of Things -- the umbrella term for network aware devices, wearables, home, cars and other connected devices -- appears to be shedding its hype and starting to coming of age. The projections are certainly eye-popping, with an expected 50-200 billion connected devices by 2020, $1.7 trillion in spending, and 173 million wearable devices by 2019.

Given the volume and velocity of sensor data, NoSQL databases are often the preferred option to storing sensor data. Commonly used data-stores include Cassandra, HDFS,  Graphite and more recently, Druid and InfluxDB.  


InfluxDB is an open source database, created and supported by the InfluxData team. It can be downloaded here.


InfluxDB does not require a schema defined upfront. Queries are enabled using a nice SQL-like syntax with easy extensions for time-based queries. Write and Query API’s are exposed through a REST API.
Given its focus on time series databases, data is always stored with a timestamp component,  which acts like a primary index.

It uses a Line Protocol format, with measurements (like a SQL table), tags (like an indexed column) and fields (an unindexed column)  

Data Format:
<measurement>[,<tag-key>=<tag-value>...] <field-key>=<field-value>[,<field2-key>=<field2-value>...] [unix-nano-timestamp]

stock,symbol=AAPL bid=127.46,ask=127.48 1434067467100293230

In the example:
Stock is a measurement
Symbol is a tag
Bid and ask are fields.
The number field is the Unix timestamp in nanoseconds (optional: InfluxDB will auto insert time for the record)


InfluxDB is straightforward to download and setup.

a. Download the appropriate package here.
b. Start the service.

To import some data:
Download test data: curl > NOAA_data.txt
(Water level data from the National Oceanic and Atmospheric Administration)

Insert: influx -import -path=NOAA_data.txt -precision=s


Queries can be executed against InfluxDB using InfluxQL, a SQL- like syntax to query data stored in InfluxDB. Example:

SELECT * FROM mydb WHERE time > now() - 1d

This returns all tags & fields for the mydb measurement for last day.  


InfluxData provides Chronograf, a visualization tool to interface with InfluxDB. This might be an option to consider for time series charts if you are planning to store all your data in InfluxDB alone.   

Knowi provides native integration into InfluxDB to be able to query, aggregate, visualize, analyze data from InfluxDB, but also allows multi-datasource joins it with your polyglot persistence architectures (data across various NoSQL, SQL, REST API’s and files), all the way to produce interactive dashboards for non-technical users.

To see InfluxQL in action:

  1. Go to InfluxDB Instant Visualization page at Knowi. It’s configured with a live InfluxDB database and InfluxDB queries.
Note: The query generator section can be used to discover measurements, tags & fields and build basic InfluxQL dynamically.

  1. Click on Show me for instant visualizations.
  2. Experiment with your own time series InfluxQL queries against the data.

Click here to learn more about Knowi in the IoT space.


Monday, August 21, 2017

How Lemonaid Health Monitors Performance with Knowi

Lemonaid Health Review of Knowi Our friends at Lemonaid Health use Knowi to manage their day-to-day operations and deliver business performance reporting across their teams and provider partners. Here is a little summary of their story.  

If you are not familiar with Lemonaid Health, they are an innovator in telehealth services. As of April 2017, their team of doctors treat things like sinus infections, UTIs, acne, acid reflux or write prescriptions for birth control pills.   Their mission is to make it easy to access affordable, high-quality healthcare.

THE CHALLENGE:   As new health services or new States roll out, understanding service performance in near real-time is critical for Lemonaid Health. They had two challenges: 1) ensuring people had the latest information and 2) heavy reliance on the engineering team to create and run SQL queries and then transcribe the results to Excel spreadsheets. 

“We fall under HIPAA so a cloud-based solution had to have the levels of security we need to be sure our patient data was protected,” says Simon Williams, Chief Technology Officer at Lemonaid Health.

Lemonaid began with using a combination of Excel and native SQL queries but found this did not provide the near real-time performance reporting they wanted and was too taxing on engineering resources who had other priorities. 

THE SOLUTION:  Lemonaid selected Knowi “because it was cloud-based and the fact the data would stay on premise.” says Williams.  Using hybrid deployment of Knowi enabled a cloud implementation of Knowi connected to on-premise instances of MySQL and MongoDB. Lemonaid’s commercial partners access Knowi directly, but filters limit their view to only their data.  Within the dashboards, their partners have full capabilities to run ad-hoc queries, change filters or visualization types and save them as custom reports. In addition to direct access for near real-time reporting, Lemonaid extensively uses Knowi’s sharing capabilities to send out PDF reports to offline users, ensuring everyone, company-wide, as access to up to the moment performance data.  This flexibility in data delivery methods was essential in transitioning the organization to be data-driven.

THE RESULTS:  The initial goal was of creating a company-wide unified view of all business performance data was quickly achieved.  “To be able to put up some real-time dashboards that showed what was going on at any point in time was key business information,” says Simon Williams.   “… [It] helped create a great buzz in the office as we displayed it on a large screen and could see counters going up as we served new patients." 

Once the engineering team built some foundational queries, “it was then easy for the business users to go in and look at the data however they wanted by State, service, doctor, day.”  The Lemonaid engineering team did not have to dedicate any additional time to generating reports or views.  Business users were taught to fish for the data themselves with Knowi.

Lemonaid quickly recognized the tremendous value of the data they collected for their partners, and they used Knowi to build and deliver a new class of data products for their commercial partners. Their partners now have near real-time data about how they are servicing their patients, demographics information about those patients and patient and service volume by State. 

To read or download Lemonaid's full story, head over to our resources page here.  

Monday, July 17, 2017

Add Predictive Analytics on Your Dataset in 3 Steps

Knowi Predictive Analytics brings machine learning capabilities to every Data Engineer.

Add Predictive Analytics on Your Datasets
Integrate machine learning into your analytics workflows and drive data directed actions.

The platform provides two options:
  • Out-of-the-box predictive analytics capabilities that test a dataset against a variety of forecasting models to determine the model best suited to the data,  with the least Sum of Absolute Errors (SAE).  
  • Built-in Predictive and Machine Learning algorithms that can be plugged into data workflows.
This post focuses on the first option above that takes a hands-on look at how it works.  We'll take monthly stock prices for Amazon to determine predicted values over a three month period starting in July in a few simple steps. No signup is required to follow along.  

Models used include:
  • Simple Exponential, Double Exponential, Triple Exponential Smoothing Models
  • Moving Averages and Weighted Moving Averages
  • Naive Forecasting Model
  • Regression and Polynomial Regression Model
  • Multiple Linear Regression Model

The monthly stock prices looks like this:

Date, Price
06/01/16, 719.14
05/01/16, 683.85
04/01/16, 598.00
03/01/16, 579.00
02/01/16, 574.81
01/05/16, 633.79
12/01/15, 679.36
11/2/15, 628.35
10/1/15, 625.90
9/1/15, 511.89
8/3/15, 512.89
7/1/15, 536.15
6/1/15, 434.09
5/1/15, 429.23
4/1/15, 421.78
3/2/15, 372.10
2/2/15, 380.16
1/2/15, 354.53


1. Copy and paste the above dataset into

2. Click on Show me. The data will be parsed and visualized immediately. 

3. To perform predictions:

   i. Click on Analyze from the menu of the time series chart. This opens up an Analysis mode.
  ii. Drag date into the Grouping field. 
  iii. Click on 'Add a derived Field' option. Enter a name ("Predictions", for example) and in the operation, enter predict(price,date,07/01/2016,1m,3). This will choose the best model based on historical accuracy of the model to determine the projected prices over a three month period, on a monthly basis. 

That's it! In a few simple steps, you can apply predictive analytics on any of your own datasets. Enjoy!

To get started with Knowi Predictive Analytics head to and try it out free.

Additional Machine Learning Resources:

Predictive Analysis docs:
Advanced Machine Learning (AI) Capabilities:
All Documentation: