Logi Analytics | Business Intelligence and Analytics Blog
Logi Analytics embeds business intelligence in the fabric of your organization and products. With Logi, you can seamlessly embed custom analytics in your apps—empowering your customers and employees with smarter insights.
As data-driven applications become increasingly complex every day, it’s necessary to build in smarter programmable checks into the system to help you add, audit, and update business rules over time. With Logi Info Studio’s new Procedure.Else and Procedure.Switch elements, you have the ability to easily create powerful and robust If, Then, Else statements. These conditional expressions help to build an easy-to-understand visual representation of your business rules, which can be easily changed and audited over time.
These elements extend the power of the Process Task elements in Logi Info to notify users, upload data, change data, switch and set new values, and much more. To demonstrate how easy-to-use these elements are, I built a Tic-Tac-Toe game using Logi Info.
Tic-Tac-Toe is a 3×3 matrix represented in the form of a board. The rules are simple: the first player to have three consecutive markers placed next to one another wins. To show you how the new Procedure.Else and Procedure.Switch elements work, we’ll take a look at the Process Task elements used to create this game in Logi Info Studio.
The first step to building Tic-Tac-Toe is to model out the system interactions and what is required to build a robust system of checks and balances. We know that a series of eight different combinations on the board will result in a win. If neither of the two players are able to match any of these conditions, the game results in a stalemate or tie.
Logi’s conditional branching elements allow you to map out the entire Tic-Tac-Toe game in three, easy-to-understand Process Task branches—using these new conditional elements as the backbone. Here’s how I did it.
Building the Report Definition:
The first step is to create a game board where the player interactions can take place. In Tic-Tac-Toe, this requires a 3×3 matrix where each square has a unique value of 1 through 9. These values are known as a sqNumber parameter or the ID of the square. Using a Data Multi Column List element—with its Multi-list Direction attribute set to Across—produces a 3×3 matrix as shown below.
Now that we have a game board, we can create the Process Task elements required to play Tic-Tac-Toe. Note that this document does not cover building the report definition known as the game board, but the report definition, itself, does have many Note elements to help you fully understand the building blocks.
Creating the Process Task of X or O:
Whenever a player clicks on a square in the board, the first Process Task element to be called is the tXorO Task element. This task does two primary things: it checks which player clicked on a square and changes the player turn while also setting the square value.
This is built by using the following elements:
Procedure.Switch – This element retrieves the current players turn and passes that value into the child Procedure.Switch Case elements. Here, we are determining what player clicked on a square and then which Procedure.Switch Case element should be evaluated based on that value.
Each Procedure.Switch Case element is looking for a value of ‘1’ or ‘2’. If that value is found, then the next path is taken to determine which Procedure.If element should be executed. Each Procedure.If element is checking to see what sqNumber value was passed.
When the sqNumber value matches to the correct Procedure.If elements’ Expression attribute and creates a True statement, it updates the parameters for which player turn it currently is and the new value of that square as an X or O.
This Procedure.If element then calls on the next Process Task element, tSqCheck, which is where we check the status of all the squares and look for a possible winner.
You’ll notice a Procedure.Switch Else element at the bottom. If for some reason both Procedure.Switch Case elements do not find a player 1 or 2 value, this element will push the player back to the game board report definition with all the previous parameter values. As with all If, Else, Then statements and programing in general is to code for and protect for the unknown.
Creating the Process Task to Check for a Winner:
The next Process Task to build is called tSqCheck. This Task element checks for one of eight different ways in which the player can win for both X and O. Within Tic-Tac-Toe, the eight ways to win are:
Winning solutions are grouped together by the square in which we check for a win first. For example, the player can win by selecting the top three squares. Though, we know that it doesn’t matter in which order or direction those squares were selected. So we are only concerned about the first row and first column cells. One, two, three, four, and seven. These determine our groups and is used to guide us when creating the next step of our conditions.
In this Process Task, tSqCheck; we look for any winning grouping by using a Procedure.If element where the Expression attribute is the following simplified condition:
(Square ID is equal to X)
(Square ID is equal to X)
(Square ID is equal to X)
The same is run for O.
If this Procedure.If element evaluates to True, which it does in the above example, then it will run the appropriate Process Task element for the winner.
Building on this overview, we can see that for each winning group there is a separate Process.If element for X and O respectively. For instance, to win the top row the following Procedure.If elements’ Expression attribute needs to evaluate to True for X to win.
If this is True, then the last Process Task element is called, tUpdateWinValueX or tUpdateWinValueO for player 2. Passed into this last Process Task element is a new parameter called winGroup where we pass in the group of squares that won. Following the example above, Group A won with the top row of boxes so we pass a value of A1.
In the event that all squares are filled with an X or O but no one is the winner the last Procedure.If element is used with the following Expression attribute:
This is merely checking that each square is not empty. Since this element is placed at the end of the element stack and no further win checks are made, this equates to a tie. The player is then returned to the game board with a new parameter ‘win’ set to ‘0’. This will be used in Conditional attributes of the report definitions’ elements to determine what to show in the case of a tie.
However, if not a single Procedure.If element evaluates to True then Procedure.Else element is triggered. This element takes all the currently selected square values and the player turn values and then returns the player to the game board. Unless there is a win or tie, this is the path most taken by the players so that the game can progress forward.
Our last Process Task elements, tUpdateWinValueX or tUpdateWinValueO, are used for some housekeeping and to update tracked wins. The Process Task used is dependent on the identified winner.
The Default Request Parameters element retrieves the Cookie value for xWins and updates by one. If no Cookie value is present, then it will be null + 1 which equals 1.
The Procedure Set Cookie Vars element sets the updated xWins value to the local players’ browsers cookie cache.
As this Process Task road comes to an end, we pass three new values to the report definition; ‘win’ to tell the Conditional attributes that a win occurred, ‘winGroup’ to state which squares were a part of the winning selection, and ‘winner’ to set the winner as player one or two so we know who won.
Mapping Out the Rules
Below you will find a flow of the steps which show the rules of Tic-Tac-Toe within a Logi Info Studio application.
We outlined the Process Task elements that power a Tic-Tac-Toe game, but there is much more than can be done with this process element. I invite you to not only learn from this report definition but to also expand on it to further your Logi skills—you can add additional rules or combine the Process Task elements into a much more compact and streamlined statement. You might even find a whole new way to power your business rules.
Machine learning, artificial intelligence, and predictive analytics are all becoming an integral part of the healthcare industry. Predictive analytics can provide real-time insights and recommended actions to vastly improve patient care. Predictive analytics allows medical professionals to gain a 360-degree view around critical aspects of healthcare.
One example of predictive analytics in healthcare is improving care for high-risk patients. A whopping fifty percent of healthcare spending can be attributed to just five percent of patients. Predictive health analytics can help doctors identify factors that can lead to chronic conditions down the road, ultimately improving patient outcomes and reducing long-term healthcare costs.
How Predictive Health Analytics Helps High-Risk Patients
A predictive analytics application can analyze data from a wide variety of patients—including healthy individuals, those with chronic illnesses, and many people in between. These factors may include regular checkup schedules, smoking, family medical history, occupational hazards, health insurance status, age, and location. It can also consider the efficacy of various preventive measures for those patients.
Next, the predictive analytics solution identifies common factors among high-risk patients. In particular, it can look for early signs that the patient is becoming high risk. It can also identify which treatments are most effective and what the best timing is to introduce those treatments.
Recommended Actions to Take
Early identification of patients who are likely to develop high-risk conditions can help medical practitioners prevent conditions from getting worse in the first place.
Predictive modeling in healthcare may recommend some actions, including:
Encouraging the patient to adopt some new healthy habits
Suggesting medication that may alleviate the condition if taken early enough
Offering after-hours appointments for patients who can’t make it to the office during the day
Identify unnecessary tests and procedures to eliminate in the course of the patient’s care
Coordinating with other providers and healthcare facilities to provide focused treatment paths for specific chronic illnesses
Adding healthcare navigators or other people who can help the patient get the services he or she needs
These actions, driven by data insights, can lead to better patient outcomes and improved care overall. The information can also be used to target groups that are at risk of developing chronic conditions, allowing physicians to come up with early intervention treatment plans.
Predictive health analytics should be viewed as a tool for informing decisions, not a quick fix that can immediately resolve problems. Continuously feeding new information into your analytics solution can improve predictions and allow your organization to identify trends in the data.
Product managers and software development leaders are increasingly recognizing the benefits of adopting embedded analytics, as it helps them provide customers with valuable data insights and stay competitive in the market. While the benefits are readily apparent, choosing a solution from the dizzying array of products on the market can be difficult, even overwhelming.
In order to truly embrace the benefits of buying analytics technology, the data architecture of your embedded analytics solution should happily coexist with your existing technology environment. Otherwise, you risk further burdening already overtaxed developers and platform support staff.
In addition to freeing up resources for your core product development, an architecture-compatible solution will provide appropriate business intelligence security and reduce operational costs. It should also give you the ability to scale and support large user communities in a cost-efficient manner, support multi-tenancy for SaaS deployments, and the ability to run on various cloud hosting platforms with cloud data sources.
#2. Outputs and Development Process
The embedded analytics product you choose should efficiently produce all of the reports, dashboards, and other types of data visualizations your customers require, while allowing for easy customization. A platform that cannot deliver on this requirement will cause your developers just as many headaches as one that is incompatible due to architecture and technology.
Many engineering teams have struggled to keep up with customer requests for report creation and customization, and these ongoing requests impact core product development as well as time-to-market for each release. An embedded analytics platform that offers self-service reporting can reduce or eliminate this drain on development resources. Even if you’re not ready to offer self-service options to customers now, choose a solution that has this capability so you can address internal agility and market demands in the future.
#3. Pricing and Licensing Models
The cost of your embedded analytics solution should align with your existing product’s pricing and licensing, ideally providing revenue growth as well as cost savings.
Depending upon the analytics vendors you evaluate, there are a number of licensing models available, ranging from term-based licenses to annual subscriptions. Pricing plans are sometimes based on factors such as user counts or product revenue. Ensure that you have a clear view of current and potential go-to-market packaging and pricing for your core product before you decide on a platform provider.
Logi SecureKey Authentication technology provides integration for applications requiring secure access to a Logi application or embedded Logi components. SecureKey allows for a single sign-on environment but provides enhanced security. User authorization is established and requests are made to a Logi application or embedded components using a special SecureKey.
Logi’s SecureKey authentication is provided for use in scenarios where a Logi application or Logi components are integrated with or embedded into a non-Logi application or parent application, which conducts the initial user authentication.
Sometimes called single sign-on, this arrangement allows users to log in once to the parent application and have their security information propagated to other integrated applications—creating a seamless and secure user experience. This, of course, means that users can’t be authorized to go around the parent application and directly access the other applications.
In the stateless world of web applications, this requires some special mechanisms to ensure security. These mechanisms are provided for Logi applications through our SecureKey technology. The authentication part of SecureKey Authentication refers to the method used by the two applications to verify the validity of requests from users. This is a secondary authentication that occurs separately from the typical standard user login authentication.
There are four steps involved in a typical SecureKey authentication transaction:
In the following example scenario, a Logi application is embedded within its parent application, the parent application initially authenticates the user.
The user successfully logs into the parent application. The exact method it uses to determine that a user is valid is irrelevant to this process, except that user roles and/or rights, if applicable, should be retrieved as part of the authentication.
After the parent application authenticates the user, the page that includes the embedded Logi application loads. The parent application sends the Logi application a request for SecureKey token. The request includes the user id, any roles and/or rights, and the user’s machine’s IP address. When it receives a request for a SecureKey token, the Logi application first verifies that the request came from a valid source. This is done by automatically determining the parent application server IP address and ensuring that it’s in a pre-configured list of approved addresses. If so, the Logi application stores the user information from the requests and returns a unique SecureKey token to the parent application.
The parent application sends that original report requests to the Logi application, appending the SecureKey token. Upon receiving the request, the Logi application automatically determines the client browser IP address and uses it along with this token to authenticate the requests. The client browser IP address is used to verify that each port request for the Logi application is coming from a source that has been authenticated by the parent application. This IP address of the end user computer used to request reports through the parent application. This address is passed in the URL of every request for a SecureKey. If authenticated, a session has started for the user and any roles and rights stored in step two associated with the token are used to authorize access to the correct reports, data, and other resources. The requested report is then generated and returned to the parent application.
When the Logi application session is started for the user, the SecureKey token expires and cannot be used again. Any subsequent reports requested from within the context of the Logi application will use a user’s session to control access to reports, data, and resources. The user is then communicating directly from his or her browser to the Logi application for reports and the parent application is no longer involved as long as the Logi application session persists. If the user’s Logi application session ends and he or she subsequently wants to access reports again, then the process of requesting and using a new SecureKey token will have to be repeated.
Once a SecureKey token is used, it cannot be used again in a different session, and sharing tokens across different clients is impossible as a client browser IP address is referenced within this key when it’s created, making it valid only for the client that made the initial request.
Typically, the information associated with the SecureKey token (including username roles, rights, browser IP address, etc.) is stored as a web server application scope variable. However, if you’re using a clustered server configuration, you can specify a physical folder for temporary storage of SecureKey information as XML files. Using a shared network folder for this purpose allows common access to the SecureKey information by clustered servers. This shared folder is specified in the SecureKey shared folder attribute of the security element in a Logi application setting definition. The SecureKey files created by this option are deleted whenever the automatic temporary cache file cleanup occurs.
Starting with version 12.5, you can save the SecureKey token information and supported databases, including SQL server, Oracle, MySQL and PostgreSQL. The temporary values are stored in a database table, RD SecureKey, and the table is automatically created in the connections database the first time SecureKey is made.
If the embedded analytics solution is DevOps-friendly, it will continue handling security with the same methods your DevOps team is currently using. You won’t have to recreate or replicate BI security information in two different places. The rising trend of DevSecOps teams is an indicator of how essential security is to DevOps teams everywhere. Consider the following:
Authorization controls. Whatever DevOps is currently using—credentials, enterprise services account, or something else—the embedded analytics platform should adapt to it.
Standard-based SSO. It’s important for user experience that embedded analytics applications play within a single sign-on (SSO) environment—and important for DevOps that they do it based on standards. Any custom elements divert DevOps resources to figuring out SSO peculiarities and decrease the odds of being able to quickly debug any problems.
Defense against known vulnerabilities. Look for an embedded analytics vendor who is already protecting against common security vulnerabilities that any web application will face.
Incident visibility. If there is a security incident, does the platform provide detailed event logging? DevOps should be able to quickly find the Who, What, and When of everything going on at the time.
A DevOps-friendly embedded analytics solution will not force you to use a proprietary data store, replicate data, or change your current data schemas. You should be able to store your data in place via relational databases, web services, or your own proprietary solution. Applications incorporating analytics should be able to use data in your existing systems and deliver strong performance across disparate sources.
A DevOps-friendly embedded analytics solution will work in any environment. That means applications will work on clouds, in containers, and across mixed and changing deployment environments. These may include Windows or Linux; on-premises, cloud, and hybrid architectures; and mobile devices and browsers. It also means applications can be deployed in a container, or parts of them dispersed across multiple containers.
If the embedded analytics solution is DevOps-friendly, you should be able to deploy it into your current architecture using standard methods (with minimal architecture-specific steps). For instance, it may be as simple as installing an ASP.NET application on an IIS server or a Java application on an Apache Tomcat server.
#5. Release Cycles
A DevOps-friendly embedded analytics solution won’t drag release cycles. If you’re moving toward continuous integration and continuous delivery (CI/CD), you want to control exactly what gets deployed and when through your build pipelines. Does the embedded analytics platform allow you to deploy smaller size incremental update packages when broader changes aren’t needed?
Sustainable Innovation and Differentiation
Your embedded analytics solution will affect how frequently you release standout software, how competitive you are, and how sustainable your advantage is over time. In summary, look for one that:
Utilizes your technology—including security frameworks and tech stack architecture—as it is
Leverages your existing processes so you can build and release application updates faster
Offers flexible scaling so it grows with your business
Predictive analytics tools are powered by several different models and algorithms that can be applied to wide range of use cases. Determining what predictive analytics models are best for your company is key to getting the most out of a predictive analytics solution and leveraging data to make insightful decisions.
For example, consider a retailer looking to reduce customer churn. They might not be served by the same predictive analytics models used by a hospital predicting the volume of patients admitted to the emergency room in the next ten days.
What are the most common predictive analytics models? And what predictive algorithms are most helpful to fuel them? In this post, we give an overview of the most popular types of predictive models and algorithms that are being used to solve business problems today.
Top 5 Predictive Analytics Models
The classification model is, in some ways, the simplest of the several types of predictive analytics models we’re going to cover. It puts data in categories based on what it learns from historical data.
Classification models are best to answer yes or no questions, providing broad analysis that’s helpful for guiding decisive action. It can answer questions such as:
For a retailer, “Is this customer about to churn?”
For a loan provider, “Will this loan be approved?” or “Is this applicant likely to default?”
For an online banking provider, “Is this a fraudulent transaction?”
The breadth of possibilities with the classification model—and the ease by which it can be retrained with new data—means it can be applied to many different industries.
The clustering model sorts data into separate, nested smart groups based on similar attributes. If an ecommerce shoe company is looking to implement targeted marketing campaigns for their customers, they could go through the hundreds of thousands of records to create a tailored strategy for each individual. But is this the most efficient use of time? Probably not. Using the clustering model, they can quickly separate customers into similar groups based on common characteristics and devise strategies for each group at a larger scale.
Other use cases of this predictive analytics model might include grouping loan applicants into “smart buckets” based on loan attributes, identifying areas in a city with a high volume of crime, and benchmarking SaaS customer data into groups to identify global patterns of use.
One of the most widely used predictive analytics models, the forecast model deals in metric value prediction, estimating numeric value for new data based on learnings from historical data.
This model can be applied wherever historical numerical data is available. Scenarios include:
A SaaS company can estimate how many customers they are likely to convert within a given week.
A call center can predict how many support calls they will receive per hour.
A shoe store can calculate how much inventory they should keep on hand in order to meet demand during a particular sales period.
The forecast model also considers multiple input parameters. If a restaurant owner wants to predict the number of customers she is likely to receive in the following week, the model will take into account factors that could impact this, such as: Is there an event close by? What is the weather forecast? Is there an illness going around?
The outliers model is oriented around anomalous data entries within a dataset. It can identify anomalous figures either by themselves or in conjunction with other numbers and categories.
Recording a spike in support calls, which could indicate a product failure that might lead to a recall
Finding anomalous data within transactions, or in insurance claims, to identify fraud
Finding unusual information in your NetOps logs and noticing the signs of impending unplanned downtime
The outlier model is particularly useful for predictive analytics in retail and finance. For example, when identifying fraudulent transactions, the model can assess not only amount, but also location, time, purchase history and the nature of a purchase (i.e., a $1000 purchase on electronics is not as likely to be fraudulent as a purchase of the same amount on books or common utilities).
Time Series Model
The time series model comprises a sequence of data points captured, using time as the input parameter. It uses the last year of data to develop a numerical metric and predicts the next three to six weeks of data using that metric. Use cases for this model includes the number of daily calls received in the past three months, sales for the past 20 quarters, or the number of patients who showed up at a given hospital in the past six weeks. It is a potent means of understanding the way a singular metric is developing over time with a level of accuracy beyond simple averages. It also takes into account seasons of the year or events that could impact the metric.
If the owner of a salon wishes to predict how many people are likely to visit his business, he might turn to the crude method of averaging the total number of visitors over the past 90 days. However, growth is not always static or linear, and the time series model can better model exponential growth and better align the model to a company’s trend. It can also forecast for multiple projects or multiple regions at the same time instead of just one at a time.
Common Predictive Algorithms
Overall, predictive analytics algorithms can be separated into two groups: machine learning and deep learning.
Machine learning involves structural data that we see in a table. Algorithms for this comprise both linear and nonlinear varieties. Linear algorithms train more quickly, while nonlinear are better optimized for the problems they are likely to face (which are often nonlinear).
Deep learning is a subset of machine learning that is more popular to deal with audio, video, text, and images.
With machine learning predictive modeling, there are several different algorithms that can be applied. Below are some of the most common algorithms that are being used to power the predictive analytics models described above.
Random Forest is perhaps the most popular classification algorithm, capable of both classification and regression. It can accurately classify large volumes of data.
The name “Random Forest” is derived from the fact that the algorithm is a combination of decision trees. Each tree depends on the values of a random vector sampled independently with the same distribution for all trees in the “forest.” Each one is grown to the largest extent possible.
Predictive analytics algorithms try to achieve the lowest error possible by either using “boosting” (a technique which adjusts the weight of an observation based on the last classification) or “bagging” (which creates subsets of data from training samples, chosen randomly with replacement). Random Forest uses bagging. If you have a lot of sample data, instead of training with all of them, you can take a subset and train on that, and take another subset and train on that (overlap is allowed). All of this can be done in parallel. Multiple samples are taken from your data to create an average.
While individual trees might be “weak learners,” the principle of Random Forest is that together they can comprise a single “strong learner.”
The popularity of the Random Forest model is explained by its various advantages:
Accurate and efficient when running on large databases
Multiple trees reduce the variance and bias of a smaller set or single tree
Resistant to overfitting
Can handle thousands of input variables without variable deletion
Can estimate what variables are important in classification
Provides effective methods for estimating missing data
Maintains accuracy when a large proportion of the data is missing
Generalized Linear Model (GLM) for Two Values
The Generalized Linear Model (GLM) is a more complex variant of the General Linear Model. It takes the latter model’s comparison of the effects of multiple variables on continuous variables before drawing from an array of different distributions to find the “best fit” model.
Let’s say you are interested in learning customer purchase behavior for winter coats. A regular linear regression might reveal that for every negative degree difference in temperature, an additional 300 winter coats are purchased. While it seems logical that another 2,100 coats might be sold if the temperature goes from 9 degrees to 3, it seems less logical that if it goes down to -20, we’ll see the number increase to the exact same degree.
The Generalized Linear Model would narrow down the list of variables, likely suggesting that there is an increase in sales beyond a certain temperature and a decrease or flattening in sales once another temperature is reached.
The advantage of this algorithm is that it trains very quickly. The response variable can have any form of exponential distribution type. The Generalized Linear Model is also able to deal with categorical predictors, while being relatively straightforward to interpret. On top of this, it provides a clear understanding of how each of the predictors is influencing the outcome, and is fairly resistant to overfitting. However, it requires relatively large data sets and is susceptible to outliers
Gradient Boosted Model (GBM)
The Gradient Boosted Model produces a prediction model composed of an ensemble of decision trees (each one of them a “weak learner,” as was the case with Random Forest), before generalizing. As its name suggests, it uses the “boosted” machine learning technique, as opposed to the bagging used by Random Forest. It is used for the classification model.
The distinguishing characteristic of the GBM is that it builds its trees one tree at a time. Each new tree helps to correct errors made by the previously trained tree—unlike in the Random Forest model, in which the trees bear no relation. It is very often used in machine-learned ranking, as in the search engines Yahoo and Yandex.
Via the GBM approach, data is more expressive, and benchmarked results show that the GBM method is preferable in terms of the overall thoroughness of the data. However, as it builds each tree sequentially, it also takes longer. That said, its slower performance is considered to lead to better generalization.
A highly popular, high-speed algorithm, K-means involves placing unlabeled data points in separate groups based on similarities. This algorithm is used for the clustering model. For example, Tom and Rebecca are in group one and John and Henry are in group two. Tom and Rebecca have very similar characteristics but Rebecca and John have very different characteristics. K-means tries to figure out what the common characteristics are for individuals and groups them together. This is particularly helpful when you have a large data set and are looking to implement a personalized plan—this is very difficult to do with one million people.
In the context of predictive analytics for healthcare, a sample size of patients might be placed into five separate clusters by the algorithm. One particular group shares multiple characteristics: they don’t exercise, they have an increasing hospital attendance record (three times one year and then ten times the next year), and they are all at risk for diabetes. Based on the similarities, we can proactively recommend a diet and exercise plan for this group.
The Prophet algorithm is used in the time series and forecast models. It is an open-source algorithm developed by Facebook, used internally by the company for forecasting.
The Prophet algorithm is of great use in capacity planning, such as allocating resources and setting sales goals. Owing to the inconsistent level of performance of fully automated forecasting algorithms, and their inflexibility, successfully automating this process has been difficult. On the other hand, manual forecasting requires hours of labor by highly experienced analysts.
Prophet isn’t just automatic; it’s also flexible enough to incorporate heuristics and useful assumptions. The algorithm’s speed, reliability and robustness when dealing with messy data have made it a popular alternative algorithm choice for the time series and forecasting analytics models. Both expert analysts and those less experienced with forecasting find it valuable.
How do you determine which predictive analytics model is best for your needs? You need to start by identifying what predictive questions you are looking to answer, and more importantly, what you are looking to do with that information. Consider the strengths of each model, as well as how each of them can be optimized with different predictive analytics algorithms, to decide how to best use them for your organization.
Logi Predict is the only predictive analytics tool built for product managers and developers—and made to embed in existing applications.Watch a free demo today>
If you’re like most people, when faced with a text-heavy document, you may end up skimming the content or not reading it at all. The same holds true for analytics dashboards: If you include too much content or present that content in an overly complex design, your end users may not bother using your analytics at all. Content is key—but sometimes less is more.
How do you present the right information in the right way? You need to simplify what you show and how you show it. Follow these four dashboard design best practices to enhance your analytics user experience today:
#1: Show the relevant information.
Consider the information your end users truly need, and reduce what you show on your dashboard down to a few key ideas. If you’re designing a dashboard for a sales team, for example, don’t include visualizations on marketing campaign channels. Placing sales and marketing metrics on the same page will only confuse users with information they don’t really need.
#2: Choose the right visualizations for your data.
When designing dashboards, application teams can choose from a variety of different chart and graph types to display data. But not all data visualizations will work for every dataset.
Here are some common data visualization types and the best datasets for each:
Tabular formats are best used when exact numbers must be known. Numbers are presented in rows and columns, and may contain summary information. This format is not conducive to finding trends or comparing sets of data. Tabular charts make it hard to analyze sets of numbers, and the presentation becomes unwieldy with larger datasets.
Line charts are best to show continuous data and trends over time. Line charts are set against a common scale; you can also add a trend line or goal line to show performance against a set benchmark.
Bar charts are best used when showing comparisons between categories. The bars can be plotted either horizontally or vertically. Horizontal bar charts often show rank comparisons, usually with the largest bar on top. Vertical bar charts or column charts are often used to show multiple dimensions on a chart or a cross-tabular chart.
Pie charts are best used to compare a percentage of the whole. Pie charts make it easy to understand the relative importance of values, but when there are more than five sections it can become difficult to compare the results.
#3: Hide some content.
Every piece of information doesn’t need to appear on your dashboard all at once. Don’t be afraid to let users drill down into some data points. Utilize icons, pop-up windows, sliding trays, and other expandable areas to show longer blocks of text as users dive deeper, instead of displaying everything from the main page. Selectively showing dashboard content has the added bonus of speeding up load times—and as any developer knows, poor performance is a killer for application engagement.
#4: Use iconography.
Content isn’t limited to data and charts. To support a great user experience, application teams are using icons in the navigation panes and in reports. Icons are typically small graphic images, sometimes accompanied by a one- or two-word description. They help users easily navigate analytics, understand exactly what they’re looking at, and quickly discern what action to take—enabling users to work more efficiently.
You don’t have to create icons from scratch. Graphics libraries such as Font Awesome make it easy to add scalable vector graphics (SVG) to your application. Font Awesome is a free font and CSS framework full of SVG icons that you can scale and customize—so if you resize them, they don’t lose or stretch pixels and you won’t see a change in picture quality.
Predictive health analytics is playing a vital role in healthcare organizations, providing real-time insights to an industry struggling to find ways of improving patient care. Data has become a vital tool in delivering insights that help drive critical decisions in these efforts.
One common use case of predictive analytics for healthcare is the ability to predict and reduce the number of missed appointments. Missed appointments cost the healthcare industry an average of 150 billion dollars annually. That factors out to around $200 for every hourly visit.
Predictive modeling can help providers identify the patients most likely to skip out on an appointment. By analyzing historical data from patients who made their appointments, patients who missed them, and patients who rescheduled, a predictive analytics application can find common factors in patients who often miss appointments. Predictive health analytics can also recommend actions that a healthcare provider can take in order to reduce the number of missed appointments.
How Predictive Health Analytics Helps Avoid Missed Appointments
Here’s an example of how predictive analytics can help healthcare providers reduce the number of missed appointments. First, the predictive analytics solution analyzes patient data from past appointments. It looks at factors including patient location, age, occupation, family size, and more to find similarities among patients who missed their appointments in the past. It may also analyze other data such as weather conditions, traffic patterns, time of day, wait time length, the availability of self-scheduling tools for patients, and the number of locations where that doctor takes appointments.
Predictive modeling in healthcare can analyze vast amounts of data and find correlations between missed appointments. For example, maybe patients who live more than 15 minutes away and who have scheduled an appointment during rush hour are more likely to skip. Or perhaps patients who self-schedule their appointments are more likely to make it on time compared to patients who call the office to schedule.
Next, the predictive analytics application considers new patients who have upcoming appointments. It compares data from these patients to the historical data, and makes a prediction about who is likely to miss their appointment.
Actions to Take
Predictive analytics can also recommend actions the healthcare facility can take to decrease the number of appointment no-shows.
These actions might include:
Investing in better self-scheduling software
Setting up automated reminders via email, text, or phone call
Establishing a timeline for staff to follow up with every scheduled appointment
Make it possible for patients to fill out time-consuming paperwork and forward relevant documentation ahead of time
Instructing staff on ways of engaging patients during their visit to avoid the feeling of being abandoned while waiting on their appointments
Given data’s high demand and complex landscape, data architecture has become increasingly important for organizations that are embarking on any data-driven project, especially embedded analytics. There are many ways to approach data architecture. You may skip some approaches altogether, or use two simultaneously.
To determine which data architecture solution is best for you, consider the pros and cons of these seven most common approaches:
The starting point for many application development teams is the ubiquitous transactional database, which runs most production systems. Transactional databases are row stores, with each record/row keeping relevant information together. They are known for very fast read/write updates and high data integrity. As soon as data hits the transactional database, it is available for analytics. The main downside of transactional databases is structure: They’re not designed for optimal analytics queries, which creates a multitude of performance issues.
Bottom Line: Using transactional databases for embedded analytics makes sense if you already have them in place, but you will eventually run into limitations and need workarounds.
Views or Stored Procedures
Typically, when developers start noticing problems with their transactional systems, they may opt to create some views or stored procedures. Views create the appearance of a table as a result set of a stored query. While views only showcase the data, stored procedures allow you to execute SQL statements on the data. This approach simplifies the SQL needed to run analytics and allows users to filter the information they want to see. However, views or stored procedures typically make performance worse.
Bottom Line: When it comes to embedded analytics, views or stored procedures risk creating lags and affecting your application’s response time.
Aggregate Tables or Material Views
Application development teams may opt to create aggregate tables or material views as another workaround to using view or stored procedures. With an aggregate table, you can create a summary table of the data you need by running a “Group By” SQL query. In a materialized view, you can store query results in a table or database. Aggregate tables or material views improve query performance because you don’t need to aggregate the data for every query. But, the downside is that you need to figure out when and how to update the tables, as well as how to distinguish between updates versus new transactions.
Bottom Line: Pre-aggregated tables and materialized views will help with performance, but you do need to stay organized and put strict processes in place to keep the aggregates up to date.
Replication of Transactional Databases
Replication offloads analytics queries from the production database to a replicated copy of the database. It requires copying and storing data in more than one site or node, so all of the analytics users share the same information. Because many databases have built-in replication facilities, this is easier to implement than other data architecture approaches—and replication removes analytical load from the production database. However, the main issue with replication is the lag between a new transaction hitting the database and that data being available in the replicated table.
With caching, you can preprocess complex and slow-running queries so the resulting data is easier to access when the user requests the information. The cached location could be in memory, another table in the database, or a file-based system where the resulting data is stored temporarily. Caching can help with performance where queries are repeated and is relatively easy to set up in most environments. But, if you have multiple data sources, ensuring consistency and scheduling of cache refreshes can be complex.
Bottom Line: Caching can be a quick fix for improving embedded analytics performance, but the complexity of multiple sources and data latency issues may lead to limitations over time.
For a more sophisticated data architecture, application development teams may turn to data warehouses or marts. Data warehouses are central repositories of integrated data from one or more disparate sources, while data marts contain a subset of a data warehouse designed for a specific reason (e.g., isolating data related to a particular line of business). They both allow you to organize your data in a way that simplifis query complexity and significantly improves query performance. However, designing a data structure for particular use cases can be complex, especially if you’re not familiar with the schema and ETL tools involved.
Bottom Line: Data warehouses and data marts are designed for faster analytics and response times, but implementation will take more time and be more complex.
Modern Analytics Databases
Modern analytics databases are typically columnar structures or in-memory structures. In columnar structures, data is stored at a granular column level in the form of many files, making it faster to query. For in-memory structures, the data is loaded into the memory, which makes reading/writing dramatically faster than a disk-based structure. Modern analytics databases provide improved performance on data load as well as optimal query performance, which is important if you have large volumes of data. But, a big downside is the significant learning curve associated with switching to a modern analytics database. Also, unlike transactional databases, analytics databases perform updates and deletions poorly.
Bottom Line: The modern analytics database is optimal for faster queries and dealing with large volumes of data, but it requires specialized skills and can be costly to implement.
Operational reports provide key metrics in precise formats and layouts. By embedding these reports in the applications end users already utilize, companies can deliver insights in context and aid users in their day-to-day decision-making.
But like any new application feature, if you take shortcuts, you’ll inevitably pay for it later. Avoiding the most common mistakes will ensure your operational reports deliver the right information, in the right format, and in the right context at all times.
Mistake #1: Ignoring Your Users
It’s essential to collect end user requirements before you start designing reports. If possible, conduct interviews with all your key stakeholders and user groups to understand their goals and needs. You’ll want to answer questions such as: What is the purpose of the operational report? What type of data do you want to present? How do you want to deliver the report?
Mistake #2: Designing Before You Think
After gathering user requirements, the next step is to decide what you want the report to look like. Remember that report formats are not mutually exclusive. If your end users need to interact a lot with the report, an interactive operational report is a good option. Or, if you need a precise layout for print or PDF distribution, you may choose to build a pixel-perfect static report (which requires formatting control down to the individual pixel). You may even opt for a combination of the two: build an interactive report that can be exported to print/PDF, but doesn’t need to be as precise as it would be with pixel-perfect reporting.
Mistake #3: Forgetting About Data Complexity
Now that you know what types of reports you’re building, you need to consider the data that will be presented in them. What are your data sources? What’s the size of the dataset? How often will data be delivered? Also consider what data you want to actually show. Displaying data in the same raw format as it’s stored in the database can be visually confusing. Finally, think about where you want to make the calculations on the raw data. High-performance applications often do calculations in the database before the data is pulled into the reporting tool.
Mistake #4: Under- and Overusing Interactive Elements
You don’t need to put everything in one single report. Don’t overload users with dozens of options to drill down, filter, search, pivot, and slice and dice their data on every screen. The best operational reports give users just enough interactivity controls to meet their needs—and no more. Refer back to your requirements to see what your users want to achieve, then think about the interactivity controls you can provide to let them do everything they need to do.
Mistake #5: Failing to Reuse Elements
The secret of smart operational report design is focusing on reusability. This is especially helpful if you make a lot of reports for the same customer or build a general report for different customers. The best operational reports are set up to support reusable components, allowing you to quickly scale to meet new requirements. In operational reporting, you can reuse multiple elements such as headers and footers, frequently used data connections, data visualizations, and more. All of these elements can be placed into reusable sections called “bands,” which may be repeated or nested as needed.
Mistake #6: Using the Wrong Data Visualizations
Different data visualizations convey different types of information. End users should get the information in a format that helps them make sense of it quickly. For example, tabular formats are good at showing exact values, while line charts are best to show continuous data and trends over time. Bar charts show comparisons between categories, while pie charts are best used to compare a percentage of the whole.