IBM Data Science Experience Blog
0 FOLLOWERS
IBM Data Science Experience supports the data scientist community to learn, create, and collaborate.
IBM Data Science Experience Blog
4y ago
In a typical data science workflow, the initial steps are to identify relevant data sources and request data from various departments and source systems. The bulk of time is then spent cleaning and transforming the data before models can be built.
Feature engineering pain points
In most environments, data scientists have to wait for experts to provide the requested data. Depending on the effort involved and on competing priorities, this process can sometimes take weeks or months before data is available for analysis. The business impact of these delays include crippled analytics projects, miss ..read more
IBM Data Science Experience Blog
4y ago
Use IoT data in IBM Streams Flows for billing and alerts
As more devices become internet enabled, harnessing that data to provide value for consumers is becoming an essential strategy for many businesses. Some utility companies, for example, offer smart meters that send usage readings from homes and businesses, improving accuracy and enabling remote reporting. Since the smart meters have more fine grained usage data, some companies are offering discounts to customers if their consumption is during off peak periods. Processing streaming data from thousands of smart devices is a good fit for Str ..read more
IBM Data Science Experience Blog
4y ago
Exploring the power of IBM Analytics Engine to read data from Cloud Object Storage and stream data from Apache Hadoop HDFS
In this tutorial you will learn how to connect to IBM Analytics Engine (IAE) and run Spark and Hadoop jobs from a notebook to read data from IBM Cloud Object Storage (COS) and stream data from Apache Hadoop HDFS on Data Science Experience (DSX).
A. Connect to IBM Analytics Engine from Data Science Experience.
Follow this video for provisioning and adding the IAE service to DSX.
You can either create a new Analytics Engine service, or select an existing service if you have ..read more
IBM Data Science Experience Blog
4y ago
Last year we made data science a team sport with IBM Data Science Experience, our award winning IDE for analytics. This summer we brought to market IBM Watson Machine Learning that allows companies to put models into production with easy model management and full workflow automation. We’ve grown up those two products into a platform adding new features.
Continuous learning — your models should always improve
Models today are difficult + time-consuming to maintain and to keep always up to date. With Watson Machine Learning it is possible to automate the retraining of models and to monitor how t ..read more
IBM Data Science Experience Blog
4y ago
Combine the strengths of IBM Data Platform and Esri ArcGIS to rapidly build machine learning models with high resolution geospatial data. IBM and Esri offer a powerful suite of tools for building first class geospatial applications. In this post, you’ll learn how to leverage deep learning models with Watson Machine Learning and Watson Visual Recognition in your applications. Watson Machine Learning offers a powerful Python API built for controlling the machine learning lifecycle programmatically. You can build, save, and deploy a variety of models. In this post, we’ll save a retrained TensorFl ..read more
IBM Data Science Experience Blog
4y ago
In DSX, we use project to organize resources like data, notebooks, models & connections. To easily interact with these assets now we have project-lib along with object storage APIs. Project-lib is programmatic interface to interact with your data stored in object storage. It allows you to easily access all your project assets including files, connections and metadata.
In this blog, we will explore this project-lib library in a python notebook.
Set-Up Your Project
Project-lib is pre-installed on DSX and can be imported in notebook through simple steps:
Click on more (three dots) option from ..read more
IBM Data Science Experience Blog
4y ago
In this blog post we will try to predict chronic kidney disease using various attributes collected from hospitals. Chronic kidney disease (CKD) is a condition characterized by a gradual loss of kidney function over time, which may lead to kidney failure.
Data Source:
We are using the UCI Chronic Kidney Disease data set from the Data Science Experience community. You can get data into your project in two steps:
Go to the Data Science Experience community . You can also navigate to the community from DSX by clicking the Community tab on the top panel.2. Select the data set from the community and ..read more
IBM Data Science Experience Blog
4y ago
In this article, you will learn how to bring data into RStudio in Data Science Experience from IBM Cloud Object Storage (COS), and write data from RStudio back into IBM Cloud Object Storage, using ‘sparklyr’ and ‘Stocator’ to work with Spark.
First, connect to the Spark service using sparklyr’s spark_connect function. You can refer to this post for details.library(sparklyr) library(dplyr) # list Spark kernels kernels <- list_spark_kernels() kernels
# connect to Spark kernel sc <- spark_connect(config = kernels[3])
2. Next set up credentials for Object Storage. Easiest way is to get the c ..read more