Posts Tagged ‘data science at scale’

An Overview of Data Profiling

Introduction Data from various business contexts and processes don’t lend themselves to ready analysis that is useful to businesses as soon as the data are collected. While the sources of such data can be diverse, like machine logs, sensor data, social networking data, e-commerce data, etc., there’s no established and uniform way to process diverse…

Time Series Analysis of Sensor Data

The IOT revolution brings along with it a glut of sensor data. The sensor data glut in turn brings with it a set of challenges in statistical modeling and scalable machine learning for sensor data analysis. This is an overview of sensor data analysis. We look at sensor characteristics such as bias, linearity, stability, precision and accuracy. We also understand gage analysis approaches, time series analysis in the univariate and multivariate forms, such as AR and VAR models. We look at caveats

Calling out the big data scientist

“Data science” is a popular term and one in the ascendancy in Gartner’s Hype Cycle for Emerging Technologies 2014. It has multiple meanings based on whom you ask. One way to deal with subjective interpretations is to crowdsource the answer and pick the popular interpretations, provided there is enough data. Recently, a data scientist (who else?) at LinkedIn attempted to define the term “data scientist” using data from profiles of people that have the phrase “data scientist” across