Implied Probabilities
Data Mining in MATLAB
by
2y ago
Oddsmakers provide a useful prediction mechanism for many subjects of interest. Beyond sports, they host prediction markets for events in politics, entertainment, current events and other fields. There are some subtleties, however, in converting payout odds to probabilities. Note that that payout odds (also called payoff odds or house odds) are expressed in this article as fractional odds (there are several other popular formats used in casinos and on-line bookmakers, such as decimal odds and American odds: they simply indicate the same information a different way). Payout ..read more
Visit website
A New Dawn for Local Learning Methods?
Data Mining in MATLAB
by
4y ago
The relentless improvement in speed of computers continues. While some technical barriers to this progress have begun to emerge, exploitation of parallelism has actually increased the rate of acceleration for many purposes, especially in applied mathematical fields such as data mining. Interestingly, new, powerful hardware has been put to the task of running ever more baroque algorithms. Feedforward neural networks, once trained over several days, now train in minutes on affordable desktop hardware. Over time, ever fancier algorithms have been fed to these machines: boosting, support vector m ..read more
Visit website
Data Analytics Summit III at Harrisburg University of Science and Technology
Data Mining in MATLAB
by
4y ago
Harrisburg University of Science and Technology (Harrisburg, Pennsylvania) has just finished hosting Data Analytics Summit III. This is a multi-day event featuring a mix of presenters from the private sector, the government/government-related businesses and academia which spans research, practice and more visionary ("big picture") topics. The theme was “Analytics Applied:  Case Studies, Measuring Impact, and Communicating Results". Regrettably, I was unable to attend this time because I was traveling for business, but I was at Data Analytics Summit II, which was held in December of 2015. If y ..read more
Visit website
Geographic Distances: A Quick Trip Around the Great Circle
Data Mining in MATLAB
by
4y ago
Recently, I wanted to calculate the distance between locations on the Earth. Finding a handy solution, I thought readers might be interested. In my situation, location data included ZIP codes (American postal codes). Also available to me is a look-up table of the latitude and longitude of the geometric centroid of each ZIP code. Since the areas identified by ZIP codes are usually geographical small, and making the "close enough" assumption that this planet is perfectly spherical, trigonometry will allow distance calculations which are, for most purposes, precise enough. Given the latitude and ..read more
Visit website
Four Books Worth Owning
Data Mining in MATLAB
by
4y ago
Below are listed four books on statistics which I feel are worth owning. They largely take a "traditional" statistics perspective, as opposed to a machine learning/data mining one. With the exception of "The Statistical Sleuth", these are less like textbooks than guide-books, with information reflecting the experience and practical advice of their respective authors. Comparatively few of their pages are devoted to predictive modeling- rather, they cover a host of topics relevant to the statistical analyst: sample size determination, hypothesis testing, assumptions, sampling technique,  etc ..read more
Visit website
Re-Coding Categorical Variables
Data Mining in MATLAB
by
4y ago
Categorical variables as candidate predictors pose a distinct challenge to the analyst, especially when they exhibit high cardinality (a large number of distinct values). Numerical models (for instance linear regression and most neural networks) cannot accept these variables directly as inputs, since operations between categories and numbers are not defined. It is sometimes advantageous (even necessary) to re-code such variables as one or more numeric dummy variables, with each new variable containing a 0 or 1 value indicating the presence (1) or absence (0) of one distinct value. This often ..read more
Visit website
MATLAB as (Near-)PseudoCode
Data Mining in MATLAB
by
4y ago
In "Teaching Data Science in English (Not in Math)", the Feb-08-2016 entry of his Web log, "The Datatist", Charles Givre criticizes the use of specialized math symbols (capital sigma for summation, etc.) and Greek letters as being confusing, especially to newcomers to the field. He offers, as an example, the following, traditional definition of sum of squared errors: He suggests that "English" (pseudo-code) be used instead, such as the following: residuals_squared = (actual_values - predictions) ^ 2 RSS = sum( residuals_squared ) Although there are some flaws in this particular compari ..read more
Visit website
Probability: A Halloween Puzzle
Data Mining in MATLAB
by
4y ago
Introduction Though Halloween is months away, I found the following interesting and thought readers might enjoy examining my solution. Recently, I was given the following probability question to answer: Halloween Probability Puzzler The number of trick-or-treaters knocking on my door in any five minute interval between 6 and 8pm on Halloween night is distributed as a Poisson with a mean of 5 (ignoring time effects). The number of pieces of candy taken by each child, in addition to the expected one piece per child, is distributed as a Poisson with a mean of 1. What is the minimum number of ..read more
Visit website

Follow Data Mining in MATLAB on FeedSpot

Continue with Google
Continue with Apple
OR