John D. Cook Consulting Blog on Feedspot

John D. Cook Consulting Blog

8,859 FOLLOWERS

The Endeavor gives readers a sense of how they might combine their programming skills with business, and/or use those skills to solve real-world problems. It is the popular computer science-themed blog of John Cook, a former math professor and programmer who has transitioned into consulting. He helps companies make better decisions by taking advantage of the data they have, combining it with..

Logistic regression quick takes

John D. Cook Consulting Blog

by John

2d ago

This post is a series of quick thoughts related to logistic regression. It starts with this article on moving between logit and probability scales. *** Logistic regression models the probability of a yes/no event occurring. It gives you more information than a model that simply tries to classify yeses and nos. I advised a client to move from an uninterpretable classification method to logistic regression and they were so excited about the result that they filed a patent on it. It’s too late to patent logistic regression, but they filed a patent on the application of logistic regression to thei ..read more

Visit website

Numerical application of mean value theorem

John D. Cook Consulting Blog

by John

2d ago

Suppose you’d like to evaluate the function for small values of z, say z = 10−8. This example comes from [1]. The Python code from numpy import exp def f(z): return (exp(z) - 1 - z)/z**2 print(f(1e-8)) prints -0.607747099184471. Now suppose you suspect numerical difficulties and compute your result to 50 decimal places using bc -l. scale = 50 z = 10^-8 (e(z) - 1 - z)/z^2 Now you get u(z) = .50000000166666667…. This suggests original calculation was completely wrong. What’s going on? For small z, ez ≈ 1 + z and so we lose precision when directly evaluating the numer ..read more

Visit website

Numerical differentiation with a complex step

John D. Cook Consulting Blog

by John

2d ago

The most obvious way to approximate the derivative of a function numerically is to use the definition of derivative and stick in a small value of the step size h. f′ (x) ≈ ( f(x + h) − f(x) ) / h. How small should h be? Since the exact value of the derivative is the limit as h goes to zero, the smaller h is the better. Except that’s not true in computer arithmetic. When h is too small, f(x + h) is so close to f(x) that the subtraction loses precision. One way to see this is to think of the extreme case of h so small that f(x + h) equals f(x) to machine precision. Then the derivative is approxi ..read more

Visit website

MCMC and the coupon collector problem

John D. Cook Consulting Blog

by John

2d ago

Bob Carpenter wrote today about how Markov chains cannot thoroughly cover high-dimensional spaces, and that they do not need to. That’s kinda the point of Monte Carlo integration. If you want systematic coverage, you can/must sample systematically, and that’s impractical in high dimensions. Bob gives the example that if you want to get one integration point in each quadrant of a 20-dimensional space, you need a million samples. (220 to be precise.) But you may be able to get sufficiently accurate results with a lot less than a million samples. If you wanted to be certain to have one sample in ..read more

Visit website

Up and down the abstraction ladder

John D. Cook Consulting Blog

by John

5d ago

It’s easier to go up a ladder than to come down, literally and metaphorically. Gian-Carlo Rota made a profound observation on the application of theory. One frequently notices, however, a wide gap between the bare statement of a principle and the skill required in recognizing that it applies to a particular problem. This isn’t quite what he said. I made two small edits to generalize his actual statement. He referred specifically to the application of the inclusion-exclusion principle to problems in combinatorics. Here is his actual quote from [1]. One frequently notices, however, a wide gap ..read more

Visit website

Making documents with makefiles

John D. Cook Consulting Blog

by John

1w ago

I learned to use the command line utility make in the context of building C programs. The program make reads an input file to tell it how to make things. To make a C program, you compile the source files into object files, then link the object files together. You can tell make what depends on what, so it will not do any unnecessary work. If you tell it that a file foo.o depends on a file foo.c, then it will rebuild foo.o if that file is older than the file foo.c that it depends on. Looking at file timestamps allows make to save time by not doing unnecessary work. It also makes it easier to dic ..read more

Visit website

Applied abstraction

John D. Cook Consulting Blog

by John

1w ago

“Good general theory does not search for the maximum generality, but for the right generality.” — Saunders Mac Lane One of the benefits of a scripting language like Python is that it gives you generalizations for free. For example, take the function sorted. If you give it a list of integers, it will return a list of numerically sorted integers. But you could also give it a list of strings, and it will return a list of alphabetically sorted strings. In fact, you can give it more general collections of more general things. The user gets generality for free: more functionality without comp ..read more

Visit website

Fizz buzz walk

John D. Cook Consulting Blog

by John

1w ago

I ran across a graphic yesterday made by taking a sequence of steps of the same length, turning left on the nth step if n is prime, and otherwise continuing in the same direction. Here’s my recreation of the first 1000 steps: You can see that in general it makes a lot of turns at first and then turns less often as the density of primes thins out. I wondered what the analogous walk would look like for Fizz Buzz. There are several variations on Fizz Buzz, and the one that produced the most interesting visualization was to turn left when a number either is divisible by 7 or contains the digit 7 ..read more

Visit website

Closed-form solutions to nonlinear PDEs

John D. Cook Consulting Blog

by John

2w ago

The traditional approach to teaching differential equations is to present a collection of techniques for finding closed-form solutions to ordinary differential equations (ODEs). These techniques seem completely unrelated [1] and have arcane names such as integrating factors, exact equations, variation of parameters, etc. Students may reasonably come away from an introductory course with the false impression that it is common for ODEs to have closed-form solutions because it is common in the class. My education reacted against this. We were told from the beginning that differential equations ra ..read more

Visit website

Choosing a Computer Language for a Project

John D. Cook Consulting Blog

by Wayne Joubert

2w ago

Julia. Scala. Lua. TypeScript. Haskell. Go. Dart. Various computer languages new and old are sometimes proposed as better alternatives to mainstream languages. But compared to mainstream choices like Python, C, C++ and Java (cf. Tiobe Index)—are they worth using? Certainly it depends a lot on the planned use: is it a one-off small project, or a large industrial-scale software application? Yet even a one-off project can quickly grow to production-scale, with accompanying growing pains. Startups sometimes face a growth crisis when the nascent code base becomes unwieldy and must be refactored or ..read more

Visit website

Follow John D. Cook Consulting Blog on FeedSpot