The Endeavour - Statistical consulting, expert applied math, data analysis.+Add.Feed Info1000FOLLOWERS
Math Blog by John D. Cook. I help companies make better decisions by taking advantage of the data they have, combining it with expert opinion, creating mathematical models, overcoming computational difficulties, and interpreting the results. My blog The Endeavour has short, self-contained articles on a variety of topics, technical and non-technical.
I’ve run into potential polynomials a lot lately. Most sources I’ve seen are either unclear on how they are defined or use a notation that doesn’t sit well in my brain. Also, they’ll either give an implicit or explicit definition but not both. Both formulations are important. The implicit formulation suggests how potential polynomials arise in practice. The explicit formulation shows how you’d actually compute them when you need to.
The potential polynomials Ar,n are homogeneous polynomials of degree r in n variables. They are usually defined implicitly by
Does anyone know why they’re called “potential” polynomials? Is there some analogy with a physical potential?
By the way, potential polynomials are called “ordinary” in the same sense that Bell polynomials and generating functions are called ordinary: to contrast with exponential forms that insert factorial scaling factors. See the previous post for details.
Bell polynomials come up naturally in a variety of contexts: combinatorics, analysis, statistics, etc. Unfortunately, the variations on Bell polynomials are confusingly named.
Analogy with differential equations
There are Bell polynomials of one variable and Bell polynomials of several variables. The latter are sometimes called “partial” Bell polynomials. In differential equations, “ordinary” means univariate (ODEs) and “partial” means multivariate (PDEs). So if “partial” Bell polynomials are the multivariate form, you might assume that “ordinary” Bell polynomials are the univariate form. Unfortunately that’s not the case.
It seems that the “partial” in the context of Bell polynomials was taken from differential equations, but the term “ordinary” was not. Ordinary and partial are not opposites in the context of Bell polynomials. A Bell polynomial can be ordinary and partial!
Analogy with generating functions
“Ordinary” in this context is the opposite of “exponential,” not the opposite of “partial.” The analogy is with generating functions, not differential equations. The ordinary generating function of a sequence multiplies the nth term by xn and sums. The exponential generating function of a sequence multiplies the nth term by xn/n! and sums. For example, the ordinary generating function for the Fibonacci numbers is
while the exponential generating function is
The definitions of exponential and ordinary Bell polynomials is complicated—you can find them on Wikipedia, for example—but the difference between the two that I wish to point out is that the former divides the kth polynomial argument by k! while the latter does not. They also differ by a scaling factor. The exponential form of Bn,k has a factor of n! where the ordinary form has a factor of k!.
“Ordinary” as a technical term
Based on the colloquial meaning of “ordinary” you might assume that it is the default. And indeed that is the case with generating functions. Without further qualification, generating function means ordinary generating function. You’ll primarily see explicit references to ordinary generating functions in a context where it’s necessary to distinguish them from some other kind of generating function. Usually the word “ordinary” will be written in parentheses to emphasize that an otherwise implicit assumption is being made explicit. In short, “ordinary” and “customary” correspond in the context of generating functions.
But in the context of Bell polynomials, it seems that the exponential form is more customary. At least in some sources, an unqualified reference to Bell polynomials refers to the exponential variety. That’s the case in SymPy where the function bell() implements exponential Bell polynomials and there is no function to compute ordinary Bell polynomials.
I often use SVG images because they look great on a variety of devices, but most email clients won’t display that format. If you subscribe by email, there’s always a link to open the article in a browser and see all the images.
I also have a monthly newsletter. It highlights the most popular posts of the month and usually includes a few words about what I’ve been up to.
I have 17 Twitter accounts that post daily on some topic and one personal account.
The three most popular are CompSciFact, AlgebraFact, and ProbFact. These accounts are a little broader than the names imply. For example, if I run across something that doesn’t fall neatly into another mathematical category, I’ll post it to AlgebraFact. Miscellaneous applied math tends to end up on AnalysisFact.
I don’t keep up with replies to my topical accounts, but I usually look at replies to my personal account. If you want to be sure I get your message, please call or send me email.
You can find a list of all accounts and their descriptions here.
Here’s my contact info.
My phone number isn’t on there. It’s 832.422.8646. If you’d like, you can import my contact info as a vCard or use the QR code below.
A “squircle” is a sort of compromise between a square and circle, but one that differs from a square with rounded corners. It’s a shape you’ll see, for example, in some of Apple’s designs.
A straight line has zero curvature, and a circle with radius r has curvature 1/r. So in a rounded square the curvature jumps from 0 to 1/r where the flat side meets the circular corner. But in the figure above there’s no abrupt change in curvature but instead a smooth transition. More on that in just a bit.
To get a smoother transition from the corners to the flat sides, designers use what mathematicians would call Lp balls, curves satisfying
in rectangular coordinates or
in polar coordinates.
When p = 2 we get a circle. When p = 2.5, we get square version of the superellipse. As p increases the corners get sharper, pushing out toward the corner of a square. Some sources define the squircle to be the case p = 4 but others say p = 5. The image at the top of the post uses p = 4. The larger p is, the closer the figure becomes to a square.
To show how the curvature changes, let’s plot the curvature on top of the squircle. The inner curve is the squircle itself, and radial distance to the outer curve is proportional to the curvature.
Here’s the plot for p = 4.
And here’s the plot for p = 5.
If we were to make the same plot for a rounded square, the curvature would be zero over the flat sides and jump to some constant value over the circular caps. We can see above that the curvature is largest over the corners but continuously approaches zero toward the middle of the sides.
Suppose you have a differential equation of the form
If the function f(x) is constant, the differential equation is easy to solve in closed form. But when it is not constant, it is very unlikely that closed form solutions exist. But there may be useful closed form approximate solutions.
There is no linear term in the equation above, but there is a standard change of variables to eliminate a linear term and put any second order linear equation with variable coefficients in the form above.
The Liouville-Green approximate solutions are linear combinations of the functions
The constant of integration doesn’t matter. It would only change the solution by a constant multiple, so any indefinite integral will do.
To try out the Liouville-Green approximation, let’s look at the equation
Here f(x) = x² and so the LG approximation solutions are
Let’s see how these solutions behave compare to a numerical solution of the equation. To make things more definite, we need initial conditions. Let’s say A= B = 1, which corresponds to initial conditions y(1) = 2.25525 and y‘(1) = -0.0854354.
Here’s a plot of the L-G approximation and a numerical solution.
Here’s the Mathematica code that was used to create the plot above.
Yesterday I said on Twitter “Time to see whether practice agrees with theory, moving from LaTeX to Python. Wish me luck.” I got a lot of responses to that because it describes the experience of a lot of people. Someone asked if I’d blog about this. The content is confidential, but I’ll talk about the process. It’s a common pattern.
I’m writing code based on a theoretical result in a journal article.
First the program gets absurd results. Theory pinpoints a bug in the code.
Then the code confirms my suspicions of an error in the paper.
The code also uncovers, not an error per se, but an important missing detail in the paper.
Then code and theory differ by about 1%. Uncertain whether this is theoretical approximation error or a subtle bug.
Then try the code for different arguments, ones which theory predicts will have less approximation error, and now code and theory differ by 0.1%.
Then I have more confidence in (my understanding of) the theory and in my code.
Hermann Weyl is my great model. He used to write beautiful literature. Reading it was a joy because he put a lot of thought into it. Hermann Weyl wrote a book called The Classical Groups, very famous book, and at the same time a book appeared by Chevalley called Lie Groups. Chevalley’s book was compact, all the theorems are there, very precise, Bourbaki-style definitions. If you want to learn about it, it’s all there. Hermann Weyl’s book was the other way around. It is discursive, expansive, it’s a bigger book, it tells you much more than need to know. It’s the kind of book you read ten times. Chevalley’s book you read once. …
In the introduction he explains that he’s German, he’s writing in a foreign language, he apologizes saying he is writing in a language that was “not sung by the gods at my cradle.” You can’t get any more poetic than that. I’m a great admirer of his style, of his exposition and his mathematics. They go together.
Here’s the portion of the preface that Atiyah is quoting, where Weyl apologies eloquently for his lack of eloquence in writing English.
The gods have imposed upon my writing the yoke of a foreign tongue that was not sung at my cradle.
“Was dies heissen will, weiss jeder,
Der im Traum pferdlos geritten,”
I am tempted to say with Gottfried Keller. Nobody is more aware than myself of the attendant loss of vigor, ease and lucidity of expression.
Photographer Bernhard Kreutzercame by while I was interviewing Atiyah at the first Heidelberg Laureate Forum in 2013.
Which is more accurate? Let’s make a couple plots plot to see.
First here are the results on the linear scale.
Note that the short series is far more accurate than the long series! The differences are more dramatic on the log scale. There you can see that you get more correct significant figures from the short series as the angle approaches zero.
What’s going on? Shouldn’t you get more accuracy from a longer Taylor series approximation?
Yes, but there’s an error in our code. The leading coefficient in long_series is wrong in the 4th decimal place. That small error in the most important term outweighs the benefit of adding more terms to the series. The simpler code, implemented correctly, is better than the more complicated code with a small error.
Focus on the leading term. Until it’s right, the rest of the terms don’t matter.
Here’s an apparent paradox. You’ll hear that Monte Carlo methods are independent of dimension, and that they scale poorly with dimension. How can both statements be true?
The most obvious way to compute multiple integrals is to use product methods, analogous to the way you learn to compute multiple integrals by hand. Unfortunately the amount of work necessary with this approach grows exponentially with the number of variables, and so the approach is impractical beyond modest dimensions.
The idea behind Monte Carlo integration, or integration by darts, is to estimate the area under a (high dimensional) surface by taking random points and counting how many fall below the surface. More details here. The error goes down like the reciprocal of the square root of the number of points. The error estimate has nothing to do with the dimension per se. In that sense Monte Carlo integration is indeed independent of dimension, and for that reason is often used for numerical integration in dimensions too high to use product methods.
Suppose you’re trying to estimate what portion of a (high dimensional) box some region takes up, and you know a priori that the proportion is in the ballpark of 1/2. You could refine this to a more accurate estimate with Monte Carlo, throwing darts at the box and seeing how many land in the region of interest.
But in practice, the regions we want to estimate are small relative to any box we’d put around them. For example, suppose you want to estimate the volume of a unit ball in n dimensions by throwing darts at the unit cube in n dimensions. When n = 10, only about 1 dart in 400 will land in the ball. When n = 100, only one dart out of 1070 will land inside the ball. (See more here.) It’s very likely you’d never have a dart land inside the ball, no matter how much computing power you used.
If no darts land inside the region of interest, then you would estimate the volume of the region of interest to be zero. Is this a problem? Yes and no. The volume is very small, and the absolute error in estimating a small number to be zero is small. But the relative is 100%.
(For an analogous problem, see this discussion about the approximation sin(θ) = θ for small angles. It’s not just saying that one small number is approximately equal to another small number: all small numbers are approximately equal in absolute error! The small angle approximation has small relative error.)
If you want a small relative error in Monte Carlo integration, and you usually do, then you need to come up with a procedure that puts a larger proportion of integration points in the region of interest. One such technique is importance sampling, so called because it puts more samples in the important region. The closer the importance sampler fits the region of interest, the better the results will be. But you may not know enough a priori to create an efficient importance sampler.
So while the absolute accuracy of Monte Carlo integration does not depend on dimension, the problems you want to solve with Monte Carlo methods typically get harder with dimension.
Spheres and balls are examples of common words that take on a technical meaning in math, as I wrote about here. Recall the the unit sphere in n dimensions is the set of points with distance 1 from the origin. The unit ball is the set of points of distance less than or equal to 1 from the origin. The sphere is the surface of the ball.
Integrating a polynomial in several variables over a ball or sphere is easy. For example, take the polynomial xy² + 5x²z² in three variables. The integral of the first term, xy², is zero. If any variable in a term has an odd exponent, then the integral of that term is zero by symmetry. The integral over half of the sphere (ball) will cancel out the integral over the opposite half of the sphere (ball). So we only need to be concerned with terms like 5x²z².
Now in n dimensions, suppose the exponents of x1, x2, …, xn are a1, a2, …, an respectively. If any of the a‘s are odd, the integral over the sphere or ball will be zero, so we assume all the a‘s are even. In that case the integral over the unit sphere is simply
is the multivariate beta function and for each i we define bi = (ai + 1)/2. When n = 2 then B is the (ordinary) beta function.
Note that the integral over the unit sphere doesn’t depend on the dimension of the sphere.
The integral over the unit ball is
which is proportional to the integral over the sphere, where the proportionality constant depends on the sum of the exponents (the original exponents, the a‘s, not the b‘s) and the dimension n.
Note that if we integrate the constant polynomial 1 over the unit sphere, we get the surface area of the unit sphere, and if we integrate it over the unit ball, we get the volume of the unit ball.
You can find a derivation for the integral results above in . The proof is basically Liouville’s trick for integrating the normal distribution density, but backward. Instead of going from rectangular to polar coordinates, you introduce a normal density and go from polar to rectangular coordinates.
 Gerald B. Folland, How to Integrate a Polynomial over a Sphere. The American Mathematical Monthly, Vol. 108, No. 5 (May, 2001), pp. 446-448.
Read Full Article
Read for later
Articles marked as Favorite are saved for later viewing.
Scroll to Top
Separate tags by commas
To access this feature, please upgrade your account.