Combining multiply-imputed datasets, never easy
Andrew Gelman
by Andrew
7h ago
Thomas Hühn writes: I’m thinking about doing a Bayesian analysis of a very small subset of PISA or TIMSS data. Those large-scale education surveys do not report students achievement scores as single numbers, but they report five or ten numbers, so called plausible values. Those plausible values have been sampled from a constructed probability distribution. The user guides and methodology papers strongly warn against taking those five plausible values as five observations, and also against taking the mean of those five plausible values and doing statistical analysis on that. Instead you’re sup ..read more
Visit website
Two kings, a royal, a knight, and three princesses walk into a bar . . . (Dude from Saudi Arabia accuses the lords of AI of not giving him enough credit.)
Andrew Gelman
by Andrew
1d ago
Roger Critchlow points us to this post from Jürgen Schmidhuber, “How 3 Turing Awardees Republished Key Methods and Ideas Whose Creators They Failed to Credit.” The whole thing is too long and detailed to follow—it’s like one of those pieces of outsider art with scribblings all over the margins of an elaborate painting on an already-patterned fabric—but I did follow one of the links in the post which pointed to this: This reminds me of Stan’s pedantic mode where it gives a warning if you do something like define a variable named sigma without constraining it to be positive or if you assign a ..read more
Visit website
Who wrote the music for In My Life? Three Bayesian analyses
Andrew Gelman
by Andrew
2d ago
A Beatles fan pointed me to this news item from a few years ago, “A Songwriting Mystery Solved: Math Proves John Lennon Wrote ‘In My Life.'” This surprised me, because in his memoir, Many Years from Now, Paul McCartney very clearly stated that he, Paul, wrote it. Also, the news report is from NPR. Who you gonna trust, NPR or Paul McCartney? The question pretty much answers itself. But I was curious, so I read on: Over the years, Lennon and McCartney have revealed who really wrote what, but some songs are still up for debate. The two even debate between themselves — their memories seem to diff ..read more
Visit website
Bayesian Workflow, Causal Generalization, Modeling of Sampling Weights, and Time: My talks at Northwestern University this Friday and the University of Chicago on Monday
Andrew Gelman
by Andrew
3d ago
Fri 3 May 2024, 11am at Chambers Hall, Ruan Conference Room – lower level: Audience Choice: Bayesian Workflow / Causal Generalization / Modeling of Sampling Weights The audience is invited to choose among three possible talks: Bayesian Workflow: The workflow of applied Bayesian statistics includes not just inference but also building, checking, and understanding fitted models. We discuss various live issues including prior distributions, data models, and computation, in the context of ideas such as the Fail Fast Principle and the Folk Theorem of Statistical Computing. We also consider some ex ..read more
Visit website
Does this study really show that lesbians and bisexual women die sooner than straight women? Disparities in Mortality by Sexual Orientation in a Large, Prospective JAMA Paper
Andrew Gelman
by Andrew
3d ago
This recently-published graph is misleading but also has the unintended benefit of revealing a data problem: Jrc brought it up in a recent blog comment. The figure is from an article published in the Journal of the American Medical Association, which states: This prospective cohort study examined differences in time to mortality across sexual orientation, adjusting for birth cohort. Participants were female nurses born between 1945 and 1964, initially recruited in the US in 1989 for the Nurses’ Health Study II, and followed up through April 2022. . . . Compared with heterosexual participants ..read more
Visit website
Job Ad: Spatial Statistics Group Lead at Oak Ridge National Laboratory
Andrew Gelman
by Bob Carpenter
4d ago
Robert Stewart, of Oak Ridge National Lab (ORNL), who we met at StanCon, is looking to fill the following role: ORNL Job ad: Group Leader for Spatial Statistics It’s a research group leader position with an emphasis on published research that’s relevant for the Department of Energy (the group that runs the national labs). Robert came and gave a talk here at Flatiron Institute and the scale and range of problems they’re tasked with solving is really interesting (e.g., inferring the height, construction, and use of every human-constructed building in the world). They’ve got access to the compu ..read more
Visit website
Boris and Natasha in America: How often is the wife taller than the husband?
Andrew Gelman
by Andrew
4d ago
Shane Frederick, who sometimes sends me probability puzzles, sent along this question: Among married couples, what’s your best guess about how often the wife is taller than the husband? • 1 in 10 • 1 in 40 • 1 in 300 • 1 in 5000 I didn’t want to cheat so I tried to think about this one without doing any calculations or looking anything up. My two sources of information were my recollection of the distributions of womens’ and mens’ heights (this is in Regression and Other Stories) and whatever I could dredge up from my personal experiences with friends and couples I see walking on the street ..read more
Visit website
“Often enough, scientists are left with the unenviable task of conducting an orchestra with out-of-tune instruments”
Andrew Gelman
by Andrew
5d ago
Gaurav Sood writes: Often enough, scientists are left with the unenviable task of conducting an orchestra with out-of-tune instruments. They are charged with telling a coherent story about noisy results. Scientists defer to the demand partly because there is a widespread belief that a journal article is the appropriate grouping variable at which results should “make sense.” To tell coherent stories with noisy data, scientists resort to a variety of underhanded methods. The first is simply squashing the inconvenient results—never reporting them or leaving them to the appendix or couching the r ..read more
Visit website
Evaluating MCMC samplers
Andrew Gelman
by Bob Carpenter
6d ago
I’ve been thinking a lot about how to evaluate MCMC samplers. A common way to do this is to run one or more iterations of your contender against a baseline of something simple, something well understood, or more rarely, the current champion (which seems to remain NUTS, though we’re open to suggestions for alternatives). Reporting comparisons of averages (and uncertainty) Then, what do you report? What I usually see is a report of averages over runs, such as average effective sample size per gradient eval. Sometimes I’ll see medians, but I like averages better here as it’s a fairer indication o ..read more
Visit website
“When are Bayesian model probabilities overconfident?” . . . and we’re still trying to get to meta-Bayes
Andrew Gelman
by Andrew
6d ago
Oscar Oelrich, Shutong Ding, Måns Magnusson, Aki Vehtari, and Mattias Villani write: Bayesian model comparison is often based on the posterior distribution over the set of compared models. This distribution is often observed to concentrate on a single model even when other measures of model fit or forecasting ability indicate no strong preference. Furthermore, a moderate change in the data sample can easily shift the posterior model probabilities to concentrate on another model. We document overconfidence in two high-profile applications in economics and neuroscience. To shed more light on t ..read more
Visit website

Follow Andrew Gelman on FeedSpot

Continue with Google
Continue with Apple
OR