I set up a new data analysis blog
Realizations in Biostatistics
by
3y ago
Well, I tried to write a blog post using the RStudio Rmarkdown system, and utterly failed. Thus, I set up a system where I could write from RStudio. So I set up a Github pages blog at randomjohn.github.io. There I can easily write and publish posts involving data analysis ..read more
Visit website
Windows 10 anniversary updates includes a whole Linux layer - this is good news for data scientists
Realizations in Biostatistics
by
3y ago
If you are on Windows 10, no doubt you have heard that Microsoft included the bash shell in its 2016 Windows 10 anniversary update. What you may not know is that this is much, much more than just the bash shell. This is a whole Linux layer that enables you to use Linux tools, and does away with a further layer like Cygwin (which requires a special dll). However, you will only get the bash shell out of the box. To enable the whole Linux layer, follow instructions here. Basically, this involves enabling developer mode then enabling the Linux layer feature. In the process, you will download some ..read more
Visit website
Which countries have Regrexit?
Realizations in Biostatistics
by
3y ago
This doesn't have a lot to do with bio part of biostatistics, but is an interesting data analysis that I just started. In the wake of the Brexit vote, there is a petition for a redo. The data for the petition is here, in JSON format. Fortunately, in R, working with JSON data is pretty easy. You can easily download the data from the link and put it into a data frame. I start on that here, with the RJSONIO package, ggplot2, and a version of the petition I downloaded on 6/26/16. One question I had was whether all the signers are British. Fortunately, the petition collects the place of residence ..read more
Visit website
Little Debate: defining baseline
Realizations in Biostatistics
by
3y ago
In an April 30, 2015 note in Nature (vol 520, p. 612), Jeffrey Leek and Roger Peng note that p-values get intense scrutiny, while all the decisions that lead up to the p-values get little debate. I wholeheartedly agree, and so I'm creating a Little Debate series to shine some light on these tiny decisions that may not get a lot of press. Yet these tiny decisions can have a big influence on statistical analysis. Because my focus here is mainly biostatistics, most of these ideas will be placed in the setting of clinical trials. Defining baseline seems like an easy thing to do, and con ..read more
Visit website
Simulating a Weibull conditional on time-to-event is greater than a given time
Realizations in Biostatistics
by
3y ago
Recently, I had to simulate a time-to-event of subjects who have been on a study, are still ongoing at the time of a data cut, but who are still at risk of an event (e.g. progressive disease, cardiac event, death). This requires the simulation of a conditional Weibull. To do this, I created the following function: # simulate conditional Weibull conditional on survival > T --------------- # reliability function is exp{-(T+t/b)^a} / exp{-(T/b)^a} = 1-F(t) # n = number of points to return # shape = shape parm of weibull # scale = scale parm of weibull (default 1) # t is minimum (default ..read more
Visit website
Talk to Upstate Data Science Group on Caret
Realizations in Biostatistics
by
3y ago
Last night I gave an introduction and demo of the caret R package to the Upstate Data Science group, meeting at Furman University. It was fairly well attended (around 20 people), and well received. It was great to get out of my own comfort zone a bit (since graduate school, I've only really given talks on some topic in biostatistics) and meeting statisticians, computer scientists, and other sorts of data scientists from many different fields. This is a relatively new group, and given the interest over the last couple of months or so I think this has been sorely needed in the Upstate Sout ..read more
Visit website
Even the tiniest error messages can indicate an invalid statistical analysis
Realizations in Biostatistics
by
3y ago
The other day, I was reading in a data set in R, and the function indicated that there was a warning about a parsing error on one line. I went ahead with the analysis anyway, but that small parsing error kept bothering me. I thought it was just one line of goofed up data, or perhaps a quote in the wrong place. I finally opened up the CSV file in a text editor, and found that the reason for the parsing error was that the data set was duplicated within the CSV file. The parsing error resulted from the reading of the header twice. As a result, anything I did afterward was suspect. Word to the wi ..read more
Visit website
Statisticians ruin the day again, this time with a retraction
Realizations in Biostatistics
by
3y ago
Authors retract second study about medical uses of honey - Retraction Watch at Retraction Watch: For the second time, authors of manuscripts have had to retract their papers because of serious data analysis errors. While the details of the actual errors are scant, we do know that the article was published, a company tried to replicate the results but failed, the journal editor employed a third-party statistician who found serious errors in the data analysis, and the errors were serious enough that the paper, to stay accepted, would have had to go through a major revision and further peer r ..read more
Visit website
The thirty-day trial
Realizations in Biostatistics
by
3y ago
Steve Pavlina wrote about a self-help technique called the thirty-day trial. To perform the technique, you commit 30 days of some new habit, such as quitting smoking or writing in a journal. The idea is that it’s psychologically easier to commit to something for 30 days than to make a permanent change, but after the 30 days you break addiction to old habits and have the perspective of whether to continue on with the new habit, go back, or go a completely different direction. For activities like journaling or quitting smoking, this technique might work. After all, psychologist Jeremy Dean anno ..read more
Visit website
Statistics: P values are just the tip of the iceberg : Nature News & Comment
Realizations in Biostatistics
by
3y ago
Statistics: P values are just the tip of the iceberg : Nature News & Comment: This article is very important. Yes, p-values reported in the literature (or in your own research) need scrutiny, but so does every step in the analysis process, starting with the design of the study ..read more
Visit website

Follow Realizations in Biostatistics on FeedSpot

Continue with Google
Continue with Apple
OR