Win-Vector Blog
3,492 FOLLOWERS
Win-Vector Blog is a data science blog which gives tips and tricks for using R programming in data science.
Win-Vector Blog
4y ago
Just a heads-up, Nina and I are working on re-structuring and updating the website. In particular we are finally moving to https.
Please don’t be alarmed if things are in flux, and some links break. We are managing both winvector.com and win-vector.com, so the new no-dash URL is not a forgery.
win-vector.com is the preferred format of the URLs, but it will take us a few days to safely migrate back to that structure ..read more
Win-Vector Blog
4y ago
One of the chapters that we are especially proud of in Practical Data Science with R is Chapter 7, “Linear and Logistic Regression.” We worked really hard to explain the fundamental principles behind both methods in a clear and easy-to-understand form, and to document diagnostics returned by the R implementations of lm and glm.
For the second edition, we added a new section on regularization of linear models, and how to fit regularized linear models with glmnet.
So if you are looking for a good introduction to the principles and practice of linear models in R, we hope you check out Practical D ..read more
Win-Vector Blog
4y ago
We have an exciting new article to share: Don’t Feel Guilty About Selecting Variables.
If you are at all interested in the probabilistic justification of important data science techniques, such as variable selection or pruning, this should be an informative and fun read.
“Data Science” is often criticized with the common slur “if it has science in the name it isn’t a science.” Data science is in fact a science for the following reason: it has empirical content. That is, there are methods that are used because we can confirm they work.
However, data science when done well also has a mathematica ..read more
Win-Vector Blog
4y ago
A kind reader recently shared the following comment on the Practical Data Science with R 2nd Edition live-site.
Thanks for the chapter on data frames and data.tables. It has helped me overcome an obstacle freeing me from a lot of warnings telling me my data table was not a real . It reduced the calculation time for a scenario in modelStudio from 30 minutes to 7 minutes. Following the advice in your book is helping me a lot with understanding R and the models you can create with R: Thanks
This is exactly what we were hoping for when we added Chapter 5 Data engineering and data shaping to the ..read more
Win-Vector Blog
4y ago
Data science is often a case of brining the tools to the problems and data, instead of insisting on bringing the problems and data to the tools.
To support cross-language data science we have been working on cross-language tools, documentation, and training.
For example:
vtreat data preparation package for supervised machine learning available both for vtreat R users and for vtreat Python users. Video lectures: advanced data preparation for R users video, and advanced data preparation for Python users video.
We have task-oriented cross-linked documentation:
Regression: R regression example, f ..read more
Win-Vector Blog
4y ago
Deal of the Day May 10: Half off Practical Data Science with R, Second Edition. Use code dotd051020au at https://bit.ly/2xLRPCk ..read more
Win-Vector Blog
4y ago
Thank you very much Why R? for being awesome hosts. We are really pleased with how your virtual MeetUp went. For those who missed it here is a link ..read more
Win-Vector Blog
4y ago
Thank you very much Why R? for being awesome hosts. We are really pleased with how your virtual MeetUp went. For those who missed it here is a link ..read more
Win-Vector Blog
4y ago
Nina Zumel and John Mount will be speaking on advanced data preparation for supervised machine learning at the Why R? Webinar Thursday, May 7, 2020.
This is a 8pm in a GMT+2 timezone, which for us is 11AM Pacific Time. Hope to see you there ..read more
Win-Vector Blog
4y ago
Here are a few isolation inspired “applications” (in the theoretical or mathematical sense of the term) of the spicy soup combinatorial design.
Now by “application” we mean: another abstract or mathematical problem that is solved by our tools. This is how the word “application” is used in mathematics and theoretical computer science; and a bit at-odds with how the word “application” is commonly used. In particular: it implies no claim of practicality at real-world scales.
Screening for strong pesticides/poisons
Strong pesticides or poisons likely have the “union property” (a mixture has the pr ..read more