Site re-Org
Win-Vector Blog
by John Mount
4y ago
Just a heads-up, Nina and I are working on re-structuring and updating the website. In particular we are finally moving to https. Please don’t be alarmed if things are in flux, and some links break. We are managing both winvector.com and win-vector.com, so the new no-dash URL is not a forgery. win-vector.com is the preferred format of the URLs, but it will take us a few days to safely migrate back to that structure ..read more
Visit website
Linear and Logistic Regression in Practical Data Science with R 2nd Edition
Win-Vector Blog
by Nina Zumel
4y ago
One of the chapters that we are especially proud of in Practical Data Science with R is Chapter 7, “Linear and Logistic Regression.” We worked really hard to explain the fundamental principles behind both methods in a clear and easy-to-understand form, and to document diagnostics returned by the R implementations of lm and glm. For the second edition, we added a new section on regularization of linear models, and how to fit regularized linear models with glmnet. So if you are looking for a good introduction to the principles and practice of linear models in R, we hope you check out Practical D ..read more
Visit website
Don’t Feel Guilty About Selecting Variables
Win-Vector Blog
by John Mount
4y ago
We have an exciting new article to share: Don’t Feel Guilty About Selecting Variables. If you are at all interested in the probabilistic justification of important data science techniques, such as variable selection or pruning, this should be an informative and fun read. “Data Science” is often criticized with the common slur “if it has science in the name it isn’t a science.” Data science is in fact a science for the following reason: it has empirical content. That is, there are methods that are used because we can confirm they work. However, data science when done well also has a mathematica ..read more
Visit website
Data engineering and data shaping in Practical Data Science with R 2nd Edition
Win-Vector Blog
by John Mount
4y ago
A kind reader recently shared the following comment on the Practical Data Science with R 2nd Edition live-site. Thanks for the chapter on data frames and data.tables. It has helped me overcome an obstacle freeing me from a lot of warnings telling me my data table was not a real . It reduced the calculation time for a scenario in modelStudio from 30 minutes to 7 minutes. Following the advice in your book is helping me a lot with understanding R and the models you can create with R: Thanks This is exactly what we were hoping for when we added Chapter 5 Data engineering and data shaping to the ..read more
Visit website
General Data Science Means Cross-Language Tools, Training, and Documentation
Win-Vector Blog
by John Mount
4y ago
Data science is often a case of brining the tools to the problems and data, instead of insisting on bringing the problems and data to the tools. To support cross-language data science we have been working on cross-language tools, documentation, and training. For example: vtreat data preparation package for supervised machine learning available both for vtreat R users and for vtreat Python users. Video lectures: advanced data preparation for R users video, and advanced data preparation for Python users video. We have task-oriented cross-linked documentation: Regression: R regression example, f ..read more
Visit website
Deal of the Day May 10: Half off Practical Data Science with R, Second Editio
Win-Vector Blog
by John Mount
4y ago
Deal of the Day May 10: Half off Practical Data Science with R, Second Edition. Use code dotd051020au at https://bit.ly/2xLRPCk ..read more
Visit website
Thank you “Why R?” for Being Awesome Hosts
Win-Vector Blog
by John Mount
4y ago
Thank you very much Why R? for being awesome hosts. We are really pleased with how your virtual MeetUp went. For those who missed it here is a link ..read more
Visit website
Thank you “Why R?” for Being Awesome Hosts
Win-Vector Blog
by John Mount
4y ago
Thank you very much Why R? for being awesome hosts. We are really pleased with how your virtual MeetUp went. For those who missed it here is a link ..read more
Visit website
Nina and John Speaking at Why R? Webinar Thursday, May 7, 2020
Win-Vector Blog
by John Mount
4y ago
Nina Zumel and John Mount will be speaking on advanced data preparation for supervised machine learning at the Why R? Webinar Thursday, May 7, 2020. This is a 8pm in a GMT+2 timezone, which for us is 11AM Pacific Time. Hope to see you there ..read more
Visit website
Some Applications of The Spicy Soup Test
Win-Vector Blog
by John Mount
4y ago
Here are a few isolation inspired “applications” (in the theoretical or mathematical sense of the term) of the spicy soup combinatorial design. Now by “application” we mean: another abstract or mathematical problem that is solved by our tools. This is how the word “application” is used in mathematics and theoretical computer science; and a bit at-odds with how the word “application” is commonly used. In particular: it implies no claim of practicality at real-world scales. Screening for strong pesticides/poisons Strong pesticides or poisons likely have the “union property” (a mixture has the pr ..read more
Visit website

Follow Win-Vector Blog on FeedSpot

Continue with Google
Continue with Apple
OR