Machined Learnings
1,332 FOLLOWERS
Machined Learning by Paul Mineiro, from Microsoft Cloud & Information Services Lab
Machined Learnings
1y ago
When I was coming up, conferences were a mature technology for disseminating high-quality scientific results at higher speed than journals. The combination of smaller units of work (“one paper, one idea”) and focused peer review were part of the key ingredients. Unfortunately the process is now broken. Arxiv halps fill the gap, but remains caveat emptor; and high quality journals are still slow.
The Inevitability of Game Theory
I've played arguably too much poker in my life, but I did learn some transferable skills. At the poker table, you love to see your opponent upset, because p ..read more
Machined Learnings
1y ago
This slide is from Leon Bottou's amazing ICML 2015 keynote.
You should contemplate this image on a regular basis.
In the above slide Leon Bottou outlines three distinct, valid, and complementary ways of attempting to understand the world. In AI all of these approaches are used, but with different emphasis as trends change. Arguably, there was too much of the left hand side in the “if it's not convex it's crap” days, but then the deep learning revolution pushed many all the way to right. In 2014 I asked a famous theoretician (who will remain anonymous) at a Q ..read more
Machined Learnings
3y ago
Here's an example of a research question that started as a practical concern and ended up having science-fiction levels of potential. A Practical Question and Answer Contextual bandits have developed from research prototype to maturing industrial technology. CBs are practical because they incorporate the partial feedback nature of decision making in the real world, while sidestepping difficult issues of credit assignment and planning. In other words: you get a reward for the action you've taken, but you don't know what the reward would have been if you had done something else (partial feedback ..read more
Machined Learnings
4y ago
This blog post is about improved off-policy contextual bandit learning via distributional robustness. I'll provide some theoretical background and also outline the implementation in vowpal wabbit. Some of this material is in a NeurIPS expo talk video, and additional material is in the accepted paper. Motivation In off-policy learning in contextual bandits our goal is to produce the best policy possible from historical data, and we have no control over the historical logging policy which generated the data. (Note production systems that run in closed-loop configurations nonetheless are in pract ..read more
Machined Learnings
4y ago
If you need to solve a convex optimization problem nowadays, you are in great shape. Any problem of the form $$
\begin{alignat}{2}
&\!\sup_z & \qquad & f(z) \\
& \text{subject to} & & h(z) = 0 \\
& & & g(z) \preceq 0
\end{alignat}
$$ where $f$ and $g$ are convex and $h$ is affine can be attacked by several excellent freely available software packages: my current favorite is cvxpy, which is a joy to use. If you have a lot of variables and not a lot of constraints, you can instead solve a dual problem. It ends up looking like $$
\begin{alignat}{2}
&\!\sup_ ..read more
Machined Learnings
5y ago
ICLR 2019 was reminiscent of the early NeurIPS days (sans skiing): a single track of talks, vibrant poster sessions, and a large mid-day break. The Tuesday morning talks were on climate change, modeling proteins, generating music, and modeling the visual cortex. Except for climate change, these were all hot topics at NeurIPS in the late 1990s. History doesn't repeat, but it does rhyme. My favorite talk was by Pierre-Yves Oudeyer, whose research in curiosity based learning spans both human subjects and robotics. Pierre's presentation was an entertaining tour de force of cognitive science ..read more