NLPositionality: Characterizing Design Biases of Datasets and Models
Carnegie Mellon University | Machine Learning Blog
by Jenny Liang
1M ago
TLDR; Design biases in NLP systems, such as performance differences for different populations, often stem from their creator’s positionality, i.e., views and lived experiences shaped by identity and background. Despite the prevalence and risks of design biases, they are hard to quantify because researcher, system, and dataset positionality are often unobserved. We introduce NLPositionality, a framework for characterizing design biases and quantifying the positionality of NLP datasets and models. We find that datasets and models align predominantly with Western, White, college-educated, and you ..read more
Visit website
On Noisy Evaluation in Federated Hyperparameter Tuning
Carnegie Mellon University | Machine Learning Blog
by Kevin Kuo
4M ago
Evaluating models in federated networks is challenging due to factors such as client subsampling, data heterogeneity, and privacy. These factors introduce noise that can affect hyperparameter tuning algorithms and lead to suboptimal model selection. Hyperparameter tuning is critical to the success of cross-device federated learning applications. Unfortunately, federated networks face issues of scale, heterogeneity, and privacy, which introduce noise in the tuning process and make it difficult to faithfully evaluate the performance of various hyperparameters. Our work (MLSys’23) explores key so ..read more
Visit website
Creative Robot Tool Use with Large Language Models
Carnegie Mellon University | Machine Learning Blog
by Peide Huang
4M ago
TLDR: We introduce RoboTool, enabling robots to use tools creatively with large language models, which solves long-horizon hybrid discrete-continuous planning problems with the environment- and embodiment-related constraints. Tool use is an essential hallmark of advanced intelligence. Some animals can use tools to achieve goals that are infeasible without tools. For example, crows solve a complex physical puzzle using a series of tools, and apes use a tree branch to crack open nuts or fish termites with a stick. Beyond using tools for their intended purpose and following established procedures ..read more
Visit website
Peer Reviews of Peer Reviews: A Randomized Controlled Trial and Other Experiments
Carnegie Mellon University | Machine Learning Blog
by Nihar Shah
5M ago
Alexander Goldberg, Ivan Stelmakh, Kyunghyun Cho, Alice Oh, Alekh Agarwal, Danielle Belgrave, and Nihar Shah Is it possible to reliably evaluate the quality of peer reviews? We study peer reviewing of peer reviews driven by two primary motivations:  (i) Incentivizing reviewers to provide high-quality reviews is an important open problem. The ability to reliably assess the quality of reviews can help design such incentive mechanisms.  (ii) Many experiments in the peer-review processes of various scientific fields use evaluations of reviews as a “gold standard” for investigating polici ..read more
Visit website
Supporting Human-AI Collaboration in Auditing LLMs with LLMs
Carnegie Mellon University | Machine Learning Blog
by Charvi Rastogi
7M ago
Illustration depicting the process of a human and a large language model working together to find failure cases in a (not necessarily different) large language model. Overview In the era of ChatGPT, where people increasingly take assistance from a large language model (LLM) in day-to-day tasks, rigorously auditing these models is of utmost importance. While LLMs are celebrated for their impressive generality, on the flip side, their wide-ranging applicability renders the task of testing their behavior on each possible input practically infeasible. Existing tools for finding test cases that LLM ..read more
Visit website
Navigating to Objects in the Real World
Carnegie Mellon University | Machine Learning Blog
by Theo Gervet
10M ago
Empirical study: We evaluated three approaches for robots to navigate to objects in six visually diverse homes. TLDR: Semantic navigation is necessary to deploy mobile robots in uncontrolled environments like our homes, schools, and hospitals. Many learning-based approaches have been proposed in response to the lack of semantic understanding of the classical pipeline for spatial navigation. But learned visual navigation policies have predominantly been evaluated in simulation. How well do different classes of methods work on a robot? We present a large-scale empirical study of semantic visual ..read more
Visit website
Validating Large Language Models with ReLM
Carnegie Mellon University | Machine Learning Blog
by Michael Kuchnik
11M ago
ReLM enables writing tests that are guaranteed to come from the set of valid strings, such as dates. Without ReLM, LLMs are free to complete prompts with non-date answers, which are difficult to assess. TL;DR: While large language models (LLMs) have been touted for their ability to generate natural-sounding text, there are concerns around potential negative effects of LLMs such as data memorization, bias, and inappropriate language. We introduce ReLM (MLSys ’23), a system for validating and querying LLMs using standard regular expressions. We demonstrate via validation tasks on memorization, b ..read more
Visit website
On Privacy and Personalization in Federated Learning: A Retrospective on the US/UK PETs Challenge
Carnegie Mellon University | Machine Learning Blog
by Ken Liu
1y ago
TL;DR: We study the use of differential privacy in personalized, cross-silo federated learning (NeurIPS’22), explain how these insights led us to develop a 1st place solution in the US/UK Privacy-Enhancing Technologies (PETs) Prize Challenge, and share challenges and lessons learned along the way. If you are feeling adventurous, checkout the extended version of this post with more technical details! How can we be better prepared for the next pandemic? Patient data collected by groups such as hospitals and health agencies is a critical tool for monitoring and preventing the spread of disease. U ..read more
Visit website
TIDEE: An Embodied Agent that Tidies Up Novel Rooms using Commonsense Priors
Carnegie Mellon University | Machine Learning Blog
by Gabriel Sarch
1y ago
Example of embodied commonsense reasoning. A robot proactively identifies a remote on the floor and knows it is out of place without instruction. Then, the robot figures out where to place it in the scene and manipulates it there. For robots to operate effectively in the world, they should be more than explicit step-by-step instruction followers. Robots should take actions in situations when there is a clear violation of the normal circumstances and be able to infer relevant context from partial instruction. Consider a situation where a home robot identifies a remote control which has fallen t ..read more
Visit website
Are Model Explanations Useful in Practice? Rethinking How to Support Human-ML Interactions.
Carnegie Mellon University | Machine Learning Blog
by Valerie Chen
1y ago
Figure 1. This blog post discusses the effectiveness of black-box model explanations in aiding end users to make decisions. We observe that explanations do not in fact help with concrete applications such as fraud detection and paper matching for peer review. Our work further motivates novel directions for developing and evaluating tools to support human-ML interactions. Model explanations have been touted as crucial information to facilitate human-ML interactions in many real-world applications where end users make decisions informed by ML predictions. For example, explanations are thought to ..read more
Visit website

Follow Carnegie Mellon University | Machine Learning Blog on FeedSpot

Continue with Google
Continue with Apple
OR