How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark
The Berkeley Artificial Intelligence Research Blog
by
2w ago
When we began studying jailbreak evaluations, we found a fascinating paper claiming that you could jailbreak frontier LLMs simply by translating forbidden prompts into obscure languages. Excited by this result, we attempted to reproduce it and found something unexpected. The paper in question claimed an impressive 43% success rate in jailbreaking GPT-4 by translating forbidden prompts into Scots Gaelic (Yong et al., 2023). To showcase their method, the authors asked GPT-4 to provide instructions for building a homemade explosive device using household materials. They translated the prompt into ..read more
Visit website
Are We Ready for Multi-Image Reasoning? Launching VHs: The Visual Haystacks Benchmark!
The Berkeley Artificial Intelligence Research Blog
by
1M ago
Humans excel at processing vast arrays of visual information, a skill that is crucial for achieving artificial general intelligence (AGI). Over the decades, AI researchers have developed Visual Question Answering (VQA) systems to interpret scenes within single images and answer related questions. While recent advancements in foundation models have significantly closed the gap between human and machine visual processing, conventional VQA has been restricted to reason about only single images at a time rather than whole collections of visual data. This limitation poses challenges in more complex ..read more
Visit website
TinyAgent: Function Calling at the Edge
The Berkeley Artificial Intelligence Research Blog
by
3M ago
The ability of LLMs to execute commands through plain language (e.g. English) has enabled agentic systems that can complete a user query by orchestrating the right set of tools (e.g. ToolFormer, Gorilla). This, along with the recent multi-modal efforts such as the GPT-4o or Gemini-1.5 model, has expanded the realm of possibilities with AI agents. While this is quite exciting, the large model size and computational requirements of these models often requires their inference to be performed on the cloud. This can create several challenges for their widespread adoption. First and foremost, uploa ..read more
Visit website
Modeling Extremely Large Images with xT
The Berkeley Artificial Intelligence Research Blog
by
6M ago
As computer vision researchers, we believe that every pixel can tell a story. However, there seems to be a writer’s block settling into the field when it comes to dealing with large images. Large images are no longer rare—the cameras we carry in our pockets and those orbiting our planet snap pictures so big and detailed that they stretch our current best models and hardware to their breaking points when handling them. Generally, we face a quadratic increase in memory usage as a function of image size. Today, we make one of two sub-optimal choices when handling large images: down-sampling or cr ..read more
Visit website
2024 BAIR Graduate Directory
The Berkeley Artificial Intelligence Research Blog
by
6M ago
Every year, the Berkeley Artificial Intelligence Research (BAIR) Lab graduates some of the most talented and innovative minds in artificial intelligence and machine learning. Our Ph.D. graduates have each expanded the frontiers of AI research and are now ready to embark on new adventures in academia, industry, and beyond. These fantastic individuals bring with them a wealth of knowledge, fresh ideas, and a drive to continue contributing to the advancement of AI. Their work at BAIR, ranging from deep learning, robotics, and natural language processing to computer vision, security, and much more ..read more
Visit website
The Shift from Models to Compound AI Systems
The Berkeley Artificial Intelligence Research Blog
by
7M ago
AI caught everyone’s attention in 2023 with Large Language Models (LLMs) that can be instructed to perform general tasks, such as translation or coding, just by prompting. This naturally led to an intense focus on models as the primary ingredient in AI application development, with everyone wondering what capabilities new LLMs will bring. As more developers begin to build using LLMs, however, we believe that this focus is rapidly changing: state-of-the-art AI results are increasingly obtained by compound systems with multiple components, not just monolithic models. For example, Google’s AlphaC ..read more
Visit website
Ghostbuster: Detecting Text Ghostwritten by Large Language Models
The Berkeley Artificial Intelligence Research Blog
by
10M ago
The structure of Ghostbuster, our new state-of-the-art method for detecting AI-generated text. Large language models like ChatGPT write impressively well—so well, in fact, that they’ve become a problem. Students have begun using these models to ghostwrite assignments, leading some schools to ban ChatGPT. In addition, these models are also prone to producing text with factual errors, so wary readers may want to know if generative AI tools have been used to ghostwrite news articles or other sources before trusting them. What can teachers and consumers do? Existing tools to detect AI-generated t ..read more
Visit website
Asymmetric Certified Robustness via Feature-Convex Neural Networks
The Berkeley Artificial Intelligence Research Blog
by
10M ago
Asymmetric Certified Robustness via Feature-Convex Neural Networks TLDR: We propose the asymmetric certified robustness problem, which requires certified robustness for only one class and reflects real-world adversarial scenarios. This focused setting allows us to introduce feature-convex classifiers, which produce closed-form and deterministic certified radii on the order of milliseconds. Figure 1. Illustration of feature-convex classifiers and their certification for sensitive-class inputs. This architecture composes a Lipschitz-continuous feature map $\varphi$ with a learned convex functio ..read more
Visit website
Goal Representations for Instruction Following
The Berkeley Artificial Intelligence Research Blog
by
11M ago
Goal Representations for Instruction Following A longstanding goal of the field of robot learning has been to create generalist agents that can perform tasks for humans. Natural language has the potential to be an easy-to-use interface for humans to specify arbitrary tasks, but it is difficult to train robots to follow language instructions. Approaches like language-conditioned behavioral cloning (LCBC) train policies to directly imitate expert actions conditioned on language, but require humans to annotate all training trajectories and generalize poorly across scenes and behaviors. Meanwhile ..read more
Visit website
Rethinking the Role of PPO in RLHF
The Berkeley Artificial Intelligence Research Blog
by
11M ago
Rethinking the Role of PPO in RLHF TL;DR: In RLHF, there’s tension between the reward learning phase, which uses human preference in the form of comparisons, and the RL fine-tuning phase, which optimizes a single, non-comparative reward. What if we performed RL in a comparative way? Figure 1: This diagram illustrates the difference between reinforcement learning from absolute feedback and relative feedback. By incorporating a new component - pairwise policy gradient, we can unify the reward modeling stage and RL stage, enabling direct updates based on pairwise responses. Large Language Models ..read more
Visit website

Follow The Berkeley Artificial Intelligence Research Blog on FeedSpot

Continue with Google
Continue with Apple
OR