GPT-4 + Stable-Diffusion = ?: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models
BAIR
by
4d ago
TL;DR: Text Prompt -> LLM -> Intermediate Representation (such as an image layout) -> Stable Diffusion -> Image. Recent advancements in text-to-image generation with diffusion models have yielded remarkable results synthesizing highly realistic and diverse images. However, despite their impressive capabilities, diffusion models, such as Stable Diffusion, often struggle to accurately follow the prompts when spatial or common sense reasoning is required. The following figure lists four scenarios in which Stable Diffusion falls short in generating images that accurately correspond to ..read more
Visit website
Interactive Fleet Learning
BAIR
by
1M ago
Figure 1: “Interactive Fleet Learning” (IFL) refers to robot fleets in industry and academia that fall back on human teleoperators when necessary and continually learn from them over time. In the last few years we have seen an exciting development in robotics and artificial intelligence: large fleets of robots have left the lab and entered the real world. Waymo, for example, has over 700 self-driving cars operating in Phoenix and San Francisco and is currently expanding to Los Angeles. Other industrial deployments of robot fleets include applications like e-commerce order fulfillment at Amazo ..read more
Visit website
Fully Autonomous Real-World Reinforcement Learning with Applications to Mobile Manipulation
BAIR
by
4M ago
Reinforcement learning provides a conceptual framework for autonomous agents to learn from experience, analogously to how one might train a pet with treats. But practical applications of reinforcement learning are often far from natural: instead of using RL to learn through trial and error by actually attempting the desired task, typical RL applications use a separate (usually simulated) training phase. For example, AlphaGo did not learn to play Go by competing against thousands of humans, but rather by playing against itself in simulation. While this kind of simulated training is appealing fo ..read more
Visit website
Keeping Learning-Based Control Safe by Regulating Distributional Shift
BAIR
by
9M ago
To regulate the distribution shift experience by learning-based controllers, we seek a mechanism for constraining the agent to regions of high data density throughout its trajectory (left). Here, we present an approach which achieves this goal by combining features of density models (middle) and Lyapunov functions (right). In order to make use of machine learning and reinforcement learning in controlling real world systems, we must design algorithms which not only achieve good performance, but also interact with the system in a safe and reliable manner. Most prior work on safety-critical cont ..read more
Visit website
Reverse engineering the NTK: towards first-principles architecture design
BAIR
by
9M ago
Foundational works showed how to find the kernel corresponding to a wide network. We find the inverse mapping, showing how to find the wide network corresponding to a given kernel. Deep neural networks have enabled technological wonders ranging from voice recognition to machine transition to protein engineering, but their design and application is nonetheless notoriously unprincipled. The development of tools and methods to guide this process is one of the grand challenges of deep learning theory. In Reverse Engineering the Neural Tangent Kernel, we propose a paradigm for bringing some princi ..read more
Visit website
Why do Policy Gradient Methods work so well in Cooperative MARL? Evidence from Policy Representation
BAIR
by
11M ago
In cooperative multi-agent reinforcement learning (MARL), due to its on-policy nature, policy gradient (PG) methods are typically believed to be less sample efficient than value decomposition (VD) methods, which are off-policy. However, some recent empirical studies demonstrate that with proper input representation and hyper-parameter tuning, multi-agent PG can achieve surprisingly strong performance compared to off-policy VD methods. Why could PG methods work so well? In this post, we will present concrete analysis to show that in certain scenarios, e.g., environments with a highly multi-moda ..read more
Visit website
FIGS: Attaining XGBoost-level performance with the interpretability and speed of CART
The Berkeley Artificial Intelligence Research Blog
by
1y ago
FIGS (Fast Interpretable Greedy-tree Sums): A method for building interpretable models by simultaneously growing an ensemble of decision trees in competition with one another. Recent machine-learning advances have led to increasingly complex predictive models, often at the cost of interpretability. We often need interpretability, particularly in high-stakes applications such as in clinical decision-making; interpretable models help with all kinds of things, such as identifying errors, leveraging domain knowledge, and making speedy predictions. In this blog post we’ll cover FIGS, a new method ..read more
Visit website
The Berkeley Crossword Solver
BAIR
by
1y ago
We recently built the Berkeley Crossword Solver (BCS), the first computer program to beat every human competitor in the world’s top crossword tournament. The BCS combines neural question answering and probabilistic inference to achieve near-perfect performance on most American-style crossword puzzles, like the one shown below: Figure 1: Example American-style crossword puzzle Crosswords are challenging for humans and computers alike. Many clues are vague or underspecified and can’t be answered until crossing constraints are taken into account. While some clues are similar to factoid question ..read more
Visit website
Rethinking Human-in-the-Loop for Artificial Augmented Intelligence
BAIR
by
1y ago
Figure 1: In real-world applications, we think there exist a human-machine loop where humans and machines are mutually augmenting each other. We call it Artificial Augmented Intelligence. How do we build and evaluate an AI system for real-world applications? In most AI research, the evaluation of AI methods involves a training-validation-testing process. The experiments usually stop when the models have good testing performance on the reported datasets because real-world data distribution is assumed to be modeled by the validation and testing data. However, real-world applications are usually ..read more
Visit website
Designing Societally Beneficial Reinforcement Learning Systems
BAIR
by
1y ago
Deep reinforcement learning (DRL) is transitioning from a research field focused on game playing to a technology with real-world applications. Notable examples include DeepMind’s work on controlling a nuclear reactor or on improving Youtube video compression, or Tesla attempting to use a method inspired by MuZero for autonomous vehicle behavior planning. But the exciting potential for real world applications of RL should also come with a healthy dose of caution - for example RL policies are well known to be vulnerable to exploitation, and methods for safe and robust policy development are an a ..read more
Visit website

Follow BAIR on Feedspot

Continue with Google
Continue with Apple
OR