Keyframer: Empowering Animation Design using Large Language Models
Apple Machine Learning Research
by
3d ago
Large language models (LLMs) have the potential to impact a wide range of creative domains, as exemplified in popular text-to-image generators like DALLĀ·E and Midjourney. However, the application of LLMs to motion-based visual design has not yet been explored and presents novels challenges such as how users might effectively describe motion in natural language. Further, many existing generative design tools lack support for iterative refinement of designs beyond prompt engineering. In this paper, we present Keyframer, a design tool that leverages the code generation capabilities of LLMs to ..read more
Visit website
Efficient ConvBN Blocks for Transfer Learning and Beyond
Apple Machine Learning Research
by
3d ago
Convolution-BatchNorm (ConvBN) blocks are integral components in various computer vision tasks and other domains. A ConvBN block can operate in three modes: Train, Eval, and Deploy. While the Train mode is indispensable for training models from scratch, the Eval mode is suitable for transfer learning and beyond, and the Deploy mode is designed for the deployment of models. This paper focuses on the trade-off between stability and efficiency in ConvBN blocks: Deploy mode is efficient but suffers from training instability; Eval mode is widely used in transfer learning but lacks efficiency. To ..read more
Visit website
Resource-constrained Stereo Singing Voice Cancellation
Apple Machine Learning Research
by
1w ago
We study the problem of stereo singing voice cancellation, a subtask of music source separation, whose goal is to estimate an instrumental background from a stereo mix. We explore how to achieve performance similar to large state-of-the-art source separation networks starting from a small, efficient model for real-time speech separation. Such a model is useful when memory and compute are limited and singing voice processing has to run with limited look-ahead. In practice, this is realised by adapting an existing mono model to handle stereo input. Improvements in quality are obtained by tuning ..read more
Visit website
Scalable Pre-training of Large Autoregressive Image Models
Apple Machine Learning Research
by
3w ago
This paper introduces AIM, a collection of vision models pre-trained with an autoregressive objective. These models are inspired by their textual counterparts, i.e., Large Language Models (LLMs), and exhibit similar scaling properties. Specifically, we highlight two key findings: (1) the performance of the visual features scale with both the model capacity and the quantity of data, (2) the value of the objective function correlates with the performance of the model on downstream tasks. We illustrate the practical implication of these findings by pre-training a 7 billion parameter AIM on 2 ..read more
Visit website
Acoustic Model Fusion for End-to-end Speech Recognition
Apple Machine Learning Research
by
3w ago
Recent advances in deep learning and automatic speech recognition (ASR) have enabled the end-to-end (E2E) ASR system and boosted its accuracy to a new level. The E2E systems implicitly model all conventional ASR components, such as the acoustic model (AM) and the language model (LM), in a single network trained on audio-text pairs. Despite this simpler system architecture, fusing a separate LM, trained exclusively on text corpora, into the E2E system has proven to be beneficial. However, the application of LM fusion presents certain drawbacks, such as its inability to address the domain ..read more
Visit website
User-level Differentially Private Stochastic Convex Optimization: Efficient Algorithms with Optimal Rates
Apple Machine Learning Research
by
3w ago
We study differentially private stochastic convex optimization (DP-SCO) under user-level privacy where each user may hold multiple data items. Existing work for user-level DP-SCO either requires super-polynomial runtime or requires number of users that grows polynomially with the dimensionality of the problem. We develop new algorithms for user-level DP-SCO that obtain optimal rates, run in polynomial time, and require a number of users that grow logarithmically in the dimension. Moreover, our algorithms are the first to obtain optimal rates for non-smooth functions in polynomial time. These ..read more
Visit website
Investigating Salient Representations and Label Variance Modeling in Dimensional Speech Emotion Analysis
Apple Machine Learning Research
by
3w ago
Representations from models such as Bidirectional Encoder Representations from Transformers (BERT) and Hidden units BERT (HuBERT) have helped to achieve state-of-the-art performance in dimensional speech emotion recognition. Both HuBERT, and BERT models generate fairly large dimensional representations, and such models were not trained with emotion recognition task in mind. Such large dimensional representations result in speech emotion models with large parameter size, resulting in both memory and computational cost complexities. In this work, we investigate the selection of representations ..read more
Visit website
Large-scale Training of Foundation Models for Wearable Biosignals
Apple Machine Learning Research
by
3w ago
Tracking biosignals is crucial for monitoring wellness and preempting the development of severe medical conditions. Today, wearable devices can conveniently record various biosignals, creating the opportunity to monitor health status without disruption to one's daily routine. Despite the widespread use of wearable devices and existing digital biomarkers, the absence of curated data with annotated medical labels hinders the development of new biomarkers to measure common health conditions. In fact, medical datasets are usually small in comparison to other domains, which is an obstacle for ..read more
Visit website
Co-ML: Collaborative Machine Learning Model Building for Developing Dataset Design Practices
Apple Machine Learning Research
by
3w ago
Machine learning (ML) models are fundamentally shaped by data, and building inclusive ML systems requires significant considerations around how to design representative datasets. Yet, few novice-oriented ML modeling tools are designed to foster hands-on learning of dataset design practices, including how to design for data diversity and inspect for data quality. To this end, we outline a set of four data design practices (DDPs) for designing inclusive ML models and share how we designed a tablet-based application called Co-ML to foster the learning of DDPs through a collbaborative ML model ..read more
Visit website
Hybrid Model Learning for Cardiovascular Biomarkers Inference
Apple Machine Learning Research
by
1M ago
This paper was accepted at the workshop Deep Generative Models for Health at NeurIPS 2023. Cardiovascular diseases (CVDs) are a major global health concern, making the longitudinal monitoring of cardiovascular biomarkers vital for early diagnosis and intervention. A core challenge is the inference of cardiac pulse parameters from pulse waves, especially when acquired from wearable sensors at peripheral body locations. Traditional machine learning (ML) approaches face hurdles in this context due to the scarcity of labeled data, primarily sourced from clinical settings. Simultaneously, physical ..read more
Visit website

Follow Apple Machine Learning Research on FeedSpot

Continue with Google
Continue with Apple
OR