Federated Learning With Differential Privacy for End-to-End Speech Recognition
Apple Machine Learning Research
by
2d ago
*Equal Contributors While federated learning (FL) has recently emerged as a promising approach to train machine learning models, it is limited to only preliminary explorations in the domain of automatic speech recognition (ASR). Moreover, FL does not inherently guarantee user privacy and requires the use of differential privacy (DP) for robust privacy guarantees. However, we are not aware of prior work on applying DP to FL for ASR. In this paper, we aim to bridge this research gap by formulating an ASR benchmark for FL with DP and establishing the first baselines. First, we extend the existing ..read more
Visit website
Samplable Anonymous Aggregation for Private Federated Data Analytics
Apple Machine Learning Research
by
2d ago
We revisit the problem of designing scalable protocols for private statistics and private federated learning when each device holds its private data. Locally differentially private algorithms require little trust but are (provably) limited in their utility. Centrally differentially private algorithms can allow significantly better utility but require a trusted curator. This gap has led to significant interest in the design and implementation of simple cryptographic primitives, that can allow central-like utility guarantees without having to trust a central server. Our first contribution is to ..read more
Visit website
Ferretv2: An Improved Baseline for Referring and Grounding
Apple Machine Learning Research
by
2d ago
While Ferret seamlessly integrates regional understanding into the Large Language Model (LLM) to facilitate its referring and grounding capability, it poses certain limitations: constrained by the pre-trained fixed visual encoder and failed to perform well on broader tasks. In this work, we unveil Ferret-v2, a significant upgrade to Ferret, with three key designs. (1) Any resolution grounding and referring: A flexible approach that effortlessly handles higher image resolution, improving the model's ability to process and understand images in greater detail. (2) Multi-granularity visual ..read more
Visit website
PINE: Efficient Norm-Bound Verification for Secret-Shared Vectors
Apple Machine Learning Research
by
1w ago
Secure aggregation of high-dimensional vectors is a fundamental primitive in federated statistics and learning. A two-server system such as PRIO allows for scalable aggregation of secret-shared vectors. Adversarial clients might try to manipulate the aggregate, so it is important to ensure that each (secret-shared) contribution is well-formed. In this work, we focus on the important and well-studied goal of ensuring that each contribution vector has bounded Euclidean norm. Existing protocols for ensuring bounded-norm contributions either incur a large communication overhead, or only allow for ..read more
Visit website
Towards Automated Accessibility Report Generation for Mobile Apps
Apple Machine Learning Research
by
1w ago
Many apps have basic accessibility issues, like missing labels or low contrast. Automated tools can help app developers catch basic issues, but can be laborious to run or require writing dedicated tests. In this work, we developed a system to generate accessibility reports from mobile apps through a collaborative process with accessibility stakeholders at Apple. Our method combines varied data collection methods (e.g., app crawling, manual recording) with an existing accessibility scanner. Many such scanners are based on single-screen scanning, and a key problem in whole app accessibility ..read more
Visit website
International Conference on Machine Learning (ICML) 2024
Apple Machine Learning Research
by
1w ago
..read more
Visit website
Projected Language Models: A Large Model Pre-Segmented Into Smaller Ones
Apple Machine Learning Research
by
1w ago
This paper has been accepted at the Foundation Models in the Wild workshop at ICML 2024. Large language models are versatile tools but are not suitable for small inference budgets. Small models have more efficient inference but their lower capacity means that their performance can be good only if one limits their scope to a specialized domain. This paper explores how to get a small language model with good specialized accuracy, even when specialization data is unknown during pretraining. We propose a novel architecture, projected networks (PN). PN is a high capacity network whose parameters ..read more
Visit website
On a Neural Implementation of Brenier's Polar Factorization
Apple Machine Learning Research
by
1w ago
In 1991, Brenier proved a theorem that generalizes the polar decomposition for square matrices -- factored as PSD ×\times× unitary -- to any vector field F:Rd→RdF:\mathbb{R}^d\rightarrow \mathbb{R}^dF:Rd→Rd. The theorem, known as the polar factorization theorem, states that any field FFF can be recovered as the composition of the gradient of a convex function uuu with a measure-preserving map MMM, namely F=∇u∘MF=\nabla u \circ MF=∇u∘M. We propose a practical implementation of this far-reaching theoretical result, and explore possible uses within machine learning. The theorem is closely related ..read more
Visit website
Whispering Experts: Toxicity Mitigation in Pre-trained Language Models by Dampening Expert Neurons
Apple Machine Learning Research
by
1w ago
An important issue with Large Language Models (LLMs) is their undesired ability to generate toxic language. In this work, we show that the neurons responsible for toxicity can be determined by their power to discriminate toxic sentences, and that toxic language can be mitigated by reducing their activation levels proportionally to this power. We propose AUROC adaptation (AURA), an intervention that can be applied to any pre-trained LLM to mitigate toxicity. As the intervention is proportional to the ability of each neuron to discriminate toxic content, it is free of any model-dependent ..read more
Visit website
Revealing the Utilized Rank of Subspaces of Learning in Neural Networks
Apple Machine Learning Research
by
1w ago
In this work, we study how well the learned weights of a neural network utilize the space available to them. This notion is related to capacity, but additionally incorporates the interaction of the network architecture with the dataset. Most learned weights appear to be full rank, and are therefore not amenable to low rank decomposition. This deceptively implies that the weights are utilizing the entire space available to them. We propose a simple data-driven transformation that projects the weights onto the subspace where the data and the weight interact. This preserves the functional mapping ..read more
Visit website

Follow Apple Machine Learning Research on FeedSpot

Continue with Google
Continue with Apple
OR