Google Developers Experts on Feedspot

[Mar 2024] ML Community — Highlights and Achievements

Google Developers Experts

by Nari Yoon

2d ago

[Mar 2024] ML Community — Highlights and Achievements Let’s explore highlights and accomplishments of the vast Google Machine Learning communities over the month. We appreciate all the activities and commitment by the community members. Without further ado, here are the key highlights! Featured Stories ML Developer’s Journey AI/ML GDE Rubens Zimbres (Brazil) shared how he increased the number of readers and followers on his Medium channel. From Mar 2023 to Mar 2024 (1 year), readers have increased by 700% and followers increased by 800%! One of his articles, Augmenting Gemini-1.0-Pro with Kno ..read more

Visit website

AI Chef: Turning Food Photos into Recipes with Gemini Vision Pro in Colab

Google Developers Experts

by Esther Irawati Setiawan

4d ago

Have you ever stared at a photo of a delicious dish and wondered what it was or how to make it? With the power of AI and image recognition, that question can now be a thing of the past. In this article, we’ll explore how to leverage Gemini Vision Pro, a large language model from Google AI, alongside Colaboratory (Colab), a free Jupyter Notebook environment in the cloud, to generate recipes based on an image. What is Gemini Vision Pro? Gemini Vision Pro is a cutting-edge vision model from Google AI capable of understanding and interpreting visual content. It can analyze images and ext ..read more

Visit website

Using Gemini 1.5 Pro to create video trailers

Google Developers Experts

by Dimitre Oliveira

4d ago

Taking advantage of the Gemini's multi-modal input to create trailers for any videos. This year on February 15, Google announced the release of Gemini 1.5, this new version brought many improvements, and on top of impressive improvements in the language domain, this model can process a huge input context of up to 1 million tokens, to make it even better it is was trained as a multimodal model, this means that is can natively process text, images, audio or video. This combination of different input types and huge context got me excited with the opportunity to process long videos, so I rev ..read more

Visit website

[ML Story]Multi-modal LLMs made easy: photo & video reasoning with Gemini 1.5 Pro

Google Developers Experts

by Gabriel Moreira

1w ago

Screenshot of Google AI Studio with Gemini 1.5 Pro model selected, and my multi-modal prompt with my L.A. trip photos How accelerated has been the evolution of Generative AI technologies! People are impressed by Multi-modal LLMs, that can understand and generate text, images, videos and audio using a single end-to-end model. In this post, I demonstrate how to use a great recent multimodal LLM — Gemini 1.5 Pro — for the se case of generating a blog post solely from photos and videos taken on a trip. In the end, I also talk briefly about some popular multi-modal LLM architectures and public ..read more

Visit website

Leveraging Gemini 1.5 Multimodal model(Generative AI) for Software development

Google Developers Experts

by Monika Kumar Jethani

1w ago

Image Source: https://dataedo.com/asset/img/banners/blog/cartoons.png Google recently launched Gemini 1.5 Pro model, which is a mid-sized multimodal model optimised for scaling across wide range of tasks. In this blog, we will learn how Gemini 1.5 Pro model can help us during software development. This blog is an improved and recent version of my previous blog, How Generative AI improves the productivity of Software developers All examples in this blog use the freeform prompt in Google AI Studio and Gemini 1.5 pro model. Below are some of the ways in which Gemini 1.5 Pro can help sof ..read more

Visit website

Fine Tuning Gemma-2b to Solve Math Problems

Google Developers Experts

by Rubens Zimbres

1w ago

Mathematical word problem-solving has long been recognized as a complex task for small language models (SLMs). To reach a good level of performance with these models, researchers often train SLMs to generate Python code or by using ensembling techniques, associated with consensus or majority vote. The challenge here is to use Google’s Gemma model, with less than 2 billion parameters and with safeguards against generating code to solve these Grade School Math problems. Here I will use Microsoft’s Orca-Math dataset, a high quality synthetic dataset of 200K math problems obtained through a multi ..read more

Visit website

[ML Story] Unlock your ideas with Gemini: Hands-on guide

Google Developers Experts

by Nitin Tiwari

1w ago

In March 2024, Google’s Gemini models marked their first quarter in action and since their release, they have been the new talking point. Image Source: Google Images While Generative AI isn’t something new, let’s quickly have a look at how Google’s language models have gradually evolved over the years. Language models over the years Word2Vec laid the foundation for the spotlight that today’s language models, including Gemini and other state-of-the-art (SOTA) models, currently enjoy. In this blog, I will guide you through the Gemini models and illustrate their practical appl ..read more

Visit website

[ML Story] Asking questions to images with Gemini Pro Vision

Google Developers Experts

by Mikaeri Ohana

1w ago

Just a few months after exploring PaLM APIs (which are now legacy!), it’s time to give Gemini Pro Vision a shoutout! HSBC Rain Vortex — Changi Airport, SingaporeIntroduction Recently, the Artificial Intelligence field has witnessed an accelerated pace of innovation, with new models being unveiled rapidly. Multimodality, in particular, has become a focal point of discussion this year, illustrating how the integration of diverse data types and input modalities is revolutionizing technology. These multimodal models, which synergize visual, textual, and sometimes auditory data, are leading this tr ..read more

Visit website

Cybersecurity in AI: Transfer Learning as an Attack Vector

Google Developers Experts

by Rubens Zimbres

1w ago

In the ever-evolving landscape of cybersecurity, the emergence of Artificial Intelligence (AI) presents both opportunities and challenges. One such challenge arises in the form of Transfer Learning as an Attack Vector. Transfer learning attacks represent a sophisticated approach wherein adversaries leverage pre-existing models to craft and deploy malicious interventions, circumventing traditional defense mechanisms. This text delves into the intricacies of such attacks, exploring their phases, methodologies, and potential impacts. By dissecting real-world scenarios and offering comprehensive ..read more

Visit website

Creating a Corporate Assistant with Gemini Pro

Google Developers Experts

by Nathaly Alarcón

2w ago

In today’s globalized world, effective communication in English is crucial for professional success. However, language barriers can hinder collaboration and the flow of information. On this occasion, we will explore the advantages of generative artificial intelligence as a support tool in our day-to-day work with simple tasks at the corporate communication level. To build this assistant we will use: Gemini Pro The most powerful and multimodal model that Google has released. Google AI Studio Web tool for quickly prototyping Generative AI applications. Streamlit Allows us to create web app ..read more

Visit website

Follow Google Developers Experts on FeedSpot