Predictive Hacks
427 FOLLOWERS
We're a team of Data Scientists, passionate about what we are doing and we love getting our hands dirty. The blog acts as a resource with insights, tutorials, and case studies. Our goal is to contribute to the Data Science community by helping people find valuable tips and answers in the data, effectively and quickly.
Predictive Hacks
1M ago
In this tutorial, we will show you how to use the Transformers library from HuggingFace to build chatbot pipelines. Let’s start by installing the transformers library:
pip install transformers
Once we install the library, we can move on. We will work with the ‘blenderbot-400M-distill’ model from Meta. This is a small open-source model (700 MB) that performs relatively well. Let’s start with the pipeline.
from transformers import pipeline
chatbot = pipeline(task="conversational",
model="./models/facebook/blenderbot-400M-distill")
At this point, I will initiate the conv ..read more
Predictive Hacks
2M ago
Named Entity Recognition (NER) is a natural language processing (NLP) technique used to identify and classify named entities within a text into predefined categories such as the names of persons, organizations, locations, dates, quantities, monetary values, percentages, and more. The primary goal of NER is to extract and categorize specific entities mentioned in unstructured text data to better understand the underlying information and relationships within the text.
NER involves several steps:
Tokenization: Breaking down the text into individual words or tokens.
Part-of-Speech Tagging: Assign ..read more
Predictive Hacks
2M ago
In this tutorial, we will talk about different ways of how to split the loaded documents into smaller chunks using LangChain. This process is tricky since it is possible that the question of one document is in one chunk and the answer in another, which is a problem for the retrieval models. There is a lot of nuance and significance in how you split the chunks to ensure that you group semantically relevant parts. The core principle behind all text splitters in LangChain revolves around dividing the text into chunks of a certain size with some overlap between them.
Chunk size refers to the size ..read more
Predictive Hacks
2M ago
In retrieval augmented generation (RAG), an LLM retrieves contextual documents from an external dataset as part of its execution, which enables us to ask questions about the context of the documents. These documents can be plain text files, PDFs, URLs and even videos, like YouTube videos.
In this tutorial, we will show you how to upload a YouTube using LangChain.
Source: Deeplearning.aiYouTube Video Loading with LangChain
The YouTube loader enables users to extract text from videos. It highlights the importance of this functionality for engaging with favorite videos or lectures. This loader in ..read more
Predictive Hacks
3M ago
The Assistants API enables the creation of AI assistants integrated into your applications. These assistants possess instructions and utilize models, tools, and knowledge to address user queries. Presently, the Assistants API supports three categories of tools: Code Interpreter, Retrieval, and Function calling. The plans for the future include introducing additional tools developed by OpenAI and providing you the flexibility to contribute your tools to the platform.
In this tutorial, we will show you how to create Retrieval Assistants using the Python SDK. Before you start coding, we strongly ..read more
Predictive Hacks
8M ago
In this blog post, we’ll teach you how to create dynamic forms based on user input using Streamlit’s session state function.
Streamlit’s session state lets you share variables throughout a user’s session. We’ll use one such variable to track the app’s state, which changes when the user hits the submit button.
In this article, we’re going to develop an application that generates forms based on a numerical input. To start, we’ll set up the session state variable and initialize it to 0. Then, we’ll create a function that the submit button triggers to update this variable. This process forms the c ..read more
Predictive Hacks
9M ago
ChatGPT’s knowledge is limited to its training data, which has the cutoff year of 2021. This implies that we cannot extract information for cases that have occurred after the cutoff year. However, we can integrate Wikipedia with ChatGPT. We will go straightforward with an example. Our goal is to extract information about Juancho Hernangomez, the new star of Panathinaikos BC.
The wikipedia Python Package
We will need to install the wikipedia python package by running:
pip install wikipedia
From the wikipedia package, we will use the WikipediaLoader that has the following arguments
query: you ..read more
Predictive Hacks
9M ago
In a previous tutorial, we showed you how to work with LangChain Prompt Templates. Clearly, the prompt templates allow us to automate some tasks and to work much more efficiently. The good news is that you can save your templates as JSON objects, allowing you to use them later or to share them with others. In this tutorial, we will show you how to save and load prompt templates. First, let’s start with a simple prompt template:
from langchain.llms import OpenAI
from langchain.chat_models import ChatOpenAI
from langchain.prompts import PromptTemplate
llm = OpenAI(model_name='text-davinci-003 ..read more
Predictive Hacks
9M ago
Sigma Computing is a cloud analytics platform that uses a familiar spreadsheet interface to give business users instant access to explore and get insights from their cloud data warehouse. In this tutorial, I will share with you my experience with Date Range parameters that I found really challenging.
Sigma has a collection of “Control Elements” such as “TEXT BOX”, “LIST VALUES”, “SLIDER”, “RANGE SLIDER”, “DATE”, “SWITCH”, “DRILL DOWN” and “TOP N”
These elements can be used as filters or as SQL parameters for the dashboards. In this tutorial, we will focus on the Date range control type.
As w ..read more
Predictive Hacks
9M ago
In the majority of cases, LLM applications don’t directly input user input into an LLM. Instead, they utilize a larger piece of text known as a “prompt template” to include the user input along with additional context related to the specific task. They encapsulate all the necessary logic to transform user input into a fully formatted prompt. Let’s start with some prompt templates:
Single Input Prompt
from langchain.llms import OpenAI
from langchain.chat_models import ChatOpenAI
from langchain.prompts import PromptTemplate
llm = OpenAI(model_name='text-davinci-003')
chat = ChatOpenAI()
sing ..read more