Towards Data Science
8,691 FOLLOWERS
Sharing concepts, ideas and codes on data science. Towards Data Science Inc. is a corporation registered in Canada. It provides a platform for thousands of people to exchange ideas and to expand their understanding of data science.
Towards Data Science
4h ago
PYTHON PROGRAMMING Even quite complicated Python comprehensions can be more readable than the corresponding for loops. Python comprehensions allow for powerful computations in loops — even nested ones. Photo by Önder Örtel on Unsplash
Python comprehensions — including list, dictionary and set comprehensions as well as generator expressions — constitute a powerful Python syntactic sugar. You can read about them in the following articles:
A Guide to Python Comprehensions
Building Comprehension Pipelines in Python
Python comprehensions have two great advantages when compared to the co ..read more
Towards Data Science
8h ago
There are times when brevity is a blessing; sometimes you just need to figure something out quickly to move ahead with your day. More often than not, though, if you’d like to truly learn about a new topic, there is no substitute for spending some time with it.
This is where our Deep Dives excel: these articles tend to be on the longer side (some of them could easily become a short book!), but they reward readers with top-notch writing, nuanced explanations, and a well-rounded approach to the question or problem at hand. We’ve published some excellent articles in this category recently, an ..read more
Towards Data Science
21h ago
Fabric Madness part 3 Image by author and ChatGPT. “Design an illustration, featuring a Paralympic basketball player in action, this time the theme is on data pipelines” prompt. ChatGPT, 4, OpenAI, 15April. 2024. https://chat.openai.com.
In the previous post, we discussed how to use Notebooks with PySpark for feature engineering. While spark offers a lot of flexibility and power, it can be quite complex and requires a lot of code to get started. Not everyone is comfortable with writing code or has the time to learn a new programming language, which is where Dataflow Gen2 comes in.
Wh ..read more
Towards Data Science
23h ago
The Promotion Playbook Whether you are just starting up or aspiring to make another leap ensure that you are ready and your boss knows that Image generated with Imagen by the Author
How come John, with fewer years of experience, is promoted quicker than Mark, who has been in the company even before John went to University and is doing so much heavy lifting? Why does Sarah get the most promising projects while Ben, proposing so many ideas, is still on the bench?
I have been there and asked similar questions, and sometimes, I felt that somebody was undeservingly getting ahead in t ..read more
Towards Data Science
23h ago
Learn about the structure of LangChain pipelines, callbacks, how to create custom callbacks and integrate them into your pipelines for improved monitoring
Callbacks are an important functionality that helps with monitoring/debugging your pipelines. In this note, we cover the basics of callbacks and how to create custom ones for your use cases. More importantly, through examples, we also develop an understanding of the structure/componentization of LangChain pipelines and how that plays into the design of custom callbacks.
This note assumes basic familiarity with LangChain and how pipelines in ..read more
Towards Data Science
23h ago
Exploring causality with Python. Difference-in-differences Photo by Scott Graham on Unsplash
Establishing causality is one of modern analytics's most essential and often neglected areas. I would like to describe and highlight the tools most used in our causal inference workshop in an upcoming series of articles.
Causal inference 101
Let’s start by defining causal inference. I will use Scott Cunningham’s definition from the Mixtape book.
He defines it as the study of estimating the impact of events and choices on a given outcome of interest. We are trying to establish the cause-and-ef ..read more
Towards Data Science
23h ago
Climate change. Source: Canva
Climate change is a frustrating topic. Politicians are not committed to doing anything meaningful about it. And most people like you and me feel powerless and don’t really know what to do to help.
Nonetheless, climate change is happening, and it’s likely accelerating (as we’ll see in the data later in this blog post). We seem to be living in a world where every summer is warmer than the last.
As a millennial, sometimes I seriously wonder whether it’s fair to bring children into this world if they are doomed to suffer from the future climate apocalyp ..read more
Towards Data Science
1d ago
Pushing RL Boundaries: Integrating Foundational Models, e.g. LLMs and VLMs, into Reinforcement Learning In-Depth Exploration of Integrating Foundational Models such as LLMs and VLMs into RL Training Loop
Authors: Elahe Aghapour, Salar Rahili
Overview:
With the rise of the transformer architecture and high-throughput compute, training foundational models has turned into a hot topic recently. This has led to promising efforts to either integrate or train foundational models to enhance the capabilities of reinforcement learning (RL) algorithms, signaling an exciting direction for the fi ..read more
Towards Data Science
1d ago
How to become a data science “unicorn”
This is the first article in a larger series on “Full Stack Data Science” (FSDS). Although there are distinct roles for different aspects of a machine learning (ML) project, there is often a need for someone who can manage and implement projects end-to-end. This is what we can call a full-stack data scientist. In this article, I will introduce FSDS and discuss its 4 Hats.
Photo by Amanda Jones on UnsplashWhat is a Full Stack Data Scientist?
When I first learned data science (5+ years ago), data engineering and ML engineering were not as widespre ..read more
Towards Data Science
1d ago
Master Advanced Information Retrieval: Cutting-edge Techniques to Optimize the Selection of Relevant Documents with Langchain to Create Excellent RAGs Content Table
· Introduction
· Vectore Store Creation
· Method: Naive Retriever
· Method: Parent Document Retriever
· Method: Self Query Retriever ∘ Query Constructor
∘ Query Translater
· Method: Contextual Compression Retriever (Reranking)
· Conclusion
Introduction
Let’s briefly remember what the 3 acronyms that make up the word RAG mean:
Retrieval: The main objective of a RAG is to collect the most relevant documents/chunks regarding the ..read more