Carnival Of Mathematics
George Shakan Mathematical Research Blog
by George Shakan
4M ago
In this somewhat different post, I am hosting the long-running Carnival of Mathematics. First I’ll talk about 223 (the issue number) and then I’ll round up some mathematical posts from December 2023. It’s primetime we talk about 223. First of all, it is a lucky prime, to which it is unknown if there are infinitely many. To write the number 223 as the sum of fifth powers requires 37 terms, more than any other number. This is an example of Waring’s problem has a rich history going back to Diophantus nearly 2000 years ago. Also, 223 is the number of permutations on 6 elements that have a strong f ..read more
Visit website
Temperature in Natural Language Processing
George Shakan Mathematical Research Blog
by George Shakan
8M ago
In Machine Learning, and in particular Generative AI, temperature is a useful hyperparameter for tuning model outputs. In this post, we will make precise the following. Temperature is a parameter developers can use to alter outputs from Large Language Models With a higher temperature we get more of a variety of outputs. Why changing the temperature is useful. Let’s start with an example (the code I used is at the end of this post). Consider the following prompt, taken from Google’s Minerva paper. A line parallel to passes through . What is the -coordinate of the point where this line crosse ..read more
Visit website
Singular Value Decomposition and PCA
George Shakan Mathematical Research Blog
by George Shakan
9M ago
Principal Component Analysis (PCA) is a popular technique in machine learning for dimension reduction. It can be derived from Singular Value Decomposition (SVD) which we will discuss in this post. We will cover the math, an example in python, and finally some intuition. The Math SVD asserts that any matrix can be written as where is and is rectangular diagonal, that is for . and are orthogonal, that is and . This allows us to derive PCA. First, we are given a data matrix, , that is a matrix where each of the rows are data points with features. First, we translate each column of s ..read more
Visit website
The Square Root Cancellation Heuristic
George Shakan Mathematical Research Blog
by George Shakan
9M ago
In the first equation of the popular Attention is all you need paper (see also this blog post), the authors write In this post we are going to discuss where the comes from, leading us to some classical Probability Theory. We will first talk about the math with some examples and then quickly make the connection. The principle of square root cancellation also appears in the Batch Norm Paper, Neural Network weight initialization (see also Xavier Uniform), and elsewhere. The Math Let be a $d$-dimensional vector with entries that are either or Then we may write We will be interested in the si ..read more
Visit website
How Does ChatGPT read?
George Shakan Mathematical Research Blog
by George Shakan
9M ago
How would ChatGPT read the infamous “Hello, World!” Does it see each character, sequentially H e l l o , W o r l d ! Or maybe it sees each word as well as the punctuation: Hello , World ! By the end of this post we will have a full understanding of this. On the way, we will learn about unicode, UTF-8, and byte pair encoding (BPE). In order to understand how ChatGPT sees data, we have to understand the data on which it is trained on. The majority of the data used to train GPT-3 comes from the Common Crawl dataset, which is text scraped from the internet. Thus we turn our attention to understand ..read more
Visit website
Using AI to Write Math
George Shakan Mathematical Research Blog
by George Shakan
10M ago
Unfortunately (or perhaps, fortunately), we are still far from the days where we can ask a computer to write proofs for us. However, there are tools available today that can concretely assist with writing mathematics. I made a video on the topic, with blog post below. While I left research mathematics some time ago, I still find myself typing up some math from time to time. For me this has become a bit easier with Github Copilot. I am now VSCode as my text editor to write LaTex. VSCode is a free and powerful IDE used by millions of software engineers. With it, one can access GitHub Copilot ..read more
Visit website
Truthfulness of AI models like Chat GPT
George Shakan Mathematical Research Blog
by George Shakan
10M ago
Large language models, have improved and become more main stream. I discuss this in the video below with AI Researcher Xavier Garcia ..read more
Visit website
I started a YouTube Channel
George Shakan Mathematical Research Blog
by George Shakan
10M ago
My First Video I am happy to announce that I just posted my first video to a new YouTube Channel. The result is an interview with Xavier Garcia about how Chat GPT works. What’s Next? I’ll continue to post videos surrounding Machine Learning, Data Science, and perhaps elsewhere. I plan to do more interviews, individual videos about relevant topics, as well as some educational material. The Process I was already meeting with Xavier regularly to discuss Machine Learning topics, and so I thought to myself that other people might benefit from our discussions. Our discussion for the video was perha ..read more
Visit website
Can Chat-GPT Do Math?
George Shakan Mathematical Research Blog
by George Shakan
10M ago
Chat-GPT is a new impressive AI chatbot released by Open AI. Impressive applications of it can be found all over the internet. But can it do math? By math, we do not mean simply perform computations. Its own design ensures that there will be computational problems it will be unable to solve. What I am more interested in is if it can solve problems that require some mathematical reasoning. To choose our problems, we use the MMLU dataset. Galactica, a large language model recently released by Meta AI, has achieved some good results on this dataset. Their findings are in section 5.3 of their pape ..read more
Visit website
My experience with Computer Proofs
George Shakan Mathematical Research Blog
by George Shakan
10M ago
Image generated by Dalle To acclimate myself with Computer Proofs, I aimed to write down some basic theorems from my area of research in Lean. I chose Lean primarily because of the very active and helpful community. In fact, everything I did in lean was done alongside Yael Dillies whom I met through Zulip. In what follows below, I will discuss how I went about this and what I took away from the experience. First of all, what is Lean? It is an interactive theorem prover (such as Isabelle, Coq, and others) . For our purposes, it is a way to input mathematical statements and their corresponding p ..read more
Visit website

Follow George Shakan Mathematical Research Blog on FeedSpot

Continue with Google
Continue with Apple
OR