
Salmon Run
468 FOLLOWERS
Swimming upstream on the technology tide, one technology at a time. A collection of articles, tips, and random musings on application development and system design.
Salmon Run
1w ago
Welcome to Part III of my review of the Biomedical Artificial Intelligence (BMI 702) course, part of Harvard's Foundations of Biomedical Informatics 2023 Spring session, taught by Prof Marinka Zitnik and her team. If you want to check out my previous two reviews in this series, they are listed below. BMI 702 Review Part I BMI 702 Review Part II (Graph Learning) As the title of my post ..read more
Salmon Run
1M ago
I attended the Haystack US 2023 Search Relevance conference last week. It was a great opportunity to share ideas and techniques around search and search relevance, as well as to catch up with old friends and acquaintances and a chance to make new ones. I was there only for the two days of the actual conference, but there were events before and after the conference as well. The full talk schedule ..read more
Salmon Run
1M ago
This week I continue with the review of the papers suggested in the Biomedical Artificial Intelligence (BMI 702), specifically the Graph Learning (M3) module. There are 7 papers in the first week (2 required, 5 optional) and 5 in the second week (2 required, 3 optional). In this post I will attempt to enumerate my high level takeaways from this module and summarize these 12 papers so you can ..read more
Salmon Run
2M ago
I recently moved to our Health Markets division as part of an internal restructuring. While it is essentially a lateral shift, there are subtle differences in the kind of work I will do going forward versus what I have been doing at Elsevier so far. At my previous position at Labs, the focus of work was more on the use of technology to solve business problems of other teams such as those in ..read more
Salmon Run
3M ago
2022 has came and gone, and without a single blog post from my end. To be fair, my blogging output has been steadily decreasing over the last few years, so you would be justified in thinking of it as a somewhat inevitable trend. In other words, we had a good run, etc. Thinking back, one possible reason for my decreasing output is that my previous job was more product focused and my current one is ..read more
Salmon Run
1y ago
In July this year, a group of us on the TWIML Slack Channel came together and participated in the Flax/JAX Community Week organized by Hugging Face and Google Cloud. Our project was about fine-tuning the CLIP Model from OpenAI with the RSICD (Remote Sensing Image Captioning Dataset), and ended up placing third. The code for the project is available on github at arampacha/CLIP-rsicd if you are ..read more
Salmon Run
2y ago
Even though I am from India and my mother tongue is Bengali, and I speak, read, and write both Hindi and Bengali almost as well as English, in my career with Natural Language Processing (NLP) I have worked exclusively with English. This is probably not that uncommon, because until recently, English was the language where most NLP work happened, and to a lesser extent some of the major European ..read more
Salmon Run
2y ago
Some time back I wrote a post about Tricks to improve performance of CIFAR-10 classifier, based on things I learned from New York University's Deep Learning with Pytorch course taught by Yann Le Cun and Alfredo Canziani. The tricks I covered were conveniently located on a single slide in one of the lectures. Shortly thereafter, I learned of a few more tricks that wee mentioned in passing, so I ..read more
Salmon Run
2y ago
No, not the scooter :-). I meant Vespa.AI, a search engine that supports structured search, text search, and approximate vector search. While Vespa's vector search functionality was probably built in response to search engines incorporating vector based signals into their ranking algorithms, there are many ML/NLP pipelines as well that can benefit from vector search, i.e., the ability to find ..read more
Salmon Run
2y ago
Some time back, I found myself thinking of different data augmentation strategies for unbalanced datasets, i.e. datasets in which one or more classes are over-represented compared to the others, and wondering how these strategies stack up to one another. So I decided to set up a simple experiment to compare them. This post describes the experiment and its results. The dataset I chose for this ..read more