Simon Willison's Weblog
0 FOLLOWERS
Simon Willison is a British technologist and open-source advocate who runs Simon Willison's Weblog. The blog covers a wide range of topics related to software development, data engineering, and open data. Simon writes about his experiences working with web technologies, databases, and programming languages, and shares his insights on best practices, emerging trends, and industry news. He..
Simon Willison's Weblog
5h ago
[...] by default Heroku will spin up multiple dynos in different availability zones. It also has multiple routers in different zones so if one zone should go completely offline, having a second dyno will mean that your app can still serve traffic.
— Richard Schneeman ..read more
Simon Willison's Weblog
11h ago
But where the company once limited itself to gathering low-hanging fruit along the lines of “what time is the super bowl,” on Tuesday executives showcased generative AI tools that will someday plan an entire anniversary dinner, or cross-country-move, or trip abroad. A quarter-century into its existence, a company that once proudly served as an entry point to a web that it nourished with traffic and advertising revenue has begun to abstract that all away into an input for its large language models.
— Casey Newton ..read more
Simon Willison's Weblog
11h ago
PaliGemma model README
One of the more over-looked announcements from Google I/O yesterday was PaliGemma, an openly licensed VLM (Vision Language Model) in the Gemma family of models.
The model accepts an image and a text prompt. It outputs text, but that text can include special tokens representing regions on the image. This means it can return both bounding boxes and fuzzier segment outlines of detected objects, behavior that can be triggered using a prompt such as "segment puffins".
You can try it out on Hugging Face.
It's a 3B model, making it feasible to run on consumer hardware.
Via Robo ..read more
Simon Willison's Weblog
14h ago
Managing your work in the API platform with Projects
New OpenAI API feature: you can now create API keys for "projects" that can have a monthly spending cap. The UI for that limit says:
If the project's usage exceeds this amount in a given calendar month (UTC), subsequent API requests will be rejected
You can also set custom token-per-minute and request-per-minute rate limits for individual models.
I've been wanting this for ages: this means it's finally safe to ship a weird public demo on top of their various APIs without risk of accidental bankruptcy if the demo goes viral!
Via @romainhuet ..read more
Simon Willison's Weblog
14h ago
Monday's OpenAI announcement of their new GPT-4o model included some intriguing new features:
Creepily good improvements to the ability to both understand and produce voice (Sam Altman simply tweeted "her"), and to be interrupted mid-sentence
New image output capabilities that appear to leave existing models like DALL-E 3 in the dust - take a look at the examples, they seem to have solved consistent character representation AND reliable text output!
They also made the new 4o model available to paying ChatGPT Plus users, on the web and in their apps.
But, crucially, those big new features wer ..read more
Simon Willison's Weblog
17h ago
If we want LLMs to be less hype and more of a building block for creating useful everyday tools for people, AI companies' shift away from scaling and AGI dreams to acting like regular product companies that focus on cost and customer value proposition is a welcome development.
— Arvind Narayanan ..read more
Simon Willison's Weblog
17h ago
How to PyCon
Glyph’s tips on making the most out of PyCon. I particularly like his suggestion that “dinners are for old friends, but lunches are for new ones”.
I’m heading out to Pittsburgh tonight, and giving a keynote (!) on Saturday. If you see me there please come and say hi!
Via Lobste.rs ..read more
Simon Willison's Weblog
1d ago
The MacBook Airs are Apple’s best-selling laptops; the iPad Pros are Apple’s least-selling iPads. I think it’s as simple as this: the current MacBook Airs have the M3, not the M4, because there isn’t yet sufficient supply of M4 chips to satisfy demand for MacBook Airs.
— John Gruber ..read more
Simon Willison's Weblog
2d ago
Context caching for Google Gemini
Another new Gemini feature announced today. Long context models enable answering questions against large chunks of text, but the price of those long prompts can be prohibitive—$3.50/million for Gemini Pro 1.5 up to 128,000 tokens and $7/million beyond that.
Context caching offers a price optimization, where the long prefix prompt can be reused between requests, halving the cost per prompt but at an additional cost of $4.50 / 1 million tokens per hour for context cache storage.
Given that hourly extra charge this isn’t a default optimization for all cases, but ..read more
Simon Willison's Weblog
2d ago
llm-gemini 0.1a4
A new release of my llm-gemini plugin adding support for the Gemini 1.5 Flash model that was revealed this morning at Google I/O.
I'm excited about this new model because of its low price. Flash is $0.35 per 1 million tokens for prompts up to 128K token and $0.70 per 1 million tokens for longer prompts - up to a million tokens now and potentially two million at some point in the future. That's 1/10th of the price of Gemini Pro 1.5, cheaper than GPT 3.5 ($0.50/million) and only a little more expensive than Claude 3 Haiku ($0.35/million ..read more