Quoting Richard Schneeman
Simon Willison's Weblog
by Simon Willison
5h ago
[...] by default Heroku will spin up multiple dynos in different availability zones. It also has multiple routers in different zones so if one zone should go completely offline, having a second dyno will mean that your app can still serve traffic. — Richard Schneeman ..read more
Visit website
Quoting Casey Newton
Simon Willison's Weblog
by Simon Willison
11h ago
But where the company once limited itself to gathering low-hanging fruit along the lines of “what time is the super bowl,” on Tuesday executives showcased generative AI tools that will someday plan an entire anniversary dinner, or cross-country-move, or trip abroad. A quarter-century into its existence, a company that once proudly served as an entry point to a web that it nourished with traffic and advertising revenue has begun to abstract that all away into an input for its large language models. — Casey Newton ..read more
Visit website
PaliGemma model README
Simon Willison's Weblog
by Simon Willison
11h ago
PaliGemma model README One of the more over-looked announcements from Google I/O yesterday was PaliGemma, an openly licensed VLM (Vision Language Model) in the Gemma family of models. The model accepts an image and a text prompt. It outputs text, but that text can include special tokens representing regions on the image. This means it can return both bounding boxes and fuzzier segment outlines of detected objects, behavior that can be triggered using a prompt such as "segment puffins". You can try it out on Hugging Face. It's a 3B model, making it feasible to run on consumer hardware. Via Robo ..read more
Visit website
Managing your work in the API platform with Projects
Simon Willison's Weblog
by Simon Willison
14h ago
Managing your work in the API platform with Projects New OpenAI API feature: you can now create API keys for "projects" that can have a monthly spending cap. The UI for that limit says: If the project's usage exceeds this amount in a given calendar month (UTC), subsequent API requests will be rejected You can also set custom token-per-minute and request-per-minute rate limits for individual models. I've been wanting this for ages: this means it's finally safe to ship a weird public demo on top of their various APIs without risk of accidental bankruptcy if the demo goes viral! Via @romainhuet ..read more
Visit website
ChatGPT in "4o" mode is not running the new features yet
Simon Willison's Weblog
by Simon Willison
14h ago
Monday's OpenAI announcement of their new GPT-4o model included some intriguing new features: Creepily good improvements to the ability to both understand and produce voice (Sam Altman simply tweeted "her"), and to be interrupted mid-sentence New image output capabilities that appear to leave existing models like DALL-E 3 in the dust - take a look at the examples, they seem to have solved consistent character representation AND reliable text output! They also made the new 4o model available to paying ChatGPT Plus users, on the web and in their apps. But, crucially, those big new features wer ..read more
Visit website
Quoting Arvind Narayanan
Simon Willison's Weblog
by Simon Willison
17h ago
If we want LLMs to be less hype and more of a building block for creating useful everyday tools for people, AI companies' shift away from scaling and AGI dreams to acting like regular product companies that focus on cost and customer value proposition is a welcome development. — Arvind Narayanan ..read more
Visit website
How to PyCon
Simon Willison's Weblog
by Simon Willison
17h ago
How to PyCon Glyph’s tips on making the most out of PyCon. I particularly like his suggestion that “dinners are for old friends, but lunches are for new ones”. I’m heading out to Pittsburgh tonight, and giving a keynote (!) on Saturday. If you see me there please come and say hi! Via Lobste.rs ..read more
Visit website
Quoting John Gruber
Simon Willison's Weblog
by Simon Willison
1d ago
The MacBook Airs are Apple’s best-selling laptops; the iPad Pros are Apple’s least-selling iPads. I think it’s as simple as this: the current MacBook Airs have the M3, not the M4, because there isn’t yet sufficient supply of M4 chips to satisfy demand for MacBook Airs. — John Gruber ..read more
Visit website
Context caching for Google Gemini
Simon Willison's Weblog
by Simon Willison
2d ago
Context caching for Google Gemini Another new Gemini feature announced today. Long context models enable answering questions against large chunks of text, but the price of those long prompts can be prohibitive—$3.50/million for Gemini Pro 1.5 up to 128,000 tokens and $7/million beyond that. Context caching offers a price optimization, where the long prefix prompt can be reused between requests, halving the cost per prompt but at an additional cost of $4.50 / 1 million tokens per hour for context cache storage. Given that hourly extra charge this isn’t a default optimization for all cases, but ..read more
Visit website
Llm-gemini 0.1a4
Simon Willison's Weblog
by Simon Willison
2d ago
llm-gemini 0.1a4 A new release of my llm-gemini plugin adding support for the Gemini 1.5 Flash model that was revealed this morning at Google I/O. I'm excited about this new model because of its low price. Flash is $0.35 per 1 million tokens for prompts up to 128K token and $0.70 per 1 million tokens for longer prompts - up to a million tokens now and potentially two million at some point in the future. That's 1/10th of the price of Gemini Pro 1.5, cheaper than GPT 3.5 ($0.50/million) and only a little more expensive than Claude 3 Haiku ($0.35/million ..read more
Visit website

Follow Simon Willison's Weblog on FeedSpot

Continue with Google
Continue with Apple
OR