At WWDC this year, Apple announced a team-up with OpenAI. ChatGPT will be embedded in many places in the next iOS version, serving up writing assistance, creating custom memojis, and making Siri smarter.
And it costs you nothing!
However, does it cost Apple nothing?
According to reports, Apple is not paying OpenAI one thin dime for this integration. So why would OpenAI do this? After all, it requires enormous processing power, management time, development work, etc. It’s easy to understand Apple’s motivations: it gets leading-edge AI integrated into its platforms.
But what does OpenAI get?
The Data Wall
In the landmark series of essays, Situational Awareness, the “data wall” is explained:
There is a potentially important source of variance for all of this: we’re running out of internet data. That could mean that, very soon, the naive approach to pretraining larger language models on more scraped data could start hitting serious bottlenecks.
Frontier models are already trained on much of the internet. Llama 3, for example, was trained on over 15T tokens. Common Crawl, a dump of much of the internet used for LLM training, is >100T tokens raw, though much of that is spam and duplication (e.g., a relatively simple deduplication leads to 30T tokens, implying Llama 3 would already be using basically all the data). Moreover, for more specific domains like code, there are many fewer tokens still, e.g. public github repos are estimated to be in low trillions of tokens.
Training LLMs requires vast amounts of data. In fact, we’re out of data. All the large publicly available datasets are already trained to death, and doing clever things like sucking in every Reddit post or every Tweet has already been done. There’s ebooks and YouTube and other things, but the core problem is that there is only so much data.
Now Apple is presenting OpenAI with a vast new storehouse of data: everything its hundreds of millions of customers do on their iPhones, iPads, Watches, and Macs.
Remember, Sam Altman is a Liar
Now, of course, they won’t be training on Apple user data. Of course not.
Also remember that at every turn, Sam Altman has ignored safety, rules, and limits. That’s why he was canned.
If you honestly believe that your Apple-originated data is not going to be used to improve ChatGPT’s models, I have a bridge in New York I’d like to sell you.
The official story is that Apple is acting as a brand ambassador for OpenAI…wink, wink.
Related Posts:
StackOverflow has Collapsed: Spicy AMA Coming Up
Did You Miss LowEndBox's SuperBowl Ad?
OpenAI Disbands Yet Another Safety Team
LowEndBoxTV: OpenAI Whisper? No! There Are Better Options
We Need To Stop Supporting THIS As Consumers… (My One Apple Complaint)
Want to Run a LLM Like Llama3 at Home for a Self-Hosted ChatGPT? A Mac Might Be Your Best, Cheapest ...
- Need a Fast 10Gbps Uplink with Unlimited Bandwidth on a Cheap VPS?Check Out This Offer From MetWeb! - February 18, 2025
- Don’t Miss This Cheap VPS Offer From Baboon Hosting!10.99€/Year for a 1GB VPS in the Netherlands! - February 17, 2025
- Elcro Digital: 4GB VPS for $5.27/Month in Dallas, With Powerful DDoS Protection and a Five-Nines SLA! - February 16, 2025
Leave a Reply