Learn how to remove yourself from Google by clicking HERE
(LifeSiteNews) — With the update of its amusingly named “privacy policy” on July 1, Google has announced it is going to take practically everything you have ever posted on the internet to improve its own artificial intelligence model.
Google’s Large Language Model (LLM) is called “Bard.” These artificial intelligences are “trained” using data input. Yours.
How have things changed?
Formerly, Google’s privacy policy indicated a more limited harvesting and application of user data. As tech news source Gizmodo reported on Monday:
‘Google uses information to improve our services and to develop new products, features and technologies that benefit our users and the public,’ the new Google policy says. ‘For example, we use publicly available information to help train Google’s AI models and build products and features like Google Translate, Bard, and Cloud AI capabilities.’
Fortunately for history fans, Google maintains a history of changes to its terms of service. The new language amends an existing policy, spelling out new ways your online musings might be used for the tech giant’s AI tools work.
Yet the new changes are one of an order of magnitude.
Previously, Google said the data would be used “for language models,” rather than “AI models,” and where the older policy just mentioned Google Translate, Bard and Cloud AI now make an appearance.
This is an unusual clause for a privacy policy. Typically, these policies describe ways that a business uses the information that you post on the company’s own services. Here, it seems Google reserves the right to harvest and harness data posted on any part of the public web, as if the whole internet is the company’s own AI playground. Google did not immediately respond to a request for comment.
By its own admission, Google now intends to take everything it can. As Gizmodo put it, the tech giant, whose parent company is Alphabet, “reserves the right to scrape just about everything you post online to build its AI tools. If Google can read your words, assume they belong to the company now, and expect that they’re nesting somewhere in the bowels of a chatbot.”
Why does this matter?
Big data is big business. Aside from the risible use of the term “privacy” to describe a process they will never disclose, and in a manner which is likely illegal anyway, Big Tech is using Big Data to magnify its own power and profit.
This data is all the information that has been uploaded about you on any public website. The value of such a vast data pool is in its use. This is the building of ever better models of the prediction, mimicry, and shaping of human behavior.
As the models “learn” from human input including preferences, political views, purchase history, and public information such as identity and family, they become more precise in producing content which will influence emotions and decision making in the human population.
Google seeks to build the biggest picture of humanity on earth. As with another leading LLM, ChatGPT, it will never make the model public. This is done not only for reasons of copyright, but is said to make AI “safer.”
According to another article from Gizmodo, the result of this secrecy is the opposite of safety.
In a report focused on Google competitor ChatGPT, the issue of safety was one which was argued to be compromised by secrecy over the protocols which govern the action of these artificial intelligence models.
The co-founder of OpenAI, Ilya Sutskever, gave his view on how “wrong” it was for this company to have previously released its training data. One major investor in OpenAI was Elon Musk, who now laments the fact the company has now departed from its intended purpose of remaining transparent and “open source.”
Quoted in The Verge on March 15, Sutskever said:
These models are very potent and they’re becoming more and more potent.
At some point it will be quite easy, if one wanted, to cause a great deal of harm with those models. And as the capabilities get higher it makes sense that you don’t want want to disclose them.
One major reason why these companies will not release their training data is that it is legally dubious to harvest it in the first place.
The story above relates the class action lawsuit of a California legal firm. Known as the Clarkson suit, following the name of the legal practice bringing the action, Gizmodo reported its skeleton argument:
The central claim in the Clarkson lawsuit is that OpenAI’s entire business model is based on theft.
The lawsuit specifically accuses the company of creating its products using “stolen private information, including personally identifiable information, from hundreds of millions of internet users, including children of all ages, without their informed consent or knowledge.”
As for the “safety” claim, Gizmodo had more:
After OpenAI released GPT-4, AI security researchers at Adversara conducted some simple prompt injection attacks to find out how it can manipulate the AI. These prompts trick the AI into overriding its own safeguards. The AI could then create an edited article to, for example, explain how to best destroy the world. In a much more pertinent example for our demented political environment, Adversara researchers could also get the AI to write an edited article using subversive text and dog whistles to attack LGBTQ+ people.
Without knowing where GPT-4 derives its information from, it’s harder to understand where the worst harms lie. University of Washington computational linguistics professor Emily Bender wrote on Twitter this has been a constant problem with OpenAI going back to 2017. She said OpenAI is ‘willfully ignoring the most basic risk mitigation strategies, all while proclaiming themselves to be working towards the benefit of humanity.’
Chat models such as Bing by Microsoft and Google’s Bard have the same potential dangers.
A professor of computational linguistics put the case succinctly in March:
Without clear and thorough documentation of what is in the dataset and the properties of the trained model, we are not positioned to understand its biases and other possible negative effects, to work on how to mitigate them, or fit between model and use case.
>>
— @[email protected] on Mastodon (@emilymbender) March 14, 2023
In addition, with the growing and often invisible intrusion of AI systems into everyday life, without knowledge of the natural bias in the training of these models, we are subject to the brainchildren of a very small number of people, whose ideological agenda is certain to frame that of the models of human intelligence they are building.
We can infer the values being transmitted to our new robot overlords from the messaging of Big Tech.
Replacing reality
The difficulty in determining what – or who – is real in the Liberal ideology we inhabit will become immensely more challenging, as machine behaviors mimic human traits, themselves increasingly shaped by algorithms – or patterned on machines.
As cybersecurity firm Malwarebytes Labs reported on July 5:
With so many AI tools doing things like falsely claiming that people have written articles or just running into copyright trouble generally, we have no real way to know if this will actually improve anything.
You may have had some objections to search engines making bank from content you post online, but there is some positive return there in the form of your content being placed in front of people. Now we have AI spam posing a threat to said engines, while your content is potentially being monetized twice over with new AI policies coming into force.
The questions of copyright, of who owns your private life, and whether anyone is really on the internet are all far more open than the training models of the algorithms which will shape your future.
The dead internet theory – which posits that most activity on the internet is not made by humans but by automated scripts called “bots” – appears to be a more likely prediction of future content than the mere “conspiracy” it was once dubbed.
The impact is also being felt on other new media platforms. Recently Twitter saw severe account limitations and broken features, explained by Elon Musk as an attempt to combat aggressive data harvesting – or “scraping” – presumably by Google and others.
To address extreme levels of data scraping & system manipulation, we’ve applied the following temporary limits:
– Verified accounts are limited to reading 6000 posts/day
– Unverified accounts to 600 posts/day
– New unverified accounts to 300/day— Elon Musk (@elonmusk) July 1, 2023
The measures have been lifted now but appear to have been partly a response to actions taken by Mark Zuckerberg’s Twitter competitor, Threads. A letter was sent by Elon Musk’s lawyer, Alex Spiro, accusing the Facebook billionaire’s company of industrial espionage and theft.
That the issue of data scraping is a problem for the powerful is some consolation, as it forces the issue into the open. With legal battles looming, and the terrifying potential of AI developing in the dark, there is hope that the broader conflict of interest will come to the fore: Who owns you, and what do they owe you for it?
Learn how to remove yourself from Google by clicking HERE