Menu
O mnie Kontakt

Recaptcha jako narzędzie inwigilacji - jak Google śledzi użytkowników? (film, 18m)

Film na kanale CHUPPL porusza fascynujący temat rzekomego narzędzia, które chroni nas przed botami, zwanym reCAPTCHA. Prowadzący skłania nas do zastanowienia, jak to możliwe, że narzędzie to potrafi rozpoznać naszą ludzką aktywność, nie prosząc tym samym o tradycyjne weryfikacje, takie jak odczytywanie zniekształconych słów czy wybieranie zdjęć z motorami. Co więcej, pojawia się pytanie, w jaki sposób reCAPTCHA stało się narzędziem do masowej inwigilacji i korporacyjnej chciwości. Z racji tego, że jest obecne na ponad 12,5 miliona stron internetowych, wielu z nas ma z nim do czynienia na co dzień, często nie zdając sobie sprawy z jego ciemniejszej strony.

Podczas badań nad tym narzędziem, prowadzący natrafił na różne skutki uboczne, które przyczyniają się do inwigilacji obywateli USA. Zostały podniesione kontrowersyjne kwestie, takie jak tajne zlecenia FISA oraz związek reCAPTCHA z NSA. Co ciekawe, reCAPTCHA nie jest skuteczne w blokowaniu botów, ale gromadzi dane o ruchach myszki oraz interakcjach użytkowników, co pozwala Google na lepsze zbieranie informacji o ich aktywności w sieci. Zjawisko to budzi pytania o to, kim tak naprawdę jesteśmy w oczach korporacji oraz jakie konsekwencje wiążą się z użytkowaniem tak popularnych narzędzi.

Analizując dane, które zostały zgromadzone przez użytkowników w interakcjach z reCAPTCHA, można dostrzec, że z czasem na użytkowników spada coraz większe obciążenie bezpłatnym wysiłkiem w celu walidacji danych dla Google. Szokujące statystyki świadczą o tym, że ludzie spędzili więcej niż 819 milionów godzin rozwiązując zadania reCAPTCHA, co przekłada się na 6.1 miliarda dolarów pracy, za którą nie otrzymali wynagrodzenia. Niestety, mimo to, pozew zbiorowy dotyczący tych praktyk został umorzony ze względu na stosunkowo krótki czas, jaki użytkownicy spędzają na tych zadaniach.

Prowadzący również zauważa, że pomimo iż Google kontynuuje zapewnianie, że dane zbierane z reCAPTCHA nie są wykorzystywane do spersonalizowanej reklamy, wiele jego ustaleń sugeruje, że może to być nieprawda. Podczas przeszukiwania warunków korzystania z usług Google, pojawiają się niejednoznaczne sformułowania, które mogą interpretowane w sposób pozwalający firmie na dalsze gromadzenie danych w celach, które są trudne do zdefiniowania. W połączeniu z działalnością sądów skargowych oraz rządowych agencji, wszystko to może prowadzić do zagrożeń związanych z prywatnością, a także nadużyć ze strony różnych podmiotów.

Na koniec warto zwrócić uwagę na statystyki wideo CHUPPL, które do tej pory zgromadziło 228 314 wyświetleń oraz 17 000 polubień, co wskazuje na zainteresowanie tematem. Widzowie są nie tylko zaciekawieni, ale również zaniepokojeni tym, co mogą odkryć o narzędziach, z których korzystają na co dzień. To wideo skłania do przemyśleń o tym, w jaki sposób koronawirusa korporacje i rządy mogą nas obserwować oraz jakie będą implikacje wobec naszej prywatności w przyszłości.

Toggle timeline summary

  • 00:00 Wprowadzenie do pytania o to, jak działa reCAPTCHA.
  • 00:15 Wyjaśnienie obecności reCAPTCHA w sieci.
  • 00:37 Zastanawianie się nad skutecznością reCAPTCHA w zatrzymywaniu botów.
  • 00:52 Ujawnienie, że narracja dotycząca reCAPTCHA jest myląca.
  • 01:10 Badanie śledzenia ruchów myszy przez reCAPTCHA.
  • 01:18 Dyskusja na temat kontrowersji związanej z narzędziem, łącząc ją z chciwością korporacyjną.
  • 01:45 Pytanie o brak zdolności robota do zaznaczania pola.
  • 01:55 Wyzwanie w kontekście procesu antymonopolowego.
  • 02:19 Szczegóły dotyczące analizy ruchów myszy.
  • 03:40 Twierdzenie, że reCAPTCHA przekształciła się w oprogramowanie szpiegowskie.
  • 03:56 Skomplikowane zbieranie danych przez reCAPTCHA.
  • 04:23 Wzmianka o tajemniczych ramach prawnych otaczających inwigilację.
  • 06:10 Pytanie o praktyki zbierania danych przez Google.
  • 07:27 Opis tła funkcjonalności reCAPTCHA.
  • 07:49 Bezpośrednie ostrzeżenie o inwigilacyjnej naturze reCAPTCHA.
  • 08:15 Wgląd w praktyki brokerów danych.
  • 10:07 Dyskusja o pozwie zbiorowym przeciwko Google.
  • 12:00 Analiza niejednoznacznych warunków użycia danych przez Google.
  • 13:42 Wyjaśnienie implikacji żądań danych przez rząd.
  • 15:10 Końcowe przemyślenia na temat implikacji reCAPTCHA wykraczających poza boty.
  • 17:10 Obawy o przyszłość prywatności danych.

Transcription

How does this checkbox know that I'm not a robot? I didn't click any motorcycles, traffic lights, I didn't even type in distorted words, and yet it knew. This infamous tech is called reCAPTCHA, and when it comes to reach, few tools rival its presence across the web. It's on 12 and a half million websites, quietly sitting on pages that you visit every day. And it's actually not very good at stopping bots. Which of course led me to the question, if it's not stopping bots, what is it doing? This simple question sent us down a rabbit hole that was deeper and more complicated than we ever imagined. The box test isn't really about the box. A single checkbox. It turns out reCAPTCHA isn't what we think it is. And the public narrative around reCAPTCHA is an impossibly small sliver of the truth. And by accepting that sliver as the full truth, we've all been misled. Facts, your mouse movement. For months, we followed the data. We examined glossed over research, and uncovered evidence that most people don't know exists. This isn't the story of an inconsequential box. It's the story of a seemingly innocent tool, and how it became a gateway for corporate greed and mass surveillance. We found buried lawsuits, whispers of the NSA, and echoes of Edward Snowden. This is the story of the future of the internet, and who's trying to control it. Why can't a robot work out how to tick a box marked, I'm not a robot? In the largest antitrust trial in the United States in 25 years. Let's start here. What you've been told is wrong. Journalists told you such a small sliver of the truth, that I would consider it to be deceptive. Why can't a robot just click, I'm not a robot? The box test isn't really about the box. It actually tracks your mouse movements right before you check the box. See, if you run code to make an object move to a certain point, like a cursor, the simplest version will make it move in a straight line. Robots move like this. But humans naturally aren't that accurate. We don't move in a perfectly straight line. But humans move kind of like this. Humans tend to move their mice in wiggly, imperfect ways. And that is what this version of CAPTCHA was looking for. Okay. Mouse macro software. Find image. Look for the checkbox. Move mouse to checkbox. And make it click. Play. I mean, it's so easy. It's so easy to test this. Okay, you're probably thinking, why does any of this matter? I agree with you. I did agree with you. I actually halted this investigation for a few weeks because I thought it was quite boring. Until I went to renew my passport. Passportstatus.state.gov I got a CAPTCHA. Not a checkbox. Not fire hydrants. But the old one. And I clicked it. And it took me here. ReCAPTCHA seems to have become a spyware. It might be also that its primary purpose is doxing of US residents and spying on everyone else. I don't know what led me to do it. It felt like my mouse was moving itself. An entire page dedicated to documenting the horrors of ReCAPTCHA. Alleging national security implications for the US and foreign governments. Its ability to dox users. Mentioning secret FISA orders. The same type of orders that Edward Snowden risked his life to warn us about. It is one of the most secretive places in the world. Who put this together? Anonymous. If you're a webnative journalist looking to get in touch, we doubt you're going to have a hard time figuring out who we are anyway. This felt like a key. Left in plain sight, whispering, There's a door nearby, and it's meant to be open. This is what we're good at. This is what we do. This is what we do. This is what we do. This is what we do. This is what we do. This is what we do. This is what we do. This is what we do. There's a door nearby, and it's meant to be open. This is what we're good at. This is what we do. It felt like someone on the other side already knew that. As if they'd been waiting for someone to come along and notice. Okay, let's get this out of the way. Recaptcha is not, and really has never been, very good at stopping bots. Recaptcha was founded in part by this guy in 2007. Version 1 looked like this. In 2009, Google bought Recaptcha. But in 2012, hackers were able to get bots through with a 99.1% success rate. V2 dropped in 2014. It was the first time we saw the infamous checkbox. This is also when it allegedly became spyware. In 2017, version 2 was cracked with an 85% success rate. The code to do so was made public, and still works to this day. 2018 was the launch of V3. According to researchers at UC Irvine, there's practically no difference between V2 and V3. A few months after the launch of V3, it was beaten with a 97% success rate. Google doesn't tell us how Recaptcha works. Besides using an advanced risk analysis engine. But these hackers, they spelled it out. Do you believe that Recaptcha should be thrown out? Yeah, I think it's time to deprecate this technology. This is Dr. Andrew Searles, and he's the lead author of this study. And so I essentially developed a mirror of Recaptcha that would make you have to solve multiple in a row and would tell you that you're wrong regardless of what you said. And showed some of the data that you can actually scrape. You can scrape so much data from somebody's user interaction from a website. Whoa, this has been my big question. What data is Google collecting? They can collect all sorts of data, right? Any information, any keystroke, any click, IP address, user agent string, all the websites, all the browsing history, cookie information. Recaptcha takes a pixel-by-pixel fingerprint of your browser. A real-time map of everything you do on the internet. I think it's like 10 million websites employ this technology. They essentially get access to any user interaction on that webpage. Are you saying that Google can see anything that anyone is doing on any website that has Recaptcha embedded in it? There's a very good chance. They have that capability, is what I will say. Recaptcha doesn't need to be good at stopping bots because it knows who you are. The new Recaptcha runs in the background, is invisible, and only shows challenges to bots or suspicious users. If there's any part of this video you should listen to, it's this. Stop making dinner, stop scrolling on your phone, and please listen. When I tell you that Recaptcha is watching you, I'm not saying that in some abstract, metaphorical way. Right now, Recaptcha is watching you. It knows that you're watching me, and it doesn't want you to know. It's not just Google. Now there is this whole sketchy market about brokers, advertising brokers that like to purchase large amounts of people's information. These brokers have information on probably all of us. They get this data from a variety of sources. Government records, like your birth certificate, your voter registration, court documents, or social media. The posts you've made, the posts you've liked, the quizzes you've taken, the websites you've visited. They package all of this together, and they sell it. They're not just selling this to advertisers. Many are selling it to whoever will pay for it. So I use a tool called Delete.me. They're also graciously sponsoring this portion of today's video. Delete.me goes through the strenuous steps of getting your data removed from hundreds of these data broker websites. Just in my first month, Delete.me searched 3,500 listings for my personal information, and they found data on me on 55 different data broker websites. And then they got it removed on my behalf. My favorite thing about this is that it's not a one-time deal. Delete.me routinely checks different data broker websites to see if they post anything new on me, and then they get it removed. I don't have to do a thing. Because they've sponsored this portion of today's video, Delete.me is offering a 20% discount to you if you use the link down below. It's joindelete.me.com.chuple20 or you can use code CHUPLE20 at checkout. So if you don't like the idea of corporations selling your personally identifiable information for financial gain, and you see the potential risks of the data economy, use that link. You'll get 20% off, and you'll get your first report back in seven days. Thank you to Delete.me. I'm seriously, genuinely, really impressed with what you all do. In January of 2015, a class action lawsuit was filed against Google. The allegations claimed that every time someone solved a reCAPTCHA challenge, they weren't just proving that they were human, but they were performing unpaid labor for Google without their knowledge. Look at this. Typing morning overlooks would get you a passing grade, obviously, but so would adorning overlooks, and pouring overlooks, even egg overlooks, horse overlooks, blue man group overlooks. All of them will get you a passing grade and confirm you're a human. reCAPTCHA is testing you on this word alone. It has no idea what this one is. You, on the other hand, do. And when you submit the word morning, you are fulfilling a contractual obligation that Google has with one of its clients, like the New York Times. And when you do this, you are helping to train one of Google's AI data sets. As one researcher says, you are doing unpaid labor that directly benefits the world's most powerful surveillance corporation. And even though the judge acknowledged this to be true, the class action lawsuit was dropped. In large part because the time involved for each user is extremely small. As a part of Dr. Searle's study, they estimated that users have spent more than 819 million hours taking these tests, which results in $6.1 billion of unpaid labor. Maybe we could find a way to turn a blind eye to all of this, as long as the tests served their purpose, but that doesn't seem to be the case. Google has said that they don't use the data collected from reCAPTCHA for targeted advertising, which actually scares me a bit more. If not for targeted ads, which is their whole business model, why is Google acting like an intelligence agency? So I dove into Google's 32,000 word terms of service. When you're writing a legal document, syntax matters. They only collect data as necessary. But because of this comma, the words as necessary only apply here. Not here. The word improve, general security purposes, those are keywords where you're like, okay, I see you clearly say it's not being used for personalized advertising. Congratulations, great. What does improve mean in that context? What does that security purposes entail? That's where Google is the art of saying nothing. This is Zach. He's a privacy watchdog, but he's also the co-creator of this website, a central hub dedicated to documenting the US's antitrust case against Google. Google has allowed themselves this general security purposes allowance to use the decisions and the data it collects for anything that they deem as security related. In 2015, Google failed an audit by the United Kingdom's ICO, saying that Google is too vague when describing how it uses personal data gathered from its services and products. Is sharing to the government considered security related? It probably could be defined that way by a creative lawyer. Check this out. If you want to submit a tip to the FBI, you're met with this notice acknowledging your right to anonymity. But even though the State Department doesn't use reCAPTCHA, the FBI and the NSA do. The federal court of 11 judges with the power to allow government to conduct electronic surveillance on you. And if they want to know who submitted the anonymous report, Google has to tell them. But the Federal Intelligence Surveillance Court is anything but public. It's here somewhere inside this sprawling federal court complex. This is a court so secret, we don't even know exactly where it convenes inside the building. These companies, when they engage in a clandestine relationship with the government, as soon as they begin cooperating in part, they will ultimately end up cooperating in full. Look at this. I realize most people aren't submitting anonymous tips to the FBI. But listen. Earlier this month, 404 Media got its hands on internal emails from the Secret Service. They confirmed that the intelligence agency used a technology called LocateX, which uses location data harvested from ordinary apps installed on phones. Because users agreed to an opaque Terms of Service page, the Secret Service believes it doesn't need a warrant. Despite those apps often not saying that their data may end up with the authorities. Any information, any keystroke, any click, IP address, all the websites, all the browsing history. A real-time map of everything you do on the internet. It knows who you are. Google has essentially created a digital tollbooth with Recaptcha. They hide in the background, watching whatever you do before, during, and after you interact with it. Then it puts emphasis on the question, which human is this user? Rather than the ordinary, is this user human? If it can figure out who you are, it lets you through. It's also why this happened. Why can't a robot just click, I'm not a robot? Robots move like this. Okay, for the checkbox. It's so easy to test this. If Google can't figure out exactly who you are, let's say you clear your cookies, you're browsing incognito, maybe you're using a privacy-focused browser. They flag you as suspicious. You're grouped in with the bots, who are actually better at solving Recaptchas than you are. And you still have to pay the toll in a different way. Today, that means labeling Google's machine learning data sets. In my experience, the more that I try to hide my identity from Google, the more data sets they make me train. Finally, if you reject those two prior options, Recaptcha, the toll booth, stops you right there. The internet isn't open for you. In short, Recaptcha isn't about bots. It's about you. And USVGoogle, what's the top-down on this situation? Here in the United States, there are a group of attorney generals and the Department of Justice are basically suing Google for a variety of monopolistic practices. And so the government's response was, we want you, Google, to get out of the advertising industry. So Google, as you can imagine that being one of the primary sources of revenue, said, okay, it looks like we're going to trial over this. I realize the irony of posting this video on YouTube, which is wholly owned by Google. We reached out to Google for comment. And by the time of filming this, they still haven't responded. So if they have, you'll see it on the screen somewhere right now. Google has all this power. What they have currently is empowering the data broker ecosystem, tons of this data flowing into government and data broker hands. What's interesting is, if Google is forced to sell their ad tech space, it may actually make privacy outcomes worse in the short term. Depending on who they sell it to, that company may be more evil than Google. And thanks again to Delete.me for sponsoring a portion of this video. Thanks so much to our Patreon supporters. You are my pride and joy, my muse. If you want to support us, hang out with us on Discord. I'd love to see you over there. Thanks for watching. Transcribed by https://otter.ai