YouTube as Infinite Storage? (film, 5min)
BK Binary set out to test the idea of using YouTube as an infinite storage medium for data. The basic premise is that YouTube allows for unlimited video uploads, theoretically enabling the storage of limitless data files encoded within videos. The author considered various methods to carry out this process, with the first idea involving manipulating the mp4 file structure to embed raw data. However, he quickly discovered that the complexity of the mp4 file structure leads to numerous issues regarding data compression and corruption, complicating the task significantly.
The next idea BK Binary proposed was storing data visually. He acknowledged that any file on a computer can be represented as a sequence of 0s and 1s. By assigning colors to these binary values, he managed to create images that represented the file's contents. After determining the canvas size necessary for the 1080p resolution video, which can hold a substantial amount of bits, the author describes the encoding process and how he utilized libraries like OpenCV2 and FFmpeg to generate the video output.
However, creating the video is only half the challenge. To retrieve the encoded information, BK Binary had to construct a decoder that operates in reverse to the encoder. He explains that the decoder splits the video into frames and reads the encoded data before writing it to an output file. Both processes require similar calculations, hence the author refrains from delving into the details of the decoder, allowing more focus on the practical limitations of the project.
Despite successfully converting a file to a video and back, BK Binary highlights several impractical factors that render this method unfeasible in real-world applications. The key issue lies in video compression, which significantly inflates the data requirement. Under his setup, the converted file sees about a 20x increase in size compared to the original, marking this concept as more of an experiment than a practical data storage solution. He anticipates the effect and considered enhancements to mitigate these issues, but emphasizes that the project's purpose was to explore the feasibility of file-to-video conversion, a goal he accomplished.
As of this writing, the YouTube video has garnered 777661 views and 39519 likes, indicating that the idea has indeed captured attention. Fans of technology and innovation may appreciate BK Binary's unique approach to repurposing the platform. The author invites viewers to check out his code hosted on GitHub and encourages feedback or suggestions for further project development. As technology continues to evolve, there may well be better methods to utilize YouTube as a data storage medium.
Toggle timeline summary
-
Introduction to using YouTube for infinite cloud storage.
-
Explaining the concept of uploading unlimited videos to store data.
-
Theory behind encoding and decoding files into videos.
-
Challenges of overwriting video data in mp4 format.
-
Complications like compression and error correction in mp4.
-
Visual data storage by encoding binary values.
-
Setting a canvas size for storing information.
-
Storing the first 260,000 bytes of the file into an image.
-
Details on writing bits to corresponding colors for visual storage.
-
Creating video frames using libraries for processing.
-
Need for a decoder to retrieve stored information.
-
Discussing the impracticality of the method due to video compression.
-
The increase in file size when converting to video format.
-
Conclusion about the project's goals and outcomes.
-
Invitation to view code and provide feedback.
Transcription
Using YouTube as an infinite cloud storage medium isn't a new idea, but I wanted to test the functionality and practicality of it myself. The basic idea is that YouTube lets you upload unlimited amounts of videos, and therefore infinite amounts of data. If you were to somehow encode a file into a video as well as decode that video back to the original file, you'd theoretically be able to store as many files as you want, completely for free. There's a couple different ways I thought of going about this. If you were to analyze the file structure of an mp4 and sneak in raw bytes of the file you want to encode in such a way that you could retrieve the data back, you would be golden. Seems easy enough, right? Just write over the parts of the mp4 that contain the video data, since it wouldn't matter what the video was displaying. Turns out it's not that easy. The file structure for an mp4 is surprisingly complex, and there's lots of issues that arise from simply overwriting the video part of an mp4. You have to account for compression, error correction codes, corruption, and so much more. Which gave rise to my second idea, storing the data visually. We know any file on a computer is just stored as a sequence of 1s and 0s on your disk. If we take those binary values and assign a color to them, let's just say 0 is black and 1 is white, we can store any file visually, given a big enough canvas. We can assign a canvas to be a discrete chunk of information of a file, and store the canvases sequentially in a video file to put all of the information into video format. Okay great, so now we have the basic idea of the video encoder, so let's look at some of the code. First thing we need to do is define a canvas size for the information we're going to be storing. Since YouTube allows for 1080p video, I chose that resolution. If we do some calculation, we can store 1920x1080 bits of information in one canvas, or about 260,000 bytes. Next, we'll take the first 260,000 bytes of the file we intend to encode, store them, then create the image represented by those bytes. We loop through each XY position of the canvas, writing the corresponding bit to each pixel. We calculate which bit to write with a little bit of binary math. First, we take the corresponding byte with this equation, then bit shift the byte to the right and take the rightmost bit with this equation. This is the bit we need to write at this specific XY position. We then convert the bit to its corresponding color and put it on the canvas. In my implementation of this, I stored 2 bits per pixel of canvas, assigning these values to different colors. This allows us to double the amount of information stored in each frame, shortening the final video. After all this, we take each of those images and use them as frames of a video, giving us an output like this. Luckily, instead of having to create a video file by scratch and editing the data directly, we have libraries such as OpenCV2, FFmpeg, and LoadPNG to do this process for us. Well, that's the end of this. We've successfully stored whatever file we want as a video. But how do we get that information back? Well, we have to write a decoder as well. The decoder works essentially the same way as the encoder, but in reverse. We take the video, split it up into all its different frames, then read the data from each of the frames and write the corresponding data to an output file. The math encoding for the decoder works in essentially the same way as the encoder, so I won't go too far into detail for it. Now, before you go and drop your Google Drive subscription, there are quite a few factors that make this method completely impractical and more of just a fun proof-of-concept project. Storing with the method I've just talked about completely destroys the file because of one factor, video compression. Video compression is a necessary component of video storage, since completely uncompressed video takes up vast amounts of data. So rather than storing a bit or multiple bits in each pixel of each frame, we have to expand the size each bit takes up. To achieve lossless conversion, I've found I need at least four pixel-wide information blocks, meaning each bit takes up 16 pixels, which drastically increases storage size. With this factored in, with my setup, a file sees about a 20x increase in size when converted to its video counterpart. I knew this would be the result of the project before I started, but 20x still seems a little bit ludicrous to me. There are a few ideas I have to bring this factor down a fair bit, but I'm not going to bother to implement them, since the point of this project was just to successfully convert a file to a video and back to a file again, which is something I've achieved. But more importantly, just to complete a fun project. Anyways, if you want to look at the code, the GitHub link will be in the description, and if you have any questions or tips on improvement, let me know, and thanks for watching.