IPFS: The Distributed File System For The Web

By Shyam Purkayastha

June 17, 2019

It turns out that decentralized way of accessing information is more beneficial and efficient for the humankind. Originally, all information systems were designed with a client-server paradigm which was completely aligned with a centralized mindset of sharing things. However, with the advent of Blockchain, there is a renewed focus towards a decentralized model of information storage, access, and distribution.  

More...

IPFS : A New Way of Distributing Content Over The Internet

IPFS probably emerged from the same thought process. It was developed by Protocol Labs as a new protocol suite for exchanging information over the world wide web. While still in its early days, IPFS shows a lot of promise considering the bludgeoning rate in which the Internet is growing. One of the key indicators for this growth is the size of a web page over the years.

Growth of WebPages

HTTP, the defacto protocol designed for transmitting webpages and its associated artifacts such as images and scripts, is not able to keep pace with this growth. The very obvious reason is that HTTP is not keeping pace with the scale of traffic across the Internet.  While the end user may not feel this, system administrators, developers, and application engineers bear the brunt of poor performing web pages.

HTTP’s limitation lies in the fact that it is dependent on a location. Think of the traditional land telephone. It is location dependent. If you have it at home then for someone to reach you, you must be at home. HTTP works in the same way. For someone to access a web resource using HTTP, the HTTP server must be set to a location (IP address) and must be available, which means that the server should be up and running.

One of the major problems with the location is latency. Since the location of the server is fixed, high latency will be experienced by clients that are far away from the server. This problem has already been experienced and addressed using CDNs. A CDN caches the content of the web resource and makes it available to the user’s nearest server by doing an intelligent routing based on DNS.

CDN solves the latency problem by making an entire copy of web content. It then serves the content from the nearest location. However, this does not alleviate the dependency on the location itself.

But why is this dependence on location such a big issue?

  1. 1

    Server load - In the early days of the internet, it was all about static HTML pages with a few images and script and style files. Back then, web pages were pretty simple and very lightweight. Now, with the bludgeoning size of the web pages, along with rich media files and other web resources, the file sizes are huge. This puts additional load on the servers to serve the files without any interruption.

  2. 2
    Disk space - The average amount of data added to the Internet is growing at a humungous rate.  Most of it is replicated data, in the form of CDNs, other caching systems or just redundant data as backup. At this rate, there will be increased investment in data storage which has an overhead cost due to the redundant data.

The existing ways of handling HTTP requests are working fine, But with rising internet traffic, it's going to escalate the costs of managing servers with an optimum load to handle user requests and disk space to accommodate the content.

IPFS, A Leap Over From CDN

IPFS takes a radical approach which is partly similar to the concept of CDN. While CDN follows a distributed storage approach for entire files, IPFS proposes a distributed storage and addressing for a file segment.  

IPFS envisages a network of nodes, which do not store entire file copies, but file segments. Further, each file segment is uniquely identified by a cryptographically generated hash which ensures that any duplicates of the file segment can point the original file segment, rather than storing the duplicate file segment.

How does this help?

The onus of transferring the file is now distributed among multiple servers, which are now known as peers. There is no longer a single server serving the file. Instead, there are multiple peers storing the file segments and cooperating with other peers to serve the file. With this arrangement, a client requesting a file will handshake with multiple serving peers to obtain the file segments and sequence them to form the complete file. For small files, this may be overkill. However, for larger files, this can reduce the load on the serving peers.

IPFS Comparison with HTTP

Every file is composed of uniquely identifiable segments with a unique hash address. This way redundant copies of the file segment can point to the unique address of the segment such that content replicating can be limited to avoid redundant storage. For smaller files, redundant copies will not hurt the disk space, but for large media and document files, uncontrolled redundant storage does pose a problem of backup capacity.

So there you are. IPFS would enable a whole new way of communicating and exchanging information between devices connected to the Internet. It is no longer a client server-centric architecture.  It is a peer to peer network, with distributed content delivery based on content addressability rather than on server addressability.

This content-based addressing is enabled by a new protocol called IPNS which allows users to point to their content, just like a domain name allows us to point to the web hosting server. Further, peers perform different roles in coordinating the storage and retrieval of files.

Hosted IPFS Service Providers

Many IPFS services have emerged in the recent past. Here is a list of the few popular IPFS file hosting services that you can use for storing content. Before you decide to try out any service, be sure to check out their level of privacy protection on IPFS.

A Google Drive like UI for storing your files under IPFS.

Another IPFS hosting service but additionally supports SDK for programmatically managing file storage on their IPFS cloud.

Organize and share your photos on an IPFS backed storage

Possible Use Cases of IPFS

It is highly unlikely that IPFS will overtake HTTP and become the default protocol for file and content exchange over the Internet. That is not going to happen in the coming years, and even if IPFS inches closer to displace HTTP, it cannot achieve it all alone. The way Internet has evolved over the years by relying on the centralized ideology, the underlying protocols that carry the internet traffic have also evolved in a similar fashion. Hence changing the entire underlying infrastructure with a decentralized mindset requires a humongous effort. Maybe the advent of SDN and NFV along with some intelligent routing protocol can aid in large scale adoption of IPFS or similar protocols build with decentralized ideology. 

However, on a smaller scale, IPFS does have some potential which is worth exploring. Here are the possible use cases where IPFS might excel over HTTP without affecting too much of a change on the Internet as a whole.

1. File Sharing for DApps: DApps are new phenomena thanks to the popularity of Blockchain. A DApp is an application whose back-end code runs in a decentralized peer-to-peer network. If a DApp requires storage of files, then those files can be hosted on IPFS. As an example, if there is a DApp for search and analysis of legal document then the document copies can be stored on a few IPFS nodes for reliable and faster access.

2. Localized Geographical Sharing: Similar to the DApp scenario, if a set of users from a geographical area share some files, then those files can be hosted on IPFS for better access. IPFS's performance may not scale well if the file segments are to be fetched from peers across the entire global geographical span of the Internet. Hence a localized hosting is a better option and can save a lot of bandwidth for the ISP as well.

2. Archival Systems: IPFS excels for large files where the burden of transferring the file is shared by multiple peer servers. Hence it could be the ideal choice for hosting file archives. Corporate intranets and media houses who have to store and retain huge volumes of big sized files can leverage this technology for a more efficient archival system compared to traditional ways of archival which consume more space resulting in costs and inefficient retrieval.

Want to Explore Further?

If you are keen to take a deep dive into IPFS then try out the official IPFS guide to launch your own local IPFS node.

We will keep monitoring this exciting technology and bring back to you the new services and use cases being conceived based on IPFS. As more and more people adopt it, we hope that in a few years from now, IPFS and other similar distributed storage technologies will replace the existing protocols and make the Internet faster and more open.

Shyam Purkayastha

About the author

Shyam is the Creator-in-Chief at RadioStudio. He is a technology buff and is passionate about bringing forth emerging technologies to showcase their true potential to the world. Shyam guides the team at RadioStudio, a bunch of technoholiks, to imagine, conceptualize and build ideas around emerging trends in information and communication technologies.

{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}
TechForCXO Weekly Newsletter
TechForCXO Weekly Newsletter

TechForCXO - Our Newsletter Delivering Technology Use Case Insights Every Two Weeks

>