Decentralizing DSpace Backups using IPFS

With the Archive token (ARCH) deployed and https://knowledgearc.io live, the team is now focussed on implementing our decentralized archiving ecosystem.

In the beginning…

KnowledgeArc.Network is already under development. We have spent the past year tokenizing various academic material. Using a combination of ERC721 non-fungible tokens, IPFS and OrbitDB, we have been developing an open-source, distributed archiving ecosystem that can be implemented by anyone. Using our Archive token, community members will be incentivized to participate in the creation of a truly open, decentralized academic platform.

The next iteration

Everything up until now has been research and prototyping. Now it is time to implement real world solutions which can be used in production scenarios.

Our first implementation will be decentralized backups for our existing archiving platform. The KnowledgArc Platform is built upon multiple open source technologies, the central pillar being DSpace, an open source digital asset management system that provides an out-of-the-box archiving solution. However, backup solutions available for this technology are centralized and often costly. Our aim is to provide a distributed solution which is cost effective and engages the community, both users and providers.

Development has only just begun, but we are are already deploying a number of IPFS nodes coordinated through IPFS Cluster to provide persistent, decentralized storage for DSpace assets. We are also developing tools which integrate with the DSpace platform; these tools will push assets directly to IPFS Cluster in real time, ensuring they are backed up with minimal chance of data loss. More importantly, backups will not rely on one or two big data repositories, i.e. a single point of failure.

What else can be backed up?

DSpace not only stores assets; metadata is key to describing what those assets represent. This metadata should also be backed up in a distributed manner but current backup solutions do not provide tools for robust metadata backup. Metadata is as important to the archive as the assets it describes and ensuring its permanence is vital to a fully-implemented backup plan. This is where OrbitDB plays a key role…

OrbitDB is a distributed database built on top of IPFS. Once we have implemented asset backups using IPFS, next on our roadmap will be pushing metadata to OrbitDB.

Incentivizing the community

An obvious use-case for the Archive token is accepting it as a form of payment for backing up assets to the decentralized cluster. However, we also envisage providers being able to connect their own IPFS nodes to our backup ecosystem and receive compensation in Archive. Users can rest assured that their data is truly distributed and providers can actively participate in backups and be rewarded for their efforts.

We envisage a marketplace where providers can register their IPFS Cluster nodes and users can rent backup space. When a provider is selected, they are added to the list of backup nodes the user controls and the user’s data is written to the provider’s infrastructure. Users will be able to select the provider nodes they want, allowing them to decide on how many replicas they need and in what locations they would like them stored to. Archive will be used to pay for resources and providers can price their nodes as they see fit. Users can choose the best price, whether it be based on the lowest cost or the most reliable provider.

The ultimate goal

So what’s the point of working on something like backups? Isn’t this about blockchain and decentralization and democratization of data. Isn’t this about removing the middleman (or middleware) and replacing it with something that works anywhere, anytime?

Academic institutions can be reluctant to embrace nascent technologies; understandable, as they invest huge amounts in the technology they implement and need to be able to commit to solutions which can last the test of time. Archives are a perfect example; organizations have to think in terms of decades when it comes to storing academic material; material which must be immutable and permanent.

Introducing a decentralized backup system based upon IPFS and OrbitDB allows institutions to “dip their toe” into the new, “distributed web” and gives them time to evaluate the full impact of blockchain and cryptocurrency technologies on their archiving requirements; they can implement decentralized technologies without having to fully commit to them. And when the time is right, these organizations will be able to migrate over to this new archiving ecosystem without the headache of a major shift in technology.

Leave a Reply

Your email address will not be published. Required fields are marked *