Archives and Inter-Planetary Linked Data: The Perfect Fit

Image by Pete Linforth from Pixabay

Decentralised, Peer-To-Peer, Blockchain-Based Academic Archiving & Publishing Ecosystem

We’re well underway with our development of KnowledgeArc Network, the decentralised, peer-to-peer, blockchain-based academic archiving and publishing ecosystem.

Latest Decentralised Technologies

Employing the latest decentralised technologies such as Inter-Planetary File System (IPFS), OrbitDB and Ethereum, we’re solving the problem of needing truly permanent repository solutions—permanent, immutable and censorship-resistant.

The Inter-Planetary Linked Data (IPLD) model provides us with a new way to describe information on KnowledgeArc Network along with the metadata that describes the information that’s being stored.

IPLD is built on top of IPFS, a decentralised peer-to-peer file storage system.

Location-Based vs. Content-Based Addressing

The brilliance of IPFS comes from the way it references what it stores: instead of the traditional method of using a location-based address to a document for file, IPFS identifies the file using a unique identifier, or hash. This is known as content addressing. Doesn’t make sense?

Here’s a brief example:

Let’s say I have some very important research I’ve undertaken and I want to distribute it to as many people on the web as I can reach.

Using some kind of web-publishing software, I upload this important research in a PDF I’ve called “my-academic-research-that-will-change-the-world.pdf” from my computer to a web site somewhere: https://my.uni.edu

I put it in a folder called “research-papers”.

I then distribute this file to everyone, with the web address https://my.uni.edu/research-papers/my-academic-research-that-will-change-the-world.pdf.

It’s critical that this research is easily accessible, unchangeable, and built to last for generations.

There are a number of problems with this solution:

  • What happens if the web site address changes?
  • What happens if the web site gets moved?
  • What happens if the web site administrator decides to change the research papers path to something else?
  • What happens if this file gets deployed to other web sites?
  • How do I ensure people know that the file my-academic-research-that-will-change-the-world.pdf contains MY research and hasn’t been changed, manipulated or completely replaced?

Basically, the problem we have is that this important research cannot be guaranteed to:

  • be accessible using the same address
  • be unique
  • be tamper-resistant

IPFS Solves These Problems

IPFS solves these problems by generating an identifier or “hash” based on the contents of the file. A hash which is mathematically unique enough to never result in a clash between two files.

By generating a hash based on the contents of my-academic-research-that-will-change-the-world.pdf, I can now locate this file no matter where it’s stored.

Additionally, IPFS includes a network of computers which can store this file so that it can be found from the network rather than from one computer (for example, https://my.uni.edu).

Linked Data – Decentralised

IPLD is another part of this decentralised picture. IPLD can link pieces of data together.

Again, taking our research ‘my-academic-research-that-will-change-the-world.pdf’, I can now add some metadata which I can link to the document.

Metadata might include:

  • the title of the paper
  • authors
  • keywords
  • an abstract from the paper
  • other important descriptive information

Additionally, I can extend the linked data to include whole profiles about the author, link keywords to other data so that I can find similar works. I can even store links to revisions, so a historical change log of the paper could be easily retrieved from the most current version.

And The Benefit Of All Of This?

  1. Everything is uniquely identified based on its content.
  2. It can never be corrupted because the corruption will be easy to identify.
  3. It will be censorship-resistant because any change will generate a whole new set of identifiers.
  4. And it’s easy to locate because the identifier never changes.
CIDs are content identifiers. Metadata is captured as individual documents and can be compared for changes. Items store a reference to assets and metadata. A chain of items provides a version history. Each keyword is also a data block and multiple keywords can be linked to metadata.

Follow The Developments

Follow our progress by checking out our code at GitLab

+Follow Knowledge Arc Network on LinkedIn

Join the Knowledge Arc Network discussion on Telegram

Covid-19: Mixed Messages & Mistrust (Information Series Part III)

Photo by visuals on Unsplash

During the CoronaVirus pandemic, we have witnessed various examples of misinfomation and narrative “flip-flopping” from a number of governing bodies, leading to confusion and a growing public mistrust of policy decisions.

Endless Examples of Misinformation

Since the start of this current health crisis, we’ve seen endless examples of misinformation from various bodies. This feeds the mistrust of many official information channels. Sometimes with good reason, but often to our detriment.

It’s also an opportunity to exploit fears and vulnerabilities. This pushes an agenda that benefits a small minority.

Common Misleading Statements

Whether it was at the outset of the publicity surrounding the Coronavirus, with common misleading statements that it was “a mere ‘flu” to current claims that certain jurisdictions have the virus “well under control”, lines of misinformation create confusion at best. Ultimately however, they harm society.

Mixed Messages

We need to address the mixed messages and information without basis, and allow doctors, researchers, academics and scientists to construct an impartial view of the disease:

  • What it looks like
  • How it spreads
  • How to combat it
  • How to eradicate it

We need free-flow of quality information, quantitative research and the sharing of ideas: a platform where scientists can publish their research objectively, without it being manipulated by others looking to exploit the current crisis.

Freely Distribute Information

KnowledgeArc aims to provide a way to freely distribute information, without it being censored or filtered. And without silencing those who speak the truth. People need a source of truth, a trust-less mechanism to validate and verify the veracity of information. Especially as we see the rise of #deepfake and other mechanisms of truth subversion.

KnowledgeArc Network is working towards this goal using peer-to-peer decentralized storage and blockchain technology. It aims to provide a linkage back to the original source of information: research, findings, discoveries, and opinions. These will all be immutable and permanent on the blockchain.

Misinformation Will Be Easily Identified, Tracked And Isolated

We build on IPFS, OrbitDB and Ethereum: all information is verified for authenticity and integrity. Users are able to build tools around this technology stack, to collate information around similar topics. These collection (or subject) repositories allow people to disseminate information about issues such as the Coronavirus, without the truth being blurred or filtered. Misinformation is be easily identified, tracked and isolated.

In the meantime, look to our scientific and medical experts for an objective picture of the current crisis. Listen instead to trusted institutions such as our universities to collect, collate and verify important information that can be relied on by the wider global community.

+Follow Knowledge Arc Network on LinkedIn

The Tokenomics of Knowledge

Photo by Timo Volz on Unsplash

Academic research is a noble cause which adds to the repository of public knowledge. But those who undertake academic research take on a lot of personal responsibility and, ultimately, a lot of risk.

  • Risky research can result in career ruin
  • Costly research may fail to raise the necessary funding
  • New discoveries may supersede existing findings

Creators should be directly incentivised to push the boundaries of human knowledge, but existing processes financially reward the big players while the authors generally miss out.

What if there was a way for researchers to recuperate personal and financial costs directly? Maybe even generate revenue from their work? Could researchers generate financial value from their work, even during the research process?

Research Tokenomics

Tokenomics introduces a new method of revenue generation or self-funding without the need of an intermediary or “middle-man”. In a similar way that cryptocurrencies take the bank out of the middle of a transaction between parties, research tokenisation would take corporate funders and publishers out of the academic process.

Micro-Payments for Cited Work

One example of this would be micro-payments for cited work. When an author publishes his/her work, the findings of that work is often used by other researchers in their studies, to validate certain assumptions – building upon the work of others rather than having to create concepts from scratch.

Research tokenomics would transfer a small amount of tokens to the original authors of the work every time it is referenced. The more useful or applicable the research, the more it is cited and the more tokens the authors can expect to earn. (Think of BAT* for content producers but in the academic space.)

(*BAT is Brave browser’s token. You can earn BAT by either watching ads or by authoring content. Others can contribute BAT when they consume content. This can either be a one-off payment or some kind of ongoing subscription. Instead of Google getting revenue for you consuming ads, or for you posting your content to Facebook who then monetise it, the end users are directly rewarded.)

The KnowledgeArc Network platform deploys smart contracts which track the citations of academic works and generate tokens, which are paid out to the original producers.

Researchers Could Raise Tokens Before Research Completed

Potentially, researchers could even raise tokens before and during the research process, introducing a funding dimension to the tokenomic model.

Ultimately, authors would be able to be rewarded for the huge burden they take on as creators of knowledge.

Find out more about how KnowledgeArc Network is revolutionising how researchers can directly profit from their work.

+Follow KnowledgeArc Network on LinkedIn

Covid-19 & The Paywall Dilemma (Information Series Part II)

As the Coronavirus crisis deepens, quality information is critical to individual, community, state and national preparedness. Staying informed should be easily available in the “digital age”, and it is, but with a considerable cost, both financially and in terms of human health.

Some very large publishers have managed to develop very large revenue streams by restricting access to valuable data. Using paywalls and subscription services, these organisations generate large revenue streams for material they do not author.

As the “middleman” they can charge sizeable access fees which are too costly for most individuals and smaller institutions, especially in developing countries.

Subscriptions also require a large upfront payment, something that’s unattractive to someone simply looking for a particular piece of information.

In recent years, there’s been growing concern around the monetisation of academic research which is:

a) in the best interests of the public

b) funded by the public purse

Europe has been taking a strong stance on ensuring publicly funded academic research be available for free and there has been increased scrutiny around the limitations of paywalls and other subscription-based models when accessing medical and other scientific research.

And Coronavirus has only reinforced the negative impact of paywalls on the dissemination of life-saving information and the real world implications it’s having on people’s ability to find quality research.

Researchers and authors do need to be compensated for their efforts but opportunistic “middle-men” should not be entitled to profiteer off of the hard work of others.

+Follow us on LinkedIn

What Covid-19 Has Taught Us About Knowledge Management (Information Series Part I)

One thing the Coronavirus outbreak has shown us is that getting quality information based on quantitative-based research and professional recommendations is key to ensure the public is well-informed, and fully educated about a wide-scale health issue (or any issue for that matter).

Photo by 🇨🇭 Claudio Schwarz | @purzlbaum on Unsplash

Subject repositories (or discipline repositories) attempt to collect information based on academic research, about a particular subject or area of interest. They provide a one-stop for quality information, collating educational material, findings and other supporting documentation in a single location. Subject repositories should use well-researched scholarly information and this information should be verified for authenticity and its source should be easily tracked.

Subject repositories are even more important in a decentralised world. Information could be stored and hosted across a number of disparate systems – this is perfect for circumventing the influence over information by nefarious parties who are looking to either control the narrative, or benefit from either playing up or playing down its impact… But by its very nature, decentralised data is difficult to find, search across and extricate meaningful conclusions.

In a decentralised world, subject repositories will be the gathering points for various information from a wide range of sources. It will be more important than ever to attach a pseudonymous path to the original material to ensure both the integrity and truthfulness of the data while also ensuring that the privacy of the source is protected, especially in regimes which single out or punish purveyors of quality, scientific information.

KnowledgeArc Network offers some mind-blowing alternatives to the way ‘the asset of knowledge’ has been managed to date…

When data is archived on a blockchain the information remains:

1. Immutable – the data can never be changed or corrupted

2. Persistent – it will last forever

3. Unique – there is no other information like this, it’s the single source

4. Open – the data is publicly accessible so others can build on the knowledge created

Come +Follow KnowledgeArc Network on LinkedIn for the latest tech and team articles, giving information-power (and potentially tokens) back to the academic researcher and ultimately the communities (who need it most).

August in review – Knowledge Identifiers

For August in Review, we’ve been quietly working on our next product, Knowledge Identifiers, a decentralized proof of ownership of scholarly works. Knowledge Identifiers work like similar systems such as Digital Object Identifiers or Handle.net, but are not controlled by a single authority. Instead, a combination of smart contracts, decentralized file storage and database systems as well as traditional web apps will power this new permanent identification solution.

We will be making our functional specification for the development of Knowledge Identifiers publicly available, and we will post the link to this document shortly. We welcome participation in developing this new solution.

You can also follow our technical progress on Gitlab.

Other Developments for August in Review

We continue to discuss the coming decentralization of archived information with key players in the industry. We seek their expertise on how legacy systems function and how they can be improved through the use of blockchain technologies.

July 2019 in review

July may have been light on news but there have been a lot of developments which will improve KnowledgeArc.Network’s technology moving forward.

Using ARCH for covering Ethereum network costs

We have been investigating the concept of zero gas charges for our upcoming smart contracts. This means that you will not have to hold Ether, the default currency for handling any transaction fees on the Ethereum blockchain, when dealing with our smart contracts. Instead, all fees will be handled using Archive (ARCH) tokens which should aid in onboarding new users to the decentralized archive.

OrbitDB CLI

One of our developers has been working with the OrbitDB community to develop another way to communicate with the decentralized database system. For developers and technical users, you can find out more at https://github.com/orbitdb/go-orbit-db/.

Knowledge Identifiers

We’re working on a decentralized digital asset identification system using Ethereum smart contracts and OrbitDB. Knowledge Identifiers will provide an alternative to existing, centralized solutions such as Handle.net and DOI.

Such a system will provide immutable, permanent identification of digital assets, collections and even users in a trustless way, which means users won’t be beholden to a single point of failure; instead they will be able to manage their identifiers on chain with no 3rd party dependency.

This opens up exciting new use cases; identifiers will no longer simply be permanent links to an item. Instead they could potentially open up licensing, citation and other opportunities.

June 2019 in review

June was an important month in the evolution of KnowledgeArc.Network. We review some of the highlights from the month.

Whitepaper

We released our whitepaper early in June. This was an important step; even though we had been developing features and software for over two years, the whitepaper captured the reason behind KnowledgeArc.Network and distilled what our ecosystem is all about at a higher level.

Deploying our whitepaper to IPFS also highlighted our commitment to distributed technologies.

Exchange Listings

We’re committed to decentralization, distribution and democracy. Therefore, we are excited to see our cryptocurrency, Archive (ARCH), listed on two decentralized exchanges; SwitchDex and Ethermium.

We hope this will make it easier for our community to obtain Archive for ongoing development in the KnowledgeArc.Network ecosystem.

OrbitDB

It’s important for decentralized applications to move forward, and to be actively developed and supported. However, with dApps and other distributed applications being nascent technologies, not all of the underlying architecture is ready for production. As is often the case, software is still going through active development and requires a lot of resources to get it to a stable, production-ready state. This can mean that projects look stagnant even though developers are hard at work on various, related projects.

KnowledgeArc.Network is using IPFS as the underlying storage mechanism. This includes OrbitDB, a decentralized, peer-to-peer database system, which uses IPFS for replication. OrbitDB is a powerful technology and will be one of the cornerstones of the new Web3, similar to what MySQL did for the Internet v1.

OrbitDB will be KnowledgeArc.Network’s decentralized storage layer, storing metadata and other supporting information. The ecosystem will be able to replicate these OrbitDB data stores as well as combine them to form larger databases.

OrbitDB is under active development. That is why we have contributed time and resources to assist with the success of this project. Some of our work includes co-contributing to the HTTP API and field manual as well as maintaining the Go implementation of OrbitDB.

The KnowledgeArc.Network Working Group

We have started a working group, a place for advisors and experts to discuss ways to decentralize archiving, peer review and journalling.

During June, we invited some project managers and librarians who work in the archiving space to join our working group and we welcome these new members. We hope to expand this group of experts and look forward to seeing what insights they can provide to this new ecosystem.

Taking back ownership of your data

The convenience of hosted solutions for digital assets and archiving can hide a major problem; do you control the data you own? KnowledgeArc.Network’s decentralized architecture ensures you are in full control of your data.

Do you really own your data?

Hosting digital assets in the cloud has become a popular and cost-effective solution. But what happens when you decide the host you are with is no longer providing the level of service you expect?

You may think migration is as simple as your existing host dumping the data out to a backup file and making it available for your new provider to restore. Unfortunately, the reality isn’t that simple; closed source applications often have proprietary formats which make them difficult or even impossible to import into other systems.

On the other hand, some open source systems are customized, but the customizations might not be publicly available, so backups only capture a subset of your data. For example, there are archive hosting providers who have built multi-tenant data storage on top of a single application. Databases in such a system cannot simply be lifted and re-implemented on other infrastructure. This results in broken features and crucial data being excluded from the system.

Even if migrating from one system to another runs smoothly, complex backups and time-consuming debugging are often required. Export/import tools need constant maintenance, but with niche products such as digital asset systems, maintenance of these ancillary tools can often be ignored.

A distributed solution

The KnowledgeArc.Network platform makes centralized storage obsolete. Data is replicated in multiple locations whilst still being owned of the original creator.

Replication allows application managers, developers and system administrators to build a variety of user experiences on top of the data. There is no need to set up complex data structures, import and export data, or work around missing data. Instead, the user simply replicates an existing database and works directly on top of it.

Data can also remain private even though it is stored in a public way. By encrypting data, the owner is the only one with access to this information and can grant other users varying degrees of control. For example, perhaps certain users might only be able to read data. Others might be able to update existing data but not delete it.

Centralized vs decentralized

Recently there has been a move to more centralized archiving solutions. Instead of disparate systems talking to one another or federated systems being established to support a “go-to” repository of information, a number of governments and bureaucracies are pushing for everything to be centralized. This results in a stagnation of innovation and, more importantly, a single point of failure.

Figure 1: Legacy Archives

KnowledgeArc.Network decentralized databases will capture the best of both worlds; every archive is unique but their records can easily be merged into a single, federated archive. This federated archive can then be replicated further so that multiple user interfaces can be created on top of the same data.

KnowledgeArc.Network captures the best of every model. Decentralized, independent databases provide institutions with full control and ownership of their data. Federated archives simply merge distributed databases into a single data store. And, finally, the entire community can build their own user experiences on top of any archived data by simply replicating an existing database.

Figure 2: Decentralized Archive

The Decentralized Archive Journey Begins

At KnowledgeArc.Network, we believe that the publishing, dissemination and archiving or information needs to fundamentally change.

Information should be open and public. It should also incentivize a decentralized community to participate in the creation, review, licensing, verification and archiving of information.

A democratized ecosystem for peer review

A single entity should not control and decide what quality content can or cannot be peer reviewed and published. Large, well-funded institutions should not receive preferential treatment over smaller, less-funded ones. Instead, we believe the entire community can actively participate in the review and publishing process. The community can decide inclusion of a work based on its merits rather than the size of an institution’s reach and influence.

Your data held for ransom

The convenience of a third-party hosting provider can often mean you give up control of your data. If you decide to change hosts or move information to in-house infrastructure, you are reliant on your existing host to hand over all your data. Depending on your agreement with your host, it may not be possible to salvage it all.

KnowledgeArc.Network uses decentralized technologies to store, sign and verify your archived information. An archiving provider can no longer hold your data exclusively; you and others can replicate your data, even if it is private, whether it is to another hosting provider, an in-house server or even your local computer.

Multiple versions of the data also ensures there isn’t a single point of failure.

Incentivizing the community

Current solutions incentivize and reward middlemen, but it is the authors, reviewers, end-users and developers who create all of the information from which these middlemen profit.

KnowledgeArc.Network aims to incentivize the community and revenue will go directly to the participants of ecosystem. Citations and licensing will flow directly to the creators of works archived to the ecosystem through the use of automated agreements (smart contracts). Community members will conduct peer review, with smart contracts providing remuneration directly. Developers will have access to the entire system and will be able to create tools and processes which directly benefit all users. And users will be able to directly reward content creators for their contribution to the ecosystem.

Alternative statistics and metrics could even result in additional earnings for content creators as impact factor is monetized.

KnowledgeArc.Network whitepaper

We distilled our vision into our official whitepaper which is available for download.

Active development

The whitepaper is not the start of our development cycle. KnowledgeArc.Network has been in development for 2 years and momentum is growing.

We are integrating various technologies with our archiving platform and ecosystem and cultivating partnerships with other blockchain systems which we have identified as key to the evolution of the KnowledgeArc.Network ecosystem.

Tokenomics

The utility token, Archive (ARCH) powers the KnowledgeArc.Network for transactions within the decentralized ecosystem.

Community members participating in the ecosystem will be able to directly earn tokens; authors will earn through citations and licensing, peer reviewers through verifying the authenticity of works, developers by extending functionality and providing customizations and resource providers by providing solutions such as backups and hosting applications.

We are working on ways to make using Archive as easy as possible and are incentivizing key archiving players to embrace KnowledgeArc.Network and blockchain technologies to replace redundant solutions and methodologies.