Identty and Trust

By Adrian Zidaritz

Original: 02/09/20
Revised: no

In the background article Graphs of Data we have introduced AWI systems, AI systems built around graphs which have people as their nodes. Each of these AWI systems provides some service to their users, for example banking, or social media, or social security benefits, etc. We have seen some problems in the previous articles, in particular the use of AI to fake data or fake full accounts and the lack of trust that users of the some AWI systems have in them, especially social media. In this article we discuss some solutions, based on AI, and how to restore and even strengthen that trust between users and the services that they connect to. Since trust goes in both directions, we also discuss how AI is being used to strengthen the ways we identify ourselves when connecting to those services. For the purposes of this article, we will distinguish between two types of services we connect to: businesses and the government.

Identifying yourself, that you are present at the time the service is requested, either online or in-person, is increasingly done with biometrics (fingerprint or palm reading or face recognition). We have seen how far AI has advanced in its ability to provide biometrics which uniquely identify an individual. Deep learning techniques provide a facial biometric, a finger biometric, a palm biometric, an iris biometric, a voice biometric, and even a brain biometric, as we have seen in AI Versus Human Intelligence background article.

The issue around the use of biometrics are even more potent in national graphs, which we idealized in the same article Graphs of Data. The digital twins in those graphs may have financial data or medical data included. AI algorithms enhance those graphs with additional information which is statistically obtained from the relationships between the digital twins. Given that data is the most important asset emerging in a world populated by AI systems, a number of questions emerge about the means of gathering that data, the means of moving it around, the means of monetizing it, the privacy and security of that data and so on. So in these national graphs, the issues of identity resolution and the issue of trust are even more potent.

Let's look at an example of such use of AI for identity verification. Yann LeCun, the inventor of the convolutional neural networks, and the leader of Facebook's AI lab for many years, talks about the use of these neural netoorks in identity applications. The neural networks are trained on millions of images of faces of people and they learn how to recognize someone from a picture. That facial biometric (the feature vector extracted from a few pictures of someone) is stored with the service provider and when you log in, the camera takes a picture of you which from which a vector is also extracted. If that vector matches the one stored, you're in. Obviously the services won't use an MRI machine to get a brain biometric and compare it with the one stored. Yet!

When a service is requested online, the AI could be even used to detect mood, for example to detect that you are not being forced by someone else to request a service, that it is your intention to do so; we have seen examples of this mood reading in AI Versus Human Intelligence background article. A much stronger way of verifying your identity when you request a service is for AI to challenge the person requesting the service with questions based on the graphs in which they appear as a node; it is very difficult to steal the graph topology around the node and the properties of the relationships.

The second problem with graph merging is that of identity resolution, if data about you originates somewhere else, how do we know that it refers to you. Could someone (the government for example request the merging of info about you, after you disappear for example). This is an interesting technical problem, suffice it to say though that the most important aspects of this kind of merging is less a technical matter and more of a policy matter.

Both businesses and the government provide services of interest to us. Some of these graphs contain financial information about us, or medical information, or employment information, or social relationship information, in other words sensitive information, so it is clear that access to that data needs to be authorized and the use of that information has to be trusted.

The identifying information may be stored in identity instruments, like a driver license, a passport, a health insurance card, etc., or it may be presented online in the form of a (username, password) pair. The concepts to keep in mind in this article therefore are: an individual, a service provided, an identity instrument, and the way the individual identifies to the service provided with the use of the identity instrument. (We know that the individual is represented by a node in the graph that the service controls, and we will see that the connections in those graphs represent some of the strongest ways in which the service can ask for identification before access is granted, but those techniques are more theoretical and not in use yet, so we only mention them towards the end.) Here is a succinct presentation from IBM:

The central aim of a strong identity is to establish trust in that identity. Blockchain is the modern vehicle of establishing trust. You may have heard of blockchain as the technology behind more trendy applications like cryptocurrencies, but some of its important applications have nothing to do with cryptocurrencies.

The application we are interested in is digital identity. A blockchain is a linked list of chronologically ordered records, with each link based on cryptography. Because of this encryption, a blockchain is an immutable piece of data, it cannot be modified by any party whatsoever; once a record is added to a blockchain, it is part of it forever.

Blockchains for a specific application form a database which is open and decentralized. It is important to realize that in a blockchain, data itself can be openly read by anyone or it can be encrypted and its reading be allowed only by those for whom the data is intended.

Because of the openness and distribution of a blockchain, and because of the financial origin of the term, a blockchain is often referred to as a distributed public ledger; blockchains also allow smart contracts, which are nothing but programs, to be executed when the various actions within the ledger have met a certain criteria; the contract may specify for example a financial transaction between two parties, which it can be executed without any central authority having to approve it.

Let's begin with a summary understanding of blockchain so we can proceed for now (we'll add more details later):

But before we embark on a more in-depth discussion of blockchain and its applications to the design of a strong identity, let's look at two examples of nations which have fully embraced the technology. These two examples will anchor our future discussion and provide a point of reference, when we discuss the technology later and its relationship with AI in the U.S..

The first example, Estonia, may even allow us to glimpse into that idea of Congress-on-a-chip, which we covered in the article AI and Liberal Democracy. Estonia is proof that a cooperative structure between government, businesses and individuals can be successfully implemented and it is quite apparent that this success reflects back into the well being of all of the three aspects.

One might argue that Estonia is a small nation and therefore they can institute these kinds of initiatives, but that similar initiatives would be difficult to implement in larger more complicated nations. But this is not so, businesses in the U.S. are making increasing use of blockchain and China is fully embracing blockchain at governmental level and making it part of their strategic push into high-tech dominance:

There is a deep connection between AI and blockchain because blockchain facilitates the openness and the availability of data; as we know by know, AI needs large data sets to learn from. Blockchain helps unlock the data from its silos and make it openly available in flexible ways in which the producers of that data decide what parts of the data to see and who is allowed to see it; blockchain is a means of democratizing data, which is essential for AI's success.

Source: AI and Blockchain: A Disruptive Integration

Digital Identity is not the only application that needs the trust established by the blockchain ledger. Another very sensitive area where trust is needed is in healthcare. Here is an example of the use of blochchain ledger by the the Center for Centers for Disease Control and Prevention: IBM Blockchain to create a ledger for electronic health records.

Both digital identity records and medical records are records of value, not just information.
The distinguished Stanford physicist Shoucheng Zhang, whom we have lost in December of 2018, explains in the nest video clip how the next version of the Internet, the so-called Internet of Things (IoT) will transport value, not just information, and how blockchain is the technology to accomplish this transport of value.

Other than digital identity and medical records, value could also mean financial transactions,or property titles, etc. Zhang's video clip is 58 minutes in its entirety and we only show here the part about blockchain.

But the reader may decide to watch it in its entirety; it is a very technical and intense presentation of the cross fertilization between quantum computing, AI, and blockchain, which would be quite valuable for the understanding of the main subject of our website; it may require shorter multiple auditions at different times.

Strong Identity And Trust Are Achievable

There are many applications of blockchain, and as we have seen in the last video, it is meant to allow anything of value to be recorded and tracked over the Internet, not just cryptocurrencies: financial transactions, medical data, identity information, etc. The centralization of data means that a consensus method must be used for any transaction to be allowed to be added to a chain. There are many such consensus methods used in blockchain, depending on the application. Cryptocurrencies use only public blockchains and a consensus method named proof-of-work done by special nodes in the network called miners; proof-of-work is how miners solve a certain cryptographic problem that allows a new block to be added to a chain, how they share their solution and how other parties can verify that solution, which finalizes the addition of a block to the chain. Miners get a small payment for being the first to solve the puzzle. Here is a more detailed explanation of how this process works:

The need for private blockchains has appeared later; in a private blockchain permissions to access the blockchain are added to the blockchain; such blockchains are very useful for identity applications, as we shall see below. Other applications use hybrid blockchains which contain public data and also private data that only certain users with the right permissions can see. We look now at a particularimplementation of blockchain by IBM, called IBM's Hyperledger, based on the open source Hyperledger project, and which uses such private blockchains. Hyperledger uses a different method of validating transactions, not proof-of-work; it is better suited for many transactions but for a smaller number of participants, as opposed to the original proof-of-work which is the opposite, fewer transactions but very large number of participants.

Hyperledger is used to create business networks and also used to establish a strong digital identity, which is our ultimate goal in covering the subject of identity. A strong identity in a blockchain means an identity which is controlled by the party identified only, it does not leak out extra information to unspecified parties and is fully trusted by all parties who use the identity. In other words, what we will see is that blockchain and its implementation in Hyperledger offers an uncompromising solution that meets all needs for a strong identity.

Let's see a different type of application, to deepen our appreciation for the technology before we zero in on the application of digital identity. The speaker points to existence of an Achilles' heel in the food industry, namely how difficult it is to trace the source of a particular food. He gives two known examples, the outbreak of E.Coli due to contaminated spinach and the contamination of peanut paste. FDA stopped all consumption of spinach for two weeks until it was tracked to one producer on one farm for one particular day. That's because there was no quick and trusted trace mechanism for all suppliers. Walmart will use the traceability and transparency supported by the blockchain technology of Hyperledger to trace the entire supply chain of its food to its source in seconds.

Finally, we will see the use of this technology for digital identity and trust. Because the graph of an AWI system contains sensitive data about each node (the digital twin of someone), the questions of identity and of trust are central ones. The system must protect the identity of the node, prevent fraudulent entry into the system in the name of that node, and must ensure that when data originating from other sources, with incomplete identity information about the node, that it only merges this data into the node after some strong identity resolution is performed on that data.

Governments in particular should make sure that the identity of your digital twin cannot be attacked. In this respect, the U.S. trails significantly and with regards to the financial credit bureaus your identity protection qualifies as a national disgrace, as the Equifax break-in has proven. Because of the higher income level of the U.S. population relative to most of the rest of the world and the unrestricted high connectivity of the U.S. population to the Internet, there is an increased urgency for the U.S. government to develop appropriate defenses of our citizens. Most of the technologies used in the e-government of Estonia have been invented in the U.S., but the adoption of these technologies here has lagged behind. U.S. places 11th in the ranking of e-government adoption among nations of the world.

There are initiatives (coming from the industry) though to leverage this concept of a distributed ledger in digital identity applications. Here is one such initiative:

Hopefully, the use of this technology will eliminate the most damaging form of attack, complete identity theft. If you look at the U.S. government services as another large graph providing services to its citizens, the weakness around the Social Security Number as the solitary key that opens the entry to that graph becomes apparent. It is imperative that the social security graph contain truthful data; imagine what happens when some properties attached to a node in that graph are fraudulently modified. The AI technology to solve these problems with the SSN and the credit bureaus exists but the political will does not. These are all archaic systems. Bad actors can easily change your address to a different one, call credit companies and freeze you out of your credit, file income taxes and claim a refund in your name.

The U.S. Postal Office is another example of an archaic system badly needing a revision. Hopefully, the proposed identity scheme you saw above will invalidate the change-of-address attacks when your identity is stolen by someone who puts a stop mail at your old address, followed by a change-of-address request with the Postal Office. PO sends two mails, announcing the change on the date requested, old address and new address. But the attacker can beat you and you will not get the change of address notice.

The anguish and suffering brought about by this inability to act is especially hard among the elderly, who cannot easily cope with a loss of identity. Should identity theft be treated as 3rd degree murder and be punished accordingly?

You would think that all citizens of the world should be interested in a global digital identity. But this is not so, there will be fierce political opposition to it. And again, most of that opposition comes from the "freedom" camp:

When confronted with the challenges brought by AI, world governments will eventually be forced to cooperate in establishing a strong worldwide identity system, and hopefully the scourge of identity theft will be eradicated.

There are fewer more important areas where the need for truthful data is more critical than the identity of humans and setting the foundation of a truthful, secure, and transparent, world identity system, may be just the backbone of what we need to start doing in the age of AI, especially because robust identity resolution is needed by these powerful AWI systems in which the main node of interest to you is your digital twin.