High-Level Machine Intelligence (HMLI) has been considered as “achieved when unaided machines can accomplish every task better and more cheaply than human workers” (Grace et al., 2018), but is this definitional approach appropriate?
I don’t think so. The definition requires a threefold change: specifying human-level versus human-like intelligence, specifying a set of non-infinite tasks and specifying our judgement of ‘better’.
At current, Professor Boden (2015) describes artificial intelligence as a “bag of highly specialised tricks”, seeing the advent of true ‘human-level’ intelligence a distant threat. Present AI is sometimes labelled as an ‘idiot savant’, drastically outperforming humans in specific tasks but this ‘performance’ cannot be extended to many other tasks which humans complete on a day-to-day basis with such little cognition power a child or even an infant could master. Diversity is thus key: human-level intelligence requires people can learn models and skills to apply them to arbitrary new tasks and goals. For a machine to beat a human at Go requires a neural network to be trained on hundreds of millions of games. While it can be successful in this domain specific scenario, consider now we change the objective function to purposeful losing, to being the last player to pick up a stone or even to beating opponent but only by enough to not embarrass them. While a human player could adapt to these new situations quickly, an AI model would require substantial retraining and reconfiguring. A crucial aspect of human intelligence thus requires transfer learning. One change to the question must state that one AI agent trained on one set of data, however large this may be, can adapt to different objective functions and different constraints. Otherwise AI can remain as “a bag highly specialised tricks”, where each separately trained model can excel at just one task but not across the board; yet diversity is a key component of human-level intelligence. Humans also have the ability to learn their weaknesses and improve in the future. It may be required then that a machine cannot only perform the task better than a human worker, but continually improve its own performance by for example rewriting its own python scripts analogous to the human process of self-development.
Yudkowsky (2008) considers the nexus of artificial intelligence and existential risk arising from convergent instrument goals, avoiding the trap of anthropomorphising AI. Human-level ≠ human-like. Lake et al. (2015) are advocates of ‘building machines that learn and think like people’. They consider incorporating intuitive physics and psychology. This richer starting point would allow technologies such as neural networks to converge to human performance with less training examples, making human-like not only human-level decisions. Grace et al. (2018) ascribes somewhat to this view by asking experts: when can a machine can beat the best human Go players but with a similar level of training examples, in the tens of thousands, rather than requiring hundreds of millions of games. While using the human brain as our best example of developed intelligence can provide fruitful research ventures, such a requirement for human-level intelligence to be human-like is overly restrictive. If we abide by true human-like learning, as Crick (1989) famously criticised, the commonly used technique of backpropagation requires that information be transmitted backwards along the axon, an impossible process in the reality of neuronal function. A less puritanical criticism is why must machines must think like humans if they achieve the same outcome. Whilst the paper requires an artificial Go player to have the same experience as a human player, it does not specify an artificial musician must have learned to the same number of songs as Taylor Swift to imitate her writing style. Nor do we require a neural network to classify images using a lens, a cornea and an optic nerve. Admittedly the black-box of machine learning algorithms is an area requiring study but our definition of human-level intelligence must be clear in whether this is required to be human-like intelligence and if we want to enforce this stricture. Despite its sophistication, human intelligence relies on behavioural biases and heuristics which can give rise to irrational or racially and sexist discriminatory actions, raising the philosophical question of what human-level intelligence really means and whether mimicking its imperfections is a desirable development path to take.
One can easily come up with a set of tasks that we do not require AI to perform better in. To list a few, do we require an AI to dance better, to go to the toilet better, or to offer companionship for the elderly better? As Kai Fu Lee, a leading Chinese AI specialist and AI optimist notes that some tasks, especially those requiring empathy and compassion, are profoundly human and can stay that way. Reaching human-level intelligence need not be limited by developing human-level emotional capacity if such capabilities are not required in the tasks AI must perform. In fact, in the literature on the future of capitalism, advocates of AI hope for digital socialism where humans maintain their comparative advantage over machines in a subset of non-automated tasks requiring exactly the aspects of human nature which cannot easily be coded or trained into a machine’s learning process. We thus require a subset of tasks, perhaps 95%, leaving the remaining for human workers.
Being ‘better’ at a task is measurable in a number of different ways. AI may reach human-level or even superhuman performance at certain tasks but retain subpar performance in other components. The cost component has here been specified but vagueness in detail creates vagueness in prediction. If an AI can do what a human can do for 1000 times the hourly wage, this is clearly sub-optimal. However, stating an AI must be ‘cheaper’ than one human worker is also naive if a machine has a higher than 1:1 replacement ratio. This can be overcome by referring to human workers in the plural. Yet vagueness remains in the term ‘better’ , thus introducing scope for different interpretations of this survey question. Does better mean quicker, more accurate or making more efficient use of resources? To illustrate consider the following personal example. After being in a road accident last week and suffering a few broken bones, I have lost use of my arm. My human capability to type this blog post is severely limited. Instead I have used voice dictation software for speech to text recognition. On one hand, this technology is faster, cheaper and less demanding of external resources compared to dictating to fellow human. Yet, on the other, it cannot offer me grammatical, topical or semantic advice, nor does it recognise less frequently used words such as ‘Bayesian’, ‘a priori’ or ‘Nick Bostrom’. Equally, unlike a human, it does not understand if I am making a true statement so cannot warn me to validate claims or delete certain sentences. If weighing up whether this technology is ‘better’ than human help, on which metrics should we put more weight? Critically, our parameterisation of the definition depends on our primary concern so should be treated as domain specific.
Considering all of these points I would change the definition to address the following changes. To better confine interpretations of the requirements I offer one example of domain bifurcation:
While these alternative definitions do mitigate some problems of vagueness and variability of interpretation, they do not remove it entirely. The unknown nature of undeveloped technologies advancing on an uncertain timeline inevitably renders the question of when AI will reach high-level intelligence definitionally ambiguous to some degree.