IQ of a Physical System Celestine Preetham Lawrence University of Twente, The Netherlands
[email protected] August 15, 2015 To be presented at 6th International Workshop on Physics and Computation (PC2015) at the Conference on Unconventional Computation & Natural Computation 2015
Abstract Natural computing was demonstrated on a disordered network of nanoparticles, recently1 by researchers in the University of Twente. For future experiments on such rich physical networks, it is useful to define a computational capacity. To that end, we introduce a phenomenological quantity called the IQ for a physical system. We derive an expression (IQ = number of bits of Information + Integration), by considering intelligence as the capacity to relate patterns to data. Then, we study the IQ trends for a physical system (the network of nanoparticles). We find that the IQ for our physical system is better than random systems, under practical constraints on thinking coverage (limited by energy, size, noise etc.). These methods can also be used to compare IQ’s of different physical systems, and thus offers insights on designing artificial brains. For instance, we show how the IQ of a neural network scales up with the number of neurons.
1 1.1
Theoretical proposals Motivation
Researchers are working hard to build computers that are as efficient as the brain for tasks like pattern recognition. To create such artificial brains, several physical systems comprising of novel materials and architectures are being explored. A few relevant questions to ask are: a) What is the total computational capacity of a physical system? b) How should this physical system be configured? c) How does it compare with other physical systems like the human brain? To answer these questions, we propose a theory to calculate the IQ (Intelligence Quotient) of a physical system. 1 S.K. Bose, C.P. Lawrence, Z. Liu, K.S. Makarenko, R.v. Damme, H.J. Broersma and W.G.v.d. Wiel. Evolution of a Designless Nanoparticle Network into Reconfigurable Boolean Logic. Nature Nanotechnology, manuscript accepted on 13th August, 2015.
1
Is it a triangle?
Datum
YES/NO
Pattern Datum
Pattern
Decision YES
Output current threshold YES/NO
Control voltages Input voltages
Figure 1.1: A phenomenology of intelligence, a corresponding black-box model, and its practical implementation on a physical system.
1.2
Intelligence
Intelligence is the ability to recognize patterns in data (Fig.1.1). The input-output characteristics of any physical system can be translated into Datum-Pattern (DP) relations. The IQ is thus determined from the table of DP relations.
1.3
Datum-Pattern relation over discrete symbols
An elementary DP relation and its characteristic parameters are shown in Fig.1.2 a,b. Information is defined as the harmonic mean between the number of distinct Data and distinct Patterns. This choice is motivated because, 1) More the number of Data or Patterns, higher the information content 2) a single Datum related to multiple patterns or a single Pattern related to multiple Data does not convey much more information. Integration = No. of Relations/ No. of distinct Data - 1. This choice is motivated because, 1) Integration is maximal when a Pattern is related to every possible subset of Data 2) Integration is zero when no set of Data is related to a shared Pattern (R = D).
1.4
IQ
2DP IQ = number of bits of Information + Integration = log2 D+P + log2 On simplifying, IQ = 1 + log2 P + log2 (R − D) − log2 (D + P ) .
R D
−1 . (1.1)
a
c
DP relation Datum
e
Pattern Triangle
Pattern
n tio ma r o nf +I
Triangle Circle
+
In te g
Marks b
Datum
D
P
R
Inf.
Int.
2
2
2
2
0
ra tio
# of distinct data
P
# of distinct patterns
R
# of relations
Datum
Rectangle
Int.
R
Inf.
Int.
4
4
4
0
Pattern
P
R
Inf.
Int.
Red Triangle
Red
3
3
3
0
Red Circle
Blue
Blue Triangle
Triangle
Blue Circle
Circle
d
Datum
Pattern
Circle
Integration = (R-D)/D
Datum
D
n
Information = 2DP/(D+P)
Pattern
3
g
Red Inf.
P
4
Circle
Triangle D
D
D
P
R
Inf.
Int.
2
3
3
2.4
0.5
+
Japan/ Flag
h
Datum
Pattern Triangle Circle
Datum
f
Pattern
Rectangle
Triangle Datum
Red
Pattern Circle
Japan/ Cricket
Triangle Rectangle
Line
Circle Red
Shape
Rectangle A long list of N relations
Flag Red Cricket Japan/ Cricket Japan Line Line Shape Shape D
P
R
Inf.
Int.
3
8
7
4.4
1.33
...
...
Het
த
Neder
மி
lands ...
...
D
P
R
Inf.
Int.
D
P
R
Inf.
Int.
4
9
9
5.5
1.25
3+N
8+N
7+N
N+3
4/N
Figure 1.2: a) A DP relation table for an elementary system. b) Marks that determine IQ of a DP relation. c) A system with higher Information. d) A system with higher Integration. e) Two systems with equal IQ. They can answer the same set of questions like 1) Is it Red? 2) Is it a Red Triangle? But, the second table can answer conceptual questions substantially faster. eg. What color is it? Such questions only need to checked for a relation within the class of patterns corresponding to ’color’. This factor becomes more prominent for large tables. f) A highly integrated system, with a pattern related to all possible combinations of data. It also has 2 distinct patterns for the same combination of data (Japan/Cricket). Those patterns can act like linkers when encountering newer experiences and facilitate g) "thinking out of the box". On adding a new experience that is linkable to the previous system, we see that the number of relations can increase faster than the number of patterns. However, in case of h) Adding unrelated experiences, integration decays due to insufficent rise in number of new relations.
Normalize I-V measurements Vin1
...
Vc1
...
Iout
Datum
Pattern
.28
...
.79
...
.39
(.28, ...)
(.79, ..., .39)
.29
...
.70
...
.80
(.29, ...)
(.70, ..., .80)
:
:
:
:
:
:
:
.09
...
.63
...
.49
(.09, ...)
(.63, ..., .49)
Assign meaning Datum
Slice experiences into Datum/ Pattern
Vin
* * * * * * * * * * * * * *
Select experiences
Datum
Pattern
Red Datum
Pattern
1
1
1, 2
1
1
2
1
2
:
:
2
1
** * * * * * * * ** ** *
Slice Vc
Bin
Label vectors (binning)
Pattern
Circle
Iout
3 2 1 0
** * * * * * * * ** ** * 0 1 2
3 4 56
Select 2 1
* ** 1 2
Figure 1.3: Steps involved to go from experimental measurements on a physical system to a DP relation. To use the full thinking capacity of this physical system, we must select high enough bits of experiences and relate them using an optimum binning threshold to cover the entire space (thinking coverage = bits*threshold). However, in practice, we might be constrained to operate in the low-coverage regime. High-coverage requires more energy, physical space, higher SNR etc. See appendix, for more explanation.
1.5
Calculating the D,P ,R for continuous experiences
At a high level, we can transform the IV characteristics (continuous experiences) to a table of DP relations (as seen in Fig. 1.2) by following the scheme shown in Fig. 1.3. But when working with continuous systems, experiences in overlapping bins can not be assigned a wholesome meaning (that which contributes a +1 to D/P/R). To work around this, we introduce the notion of clarity and confusion. For an illustration, see Fig. 1.4.
2
Experimental Results 1. As explained in section 1, we obtain a table of DP relations by performing electrical measurements on a nanoparticle network2 . We find (see Fig.2.1) that as expected, the IQ drops when unconscious (low number of bits), overthinking (high value of bits*threshold) or underthinking (low value of bits*threshold). The IQ for our physical system is better than random systems, under practical constraints on thinking coverage (which is limited by energy, size, noise etc.).
2 to improve the IQ of the system, we only consider the regime where single-electron physics dominate. Therefore a cluster of few hundred nanoparticles of size 20 nm were trapped within an area of 200 nm, and cooled down to 300 mK. On the surrounding electrodes, we applied a set of 8 voltages upto ~100 mV and measured the output current upto ~1 nA. 4 voltages were considered as input, another 4 were considered as control. The DP relation is thus obtained by following the procedure in Fig.1.3.
darkblue
Datum
Pattern
blue
clarity
lightblue
Pattern
triangle1 blue1/3 darkblue1/2
triangle
lightblue1/2
Datum
Figure 1.4: Confusion and Clarity. 4 distinct IV measurements are converted to a table of DP vectors (that are plotted in a 2D hyperplane). For a binning threshold = 0.1, it has 3 distinct Data and 4 Patterns (1 clear+3 confused). clarity (of a Pattern defining the bin centre) = 1/# of Patterns inside the bin. Here, D = 3, P = Σclarity(P attern) = 2.33 and R = 4. Note that here R > P because the relations are still preserved, amidst the fuzzy patterns that lack clarity. In contrast, clear DP relations in Fig. 1.2 will always have R ≤ P. IQ
a
IQ
b
ra n d
9
c
p h y
9 8
8
7
8
7 8
6
7 3 6
6 6
6 5
2
5 4
5
b its
b its
5 3
5
b its
4
3
1
2 2
4
4
0
4 1
1
-1 0
0
3
3
3
2
-2
1
2 1
th r e s h o ld
0 .8
1 .0
IQ
p h y
-3
2
0 .6
ra n d
ra n d
7 7
0 .4
= IQ
−I Q
9 9
0 .2
p h y
8
1 0
1 0
0 .0
IQ 9
1 0 .0
0 .2
0 .4
0 .6
th r e s h o ld
0 .8
1 .0
0 .0
0 .2
0 .4
0 .6
0 .8
1 .0
th r e s h o ld
Figure 2.1: (a) The image plot shows the IQ of random systems for various thresholds and number of bits. The contour lines IQrand are obtained by fitting the function IQ = α ∗ bits − γ(bits ∗ threshold − β)2 . (b) Similar plots for the physical system, with contour lines IQphy . (c) This image plot shows the difference in IQ between the physical system and random systems. The dashed line satisfies IQphy = IQrand , and as expected from the fit, the physical system usually has higher IQ in the region below it. That region also overlaps with the low-coverage region (thinking coverage = bits*threshold). Hence, our physical system has better in IQ under constraints on possible thinking coverage.
Figure 2.2: IQ landscape of a Feedforward neural network 2. In Fig. 2.2, we show that the IQ landscape of an artificial neural network increases with the number of neurons.
3
Avenues for future research
In future work, we plan to compare the nanoparticle network to other intelligent systems (like artificial neural networks, or living neural networks). It is also interesting to derive the maximum IQ achievable by also considering physical limits to computation3 , and then compare it to the performance of our nanoparticle network (e.g. IQ vs temperature, number of NPs, number of control electrodes). We still need to show how well IQ can be seen as a measure of solving practical problems (both by simulations on artificial neural networks, as well as experiments on physical systems).
4
Acknowledgments
Notions of integrated-information as intelligence are inspired from existing literature, mainly 4 . Experiments on nanoparticles where performed by S.K.B and C.P.L. W.G.v.d.W and H.J.B conceived the project and supervised. We thank anonymous reviewers for their stimulating comments. 3 Lloyd,
S. (2000). Ultimate physical limits to computation. Nature, 406(6799), 1047-1054. M, Albantakis L, Tononi G (2014). From the Phenomenology to the Mechanisms of Consciousness: Integrated Information Theory 3.0. PLoS Comput Biol 10(5): e1003588. doi:10.1371/journal.pcbi.1003588 4 Oizumi
5
Appendix
5.1
Critical Thinking
For a system with N bits selected, • Lowering the threshold to 0 means that the system is underthinking (D=P=R=2^N, IQ=0 because nothing is inter-related) • Increasing the threshold to ∞ means that the system is overthinking (D=P=R=1, IQ=0 because everything is inter-related) • In the middle, there exists a critical threshold which yields IQmax . Similarly, for a system with a fixed threshold, • Using very few bits means that the system is unconscious. • After a certain number of bits, using more is wasteful because. D,P,R saturate5 once the system is already fully conscious. • In the middle, there exists a critical number of bits which yields IQmax . Now, there also exists a critical thinking coverage = bits*threshold. But under practical constraints, we may prefer to operate at a slightly lower coverage because otherwise • higher thresholds requires more space to operate upon the same amount information. • higher bits costs more energy, requires more space and needs higher SNR. Thus, instead of thinking through the entire space of the physical system, it may be better to interconnect many low-coverage modules of it. Evolutionary algorithms may be useful to identify many such high-IQ and low-coverage regimes.
5 but
in case of random systems, the R keeps decreasing due to overthinking!