Technologies for an Intelligent Web l Francis Heylighen

Technologies for an Intelligent Web l Francis Heylighen l Center Leo Apostel l Vrije Universiteit Brussel 1

What is intelligence? n capacity for problem-solving in the widest sense n problem =difference between perceived and preferred l input = perception, output =plan for action n problem-solving= l efficiently exploring mental map l includes interpretation, search, inference, decision-making, etc. l selecting the adequate combination of resources to go from present state to desired state n requires mental map or knowledge l representation of problem states and resources 2

Collective intelligence n synergy l when the group can find more/better solutions than the sum of solutions found by all members individually n requires Collective Mental Map l integrated sum of all individual knowledge l read/write access for all people n no individual or computer can store a CMM for humanity l externall, shared memory l requires a distributed representation/search l must self-organize: no centralized control possible n the “web” can be made to function as a CMM 3

Global network - Global Brain? 4

The web as as a collective mental map n distributed knowledge system l sum total of individual contributions l coherent because of its interlinking n global neural network l Web pages as neurons l hyperlinks as synapses n problem-solving support l helps the user collect the resources that solve their problem l e. g. “find me …” – a second-hand video recorder – the quickest way to travel from here to there – the treatment that tackles symptoms – information about growing blueberries 5

Hypertext network 6

Network of Nodes and Links 7

Web as network of resources n Nodes are any resources that can help solve problems l web documents l computer programs or databases l software agents l products: fridges, TVs, phones, . . . l people l organizations, public or commercial n Links are relations between resources l hyperlinks l people having access to other people/devices/organizations. . l relations between databases or programs 8

Links as relations n links can have types l e. g. “is author of”, “cites”, “lives in”, “works for”, “is a type of”… n links can have weights n link weights measure degree of association l effort needed for the one to “access” or “connect to” the other l e. g. order in which telephone numbers are listed in cellular phone memory – first ones are easier to access 9

Metasystem Transitions in the Brain n one-to-one communication l direct transmission l traditional media: phone, post, . . . n many-to-many communication l integrating and processing different signals l this is the level of the present web n learning l creating/adapting connections from experience n thought l exploring combinations never experienced together n discovery l developing new concepts, rules and models 10

Learning Webs n let the web learn from the way it is used l optimize connection between initial and desired states n assumption: users go from a web page to relevant page n when link between two pages is used, weight is increased l unused links are correspondingly weakened n �indirect links too are reinforced l user� goes A� B, and B � C, then also A � C is reinforced l creates shortcuts for often travelled paths n turns the web into an associative network l the more associated the nodes, the stronger their connection l organization similar to the brain 11

The Learning Web Experiment n performed by Johan Bollen and myself n 150 most frequent English nouns n each word gets one web page n each page is linked randomly to 10 other pages/words n users are asked to choose the best association out of 10 12

The Learning Web Experiment 13

Results from the experiment 14

Associative Network from Experiment 15

Hebbian Rule for Web Learning n Connection is strengthened proportional to joint activation n Activation = degree of “usefulness” for user l explicit evaluation by user l implicit evaluation derived from – duration of visit – bookmarking, saving, printing, ordering, etc. n Joint activation = usage by same user l product of activation degrees l activation can be negative -> link weakened – if user dislikes resource l activation decays exponentially l reinforcement decays with interval between usages 16

Spreading activation n Associative networks can be explored in parallel l users can only move sequentially between nodes n “input” nodes can be activated simultaneously l activation follows associative links to other nodes l these are in turn activated, proportionally to link strength n thus, activation spreads over a semantic neighborhood n primitive form of “thinking” l exploring different combinations of concepts 17

Spreading activation illustration sea bird river water river edge ground seagull bank support building financial institution sit money rate 18

Spreading activation illustration sea bird river water river edge ground seagull bank support building financial institution sit money rate 19

Spreading activation illustration sea bird river water river edge ground seagull bank support building financial institution sit money rate 20

Spreading activation illustration sea bird river water river edge ground seagull bank support building financial institution sit money rate 21

Spreading activation illustration sea bird river water river edge ground seagull bank support building financial institution sit money rate 22

Personalized Recommendations n agent collects appreciated items l e. g. liked pages, music records, concepts n by spreading activation from these elements, the agent tries to find associated items, e. g. l related pages, similar records l pages related to all concepts – e. g. “paper”, “work”, “room” -> “office” n the agent “recommends” the most activated items l these are most likely to please the user n similar to collaborative filtering l recommend items appreciated by people with similar tastes 23

Finding attractors n If spreading is repeated many times, activation concentrates in “attractors” of the network l densely connected clusters of nodes n equivalent to calculating eigenvectors of linking matrix n Application: finding “communities” l related pages on a subject l e. g. Kleinberg, CLEVER project n Application: determining authority l Google’s Page. Rank algorithm l most “attractive” pages are most authoritative 24

Spreading Authority 25

Ill-Structured Problems n User in general cannot formulate problem/goal/preferences l only vague associations l e. g. diarrhoea, constipation, cramps, colon, gas, bloating. . . l implicit problem: “How to cure Irritable Bowel Syndrome? ” l activate symptom resources l let activation spread l find most authoritative documents that solve problem n The web “thinks ahead” of the user l takes into account implicit signs of interest l suggests solutions to problems the user may not even be aware of 26

The Semantic Web n Spreading activation diffuses or ends up in attractors l loss of information with respect to initial state n Constrained spreading activation = inference l follow only specific link or node types l allows activation to spread in a much more focused way n Answering structured queries l E. g. lady works for client, lives in Washington, has son that goes to Princeton – link types “employed by”, “adress”, “child of”, “studies at”, . . . l E. g. appointment with nearest plumber within free hours n Requires consensual ontologies l explicit taxonomies of types and their relations 27

Collective Development of Ontologies n Ontological categories must be formal, unambiguous l very hard to develop manually n Clustering l put “similar” items into same category l from soft associations to hard categories n Bootstrapping l concepts defined by relations with other concepts – represented as column vectors of association matrix l concepts more similar if associations overlap more l similarity s can be calculated as dot product of vectors: 28

Knowledge Discovery n Web can autonomously create new knowledge l clustering new categories or concepts l rule: if (concept), then (other concept) – e. g. if banana, then yellow; if fire and gas, then explosion l system of concepts and rules knowledge n Ex. medical syndrome l huge database of persons, symptoms, treatments, etc. l clustering on the basis of symptoms distinguishing syndromes l correlating syndromes, treatments and outcomes finding best treatment for given syndrome 29

Conclusion n web can be seen as network of nodes and links l nodes = resources n new links can be learned implicitly from usage l makes the web more efficient, intuitive, dense, . . . n network can be explored through spreading activation l allows vague, intuitive, unstructured queries n ontologies can be used to structure web l allows concrete, explicit queries n new structures can be mined from implicit relations l allows creation of ontologies, knowledge discovery 30