5.1 Rumelhart Semantic Network#

Note, the code in this notebook trains a network and may take a while to run. To save time, you can run the full notebook now while reading the introduction. Then, don’t rerun each cell while going through the notebook since that might reset the training.

Introduction#

Multilayer NN in Psychology: Semantics#

Our minds are full of associations, presumably implemented as connection strengths between concepts. But associations can have a wide variety of structure – they are not merely symmetric (e.g. “dog” and “mail carrier” are associated but have an asymmetric relationship – the mail carrier is less likely to bite the dog than vice a versa). Semantic networks are graphs used to model the relationships between concepts in the mind, and are a commonly used representation of knowledge. They are depicted as graphs, whose vertices are concepts and whose edges represent connections between concepts. A concept that is a noun is connected to its characteristics, but also to categories that it is a member of. Nouns that share a category (e.g. birds and dogs are both animals) both have a connection to that category, and share a higher-order connection between that category and a category to which it belongs (e.g. animals belong to the category living things).

In this way, semantic nets can contain a deductive logical hierarchy. One of the key features arising from this hierarchy is the “inheritance” property, which states that a concept belonging to a subcategory will inherit the properties of that subcategory and, by extension, any category to which the subcategory belongs.

To clarify, consider hearing the phrase “Blue-Footed Booby” for the first time. Initially, it contains very little information. If you are then told that it is the name of a bird, you’ll know that it must necessarily have feathers, a beak, feet, and eyes. You will also know that it is a living thing, and therefore grows and is comprised of cells. This is the inheritance property at work.

semantic net example

Dave Rumelhart used neural networks to model semantic networks, recreating structured knowledge representations using hidden layers and backprop learning.

In order to represent the semantic net computationally, Rumelhart came up with a particularly clever structure of a multi-layer network, with two input layers, two hidden layers, and four output layers.

semantic net architecture

When training networks of this structure, it was found that the represenational nodes, when adequately trained, came to represent distinguishing features of their inputs. They were internal representations of the concepts similar to those created in the human mind.

Let us explore his idea by building it ourselves.

Installation and Setup

%%capture
%pip install psyneulink

import numpy as np
import psyneulink as pnl
import matplotlib.pyplot as plt 

Rumelhart’s Semantic Network#

The first thing we will do is create a data set with examples and labels that we can train the net on. We’ll be using a similar data set to the one in Rumelhart’s original paper on the semantic net Rumelhart, Hinton, and Williams (1986).

The “nouns” inputs are provided to you in a list, as are the “relations”. The relations consist of “I”, which is simply the list of nouns, “is” which is a list of largest categories to which the noun can belong, “has”, which are potential physical attributes, and “can”, which are abilities each noun possesses.

The network takes as input a pair of nouns and relations, which form a priming statement, such as “daisy is a”, and should produce a response from the network that includes both a list of what a daisy is, and lists of what a daisy has and can do, as well as the actual term “daisy”. This response mimics the brain’s priming response, by creating activation for concepts related to the one explicitly stated.

This is a slightly simplified model of Rumelhart’s network, which learned associations through internal structure only, not through explicit updating for every output.

# Stimuli and Relations

nouns = ['oak', 'pine', 'rose', 'daisy', 'canary', 'robin', 'salmon', 'sunfish']
relations = ['is', 'has', 'can']
is_list = ['living', 'living thing', 'plant', 'animal', 'tree', 'flower', 'bird', 'fish', 'big', 'green', 'red',
           'yellow']
has_list = ['roots', 'leaves', 'bark', 'branches', 'skin', 'feathers', 'wings', 'gills', 'scales']
can_list = ['grow', 'move', 'swim', 'fly', 'breathe', 'breathe underwater', 'breathe air', 'walk', 'photosynthesize']
descriptors = [nouns, is_list, has_list, can_list]

truth_nouns = np.identity(len(nouns))

truth_is = np.zeros((len(nouns), len(is_list)))

truth_is[0, :] = [1, 1, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0]
truth_is[1, :] = [1, 1, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0]
truth_is[2, :] = [1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0]
truth_is[3, :] = [1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0] # <--- Exercise 1a What does this row represent?
truth_is[4, :] = [1, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1]
truth_is[5, :] = [1, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1]
truth_is[6, :] = [1, 1, 0, 1, 0, 0, 0, 1, 1, 0, 1, 0]
truth_is[7, :] = [1, 1, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0]

truth_has = np.zeros((len(nouns), len(has_list)))

truth_has[0, :] = [1, 1, 1, 1, 0, 0, 0, 0, 0]
truth_has[1, :] = [1, 1, 1, 1, 0, 0, 0, 0, 0]
truth_has[2, :] = [1, 1, 0, 0, 0, 0, 0, 0, 0]
truth_has[3, :] = [1, 1, 0, 0, 0, 0, 0, 0, 0]
truth_has[4, :] = [0, 0, 0, 0, 1, 1, 1, 0, 0]
truth_has[5, :] = [0, 0, 0, 0, 1, 1, 1, 0, 0]
truth_has[6, :] = [0, 0, 0, 0, 0, 0, 0, 1, 1] # <--- Exercise 1b What does this row represent?
truth_has[7, :] = [0, 0, 0, 0, 0, 0, 0, 1, 1]

truth_can = np.zeros((len(nouns), len(can_list)))

truth_can[0, :] = [1, 0, 0, 0, 0, 0, 0, 0, 1]
truth_can[1, :] = [1, 0, 0, 0, 0, 0, 0, 0, 1] # <--- Exercise 1c What does this row represent?
truth_can[2, :] = [1, 0, 0, 0, 0, 0, 0, 0, 1]
truth_can[3, :] = [1, 0, 0, 0, 0, 0, 0, 0, 1]
truth_can[4, :] = [1, 1, 0, 1, 1, 0, 1, 1, 0]
truth_can[5, :] = [1, 1, 0, 1, 1, 0, 1, 1, 0]
truth_can[6, :] = [1, 1, 1, 0, 1, 1, 0, 0, 0]
truth_can[7, :] = [1, 1, 1, 0, 1, 1, 0, 0, 0]

truths = [[truth_nouns], [truth_is], [truth_has], [truth_can]]

🎯 Exercise 1

In the above code, there are three rows that are marked with comments. For each of these rows, identify the noun that it represents, and explain what the row is representing.

Exercise 1a

Remember:

nouns = ['oak', 'pine', 'rose', 'daisy', 'canary', 'robin', 'salmon', 'sunfish']
relations = ['is', 'has', 'can']
is_list = ['living', 'living thing', 'plant', 'animal', 'tree', 'flower', 'bird', 'fish', 'big', 'green', 'red',
           'yellow']
has_list = ['roots', 'leaves', 'bark', 'branches', 'skin', 'feathers', 'wings', 'gills', 'scales']
can_list = ['grow', 'move', 'swim', 'fly', 'breathe', 'breathe underwater', 'breathe air', 'walk', 'photosynthesize']

what does the row truth_is[3, :] represent?

truth_is[3, :] = [1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0]
✅ Solution 1a

The 3 in truth_is[3, :] refers to the 4th noun in the list of nouns (remember indexing in Python starts with 0), which is “daisy”.

[1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0] can be interpreted as a list of 12 booleans, where 1 means true and 0 means false. We map this to the is_list, which is a list of 12 concepts.

That means:

  • 1: “daisy” is a living thing

  • 1: “daisy” is a living

  • 1: “daisy” is a plant

  • 0: “daisy” is not an animal

  • 0: “daisy” is not a tree

  • 1: “daisy” is a flower

  • 0: “daisy” is not a bird

  • 0: “daisy” is not a fish

  • 0: “daisy” is not big

  • 0: “daisy” is not green

  • 0: “daisy” is not red

  • 0: “daisy” is not yellow

Exercise 1b

Remember:

nouns = ['oak', 'pine', 'rose', 'daisy', 'canary', 'robin', 'salmon', 'sunfish']
relations = ['is', 'has', 'can']
is_list = ['living', 'living thing', 'plant', 'animal', 'tree', 'flower', 'bird', 'fish', 'big', 'green', 'red',
           'yellow']
has_list = ['roots', 'leaves', 'bark', 'branches', 'skin', 'feathers', 'wings', 'gills', 'scales']
can_list = ['grow', 'move', 'swim', 'fly', 'breathe', 'breathe underwater', 'breathe air', 'walk', 'photosynthesize']

what does the row truth_has[6, :] represent?

truth_has[6, :] = [0, 0, 0, 0, 0, 0, 0, 1, 1]
✅ Solution 1b
  • salmon does not have roots

  • salmon does not have leaves

  • salmon does not have a bark

  • salmon does not have branches

  • salmon does not have skin

  • salmon does not have feathers

  • salmon does not have wings

  • salmon has gills

  • salmon has scales

Exercise 1c

Remember:

nouns = ['oak', 'pine', 'rose', 'daisy', 'canary', 'robin', 'salmon', 'sunfish']
relations = ['is', 'has', 'can']
is_list = ['living', 'living thing', 'plant', 'animal', 'tree', 'flower', 'bird', 'fish', 'big', 'green', 'red',
           'yellow']
has_list = ['roots', 'leaves', 'bark', 'branches', 'skin', 'feathers', 'wings', 'gills', 'scales']
can_list = ['grow', 'move', 'swim', 'fly', 'breathe', 'breathe underwater', 'breathe air', 'walk', 'photosynthesize']

what does the row truth_can[1, :] represent?

truth_can[1, :] = [1, 0, 0, 0, 0, 0, 0, 0, 1]
✅ Solution 1c
  • pine can grow

  • pine cannot move

  • pine cannot swim

  • pine cannot fly

  • pine cannot breathe

  • pine cannot breathe underwater

  • pine cannot breathe air

  • pine cannot walk

  • pine can photosynthesize

Create the Inputs Set#

Here, we will create the input set for each of the nouns and relations. We will use one-hot encoding to represent the nouns and relations. Additionally, we will add single constant input to each of the vectors that will be used as a bias input for the network.

def gen_input_vals(nouns, relations):

    rumel_nouns_bias=np.vstack((np.identity(len(nouns)),np.ones((1,len(nouns)))))
    rumel_nouns_bias=rumel_nouns_bias.T

    rumel_rels_bias=np.vstack((np.identity(len(relations)),np.ones((1,len(relations)))))
    rumel_rels_bias=rumel_rels_bias.T
    return (rumel_nouns_bias, rumel_rels_bias)

nouns_onehot, rels_onehot = gen_input_vals(nouns, relations)

r_nouns = np.shape(nouns_onehot)[0]
c_nouns = np.shape(nouns_onehot)[1]
r_rels = np.shape(rels_onehot)[0]
c_rels = np.shape(rels_onehot)[1]

print('Nouns Onehot\n:', nouns_onehot)

print('\nRelations Onehot\n:', rels_onehot)

print('\nNouns Shape:\n', np.shape(nouns_onehot))

print('\nRelations Shape:\n', np.shape(rels_onehot))
Nouns Onehot
: [[1. 0. 0. 0. 0. 0. 0. 0. 1.]
 [0. 1. 0. 0. 0. 0. 0. 0. 1.]
 [0. 0. 1. 0. 0. 0. 0. 0. 1.]
 [0. 0. 0. 1. 0. 0. 0. 0. 1.]
 [0. 0. 0. 0. 1. 0. 0. 0. 1.]
 [0. 0. 0. 0. 0. 1. 0. 0. 1.]
 [0. 0. 0. 0. 0. 0. 1. 0. 1.]
 [0. 0. 0. 0. 0. 0. 0. 1. 1.]]

Relations Onehot
: [[1. 0. 0. 1.]
 [0. 1. 0. 1.]
 [0. 0. 1. 1.]]

Nouns Shape:
 (8, 9)

Relations Shape:
 (3, 4)

🎯 Exercise 2

Exercise 2a

This is the tensor nouns_onehot that was generated by the function gen_input_vals. What does the marked line represent?:

[[1. 0. 0. 0. 0. 0. 0. 0. 1.]
 [0. 1. 0. 0. 0. 0. 0. 0. 1.]
 [0. 0. 1. 0. 0. 0. 0. 0. 1.]
 [0. 0. 0. 1. 0. 0. 0. 0. 1.] 
 [0. 0. 0. 0. 1. 0. 0. 0. 1.]
 [0. 0. 0. 0. 0. 1. 0. 0. 1.] # <-- Exercise 2a
 [0. 0. 0. 0. 0. 0. 1. 0. 1.]
 [0. 0. 0. 0. 0. 0. 0. 1. 1.]]
✅ Solution 2a

The marked line represents the biased input for robin. The 1 at the 6th index represents the activation for robin and the 1 at the 9th index represents the bias input.

Exercise 2b

There seems to be a pattern in the shapes of the tensors:

  • nouns_onehot has a shape of (8, 9)

  • rels_onehot has a shape of (3, 4)

The general form of the shape seems to be (i, i+1)

Can you explain what the i is and why the shape of the tensors is in this form?

✅ Solution 2b

The i is the number of elements. Without a bias term the tensor would be of shape (i, i) and look like an identity matrix. Since we add a bias term to each vector, the length of each vector increases by 1, but we still have i elements. This is why the shape of the tensors is (i, i+1).

Create the Network#

Now, we can set up the network according to Rumelhart’s architecture (see picture above).

First, let’s create the layers:

# Create the input layers

input_nouns = pnl.TransferMechanism(name="input_nouns",
                                    input_shapes=c_nouns,
                                    )

input_relations = pnl.TransferMechanism(name="input_relations",
                                        input_shapes=c_rels,
                                        )

# Create the hidden layers. Here we use a logistic activation function
# The number of hidden units is taken directly from Rumelhart's paper
hidden_nouns_units = 9
hidden_mixed_units = 16

hidden_nouns = pnl.TransferMechanism(name="hidden_nouns",
                                     input_shapes=hidden_nouns_units,
                                     function=pnl.Logistic()
                                     )

hidden_mixed = pnl.TransferMechanism(name="hidden_mixed",
                                     input_shapes=hidden_mixed_units,
                                     function=pnl.Logistic()
                                     )

# Create the output layers. Here we use a logistic activation function

output_I = pnl.TransferMechanism(name="output_I",
                                 input_shapes=len(nouns),
                                 function=pnl.Logistic()
                                 )

output_is = pnl.TransferMechanism(name="output_is",
                                  input_shapes=len(is_list),
                                  function=pnl.Logistic()
                                  )

output_has = pnl.TransferMechanism(name="output_has",
                                   input_shapes=len(has_list),
                                   function=pnl.Logistic()
                                   )

output_can = pnl.TransferMechanism(name="output_can",
                                   input_shapes=len(can_list),
                                   function=pnl.Logistic()
                                   )

Then, we can create the projections between the layers. We will use random matrices to connect the mechanisms to each other.

# Seed for reproducibility
np.random.seed(42)

# First, we create the projections from the input layers

# input_nouns -> hidden_nouns
map_in_nouns_h_nouns = pnl.MappingProjection(
    matrix=np.random.rand(c_nouns, hidden_nouns_units),
    name="map_in_nouns_h_nouns",
    sender=input_nouns,
    receiver=hidden_nouns
)

# input_relations -> hidden_mixed
map_in_rel_h_mixed = pnl.MappingProjection(
    matrix=np.random.rand(c_rels, hidden_mixed_units),
    name="map_in_rel_h_mixed",
    sender=input_relations,
    receiver=hidden_mixed
)

# Then, we create the projections between the hidden layers

# hidden_nouns -> hidden_mixed
map_h_nouns_h_mixed = pnl.MappingProjection(
    matrix=np.random.rand(hidden_nouns_units, hidden_mixed_units),
    name="map_h_nouns_h_mixed",
    sender=hidden_nouns,
    receiver=hidden_mixed
)

# Finally, we create the projections from the hidden layers to the output layers

# hidden_mixed -> output_I
map_h_mixed_out_I = pnl.MappingProjection(
    matrix=np.random.rand(hidden_mixed_units, len(nouns)),
    name="map_h_mixed_out_I",
    sender=hidden_mixed,
    receiver=output_I
)

# hidden_mixed -> output_is
map_h_mixed_out_is = pnl.MappingProjection(
    matrix=np.random.rand(hidden_mixed_units, len(is_list)),
    name="map_hm_is",
    sender=hidden_mixed,
    receiver=output_is
)

# hidden_mixed -> output_has
map_h_mixed_out_can = pnl.MappingProjection(
    matrix=np.random.rand(hidden_mixed_units, len(has_list)),
    name="map_h_mixed_out_can",
    sender=hidden_mixed,
    receiver=output_can
)

map_h_mixed_out_has = pnl.MappingProjection(
    matrix=np.random.rand(hidden_mixed_units, len(has_list)),
    name="map_h_mixed_out_has",
    sender=hidden_mixed,
    receiver=output_has
)

Finally, we create the composition. Here, we use pnl.AutoDiffComposition to build a network capable of learning. This subclass of pnl.Composition is primarily designed for the efficient training of feedforward neural networks. It is mostly restricted to supervised learning, specifically using backpropagation.

We then add all nodes and projections to the network and configure the logging conditions for the outputs.

learning_rate = 1
# Create the AutoDiffComposition with the pathways

rumelhart_net = pnl.AutodiffComposition(
    pathways=[[input_nouns, map_in_nouns_h_nouns, hidden_nouns],
              [input_relations, map_in_rel_h_mixed, hidden_mixed],
              [hidden_nouns, map_h_nouns_h_mixed, hidden_mixed],
              [hidden_mixed, map_h_mixed_out_I, output_I],
              [hidden_mixed, map_h_mixed_out_is, output_is],
              [hidden_mixed, map_h_mixed_out_has, output_has],
              [hidden_mixed, map_h_mixed_out_can, output_can]],
    learning_rate=learning_rate
)

rumelhart_net.show_graph(output_fmt='jupyter')
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
File /opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages/graphviz/backend/execute.py:76, in run_check(cmd, input_lines, encoding, quiet, **kwargs)
     75         kwargs['stdout'] = kwargs['stderr'] = subprocess.PIPE
---> 76     proc = _run_input_lines(cmd, input_lines, kwargs=kwargs)
     77 else:

File /opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages/graphviz/backend/execute.py:96, in _run_input_lines(cmd, input_lines, kwargs)
     95 def _run_input_lines(cmd, input_lines, *, kwargs):
---> 96     popen = subprocess.Popen(cmd, stdin=subprocess.PIPE, **kwargs)
     98     stdin_write = popen.stdin.write

File /opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/subprocess.py:1026, in Popen.__init__(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags, restore_signals, start_new_session, pass_fds, user, group, extra_groups, encoding, errors, text, umask, pipesize, process_group)
   1023             self.stderr = io.TextIOWrapper(self.stderr,
   1024                     encoding=encoding, errors=errors)
-> 1026     self._execute_child(args, executable, preexec_fn, close_fds,
   1027                         pass_fds, cwd, env,
   1028                         startupinfo, creationflags, shell,
   1029                         p2cread, p2cwrite,
   1030                         c2pread, c2pwrite,
   1031                         errread, errwrite,
   1032                         restore_signals,
   1033                         gid, gids, uid, umask,
   1034                         start_new_session, process_group)
   1035 except:
   1036     # Cleanup if the child failed starting.

File /opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/subprocess.py:1955, in Popen._execute_child(self, args, executable, preexec_fn, close_fds, pass_fds, cwd, env, startupinfo, creationflags, shell, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite, restore_signals, gid, gids, uid, umask, start_new_session, process_group)
   1954 if err_filename is not None:
-> 1955     raise child_exception_type(errno_num, err_msg, err_filename)
   1956 else:

FileNotFoundError: [Errno 2] No such file or directory: PosixPath('dot')

The above exception was the direct cause of the following exception:

ExecutableNotFound                        Traceback (most recent call last)
File /opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages/IPython/core/formatters.py:1036, in MimeBundleFormatter.__call__(self, obj, include, exclude)
   1033     method = get_real_method(obj, self.print_method)
   1035     if method is not None:
-> 1036         return method(include=include, exclude=exclude)
   1037     return None
   1038 else:

File /opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages/graphviz/jupyter_integration.py:98, in JupyterIntegration._repr_mimebundle_(self, include, exclude, **_)
     96 include = set(include) if include is not None else {self._jupyter_mimetype}
     97 include -= set(exclude or [])
---> 98 return {mimetype: getattr(self, method_name)()
     99         for mimetype, method_name in MIME_TYPES.items()
    100         if mimetype in include}

File /opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages/graphviz/jupyter_integration.py:98, in <dictcomp>(.0)
     96 include = set(include) if include is not None else {self._jupyter_mimetype}
     97 include -= set(exclude or [])
---> 98 return {mimetype: getattr(self, method_name)()
     99         for mimetype, method_name in MIME_TYPES.items()
    100         if mimetype in include}

File /opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages/graphviz/jupyter_integration.py:112, in JupyterIntegration._repr_image_svg_xml(self)
    110 def _repr_image_svg_xml(self) -> str:
    111     """Return the rendered graph as SVG string."""
--> 112     return self.pipe(format='svg', encoding=SVG_ENCODING)

File /opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages/graphviz/piping.py:104, in Pipe.pipe(self, format, renderer, formatter, neato_no_op, quiet, engine, encoding)
     55 def pipe(self,
     56          format: typing.Optional[str] = None,
     57          renderer: typing.Optional[str] = None,
   (...)     61          engine: typing.Optional[str] = None,
     62          encoding: typing.Optional[str] = None) -> typing.Union[bytes, str]:
     63     """Return the source piped through the Graphviz layout command.
     64 
     65     Args:
   (...)    102         '<?xml version='
    103     """
--> 104     return self._pipe_legacy(format,
    105                              renderer=renderer,
    106                              formatter=formatter,
    107                              neato_no_op=neato_no_op,
    108                              quiet=quiet,
    109                              engine=engine,
    110                              encoding=encoding)

File /opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages/graphviz/_tools.py:171, in deprecate_positional_args.<locals>.decorator.<locals>.wrapper(*args, **kwargs)
    162     wanted = ', '.join(f'{name}={value!r}'
    163                        for name, value in deprecated.items())
    164     warnings.warn(f'The signature of {func.__name__} will be reduced'
    165                   f' to {supported_number} positional args'
    166                   f' {list(supported)}: pass {wanted}'
    167                   ' as keyword arg(s)',
    168                   stacklevel=stacklevel,
    169                   category=category)
--> 171 return func(*args, **kwargs)

File /opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages/graphviz/piping.py:121, in Pipe._pipe_legacy(self, format, renderer, formatter, neato_no_op, quiet, engine, encoding)
    112 @_tools.deprecate_positional_args(supported_number=2)
    113 def _pipe_legacy(self,
    114                  format: typing.Optional[str] = None,
   (...)    119                  engine: typing.Optional[str] = None,
    120                  encoding: typing.Optional[str] = None) -> typing.Union[bytes, str]:
--> 121     return self._pipe_future(format,
    122                              renderer=renderer,
    123                              formatter=formatter,
    124                              neato_no_op=neato_no_op,
    125                              quiet=quiet,
    126                              engine=engine,
    127                              encoding=encoding)

File /opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages/graphviz/piping.py:149, in Pipe._pipe_future(self, format, renderer, formatter, neato_no_op, quiet, engine, encoding)
    146 if encoding is not None:
    147     if codecs.lookup(encoding) is codecs.lookup(self.encoding):
    148         # common case: both stdin and stdout need the same encoding
--> 149         return self._pipe_lines_string(*args, encoding=encoding, **kwargs)
    150     try:
    151         raw = self._pipe_lines(*args, input_encoding=self.encoding, **kwargs)

File /opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages/graphviz/backend/piping.py:212, in pipe_lines_string(engine, format, input_lines, encoding, renderer, formatter, neato_no_op, quiet)
    206 cmd = dot_command.command(engine, format,
    207                           renderer=renderer,
    208                           formatter=formatter,
    209                           neato_no_op=neato_no_op)
    210 kwargs = {'input_lines': input_lines, 'encoding': encoding}
--> 212 proc = execute.run_check(cmd, capture_output=True, quiet=quiet, **kwargs)
    213 return proc.stdout

File /opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages/graphviz/backend/execute.py:81, in run_check(cmd, input_lines, encoding, quiet, **kwargs)
     79 except OSError as e:
     80     if e.errno == errno.ENOENT:
---> 81         raise ExecutableNotFound(cmd) from e
     82     raise
     84 if not quiet and proc.stderr:

ExecutableNotFound: failed to execute PosixPath('dot'), make sure the Graphviz executables are on your systems' PATH
<graphviz.graphs.Digraph at 0x7f226e160b90>

Training the Network#

We aim to train our network on input pairs consisting of a noun and a relation.

On each run, we want to:

  1. Set the targets for the relevant outputs to their corresponding truth tables.

  2. Set the targets for the irrelevant outputs to zero.

Understanding the Concept

Imagine someone asks you which attributes from the “has” list a robin possesses. You would likely answer correctly by listing relevant attributes. However, you wouldn’t mention anything from the “can” list, since those attributes are not relevant to the question.

Why is this important?

This mirrors how we naturally learn. In elementary school, when asked a simple, closed-form question, we weren’t rewarded for giving a response that was related but didn’t actually answer the question.

For example:

Q: Can a canary fly?

True but irrelevant answer: A canary is a yellow living thing with wings that can breathe air.

While this statement is true, it does not directly answer the question. Our goal is to train the network in the same way—to focus on relevant responses.

Note, the following code will take a while to run.

# First, let's create zero arrays for the irrelevant outputs.
is_ir = np.zeros(len(is_list))
has_ir = np.zeros(len(has_list))
can_ir = np.zeros(len(can_list))

# We will train the model for `n_epochs` epochs and `tot_reps` repetitions
n_epochs = 5
tot_reps = 50

# Repeat th training for `tot_reps` repetitions
for reps in range(tot_reps):
    
    # Loop through each noun
    for noun in range(len(nouns)):
        # Here, we will store the inputs and targets in dictionaries
        inputs_dict = {
            input_nouns: [],
            input_relations: []
        }

        # Here, we will store the respective targets in dictionaries
        targets_dict = {
            output_I: [],
            output_is: [],
            output_has: [],
            output_can: []
        }

        # Loop through each relation
        for i in range(len(relations)):
            # Append the input to the dictionary
            inputs_dict[input_nouns].append(nouns_onehot[noun])
            inputs_dict[input_relations].append(rels_onehot[i])

            # Append the targets to the dictionary
            targets_dict[output_I].append(truth_nouns[noun])
            targets_dict[output_is].append(truth_is[noun] if i == 0 else is_ir)  # i == 0 -> "is" relation 
            targets_dict[output_has].append(truth_has[noun] if i == 1 else has_ir)  # i == 1 -> "has" relation
            targets_dict[output_can].append(truth_can[noun] if i == 2 else can_ir)  # i == 2 -> "can" relation
        
        # Train the network for `n_epochs`
        result = rumelhart_net.learn(
            inputs={
                'inputs': inputs_dict,
                'targets': targets_dict,
                'epochs': n_epochs
            },
            execution_mode=pnl.ExecutionMode.PyTorch
        )
    
    # Print a dot for each repetition to track progress
    print('.', end='')
    # Print a new line every 10 repetitions
    if (reps + 1) % 10 == 0:
        print()
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
---------------------------------------------------------------------------
KeyboardInterrupt                         Traceback (most recent call last)
/tmp/ipykernel_2964/3570587592.py in ?()
     38             targets_dict[output_has].append(truth_has[noun] if i == 1 else has_ir)  # i == 1 -> "has" relation
     39             targets_dict[output_can].append(truth_can[noun] if i == 2 else can_ir)  # i == 2 -> "can" relation
     40 
     41         # Train the network for `n_epochs`
---> 42         result = rumelhart_net.learn(
     43             inputs={
     44                 'inputs': inputs_dict,
     45                 'targets': targets_dict,

/opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages/psyneulink/core/globals/context.py in ?(context, *args, **kwargs)
    739                         pass
    740 
    741             try:
    742                 return func(*args, context=context, **kwargs)
--> 743             except TypeError as e:
    744                 # context parameter may be passed as a positional arg
    745                 if (
    746                     f"{func.__name__}() got multiple values for argument"

/opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages/psyneulink/library/compositions/autodiffcomposition.py in ?(self, synch_projection_matrices_with_torch, synch_node_variables_with_torch, synch_node_values_with_torch, synch_results_with_torch, retain_torch_trained_outputs, retain_torch_targets, retain_torch_losses, *args, **kwargs)
   1377             raise AutodiffCompositionError(f"'{self.name}.learn()' has been called with ExecutionMode.Pytorch, "
   1378                                            f"but Pytorch module ('torch') is not installed. "
   1379                                            f"Please install it with `pip install torch` or `pip3 install torch`")
   1380 
-> 1381         return super().learn(*args,
   1382                              synch_with_pnl_options=synch_with_pnl_options,
   1383                              retain_in_pnl_options=retain_in_pnl_options,
   1384                              execution_mode=execution_mode,

/opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages/psyneulink/core/globals/context.py in ?(context, *args, **kwargs)
    739                         pass
    740 
    741             try:
    742                 return func(*args, context=context, **kwargs)
--> 743             except TypeError as e:
    744                 # context parameter may be passed as a positional arg
    745                 if (
    746                     f"{func.__name__}() got multiple values for argument"

/opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages/psyneulink/core/compositions/composition.py in ?(self, inputs, targets, num_trials, epochs, learning_rate, minibatch_size, optimizations_per_minibatch, patience, min_delta, execution_mode, randomize_minibatches, call_before_minibatch, call_after_minibatch, context, *args, **kwargs)
  11780         self._analyze_graph()
  11781         self._check_nested_target_mechs()
  11782         context.execution_phase = execution_phase_at_entry
  11783 
> 11784         result = runner.run_learning(
  11785             inputs=inputs,
  11786             targets=targets,
  11787             num_trials=num_trials,

/opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages/psyneulink/library/compositions/compositionrunner.py in ?(self, inputs, targets, num_trials, epochs, learning_rate, minibatch_size, optimizations_per_minibatch, patience, min_delta, randomize_minibatches, synch_with_pnl_options, retain_in_pnl_options, call_before_minibatch, call_after_minibatch, context, execution_mode, **kwargs)
    411             # (Passing num_trials * stim_epoch + 1 works)
    412             run_trials = num_trials * stim_epoch if self._is_llvm_mode else None
    413 
    414             # IMPLEMENTATION NOTE: for autodiff composition, the following executes an MINIBATCH's worth of training
--> 415             self._composition.run(inputs=minibatched_input,
    416                                   num_trials=run_trials,
    417                                   skip_initialization=skip_initialization,
    418                                   skip_analyze_graph=True,

/opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages/psyneulink/core/globals/context.py in ?(context, *args, **kwargs)
    739                         pass
    740 
    741             try:
    742                 return func(*args, context=context, **kwargs)
--> 743             except TypeError as e:
    744                 # context parameter may be passed as a positional arg
    745                 if (
    746                     f"{func.__name__}() got multiple values for argument"

/opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages/psyneulink/library/compositions/autodiffcomposition.py in ?(self, synch_projection_matrices_with_torch, synch_node_variables_with_torch, synch_node_values_with_torch, synch_results_with_torch, retain_torch_trained_outputs, retain_torch_targets, retain_torch_losses, batched_results, *args, **kwargs)
   1634             kwargs['synch_with_pnl_options'] = synch_with_pnl_options
   1635             kwargs['retain_in_pnl_options'] = retain_in_pnl_options
   1636 
   1637         # If called from AutodiffComposition in Pytorch mode, provide chance to update results after run()
-> 1638         results = super(AutodiffComposition, self).run(*args, **kwargs)
   1639         if EXECUTION_MODE in kwargs and kwargs[EXECUTION_MODE] is pnlvm.ExecutionMode.PyTorch:
   1640             # Synchronize specified outcomes at end of learning run
   1641             context = kwargs[CONTEXT]

/opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages/psyneulink/core/globals/context.py in ?(context, *args, **kwargs)
    739                         pass
    740 
    741             try:
    742                 return func(*args, context=context, **kwargs)
--> 743             except TypeError as e:
    744                 # context parameter may be passed as a positional arg
    745                 if (
    746                     f"{func.__name__}() got multiple values for argument"

/opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages/psyneulink/core/compositions/composition.py in ?(self, inputs, num_trials, initialize_cycle_values, reset_stateful_functions_to, reset_stateful_functions_when, skip_initialization, clamp_input, runtime_params, call_before_time_step, call_after_time_step, call_before_pass, call_after_pass, call_before_trial, call_after_trial, termination_processing, skip_analyze_graph, report_output, report_params, report_progress, report_simulations, report_to_devices, animate, log, scheduler, scheduling_mode, execution_mode, default_absolute_time_unit, context, base_context, **kwargs)
  11521                     execution_stimuli = None
  11522 
  11523                 # execute processing, passing stimuli for this trial
  11524                 # IMPLEMENTATION NOTE: for autodiff, the following executes the forward pass for a single input
> 11525                 trial_output = self.execute(inputs=execution_stimuli,
  11526                                             scheduler=scheduler,
  11527                                             termination_processing=termination_processing,
  11528                                             call_before_time_step=call_before_time_step,

/opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages/psyneulink/core/globals/context.py in ?(context, *args, **kwargs)
    739                         pass
    740 
    741             try:
    742                 return func(*args, context=context, **kwargs)
--> 743             except TypeError as e:
    744                 # context parameter may be passed as a positional arg
    745                 if (
    746                     f"{func.__name__}() got multiple values for argument"

/opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages/psyneulink/library/compositions/autodiffcomposition.py in ?(self, inputs, num_trials, minibatch_size, optimizations_per_minibatch, do_logging, scheduler, termination_processing, call_before_minibatch, call_after_minibatch, call_before_time_step, call_before_pass, call_after_time_step, call_after_pass, reset_stateful_functions_to, context, base_context, clamp_input, targets, runtime_params, execution_mode, skip_initialization, synch_with_pnl_options, retain_in_pnl_options, report_output, report_params, report_progress, report_simulations, report_to_devices, report, report_num)
   1531                 # TBI: can we call _build_pytorch_representation in _analyze_graph so that pytorch
   1532                 # model may be modified between runs?
   1533 
   1534 
-> 1535                 autodiff_inputs = self._get_autodiff_inputs_values(inputs)
   1536                 autodiff_targets = self._get_autodiff_targets_values(inputs)
   1537 
   1538                 # Begin reporting of learning TRIAL:

/opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages/psyneulink/library/compositions/autodiffcomposition.py in ?(self, input_dict)
   1252         """
   1253         autodiff_input_dict = {}
   1254         for node, values in input_dict.items():
   1255             mech = node.owner if isinstance(node, InputPort) else node
-> 1256             if (mech in self.get_nested_input_nodes_at_all_levels()
   1257                     and mech not in self.get_nodes_by_role(NodeRole.TARGET)):
   1258                 # Pass along inputs to all INPUT Nodes except TARGETS
   1259                 # (those are handled separately in _get_autodiff_targets_values)

/opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages/psyneulink/core/compositions/composition.py in ?(self)
   4890     def get_nested_input_nodes_at_all_levels(self)->list or None:
   4891         """Return all Nodes from nested Compositions that receive input directly from input to outermost Composition."""
   4892         input_nodes = self.get_nested_nodes_by_roles_at_any_level(self, include_roles=NodeRole.INPUT)
-> 4893         return [input_node for input_node in input_nodes
   4894                 # include node if call to _get_source_node_for_input_CIM returns None for any of its afferents
   4895                 if not all(proj.sender.owner._get_source_node_for_input_CIM(proj.sender)
   4896                            for input_port in input_node.input_ports for proj in input_port.path_afferents

/opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages/psyneulink/core/compositions/composition.py in ?(.0)
   4893     def get_nested_input_nodes_at_all_levels(self)->list or None:
   4894         """Return all Nodes from nested Compositions that receive input directly from input to outermost Composition."""
-> 4895         input_nodes = self.get_nested_nodes_by_roles_at_any_level(self, include_roles=NodeRole.INPUT)
   4896         return [input_node for input_node in input_nodes
   4897                 # include node if call to _get_source_node_for_input_CIM returns None for any of its afferents
   4898                 if not all(proj.sender.owner._get_source_node_for_input_CIM(proj.sender)

/opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages/psyneulink/core/compositions/composition.py in ?(.0)
-> 4895     def get_nested_input_nodes_at_all_levels(self)->list or None:
   4896         """Return all Nodes from nested Compositions that receive input directly from input to outermost Composition."""
   4897         input_nodes = self.get_nested_nodes_by_roles_at_any_level(self, include_roles=NodeRole.INPUT)
   4898         return [input_node for input_node in input_nodes

KeyboardInterrupt: 

The above block of code trains the network using a set of three loops. The innermost pair of loops takes each noun and creates the appropriate training inputs and outputs associated with its “is”, “has”, and “can” relations. It will also be associated with an identity output.

After constructing the dictionaries, the middle loop, associated with the nouns, trains the network on the dictionaries for n_epochs.

The outermost loop simply repeats the training on each noun for a set number of repetitions.

You are encouraged to experiment with changing the number of repetitions and epochs to see how the network learns best.

Evaluating the Network#

We now print the losses of the network over time.

exec_id = rumelhart_net.default_execution_id
losses = rumelhart_net.parameters.torch_losses.get(exec_id)

plt.xlabel('Update Number')
plt.ylabel('Loss Value')
plt.title('Losses for the Rumelhart Network')
plt.plot(losses)
print('last lost was: ',losses[-1])
last lost was:  [0.01592901]
../../../../../_images/632ab2f1eeafd8c813392202f03ec4a25777a0299bcbe50a278c4bc5419345c0.png

Inference#

On top of the loss, let’s test what output the network gives for a given input.

# Pick a random id

noun_id = np.random.randint(0, len(nouns)-1)

relations_id = np.random.randint(0, len(relations)-1)

input_dict = {
    input_nouns: [nouns_onehot[noun_id]],
    input_relations: [rels_onehot[relations_id]]
}

answer = rumelhart_net.run(input_dict)

print(f'What {relations[relations_id]} {nouns[noun_id]}?')

print(f'{nouns[np.argmax(answer[0])]}...')

for idx, el in enumerate(answer[1]):
    if el > 0.5:
        print(f'is {is_list[idx]}')
        
for idx, el in enumerate(answer[2]):
    if el > 0.5:
        print(f'has {has_list[idx]}')
        
for idx, el in enumerate(answer[3]):
    if el > 0.5:
        print(f'can {can_list[idx]}')
What is salmon?
salmon...
is living
is living thing
is animal
is fish
is big
is red

🎯 Exercise 3

When asked the question “What is a robin?”, answering “a robin can fly” is not the most relevant response, but it is not “as false as saying “a robin is a fish”.

Ty training the following network to be sometimes answer “a robin can fly” although it is not the most relevant response.

rumelhart_net_with_ir = pnl.AutodiffComposition(
    pathways=[[input_nouns, map_in_nouns_h_nouns, hidden_nouns],
              [input_relations, map_in_rel_h_mixed, hidden_mixed],
              [hidden_nouns, map_h_nouns_h_mixed, hidden_mixed],
              [hidden_mixed, map_h_mixed_out_I, output_I],
              [hidden_mixed, map_h_mixed_out_is, output_is],
              [hidden_mixed, map_h_mixed_out_has, output_has],
              [hidden_mixed, map_h_mixed_out_can, output_can]],
    learning_rate=learning_rate
)
# implement your code here
✅ Solution 3

Instead of setting the values of the irrelevant outputs to zero, we just “discount” them by a factor of .5. This way, the network will learn to give more weight to the relevant outputs, but will still learn to give some weight to the irrelevant but true outputs.

# We will train the model for `n_epochs` epochs and `tot_reps` repetitions
n_epochs = 5
tot_reps = 200

# Repeat th training for `tot_reps` repetitions
for reps in range(tot_reps):
    
    # Loop through each noun
    for noun in range(len(nouns)):
        # Here, we will store the inputs and targets in dictionaries
        inputs_dict = {
            input_nouns: [],
            input_relations: []
        }

        # Here, we will store the respective targets in dictionaries
        targets_dict = {
            output_I: [],
            output_is: [],
            output_has: [],
            output_can: []
        }

        # Loop through each relation
        for i in range(len(relations)):
            # Append the input to the dictionary
            inputs_dict[input_nouns].append(nouns_onehot[noun])
            inputs_dict[input_relations].append(rels_onehot[i])

            # Append the targets to the dictionary
            targets_dict[output_I].append(truth_nouns[noun])
            targets_dict[output_is].append(truth_is[noun] if i == 0 else truth_is[noun]*.5)  # i == 0 -> "is" relation but instead of 0 we use .5 if irrelevant
            targets_dict[output_has].append(truth_has[noun] if i == 1 else truth_has[noun]*.5)  # i == 1 -> "has" relation but instead of 0 we use .5 if irrelevant
            targets_dict[output_can].append(truth_can[noun] if i == 2 else truth_can[noun]*.5)  # i == 2 -> "can" relation but instead of 0 we use .5 if irrelevant
        
        # Train the network for `n_epochs`
        result = rumelhart_net_with_ir.learn(
            inputs={
                'inputs': inputs_dict,
                'targets': targets_dict,
                'epochs': n_epochs
            },
            execution_mode=pnl.ExecutionMode.PyTorch
        )
    
    # Print a dot for each repetition to track progress
    print('.', end='')
    # Print a new line every 10 repetitions
    if (reps + 1) % 10 == 0:
        print()
n_epochs = 5
tot_reps = 50

# Repeat th training for `tot_reps` repetitions
for reps in range(tot_reps):
    
    # Loop through each noun
    for noun in range(len(nouns)):
        # Here, we will store the inputs and targets in dictionaries
        inputs_dict = {
            input_nouns: [],
            input_relations: []
        }

        # Here, we will store the respective targets in dictionaries
        targets_dict = {
            output_I: [],
            output_is: [],
            output_has: [],
            output_can: []
        }

        # Loop through each relation
        for i in range(len(relations)):
            # Append the input to the dictionary
            inputs_dict[input_nouns].append(nouns_onehot[noun])
            inputs_dict[input_relations].append(rels_onehot[i])

            # Append the targets to the dictionary
            targets_dict[output_I].append(truth_nouns[noun])
            targets_dict[output_is].append(truth_is[noun] if i == 0 else truth_is[noun]*.5)  # i == 0 -> "is" relation but instead of 0 we use .5 if irrelevant
            targets_dict[output_has].append(truth_has[noun] if i == 1 else truth_has[noun]*.5)  # i == 1 -> "has" relation but instead of 0 we use .5 if irrelevant
            targets_dict[output_can].append(truth_can[noun] if i == 2 else truth_can[noun]*.5)  # i == 2 -> "can" relation but instead of 0 we use .5 if irrelevant
        
        # Train the network for `n_epochs`
        result = rumelhart_net_with_ir.learn(
            inputs={
                'inputs': inputs_dict,
                'targets': targets_dict,
                'epochs': n_epochs
            },
            execution_mode=pnl.ExecutionMode.PyTorch
        )
    
    # Print a dot for each repetition to track progress
    print('.', end='')
    # Print a new line every 10 repetitions
    if (reps + 1) % 10 == 0:
        print()
..........
..........
..........
..........
..........

Evaluating the Network#

exec_id = rumelhart_net_with_ir.default_execution_id
losses = rumelhart_net_with_ir.parameters.torch_losses.get(exec_id)

plt.xlabel('Update Number')
plt.ylabel('Loss Value')
plt.title('Losses for the Rumelhart Network (with irrelevant)')
plt.plot(losses)
print('last lost was: ',losses[-1])
last lost was:  [0.00798728]
../../../../../_images/dabeac796058cb0e85ac7bf6fbbfc6a0d0bacb95a49c0a59a90d4d556171dce8.png

After training the network, examine the output of the network for a given input.

# Pick a random id

noun_id = np.random.randint(0, len(nouns)-1)

relations_id = np.random.randint(0, len(relations)-1)

input_dict = {
    input_nouns: [nouns_onehot[noun_id]],
    input_relations: [rels_onehot[relations_id]]
}

answer = rumelhart_net_with_ir.run(input_dict)

print(f'What {relations[relations_id]} {nouns[noun_id]}?')

print(f'{nouns[np.argmax(answer[0])]}...')

for idx, el in enumerate(answer[1]):
    if el > 0.5:
        print(f'is {is_list[idx]}')
        
for idx, el in enumerate(answer[2]):
    if el > 0.5:
        print(f'has {has_list[idx]}')
        
for idx, el in enumerate(answer[3]):
    if el > 0.5:
        print(f'can {can_list[idx]}')
What has rose?
rose...
is flower
has roots
has leaves
can grow

Probe the Associations#

We can also probe the associations with each noun by providing only the noun as an input to the network. This will give us a sense of the strength of the associations between the nouns and the relations.

noun_id = np.random.randint(0, len(nouns)-1)

input_n = {input_nouns: nouns_onehot[noun_id]}

# Associations when only relevant targets are trained
noun_out_I, noun_out_is, noun_out_has, noun_out_can = rumelhart_net.run(input_n)

# Associations when also irrelevant targets are trained
noun_out_I_with_ir, noun_out_is_with_ir, noun_out_has_with_ir, noun_out_can_with_ir = rumelhart_net_with_ir.run(input_n)
plt.plot(noun_out_I)
plt.xticks(np.arange(8),nouns)
plt.title(nouns[noun_id])
plt.ylabel('strength of association')
plt.show()


plt.plot(noun_out_is)
plt.xticks(np.arange(len(is_list)),is_list,rotation=35)
plt.title(nouns[noun_id] +' is')
plt.ylabel('strength of association')
plt.show()
plt.plot(noun_out_has)
plt.xticks(np.arange(len(has_list)),has_list,rotation=35)
plt.title(nouns[noun_id]+ ' has')
plt.ylabel('strength of association')
plt.show()
plt.plot(noun_out_can)
plt.xticks(np.arange(len(can_list)),can_list,rotation=35)
plt.title(nouns[noun_id]+' can')
plt.ylabel('strength of association')
plt.show()
../../../../../_images/b977037db893e822e8c14a74dfc16614bd04f63b391792c0bf90616340544886.png ../../../../../_images/88156638ea3a41397e7f89a8cc67193fa21e9df2afd8f8c67620f6639835fc2f.png ../../../../../_images/dd65d5441dd6e010e6261aa64ea350cf38a9c768716e8e924ac0f3ce158246e7.png ../../../../../_images/666af19e1bb6ebe08901e8d8bd527cf7ba3be009c91cf5db18269e29142cf897.png
plt.plot(noun_out_I_with_ir)
plt.xticks(np.arange(8),nouns)
plt.title(f'{nouns[noun_id]} (with irrelevant)')
plt.ylabel('strength of association')
plt.show()


plt.plot(noun_out_is_with_ir)
plt.xticks(np.arange(len(is_list)),is_list,rotation=35)
plt.title(f'{nouns[noun_id]} is (with irrelevant)')
plt.ylabel('strength of association')
plt.show()
plt.plot(noun_out_has_with_ir)
plt.xticks(np.arange(len(has_list)),has_list,rotation=35)
plt.title(f'{nouns[noun_id]} has (with irrelevant)')
plt.ylabel('strength of association')
plt.show()
plt.plot(noun_out_can_with_ir)
plt.xticks(np.arange(len(can_list)),can_list,rotation=35)
plt.title(f'{nouns[noun_id]} can (with irrelevant)')
plt.ylabel('strength of association')
plt.show()
../../../../../_images/c0637ab996dc862e40c97d32f0fde20599609425d4679548776d025249dd7f52.png ../../../../../_images/e9bd8e7daad04c306cdca5aadeb54e11931a4ccb76f0ec2330fd37b7ea11652e.png ../../../../../_images/2140eae69ca7972256938691fcb1a9096ebf0e1a5a04f9b239695ab88b9c8e68.png ../../../../../_images/fe7bbbe1628bb2b835bec63122cba64e5eeb225c3b50a9cca4a51599a4deb135.png

🎯 Exercise 4

Exercise 4a

Compare the associations between the two networks. What do you observe?

✅ Solution 4a

The network trained on the irrelevant outputs have overall stronger associations. This is because the network is not penalized for giving irrelevant outputs, and thus can learn stronger associations between the nouns and the relations.

Note: Not setting the irrelevant targets to 0 but a lower value can help the network to learn and converge faster since it can learn to give more weight to the associated values (even when they are not relevant to the exact input).

Exercise 4b

Try setting the discount factor to different values and try to predict the associations (you can lower the number of training trials). What do you observe?

✅ Solution 4b

The lower the discount factor (the more discounting) the lower the associations.