Toward an Integrative Approach for Making Sense Distinctions
3.1. Formal Polysemy
Formal polysemy is perhaps the most obvious and widely used method for making sense distinctions in practice as it relies principally on human reasoning3. In this case, distinctions are made based on the definitions of concepts, for example definitions of “wing” such as “a movable organ for flying” and “a stage area out of sight of the audience”4 are obviously different to any lexicographer and would be widely recognized as distinct senses to any speaker of English. These distinctions have mostly been formalized through the use of ontologies with some success (Curtis et al., 2006; Prokofyev et al., 2013). In this case, this method is highly successful as we can see that the first sense refers to a physical object and the second to a location and these are distinct elements in ontologies such as DOLCE (Gangemi et al., 2002), where the former would be a Non-agentive physical object (NAPO) and the latter a Place (PL). This distinction in the top-level hierarchy of DOLCE indicates that these are two distinct senses. Similarly, SUMO (Niles and Pease, 2003)5 is an ontology of top-level categories, however it provides a much wider coverage of the language as it consists of 20,000 terms and mappings to 117,000 WordNet senses, in place of DOLCE's 100 terms. In SUMO, these two senses of wing are also placed differently in the hierarchy with the two immediate superclasses being “limb” and “room,” both of which ultimately are subclasses of the general idea of “object.”
That being said, relying solely on hierarchical distinctions in order to make senses is not sufficient alone; firstly because it may be the case that senses that a lexicographer would like to distinguish, which do not have a taxonomic distinction, such as “wing” meaning “one of the horizontal airfoils on either side of the fuselage of an airplane”, would also be an NAPO in the DOLCE ontology. In order to solve this, lexicographers are recommended to use definitions that consist of not only a genus (the class of something, such as a “sheep” is an animal) but also at least one differentia (a unique characteristic of the concept, such as having wool) (Hartmann and James, 1998, p. 44). However, there is no clear idea about what particular differentia would constitute a meaningful sense distinction so it is hard to decide when to make a sense distinction. The second problem is more significant in that there are many cases where large differences in the genus of a term might not naturally constitute a secondary sense. For example, “rock” can refer to a single piece of rock, which would be a NAPO in the DOLCE taxonomy, but also to a material which would be amount of matter (M) in the DOLCE taxonomy6. This is an instance of systematic polysemy where we coerce a reference of a material to an object made of that material, for example, if I say “bring me the M,” where M is a material, you understand that as the object made of that material. As such, this distinction could be considered unnecessary yet most dictionaries happily make this distinction for the word “rock,” but they are much less likely to do so for more specific words such as “crystal.” For example, Merriam-Webster, Oxford and Wiktionary, make this distinction for “rock” but not “crystal,” with English WordNet being one of the few that do for both words. SUMO has classes (“substance” and a “corpuscular object”) that allow this distinction to be stated explicitly, although the current mapping maps both WordNet senses of “rock” to the same concept which is subsumed by the class “substance,” leading to the distinction being implicit in the mappings. This is based on the assumption that named concepts in SUMO are analogous to dictionary senses, as SUMO is a formal model and treating it as a dictionary misses many of SUMO's features. SUMO as a formal ontology does not need to make this distinction explicitly as it can be inferred by its associated NLP system, SigmaNLP, in a process analogous to systematic polysemy discussed below. That is, if a reader reads a dictionary entry that does not make a substance-object distinction she is capable of inferring the implicit sense and as such the human intelligence process of the generative lexicon is analogous to the artificial intelligence method of SigmaNLP. Similarly, many dictionaries do not make the substance-object distinction explicitly and SUMO does not make it explicitly either for this reason but provides formal definitions such as in Figure 3, that provide definitions of concepts, in this case, that a rock is a solid composed of one or more minerals.

SUMO's modeling of “rock”.
At this point, it is important to introduce the school of thought that views different senses to be related in some cases and therefore to be derivable from each other in a predictable or systematic way. Consider for instance the related senses of “chicken” (animal) and “chicken” (meat), a derivational process which can, in principle, be replicated for any animal. That is, given a noun that denotes an animal (has the sense “animal”), we can predict in a systematic way that this noun can also denote the meat of this animal (has the sense “meat”). We can identify a similar pattern for plants and their fruits (cherry tree and cherry fruit), plants or animals and the material that can be made from them (cotton plant and cotton material; crocodile animal and crocodile material) and many others, see Apresjan (1974) or CORELEX (Buitelaar, 1998) for other examples.
Given that this process can be viewed as systematic, researchers in lexical semantics that studied this phenomenon have phrased this as “systematic polysemy” (Nunberg, 1992) or also as “regular polysemy” (Apresjan, 1974) or “logical polysemy” (Pustejovsky and Bouillon, 1995). In the case of Pustejovsky, the “logical” nature of this form of polysemy originates out of the predictable manner in which different senses can be generated from an underlying abstract meaning representation, known as “qualia structure” (Pustejovsky, 1998). Qualia structure represents the meaning of a noun by way of four “qualia roles,” which are Formal, Constitutive, Telic, and Agentive, each of which representing a core semantic aspect of any noun. According to Pustejovsky's Generative Lexicon theory, word senses can be dynamically generated from this representation on the basis of compositional semantic requirements. In this context, the notion of “type coercion,” introduced by Pustejovsky to explain how words can acquire a different sense (i.e., “change semantic type”) if the compositional semantic structure to which it contributes requires this, is also relevant.
In order to fully expand a formal model of senses, we need to consider both the genus through theories such as DOLCE but also the differentia in a way that the different qualities of a sense are taken into account. An approach for this may be through a formal language such as Abstract Meaning Representation (Banarescu et al., 2013, AMR), which has been shown to be a robust method for the representation of syntax. It is possible that using robust approaches for parsing (Blloshmi et al., 2020) could be applied to existing dictionary definitions to allow for more formal reasoning. However, there remain two major issues to the more formal analysis of meaning by a language such as AMR. Firstly, there are many equivalent phrasings of the same definition, for example “father-in-law” could be defined by several definitions including:
father of spouse
male parent of spouse
father of wife or husband
male parent of wife or husband
Secondly, as we will show below with the analysis of “fish,” the actual set of differentiae can vary in importance and it is hard to check which differentiae are essential to the sense.
3.2. Cognitive Polysemy
As sense distinctions are fundamentally a function of cognitive action, it makes sense to look for evidence of sense distinctions from cognitive experiments. The most direct way to do this with current technology is through direct measurement of brain activity using methods such as Functional Magnetic Resonance Imaging (fMRI) to directly see if there are distinctions between different senses of words. Copland et al. (2007) was a study that did exactly this by looking at differences in brain activation between two senses of “bank” using priming concepts such as “money” and “river” concluding that there are clear differences between these two senses. This supports the hypothesis that there are cognitive differences in how we approach homonyms. However, it is less clear if the more subtle sense distinctions that lexicographers make can be clearly distinguished with such technologies, and similar studies have difficulty in detecting similar distinctions due to the fact that semantically related words “recruit similar regions” of the brain (Sachs et al., 2011). Other approaches such as priming or directly asking participants have also been investigated in the context of semantic ambiguity (Hino et al., 2006), but these have not yet been successfully applied to the task of making sense distinctions. Of course, directly asking participants if they think words have the same meaning would directly find sense distinctions but is unlikely to be financially viable for all senses in a modern lexicographic workflow.
Given the challenges with such research, much work has attempted to understand cognitive connections between concepts in the brain by means of word association games, which are an effective and cheap way to measure cognitive associations (Szalay and Deese, 1978). Recently a large database of such associations has been introduced called the “Small World of Words” (De Deyne et al., 2019, SWOW), which allows us to directly study the associations made by thousands of speakers of English and 14 other languages. The natural method of making this analysis is to look for clusters within the graph by means of algorithms for community detection (Fortunato, 2010), which detects highly connected subgraphs. It is natural to suppose that these clusters would correspond to senses within the graph, for example an ambiguous word like “bank” is connected to many other words that are closely related to each other as well such as “money,” “account,” “teller,” and “save.” Meanwhile there are other connections listed in SWOW that do not have any other further connections to this cluster such as “river” and “water” and smaller senses such as “piggy” and “sperm.” As such, it seems that such an analysis will naturally lead to the detection of homonyms but it is less clear that more subtle sense distinctions can be inferred.
A recent study by Branco et al. (2020) has shown that graph-based analysis using the SWOW word association norm database can outperform even state-of-the-art word embedding models at predicting word similarity and provide competitive performance on tasks such as natural language inference with state-of-the-art methods. Although it is clear that the database does effectively capture sense distinctions that are widely used, there are also reasons to be sceptical about the information in this database for the task of making sense distinctions.
Firstly, it is clear that the word association database consists of a large degree of collocations and this introduces a bias in the database, for example “bank” is the most used term when primed with the word “piggy” but the converse is much less frequent, that is “piggy” is rarely suggested for the prime “bank.” Secondly, there are word senses such as “bank” meaning a “flight manoeuvre” that have no clear relation to any of the word associations. Further, it seems possible that some sense distinctions may be difficult to capture with word associations as they refer to unlexicalized concepts such as certain kinds of movements. We also note that these databases normally don't distinguish different parts of speech in their data, so it is necessary to disaggregate the senses by part of speech as well. So, it may be the case that certain sense distinctions cannot be detected with this cognitive approach. Branco et al. (2020)'s study calls for “a unified account of lexical semantics” and it seems that there are certainly strong synergies between the cognitive approach and the distributional method described in the next section, as both are able to effectively detect collocations. In fact, there has already been some work in automatically inferring word association norms (Reyes-Magaña et al., 2020) based on distributional word embedding models and the success of this suggests that the information captured by the models is very similar.
Another approach that holds some promise is the mapping of the senses directly with areas of the brain such as in the work of Kocoń and Maziarz (2021), where the areas of the brain are directly connected with the semantic graph of WordNet. The addition of such connections allows for a graph representation that performs better at NLP tasks than just the semantic network alone. As such, it seems that direct mapping of semantic senses with cognitive regions can be helpful in building semantic networks and thus making sense distinctions.
3.3. Distributional Polysemy
The distributional hypothesis that “you shall know a word by the company it keeps” (Firth, 1957) has quickly become the dominant paradigm within computational linguistics and natural language processing. In particular, this has been due to the emergence of word embedding model such as word2vec (Mikolov et al., 2018). These models rely on the distributional context of a word and convert them to a vector form that is readily usable for a wide range of further applications. In this way, these models can be considered as more advanced versions of the collocation-based methods that are commonly used to make sense distinctions and offer more discriminative power at the cost of leading to results that are difficult to interpret and explain. The first word embedding models simply generated a single vector for each word, ignoring heteronyms, part-of-speech, and other distinctions that a lexicographer would typically make. However, it was quickly seen that such models were limited by not identifying senses and attempts were made to produce distinct vectors for each sense based on existing sense catalogues such as WordNet, e.g., the AutoExtend method of Rothe and Schütze (2015). More recently, contextual word embeddings models, most notably the BERT model (Devlin et al., 2019), have become popular and these models create a distinct vector for each occurrence of a word. Recent studies have shown that these vectors are easily clustered into broad sense distinctions such as homonyms (Nair et al., 2020), but they have also shown that finer-grained sense distinctions are much less obvious from these works.
Two particularly interesting works from the same research group have also shed light on the connection of contextual word embeddings and word senses: firstly, Scarlini et al. (2020) showed that the usage of an existing sense catalogue such as BabelNet (Navigli and Ponzetto, 2012), which is based on WordNet and Wikipedia as principal sources, can improve the quality of the sense embeddings created. Secondly, Generationary (Bevilacqua et al., 2020) is another system that could infer natural language definitions from contextual word embeddings and it was shown that the definitions were effectively very similar to the definitions given in a traditional dictionary. As such, it seems clear that there is much information captured by these models and they can be an effective method for defining sense distinctions, but due to the obtuse nature of these vectors it can be hard to explain the results of such systems. A common approach is to reduce these highly multidimensional vectors to a 2-dimensional space so they can be visualized easily, by means of a method such as t-Distributed Stochastic Neighbor Embedding (t-SNE) (Van der Maaten and Hinton, 2008). That being said, such a representation is too simplistic to make good sense distinctions, as we will see below in Figures 4, ,55.
Visualizations of BERT embeddings for different uses of wings.
Visualizations of BERT embeddings for different uses of fish.
While there may be potential for distributional methods to be a nearly universal solution to making sense distinctions, there are still some weaknesses of the method. Firstly, they often struggle with less frequent senses, especially if this less frequent sense is not primarily used in specific collocations. Secondly, the interaction of systematic polysemy and distributional models is not clear. For example, systematic polysemy such as the food-animal distinction is clear in a distributional model as for example “fish” may co-occur with words like “chips” or “swim” and these can easily be separated to deduce these senses, yet for organization-building distinctions such as for “school,” it is less clear if there are co-occurring words that would make this distinction. Finally, these distributional models have the tendency to be black boxes where the results are not easy to explain and so it is challenging to see how they may be accepted by a working lexicographer along with their other tools.
3.4. Intercultural Polysemy
A final principle that is used to make sense distinctions is by looking at evidence from other languages in order to make sense distinctions. For example, the homonymous senses of “bank” can easily be distinguished as there are very few other languages that use the same word for both a financial bank and a river bank. Similarly, languages that do not make a food-animal distinction can use the evidence that English makes a distinction such as “mutton”/“sheep” to provide evidence for such distinctions. However, the question here is whether it really makes sense to rely on another language to make a sense distinction. For instance, there are a great number of languages that distinguish lexically between male and female role words, e.g., “teacher” must be translated with respect to the gender of the people being referred to in German, French, Spanish, Italian, and many other languages, and it does not seem that this is a difference that a speaker of language such as English would consider important.
For bilingual dictionaries, the notion of distinct senses is dependent on the nature of the translations to another language. That is, an English-German dictionary would not list the translation of “fish” to “Fisch” twice to account for the food-animal polysemy, while an English-Spanish dictionary would have to, as Spanish has two translations (“pez” and “pescado”) according to this distinction. In the context of monolingual dictionaries, it seems much less certain as to whether such sense distinctions are appropriate and represent real distinctions that would be made by native speakers of that language. Further, the data we have presented here is mostly on European languages and the effects of using languages from different families needs further investigation although it seems likely that sense distinctions would be clearer across very different languages.
On the other hand, translation data is abundant due to the existence of large parallel corpora used in machine translation as well as large multilingual lexical resources such as BabelNet (Navigli and Ponzetto, 2012) and Apertium (Forcada et al., 2011; Gracia et al., 2018). As such, it seems natural that the use of these resources can provide important evidence for translation and an approach by means of translation graphs and clustering algorithms could be highly effective. It should also be noted that the use of parallel texts has already been shown as an effective method for distinguishing senses and the “one homonym per translation” hypothesis (Hauer and Kondrak, 2020) closely matched the “one sense per discourse” model (Gale et al., 1992) already used as a principle for making sense distinctions.
3.5. Metaphors and Metonyms
We should also note the limits of methodologies for making sense distinctions as a dictionary cannot truly cover all usages of a sense that may occur in the corpus. This is due to the productive nature of language and the fact that new senses are continuously being created. It is common, especially in poetic language, to introduce metaphorical senses that are unlikely to be found in dictionaries. This is often the process by which new words are created as described by Nunberg (1987):
Metaphors begin their lives as novel poetic creations with marked rhetorical effects, whose comprehension requires a special imaginative leap. As time goes by, they become a part of general usage, their comprehension becomes more automatic, and their rhetorical effect is dulled.
As an example, the English word “overwhelm” has gradually lost its original meaning of “to flood (over),” to the point that the antonym “underwhelm” could enter common usage and many native speakers even speculate on the original meaning7. Such metaphors are conventionalized and can be considered as a case of non-systematic polysemy, however many other metaphors are productive and impossible to capture with a fixed list of senses in a dictionary. Metonymy is a very similar process distinguished by the fact that the new sense is mapped to a concept within the same domain (Gibbs, 1999), and is thus generally less of a conscious decision by the author than metaphor. Recent results have shown strong results in the detection (Shutova, 2015; Zayed et al., 2020a) and the interpretation of metaphors (Zayed et al., 2020b). Therefore, a system for making sense distinctions should also be aware of metaphor and metonymy and be able to explain this to the user.
3.6. An Integrative Approach
The issue of sense distinctions is one of primary importance for lexicographers and the idea of dictionaries as authorities in language is undermined by the wide variety of sense catalogues found in different dictionaries. The idea that this is simply due to distinctions between “splitters” and “lumpers” seems questionable as there is a lot of variance in the number of senses a dictionary has and computational users of dictionaries have been highly critical of the inventories in extant resources, especially WordNet8. As such, it would be highly useful for a corpus-based system that lexicographers could use to analyse the meanings of words and form them into clusters.
The current state-of-the-art in distributional semantic methods, especially with respect to contextual word embedding methods, seems like it would be able to satisfy this goal. However, a number of limitations exist, including that the black-box nature of this method is not easy to translate into a word sense catalogue. Therefore, it seems that the power of the distributional method could be used to infer a formal representation of senses, such as by combining the methodology of Generationary with more formal approaches such as Abstract Meaning Representations. These methods could be further augmented with information from cognitive databases such as Small World of Words as well as multilingual parallel texts, but there are still major practical challenges in this. Moreover, it is necessary that such a system is adaptable to the needs of a given lexicographic project and can be tuned by means of parameters that fit the goals of the particular dictionary. In the remainder of this paper, we will look at how the theories can be applied for making sense distinctions and sketch the possibilities and challenges by combining them.
4.1. Formal Analysis
The formal analysis of the words is based on the existing definitions given in dictionaries as there is still much to be examined about more computational approaches such as Abstract Meaning Representation. We first look at the genera of “fish” in Table 1 and we list the noun and verb senses of the word as they appear in WordNet. These are then further distinguished in a more formal ontology, in this case SUMO (Niles and Pease, 2001).
Table 1
WordNet senses with SUMO mappings for the noun and verb “fish.”
| POS | Wordnet sense | Direct hypernym | Indirect hypernyms | SUMO Mappings |
|---|---|---|---|---|
| Noun | Any of various mostly cold-blooded aquatic vertebrates usually having scales and breathing through gills | Aquatic_vertebrate | Craniate, vertebrate < chordate < animal, animate_being, beast, brute, creature, fauna < being, organism < animate_thing, living_thing < unit, whole < object, physical_object < physical_entity < entity | Fish (equivalent mapping) |
| Noun | The flesh of fish used as food | Food, solid_food | Solid < matter < physical_entity < entity | FishMeat (subsuming mapping) |
| Verb | Catch or try to catch fish or shellfish | Catch, grab, take_hold_of | < Clutch, prehend, seize < get_hold_of, take | Fishing (equivalent mapping) |
| Verb | Seek indirectly | Look_for, search, seek | Investigating (subsuming mapping) |
In contrast to “fish,” the word “wing” has multiple unrelated senses in WordNet, as can be seen in Table 2. Most of these are clearly differentiated by the genus, with there being seven main senses identified: organ, artefact, grouping, hockey player, meat, flight formation, and addition. Some of these categories, however, seem somewhat arbitrary; a “wing” in the sense of a flight formation (10.), while semantically related to “flank” (6.), does not share any hypernyms with the latter, apart from the highly abstract “entity.” In other words, a formal analysis, at least when considered from the viewpoint of ontology and hypernymy, does not always seem to clearly and neatly capture the semantic inter-relationship between word senses.
Table 2
WordNet senses with SUMO mappings for the noun “wing” categorized according to hypernyms.
| Wordnet sense | Direct hypernym | Indirect hypernyms | SUMO Mappings |
|---|---|---|---|
| 1. A movable organ for flying | Organ (a fully differentiated structural and functional unit in an animal that is specialized for some particular function) | Piece < thing < physical_entity < entity | Organ (subsuming mapping) |
| 2. Wing (one of the horizontal airfoils on either side of the fuselage of an airplane) | Airfoil, aerofoil, control surface, surface (a device that provides reactive force when in motion relative to the surrounding air; can lift or control a plane in flight) | Device < instrumentality, instrumentation < artefact, artifact < unit, whole < object, physical_object < physical_entity < entity | WingDevice (subsuming mapping) |
| 3. Wing, offstage, backstage (a stage area out of sight of the audience) | Stage (a large platform on which people can stand and can be seen by an audience) | Platform < horizontal_surface, level < surface < artefact, artifact < unit, whole < … | PerformanceStageWing (equivalent mapping) |
| 4. Fender, wing (a barrier that surrounds the wheels of a vehicle to block splashing water or mud) | Barrier | Impediment, impedimenta, obstructer, obstruction, obstructor < construction, structure < artefact, artifact < unit, whole < … | EngineeringComponent (subsuming mapping) |
| 5. Wing (a unit of military aircraft) | Air unit (a military unit that is part of the airforce) | Force, military_force, military_group, military_unit < social_unit, unit < organization, organization < social_group < group, grouping < abstract_entity, abstraction < entity | Organization (subsuming mapping) |
| 6. Flank, wing (the side of military or naval formation) | Formation (an arrangement of people or things acting as a unit) | Arrangement < group, grouping < abstract_entity, abstraction < entity | GroupOfPeople (subsuming mapping) |
| 7. A group within a political party or legislature or other organization that holds distinct views or has a particular function | social group (people sharing some social relation) | Group, grouping < abstract_entity, abstraction < entity | Group (subsuming mapping) |
| 8. Wing (a hockey player stationed in a forward position on either side) | Hockey player, ice-hockey player (an athlete who plays hockey) | Athlete, jock < contestant < individual, mortal, person, somebody, someone, soul < being, organism (1) < Animate_thing, living_thing < unit, whole < object, physical_object < physical_entity < entity (2) < Causal_agency, causal_agent, cause < physical_entity < entity | HockeyPlayer (subsuming mapping) |
| 9. Wing (the wing of a fowl) | Helping, portion, serving (an individual quantity of food or drink taken as part of a meal) | Small_indefinite_amount, small_indefinite_quantity < indefinite_quantity < amount, measure, quantity < abstract_entity, abstraction < entity | PoultryMeat (subsuming mapping) |
| 10. Wing ((in flight formation) a position to the side and just to the rear of another aircraft) | Place, position | Point < location < object, physical_object < physical_entity < entity | PositionalAttribute (subsuming mapping) |
| 11. Annex, annexe, extension, wing (an addition that extends a main building) | Addition, add-on, improver (a component that is added to something to improve it) | Component, constituent, element < part, portion < object, physical_object < physical_entity < entity | BuildingUnit (subsuming mapping) |
Non-top-level hypernyms shared across senses are displayed in bold.
We also analyzed the differentiae, in this case for the most frequent sense of “fish” as an animal and this is summarized in Table 3. The only differentia that all the dictionaries we looked at agreed on was that fish live in water, other aspects are often missed by the definitions in one or more dictionary. As such it is clear that we cannot count on a definitive set of differentiae to distinguish between senses of words, that is the fact that three dictionaries mention scales does not indicate that the lexicographer is trying to define a distinct senses. In addition, we include the SUMO definition (Niles and Pease, 2003), where the formal axioms can be paraphrased as “a cold blooded vertebrate that inhabits water, disjoint from amphibians, and reptiles.” It is also worth noting that none of these definitions are scientifically correct as fish may be (partly) warm-blooded and not all fish have scales and a tail (Nelson et al., 2016), although of course the role of a lexicographer is to capture general usage of a language not technical distinctions. However, this emphasizes a clear challenge with a formal approach to sense distinctions, in that when we have such a wide variation in differentiae, it is difficult to infer by any automatic process which particular criteria are essential to the meaning.
Table 3
Analysis of listed differentiae for the sense of “fish” as an animal in different dictionaries.
| Cold-blooded* | Aquatic | Vertebrate | Fins | Gills | Scales* | Tail* | |
|---|---|---|---|---|---|---|---|
| English WordNet | ✓ | ✓ | ✓ | ✓ | ✓ | ||
| Wiktionary | ✓ | ✓ | ✓ | ✓ | ✓ | ||
| Merriam-Webster | ✓ | ✓ | ✓ | ✓ | ✓ | ||
| Lexico.com | ✓ | ✓ | ✓ | ✓ | ✓ | ||
| Cambridge | ✓ | ✓ | ✓ | ||||
| Dictionary.com | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |
| Collins | ✓ | ✓ | ✓ | ||||
| SUMO | ✓ | ✓ | ✓ |
4.2. Cognitive Analysis
For cognitive analysis, we took the Small World of Words graph and extracted the subgraph consisting of only the words directly connected to the word we are studying. That is we took the subgraph consisting of all words that have a forward or backward association to the words “wing” or “fish,” we also discarded all terms that had <3 associations in the dataset. We then applied the Girvan-Newman community detection methodology (Girvan and Newman, 2002) to find the main clusters within the graph. We see the main clusters that have been extracted in Table 4.
Table 4
Girvan-Newman cluster analysis of the Small World of Words dataset.
| Clusters for “wing” | |
| 1 | Bird, butterfly, dragonfly, feathers, flap, flutter, fly |
| 2 | Air, airplane, flew, flight, plane, propeller |
| 3 | Bat, man |
| 4 | Chicken, feather |
| 5 | Left, right |
| 6 | Angel |
| 7 | Gull |
| Clusters for “fish” | |
| 1 | Algae, amphibian, anchovy, animal, aqua, aquarium, barracuda, boat, boating, boats, brine, brook, calamari, catfish, chowder, clam, coral, crab, crayfish, creek, dive, diver, diving, dolphin, downstream, eat, eel, fighting, filter, fin, fingers, fish tank, fishy, flatfish, fleshy, flipper, flop, flounder, flying, fresh, go, goldfish, gull, harbor, harpoon, heron, hunt, Japan, jelly, kelp, lagoon, lake, lobster, Maine, mammals, marine, marines, mermaid, mollusk, mussel, ocean, octopus, oily, otter, oyster, pacific, paella, pelican, pet, pets, pie, pier, pike, poach, pond, porpoise, prawn, reef, river, sail, sailing, salamander, salmon, salty, sardine, sashimi, scales, scallop, scaly, scuba, sea, seafood, seagull, seahorse, seal, seaman, seashell, seaside, seaweed, shark, shell, shrimp, sinker, slimy, slippery, snorkel, sole, spawn, spear, spiny, squid, squish, starfish, stick, stickleback, stingray, stream, sucker, sushi, swim, swimmer, swish, sword, swordfish, taco, tadpole, tank, trout, tuna, underwater, upstream, water, whale, wharf, worm |
| 2 | Bait, bass, carp, cast, casting, catch, catchy, caught, cod, fishing, fishing pole, hook, hooked, lure, net, Norway, perch, reel, rod, tackle, tarp, troll, unhook, worms |
| 3 | Batter, chips, Friday, fried, fry, grilled, smoked |
| 4 | Bouillon, broil, dish, escargot, f, food, frying, grill, gulp, gut, market, nibble, plaice, platter, plenty, poison, protein, raw, rotten, rotting, skate, skillet, smelly, smelt, stinky |
| 5 | Angle, beta, England, Omega |
| 6 | Chip, flake, flaky, scale |
| 7 | Dill, herring, Sweden, Swedish |
| 8 | Choral, mainstream, school |
| 9 | Backbone, guts, rumble |
For “wing,” we see major clusters corresponding to some of the main homonyms of the word with Cluster 1 referring to the part of an animal, Cluster 2 as a part of a plane and Cluster 5 as a political orientation. Cluster 4 is suggestive of the food sense with the association with “chicken,” although “feather” is probably not associated with this sense. The third cluster is probably erroneous due to the strong association between “bat” and “man”; however this does detect the sense of a “wing man.” It is not clear why “angel” wings are so distinct from other animal wings11 and Cluster 7 is identifying wing as a shape, as in “gull-wing doors.” As such, we can see that the cognitive analysis has identified six main senses of “wing,” but other smaller senses are not being detected. Other senses such as in “to wing it” are probably not being connected as the association between “wing” and “it” had only two instances in SWOW so was filtered out. There were several other highly useful associations at this level, such as “Buffalo” and “food” to support the food sense and “commander” to support the “wing man” sense; however there were many other noisy relations, such as “little” (perhaps due to the Jimi Hendrix song) and “prayer” (from the idiom “wing and a prayer”).
The analysis of “fish” provides many more associates and it seems there is more information for this word. We see at least the three main homonyms of “fish.” with the first cluster referring to the animal sense, the second cluster to the activity of catching fish and the third and fourth to the sense of foods. It is interesting to note that this analysis hints at there being a strong cognitive distinction made between “fish and chips” as a specific dish and “fish” as a more general meat, it would certainly be interesting to see if this is also seen in other languages. The remaining clusters are less clear and may in part be to do with errors made by the annotators of SWOW, e.g., the strong association with “choral” is almost certainly due to annotators misreading it as “coral.” It should also be noted that some of the clusters have a very weak association between the elements such as Cluster 5, and more investigation of the algorithm would help in detecting senses here. Similarly, we note that Cluster 7 is created for the sense of “Swedish Fish,” a candy in the US, but this cluster then pulls in other Sweden-related words from other senses. It is also interesting to note how certain words support different senses, for example we see that Japan and Maine are associated with the animal sense, whereas Norway is associated with the fishing sense. Similarly, some species of fish are thought of more as animals, some as food (e.g., “plaice,” “skate”) and some as for recreational fishing (e.g., “carp,” “bass”).
4.3. Distributional Analysis
For distributional analysis, we use the BERT model as the basis of the analysis. This model was selected as it has been shown to have strong performance across a wide number of tasks for senses (Bevilacqua et al., 2020; Nair et al., 2020), however given the rapid development of language models, it is possible that results may improve rapidly over the next few years as stronger models are developed. We took the Gloss Tag corpus that is released as part of Princeton WordNet (Fellbaum, 2010) as this corpus has annotated each word with the sense in WordNet allowing us to analyse each individual occurrence. Note that this corpus consists of annotated definitions from Princeton WordNet; however we just treat each definition as a free-standing sentence. The senses, their definitions in Princeton WordNet and frequency in the corpus are given in Table 5. While these definitions may not fully reflect standard language usage, it was chosen as it is sufficiently large and annotated to a very high quality with a focus on less frequent senses. We also note that “very few large annotated datasets [for WSD] are available” (Taghipour and Ng, 2015) one and semi-automatically constructed datasets would be risky for this detailed analysis. For each sentence of the corpus using either the word “fish” or “wing” we applied the BERT model taking the sum of the last four hidden layers as the embedding and we extracted the vector associated with the target word. We then applied a t-SNE projection to these vectors and these are shown in Figures 4, ,55.
Table 5
Senses appearing in the Princeton WordNet Gloss Tag corpus.
| Sense | Definition | Frequency |
|---|---|---|
| Wing.n.01 | A movable organ for flying (one of a pair) | 156 |
| Wing.n.02 | One of the horizontal airfoils on either side of the fuselage of an airplane | 17 |
| Wing.n.03 | A stage area out of sight of the audience | 2 |
| Wing.n.04 | A unit of military aircraft | 2 |
| Wing.n.08 | A group within a political party or legislature or other organization that holds distinct views or has a particular function | 1 |
| Wing.n.09 | The wing of a fowl | 1 |
| fish.n.01 | Any of various mostly cold-blooded aquatic vertebrates usually having scales and breathing through gills | 414 |
| Fish.n.02 | The flesh of fish used as food | 62 |
| Fish.v.01 | Seek indirectly | 1 |
| Fish.v.02 | Catch or try to catch fish or shellfish | 27 |
For the word “wing”, we see that the noun senses that are most common are clustered in the bottom right-hand corner of Figure 4, although there are a few outliers, notably the sentence “an artificial fly that has wings extending back beyond the crook of the fishhook” appears near the top of the graph and may indicate a distinct sense as this is not related to aviation. Minor senses are mostly on the bottom left-hand corner of the diagram; however there is not really enough information to make an informed decision.
Figure 5 is much more complex due to the fact that there is more data available for this. As in the previous plot, the most frequence sense appears in all parts of the t-SNE plot, but here is more clearly clustered. Part of this reason is to do with the different forms occurring in text, as the lemma “fish” appears as “fish,” “fishes,” “fishing,” and “fished” in the corpus. The large cluster on the bottom left-hand side corresponds to the form “fishes” and the small cluster of verb forms on the far right-hand side corresponds to the form “fishing.” For the second sense of fish we see most of the senses clustered in the top right-hand corner, suggesting that these senses are mostly being found correctly. However, there is a large cluster containing both “fish” as animal and “fish” as meat senses in the bottom corner. An analysis of the definitions suggests that these examples concern the catching of fish, for example, we have sentences such as “someone whose occupation is catching fish” (fish as animal) and “a small house where smoke is used to cure meat or fish” (fish as meat). As such, we see that the distributional method is focusing more on the context of the word than the formal genus of the word.
It is also important to note that although the clusters are apparent with the annotation, for the most part they would not be obvious without the sense labels and clusters often contain examples of multiple senses. As such, it is not so clear how useful such an unsupervised approach would be to lexicographers and this explains why automatic word sense induction, while a useful tool, cannot solely solve the issue of making word sense distinctions.
4.4. Intercultural Analysis
For the intercultural analysis we take the Apertium graph of translations (Gracia et al., 2018) as the basis of our analysis. We plot the immediate neighborhoods of the word “fish” based on the translations given in this resource in Figure 6. The Apertium graph is quite incomplete and for many languages there are no translations, yet we are able to see several large cycles that correspond to some of the senses that we would expect. Firstly, we have a cycle of “fiŝo” (Esperanto) → “poisson” (French) → “peis” (Occitan) → “pescado” (Spanish) → “peix” (Catalan)/“pescado” (Galician), which corresponds to the food sense and an overlapping cycle of “pez” (Spanish) → “poisson” (French) → “fiŝo” (Esperanto)/“peix” (Catalan), which corresponds to the animal sense. We also see another cycle corresponding to the act of catching fish created by the cycle of “pescar” (Spanish) → “pescar” (Galician) → “pescar” (Catalan) → “faenar” (Spanish). This shows that it is possible to detect senses using intercultural evidence; however a more complete database of interlingual correspondences, perhaps automatically constructed from a parallel corpus, would be essential to provide useful and clear methodologies for making sense distinctions.
In the same vein, we construct the graph of translations of the word “wing” (noun) in Apertium dictionaries as depicted in Figure 7. Although many of the translations are associated to wing as a means of flights, as in “ala,” “aile,” and “á” in Catalan, French, and Galician, respectively, there are other connotations which are translated as different senses, such as “eskadro” in Esperanto which refers to a squadron. Similarly, “kazel” in Breton would also refer to wing as an extension of a building. In the current version of the data, we could not retrieve any translations for “wing” as a verb.
4.5. Unified Analysis
For the word “fish” we see that all methods are able to distinguish the animal meaning, the meat meaning and the action of fishing, suggesting that these three senses are widely used and clearly distinguished senses. The metaphorical verbal sense (as in “fishing for compliments”) is only seen formally, but this is probably due to its low frequency. For distributional methods and intercultural methods it may be possible to find this distinction with more data, but it is less clear how a cognitive method could be further extended, that is, it is not certain that distinguishing word associations such as “compliments” would start to appear when asking more users. The methods also suggest some other distinctions that are not natural from the formal approach, such as fish as a specific English dish (in the cognitive analysis) and fish as something to be hunted (in the distributional analysis). Also, the analysis of differentiae shows the limits of formal methods as it is difficult to arrive at a definition that can be widely agreed on and easily formalized such as we saw in the example of “fish,” where the dictionary definitions have significant differences from each other, SUMO and the ichthyological definition. While dictionaries are made for different purposes and in this case, it could be possible that all dictionaries should have adopted the scientific definition, it is difficult to see how this could be generalized to other senses when there is more subtle distinctions in senses created by phenomena such as systematic polysemy.
For “wing,” we have even less agreement between the models, with the distinction between an animal's wing and a plane's wing being the only solid distinction, and this does not appear in the intercultural analysis. We did a further check on multilingual resources which has not found any language that makes the distinction lexically. Meanwhile, the formal analysis shows a large number of senses with clear distinctions but this is not supported by the corpus-based analysis. As such, it can be said that “wing” is a word that can be easily coerced into new senses and requires a more nuanced formal theory such as that of the generative lexicon.
Overall, formal approaches allow for many sense distinctions to be made but when not backed by corpus evidence, this leads to senses that are unnatural to users of the dictionary12, especially if many of these senses are rare metaphors or semantic shifts. Cognitive approaches seem to be quite well-adapted to making sense distinctions, but there are questions about how this may scale to more infrequent sense distinctions. Distributional methods hold much promise, but the poor results and the black-box nature of these approaches raise questions about how useful it can be for a lexicographer. Finally, inter-cultural methods showed some effectiveness, but the Apertium resource here is too small and incomplete and new, large translation graphs would be necessary to fully validate this method. Further, some obvious sense distinctions are simply nonexistent in this analysis, while other sense distinctions appear that would not be obvious to a native speaker of the language in question.
Abstract
Word senses are the fundamental unit of description in lexicography, yet it is rarely the case that different dictionaries reach any agreement on the number and definition of senses in a language. With the recent rise in natural language processing and other computational approaches there is an increasing demand for quantitatively validated sense catalogues of words, yet no consensus methodology exists. In this paper, we look at four main approaches to making sense distinctions: formal, cognitive, distributional, and intercultural and examine the strengths and weaknesses of each approach. We then consider how these may be combined into a single sound methodology. We illustrate this by examining two English words, “wing” and “fish,” using existing resources for each of these four approaches and illustrate the weaknesses of each. We then look at the impact of such an integrated method and provide some future perspectives on the research that is necessary to reach a principled method for making sense distinctions.
Footnotes
We include synecdoches as a class of metonymy.
We distinguish in this case between the results of cognitive processes (in this case logic) and the analysis of cognitive processes in the next section. All language is the result of a cognitive method so all methods could be considered to some degree cognitive.
Definitions from English WordNet.
SUMO can be browsed online at https://www.ontologyportal.org/
We note the DOLCE-WordNet mapping is incomplete so we are inferring the corresponding top-level categories. For SUMO, there is already a formal mapping of all WordNet senses.
In the “10 Things I Hate About You” film (1999), the character Chastity Church asks, “I know you can be underwhelmed and you can be overwhelmed, but can you ever just be whelmed?”
We do not believe that other dictionaries are more principled, but are simply less widely adopted in these kinds of works.
Six of the first 100 matches were actually nouns.
We speculate this may be due to the mythological and religious connotations of this word or perhaps due to the practice of making shapes in snow or even a Polish pastry.
As an example, Wiktionary defines “fish” as both “an easy victim for swindling” and “a bad poker player.” The verb is also defined with several very similar meanings including “to hunt fish,” “to search (a body of water) for something other than fish,” “to use as bait when fishing,” and “To (attempt to) find or get hold of an object by searching among other objects”.



