Dataset: 9.3K articles from Wikipedia (CC BY-SA).
More datasets: Wikipedia | CORD-19

Logo Beuth University of Applied Sciences Berlin

Made by DATEXIS (Data Science and Text-based Information Systems) at Beuth University of Applied Sciences Berlin

Deep Learning Technology: Sebastian Arnold, Betty van Aken, Paul Grundmann, Felix A. Gers and Alexander Löser. Learning Contextualized Document Representations for Healthcare Answer Retrieval. The Web Conference 2020 (WWW'20)

Funded by The Federal Ministry for Economic Affairs and Energy; Grant: 01MD19013D, Smart-MD Project, Digital Technologies

Imprint / Contact

Highlight for Query ‹Central incisors, absence of medication

Word salad

Abstract

Word salad is a "confused or unintelligible mixture of seemingly random words and phrases", most often used to describe a symptom of a neurological or mental disorder. The words may or may not be grammatically correct, but are semantically confused to the point that the listener cannot extract any meaning from them. The term is often used in psychiatry as well as in theoretical linguistics to describe a type of grammatical acceptability judgment by native speakers, and in computer programming to describe textual randomization.

In psychiatry

Word salad may describe a symptom of neurological or psychiatric conditions in which a person attempts to communicate an idea, but words and phrases that may appear to be random and unrelated come out in an incoherent sequence instead. Often, the person is unaware that he or she did not make sense. It appears in people with dementia and schizophrenia, as well as after anoxic brain injury. Clang associations are especially characteristic of mania, as seen in bipolar disorder, as a somewhat more severe variation of flight of ideas. In extreme mania, the patient's speech may become incoherent, with associations markedly loosened, thus presenting as a veritable word salad.

It may be present as:

- Clanging, a speech pattern that follows rhyming and other sound associations rather than meaning

- Graphorrhea, a written version of word salad that is more rarely seen than logorrhea in people with schizophrenia.

- Logorrhea, a mental condition characterized by excessive talking (incoherent and compulsive)

- Receptive aphasia

- Schizophasia, a mental condition characterized by incoherent babbling (compulsive or intentional, but nonsensical)

In computing

Word salad can be generated by a computer program for entertainment purposes by inserting randomly chosen words of the same type (nouns, adjectives, etc.) into template sentences with missing words, a game similar to Mad Libs. The video game company Maxis, in their seminal SimCity 2000, used this technique to create an in-game "newspaper" for entertainment; the columns were composed by taking a vague story-structure, and using randomization, inserted various nouns, adjectives, and verbs to generate seemingly unique stories.

Another way of generating meaningless text is mojibake, also called "Buchstabensalat" ("letter salad") in German, in which an assortment of seemingly random text is generated through character encoding incompatibility in which one set of characters are replaced by another, though the effect is more effective in languages where each character represents a word, such as Chinese, than a letter.

More serious attempts to produce nonsense automatically stem from Claude Shannon's seminal paper "A Mathematical Theory of Communication" from 1948

where progressively more convincing nonsense is generated first by choosing letters and spaces randomly, then according to the frequency with which each character appears in some sample of text, then respecting the likelihood that the chosen letter appears after the preceding one or two in the sample text, and then applying similar techniques to whole words. Its most convincing nonsense is generated by second-order word approximation, in which words are chosen by a random function weighted to the likeliness that each word follows the preceding one in normal text:

THE HEAD AND IN FRONTAL ATTACK ON AN ENGLISH WRITER THAT THE CHAR-

ACTER OF THIS POINT IS THEREFORE ANOTHER METHOD FOR THE LETTERS THAT

THE TIME OF WHO EVER TOLD THE PROBLEM FOR AN UNEXPECTED.

Markov chains can be used to generate random but somewhat human-looking sentences. This is used in some chat-bots, especially on IRC-networks.

Nonsensical phrasing can also be generated for more malicious reasons, such as the Bayesian poisoning used to counter Bayesian spam filters by using a string of words which have a high probability of being collocated in English, but with no concern for whether the sentence makes sense grammatically or logically.