Acronyms

For many years I have been obsessed with languages, including natural human languages, constructed alternatives to natural languages, computer programming languages, mathematics, and music. All of these can be called languages, in a sense, and to me they all seem related.

But they all seem rather imperfect or incomplete, as well.

As someone who studied linguistics, I see computer programming languages and mathematics as deficient in a way -- they lack any phonetic, phonological, or phonemic component. I've always had trouble explaining this lack to engineering colleagues with no linguistic background, but I feel that in some way the sound of a language is an utterly vital part of it. Linguists have almost always held that writing is merely transcription, that the real language is the part spoken and listened to.

But as someone who spent years writing computer programs for a living, I also sense a deficiency in natural languages, which seem much too vague, arbitrary, and perhaps even contradictory, in comparison with mathematics and programming languages. Imagine if you can a computer with only a natural language interface. How easy would it be to tell it to perform some very detailed calculation or elaborate sequence of steps?

And then there is music. Clearly music is essentially sound, and as with natural languages the printed transcription is clearly just that, just a notation for music, not the music itself. But music seems to lack a lexical or semantic component. It is true that some pieces of "program music" like Smetana's Moldau have an intended mapping into the real world, but most of the great pieces of music are rather abstract.

If you tried to imagine programming a computer with only a natural language interface, at my suggestion above, now try to imagine programming a computer by whistling to it.

As I've mused on these thoughts over the years I've come around to the believe that mathematics and music are rather complementary, one lacking sound and the other lacking a real semantics. But they are both rather pure or ideal extremes.

And so for a long time I have thought of mathematics and music as parts of one whole language, as components of one ideal language.

Stimulated by this idea, I have studied the history of artificial languages, including the many ideal language schemes intended to produce an ideal philosophical or international language for human use. At one time I studied formal logic, but my clearest memories of that phase was of cutting classes in logic to read up on the descriptor languages used by librarians, which seemed much closer to reality.

Perhaps the most interesting idea I encountered during years of reading about artificial languages was Wilhelm von Humboldt's notion that there is in fact some underlying ideal language, and that what we call natural languages (or national languages) are merely the continual and unconscious expressions of national culture in this underlying medium.

Attempting to formalize Humboldt's idea led me to propose a linear model of human language use, based on a psychology somewhat like associationism. My graduate work involved exploring that model, using computer methods.

Years before encountering Humboldt's ideas I had a moment of insight myself in which I suddenly found a way of defining an ideal language that could be entirely non-arbitrary. There were actually two moments of insight separated by a couple of hours in the course of a very long night.

The first insight was the sudden realization that it would be possible to define a language that consisted entirely of acronyms -- a language in which each word was an acronym made from other words in the language. The second insight, after a couple of hours deep thought about the first, was that the definition of an acronymic language could be satisfied by any one of an infinite number of languages, all possessing the properties I had defined -- but that there could be (must be!) one distinguished member of this class of languages, an optimum or best language.

When I first came up with the idea for an acronymic language, some twenty years ago, or so, I thought of it as an ideal language for humans to use, a replacement for natural languages -- something like Esperanto. This is still a possibility that interests me, but I have been downplaying it somewhat lately, prefering instead to discuss it as a convenient descriptor language for information retrieval.

Let's do a little requirements analysis on the idea of a descriptor language for information retrieval.

The most important requirement for a descriptor language is a close relationship between the form and meaning of terms -- close enough that machines can easily arrange things according to similarity of meaning, by simply arranging them according to similarity of form.

An example of arrangement according to similarity of form is a simple alphabetical sort. If we sort the words in English in alphabetical order, do we at the same time sort them by meaning? No! English is clearly not a good descriptor language. By contrast, try sorting terms in the Dewey Decimal System according to form by using an ordinary ASCII alphanumeric sort, like the Sort command in MS-DOS. The result does indeed have an arrangement of the terms according to their meaning, or very nearly so.

It is important to be more specific about this relationship. The ability to sort meanings by sorting forms comes from the implication

    Similarity of Form Implies Similarity of Meaning.

What about the converse? Can we say that

   Similarity of Meaning Implies Similarity of Form?

As will be shown below, this is a much more difficult property to realize. If we do have the implications going in both ways, then we can say that there exists an isomorphism between form and meaning. This is clearly desirable, but hard to achieve. But we do need at least the one-way implication or homomorphism from meaning to form.

Another important requirement for a descriptor language is for a mechanical process or algorithm for assigning a descriptor to a particular text. It must be possible for a machine to read a text file and produce a descriptor that correctly describes that text, without human intervention. Without this, the human labour involved in using the descriptor language would be far too great for the language to be anything more that a tool for librarians.

If this automatic processing of texts is essentially a linear operation, then it follows from the well-known properties of linear operators that similar texts will always be mapped to similar descriptors, and the fact that this is so would prove that we do indeed have an isomorphism between text and meaning. I think that the converse probably holds true as well: if we do have an isomorphism between meaning and form, (and not just the one-way homomorphism), then we could perform this automatic processing of text into descriptors by a linear operation.

I will note here without further elaboration that the processing of text written in an acronymic language to form a descriptor which is essentially an acronym can be consided a linear operation.

The idea that the ideal descriptor language be realized through acronyms is clearly not a requirement but a design scheme.

Nevertheless, the idea of an acronymic language quickly leads to a real requirement, which I called the Summary Property . Imagine a sequence of words in an acronymic language. To summarize this sequence of words, we need only take the initial letter of each word and form them into a word. In an acronymic language the resulting word is a good approximation in meaning to the original sequence, and therefore can serve as a summary of it.

A very brief summary may indeed serve as a descriptor, but the Summary Property is a much stronger requirement than the ones we have previously given for descriptors. A descriptor need only indicate what some text says. A summary also indicates how the text says it.

Whether or not we use an acronymic language for our ideal descriptor language, I believe a Summary Property is essential. The Internet is and World Wide Web is giving us access to more and more text. Even now we cannot read and understand all of the text that is relevent to our needs, and the future will make the problem worse, unless it provides tools for automatically creating brief summaries for us.

We also will need some mechanical means, or algorithm, for dealing with large numbers of descriptors, by summarizing groups of them. So a very general Summary Property is definitely a requirement.

For now, let us suspend our requirements analysis with just the three basic requirements:

1. An isomorphism between meaning and form, or at very least a homomorphism from form to meaning.

2. A mechanical process for processing text and automatically generating a descriptor for it.

3. A mechanical process for summarizing text, including texts which consist simply of lists of descriptors.

In articles posted on the Internet a few years ago, I described a process that used vectors to indicate content. I suggested that texts be mapped into a vector space, and that the searching and storage of article be related to that vector space. The only difference is that I am now speaking of a descriptor language, where before I spoke of vectors. In fact, nothing has changed, since at the innermost heart of the process, I still envision a vector space. The space of possible descriptors is no more than an intermediate layer between the human being and the vector space.

This is rather similar to the use of assembly language to program a computer. The human being writes expressions like JSR GETINPUT, or ANDX count,y and each of these expressions stands for a simple machine language instruction. Human beings can and have programmed directly in machine code by writing hexadecimal numbers instead of these mnemonics, but it is much easier to write in the more easily understood assembly language.

What I have been doing is working towards such a goal by experimenting with vector spaces that encode meaning or content, and by trying to develop a descriptor language that would provide an easy way for humans to work with such a vector space.

Let us assume that we have already defined a suitable vector space for information retrieval -- that is to say, let us assume that someone has achieved what I have been trying so hard to do. Given a well-defined vector space, we are now faced with the problem of talking about individual vectors -- individual points in that space. The normal mathematical way of talking about a vector is to list its coordinates relative to some set of coordinate axes. That would mean describing the vector as a list of numbers, e.g.

{1.6 7.3 9.2 -0.5 0.0 -0.9 6.5 6.3 -5.3 1.6 6.6 -3.1 1.7 0.0 0.2}.

This is just not a practical means of communication. Lists of numbers are just too hard to deal with.

But suppose we could accurately specify a vector by using a sequence of letters that form a pronounceable word, like any of the words in this sentence. Surely that would be an excellent way to talk about the vectors in this vector space. But is it possible. I think it is possible, and I have worked out several ways to do it. What I am going to do here is explain one of those methods -- not the best one, perhaps, but certainly the easiest to explain.

Before I begin I must make an observation about precision. The list of numbers given above seems to precisely pick out one vector from a 15 dimensional vector space, but anyone in the physical sciences would object that it uses only two significant figures for each coordinate, and therefore only delimits a small volume or neighbourhood within the vector space. That is correct -- and I think that is all we can ever hope to do. We can increase the precision and therefore pick out a smaller volumn of space, but we can never hope to specify single vectors.

Let me contrast this lack of precision with an achievable goal, which is a lack of ambiguity. As I use these terms, precision refers to the size of the volume of space described, but ambiguity means that a given descriptor actually describes two or more distinct volumes of space.

Many words in natural languages like English are ambiguous in that they have two or more distinct meanings. The word `table', for example, may mean a flat board with several legs, that we eat off of, or it may mean a multi-column list on a computer, or it may mean the act of laying aside a proposed bill in the legislature. Each of these meanings would correspond to some separate volumn or neighbourhood in the vector space. A good descriptor language would have separate descriptors for each of these volumes of space, and would therefore be unambiguous, as I use the term. But each descriptor would still have some imprecision, since it picks out a volume of space rather than a single vector.

With this distinction noted, I will now describe the simplest scheme for describing a vector by means of a word or sequence of words.

The simplest scheme bears a family resemblance to Hebrew in that it ignores vowels in writing words, but adds them for speaking purposes. Let us suppose we ignore A, E, I, O, U, and Y, in writing words, and thus have 20 consonants. The basic idea is to operate in a 10 dimensional space, so that each consonant used in a word represents a significant component of the vector along the corresponding coordinate axis in one direction or the other.

The acronymic property comes from a decrease in weighting as we move from leftmost to rightmost letter in the word.

As an example, suppose we consider trying to specify a point on the 2-dimensional page or computer screen by using an alphabet of four letters, T, (short for Top), B (short for Bottom), L (short for Left) and R, (short for Right). We take as the origin the center of the screen or page.

The letter T, by itself, means "somewhere in the top half of the screen"; B means somewhere in the bottom half; L, somewhere on the left half; and R, somewhere on the right. TR means somewhere in the top right quadrant, LB or BL, somewhere in the bottom left quadrant, and so on.

But now look at each of the 4 quadrants and imagine dividing them in exactly the same way as you divided the whole screen. Therefore TRBL means "the bottom left subquadrant of the top right quadrant", and TRBLTL means "the top left subsubquadrant of the bottom left subquadrant of the top right quadrant".

It should be obvious that we can specify any point on the screen this way, to any finite level of precision.

Suppose we add the letters `F' for front or foremost, and `H' hindmost, (B and R being already in use), then we can specify any point in a 3-dimensional cube by using strings of these six letters. FRTHLB would be the "hindmost left bottom subcube of the foremost right top cube".

As it happens, talking about square quadrants and subcubes is misleading, since the most efficient use of this form of notation is to use non-square rectangles and non-cubic rectalinear solids. If we use cubes for the three dimensional case, then FRT, RTF, TFR, and TRF all mean the same thing. It is much more efficient to use a uniformly descresing set of weights in which each successive letter represents a slightly smaller portion of the original space. If we do that, then RFT and TFR represent quite different volumes of space. RFT is a rectalinear solid that is short and wide, while TFR is one that is tall and narrow.

By having each successive letter weighted less than the previous one, we can be sure that letters to the right of a sequence are less important that those to the left. To obtain an approximation of a sequence by a shorter one, it suffices to truncate the sequence to the right, leaving letters to the left intact. The most extreme case is one in which the sequence of letters is truncated to the single initial letter. The initial letter of a word in this language is therefore the best single letter approximation to that word -- a property that is fundamental to an acronymic language.

Early descriptor languages used by librarians, like the Dewey Decimal system, might be better described as descriptor vocabularies, since they used only single word descriptors. More recent descriptor languages used pairs of single word descriptors, but none that I know of has actually used long sequences of descriptor words resembling the sequences of words in a natural language. But as will be seen, this is very important.

By simply adding letters to a single word descriptor, we decrease the volumn of space and increase the precision at the same time. The word TRF specifies a large volume of space, and TRFFRBTHL specifies a much smaller one. We can also consider that TRF is not very precise and TRFFRBTHL is much more precise. But what if we want to specify a large volumne of space very precisely, or a small volume of space, but with less precision? We need to somehow decouple volume and precision, so we can specify any volume with any degree of precision.

The answer is to use sequences of words instead of single words. Without going into details here, it should suffice to say that a long sequence of short words can specify a small volume without much precision, and a short sequence of long words can specify a large volume very precisely. And it is worth noting that the acronymic properties hold: a single word approximation to a sequence of words is the acronym of that sequence.

There are many more details that could be given at this point, but what I have said here should be enough to suggest that an acronymic descriptor language can easily be made once a suitable vector space is defined. That is the hard part.

Briefly, what I have been doing is linking together meanings according to the synonymity of words that express them.

First, I needed to find a good notation for meaning. As the "table" example mentioned above indicates, words in English are ambiguous, so they do not provide a good way of indicating meaning. But pairs of English words are much better. The words "table" is ambiguous, and so is the word "list" which may mean a kind of table, or the verb describing a boat that is tilting to one side. But the pairs "table, list", "table, desk", "table, delay" all have different meanings, as do the pairs "list, table" and "list, tilt".

Pairs of words form an adequate notation for meanings, and where they fail, triples of words will succeed. Using pairs of words it is possible to define a form of similarity between meanings based on synonymity of words as follows: if both of the words in one pair are listed as synonyms of both of the words in the other pair, then the two pairs of words describe similar meanings. This holds true with very few exceptions, and can be strengthened by using triples instead of pairs.

What I have done is to define several thousand meanings by using pairs of one-syllable words. The restriction to one-syllable words was made in part to limit storage requirements and in part to allow simpler analysis of word forms. I was able to create a network structure (what mathematicians call a connected graph), by using the definition of similarity given above. In this graph, meanings (defined by pairs of words) are the nodes, and nodes representing similar meanings are joined by edges (or arcs).

This graph can be considered as embedded in a vector space of many dimensions. The problem is to find a definition of that vector space, and then to find a 10-dimensional approximation to it. There are well-known mathematical ways of doing this, but I have not quite succeeded in accomplishing it, partially because of the very large size of the graph, which defeats any straightforward approach. Nevertheless, I am quite confident it can be done, and hope to have it available soon.

Once the vector space is well-defined, I should be able to make a simple acronymic descriptor language from it, and this would be of very great value in organizing the data that is available on the Internet. A lot of work will remain to be done, but I feel that I am working on something very fundamental.


Copyright © 1998 Douglas P. Wilson  


Copyright © 2009   Douglas Pardoe Wilson

Other relevant content:

New: Social Technology through Diagrams

New: Social Techs novel online

New: Social Technology Blog

New: Social Technology Wiki

Please see these web pages:

The main Social Technology page.

Find Compatibles , the key page, with the real solution to all other problems explained

Technological Fantasies , a page about future technology

Social Tech a page about Social Technology, technology for social purposes.  I think I was the first person to use this phrase on the Internet, quite a long time ago.


Roughly corresponding to these web pages are the following blogs :

Social Technology the main blog, hosted on this site, with posts imported from the following blogger.com blogs, which still exist and are useable.

Find Compatibles devoted to matching people with friends, lovers, jobs, places to live and so on, but doing so in ways that will actually work, using good math, good algorithms, good analysis.

Technological Fantasies devoted to future stuff, new ideas, things that might be invented or might happen, such as what is listed above and below.

Sex-Politics-Religion is a blog about these important topics, which I have been told should never be mentioned in polite conversation.  Alright that advice does seem a bit dated, but many people are still told not to bring up these subjects around the dinner table.

I believe I was the first person on the Internet to use the phrase Social Technology -- years before the Web existed.

Those were the good old days, when the number of people using the net exceeed the amount of content on it, so that it was easy to start a discussion about such an upopular topic.  Now things are different.  There are so many web pages that the chances of anyone finding this page are low, even with good search engines like Google.   Oh, well.

By Social Technology I mean the technology for organizing and maintaining human society.  The example I had most firmly in mind is the subject of  Find Compatibles , what I consider to be the key page, the one with the real solution to all other problems explained.

As I explained on my early mailing lists and later webpages, I find that social technology has hardly improved at all over the years.   We still use representative democracy, exactly the same as it was used in the 18th century.  By contrast, horse and buggy transporation has been replaced by automobiles and airplanes, enormous changes.

In the picture below you will see some 18th century technology, such as the ox-plow in the middle of the picture.  How things have changed since then in agricultural technology.  But we still use chance encounters, engagements and marriages to organize our home life and the raising of children.  

I claim that great advances in social technology are not only possible but inevitable.  I have written three novels about this, one preposterously long, 5000 pages, another merely very very long, 1500 pages.  The third is short enough at 340 pages to be published some day.  Maybe.  The topic is still not interesting to most people.   I will excerpt small parts of these novels on the web sometime, maybe even post the raw text for the larger two.


This site includes many pages dating from 1997 to 2008 which are quite out of date.  They are included here partly to show the development of these ideas and partly to cover things the newer pages do not.  There will be broken links where these pages referenced external sites.  I've tried to fix up or maiintain all internal links, but some will probably have been missed.   One may wish to look at an earlier version of this page , rather longer, and at an overview of most parts of what can be called a bigger project.

Type in this address to e-mail me.  The image is interesting.  See Status of Social Technology

Copyright © 2007, 2008, 2009, Douglas Pardoe Wilson

I have used a series of e-mail address over the years, each of which eventually became out of date because of a change of Internet services or became almost useless because of spam.  Eventually I stuck with a Yahoo address, but my inbox still fills up with spam and their spam filter still removes messages I wanted to see.  So I have switched to a new e-mail service.  Web spiders should not be able to find it, since it is hidden in a jpeg picture.   I have also made it difficult to reach me.  The picture is not a clickable link.  To send me e-mail you must want to do so badly enough to type this address in.  That is a nuisance, for which I do apologize, but I just don't want a lot of mail from people who do not care about what I have to say.


Cross-References:

Doug Wilson's Home Page

Another Old Index Page

What's New?


Copyright © 2009   Douglas Pardoe Wilson