writing about my life

Archive for the ‘Natural Language Processing’ Category

Simple Python Project: Markov Text

with 2 comments

consider this sentence

“the cat sat on the mat”

we can see the following about it

the word “the” is followed by “cat” and “mat”
the word “cat” is followed by “sat”
the word “sat” is followed by “on”
the word “on” is followed by “the”

so from this sentence we can construct a dictionary like this

catsat = {"the":["cat","mat"],

Read the rest of this entry »


Written by Luke Dunn

December 31, 2015 at 9:09 am

Five more unusual adjectives

leave a comment »


seem merely the customary platitudinous british holding up of horrified hands at american slavery

joan rattled on with the platitudinous originality of youth

Read the rest of this entry »

Written by Luke Dunn

March 5, 2014 at 9:00 am

Five unusual adjectives from Gutenberg

leave a comment »


1.swollen and distended or congested
2.(of language or style) tediously pompous or bombastic

he was never so happy as when he was wrapping up some commonplace thought in a garment of sonorous but turgid rhetoric

i not only committed to memory the more turgid poems of the late lord byron

Read the rest of this entry »

Written by Luke Dunn

March 5, 2014 at 8:21 am

The second rap: questionnaire typology

with one comment

All human knowledge is about questions.

  • questions = problems in a high school calculus class
  • questions = problems in an iq test

This new system could revolutionise iq testing, because you could ask the subject about things they were truly interested in, rather than expecting everyone to be interested in trivial logic puzzles or puzzles involving shapes etc. This would be more fair, and moreover less prone to cultural bias where students fail to answer optimally simpy because the question didn’t interest them, rather than because they assessed the question and found it too hard !

A type 3/4 questionnaire in an educational setting would pinpoint the optimal learning modality of a student so their own learning could be accelerated by issuing questions and problems that stretched them in just the right ways. Not to mention the benefit that would result of making university exams into type 3/4 questionnaires, of course with a suitably designed rating system to maximise achievement and fairness of assessment. Read the rest of this entry »

Written by Luke Dunn

June 29, 2012 at 4:12 pm

About Questionnaires and classifying/implementing them

leave a comment »

Part 1 – a brief Taxonomy

As far as I know this is fairly new

Type i questionnaires are just a linear sequence of simple, usually yes/no or 1-10 etc. questions placed, say, onto a single web form. They may result in a string of digits representing the answer set and are easy to implement.

Type ii questionnaires are more subtle because the sequence of the questions is a tree where different branches of question sequences from the whole tree are selected based on previous answers. This allows for what are called contingency questions. Think “do you believe in a god?” if “yes” the succeeding answers ask for more theological details, if “no” the succeeding questions ask the nature of the persons non-theistic beliefs, eg about the nature of the universe they believe in. Once a branch has been chosen the sequence stays type i for a while, until it hits another branch point. Read the rest of this entry »

Written by Luke Dunn

June 26, 2012 at 3:27 pm

Sentiment Ticker – The Rambling Bit

leave a comment »

I have recently entertained a whole series of connected thoughts inspired by the idea of the sentiment ticker. To summarise this is a device that uses keyword analysis to read the emotions of people who write content online. Because there’s so much online it may be possible to gauge the moods and feelings of whole groups, nations or the world. The technique is in its infancy but leading hedge funds already use it to try to predict stock prices, so where the money goes the rest are to follow, perhaps.

Coincidentally at the same time I started listening to an audiobook of Isaac Asimov’s Foundation. I soon observed there was a synchronicity. Author’s note: I don’t believe such coincidences are magical or psychic in origin, as Jung thought, but they are still an interesting mental event and even literary tool.

The sentiment ticker and opinion mining in general would definitely qualify as embryonic Asimovian Psychohistorical tools. Governments and organisations could use them for prediction and other research. A way to formalise events in current affairs and link them with observed sentiment trends would be very powerful. If connections could be found then you may have the beginning of actual equations.

The other subject which is connected is the real, as opposed to SF, subject of Psychohistory. This is a discipline that uses psychotherapy techniques to try and understand the motivation of nations, groups and particularly political leaders. It is fascinating and its primary conclusion is that child rearing is critical for the future of the species. This is because psychological damage to children propagates into damaged maladapted adults who act neurotically and create conflict and perpetuation of their damage in the world. The sentiment mining approach could be a very powerful tool for a modern de Mausian psychistorian because huge volumes of textual data could be sifted for emotion words and phrases which correspond to psychohistorical patterns. Incidentally the baroque violence and depraved imagery of newspaper political cartoons are currently one of the richest veins for sentiment mining by psychohistorians, but an image processor would currently be very hard to make that could do this job.

Read the rest of this entry »

Written by Luke Dunn

December 7, 2011 at 5:49 pm

New Concept – The Sentiment Ticker

with 2 comments

The Concept

The sentiment ticker is a small widget that sits on your computer’s desktop much like the conventional stock ticker. This widget connects to a central server cluster run by the provider. It provides a graphical/numerical display of various indices of market, political and consumer mood and feeling. The central servers poll blogs, tweets, journalistic articles and press releases constantly and perform sentiment analysis on this online content. Sentiment mining is looking for clusters of mood and emotion words in certain contexts. this has been shown to be useful, most notably recently when the Arab Spring was “predicted” (in hindsight) by analysing the sentiments of millions of bloggers, tweeters etc. a statistically significant spike was found in middle east mood just before the spring started.


A stock ticker is a small program that stays open on your screen with real-time updated information on stock prices, and states of the various markets. It usually shows a graph or two and required price information you have set it for. All traders use them perennially.

Read the rest of this entry »

Written by Luke Dunn

December 6, 2011 at 1:53 pm