Writing about my life. When I'm well it's math and code… But when the schizy demon rises it's prose and poetry.

Posts Tagged ‘Python

The Collatz Problem

leave a comment »

I played around with this code, built for speed. It’s so simple there’s not even much point in functionalising it, no gain in readability. I will be doing more code to generate OEIS sequences in future, because it’s a nice pastime. I included extensive comments as a vague attempt to embody more of Knuth’s ‘literate programming’ style so that documentation and code are in more of a self-explanatory whole.

Read the rest of this entry »


Written by Luke Dunn

December 23, 2018 at 3:36 pm

Posted in Math, programming

Tagged with , , ,

Simple Python Project: Markov Text

with 2 comments

consider this sentence

“the cat sat on the mat”

we can see the following about it

the word “the” is followed by “cat” and “mat”
the word “cat” is followed by “sat”
the word “sat” is followed by “on”
the word “on” is followed by “the”

so from this sentence we can construct a dictionary like this

catsat = {"the":["cat","mat"],

Read the rest of this entry »

Written by Luke Dunn

December 31, 2015 at 9:09 am

Knowledge Discovery

with one comment

I wanted my chatbot to respond to input statements with random facts that are somehow relevant to those statements. Let the bot search the input for a term it ‘knows’ about and then give back a fact about the term. My downloaded schools’ wikipedia which I torrented might be useful for this. I converted it into plain text with a script and set about getting some code to strip out sentences or phrases, then to index those against a selection of sub phrases which loosely correspond to the subject of each sentence. a dictionary would be fine for this, and since the whole wikipedia was about 150MB of text it should be doable.

My first approach was to allow the bot to be able to converse about famous people. The naive way to extract information like this was to look for two or three consecutive terms which are capitalised and assume that each of these is a proper name.

As I soon found, this code returned a lot of data, some of which were names and some not. Then I considered my options and decided it was fine to stick with any capitalised entity since countries and cities, elements and so on were equally suited to have facts about them. Also every first word of a sentence will be capitalised and so I taught my program to ignore those. A small rate of error such as when the input text has incorrect capitalisation, or when a sentence begins with a proper name, was hard to avoid. This seems to me to be nigh universal in NLP since the input data is so unstructured that, assuming your program is not an AI, no algorithm can cover every bizarre case of language. 98% is pretty useful though.

Once I had the names I built a dictionary of ‘facts’ for each name and then my bot would have something it could talk about. Yeeha ! So far so not AI.

Read the rest of this entry »

Written by Luke Dunn

November 24, 2010 at 4:08 pm