Posts Tagged ‘chatbot’
AIML is a chatbot framework invented by Dr Richard Wallace. Sites like Pandorabots allow users to make their own bots from this technology. That’s the easy route if you want it….
First stop for building a real chatbot in Python would be to use PyAIML, which can be downloaded here
AIML (Artificial Intelligence Markup Language) is an XML based format for encoding a chatbots “brain”. It was Developed by Richard Wallace and the resulting bot, ALICE was the best at the time.
you can also download the standard ALICE brain here
I wanted my chatbot to respond to input statements with random facts that are somehow relevant to those statements. Let the bot search the input for a term it ‘knows’ about and then give back a fact about the term. My downloaded schools’ wikipedia which I torrented might be useful for this. I converted it into plain text with a script and set about getting some code to strip out sentences or phrases, then to index those against a selection of sub phrases which loosely correspond to the subject of each sentence. a dictionary would be fine for this, and since the whole wikipedia was about 150MB of text it should be doable.
My first approach was to allow the bot to be able to converse about famous people. The naive way to extract information like this was to look for two or three consecutive terms which are capitalised and assume that each of these is a proper name.
As I soon found, this code returned a lot of data, some of which were names and some not. Then I considered my options and decided it was fine to stick with any capitalised entity since countries and cities, elements and so on were equally suited to have facts about them. Also every first word of a sentence will be capitalised and so I taught my program to ignore those. A small rate of error such as when the input text has incorrect capitalisation, or when a sentence begins with a proper name, was hard to avoid. This seems to me to be nigh universal in NLP since the input data is so unstructured that, assuming your program is not an AI, no algorithm can cover every bizarre case of language. 98% is pretty useful though.
Once I had the names I built a dictionary of ‘facts’ for each name and then my bot would have something it could talk about. Yeeha ! So far so not AI.