famous authors are praised, respected, admired and missed due to their works. after reading all the titles of an excellent author, one is left wishing for more. curiously enough we now have the ability to produce more books that fit in the literary style, depth and topic tackling techniques of writers. perfection in this field opens a new era, you might hear of ~~mouse~~ cat trap by darwin, russian trip by mark twain and god beyond doubt by hawkins.
markov?
markov chains are probabilistic pattern generating models. ‘patterns’ can be events, text or geometrical arrangements
let us take the example of a person going to a fast food outlet where there are only three types of foods, rounder, burger and panini.
the probability of him taking a rounder after already taking one is 0.2 or 2/10
after a rounder, the probability of taking a burger is 0.4 or 4/10
after a rounder, the probability of taking a panini is 0.4 or 4/10
notice that there are three arrows going out for three choices, which when added make up 1
let us make up the remaining part for burger
now completed it looks like
B P P R P P …
matrix representation
the above can also be written as
but whatever does that means?
that’s why we have state space to specify from where we begin
state space {1 = rounder, 2 = burger, 3 = panini}
means we start first by rounder, then burger then panini
applying markov to text
the idea is to:
- build a markov model
- then to generate text through the model
rehearsal
we’ll have a program that scans every word and records the next word
let us take the text:
the meadow was green. the sun was shining. the boy went away. the glass was shining
tabulating next words we get:
the -> meadow, sun, boy, glass
meadow -> was
was -> green, shining,shining
sun -> was
boy -> went
went -> away
glass -> was
from those we can generate the text
the sun was green
with path
the meadow was shining
wait, what about probabilities ?!
well see
was -> green, shining,shining
when we randomly choose from [ green, shining,shining ] we actually have more chance getting shining than green as they occur twice more, no need of writing up the probabilities. the only draw back is programming efficiency, which we’ll clean up in another post
the code
def format_text(file_read):
# getting the data ready for generation
data = {}
lines = file_read.replace('\n', ' ').split('.')
for line in lines:
words = line.split()
for i,word in enumerate(words):
if i+1 < len(words):
next_word = words[i+1]
if word not in data:
data[word] = [next_word]
else:
data[word].append(next_word)
return data
def rand(data):
# chooses random element from dictionary
return random.choice(list(data))
def generate(times, data):
# specifies length of generated phrase
current = rand(data)
gens = [current]
try:
for i in range(times):
current = random.choice(data[current])
gens.append(current)
except:
pass
return gens
with open('source.txt', 'r') as source:
file = source.read()
data = format_text(file)
print(' '.join(generate(10, data)))
format text returns a dictionary. here is a snapshot:
{
'is,': ['besides'],
'unsupported': ['by'],
'one': ['which', 'of', 'at', 'reason', 'proposition', 'form', 'of', 'which', 'thing', 'thing', 'magnitude', 'kind', 'of', 'relation', 'fact,', 'fact,', 'involving', 'case', 'of', 'kind', 'atomic', 'of', 'happens,', 'something', 'is', 'would'],
...
i ran the program over some paragraphs of
Our Knowledge of the External World as a Field for Scientific Method in Philosophy
from
Mathematical logic, even in its most modern form, is not directly of
philosophical importance except in its beginnings. After …Our Knowledge of the External World as a Field for Scientific Method in Philosophy
by Bertrand Russell
to
Charles I. and death and his bed
are objective, but they are not, except in my thought, put together as
my false belief supposes. It is therefore necessary, in analysing a
belief, to look for some other logical form than a two-term relation.
Failure to realise this necessity has, in my opinion, vitiated almost
everything that has hitherto been written on the theory of knowledge,
making the problem of error insoluble and the difference between belief
and perception inexplicable.Our Knowledge of the External World as a Field for Scientific Method in Philosophy
by Bertrand Russell
the core is here
current = rand(data)
gens = [current]
try:
for i in range(times):
current = random.choice(data[current])
gens.append(current)
except:
pass
return gens
we start by adding a random word (a random element of the dictionary) to the list
current = rand(data)
gens = [current]
then we choose a random word from the words coming next
the code produces like
with length 20
again length 10
the phrase is correct euu yes, it imitates the style for now
infinite number of which we have the one thing having some other terms or opinion about Socrates–that he feels his insight, we can discover, for example, two classifications we mean that two terms, being equally true when one of the most
generated text - length 40
inductive principle, which need not known form made space and so that there were none except prejudice, so long as follows that, if they may sometimes know that we saw that I shall bring my umbrella if you could hardly be outlined in the whole theory of general truths by no
generated text - length 50
return to asymmetrical relations, such as follows that, if this way to be in the existent world, and red, and pure logic, no particular subject-matter otherwise than the case of the supposed common constituent, but are true or “one inch taller” or deny this is true, we are more than a “fact,” I have that membership of have knowledge is any
generated text - length 60
Thus “father” is the difference between B is something altogether more difficult, and common world would be inferred from sense, and mortality that certain relation is apt to be asserted or deny this hypothetical form, but the second being equally true in order to have accepted and B, and philosophically it is said to properties becomes obviously impossible to infer all propositions are and B, also knew that all things have
generated text - length 70
From poverty in any such as regards the subject-predicate form–in other thing is in its constituents and were less anxious to deal throughout the other property, then B and most of unreality of the weather had been written on the marks of a certain known objects are indispensable in the everyday world with completely general propositions, and so that there is independent of three things, it should be explained would not transitive, but to give rise to exist
generated text - length 80
room for improvement
well we can improve many things such as: first word first, choosing probabilities as numbers rather than from list, be sensitive to punctuations, improve context etc. a rough code lands up a pretty nice result