Refactoring your code
There was once a time when I had a deep affection for mathematics. I think we fell apart sometime during college but my quest to learn programming has renewed this relationship.
A major concept in computer science that I’m still getting a hang of is refactoring. It’s very similar to factorization in mathematics where you have an expression that you want to reduce to “basic building blocks”. E.g. if “x = a + b + c” and “y = a + b” then you can factor “x = y + c”.
This same concept lends itself to computer programming by the name of refactoring. Similarly this process involves rearranging a program to facilitate code re-use and improve the readability of your code.
Here’s an example from today. I’m working on a procedure “add_page_to_index” that takes three inputs and updates the index to include all of the words found in the page content by adding the url to the associated word’s url list. If that’s a little confusing (or worded terribly) this problem set is one of many on the way to building a search engine. In this example I am looking for some content (word) and want to find all the url links associated with that word:
index = []
def add_to_index(index,keyword,url):
for entry in index:
if entry[0] == keyword:
entry[1].append(url)
return
index.append([keyword,[url]])
def add_page_to_index(index,url,content):
a = content.split()
#print a
for word in a:
for entry in index:
if entry[0] == word:
entry[1].append(url)
return
index.append([word,[url]])
add_page_to_index(index,'fake.text',"This is a test")
print index
The previous problem was already used to find a procedure that adds keywords to an index with their associated urls, “add_to_index” (which is the first part of the above code). I initially started this problem by rewriting the first part to include the “content” or words in the second part.
After some initial trial and error the second part of my code looked like this:
def add_page_to_index(index,url,content):
a = content.split()
#print a
for word in a:
add_to_index(index,word,url)
Unbeknownst to me I should have just used refactoring the whole time as I already had written what was needed previously. Eventually I saw that I could refactor the code and was able to solve the problem. I still feel very new to this but it gets me excited every time I notice small things like that. I think I’m slowly getting there.