• Home
  • About
  • Archives
  • Book
  • Contact me
  • Photos
  • Projects
  • Talks
Subscribe: Posts | Comments | E-mail
  • ArticlesArticles which I authored
  • GSOCGoogle Summer of code archives
  • HacksExperiments
  • LifeIn and around life
  • Open SourceFree and Open Source Software
  • PardusContributions with Pardus Project

Sarath Lakshman

Posted on November 28, 2011 - READER COMMENTS (14)

Preparing for your first-job interviews

Articles General Interview NEW Thoughts
FacebookGoogle BuzzIdenti.caShare


It has been a long time since i wrote a blog entry. Here is some interesting piece for final year computer science students.

Getting a job is one of the happiest things in the life of a final year guy. I also had such wonderful moments during my final year. I would like to share some bytes of info that can help you out.






University/College studies and your first job


In the long four years of engineering course, you study a lot of things. Lot of junk and little good. In reality very few subjects really help and are useful for a computer science job. For getting a job, you will need even fewer set of subjects among them. Hence, finding a job is easy. But you have to master the subjects that you love. I will list out the few subjects that will help you find a job.

Data structures, Algorithms and Analysis, Operating Systems, Computer Networks, C Programming, Object Oriented Programming, Database Management systems, Compilers (Optional), Distributed Computing (Optional), Microprocessors (Optional)

The perspective of current education system and the industry goals diverge very much. In colleges, we study for obtaining some marks. The teachers also have the goal to help their students to obtain marks rather than learning something that may help gain the ability to solve computer science programs using their skillsets. In industry, the ultimate goal is to produce good quality software within short span of time. That means, the skill requirements are good coding skills, ability to understand and solve problems algorithmically with analysis to validate and come up with optimal and practical solution. To grow up as a software engineer that meets the industry requirements, the education system at college fails.

You have to put good self effort to gain the skillsets and the passion for learning. You should have good coding skills with proficiency in one or more programming languages, understanding of standard algorithm techniques at basic level. Let us have a run through objectives during preparation for a job.


Preparing your CV


Curriculum Vitae is important while applying for a job. Your CV is a blueprint of your personality. It is the first phase during a recruitment that gives the recruiters an overall view of your skillset and your background. Hence, it is worth to spend few days in preparing your CV. Note the following things while preparing your CV.

Use a different layout from other candidates who are applying for job along with you. Make yourself different from others. Write a career objective that states your interest. If you are applying for a specific role, Write a highlights section as the first section in your CV. Highlights should list out important achievements and your skillsets in bullets. This section is indented for HR who screens your resume. The next section can be ‘Skills’, which lists out your skill sets and programming languages. Write in an order separated by commas such that less proficient technologies should come last. You should also mention the operating systems you are familiar with. The next section can be Achievements. But you can move this section to the end of the resume if you think that you do not have considerable good technical achievements for highlighting. The next section should be Projects. This is the most important section in your resume. You should spend some time in writing this section. You should specify the title of the project, duration (optional), the technologies used, a summary of the project describing the project in few words, and project highlights section. The project highlight section should list key features about your project listed as bullets. Write them in an impressive manner stating the facts properly.

Eg.

1. Implemented a H.263 video streaming library for android 2.3

2. Implemented video frames collector and device mapper that converts video stream to v4l linux device

3. Tested on Android 2.3 and found that 25 % performance improvement than the bundled video library

If you have lot of numbers to showcase the benchmarking or awesomeness of your project. that is great. If your project bagged some awards or deployed for some good numbers of users, mention that figures and achievements in the highlights.

I would prefer to order the project in the order of significance rather than chronological order. After the project section, have an achievements section which lists all technical and non-technical achievements. You can split your achievements section into subsections like publications, events, etc. Write the educational background section as the last section of your resume. Because it is the most insignificant portion of a resume if you are looking for a good computer science job. It states few figures that indicate your marks which is not an indicator of your knowledge.


Tips for interview preparation


You should acquire sound knowledge about few of the subjects listed in the earlier section of this article. Usually recruitment process consists of a written test paper followed by a couple of interviews. Focus of your preparation should be based on the job position you are applying for. If you are preparing for a developer interview, you should be sound with Data Structures and Algorithms. You might be bored with the subject since you attended some boring series of lectures from colleges to grab marks. Try again approaching the subject in a different manner. You will definitely enjoy it. Start reading the book ‘Programming Pearls’ before you start preparing. It will give you a wonderful insight you never had before. I will focus on developer interviews.

Companies usually ask some technical aptitude questions, puzzles and questions from the subjects along with the test paper. Most of the questions will be repeated. Search for programmer interview aptitude questions and brain teasers for programmers on Google. You will find a good list. Try to solve them. Learning algorithms are not hard. But understanding the use cases and ability to apply them to solve problems require some effort and practice. Whenever you learn an algorithm, try to implement it using a programming language. For practicing algorithms to coding, you should use some highly object oriented and simple languages. The best language you can learn is python. You can practice coding algorithms with Python without any implementation complexity and it will look like a pseudo code. Learn python today. Seriously it won’t take more than a day to learn things that you will require to implement algorithms. C is a great language. Implementing certain Data structures or algorithms in C gives you a good experience to code well in C. I will write some notes on few algorithms that you should try implement in C. Algorithm analysis for important algorithms should be understood. You should know the worst case complexity (Find out the worst cases in the case of a particular algorithm), Best Case complexity (Also the best case) and the average case complexity (Also the average case example). Space complexity of the algorithm should be known in order to select the best algorithm according to problem environment.

 

Data Structures and Algorithms

Binary Search

This is a very important algorithm you can apply at many different problem environments you never expect.

Understand the runtime complexity for the Binary Search and understand how to derive the complexity from algorithm. Note that it can be applied for only sorted lists. Learn how to apply Binary Search in a Rotated Sorted List by using recursion and how the algorithm complexity varies. Implement Binary Search using C for a list of strings. You should familiarize the terms in place sorting, stable sorting and also should understand which sorts come into these categories.

Insertion Sort

Understand the algorithm and derivation of algorithm analysis. Compare it with Card game in which we move the cards to the suitable position. Keep the example in mind and apply to similar problems. In Insertion sort, we sort the element set by consequently moving the current element to the appropriate position in an already sorted set.

QuickSort
QuickSort is a very important sorting technique. It uses divide and conquer technique. Divide the given set of elements into smaller sets recursively and apply comparison and swap. When comparison and swap is performed to formulated smaller sets, it results in the larger sorted set. Learn algorithm analysis. Learn how to find kth smallest element by modifying the Quicksort algorithm in O(nk) complexity. Find out mean of a set of elements in O(n) by modifying the above problem.

MergeSort
Mergesort is also a divide and conquer sorting technique. The concept is to merge two sorted list to obtain larger sorted set. By dividing the given array into smaller subset by recursion, smaller subsets are formed. Merging the subsets from lower level to higher level, we obtain sorted array. Learn algorithm analysis. Note that merge sort takes an extra array. Hence this is not an in place sorting. It has O(n) space complexity. You should practice problems related to merge sort. Eg. You are given two sorted arrays with size n and 2n. The second array contain n elements in the positions 0 to n-1. Now without using extra space, formulate the elements in 1st and 2nd array into 2nd array and return a sorted array of size 2n.

HeapSort
Heapsort is an interesting sorting technique. Heap is a tree in which parent node is always >= child nodes (called as max-heap) or parent node is always <= child nodes (called as min-heap). This is the basic property for a heap. Let us have an overview of how to create a heap and manage it. When we need to add an element into an existing heap, we add the element as root or to the rightmost bottom element in the heap. Then apply heapify operation. Heapify operation can be of two types: shiftup and shift down. When we add a new element to an existing heap as root element, we perform shift down operation. Shift down operation performs a traversal from root level to bottom level, at each level of traversal, it compares whether heap property is violated, if so it will perform swap between parent node and child node to obey the heap property. Hence the element we added as root will move to the accurate position when the traversal reaches the bottom level. If we add a new element to the bottom right element, we need to perform a shift up to position the element to the right position. We traverse from parent in the bottom level to the root, by checking the heap property at each level and swapping elements to meet the heap property, we get a balanced heap when traversal reaches the top element.

Heapsort makes use of these operations to obtain a sorted set. Let us assume we have a heap (1,n). the root element will be the highest value (max-heap). Hence it will be the last element in the sorted list. We swap the root and the last element. Now the heap property is lost. But the nth position of array has the correct element in the sorted list. So we exclude nth element and heapify the heap(1,n-1) by using shift down operation. Because the root element is the one breaking the heap balance. After doing heapify 2nd time, we get 2nd highest element as root element. Now swap root element with n-1 element. Hence n-1, nth elements are 2nd largest and largest elements. Now exclude the n-1 and nth element, heapify heap(1,n-2). Follow the procedure until the newly formed heap size become one. You will get a sorted list. Read the chapter Heaps from Programming Pearls (It will give you a wonderful insight). Practice the problems: Find kth largest element from a given unsorted array. Implement priority queues.

External Sorting

External sorting is an important sorting technique used when the amount of data we need to process is greater than the available memory. For eg, we have 1GB of integers and 256MB of RAM. Hence it is clear that we cannot load entire list of numbers into ram and perform in memory sorting. External sorting techniques are to be used to solve this problem. K-Way merging is one of the simplest methods to solve the problem of RAM < data size. We can split the data into K parts. The part split is performed such that each split is less than the size of RAM. Then we can sort each part individually using any sorting algorithm. Then we can perform a special type of merging to obtain sorted output. Let us see how to perform the merge.

For eg, we split the data into 4 parts and we individually sorted them. Then take the first element from each 4 sorted lists and sort them and find out the lowest element. It will be the first element in the sorted output. Add it to the new list called full sorted array.

Now, from the array from which we obtained the lowest element, take the next element, sort the list again and find out the second lowest element. From the array we obtained 2nd lowest element pop out the next element and sort again to find out the third lowest element. Proceed the process until all arrays becomes empty or one array remains few elements. If an array remains unempty add those elements in the order to the full sorted array. Have a look at implementation code (http://code.google.com/p/kway/)

Bit array technique for solving RAM < Data problem for sorting counting numbers

In a 32 bit system an integer takes 32bits to store an integer and a 64 bit system takes 64bits to store a number. But for storing counting numbers, we can use bit vectors which are formulated by using 64 or 32 bits in an integer. If we set 0th bit in 32 bit we can represent it as 1. If we set 2nd position, we can represent it as 2. If we define an integer array of size N, we can actually represent 32*N numbers using that integer array. In order to sort large number of unique counting numbers we can use, bitsorting by setting and clearing bit positions. If we need to represent 1 to 68 numbers we need only 68 bits. We can represent it using an integer array of size 3. Ie, 32*3 = 96 bits. To set 68th bit, we know that 68th bit is situated in the array offset 2. To obtain the array offset, divide the number by 32. (68/32 = 2). Now we need to know which bit position needs to be set in the 32 bits available in array[2]. For that, findout modulus by 32. 68%32 = 4. Hence set the 4th bit in the array[2]. This can be performed without division and modulus operators by using bit shift operators.

i=68
array[i>>5] |= 1 << (i & 0x1F)

Here we find out i/32 using right shift operator (Each right shift causes division by two. Five times rightshift = division by 2^5 (32) ). By using AND operation with 0x1F, we get 5 Least significant bits, the value of 5 LSB returns in the position in 32 bits. Hence we shift 1 towards that much positions to left and is ORed to do the bit set operation.

Have a look at the implementation code. http://cm.bell-labs.com/cm/cs/pearls/bitsort.c

 

Bit manipulation problems

By using bit manupulation, we can do lot of tricks over numbers. See http://graphics.stanford.edu/~seander/bithacks.html for lot of interesting bit twiddling hacks.

One of the very common problems, is counting the number of set bits in a number.

int count=0

while (n){
    n=n&(n-1)
    count++
}

n&(n-1) will return a number obtained by setting rightmost set bit in number n to zero.

Another problem is to check whether a number is power of 2. For a number which is power of 2, there will be only one set bit in the number. Hence if we do n=n&(n-1), we will obtain zero. Using a single line operation we can identify power of 2 or not.

Hashing

Hash is an important data structure that can be used to solve different problems. When you are asked to find the number of occurrence of numbers in a given list of numbers, you can simply use hash for solving the problem. Iterate through the list of numbers, like:

for (n in numbers)
{
    hash[n] = 0
}

for (n in numbers)
{
    hash[n] = hash[n] + 1
}

We can solve many problems in O(n) using hash.
Implementing a hashtable in C is not easy at a first attempt. Try to code yourself a hashtable in C using pointer to pointer.

Binary Tree and Traversals

Binary trees are common interview questions. There are lot of BT based questions. Have understanding of common questions like the following.

* Difference between full binary tree and complete binary tree

* Find out Maximum/Minimum height of a tree (Recursive and Non-Recursive)

* What is the maximum number of elements in a tree with height H.

* Nth smallest/largest element in a binary tree

* Algorithm to find out Least Common Ancestor (LCA)

For your information, Least Common Ancestor is the common node in a binary tree which is obtained by traversing from two selected leaf nodes to the root element.

Linked list problems

- Reversing linked list

Linked lists are also very common interviewer question. First practice to be done for linked list problem is to write a linked list structure in C yourself and implement linked list traversal. Then add functions to reverse the linked list in place as well as by creating new linked list. If you do not want a new linked list, but you only need to print the elements of linked list in reverse order, use a recursive function that can do recursive calls till the end of linked list and print the elements.
Eg.

void reverse(struct linked_list *list)
{
   if (list->next!=NULL)
       reverse(list->next);

   printf("%s\n", list->element);
}

- Cycle in a linked list

Test for cycle/loop in a linked list is a commonly asked problem. You can initialize two variables as start node for linked list and traverse in a while loop such that while loop ends when one of them becomes null or both variables becomes equal. In the while loop, we traverse two variables with different speed.
(varA=varA->next, varB=varB->next->next)

Have a look at well explained tutorial, ?http://ostermiller.org/find_loop_singly_linked_list.html

Tree traversals

It is very important to understand all the tree traversals and implementation.

1. Preorder traversal

2. Post order traversal

3. Inorder traversal

Traversals can be easily implemented using recursion. But interviewers might ask about non-recursive algorithm. In that case, use stack based algorithm to explain inorder traversal. You can easily implement inorder traversal using stack.

Graph Traversals

Graph traversals are commonly asked in interviews. Have the understanding of Shortest path algorithms.

Depth First Search

In depth first search initially traversal goes deep into deepest node and traversal proceeds. You can use a stack to implement depth first search or else ?you can use recursion to implement this.

Breadth First Search

In breadthwise traversal, you can use it to print the tree in the sorted order. You can use the same algorithm used for depth first search by changing stack into queue to obtain the algorithm for BFS.

Dynamic programming
Dynamic programming is an important algorithm technique to solve a large problem by splitting into smaller overlapping problems. When overlapping small problems are solved, the larger problem solution is obtained. Problems like finding shortest path can be solved using dynamic programming. It usually involves using a storage of subsolutions so that they are used in solving bigger problems which overaps the subsolutions. The dynamic programming is difficult to identify as well as apply to solve problem scenarios. It requires considerable spending of time to learn and master it. When you look into some problems and look at its solutions, you may feel it is not that hard. But when you are given a different problem you may not be even able to identify it can be solved using dynamic programming. Even if you identify, you will find hard to code the problem solution. Hence, give considerable time to work on this one.

Try to learn the problem to find subsets of a set using dynamic programming

Trie data structure
Trie is an interesting data structure that can be used to implement autocomplete feature. You can read more about trie from my older blog post. (Implementing autocomplete with trie data structure)

Conceptual Questions

Lot of conceptual questions are being asked during interviews. It will test your basic knowledge and understanding. Find some of the commonly asked topics

* CPU Scheduling algorithms

* Layers of TCP and OSI network stack

* Understand how Virtual Memory/Paging works

* Understand what happens when you enter a URL on web browser and how website is loaded

* Understand how a computer boots and explain the story

* What is the difference between 32 bit and 64 bit machine and OS

* Understanding TCP/UDP protocol

* Understanding ARP/RARP

* Understand DHCP

* Understand DNS (Recursive, Iterative resolution)

* Understand how email works (POP, SMTP)

* Understand Web 2.0, REST, Thrift, RPC

* Understand IPV4 vs IPV6


Books/References


1. Programming Pearls
Programming pearls is a great book you should read as a computer enthusiast. You will be inspired to learn about data structures and algorithms.

2. Cracking the Code interview
It is a nice book consisting of lots of interesting questions

3. Glassdoor.com
Glassdoor is a great website consisting of lots of questions being asked for different companies. As a first programming exercise, write a perl/python/bash script to parse questions into a text file. I had a python script that I had written long time ago. (Lost that somewhere)


Choosing your first job


Every job interview is a great experience. In my life, i had attended three job inteviews and ended up in receiving 3 offers. Each of the interviews were different experiences. Once you face interviews build positive approach in finding feedback yourself. When you receive multiple offers, put enough effort to understand about what you are going to do with each of the job offers your receive. Choose the job you love to do, so that you never have to work a day in life. Thanks and all the wishes.

You can find few posts about interviews from this blog here, http://www.sarathlakshman.com/category/interview/
I dedicate this blog post to all my juniors in Computer Science Dept, Model Engineering College, Cochin

Image credit: http://www.flickr.com/photos/stevefrog8/

Posted on April 1, 2011 - READER COMMENTS (8)

The story of my job interviews with Taggle.com and Yahoo!

Interview Life Thoughts
FacebookGoogle BuzzIdenti.caShare


It has been a while since I thought of writing my previous job interview experiences with different companies.

Taggle

Taggle.com came to our campus in month of July 2010. It was CTO, Tej Arora who came to the campus for the recruitment. First of all there was a Presentation about Taggle Internet ventures and how it works.

Taggle.com is a group buying website where you get goods for reduced prices, with greater than 50 % off when there are a group of people to buy it. We had a objective multiple choice test of around 40 questions. It consisted of few aptitude questions, data structure questions, etc. It was a good question paper. By evening 5 pm, the result of the technical test came out. There were around eleven guys shortlisted for the next programming test. The eleven selected candidates were send for the programming round. We were allowed to write code on our own laptop and use any programming language we liked. He gave us two set of questions. Set 1 consisted of 1 difficult question and other set consisted of 2 easy questions. We were able to choose one set for coding. I chose the question to implement text auto completion functionality (set 1) and wrote the code in Python. He verified my program and told me to wait and come back once the programming round is completed by others. My friend Fayaz also had written autocomplete functionality. There were other two girls Nishita Suresh and Legena P.K who had worked on the other set of problems. Four of us had personal interviews. He didn’t ask me any technical interview questions but we had a very friendly conversation about the work and benefits at Taggle Internet ventures.

Once interviews were completed, the results were announced. Fayaz and Me got placed in Taggle.com.

Yahoo!

Yahoo! came to our campus on October 30th, 2010. It was a day before seventh semester university exams started. Yahoo was considered as the superstar company that comes to MEC campus with highest pay and perks. The day when placement cell announced ‘Yahoo’ is visiting campus, everyone looked with wow. Placement cell members gave us the info that Yahoo! is going to recruit for Service Engineering team where they look for guys who live and feed in UNIX environment. In the following days placement cell posted specifications and info on what they are looking for and their requirements. There were a lot of XMECians working in Yahoo and they send us some materials they studied during their time. Everyone started seriously preparing for Yahoo with lot of effort. I also wanted to get into Yahoo. It was the time I was working on my book and I had hectic schedules. Some of my seniors who got into Yahoo were famous for Shell Scripting and sed. So I had thought of seriously looking into SED. I spend few days on SED and AWK. It was really nice writing sed scripts, which looks very awkward but performs incredible text processing operations in single line of code. To brush up my shell scripting skills, I went through the first draft of my incomplete book. But, that helped me a lot to fix bugs in my book. I also brushed up few conceptual things like How E-mail works, Networking basics, etc. The day of Yahoo interview came. The cut-off percentage was 70%. There was a presentation on Yahoo! and what they are looking for? Benefits and perks at yahoo. Then we attended the screening test. The test consisted of few aptitude questions, lot of Perl questions, networking questions, questions from OS scheduling, SQL and few other things. But it was not that difficult. After the test, in about an hour the results were out. 15 guys were in. The next was programming round.

They gave two questions, To write an intruder detection system script by parsing the auth.log log file and program for generating random sequence of n numbers from single random seed. I wrote the script for intrusion detection and basic implementation of random sequence generator (I had uncertainty about the question and what I had done was slightly different from what they had meant). After the programming round, they shortlisted four candidates. Joju John Joseph, Subeen N, Neha Mahadevan and me. They announced that there will be three rounds of interviews (two technical and 1 HR round).

My turn for the interview came. They scanned through my Resume and were impressed with my work and Book. Interviewers asked about my interests. I told them that I live in GNU/Linux. One of the interviewer asked me to narrate the story of a computer from the time we press power button until it boots up. I had a long narration of the story of computer boot ups including in-depth explanation of Ramdisk and all (Actually Linux boot was one of my favorite things which I had worked on). Then he asked few questions like What happens when a user browse a Web page, DNS query, DNS records, and few other questions. The interview went through topics like GDB, Core file, Debugging, Killing processes, Init, Signals, Orphaned processes, SSH, SSH Auto-login, and many other questions. I don’t recall most of them. At the end of the interview, they told that they were pretty impressed and satisfied. The next round of interview was HR interview. It was very friendly in nature asking me usual HR questions. He was busy noting down my details on a form during the interview. After the HR interview I went for the second technical interview. They told that there is nothing to ask and we were having friendly conversation about college and environment. The interviewer told me about his college, Yahoo recruitment experience and few things about work environment. When my interviews were over, I had to wait outside with Placement cell volunteers until the three rounds of interview gets finished for other three guys. It took lot of hours. Finally they announced the result. Joju John Joseph and I got the placement offers for Service Engineering team. Neha and Subeen got internship offers.

I will write about my Zynga interview experience soon. Stay tuned!

Posted on March 28, 2011 - READER COMMENTS (6)

How to apply for Google Summer of Code – A byte of note

GSOC Open Source Thoughts
FacebookGoogle BuzzIdenti.caShare


Google Summer of Code organizations list has been published. Soon, within april 8th you will be able to submit your applications. Google summer of code is a premier Open Source programs managed by Google. The certificate of Google Summer of code adds higher market value to your resume. You will be receiving certificate stating 3 months Student Developer for Google and also you will be receiving a good paycheck of $5000.

I participated thrice (2008,2009,2010) with different organizations. I would like to give you few advices regarding how to apply and participate.

There are 171 open source organizations got accepted by Google for GSOC 2011. You can see the list from:
http://socghop.appspot.com/program/accepted_orgs/google/gsoc2011

Each of the organizations will accept few projects (No of projects as per decided by google). No of projects for each organization varies according to the market value of organizations. You can submit one or more (I prefer 3 applications) to same or different organizations.

Each of the organizations will have there project ideas page. You can select an idea that interests you or propose a new idea to them. (But already listed idea has more changes to get accepted – I feel so).

Each student who get accepted will be assigned with a mentor from the corresponding organization. Mentor will be a very experienced guy who is already contributing to the projects. In my case during my first summer of code, surprisingly my mentor John Palmeri is the author DBus-IPC, Gnome Executive and Author of GNOME network manager. You can get such kind of exposure during the project.

How to apply for a project ?

Go through the accepted organization list, Click on the organization, Go to the ideas page. You can find many ideas listed with details (Prerequisite, Difficulty level, Expected mentor, Contact person, Related information URL, URL of mailing list, etc). If you find some idea interesting, invest some time researching about the background, and related technologies related to the idea. Once you gain basic information about what you exactly need to do with project and what technologies you should use, contact the mentor or person listed.

To collect necessary details about the project and get the background of technology, look into mailing list associated with the project/organization. You can also subscribe to the mailing list.

Contacting organization or associated person:

e-mail :
Write properly what idea you want to work on, your background and how you wanted to work on the project. Include as much as details on how you will implement the project and ask necessary questions.
IRC Chat:
If you are not familiar with IRC, IRC is a type of group chat system widely used in Open Source development environment. Each organization will have a channel (Eg. #pardus-devel – pardus linux developers channel), where you can login and chat with a registered nickname. For open source project, the chat server will be irc.freenode.net.
To use IRC, you can install XChat client using: sudo apt-get install xchat
or use the webchat interface :? http://webchat.freenode.net/
For basics of IRC chat, read http://www.irchelp.org/irchelp/irctutorial.html
You can use the assigned contact person’s nickname to contact him over IRC.
IRC is a public chat where many are interacting each other. Please do not spam or ask stupid questions like (I am newbie. I want to participate in GSOC. Plzzz help.)
Write proper words rather than using chat language like ( u thr. plz. i gt ths idea frm). Do not flood the channel. Be very decent and formal over the chat channel.
Before you ask questions, I strictly recommend read this article to following the hacking culture:

http://www.catb.org/~esr/faqs/smart-questions.html

How do each organization select Project proposals ?

From my experience, there will be a group of people from the GSOC organization who review the project proposals. They will have an internal voting system. They vote induvidually and rank the list of proposals. According to the number of project slots assigned for the organization, they select the higher ranked proposals. The most higher preference is always to get successful completion of projects. So they look heavily at the strength of your project proposal. The project proposals that consist of as much strong details to support the fact that you are going to complete the project get accepted.

How to write a project proposal ?

Usually you will find a specified format for the project proposal for each of the organization. Once the application period opens, you will come to know about the format.
To have strong proposal I recommend to include the following things:

  1. Your background. Showcase your abilities (Even if you haven’t done much. Market what you have)
  2. Do enough research on project ideas and arrive at list of technologies you want to use (Include libraries required, dependencies, challenges, diificulties). This should show much of research and effort you have invested in the project.
  3. List out the features of the idea you are going to implement and how it benefits. (You can discuss with the mentor and arrive with features and benefits)
  4. If you had contacted the mentor before and had positive conversations about the project and discussions. He might also talk about you to other individuals in the organization. You can talk the public organization IRC channel, it gives lot of attention to you from the members who are going to vote for your proposal. If you have discussed technical details and give confidence in your ability, that will be positive to you while they vote.
  5. Prototype (Optional)
    Prototypes are bonus points for a proposal. You can create GUI prototype designs, code prototype designs, etc if necessary. (Last time, I coded a command Pardus Linux Live installer in 500 LOC, served as prototype)
  6. Timeline
    Timeline is a must for a project proposal. You should indicate how you are going to spend three months time to completion and development of project.
    A good format will be :
    Week 1 (May X – May Y) – Development of M module. Customization of C using T tool and commit changes…etc
  7. Version control
    You should be familiar with some version control systems like SVN, GIT, BAAZAR, etc
    Here is a basic How to on SVN:?http://betterexplained.com/articles/a-visual-guide-to-version-control/
    You should let then know that you are comfortable with the organization specific version control.
    Before writing the proposal you should check out code from the corresponding branches related to your project idea.

For example, if you are contributing to Webcam app – Cheese, you should check out code of cheese using Git version control and build the application from source code and try out things.

All the best with your project proposals.

Posted on March 5, 2011 - READER COMMENTS (5)

Implementing a spell checker

Articles Interview
FacebookGoogle BuzzIdenti.caShare


Following my previous post on autocomplete, here comes a spell checker program. First of all I would like to warn you that this is the worst and inefficient spell checker you have ever seen. The most slowest and CPU expensive :) . I wrote this program long ago for fun. I thought of sharing the code I wrote.

The purpose of this post is to show how to write a spell checker in few lines by using the Levenshtein distance algorithm. This is not a context sensitive spell checker. The following spell checker program checks each word, and if it has spell errors it will list out some word suggestions.

Levenshtein algorithm is used to calculate distance between two words Levenshtein distance. It is defined as minimum number of edits needed to transform one string into the other, with the allowable edit operations being insertion, deletion, or substitution of a single character. See wikipedia for more details on the algorithm.

This spell checker program uses a dictionary file. Every Unix like OS comes with a default dictionary file (in the directory /usr/share/dict/). Make use of that. To spell check a word, following program calculate Levenshtein distance by comparing all words in the dictionary. If the Levenshtein distance equal to one, those words will be listed as word suggestions. If the distance is zero, that means the word is a valid dictionary word. Checking across all the words in the dictionary is very expensive. Hence in order to reduce the number of comparisons needed, we use a subset of words in the dictionary having word length equal to, plus one and minus one with respect to the word we are checking for spell error. For implementing that, we have built a hash called wordsmap (python dictionary). The key used for the hash is the word length and the value assigned is a list of words of words having the word length as the key.

#!/usr/bin/env python

import sys

def build_matrix(n,m):

    '''Build a nxm matrix with first row = [0,1,2..m-1] and first col = [0,1,2..n-1]'''

    matrix = [[i for i in range(m)]]
    row = [0 for i in range(m)]
    for i in range(1,n):
        matrix.append(row[:])
        matrix[i][0] = i

    return matrix

def print_matrix(mat):
    '''Print matrix in formatted rows'''
    for col in mat:
        print col

def distance(src,dest):
    '''Calculate Levenshtein distance'''
    n = len(src)
    m = len(dest)

    matrix = build_matrix(n+1,m+1)
    for i in range(1,n+1):
        for j in range(1,m+1):
            cost = int( (ord(src[i-1]) - ord(dest[j-1]))!= 0)
            v1 = matrix[i-1][j] + 1
            v2 = matrix[i][j-1] + 1
            v3 = matrix[i-1][j-1] + cost

            matrix[i][j] = min(v1,v2,v3)
    return matrix[n][m]



def dictparse(filename):
    '''Parse words from dictionary and build a hash for reduce word comparisons'''
    f = open(filename)
    wordsmap = {}
   
    for word in f:
       
        word = word.strip('\r\n')
        length = len(word)
        if not wordsmap.has_key(length):
            wordsmap[length] = []
   
        wordsmap[length].append(word)

    return wordsmap


def spellcheck(word,wordsmap):
    '''Perform spellcheck and provide word suggestions'''
    minword = ''
    errors = []
    length = len(word)
    if length == 1:
        return None
    selected_set = wordsmap[length] + wordsmap[length+1]
    dist = 1000
    if word in wordsmap[length]:
        return None

    for w in selected_set:
        calcdist = distance(word,w)

        if calcdist < 2:
            errors.append(w)

    return errors

if __name__ == '__main__':

    wordsmap = dictparse('/usr/share/dict/words')

    line = raw_input("Input Line: ")
    for w in line.split(' '):
        result = spellcheck(w,wordsmap)

        if result != None:
            print w,"(",",".join(result),")",
        else:
            print w,

Let us do a test run with the program:

$ python spellchecker.py
Input Line: prackers break and stel code
prackers ( crackers ) break and stel ( seel,skel,steg,stem,sten,step,stet,stew,stey,steal,steel,stela,stele,stell ) code

Write your comments here.

Posted on March 3, 2011 - READER COMMENTS (5)

Implementing autocomplete with trie data structure

Articles Interview
FacebookGoogle BuzzIdenti.caShare


We have seen autocomplete functionality in many applications and devices. Have you ever thought of implementing it with an efficient data structure? If not let us learn the data structure trie and implement a basic auto complete using python.

According to wikipedia:
In computer science, a trie, or prefix tree, is an ordered tree data structure that is used to store an associative array where the keys are usually strings. Unlike a binary search tree, no node in the tree stores the key associated with that node; instead, its position in the tree shows what key it is associated with. All the descendants of a node have a common prefix of the string associated with that node, and the root is associated with the empty string. Values are normally not associated with every node, only with leaves and some inner nodes that correspond to keys of interest.

Let us go through requirements for an autocompletion functionality:
* Search should be fast
* Memory should be efficiently used
* Easy to build the data structure

We use a dictionary as data store for storing the words. When we type few letters of a word, we can do a regular expression match each time through the dictionary text file and print the words that starting the letters. Using the command grep we can print the words starting with letters as follows.

$ grep "^an" dictionary.txt

But it is a quick and dirty hack. But it is very inefficient if we need to use it in a large scale because, it needs to read the entire words in the dictionary and compare character by character each time. It is costly operation.

Trie data structure is a tree with each node consisting of one letter as data and pointer to the next node. It uses Finite Deterministic automation. That means, it uses state to state transition.

This is a trie for keys “A”, “to”, “tea”, “ted”, “ten”, “i”, “in”, and “inn”. When we need to do auto complete for the starting characters, “te”, we need to get output tea, ted and ten. Instead of checking regular expression match for all the words in the database, it will make use of transitions. First character is t. Then in the root element, it will make transition for ‘t’ so that it will reach the node with data ‘t’, then at node ‘t’, it will make transition for next node ‘e’. At that point, we need to follow all paths from node ‘e’ to leaf nodes so that we can get the paths t->e->a, t->e->d and t->e->n. This is the basic algorithm behind implementing an efficient auto complete.

Implementation of auto complete was a question asked for programming round of Taggle.com recruitment process held at our campus. I implemented it in python with Trie and I got placed :) .

Here is the python program I wrote:

#!/usr/bin/env python
import sys

class Node:
    def __init__(self):
        self.next = {}  #Initialize an empty hash (python dictionary)
        self.word_marker = False
        # There can be words, Hot and Hottest. When search is performed, usually state transition upto leaf node is peformed and characters are printed.
        # Then in this case, only Hottest will be printed. Hot is intermediate state. Inorder to mark t as a state where word is to be print, a word_marker is used


    def add_item(self, string):
        ''' Method to add a string the Trie data structure'''
       
        if len(string) == 0:
            self.word_marker = True
            return
       
        key = string[0] #Extract first character
        string = string[1:] #Create a string by removing first character

        # If the key character exists in the hash, call next pointing node's add_item() with remaining string as argument
        if self.next.has_key(key):
            self.next[key].add_item(string)
        # Else create an empty node. Insert the key character to hash and point it to newly created node. Call add_item() in new node with remaining string.
        else:
            node = Node()
            self.next[key] = node
            node.add_item(string)


    def dfs(self, sofar=None):
        '''Perform Depth First Search Traversal'''
       
        # When hash of the current node is empty, that means it is a leaf node.
        # Hence print sofar (sofar is a string containing the path as character sequences through which state transition occured)
        if self.next.keys() == []:
            print "Match:",sofar
            return
           
        if self.word_marker == True:
            print "Match:",sofar

        # Recursively call dfs for all the nodes pointed by keys in the hash
        for key in self.next.keys():
            self.next[key].dfs(sofar+key)

    def search(self, string, sofar=""):
        '''Perform auto completion search and print the autocomplete results'''
        # Make state transition based on the input characters.
        # When the input characters becomes exhaused, perform dfs() so that the trie gets traversed upto leaves and print the state characters
        if len(string) > 0:
            key = string[0]
            string = string[1:]
            if self.next.has_key(key):
                sofar = sofar + key
                self.next[key].search(string,sofar)
               
            else:
                print "No match"
        else:
            if self.word_marker == True:
                print "Match:",sofar

            for key in self.next.keys():
                self.next[key].dfs(sofar+key)


def fileparse(filename):
    '''Parse the input dictionary file and build the trie data structure'''
    fd = open(filename)

    root = Node()  
    line = fd.readline().strip('\r\n') # Remove newline characters \r\n

    while line !='':
        root.add_item(line)
        line = fd.readline().strip('\r\n')

    return root



if __name__ == '__main__':

    if len(sys.argv) != 2:
        print "Usage: ", sys.argv[0], "dictionary_file.txt"
        sys.exit(2)
       
    root  = fileparse(sys.argv[1])

    print "Input:",
    input=raw_input()
    root.search(input)

You can run it as:

$ python autocomplete.py dictionary.txt
Input: an
Match: and
Match: angry
Match: angle
Match: animal
Match: answer
Match: ant
Match: any

You can download the source files from here: autocomplete.tar.gz

Note: The image of trie used in this post is taken from wikipedia.

Posted on February 14, 2011 - READER COMMENTS (33)

Win a Linux Shell Scripting Cookbook from Packt Publishing

Life
FacebookGoogle BuzzIdenti.caShare


Here is an opportunity to win a copy of my book, Linux Shell Scripting Cookbook published by Packt Publishing. Three winners will be selected to win a copy of e-book. Read on for more details on how to receive a copy.

About the Book

I hope you had read about my book from book release post. In case you missed the announcement, Linux Shell Scripting Cookbook is a book consisting a collection of 119 shell scripting recipes along with descriptions organized into nine different chapters according to the usage and subject. If you are a beginner or an intermediate user who wants to master the skill of quickly writing scripts to perform various tasks without reading the entire man pages, this book is for you. You can start writing scripts and one-liners by simply looking at the similar recipe and its descriptions without any working knowledge of shell scripting or Linux. Intermediate/advanced users as well as system adminstrators/ developers and programmers can use this book as a reference when they face problems while coding.

This book is written in cookbook style and it offers learning through recipes with examples and illustrations. Each recipe contains step-by-step instructions about everything necessary to execute a particular task. The book is designed so that you can read it from start to end for beginner?s or just open up any chapter and start following the recipes as a reference for advanced users.

If you would like to learn more about the book check out my book page in this website or check the Packt publishing website.

How to Enter for a Chance to Win ?

All you have to do is comment on why you would like to have a copy of the book in the comments. For a comment in the comment box of this page, you will be given with a single token. You can also tweet about the book by clicking the tweet button in top of this post. If you tweet about the book, you will be given three tokens and you have more chances to win a copy of the book. Once you tweet, leave a comment with URL of your tweet status.

Giveaway Details

The duration of this giveaway is for 7 days. The giveaway ends on February 21, 2011 after which the commenting on this post will be disabled. Three winners will be randomly selected from the ones who have commented by considering the number of tokens for each candidate. Ensure that you use a valid email in the comments form so that Packt publishing can contact you if you win. Winners will be announced on Tuesday, February 22, 2011 and also contacted via email about their prize. Good luck!

Write a comment to win an e-book

Results

Contest is over. Results are out

Congratulations to the winners.
Abhishek Patil, Praveen Kumar and Subeen N will receive e-books :)

« Older Entries
Newer Entries »
Articles General Interview NEW Thoughts

Preparing for your first-job interviews

It has been a long time since i wrote a blog entry. Here is some interesting piece for final year computer science students. Getting a job is one of the happiest things in the life of a final year guy. I also had such wonderful moments during my final year. I would like to share [...]

Articles

SED Explained!

Article: Tutorial on SED scripting First published in Linux For You magazine [Opengurus] April 2011 Download PDF and read.

Articles Interview

Implementing a spell checker

Following my previous post on autocomplete, here comes a spell checker program. First of all I would like to warn you that this is the worst and inefficient spell checker you have ever seen. The most slowest and CPU expensive . I wrote this program long ago for fun. I thought of sharing the code [...]

Articles Interview

Implementing autocomplete with trie data structure

We have seen autocomplete functionality in many applications and devices. Have you ever thought of implementing it with an efficient data structure? If not let us learn the data structure trie and implement a basic auto complete using python. According to wikipedia: In computer science, a trie, or prefix tree, is an ordered tree data [...]

Articles

Make – a Power Tool for Developers

Article: Tutorial on GNU Make First published in Linux For You magazine [Developers] September 2010 Download PDF and read.

View The Archives
  • About

    Sarath Lakshman is a Hactivist of Free and Open Source Software from Kerala.
    Read more about him.
  • My Book

    Solve real-world shell scripting problems with over 110 simple but incredibly effective recipes.



  • Follow

  • Random Photos

    Arun Raghavan - Optimising applications
Hackfest @ MEC
    Location: Near Model Engineering College Canteen
  • Tweets

    • Preparing for your first-job interviews:
      http://t.co/SBdRl4At
      2011/11/30 23:31
    • is down. Having some issues with hosting account. I will update when it is back.
      http://t.co/Hj3u1qm1
      2011/11/29 11:59
    • Blog post: Preparing for your first-job interviews:
      http://t.co/SBdRl4At
      2011/11/29 09:47
    • Packt Publishers interviewed me.
      http://t.co/CMrvOPh
      2011/09/08 00:21
  • Calendar

    May 2012
    M T W T F S S
    « Nov    
     123456
    78910111213
    14151617181920
    21222324252627
    28293031  
  • Archives

  • Blogroll

    • FOSS.IN
    • GNU Vision Blog
    • Hiran Effects
    • J5′s blog
    • Pardus planet
    • Praveen Arimbrathodiyil’s blog
    • Santhosh Thottingal
    • SLYNUX GNU Operating System
    • St Josephs HSS, Thalassery – Alumni
    • Swaroop CH
    • TT’s Jottings-Blog of VU2SWX
  • Tags

    algorithm automation bangalore bash bash scripting bug code college contribution define development facebook fedora foss fossmeet freedom free sms freesoftware Friends fun gnome gnu google google summer of code hack hacking hacks internet interview joy kde 4.1.2 kochi Life linux mec microsoft new year night nitc pardus pitivi python script summer of code video editor
Copyright © 2005 - 2010 Sarath Lakshman
Powered by Wordpress 3.04