Saturday, May 6, 2017

Story of a code review

To whom it may concern

کل جو بیٹھے Junior کا Code Review کرنے ہم۔
قہر بن کر ہم پہ ٹوٹی وہ بلائے شامِ غم۔

کھول کر دیکھا جو I.D.E پہ اس عفریت کو۔
یعنی بے ہنگم سی Lines of code کے سنگیت کو۔

بھک سے سارا اُڑ گیا Experience مثلِ غبار۔
سر سے Leadership کا زائل ہوگیا سارا خمار۔

Spaghetti Code ہے، بریانی ہے، یا Soup ہے؟
ہم کہیں کھچڑی جسے۔۔۔ اُس کے مطابق OOP ہے۔

View اور Model میں کوئی ربط تک دکھتا نہیں۔
ایسے Java Bean مسٹر بین بھی لکھتا نہیں۔

گاہے گاہے گر Design change ہی مقصود تھا۔
نام Project کا عزیزم کیوں نہ پھر گرگٹ رکھا؟

Task میں لکھا کہیں بھی فائلیں بھرنا نہ تھا۔
کیوں لکھے بے کار Method کال جب کرنا نہ تھا؟

ایک گھنٹہ پی گیا Function جو اک بہروپ تھا۔
ہم Recurrence سمجھے بیٹھے۔ Nested وہ Loop تھا۔

ہر جگہ پر ٹھونسنا Maven کی Nature کفر ہے۔
شرع پروگرامنگ میں ایسا Architecture کفر ہے۔

اک منٹ! یہ کیا کہا میں نے۔۔۔ کہاں ہے کفر ادھر؟
کفر ہی ہوتا مگر ہے Architecture ہی کدھر؟

Role جس کا Guest تھا Access دیا سارا اُسے۔
تھا جسے Encrypt کرنا، Hash کر مارا اُسے۔

جانے کیسے سانس لے گا Live Server زیر بار۔
ایک Package، دو Classes اور سطریں دس ہزار!!!

یہ پتا چلتا ہمیں Function کے پھیلاؤ سے ہے۔
As is چھاپا ضرور Stack Overflow سے ہے۔

صد مشقت سے نتیجہ لایا جب Compiler۔
لال پیلی Warnings کا ڈھیر تھا پیشِ نظر۔

چل رہی ہے Back-ground میں کہیں ماں کی دعا۔
کھانستا، لنگڑاتا، روتا Code آخر Run ہوا۔

ایک بھی Exception ہونے نہ پائے گی فرار۔
چوکیاں ہیں Try-catch اور Finally کی خار دار۔

Catch جب ہوجاتی Exception ہے تو Throw کیوں کریں؟
تو بتا ہمدم کہ اس منطق پہ روئیں یا ہنسیں؟

بالیقیں کہتا ہوں پڑھ لیتی جو Debug Trace کو۔
قلب کا پڑ جاتا دورہ Ada of Lovelace کو۔

یا الہی میری توبہ اب کبھی ڈالوں نظر۔
اس قسم کی مشق سے آئیندہ سو بار الحذر۔

Tuesday, April 12, 2016

Titanic: A case study for predictive analysis on R (Final)

Our previous attempt to accurately predict whether a passenger is likely to survive, a competition from We used some statistics and machine learning models to classify the passengers.

In our final part, we will push our limits using advanced machine learning models, including Random Forests, Neural Networks, Support Vector Machines and other algorithms, and see how long we can torture our data before it confesses.

Let's resume from where we left. We are applying an implementation of Random forest method of classification. Shortly, this model grows many decision trees and then uses a voting system to decide which trees to pick. This way, the common issue with decision trees, over fitting is mitigated (learn more here).

> library(randomForest)
> formula <- as.factor(Survived) ~ Sex + Pclass + FareGroup + SibSp + Parch + Embarked + HasCabin + AgePredicted + AgeGroup 
> set.seed(seed)
> rf_fit <- randomForest(formula, data=dataset[dataset$Dataset == 'train',], importance=TRUE, ntree=100, mtry=1)
> varImpPlot(rf_fit)
> testset$PredSurvived <- predict(rf_fit, dataset[dataset$Dataset == 'test',])
> submit <- data.frame(PassengerId=testset$PassengerId, Survived=testset$Survived)

> write.csv(submit, file="rforest.csv", row.names=FALSE)

The results were not as promising as expected. We did not make any improvements using this algorithm. This indicates that the decision tree model is not over-fitting.

This is the point where we rethink our data. We noticed that missing Age is an important factor; some records are missing Fare and Embarked; we also extracted Title from names; we derived from seemingly useless variable, Cabin, a boolean variable HasCabin.

Now let's have a look at Ticket.

> unique(dataset$Ticket)

Notice anything? We see some strings like PC, CA, SOTON, PARIS, etc. Now without actually knowing what these represent, how about clipping off the digits and extract only this part? Here's how we'll do so (you'll need to install stringr package if it's missing):

> library(stringr)
> dataset$TicketPart <- NULL
> dataset$TicketPart <- str_replace_all(dataset$Ticket, "[^[:alpha:]]", "")

> dataset$TicketPart <- as.factor(dataset$TicketPart)
> plot(table(dataset$TicketPart[dataset$TicketPart != '']))
The plot reveals that some parts appear frequently. These might hint at where the passenger is coming from.

Next, we can use SibSp and Parch to determine the size of family on board. The thought behind this is that if more members of a family are on board, they'll have high support, and chances of survival.

> dataset$FamilySize <- dataset$SibSp + dataset$Parch + 1
# +1 for the passenger himself

Torturing data even more, we'll explore Name variable even more. We notice that apart from Title, we can also extract Surname, since names are in format [Surname], [Title] [Given Names]

> dataset$Surname <- sapply(dataset$Name, FUN=function(x) {strsplit(as.character(x), split='[,.]')[[1]][1]})
> dataset$Surname <- as.factor(sub(' ', '', dataset$Surname))

> dataset$Surname <- factor(dataset$Surname)

We are only interested in frequent names; we can reduce levels where family size is less than 3:

> dataset$FamilyID <- paste(as.character(dataset$FamilySize), dataset$Surname, sep="")
> dataset$FamilyID[dataset$FamilySize <= 2] <- 'Small'
> famIDs <- data.frame(table(dataset$FamilyID))
> famIDs <- famIDs[famIDs$Freq <= 2,]
> dataset$FamilyID[dataset$FamilyID %in% famIDs$Var1] <- 'Small'
> dataset$FamilyID <- factor(dataset$FamilyID)
> plot(table(dataset$FamilyID[dataset$FamilyID != 'Small']))
As visible, we have several records with same FamilyID and size of family. Therefore, we can conclude that Surname successfully mapped passengers of same family.

There can be many ways to continue torturing this data, but we will now limit ourselves to these variables only.

Now we'll apply SVM and other models (wildly) and see what combination of variables worked for us.

> formula <- as.factor(Survived) ~ Sex + AgeGroup + Pclass + FareGroup + SibSp + Parch + Embarked + Title + FamilySize + FamilyID + HasCabin + TicketPart
> svm_fit <- svm(formula, data=dataset[dataset$Dataset == 'train',])
> testset$PredSurvived <- predict(svm_fit, testset, type="class")
> submit <- data.frame(PassengerId=testset$PassengerId, Survived=testset$Survived)
> write.csv(submit, file="svm.csv", row.names=FALSE)

I continued my experiments on different models, each of which cannot be described here, but for ease, I have created this R script and partitioned the code into functions.

What should be noted is that later in my experiments, I used Age and Fare instead of their discretized variables for accuracy (yes, the execution time increases as a result).

Here are the results from some of the models:

Random Forest:


Neural Networks:


For our experiments so far, CForest proved to be the top performer. But please don't stop here; apply your own ideas of twisting and squashing data to gain more accuracy.

For now, I guess this series should serve as a good starter on Predictive analytics. You can find a variety of different problems on to participate in and polish your analytics spectacles.

Please feel free to comment and maybe share your score...

Tuesday, November 17, 2015

Do's and don'ts for Team Leaders - 1

7 do's for Team Leaders

Say you've just got promoted to a team lead position and have no clue, or received no training from your prior leader in this regard.
Well, here's a little something for you. This is, by no means a thorough guide, but rather a kick starter.
These 7 points below are an extract of what I've experienced as a Team leader of smart and versatile software engineers.

1. Be friendly and supportive:

Simplest and most general rule that fits everywhere. Hard to achieve though.
Friendliness requires openness and transparency. Try to make work fun without compromising on your company's mandate.

Dine with your team, talk about stuff other than work. Share jokes and inspirational quotes/videos. Make your presence comfortable. A good leader knows his players' interests, current goals, future plans, daily activities, pet names, favorite pizza toppings... you get the point.

Being supportive means you should always be available, set examples by doing and be the easiest person to communicate to. When you need something, ask their availability and go to their desk; when they need your help, make them a priority and again, go to their desk.

2. Believe in them:

Believe in your team mates, more than they believe themselves. Often, those who lack experience also lack the tendency to estimate their capabilities. One of my former seniors had me do things -- I then could not believe I was able to -- by breaking the job into unit tasks.

Here, by believing, I do not mean just pretending or saying that you believe in them. Have faith, and be courageous to assign them critical tasks if they are capable. Make your team mates realize the importance of such tasks and why you think they have more than half the chance to handle them. This practice builds confidence. A critical task may involve some risk and failing might cost, but the damage of not doing so may and will cost much more.

But careful not be unrealistic; a lioness will encourage her cubs to hunt a deer, but won't let them go near a Gaur. I made a mistake one with one of my best players of assigning a task without proper training. The result was a deadline unmet, a deliverable missed and an unhappy client.

3. Set vision centered goals:

Have a clear vision in your mind about why you are leading your team? What do you want them to be? For example, you can have a vision of training people to have your skills, plus their own skills, minus your weaknesses. Once you have this, set timely goals according to their skills, areas of improvements and interests.

A bright player of my team is very dedicated and hard working. She needs improvement in stress handling, so I set specific goals that put her to stress just as much as she can handle; while keeping this constant, I occasionally assign her additional tasks. The assumption was that this experiment will raise her stress threshold. A few months exhibited significant improvement.

One of the most gifted players in my team is never troubled with work and always delivered quality stuff at supersonic pace. Now the downside is that her efforts are limited to making software; the rest of the teams have not been benefited from her experience. So, her goal for the year is to identify a common problem related to software engineering processes in the organization... fix it.

You cannot exercise this well if you are not friendly and supportive. Also, if you are not sincere with them or put yourself before them, you will hesitate to set goals that might ask more from you then them. If such is the case, you're on the verge of being a bad example.

4. Be honest:

Honesty is the first chapter in the book of wisdom.
[Thomas Jefferson]

You can be manipulative in the name of diplomacy, hide the facts and call it strategy and still be successful - temporarily. Eventually, your strategy will fall flat on its face when your mates will follow your footsteps (what goes around comes around).

Fabricating project deadlines, making promises about appraisals and promotions you know won't happen, making up stories to disapprove leaves are the most common practices of dishonesty in the industry.

My manager clearly gives me project goals and deadlines at once and leaves Yes/No up to me. The only occasions he heard "No" from me were when he too admitted that the job was impossible. I follow the same practice and without keeping any buffer, my team almost always gets things ready in due time.

4. Have communication protocols:

Miscommunication is the root of all business mishaps. You may define "efficient" as someone who executes tasks quickly, while someone else might be talking about someone who gets things done the most optimal way, when using the same term.

Now, I'm not suggesting to write a dictionary and cram into your memory. Communication gets stronger with time, all you need to do is to be consistent when you say something particularly. My manager does not often use "unacceptable", but when he does, it's red alert - always. To point out a mistake casually, he uses "messed up".

Build your protocols just like that. Few examples are:
- "hang on": I'm coming to you personally
- "wait a minute": I'll be right back
- "what do you think?": you told me a problem I'm sure you can solve yourself
- "will get back to you": I'll follow up when I have some progress on this matter
- "by the end of the week": by Friday, before 5:00pm (closing time)
- "it is important": it must be done to meet the goals
- "it is urgent": do it now or it can never be done

Do not use vague words. "It should be ready": is it "it is ready" or "it is not ready"? Using ambiguous terms to stall people is unprofessional, unethical and disrespectful (humour is another thing). Leave that to your politicians.

Following this principle will eventually build a strong communication bridge, which will relieve you from most of your worries.

5. Appreciate and criticize:

Successful people have control over their emotions. You are a human, so you will get aggressive sometimes - inevitably. But remember Vince Lombardi's rule of thumb: "Praise in public; criticize in private". Appreciation should be loud, casual and not be prologue to a lecture on how to improve more. Also, be responsive, not reactive (a leaf out of 7 habits of highly effective people by Stephen Covey).

Criticism should be constructive, point out mistakes and explain ways to correct them. If you're angry, cool down. Give yourself a moment. See if the person is already doing his best. If such is the case, you're not just to criticize. You should either manage resource better or provide more training. Try to ignore minor issues. If you are fault tolerant, your team will not hesitate to report to you if a mishap occurs. I once deleted database of a live server (repeat) a live server. My manager (now CEO) is a cool captain, so I informed him immediately. The next whole day (off day), he sat with me and we recovered about 99% of the lost data. Had he been an intolerant dictator, I would have tried to cover up and the damage would've been much worse.

It is good habit to give extra credit to your team on good performance than due; if they're under-performing or have messed something up, own their mistake. Take some blame off of them. I've seen a manager doing exactly the opposite to her team. Half of her team left her; the rest spent their time making cover up stories, until she was relieved off the company the same year.

6. Welcome mistakes:

You are not what you are without making some mistakes. Making a mistake should not cause embarrassment to anyone in your environment. Rather, encourage your team by sharing your own mistakes and how you corrected them; what were the good/bad lessons you learned. Mistakes will happen, but won't repeat if people start sharing them with others. I shared the story of my blunder on the live database with whole department, and now backing up databases before any kind of change is a common practice throughout the team.

7. Performance feedback:

Performance feedback is an effective way to improve yourself. Establish a mechanism to provide and receive feedback to and from your team members. They should see you as a person who welcomes criticism and responds positively to it. In my company, some seniors encourage their teams to do their appraisal too. This works well if you have a history of not violating your authority in difficult times. On the other hand, some people are reluctant due to lack of confidence in you; it is not always easy to tell your senior about his flaws. You'll have to find your own way of knowing how your players think you can do a better job at leading them.

One of my team members won't give any feedback on my performance. Later, I learnt the hard way that my strict response to her mistakes damaged her performance rather than improve. When settling this, I made her realize that lack of feedback from her resulted in escalation of the issue.

I personally like feedback process in Academia, where pupils are asked by the administration to fill a questionnaire to evaluate the performance of their instructors each semester.

These are a few lessons I've learnt in my not-so-long tenure as a team leader. I've tried to keep them general, but no guidelines are applicable as is to all circumstances and people. Like a military officer will definitely agree to #4 and refute to #6. Next, we'll have a look at some "don'ts" in sha Allah.

Agree/disagree, please leave comments if you think it needs improvement or something is terribly wrong...