Skip to main content

Posts

Showing posts from 2015

Do's and don'ts for Team Leaders - 1

7 do's for Team Leaders Say you've just got promoted to a team lead position and have no clue, or received no training from your prior leader in this regard. Well, here's a little something for you. This is, by no means a thorough guide, but rather a kick starter. These 7 points below are an extract of what I've experienced as a Team leader of smart and versatile software engineers. 1. Be friendly and supportive: Simplest and most general rule that fits everywhere. Hard to achieve though. Friendliness requires openness and transparency. Try to make work fun without compromising on your company's mandate. Dine with your team, talk about stuff other than work. Share jokes and inspirational quotes/videos. Make your presence comfortable. A good leader knows his players' interests, current goals, future plans, daily activities, pet names, favorite pizza toppings... you get the point. Being supportive means you should always be available, s

Playing in Amazon's Clouds - Introduction to Elastic Computing Cloud - Part 2

Connecting to Cloud Previously, we looked at how to configure an EC2 instance on AWS. If you're not sure what this sentence was about, click here . In this post, we'll look at some ways to connect to your EC2 instance and try out an example. I'm assuming you already know how to get to the EC2 console page from AWS home. ٖFrom here, you should go to the Running Instances link to check your instances. You should see something like this: Right now, we only have one instance of t2.micro configuration, running on public IP address defined under Public IP. We will first create an alarm to make sure we do not hit our cap when experimenting. Click the Alarm icon under Alarm Status. You should see a pop up to configure an alarm. We are interested in making sure that the CPU usage is under certain limits. Let's create an alarm. We want to generate an email alert whenever our instance is consuming over 90% of processing power for 1 hour or more. We

Playing in Amazon's Clouds - Introduction to Elastic Computing Cloud - Part 1

A really brief Intro.. Researcher, Trying to execute an extremely computationally resource hungry experiment? App developer, unsure of how much data you'll be collecting from the users? Student, tasked to build your FYP (final year project) on distributed computing environment? Just an ordinary techie trying to catch up with the world? If you're any of these, you cannot escape the fact that Cloud computing is storming in and you have to engage yourself actively in it. Adopt it, or perish. I'm a newbie (better say wannabe) in this massive web of computing, and here just to share some experiences I'm having - successes and failures. First of all, Cloud computing is nothing new, it has been there for over 3 decades and was referred with names like Grid computing  and Distributed computing . It was business people that came up with a catchy name to attract business. The idea behind distributed computing is simple. We create a network of computers t

Titanic: A case study for predictive analysis on R (Part 4)

Working with titanic data set picked from Kaggle.com's competition, we predicted the passenger survivals with 79.426% accuracy in our previous attempt . This time, we will try to learn the missing values instead of setting trying mean or median. Let's start with Age. Looking at the available data, we can hypothetically correlate Age with attributes like Title, Sex, Fare and HasCabin. Also note that we previous created variable AgePredicted ; we will use it here to identify which records were filled previously. > age_train <- dataset[dataset$AgePredicted == 0, c("Age","Title","Sex","Fare","HasCabin")] >   age_test <- dataset[dataset$AgePredicted == 1, c("Title","Sex","Fare","HasCabin")] >   formula <- Age ~ Title + Sex + Fare + HasCabin >   rp_fit <- rpart(formula, data=age_train, method="class") >   PredAge <- predict(rp_fit, newdata=age_tes

Titanic: A case study for predictive analysis on R (Part 3)

In our previous attempt , we applied some machine learning techniques to our data and predicted the values for target variable using AgeGroup, Sex, Pclass and Embarked attributes. Now, we will further explore other attributes and see how much information we can extract. This time, instead of keeping test set apart, we will merge it into the training data set. This will enable us to collect complete range of values for each attribute, in case there are some missing outs in training set: > dataset$Dataset <- 'train' >   testset$Dataset <- 'test' >   testset$Survived <- 0 >   dataset <- rbind(dataset, testset[,c(1,13,2:12)]) This may look a strange way to merge two data sets, but here's some explanation. The first line adds a column Survived to testset, so that both the dataset and testset have identical columns. The next two lines add another column to identify whether a record is from training set or test set. The last line merge

Titanic: A case study for predictive analysis on R (Part 2)

Previously , we successfully classified passengers as "will survive" or "will not survive" with 76.5% accuracy using Gender only. We will now extend our experiment further and include other attributes in order to improve our accuracy rate. Let's resume.. Pre-processing Real data is never in ideal form. It always needs some pre-processing. We will fill missing values, extract some extra information from available variables and convert continuous valued fields into discrete valued. First, let's have a look at how Age variable is distributed. We will use a parameter useNA="always" in table function > table(dataset$Age, useNA="always") We see 177 missing values (NA's). We will fill them with mean value of  Age  as a straight forward solution. The commands below store TRUE/FALSE values in a vector bad by checking whether the value of age is available for each record or not, then stores the middle value of age where it is