Cloud Technologies, Workshops and Conferences

Cloud Computing – Part 2 #cloud #virtualization

I had been to Bangalore Barcamp 2012 yesterday and got an opportunity to meet openstack developers.One of them gave a wonderful analogy about cloud computing.I just wanted to share the same with you all.

Suppose that you have a desktop with following configuration.You install an operating system and you can boot only one operating system at time, moreover your operating system would consume only a part of your resource and the rest is just left unused.

Now consider the concept of virtual machine.For the same above configuration here we put a layer(hypervisor) above, such that it totally hides the underlying information as shown below.With the help of this layer you can install any number of operating systems irrespective of the type of OS and you can boot all the operating systems simultaneously too. Not just that, you can also specify what amount of RAM and Hard Disk space you want to allocate for a particular OS.Is it not cool..?? 🙂 Yes it is a super cool feature implemented as VMWare player and Virtual Box.Click here for ubuntu installation using vmware player.

Hypervisors provide the means to logically divide a single, physical server or blade, allowing multiple operating systems to run securely on the same CPU and increase the CPU utilization.Some of the hypervisors in the market are KVM from Redhat , XEN used by Amazon, EXXI from Openstack, Hyper-V from Microsoft.

Where hardware partitioning allows for hardware consolidation, hypervisors allow for flexibility in how the virtual resources are defined and managed, making it a more-often used system consolidation solution.

But now the issue is even in this design we might expect some unused resources in each or the OS and now comes the concept of CLOUD to utilize the remaining resources too.Ah.. Developers grow greedy right.. 😛

In cloud you set up a network of machines with varying configuration like one machine might have 2TB storage capacity and another machine might have 32GB RAM and few might have very less configuration too.What they do is just combine all the resources and when we ask for cloud service they give us a part of the resources whose size depends on the customers need.This process is called spawning of instances and all these accessed via internet.

An instance is nothing but set of resources.Cloud providers spawn the instances whose size are decided just like the way we have various T-shirts sizes(Small, Medium, Large, XL,XXL, etc). and they name it as micro instances,small instance,large instance, etc.Here we also have the concept of images which is nothing but a bootable image which might be any operating system.Hence whenever a customer chooses for particular type of instance(like ubuntu, solaris, fedora, windows,etc) appropriate images is loaded accordingly.

Workshops and Conferences

Cloud Computing – Part 1

Cloud computing.A catchy word is’nt it..?? that simply refers to accessing the applications via Internet instead of being installed in your computer.So the funda is all about remote servers.

Without even being aware of it, all of us most likely have been already using some cloud-enabled programs.Web-based email is a best use case scenario.  If you have, for example, a Hotmail or Gmail email account you’re using a cloud-based program.  These programs store nothing on your computer.  Instead you log onto their servers, enter your credentials and read and send emails.

One good thing about these programs is that they (with a few exceptions) don’t reside on your computer.  That means they aren’t eating up disk space. Also, you don’t have to worry about downloading updates.  The program is always current. Most importantly, though, true cloud apps don’t force you to store your data locally on your computer.  Instead, you save your work on their servers. This allows you to access your work from anywhere anytime just that you have internet connection.

Cloud Service Model

Platform as a Service (PaaS) is the future of the Cloud! In 2011, we got witness many acquisitions and announcements including Heroku by and CloudFoundry by VMware. Infact PaaS space is broadly divided among the .NET, Java and LAMP platforms. Though there is no serious competition to Microsoft Windows Azure in the form of a .NET based PaaS, there is a huge competition among the Java PaaS players including Google App Engine, VMware CloudFoundry, RedHat OpenShift and Heorku. Amazon is also vying for this space through its Elastic Beanstalk offering. Oracle also has announced its Java PaaS. So, Java developers have a wide range of PaaS offerings to choose from. Interestingly, the same set of players is adding support for PHP, Python, Ruby and Node.js.  For example, Heroku has added support for Ruby, Node.js, Clojure, Python and Scala. Same is the case with CloudFoundry which claims that it can run PHP (through AppFog), Ruby, Node.js and Scala! Microsoft also wants the developers to believe that they can run their Java and PHP applications on Windows Azure.

Infrastructure as a service(IaaS), is more of a useful side of cloud computing service for the big companies. They wont be needing to install a copy of windows in each of their nodes or configure severs to do them. In fact, they wont need servers at all. They would be ordering for servers and pre-configured networks online and they will be accessing them online. This implies huge savings for the company for network set-up cost and a wage-loss for the present day network,Amazon Web Services,Cloudo,Free Zoho,, eyeOS.

Software as a service(SaaS),are real pieces of software that you can access directly through the internet, no you don’t need to install anything in your computer. In a few years(strike) months  you will not be ‘installing’ Microsoft Office, Antivirus software, Medial Players  or anything in your computers. You will simply open your browser, go to the cloud service vendor and run the application directly! We have amazing examples for this service like google chrome apps, android & iphone apps,etc

Ah.. Too much theory..Grr.. One last Gyaan

Types of cloud computing

1. Public cloud : Public cloud can be accessed by anyone. Can be said as the other name for cloud computing. Example- Amazon Web Services, Google App-Engine, and Microsoft Azure.

2. Private cloud : Private cloud is exclusively meant for a particular organisation and cannot be accessed by anyone else. Thus it is Data-centre that provides hosted service to limited users. Private clouds are more secure but expensive to public clouds. You have to purchase the storage capacity and services required.

3. Hybrid cloud : Hybrid cloud links both the public and private cloud for example the database is on the private cloud and the applications managed on public. This is an optimal way to be secure at the same time and get maximum resources available. It is considered to be a fault tolerant architecture, since any failure in private cloud services are compensated with those of public cloud services.

4. Community cloud : organizations from a specific community share information on the same cloud managed by themselves or a third party and hosted by service provider.

This was more of a theoretical explanation for the geeks out there I have a technical way rolling in next.. 😉


What is the point with Hadoop…???

Whenever I have a chitchat or formal talk with a BI or Analytic person, the most widely asked question is

what is the point of Hadoop?’.


It is a more fundamental question than ‘what analytic workloads is Hadoop used for’ and really gets to the heart of uncovering why businesses are deploying or considering deploying Apache Hadoop. There are three core roles:

  • Big data storage: Hadoop as a system for storing large, unstructured, data sets
  •  Big data integration: Hadoop as a data ingestion/ETL layer
  •  Big data analytic: Hadoop as a platform new new exploratory analytic applications

While much of the attention for Apache Hadoop use-cases focuses on the innovative analytic applications it has enabled and high-profile adoption at Web properties. Initial adoption of Hadoop at traditional enterprises and later adopters are more likely triggered by the first two features. Indeed there are some good examples of these three roles representing an adoption continuum.

We also see the multiple roles playing out at a vendor level, with regards to strategies for Hadoop-related products. Oracle’s Big Data Appliance, for example, is focused very specifically on Apache Hadoop as a pre-processing layer for data to be analyzed in Oracle Database.

While Oracle focuses on Hadoop’s ETL role, it is no surprise that the other major incumbent vendors showing interest in Hadoop can be grouped into three main areas:

  • Storage vendors
  • Existing database/integration vendors
  • Business intelligence/analytic vendors

This is just a small instance I took to showcase how the major DATA players are slowly adopting this new technology to harness its capabilities to retain there position in the major players list.

Natural Language Processing(NLP)

Treebank Tag-set

Here are the most important tags used in POS tagging

POS Tag Description Example
CC coordinating conjunction and
CD cardinal number 1, third
DT determiner the
EX existential there there is
FW foreign word d’hoevre
IN preposition/subordinating conjunction in, of, like
JJ adjective green
JJR adjective, comparative greener
JJS adjective, superlative greenest
LS list marker 1)
MD modal could, will
NN noun, singular or mass table
NNS noun plural tables
NNP proper noun, singular John
NNPS proper noun, plural Vikings
PDT predeterminer both the boys
POS possessive ending friend‘s
PRP personal pronoun I, he, it
PRP$ possessive pronoun my, his
RB adverb however, usually, naturally, here, good
RBR adverb, comparative better
RBS adverb, superlative best
RP particle give up
TO to to go, to him
UH interjection uhhuhhuhh
VB verb, base form take
VBD verb, past tense took
VBG verb, gerund/present participle taking
VBN verb, past participle taken
VBP verb, sing. present, non-3d take
VBZ verb, 3rd person sing. present takes
WDT wh-determiner which
WP wh-pronoun who, what
WP$ possessive wh-pronoun whose
WRB wh-abverb where, when
Natural Language Processing(NLP)

What is Part of Speech Tagging or POS tagging?

POS is the process of marking up a word in a text as corresponding to a particular part of speech, based on both its definition, as well as its context.Before we deep down to know about POS tagging its important to know about Parts of Speech.There are mainly 8 part of speech that define the words into different categories. Here is a short summary of Parts of Speech.

part of speech function or “job” example words example sentences
Verb action or state (to) be, have, do, like, work, sing, can, must is a web site. I like
Noun thing or person pen, dog, work, music, town, London, teacher, John This is my dog. He lives in my house. We live in London.
Adjective describes a noun a/an, the, 69, some, good, big, red, well, interesting My dog is big. I like big dogs.
Adverb describes a verb, adjective or adverb quickly, silently, well, badly, very, really My dog eats quickly. When he is very hungry, he eats really quickly.
Pronoun replaces a noun I, you, he, she, some Tara is Indian. She is beautiful.
Preposition links a noun to another word to, at, after, on, but We went to school on Monday.
Conjunction joins clauses or sentences or words and, but, when I like dogs and I like cats. I like cats and dogs. I like dogs but I don’t like cats.
Interjection short exclamation, sometimes inserted into a sentence oh!, ouch!, hi!, well Ouch! That hurts! Hi! How are you? Well, I don’t know.

A Part-Of-Speech Tagger (POS Tagger) is a piece of software that reads text in some language and assigns parts of speech to each word (and other token). Let’s take an example,

Input for the POS tagger be,

The strongest rain ever recorded in India shut down the financial hub of Mumbai, snapped communication lines, closed airports and forced thousands of people to sleep in their offices or walk home during the night, officials said today.

Then the output of the POS tagger should look like,

The/DT strongest/JJS rain/NN ever/RB recorded/VBN in/IN India/NNP
shut/VBD down/RP the/DT financial/JJ hub/NN of/IN Mumbai/NNP ,/,
snapped/VBD communication/NN lines/NNS ,/, closed/VBD airports/NNS
and/CC forced/VBD thousands/NNS of/IN people/NNS to/TO sleep/VB in/IN
their/PRP$ offices/NNS or/CC walk/VB home/NN during/IN the/DT night/NN
,/, officials/NNS said/VBD today/NN ./.

Here the NN tag refers to Normal Noun, JJ refers to adjective, etc,. to know more about the tags click here.


Personal Encounters

10 Lessons from Einstein

Einstein, the father of modern physics  is one of the favorite personality for majority of high school folks not just because of his  mass–energy equivalence formula E = mc2   but also because of his childhood stories which they have to read because those are the lessons as a part of the curriculum.I personally became a crazy fan of Einstein after reading the book “At The Speed of Light” by Prof.Chandrashekar.

Knowingly or unknowingly all of us would have read about the stories related relativity (Ex:Freely Falling Lift, Space Journey of Twin sisters, etc).Einstein has truly made a huge impact on many young talented folks.Just want to share few of the lessons from Einstein to all my fellow fans of the ‘Man of Relativity’ in accordance to his 133th birthday on March 14.

1. Follow Your Curiosity “I have no special talent. I am only passionately curious.”

2. Perseverance is Priceless “It’s not that I’m so smart; it’s just that I stay with problems longer.”

3. Focus on the Present “Any man who can drive safely while kissing a pretty girl is simply not giving the kiss the attention it deserves.”

4. The Imagination is Powerful “Imagination is everything. It is the preview of life’s coming attractions. Imagination is more important than knowledge.”

5. Make Mistakes “A person who never made a mistake never tried anything new.”

6. Live in the Moment “I never think of the future – it comes soon enough.”

7. Create Value “Strive not to be a success, but rather to be of value.”

8. Don’t be repetitive “Insanity: doing the same thing over and over again and expecting different results.”

9. Knowledge Comes From Experience “Information is not knowledge. The only source of knowledge is experience.”

10. Learn the Rules and Then Play Better “You have to learn the rules of the game. And then you have to play better than anyone else.”

“The most beautiful thing we can experience is the mysterious. It is the source of all true art and all science. He to whom this emotion is a stranger, who can no longer pause to wonder and stand rapt in awe, is as good as dead: his eyes are closed.”

Source: Dumb little man



My 7 semester lab exams got scheduled for 22 of Nov,2011,to work out programs at hostel I have been struggling to configure perl to execute my “Web Programming Lab” programs. It took hell lot of time and could finally complete it.So i thought sharing the configuring the steps which might be useful for those who are stuck jus like how I was, few days before.
Perl, a scripting language  developed by Larry Wall in 1987. Perl has been constantly getting huge user response for its simplicity in text processing from the day of its release.
Here, I choose WAMP, a packages of independently-created programs which includes Apache(web server), MySQL(open source database) and  PHP as principal components.
Ok let’s start with the step by step instructions to configure.
STEP 1: Download and install wamp 2. version.Click here to download.
STEP2:Similar to step 1 download and install Active Perl 5.10.0 build 1005 from active state web.
STEP 3:Now right click on wamp     server icon which is at the left corner of windows taskbar and select put offline option else select stop all the sevices.Once all wamp services are stopped again right click on wamp server icon and select Apache then open httdp.conf file.
STEP 4: Now we need to make some changes in this httpd.conf file let’s do it one by one;
a)scroll down and look for the line “Options Indexes FollowSymLinks ” and replace it with “Options Indexes FollowSymLinks Includes ExecCGI ”



b)scroll down and look for the line  “#AddHandler cgi-script .cgi” and replace it with “AddHandler cgi-script .cgi
AddHandler cgi-script .pl ”



c)Now look for the line “DirectoryIndex index.php index.php3 index.html index.htm“  and  add index.cgi and in this line.



STEP 5:server is now configured and ready to run perl and cgi script.Now need to add additional repository and install from that repository. For that:
1. Open command prompt , then type
“ppm repo add uwinnipeg”

screen of ppm installation

2. After the “uwinnipeg” repository is added successfully, install DBD-mysql by typing this command
“ppm install DBD-mysql”
Hmmm, now were done with configuring stuffs.Try  writing some simple perl scripts     and save them in  C:\wamp\bin\apache\Apache2.2.11\cgi-bin\
to run the scripts open the browser and type this url :http://localhost/cgi-bin/  followed by your program name as shown

NOTE:  Please make sure that no process is running on port 80

Workshops and Conferences

BDotNet-“Bangalore .NET” User Group

BDotNet, Bangalore .NET user group took birth 8 years back when .NET users at Bangalore rightly identified the need to form a community to share and exchange their knowledge on rapidly growing technologies. Mr.Kashinath  and Mr.Vic Parmar, who are UG Leads of BDotNet, has always been the motivation for successful conduction of UG meets and Community TechEd’s.

BDotNet has been constantly supporting young and talented minds in order to motivate and direct them in the right path.I have been very fortunate to be part of this community.Right from the day one I joined the community constantly I’m able to update myself with the newest technologies.
I’m a 7th semester student at Bangalore Institute of Technology.When I approached the BDotNet members (Vic,Kashinath,Lohith and Amar) to give sessions for students, they wholeheartedly accepted my request and agreed on to come to my college to give sessions on Microsoft Developer Platform.

figure: Metro Style Flyer Created by Amar Nityananda

Yesterday, November 5 we had sessions on Windows 8(-by Vic Parmar), HTML 5 – CSS 3(-by Lohith)  and Windows Phone 7(by – Amar N). There were 250 odd students registered for the event and got a huge response from them about the sessions conducted and about BDotNet too.
It was all possible because of BDotNet community and members of BDotNet who always find time in there busy schedule to contribute to the community by sharing there knowledge.
I’m looking forward to see BDotNet becoming much more popular so that all those techies & geeks out there get to know about the community and kind of contribution these community offering to the society and hence there get benefited from the regular sessions conducted by BDotNet.

Bigdata, Workshops and Conferences

25th CSI Student Convention

CSI, Computer Society of India conducted its 25th student convention at R.V College of Engineering on 13th and 14th of October,2011. I got an opportunity to be part of the convention and present our paper entitled ” Map/Reduce Algorithm Performance Analysis in Computing Frequency of Tweets ” along with my co-author Nagashree.

The convention was a great time for all the students who came from all across the state to learn about the latest trends in the field of Information Technology.Also it was a wonderful platform for innovative young minds to share there ideas and innovations.Students who came from different places of karnataka took part in the convention and presented there papers.

Hadoop and map/reduce being my area of interest we decided to present a paper on “Map/Reduce Algorithm Performance Analysis” so that more and more students get to know about this latest emerging technology.We were given just 10min to present our paper and we had only 10 min to impress the judges and to communicate our ideas with our counter friends who were present in the convention.It was a wonderful experience to present paper in front of eminent professionals who were the judges for the event between were quite nervous as it was our first ever paper presentation.The day became much more memorable when we got to know that we got 3rd place for our presentation.

Here is the abstract of our presentation:

  Abstract of Paper presentation

Title:Map/Reduce Algorithm Performance Analysis in Computing Frequency of Tweets


This paper proposes method to extract the tweets from twitter and analyses the efficiency of Map/Reduce algorithm on Hadoop framework hence achieves maximum performance.

New research in cloud computing has shown that implementing mapreduce not only influencing the performance -it also influences on more reliable storage management.

For about a decade it was considered that distributed computing is more complex to handle than expanding memory of single node cluster since inter-process communication (IPC) to be used to communicate with the nodes which was tedious to implement as the code would run longer than the computation procedure itself. But now apache.hadoop offers a more scalable and reliable platform to implement distributed computing .Through this paper we have analysed  that Map/Reduce algorithm run on hadoop  influences the performance significantly while handling huge data set stored on different nodes of a multi-node cluster .

Aim of the study

Cloud computing is the future and it will  focuses more on distributed computing. In order to evaluate the features offered by hadoop for cloud computing huge unstructured data set is required. The present study investigated those questions.

The main focus of the study was to analysis the performance of Map/Reduce algorithm in computing the frequency of tweets.


About 6 to 10 lines of python algorithm was used to extract the tweets of people, taking input from twitter search API. Tweets were extracted consecutively for about 1 week resulted in a huge data set piling up to 50MB

The study was carried out in to parts. The first part was extracting tweets as mentioned above and the second was to implement customized Map/Reduce algorithm to compute the frequency of tweets on particular keywork(say “Anna Hazare”).


It was found that this approach offers a more reliable method to analyse huge data compared to any other classic methods.

Here is the slides of our presentation:

Finally after the presentation I got to know that hadoop is the platform used for the India UID (ADAR Card) project and I felt proud for having the knowledge of it.