How does a Search Engine work?

Untitled-1

You’ve been using search engines for a long time now.

You probably used one to find this article. But do you know how they work? This post will give you a basic understanding of how search engines work to find all the information you wouldn’t be able to find on your own.

Developing a search engine is complicated and requires a demanding amount of resources. Long before Google started hosting vast amounts of data in Gmail and Google Drive, the company needed to invest heavily in infrastructure just to sustain their initial search service. Suffice to say, most people cannot design and maintain their own web search engines, and there is a reason only a few key players dominate the arena. Yet regardless of which search provider you turn to, the process for how search engines work is largely the same.

1. Crawling

First, developers create Internet bots capable of reading hyperlinks and HTML. These web crawlers browse the Internet looking for web sites in order to index them. They are provided with a list of URLs to start with, and the bot scans all of the hyperlinks on each of these pages and adds them to the list of URLs to visit next. Crawlers can only download a certain number of pages within a given time period, which is why search results sometimes suggest websites that have since been deleted or removed. The websites these web crawlers visit are indexed according to the information the bots can gather by reading each site’s HTML.

2. Indexing

Indexing speeds up the process of retrieving relevant webpages. Searching an index is a substantially less intensive and time-consuming task than searching all of the possible websites. Imagine having to send out new web crawlers to discover the answers to every search inquiry. It would be akin to sending explorers out into the wilderness blind rather than using a map.

Indexes must be stored, and considerable time and effort goes into keeping indexes up-to-date, but the trade-off in speed makes the effort worth it.

3. Searching

Searching is the most visible step of the process. When you type a term into Google, Bing, Yahoo, Duck Duck Go, or any number of search engines, they search their respective indexes for the best results. This information is then displayed on search engine results pages (SERPs).

4. Making Money

Since search engines are free, how were some able to grow into such massive companies? Like much of the Internet, search engines make money by selling ad space. Some advertisers pay to have their products or services ranked higher in search results. Some engines, such as Google, run search-related ads alongside their general search results. As search engines became the most popular way of finding information on the internet, they also became the most popular way of advertising online. Potential consumers may not read the same websites, but they do turn to the same search engines.

Final Thoughts

The process is immensely complex, and there is a great demand out there for people who understand the intricacies of how search engines work. This guide just scratches the surface, but there’s a wealth of information out there. Thanks to search engines, finding that information is easier than ever.

The Google Story

Let me start off with a few questions.
Who are the Google Guys?
Why did they start Google ?
The answers to these and other compelling questions make up the content of this
eminently readable, highly entertaining account of the birth and phenomenal
growth of one of today’s leading technology companies —- GOOGLE.

In 1998, after reluctantly dropping out of the doctoral program at Stanford
University, Sergey Brin and Larry Page founded Google on some very basic
principals that remain at the heart of the company’s success today.

To quote , “Google’s transcendent and seemingly human qualities give it special appeal to an amazingly
wide range of computer users, from experts to novices, who trust the brand that has become an
extension of their brains.”

Google is so innately “human” because the programmers behind its functionality have remained true to
the founders’ vision of a search engine whose focus is entirely on the end user.  They favor “pull”
technology and marketing versus “push” and believe that the quality of their product will compel their
users to “tell a friend”.

Again to quote ,“Google grew in popularity and recognition without spending a dime.”

Of course, Google eventually needed to attract investment capital, and it did, but Brin and Page have
never compromised their integrity and vision and the result is a company that went public in 2004 at a
price of $85 a share and now trades well over $450 a share.
Boom !!!! That’s shear success in its shares.

Though they are millionaires several times over, the founders have remained personally involved in
nearly every aspect of the business.
To illustrate the sense of humor that pervades the entire corporate culture, in August, 2005 Google sold
14,159,265 additional shares in a secondary stock offering.
Why the unusual number of shares?????????
They represent the first eight digits after the decimal point for pi (3.14159265), completely appropriate
for a company run by two mathematicians.
Now , That’s passion for Mathematics and Algorithms.

From an auspicious misspelling (named GOOGOL) to an entity that for millions of people around the
world has become synonymous with the Internet, Google has transformed the way we search for
information.  And the founders’ “Don’t Be Evil” motto continues to ensure that they won’t sell out to the
mass marketers who want to capture the hapless web searcher – their marketing/advertising strategy is
unique in the industry.

Thus it is that what began as a misspelling of a very large number (a “googol”) has been so embraced by
the world that it has become a verb in the Dictionary as we Google our way around the Internet on a
daily basis.

So, Google your way into Google in everything that is Google.

GOOOOOOOOOOOOOOOOOOOOGLE !!!!!!!!!

Does your IP address betray you ?

Google is obviously creepy , in a way which is both good and bad .

Whenever you ping http:\\www.google.com , Google redirects you to http:\\www.google.co.in .

It also provides your search in the daily local languages around you , even when you are surfing thousands of miles away from the Google server.

Have you ever wondered as to why or how this happens ??????

I have made an intuition about it , and believe me , I was flabbergasted .

What happens is that , whenever you ping a Google server they trace the request back to your exact geographical location using your IP address . Sounds crazy , Yes you will be…

It outcasts the users with their privacy to browse their servers (virtually The Internet) .

It’s not very difficult to find out the IP address of an User . I’ll tell you a real time example , you can easily find out your system static IP address using the following websites : whatismyip.com, whatsmyip.org, whatismyipaddress.com etc… When we can find the IP address of our systems , Google does it in microseconds of time . Google employs Ethical hackers to implement such advance systems . Depending upon the User’s location , the search is altered to suit the User’s needs . Google search is damn very different for each geographical city or town .

Your search will definitely differ when you ping the Google servers in 2 different cities .

So , when you know that you are not private you need to have a counter-measure to it.

And Boom , there’s a patch which can easily outwit the indigenous Google’s servers.

That’s TOR , an Acronym for “The Onion Router” .

Let me run you through some of the TOR features :

Tor is free software and an open network that helps you defend against a form of network surveillance that threatens personal freedom and privacy, confidential business activities and relationships.

More of which you can find it on Google or duckduckgo.com (site that helps search anonymously )

 In Ubuntu ,         

To install TOR using terminal :  $ sudo apt-get install tor

To check for the location of  TOR in system:  $ whereis tor

To know a systems ip address and other details we can use different websites like whatismyip.com, whatsmyip.org, whatismyipaddress.com etc..

Then check if tor is running  by using the command in terminal: $ps -ae | grep tor

To see all the connections to networks try netstat command : $ netstat -ant | grep 9050

The 9050 address = tor listening

Browser configuration for TOR anonymity:
In Firefox –
Edit->Preferences->Advanced->Network->Settings
Manual proxy config:
Socks host:127.0.0.1              port 9050

Now , if we check our IP address we are directed to a bogus/random location.

So now , our location is concealed and our IP is bounced through different random PC’s. Only disadvantage is that it takes more time for connecting and checking websites and other activites .

But it can easily counterpart the privacy issues .

Enjoy Anonymity !!!

GOOBUNTU

 

It means , GOOGLE + UBUNTU = GOOBUNTU.

Thursday, August 30, 2012:  It’s no revelation that Google uses Linux on its desktops and servers. But it’s not a commonly known thing that the search engine giant uses a modified version of Ubuntu Linux called Goobuntu. People outside Google were hardly aware of what Goobuntu is all about till now, but here is anything and everything that you always wanted to know about Google’s version of Ubuntu.

Thomas Bushnell, tech lead of Google, who is responsible for managing and distributing Linux to Google’s corporate desktops unwrapped what is Goobuntu at LinuxCon.

If you want to know whether you can download it or not, well the answer is both yes and no. Talking about Goobuntu, Bushnell said that it is simply a ‘light skin over standard Ubuntu’. Google had incorporated the long term support (LTS) of Ubuntu in its version of the operating system. It means that if you want a flavour of it, you can download the latest version of Ubuntu, 12.04.1 and it will give you ‘almost’ the same feel.

According to a report, Google prefers the LTS versions as the two-years between releases is much more workable than the every six-month cycle of ordinary Ubuntu releases. Also, the search engine giant believes in updating and replacing its hardware every two-years. So, LTS version works for it perfectly.

Google prefers Ubuntu but has not stopped its employees from working on Mac or Windows. Bushnell says, “All the Googlers (Google employees) have a choice to use the tools of their choice. But using Goobuntu is encouraged at Google and all our development tools are for Ubuntu.”

Bushnell also spoke about choosing Ubuntu over other operating systems like Fedora or openSUSE. He said, “Ubuntu’s release flow is superb and Canonical provides great support to the software.”

It is worth mentioning here that Google is not only a consumer or contributor of Ubuntu, it is also a paying customer for Canonical’s Ubuntu Advantage support program. Commenting about the Unity interface of Ubuntu, Bushnell said, “Googlers are all over the map when it comes to choosing interfaces. Some like to use GNOME, others like KDE, X-Window or X-Terms. Some users stick to Unity as it reminds them of the Mac. We at Google see a lot of Mac lovers moving to Unity.” Google has not made any default Goobuntu interface.

Goobuntu users at Google include ‘tens of thosands’ of people including graphic designers, engineers, management, and sales people. The company uses apt and Puppet desktop administration tools to manage all these Goobuntu desktops. Google’s desktop management team has an advantage with it because it gives them the power to control and manage their PCs. It is also important because a single reboot can cost the company a million dollars per instance, said the official.

But the desktop problems can happen on Linux as well. Bushnell clarifies, “Hope is not a strategy. PCs can crash any day.” Google has a ‘special’ strategy on this. He said, “Active monitoring essential to avoid such cases. We face challenging demands at Google and we push workstations to their limits. We have to work with rapidly moving development cycles.”

And to top it all, Google has to be very strict with security requirements. Bushnell said, “Google is a target to everyone and every body wants to hack us.” So the company has banned some programs that are part of the Ubuntu as potential security risks. Any program that “calls home” an outside server is banned at Google. The company also used its in-house user PC network authentication that helps in network authentication, because the company has ‘such a high profile security target.”

Google satisfies its need for top-of-the-line security and high-end PC performance, simultaneously remaining flexible to meet the desktop needs of all kinds of employees with Ubuntu as its preferred operating system. As Bushnell aptly says, “You’d be a fool to use anything but Linux.”

A Day without GOOGLE

Google owns our Internet

Whether we like it or not, “Google” is almost synonymous with “Internet”. Year after year, Google has developed enough online applications for us to be able to do anything you need to, by only using Google.

Google is almost synonymous with Internet

Now we have a search engine, an online email service, an IM service, a blogging platform, photo and video sharing applications, a feed reader, an online word processor, an encyclopedia, a web site creator, an online directory and even an Internet browser!

Not to mention His Pay Per Click advertising system, its main source of revenue.

Now look at the list above. It´s pretty long, isn´t it? Well you know what? It´s only 10% of the programs, applications, widgets or services Google provides today.

Internet without Google?

Now the question is: would it be the Internet possible without Google?

Well, the natural response is yes, it would. But it would be an Internet without its best search engine,

What a mess!
It seems almost impossible to work without Google!

without Gmail, YouTube, Google Video, Google Reader, Picassa, Blogger, Orkut, Google Maps, Google Docs, Google Analytics, Google AdSense, Google Chrome, Google Analytics, Google News,……….., Google, Google, Google!

What a mess! It seems almost impossible to work without Google!

But it isn´t. That´s because – fortunately – for almost every service Google provides, there is an alternate service waiting to be discovered. So basically it all depends on your will to make a change. Are we ready to give up all these services and start One day without Google?
Whatever you do, Google is our Big Brother.

Anything we do on the Internet or in “real life”, Google knows about. Maybe this sounds a bit exaggerated, but think about it:

Google knows what we are looking for. Where we are from. What we are interested in. What we´re blogging about. What we are reading. What we advertise. What we click on. What our hobbies and much more scary stuffs that you would not want to know.

Google is our Big Brother.

That´s because we use all these Google services that seem to be tailored to our needs and that require us to put up personal and technical information.

Are we ready to give them up and start One day without Google?

Be careful what you wish, Google knows it.

Google knows everything related to what you need!

Funny , I mentioned the word “tailored”… As we´ve seen in the previous point, Google is our Big Brother, i.e. He knows pretty much about you. And of course He knows everything related to what you need.

Now that kind of information gives Google a tremendous power. Still haven´t figured it out why? Anyways ,Google can develop the exact application or service to fulfill your most ardent needs . So basically while Google works on creating new needs, He also develops new ways to address them. That´s kind of easy peasy lemon squeezy. For Him. And for us too, apparently. Unless we start One day without Google

No study or explore anymore; there´s Google now.

Here, the issue is more complex than it seems at the first sight. With today´s computers and Internet, the access to information is freer than ever. And Google showed up to organize all that information and serve it on demand. That makes it an excellent way to quickly find information, learn about things you wouldn´t otherwise know about and find an answer to virtually any question.

But an issue arises from all this: why should I study anymore since I find everything on the Internet? And the problem is, for many people this question doesn´t have an obvious answer.

So there is a big risk that we, will forget how to handwrite (word processors rulz!), will forget the multiplication table (google it!), will forget anything related to literature (lol, leave me alone M8!), history (there´s wiki), natural sciences and so on.

The problem is, sooner or later we won´t be able to discern what is true and what is false, since all our information sources have moved to online, the truth will lay on the first page of the SERPS (search engine results pages). And just like that…

…Your senses, typical for all live organisms, will all be annihilated.

Can we start One day without Google ?????????