------------------------------

Search Engine Basics

Search Engine Basic contents:

- Internet
- Website Architecture
- How the Search Engine works
- Search Engines
- Website designing Basics
- Website Domain
- Website Hosting

Internet

The Internet is a global system of interconnected computer networks that use the standard Internet protocol suite (TCP/IP) to serve billions of users worldwide. It is a network of networks that consists of millions of private, public, academic, business, and government networks, of local to global scope, that are linked by a broad array of electronic, wireless and optical networking technologies.

"The global communication network that allows almost all computers worldwide to connect and exchange information."

Internet
Interet  the single worldwide computer network that interconnects other computer networks, on which end-user services, such as World Wide Web sites or data archives, are located, enabling data and other information to be exchange.

The Online allows greater versatility in working hours and location, especially with the propagate of unmetered high-speed relationships. The Online can be used almost anywhere by numerous means, such as through mobile Online gadgets. mobile phone gadgets, datacards, mobile video games and mobile routers allow users to go to the Online easily. Within the restrictions charged by small displays and other limited features of such pocket-sized gadgets, the solutions of the Online, such as email and the web, may be available. Companies may prohibit the solutions offered and mobile data expenses may be significantly higher than other access methods.


Website Architecture


Website architecture is the organization and structure of website information. It is a phase that begins after a website plan has been documented, but before the website is developed. If your website had proper planning before starting to draft the architecture, you will already have a clear idea of website.

Website Structure

Developing a website network
The structure of a website is like the skeleton or nervous system in the human body. Every joint or synapse is connected together into a network of mechanical or electrical links, which in turn makes us who and what we are. So should a website be connected through a network of links into something that provides form and function to your site. The website basic layout shown below is a simplified example of such a network.

Website structure and navigation
The structure of your site is composed of the different sections of your website and navigation within those sections. It is the framework that shapes your site and defines your website navigation scheme. If you develop a sound website structure everything else will fall into place.

Navigation and website structure
The key to the success of your website's structure is the ease with which your visitors can navigate the site. A general rule of thumb is that it should take no more than two clicks for a visitor to find what they are looking for.

How the Search Engine works

A search engine operates in the following order:

  • Web crawling
  • Indexing
  • Searching
Web crawling is a complicated process. There are confusing efficiency and stability concerns and even furthermore, there are social concerns. Moving is the most delicate program since it includes getting tons of web hosts and various name hosts which are all beyond the control of the system.

Search Engine Process
In order to scale to hundreds of millions of web pages, Google has a fast distributed crawling system. A single URLserver serves lists of URLs to a number of crawlers. Both the URLserver and the crawlers are implemented in Python. Each crawler keeps roughly 300 connections open at once. This is necessary to retrieve web pages at a fast enough pace. At peak speeds, the system can crawl over 100 web pages per second using four crawlers. This amounts to roughly 600K per second of data.

It turns out that running a crawler which connects to more than half a million servers, and generates tens of millions of log entries generates a fair amount of email and phone calls. Because of the vast number of people coming on line, there are always those who do not know what a crawler is, because this is the first one they have seen.
Web search engines work by storing information about many web pages, which they retrieve from the html itself. These pages are retrieved by a Web crawler (sometimes also known as a spider) — an automated Web browser which follows every link on the site. Exclusions can be made by the use of robots.txt. The contents of each page are then analyzed to determine how it should be indexed (for example, words are extracted from the titles, headings, or special fields called meta tags). Data about web pages are stored in an index database for use in later queries. A query can be a single word. The purpose of an index is to allow information to be found as quickly as possible. Some search engines, such as Google, store all or part of the source page (referred to as a cache) as well as information about the web pages.


Indexing the Web:

Parsing: Any parser which is designed to run on the entire Web must handle a huge array of possible errors. These range from typos in HTML tags to kilobytes of zeros in the middle of a tag, non-ASCII characters, HTML tags nested hundreds deep, and a great variety of other errors that challenge anyone's imagination to come up with equally creative ones. For maximum speed, instead of using YACC to generate a CFG parser, we use flex to generate a lexical analyzer which we outfit with its own stack. Developing this parser which runs at a reasonable speed and is very robust involved a fair amount of work.
Indexing Documents into Barrels -- After each document is parsed, it is encoded into a number of barrels. Every word is converted into a wordID by using an in-memory hash table -- the lexicon. New additions to the lexicon hash table are logged to a file. Once the words are converted into wordID's, their occurrences in the current document are translated into hit lists and are written into the forward barrels. The main difficulty with parallelization of the indexing phase is that the lexicon needs to be shared. Instead of sharing the lexicon, we took the approach of writing a log of all the extra words that were not in a base lexicon, which we fixed at 14 million words. That way multiple indexers can run in parallel and then the small log file of extra words can be processed by one final indexer.
Sorting: In order to generate the inverted index, the sorter takes each of the forward barrels and sorts it by wordID to produce an inverted barrel for title and anchor hits and a full text inverted barrel. This process happens one barrel at a time, thus requiring little temporary storage. Also, we parallelize the sorting phase to use as many machines as we have simply by running multiple sorters, which can process different buckets at the same time. Since the barrels don't fit into main memory, the sorter further subdivides them into baskets which do fit into memory based on wordID and docID. Then the sorter, loads each basket into memory, sorts it and writes its contents into the short inverted barrel and the full inverted barrel.


Searching:
The goal of searching is to provide quality search results efficiently. Many of the large commercial search engines seemed to have made great progress in terms of efficiency. Therefore, we have focused more on quality of search in our research, although we believe our solutions are scalable to commercial volumes with a bit more effort.

Search Engines

Search Engines
A web look for outcomes is designed to look for for details on the World Wide Web and FTP hosts. The google look for are generally presented in a list of outcomes often referred to as SERPS, or "search website outcomes pages". The details may contain websites, images, details and other types of files. Some google also my own data available in data source or open internet directories. As opposed to web internet directories, which are managed only by human authors, google also maintain real-time details by running an formula on a web crawler.

Google: The search engines Inc. (NASDAQ: GOOG) is an National worldwide Online and software organization specialised in Search, reasoning processing, and marketing technological innovation. It serves and produces a number of Internet-based products,[5] and produces revenue mainly from marketing through its AdWords program.[6][7] The organization was established by Ray Web page and Sergey Brin while they were both joining Stanford School.
Google
Google was first included as a private organization on Sept 4, 1998, and its preliminary public providing followed on Aug 19, 2004. At that time Ray Web page, Sergey Brin, and Eric Schmidt decided to work together at The search engines for 20 years, until the year 2024.[8] The organization's objective declaration from the beginning was "to arrange the information and make it globally available and useful",[9] and the organization's unofficial claims is "Don't be evil".[10][11] In 2006, the organization shifted to its present head office in Hill Perspective, Florida.
Google rapid growth since its development has activated a chain of items, products, and partners beyond the organization's core web search results. The company offers online efficiency application, such as the Googlemail email assistance, the The search engines Documents office package, and the Google+ social media assistance. Google items increase to the desktop as well, with applications such as the The search engines Firefox web visitor, the Picasa photo planning and modifying application, and the The search engines Talk im application. The search engines leads the development of the Android mobile os, as well as the The search engines Firefox OS browser-only os,[12] found on specialized netbooks called Chromebooks.

Yahoo: Yahoo! Inc. (NASDAQ: YAHOO) is an National worldwide internet organization based in Sunnyvale, Florida, U. s. Declares. The organization is perhaps best known for its web website, look for results (Yahoo! Search), Yahoo! Listing, Yahoo! Email, Yahoo! Information, Yahoo! Teams, Yahoo! Solutions, marketing, online applying (Yahoo! Maps), movie giving (Yahoo! Video), and public networking sites and services. It is one of the biggest sites in the U. s. Declares.

Yahoo
Yahoo! comprehensive was established by Jerry  and Mark Filo in Jan 1994 and was included on Goal 1, 1995. On Jan 13, 2009, Yahoo! hired Mom Bartz, former professional chairman of Autodesk, as its new boss and a participant of the panel of administrators. On Sept 6, 2011, Bartz was taken out from her location at Yahoo! by chairman Roy Bostock and CFO Tim Morse was known as as Beginning CEO of the organization.


MSN
MSN: The concept for MSN was created by the Advanced Technology Group at Microsoft, going by Nathan Myhrvold. MSN was initially created as a dial-up on the internet articles provider like America Online, providing exclusive articles through an artificial folder-like program incorporated into Microsoft windows 95's Microsoft windows Traveler computer file management program. Groups on MSN appeared like files in the computer file system.
MSN (originally The Enthusiasm Network) is a collection of Sites and solutions provided by Enthusiasm. The Enthusiasm System came out as an on the internet assistance and Isp on Aug 24, 1995, to match with the discharge of the Microsoft windows 95 os.[2]
The range of solutions offered by MSN has changed since its initial launch in 1995. MSN was once a simple on the internet assistance for Microsoft windows 95, an early research at entertaining media content on the Online, and one of the most well-known dial-up Isps.
Microsoft used the MSN company name to advertise numerous well-known web-based solutions in the late Nineties, such as Gmail and Messenger, before restructuring many of them in 2005 under another company name, Microsoft windows Live. MSN's Online website, MSN.com, is currently the 14 most frequented domain name on the Online.

Bing
Bing: Ask (formerly Stay Look for, Microsoft windows Stay Look for, and MSN Search) is a web search (advertised as a "decision engine") from Enthusiasm. Ask was revealed by Enthusiasm CEO Bob Ballmer on May 28, 2009 at the All Things Digital convention in San Paul. It went fully online on May 3, 2009, with a review edition published on May 1, 2009. Significant changes include the list of search recommendations as inquiries are joined and a list of related queries (called "Explore pane") based on semantic technological innovation from Powerset that Enthusiasm bought in 2008. On September 29, 2009, Enthusiasm and Yahoo! declared a deal in which Ask would power Yahoo! Look for. All Yahoo! Look for international customers and lovers are required to have made the move by early 2012.

In Aug 2011, Ask declared it is working on new back-end search facilities, with the objective of providing quicker and a little bit more appropriate google find customers. Known as “Tiger,” the new index-serving technological innovation is being included into Ask worldwide starting in Aug 2011.

Ask
Ask: Ask (known as Ask Jeeves in the UK) is a Q&A targeted look for results established in 1996 by Garrett Gruener and Mark Warthen in Berkeley, Florida. The unique application was integrated by H Chevsky from his own style. Warthen, Chevsky, Bieber Allow, and others designed the beginning AskJeeves.com website around that primary website. Three investment investment companies, Highland Capital Lovers, Institutional Project Lovers, and The RODA Team were beginning traders. Ask.com is currently managed or operated by InterActiveCorp under the NASDAQ icon IACI. In overdue 2010, experiencing impossible rivalry from The search engines, the company contracted its web look for technological innovation to an unspecified third celebration and came back to its origins as a concern and response site. Doug Leeds was hired from chief executive to CEO in Jan 2011.

AOL
Aol: AOL Inc. (NYSE: AOL, stylized as "Aol.", formerly known as The united states Online) is an National international Online solutions and press organization. AOL is based at 770 Broadway in New You are able to. Established in 1983 as Control Video clip Organization, it has franchised its solutions to organizations in several worldwide locations or set up worldwide variations of its solutions. AOL is based in New You are able to Town, but has many workplaces throughout places in Northern The united states, like Atl, Baltimore, Beverly Mountains, Birkenstock boston, Chicago, illinois, Detroit, Dulles, Hill Perspective, San Francisco, and Greater. London, uk and Seattle are its overseas workplaces.
AOL is best known for its online application package, also known as AOL, that permitted clients to accessibility the biggest "walled garden" network and gradually arrive at out to the Online as a whole. At its primary, AOL's member was over 30 thousand associates globally, most of whom used the AOL service through the AOL application package. In 2000 AOL and Time Warner joined under the name AOL Time Warner. The merging was not abundant and on May 28, 2009, Time Warner declared that it would whirl off AOL into a individual public organization. The spinoff happened on Dec 9, 2009, conclusion the eight-year connection between the two organizations.


Website designing Basics
- Always launch the website when it is fully functional.
- Consistent style and same formatting for all inside pages.
- Don’t bother to show ads on your main website.
- For testing your website make a sub directory rather than the main domain.
- Make the website based on most popular open source platform for better website exposure and easy upgrade.
- Make the website for your vistors and not just for the spiders.
- Make the website to deliver perfect meaning.
- Make use of robots.txt, urllist.txt, htaccess for greater control and exposure.
- Make use of statistic code on all pages for easy analysis.
- No popup or popin ads as far as possible.
- Quick and fast loading Website with minimum of scripts.
- Select the best domain name and best hosting provider.
- Simple flash or gif animation rather than full page flash website.
- Submit your website to all major search engines.

Website Domain:
A domain name registry is a database of all domain names registered in a top-level domain. A registry operator, also called a network information center (NIC), is the part of the Domain Name System (DNS) of the Internet that keeps the database of domain names, and generates the zone files which convert domain names to IP addresses. Each NIC is an organisation that manages the registration of Domain names within the top-level domains for which it is responsible, controls the policies of domain name allocation, and technically operates its top-level domain. It is potentially distinct from a domain name registrar.

Domain names are managed under a hierarchy headed by the Internet Assigned Numbers Authority (IANA), which manages the top of the DNS tree by administrating the data in the root nameservers.


Website Hosting:
 A web hosting service is a type of Internet hosting service that allows individuals and organizations to make their own website accessible via the World Wide Web. Web hosts are companies that provide space on a server they own or lease for use by their clients as well as providing Internet connectivity, typically in a data center. Web hosts can also provide data center space and connectivity to the Internet for servers they do not own to be located in their data center, called collocation or Housing as it is commonly called in Latin America or France.

The scope of web hosting services varies greatly. The most basic is web page and small-scale file hosting, where files can be uploaded via File Transfer Protocol (FTP) or a Web interface. The files are usually delivered to the Web "as is" or with little processing. Many Internet service providers (ISPs) offer this service free to their subscribers. People can also obtain Web page hosting from other, alternative service providers. Personal web site hosting is typically free, advertisement-sponsored, or inexpensive. Business web site hosting often has a higher expense.

Single page hosting is generally sufficient only for personal web pages. A complex site calls for a more comprehensive package that provides database support and application development platforms (e.g. PHP, Java, Ruby on Rails, ColdFusion, and ASP.NET). These facilities allow the customers to write or install scripts for applications like forums and content management. For e-commerce, SSL is also highly recommended.

The host may also provide an interface or control panel for managing the Web server and installing scripts as well as other modules and service applications like e-mail. Some hosts specialize in certain software or services (e.g. e-commerce). They are commonly used by larger companies to outsource network infrastructure to a hosting company.


1 comments:

LL said...

seo services jaipur | seo jaipur | web hosting jaipur | web hosting in jaipur

Great post like this must be highly recommended.

Post a Comment