Creating a web page

The whole doc is available only for registered users

Pages: 55
Word count: 13638
Category: Internet

A limited time offer! Get a custom sample essay written according to your requirements urgent 3h delivery guaranteed

Abstract

The main purpose of the paper is to identify the techniques for improving the search probability of the website for a taxi business by making the website more suitable and apparent to search engines like Google, MSN Live and Yahoo.

In order to do so the paper identifies that the website should be based on key words and tags that can aid in increasing the ranking of the website on the search engines. Specific software can be used to enforce and identify keywords for the theme of the website. Another option discussed in detail in the paper is increasing the number of website links on the internet in order to increase the website ranking.

The search optimization technology and strategies adopted by Google and other search engines have been discussed and the use of Meta tags and search engines crawlers is identified. The different sea4ch engines have been compared in terms of their structure, behavior and search algorithms.

The paper also outlines how increasing the SEO of the website can result in increasing the traffic to the website and concurrent can also result in increasing the revenue and sales of the proposed products/ services on the site.

Acknowledgement

Some of my closest friends stand out above all others in deserving my gratitude and appreciation, though my poor choice of words is hardly the equal of their gifts to me. These people are David Bowes, my project supervisor for providing me with the relevant direction; Nitun, a PHP site developer and my friends who helped me through the technicalities of the project; Naheem-Ul-Haq for his constant support and most of all my wife for being understanding and helping me through all the peaks and troughs of my life.

Additionally I would like to thank everyone else who assisted and helped me in attaining my goal of conducting the following project

Introduction

Search Engine Optimization (SEO), is basically the process for updating and adjusting the content on the site and the site itself in order to make it more easily accessible to the search engines. Search engine Optimization is relates to comprehending how the search engines operate and identifying the criteria for being the top ranked site on these search engines and then makes correlated changes to the website to make it rank in the top most results when searched by certain keywords.

“What it involves basically is making sure that your Web site is search engine friendly. What that means is that these search engines have what they call spiders, and what these spiders do is look for keywords, and it gives relevance to certain text within certain tags. The search engine spider assumes that if you have a keyword within your title, header and link tags, then those words must be important and they have a lot to do with what your site is about. In order to rank high in these search engines, you want to make sure that that word is prominent within certain tags within your site.” (Moore, 2000)

Normally search engines use programs called crawlers which visit the web pages from time to time and collect data according to a pre-specified format which may include code used to build the site as well as the data posted on it. This is called indexing.

Adjustments can be made to the website even after the WebPages have been built. This gives them dynamic nature and therefore more of a chance to be ranked favorably. Moreover even after SEO has been performed for a webpage it may take time up to a month to see the results of the search engine optimization reflected in the search engine results.

Search engine optimization is increasingly becoming popular amongst business specially those which operate click-and-mortar operation and online operations. Businesses want their website to be present in the top ten ranked results when a customer processes a key word in the search engine. This is why companies have formally started hiring SEO consultants to improve the chances of their website attracting more customers through higher ranking and visibility in the net. In similar context an article by Claire Armitt in 2005 stated that high street retailers had not been able to attract large numbers of customers to their web sites due to lack of search engine optimization in place for their online operations.

“British shopping giants Tesco, Debenhams, John Lewis, Boots and Topshop were among 18 retailers whose search engine optimization (SEO) strategies were scrutinized in January 2005. Screen Pages found almost three-quarters of top brands did not have broad coverage or consistent amounts of exposure of their Web pages on major search engines. By measuring key aspects of SEO, the extent to which search engines could read and index pages, relevance of results and the number of direct links to a site, Screen Pages found that only half of retailers appeared to have an SEO strategy. Of these, only five appeared in page-one search results on Google when generic search terms were used. Just two retailers made it into the top three results when the search term that best described their business was typed in.” (Armitt, 2005)

A success story for a business employing SEO for their online operations was in the case of Ginueva Villa. She had a struggling kitchen and cabinet online website which was not attracting many customers. Her business type was that of a brick and mortar and the website for it ranked low on the Google search engine. Today however her business KitchenCabinetMart.com is a success story.

This came about to be when the owner utilized SEO implementation strategy. The owned or the business used a mythological approach for search engine optimization to make the best of the non paid advertising that it could get from the internet and the search engines. The result of all this effort is that now the website for ranks in the top results on the first page of Google’s search engine whenever a customer/ visitor types the word ‘kitchen cabinets’. The owner herself commented that she is overwhelmed with the response she has gotten and the kind of business she is getting now is much more than she could ever handle alone as a single entrepreneur.

Main chapters

2.1. Structure of the Search engines

There are basically three types of search engines. Crawler-based search engines, Human powered search engines and hybrid search engines.

2.1.1. Crawler-based search engines

Crawler type search engines, such as Google, create their listings automatically. They sift through different websites matching the key words the searcher has written and display them in order of relevance. If the pages being search have been changed then the engine will automatically record the changes periodically.

Crawler-based search engines have three major elements. First of these is known as the Spider. It is a program which surfs the website reading its content and following links within the website to view the entire site. It re visits the site periodically in order to update its status and the keywords used within it.

All the data viewed by the spider programs is sent to the second major element which is called the Index. This is like a huge library containing Copies of all the websites that have been viewed by the spider program. It updates when ever the spider program sends new information concerning a certain website. Some times it takes a while for a website to be indexed. There fore the new information is not available for searching until indexing has been done. Google has been widely successful in creating fast indexing algorithms for their search engines

The third element of a search engine is the actual search engine software itself. This is the software that actually sifts through all the information in the Index and displays the information relevant to the key words as defined by the searcher.

2.1.2. Human Powered Directories:

A human powered directory depends on humans for its listings. it does not sift through the entire world wide web. Editors of the directory write reviews or reports on website and the directory search program matches the key words being searched for to these reports. Changes made on a website do not affect its listings. Listings are based on the quality of their context. Different directories may give A certain websites different listings as their editors differ. Commonly a well written and well made website will receive a better review and therefore a better listing than a poorly made website. An example of a human powered directory is the Open Directory and EBSCO host.

2.1.3. Hybrid search engines:

Hybrid search engines like MSN and Yahoo are based on crawler based search engines and human powered directories. Which site will get top listing is based on each individual site, for example MSN favors Human made listings more than those produced by crawlers.

2.1.4. The Structure of Google search engine:

Google is considered the best and most popular search engine used today. Many other search engines use structure which take root in Google. Therefore Google can be considered the best representative of search engines found. Google search engine is mostly made using C++ programming software for maximizing efficiency during programming

Web crawling in Google is done through several Spider programs. Certain URL servers exist that send a list of URL to be fetched to the crawlers. These URLs are then sent to the Store server. This server compresses and stores the web pages into a repository. Every web page has an associated ID number called a docID which is assigned whenever a new URL is parsed out of a web page. The indexing function is performed by the indexer and the sorter. The indexer reads the repository, uncompressed the documents, and parses them. The documents are then converted to sets of word occurrences called hits. The hits record the words and their position within the document. The indexer distributes these hits into a set of “barrels”, creating a partially sorted forward index. The Indexer also parses out all the links in every web page and stores important information about them in an anchors file.

This file contains enough information to determine where each link points from and to, and the text of the link. The URL resolver reads the anchors file and converts relative URLs into absolute URLs and in turn into docIDs. It puts the anchor text into the forward index, associated with the docID that the anchor points to. It also generates a database of links which are pairs of docIDs. The links database is used to compute PageRanks for all the documents. The sorter takes the barrels, which have been sorted by their docID’s and resorts them by their word ID to generate an inverted index A program called DumpLexicon takes this list together with the lexicon produced by the indexer and generates a new lexicon to be used by the searcher. The searcher is run by a web server and uses the lexicon built by DumpLexicon together with the inverted index and the PageRanks to answer queries.

2.1.5. How Google Indexes and Ranks Pages:

Indexing: Any Information That has to be indexed must be First Checked for errors. These range from typos in HTML tags to kilobytes of zeros in the middle of a tag, non-ASCII characters, HTML tags nested hundreds deep and much more. This is done by a Parsing program. Maximum speed is needed to search and correct any errors in the millions of websites “parsed” every day. There fore Google uses a Flex program to generate a lexical analyzer which is outfitted with its own stack.

After each website and document is parsed, it has to be encoded into a number of barrels. Every word within the parsed information is converted into a wordID by using an in-memory hash table, which is the lexicon hash table. New additions to the lexicon hash table are logged to a file. Once the words are converted into wordID’s, their occurrences in the current document are translated into hit lists and are written into the forward barrels. Google takes the approach of writing a log of all the extra words that were not in a base lexicon, which is fixed at 14 million words. This way multiple indexers can run in parallel and then the small log file of extra words can be processed by one final indexer.

In order to generate the index, the sorter takes each of the forward barrels and sorts it by wordID to produce an inverted barrel for title and anchor hits and a full text inverted barrel. This process happens one barrel at a time, thus requiring little temporary storage. Also, Google parallelize the sorting phase to use as many machines as we have simply by running multiple sorters, which can process different buckets at the same time. Since the barrels don’t fit into main memory, the sorter further subdivides them into baskets which do fit into memory based on wordID and docID. Then the sorter loads each basket into memory, sorts it and writes its contents into the short inverted barrel and the full inverted barrel. This way results in the maximum amount of information to be indexed in the minimum amount of time using the minimum amount of active memory.

Page ranking: The Link graph of the web is an important resource that has largely gone unused in existing web search engines. Google has created maps containing as many as five hundred and eighteen million of these hyperlinks. These maps allow rapid calculation of a web page’s “PageRank”, an objective measure of its citation importance that corresponds well with people’s subjective idea of importance. Because of this correspondence, PageRank is an excellent way to prioritize the results of web keyword searches. For most popular subjects, a simple text matching search that is restricted to web page titles performs admirably when PageRank prioritizes the results. For the type of full text searches in the main Google system, PageRank also helps a great deal. PageRank

is a program which consists of a complex algorithm and Assumptions of what a random surfer might click on. There fore following this algorithm websites are ranked in order of their content and relevance to the key words matching the lexicon analyzer.

2.1.6. Scalability of Google:

Google has been designed to be scalable in the near term to a goal of 100 million web pages. Disks and machines have been bought to be able to handle such a large amount. All of the time consuming parts of the system are parallelized and roughly lineated time wise. These include crawlers, indexers, and sorters. It is also perceived that most of the data structures deal gracefully with the expansion. However, at 100 million web pages Google will be testing the limits of any if not all the operating systems used to handle the engine(currently Google is run on both Solaris and Linux). Addressable memory, number of open files descriptors, network sockets and bandwidth etc may make the system unwieldy. Expanding to a lot more than 100 million pages would greatly increase the complexity of Google’s system.

2.1.7. Google search optimization tools versus MSN and Yahoo

Google is considered the leading search engine. Its Search engine optimizing technology is compared to that of MSN or Yahoo through the tools they use. Google uses the following tool which allows its search to be thorough and maximizes relevance.

“Google sitemap” helps to determine if google has problems indexing a website.
“Adwords keywords” tool shows keywords related to an entered key word, web page or website
“AdWords Traffic Estimator” estimates the bid price required to rank #1 on 85% of Google AdWords ads near searches on Google, and how much traffic an AdWords ad would drive
“Google Suggest” auto completes search queries based on the most common searches starting with the characters or words you have entered
“Google Trends” shows multi-year search trends
“Google Sets” creates semantically related keyword sets based on keywords that are entered
“Google Zeitgeist” shows quickly rising and falling search queries
“Google related sites” shows sites that Google thinks are related to the user’s site related
“Google related word search” shows terms semantically related to a keyword

MSN has a wide array of new and interesting search Optimization tools. Their biggest limiting factor with them is that they have limited search market share. Some of the tools they use are as follows

“Keyword Search Funnel Tool” shows terms that people search for before or after they search for a particular keyword
“Demographic Prediction Tool” predicts the demographics of searchers by keyword or site visitors by website
“Online Commercial Intention Detection Tool” estimates the probability of a search query or web page being commercial or informational-transactional
“Search Result Clustering Tool” clusters search results based on related topics

Yahoo has a number of useful Search engine optimizing tools. But their tools cause lags during their searches which and also limit server memory which causes them to fall short of Google

“Overture Keyword Selector Tool” shows prior month search volumes across Yahoo! and their search network.
“Overture View Bids Tool” displays the top ads and bid prices by keyword in the Yahoo! Search Marketing ad network.
“Yahoo! Site Explorer” shows which pages Yahoo! has indexed from a site and which pages they know of that link at pages on your site.
“Yahoo! Mindset” shows you how Yahoo! can bias search results more toward informational or commercial search results.
“Yahoo! Advanced Search Page” makes it easy to look for .edu and .gov back links
“Yahoo! Buzz” – shows current popular searches

2.2. Comprehending and Comparing the Google Algorithm

The main focus of the paper is on the Google search engine because “Being at the top of Google is probably the most important factor in your whole marketing plan online,” says Chris Winfield, president of 10e20 LLC and aside from this Google has 3 billion search inquiries a month, Google is indeed the leader of the search engines operating currently and therefore the main interest for online website and for performing search engine optimization. The main characteristics of the Google’s search and ranking methodology are stated below.

Text matching system: Google finds out the web pages which have contents matching with entered text in search box by user. In this Google gives more weight to heading key words, therefore for SEO the most important key word is contained in the heading

Google does not use Meta tags for page for searching. Instead they use the first few lines of the web page for searching, therefore for search engine optimization one should put the relevant text at first few line of the web page to better search.

Google considers key word density by scanning the whole web page and determining how many times the key word is used or referenced in the content of the body of web page. Therefore to improve eth ranking of the page it is important to make more use of the key words to improve the key word density of the page.

Google has also assigned a substantial amount of weight for key words which are in the contents of the web page header according to their internal; rankings. This means that keywords in H6 have the lowest rank and weight, while the keywords in the H1 are given most priority and are ranked highest.

Similarly Google also assigns more weight to keywords if they are highlighted or in bold faced as compared to keywords in the text which are simply written in the normal type face. Therefore in order to adjust the page for optimizing it in the search engine we should focus on highlighting, capitalizing and changing the typeface to bold when writing keywords in the contents of the web page.

The Google search engine also uses a page rank concept for ranking the WebPages for its online list of search results. For this it uses the hilltop page algorithms which are discussed later in the section. The sample of the technique used by Google is as follows. Suppose your site asd.com has linked from xcv.com. So Google count a vote to your site by xcv.com and your site page rank is improve. They provide ranking system 1 to 10. 1 for lower rank mean less popular site and rank 7 to 10 are consider higher rank web site mean your site is recommended by large number of other site. The hilltop algorithm uses a graph, to find out the page rank of a site. The degree of a node is considered for providing a rank to the web site.

Google is most appreciated an liked by online programmers and search engines optimizers as unlike other search engines which use sketchy and informal techniques for page ranking, Google actually publishes the format for webpage ranking. Aside from this the search engine is more relevant and faster than those currently operating online. This can be seen if one has a website and is not yet listed with Google then by submitting the URL link to the search engine the website can be listed in as much of a little time as two weeks. Aside from this Google re-indexes an already listed site every month and even more frequently if the website is using Pagerank™. Compared to Yahoo and MSN Live, the re-indexing is more frequent for Google.

2.2.1. The Google Ranking Algorithm

The Google ranking algorithm is based on two parts. The first part is used to search the pages relevant to the index field that has been entered into the search engine by the visitor, and the second part deals with the ranking system. This system is patented and is called Pagerank™.

Google assigns weight to a tag when it is able to search for the required keywords. The best practice is to have keywords in the tag and have maximum of 40 other words in the tag in order to maximize the weight assigned. Google also does not indulge in assigning importance to Meta tags. These are tags which are present on every page and are not directly visible to visitors who come across the web page. These Meta tags are often abused by the web programmers as they tend to fill them with keywords even if the content on the webpage is not related to them. This reduces the integrity of the webpage when using Meta tags. As a result Google does not only base the indexing of the website on the Meta tags alone, unlike how Yahoo and MSN Live conduct indexation. Instead Google check the first few lines of the content on the webpage to determine the theme of the content. Aside from this it also checks for the keywords in the text. This technique is called determining the keyword density. The higher the keyword density for the webpage, the more its probable that Google will acknowledge and better index the site with the matching requirements.

Aside from this Google also assigns considerable weight to words which appear in the middle of the header tags. However the priority of weight assigned depends on the ranking of the header. According to this the words in the H1 are more relevant to Google than those which may appear in H4 or H6.

It has already been established that Google also assigns more importance and weight to words that appear in bold face compared to words that appear in the normal font and face type. Similarly according to the Google algorithm the search engine also assigns more weight to words that appear in the caption of the images which may be posted on the webpage. Concurrently the search engine also searches for the links that point to the webpage. If the links pointing to the webpage are high in number then it is possible that the Google search engine will assign a higher weight to the website.

The Pagerank code is used by Google to rank the different indexed WebPages and sites that are listed on it. The basis of the system is that is the webpage a is linked from another web page B then Google assumes that web page has important and relevant information on it. As a result it counts the link from the web page A to web page B as a vote and therefore it ranks the webpage B higher in comparison to other websites.

The Pagerank scale ranges from 1 to 10 as stated on the Google toolbar and 1 to 7 according to the Google directory. The least important site is assigned the rank of 1 and the most important sites are ranked from 7 to 10 depending on the system being used to determine the rank.

The most apparent pattern for determining the relevancy of the website for Google would be to simply count the number of links to the website and then assign the rank. According to this the website with most number of links leading to it would be ranked higher and the website with the lowest number of links leading to it ranked lower through the Google search engine in the results. However this is not the case as Google does not simply relay on large number of web links to a website to rank it. It provides a cross check by assigning more weight to websites being linked and led to by other which are already ranked high and have considerable weight. This means that high quality websites having high weight and ranking when refer to another website tend to increase the rank and weight of the linked websites as well. Google itself states that “Google looks at more than the sheer volume of votes, or links a page receives; it also analyzes the page that casts the vote. Votes cast by pages that are themselves “important” weigh more heavily and help to make other pages “important.”

Similar to looking at the rating and ranking of the websites linking to other websites, the Google search engine also takes into account the number of links which lead to the website resulting in a higher linked and authorized category instead of simply a higher ranked website.

The Google page rank foprmula is PR(A) = (1-d) + d(PR(t1)/C(t1) + … + PR(tn)/C(tn)) and takes into account the factors that have been discussed here. Additionally takes into account the damping factor, which is the amount of PR that can actually be passed on to the other site when one website posts a link to the other website. This factor is set at 0.85 which is slightly less than the PR of the website posting the link.

As a result the PR and the Pagerank help in boosting the ranking of the page after it gets linked and referred to other websites. In the formula, A signifies the web page in context, t1 is the website referring to the website in context, and C(t1) is the total number of web linkages that t1 has. The transferable PR as mentioned above is 0.85.

The Pagerank system however is not a perfect solution for ranking the website as it is biased towards certain directories like Dmoz.org, Looksmart and Yahoo. The Dmoz.org is an open directory project which was initiated and set up by humans to keep records of the websites. The PR for the website can be increased tenfold by listing onto these free directories. The Dmoz.org directory is particularly important as Google does not even take into account the PR rating of the individual page and instead increases the total PR rating of the websites listed in this directory. The reason for this is that Google uses its own version of ODP for the web directory specifically managed by Google itself.

For Looksmart and Yahoo however the Google system assigns a slightly higher PR compared to the normally assigned and transferable PR value figure. Therefore in order to improve the ranking of a website the website can be listed on these free online directories to get a better ranking for the website compared to others for the results of the Google search. However Google requires the quality of the web page to be of superior category as well. It tends to compare the code of the website as well analyze the content on it, as a result faulty and non sense SEO is not fruitful when dealing with the Google Search engine.

Whenever traditional algorithms are used to rank the websites and documents for the queries posted I any search engines these search engines are unable to distinguish the spam website and web pages resulting in the ranking on the website being of poor quality. Already companies have invested to implement programs in the search queries which can distinguish the spam from the actual and relevant information and some of these approaches are discussed below.

Ranking web pages bases on the human classification factor. This includes hiring editors who are human and providing them the task of manually associating and relating the keywords of the web page with the documents and websites on the web page. These associated websites are then ranked according to the query keyword search in the search engine. However this process is very slow and very expensive as the amount of data to be processed increases by the minute on the internet. As a result this is not a comprehensive solution for the problem of internet and online spam. However companies like yahoo and Mining Company are investing in this technique make their search engine results as spam free as possible.

The other option is to rank the pages on the basis of the utilization rate of the information. According to this information is collected pertaining to the specific queries being made and pages being viewed by the population logging online. This information is used to determine the number of pages which are most visited pertaining to a specific key word when a specific query is entered into the search engine. The positive points of this system are that this type of a system is more relevant however the cons related to this system are that lot of information needs to be stored and this can increase the costs for memory and storage of this information. More over the technique is open to spamming and will also result n slowing down the search engine considerably. This type of a system is used by DirectHit.

The third category relates to ranking the websites according to the connectivity. This makes use of taking into account the hyperlinks and direct referrals and links between the web pages and web sites. This option is based on the assumption that the website referring to the other website deems it important for the viewer to visit the consequently referred website as the information on that website might be important to note. Aside for this it is also assumed that legal and high quality superior websites refer to other much better websites. An example of this kind of a system is that used by Google. Its page ranking system which is currently patented and goes by the name of Pagerank also works according to the same principles. “PageRank is an algorithm to rank pages based on assumption b. It computes a query-independent authority score for every page on the Web and uses this score to rank the result set. Since PageRank is query-independent it cannot by itself distinguish between pages that are authoritative in general and pages that are authoritative on the query topic. In particular a web-site that is authoritative in general may contain a page that matches a certain query but is not an authority on the topic of the query. In particular, such a page may not be considered valuable within the community of users who author pages on the topic of the query.” (Bharat & Mihaila, 2001)

An alternative solution to Google’s Pagerank, in the connectivity based ranking category is the option of Topic Distillation. According to Topic Distillation, the World Wide Web is sough for computing the pages which are based on the query and have the name of the query I the topic of the web pages. It tends to ignore those pages which do not have the query keywords I their topics. The specific algorithm is responsible for determining the score for websites and pages in the subset of the World Wide Web which had been collected in the first part of this technique. The method of determining the score o the web page is to sum up all the weights possible through the incoming connectivity links and to com up with a formal resultant figure. This determines the weight of the specific website. The disadvantage of using this type of a system is that it requires the pages to already exist on the query based topic in order to operate on them.

2.2.2. The Hilltop Algorithm

The hilltop algorithm is the algorithm which is used specifically by the Google Search engine. Therefore by comprehending the Hilltop search algorithm it would enable us to understand the workings of the Google system and how the website can be adjusted to make it rank in the top 10 results when the relevant keyword based query is processed through the Google search engine.

The Hilltop algorithm is based on two main phases, the expert look up and the target ranking. The expert look up relates to determining and defining an expert page related to a certain topic. Moreover the expert page also provides link to many other nonaffiliated pages that might exist on the same topic. The pages are said to be nonaffiliated if they are written by authors who do not belong to the same group or organization. The expert documents are identified as those which provide recommendations and links to other web pages. These expert pages are usually highly ranked and have a superior quality structure and program base.

The information on these pages is highly relevant and the links provided by these websites are treated as official and authorized recommendation. In order to find the expert pages, the link and association between the two web pages, the web pages linking to the other web page, are determined as to whether the authors belong to the same organization. “The affiliation relation is transitive: if A and B are affiliated and B and C are affiliated then we take A and C to be affiliated even if there is no direct evidence of the fact. In practice, this may cause some non-affiliated hosts to be classified as affiliated. This may also happen, for example, if multiple, independent web sites are hosted by the same service provider. However, this is acceptable since this relation is intended to be conservative.” (Bharat & Mihaila, 2001)

After selecting o the experts the expert pages are then classified according to their themes and their index category. The expert pages are only indexed on the basis of the key phrases included in the content of the expert page. Consequently the expert pages are then ranked according to their expert score as a part of depicting the result of the original query.

The second phase of the hilltop algorithm is related to target ranking. This deals with the authority the web page has on the query topic and ranking the web page according to their degree of authority of the search criteria/ topic. specifically those query results are ranked much higher and prominently which include and have authority on all the fields of the user query being processed in the system. For the ease of processing the targets are identified as those which have at least two non- affiliated expert pages referring to the website/ web page.

By comparing this algorithm with others being used by search directories and other search engines the hilltop delivers a much more relevant result than the others. More over it is much more applicable and can be used for queries having a broad range of fields and large number of key phrases which may need to be compared.

Hilltop is the one of the main concepts which underline the Google search engine’s algorithm which provides for the dynamic nature of queries to be processed while the websites and web pages online keep on growing at an exponential rate. “Google has obviously not implemented Hilltop in its pure form, but rather uses the principles of topical communities and authority in its algorithm. Likewise, other search engines such as MSN and Yahoo! are not using Hilltop per se, but rather similar algorithmic features. Thus when I mention ‘Hilltop’ I am referring to not just the specific paper published by Bharat and Mihaila, but also to the fundamental theory upon which any authority-based link popularity algorithm is based. This theory applies to Topic-Sensitive PageRank, etc.” (Hagans, ‘Link Building for Hilltop’)

The basics of the hilltop algorithm state that the hilltop algorithm uses the link structure based on a community of topics which is compared with the required query topic to help determine the relevancy of the results to be displayed as the search results in order of highly relevant to last relevant. It even overcomes the flaw of the Pagerank system patented by Google which assigns each page an absolute measure for importance.

The workings of the algorithm are such that some web pages are classified as expert documents while others are assigned to the category of authorities. The expert document is any web page which provides link to many other non affiliated pages while the authority is a page when the expert web pages provide a link that leads to it.

The problem that relates to the factor is that it is difficult to have a long term solution for search engine searching methods in the hilltop algorithm. This problem occurs mostly due to the fact that many users of the net who develop website and post information on the web do not update it and even through new updated websites may be created. Through the topicality method the older sites may be more relevant in the search while providing outdated information. “The nature of the World Wide Web dictates that it will take time for a new Web site to get links from within its topical community. Many hubs such as resource lists or niche directories are only updated periodically with new links. Still others are static pages that will never be changed. Then there is the “human factor.” It takes time for a Web site to be recognized as valuable, and for webmasters to trust it enough to link to it. Older authority sites and hubs also tend to link to other older authority sites, creating a sort of self-perpetuating authority set” (Hagans, ‘Link Building for Hilltop’) As a result it is important for search algorithm to be able to identify outdates data on web pages from the relevant data.

Another problem with the topically based technique is that it is very difficult for the new website to becomes expert websites and documents as they are new to the world wide web. As a result eve if they might be providing more relevant information as per the query they might be ranked low due to the topicality nature of the search algorithm used by the search engine.

The challenge that exits for determining the best algorithm for search engines is to make sure that the most relevant data is displayed according to the result and that the process of searching according to the fields in the query is as fast as possible without requiring substantial amount of memory on part of the server as well as the individual net user. Aside from this the search algorithm should also distinguish between spam and only highlight spam free web pages as part of the results.

2.3. Things to avoid while conducting SEO

Spamming search engines is a hollow attempt to increase a websites ranking. While techniques such as the Black cat Search Engine Optimization spamming techniques may increase a websites ranking temporarily, it produces severe long term affects which are mostly undesirable for the spammer himself. Spamming not only creates frustration for the searcher, but also deteriorates the reputation of the search engines, Major search engines do not take this lightly. Major search engines penalize or even ban websites that use under handed or irrelevant techniques to boost the value and listing of their website. The most common practices that are considered punishable are as follows and should always be avoided.

2.3.1. Duplicate domains:

One of the most common tricks used is to create multiple websites with different domain names but hosting the same content in order to boost the listing and value of a website. This clutters search results with the identical data which causes frustration when the searcher opens different websites only to discover that they are an exact copy of each other. People should use proper key words on their web pages based on the parameters of the search engine where they want to increase their website value

2.3.2. Door way pages:

The creation of doorway pages is also a similar technique. Using this technique web designers create several pages in the same website hosting identical content in order to increase the amount of search hits it receives from crawler based engines. Focusing on altering and optimizing the key words on the actual page is considered to be a remedy to this sort of web abuse.

2.3.3. Keyword camouflage:

Duplicating keywords on a website and then hiding them by using the same colour as the back ground was once the most abused techniques used by web designers to fool crawler based search engines. People should avoid this and try to make the content of their website as descriptive as possible for their viewers.

2.3.4. Link farming:

Link farming is also considered a Shady way to increase hits to a website. Links to a website are created by using Illegal linking websites. Proper ways should be implemented to increase websites link popularity and shying away from paying a popular website in order to host a link is considered a wasted opportunity. As a general rule sites should not be linked to if they are unknown

2.3.5. “Meta tag spamming:

In the early days, search engines read the Meta keyword, description, and other tags. Based on the content of those tags, the search engine determined what your page was about and where to rank it in the SERPs. Unfortunately, people took advantage of this and filled their Meta tags with the same words repeated to make the search engine think that the page had more content than actually existed. This practice misleads the user and the search engine.

For example, if we wanted to bring in Linux users to the developerWorks main page, but lacked any Linux content, we might do this: <meta name=”keywords” content=”Linux, IBM, Linux, developer, tutorials, IBM, developer, Linux, tutorial, tutorial, tutorials, resources, Linux, tutorials, developer” />. The user will be disappointed when he or she clicks on the site listed in the SERP and is shown the developerWorks main page, which might have one Linux tutorial in the rotation for the week, but is certainly not focused specifically on Linux tutorials the same way as the developerWorks Linux section. So many people practiced the black hat Search Engine Optimization technique of meta tag spamming, search engines no longer use the information in meta tags to rank pages.

2.3.6. Alternate tag stuffing:

Misuse of alternate tags is also black hat Search Engine Optimization because it gives the user and the search engine misleading information about a graphic. For example, a graphic with the alt tag stuffed with keywords on the developerWorks Linux page might read as follows: <img alt=”Linux, IBM, Linux, developer, tutorials, IBM, developer, Linux, tutorial, tutorial, tutorials, resources, Linux, tutorials, developer” />. Although the Linux page might be about Linux tutorials, ensure that the graphic itself sends clear information to the page reader about its content. Otherwise, it’s a misuse of the alt tag.”(Jeanette Banks, Feb 2006)

2.3.7. Avoid Spamming:

Earlier it was possible to get away with spamming from the search engines however now as the search engines have specifically stared to look out for spammers and have started to black list them, spamming and black SEO strategies which lead to spamming are not recommended.

“ Search engine spamming attempts usually center around being top ranked for extremely popular keywords. It can try and fight that battle against other sites, but then be prepared to spend a lot of time each week, if not each day, defending your ranking. That effort usually would be better spent on networking and alternative forms of publicity, described below. If the practical reasons aren’t enough, how about some ethical ones? The content of most web pages ought to be enough for search engines to determine relevancy without webmasters having to resort to repeating keywords for no reason other than to try and “beat” other web pages. The stakes will simply keep rising, and users will also begin to hate sites that undertake these measures. Consider search engine spamming against spam email. No one likes spam email, and sites that use spam email services often face a backlash from those on the receiving end. Sites that spam search engines degrade the value of search engine listings. As the problem has grown, these sites now face the same backlash that spam email generates.” (Sullivan, 2007)

2.4. Techniques for Search Engine Optimization

The following section provides insight on how to conduct search engine optimization in order improve the ranking of a website in the search results displayed by the major search engines. However again as per the requirements for the paper the main focus is the Google search engine.

The factors that need to be focused on when adjusting the website for increasing its ranking and for conducting Search Engine Optimization are:

2.4.1. Identify the appropriate target Keywords:

The website should be created revolving around a specific theme and should make use of relevant key words which are of core nature to the theme. Each web page for the website will be having separate keywords. Almost all search engines use the keyword method to identify and rank the web pages by employing the method in combination with other methods as well. Therefore in order to attract customers and visitors to the website and increase the ranking in the search results it is important to identify the keywords which appropriately describe the subject of the web page and use them as often as possible in the contents of the web page.

These words should be usually more than two words in length as this reduces the number competition that the website will have to face in the search process. Moreover “Great links do little if you don’t lace your site with good search phrases. Search, after all, is how most potential customers find you. You can use tools such as Wordtracker to find the phrases that people tend to type when looking for a specific product. But this has its shortcomings. Curiously, popular phrases aren’t always the best. One way to find the most effective phrases is to buy ads on Google’s paid search side. You bid on keywords related phrases, such as “discount kitchen cabinets” -to ensure that your ad ranks high. Then, using Google Analytics or another system, track which phrases best convert into sales. The terms that work for paid search typically work in organic search. “Learn that early,” Boser advises, “or you’re going to waste a lot of time and money.”(Sloan, 2007)

2.4.2. Location:

One of the main constituents of the search engine algorithm for all major search engines is the location of the key words. Most search engines check whether the keywords are present in the meta tags of the web page. This is especially true of Yahoo and MSN Live search engines The Google search engine however checks whether the key words are present in the contents of the web page. Special importance is gives if the keywords appear near the top of the web page.

The best location for the placements of the keywords is to place them in the initial section of the website in the content. Aside form these key phrases can be formally stated in the Meta tags of the web page as well as on the title page. A short number of more relevant key phrases should be used in the title tag this is because the number and the appropriate relevance of the words in the title tag results in a much higher weight being assigned by the search engines.

The captions for the images should contain the key phrases in them wherever appropriate as the search engines cannot read graphic and pictures and therefore cannot determine what the picture or graphic is about. Highlighting the context of the image or graphic item in its caption helps the search engine assigner the web page a higher weight

The header system of the website should be properly indexed and comprehended. This is because placing the relevant key phrases in the header tags can result in increasing the probability of getting higher weight from the search engines. This happens because the search engines assign more weight to keywords in the heard category, and even more particularly based upon the priority system of the tags. The keywords in the H1 tag are assigned more weight compared to the text in the H4 or the H6 tag.

2.4.3. Frequency:

Use the key words that have been identified as many times in the content of the web page as possible. This means that when using the keywords no more 4-5 non key word phrases should be used to keep their integrity intact for the search engines.

2.4.4. Post only Relevant Content:

As mentioned earlier the key phrases should be relevant to the topic of the web page and the content on it. This is because irrelevant content will be treated as spam by the search engines which will result in the website getting blocked by the search engines.

Aside from this the strategy is to put as much HTML type text on the web page as possible. The graphics that are places on the site are not readable and recognizable by the search engines as a result make sure to assign captions to all relevant graphics and pictures and assign the graphics in HTML using the key phrases wherever possible. Another thing of importance is to make sure that the key phrases are easily visible. Some people try to hide the keywords in the web page in order to increase the key word density of the web page but the search engines are aware of such techniques and have adjusted their algorithms so that text which is not visible clearly is not indexed. Moreover also put more focus on expanding the references in text to the key phrases to maximize the changes of the site being related to the query.

2.4.5. Off The Page Factors

Through the use of crawlers and spiders, the search engines have been able to distinguish between web pages which are authentic and nature and those web pages which are highly manipulated by web programmers to attract increased traffic while not actually providing any relevant information, product or service to the visitors. The off page factors that influence the ranking criteria include the link analysis and the click based evaluation. “Off the page factors are those that a webmasters cannot easily influence. Chief among these is link analysis. By analyzing how pages link to each other, a search engine can both determine what a page is about and whether that page is deemed to be “important” and thus deserving of a ranking boost. In addition, sophisticated techniques are used to screen out attempts by webmasters to build “artificial” links designed to boost their rankings.” (Sullivan, 2007)

The click based evaluation makes use of determining how many people are actually accessing your website. Those websites and web pages which are not attracting a lot of clicks or visitors are therefore ranked lower on the search scale as they are deemed by the search engines as being of less importance. Therefore in order to insure that the website is ranked in the top results in the first page, one important factor is also to keep on attracting customers and visitors to the website.

2.4.6. Making way for Inbound Links

Almost all search engines make use of the links based analysis. This takes the form of ranking those web pages higher which are linked to by other websites. This increases the integrity and the importance of the content of the web page in the eyes of the search engine. The key factor over here is to get links for superior quality and highly ranked websites as this increase the chances of the website also getting ranked on a higher slot. “Here’s one simple means to find those good links. Go to the major search engines. Search for your target keywords. Look at the pages that appear in the top results. Now visit those pages and ask the site owners if they will link to you. Not everyone will, especially sites that are extremely competitive with yours. However, there will be non-competitive sites that will link to you – especially if you offer to link back.” (Sullivan, 2007) This is a good method as the more authorities based web pages are ranked higher than the stand alone and randomly referenced and linked ones.

Other methods which are also fast gaining popularity due to success are provoking the user to perform a task and access the web page. These actionable methods take the form of link baiting. The link bating techniques that can be used are to have very relevant content on another page of the website, or to provide links to an internal online tools sections or provide link to a downloadable option on the website.

“Obtaining links from quality hubs and authorities is easier said than done. One can however use certain methods to get links quickly. These methods include but are not limited to: offering to swap links; submitting a relevant, well-written press release; submitting a relevant, well-written article with your Web site’s URL hyperlinked and embedded in the copy; offering to buy or rent a links; and, of course, writing a lot of great content” (Hagans, ‘Link Building for Hilltop’)

2.4.7. Maintain and Update the Website/ Web page:

For long term positive results from SEO, make sure to check the listing of the website every two weeks or so to as the listing and ranking do tend to change as more and more information is posted on the internet. Also keep an eye open for any problems that might occur with the graphics, the structure of the website and the links leading to and from the website as well as the downloadable options as it is possible for them to get affected and corrupted as time passes by.

The common errors that occur due to the passage of time are:

Broken links: these are pages which can no longer be found or accessed via the posted link. Having broken links can present to the search engine that the website it not updated often and as a result the search engine can rank the website much lower than the fresh and updated sites. Aside from this errors like ‘Page Not Found’ can be highly annoying for the visitor who might not come back to the website and tell their friends as well that it is an inactive website. As a result the traffic coning to the website can get reduced over time.
Missing Images: It is also possible for links to images to malfunction. This problem can result in the structure of the page getting affected and the loading time for the page increasing which again can annoy the visitors.
Large Documents: If the documents are very large in size, then they too can take a lot of time to load. Moreover too long web pages may find that search engines spiders and crawlers are unable to access and go thorough the entire contents of their pages whereby reducing the overall ranking of the website/ web page.
No Sitemap: An HTML based site map can refer the audience as well as the search engine to the different web pages, making them all accessible. By having graphical site maps or no site maps at all the website looses the change of getting the search engines to view and access all the information contained on it.

Aside from this if the content placed on the size is of highly volatile nature and keeps on changing after some time them keep on updating the website as well as the key phrases. Another thing of importance when making changes to the website is to resubmit it to the search engines as well.

Also make sure to keep on updating the website as the search engines have developed the search algorithms to be very specific and they can determine the evolution rate of the website an if it is more than 6 months then they might just visit in once or twice a year to re-index it. this can result in lowering the ranking of the website. Therefore it is important to keep on changing and updating the content on the website to ensure that the search engines are knowledgeable about the website being highly active and visit it often to re-index the web page.

2.4.8. Registering with Directories and Search Engines:

Another option that is available for increasing the ranking of the websites is by formally registering with certain directories as well as with the search engines. By registering with the Google search engine through the option of ‘add a URL’ the website can be listed in the search engine records in as little a time as two weeks.

Aside from this the ranking for the website in the search engine, especially Google can be increased by formally registering with directories like the Yahoo, Looksmart or the online directory Dmoz.org. This is because the search algorithm for Google assigns PR points for websites on these directories without even giving any regard for the individual PR ranking of the websites. This method is not preferable for long term high ranking but can be used to get a high ranking initially and then maintain the position by improving on the SEO strategies being employed.

For long term results it is best to focus the search engine optimization strategy relevant to encouraging the crawlers and spiders used by Google to come into the website and continuously index and rank the website.

If then SEO is specifically being conducted for the Google search engine then it must be born in the mind that Google mostly gets the results for its ranking and indexation through the crawlers and spiders its has. Aside from this the Google spider is only able to identify and index the only the text on the page.

Moreover it is only possible for Google to index those files which have the following as their extension: html, pdf, ps, wk1, wk2, wk3, wk4, wk5, wki, wks, wku, lwp, mw, xls, ppt, doc, wks, wps, wdb, wri, rtf, swf, ans, and txt.

The Google indexing system also places special emphasis on the page content as well as the link popularity which is not characteristic of other search engines. This format requires that part of the factor which determines the ranking for the web page is the content posted on it while the other is the number, the quantity and the quality of the links that are posted on the site. The quality and the ranking of the websites referring to the website and providing inbound links to the website are also of importance as they can encourage the search engine to provide the website with a higher ranking if they have been ranked high in the Google search engine as well.

Google itself mentions that it determines the results of the Google search engine based on the query entered by the user depending on more than 100 factors. Aside from this it is also known that Google uses the Hilltop algorithm for searching to an extent and utilizes its own patented PageRank system, to determine the link structure and assign ranks to the websites based on the link analysis. Aside form this the Google search engine also conducts hyper text based matched analysis to determine the web pages that are appropriately relevant for the query based search.

The search engine optimization checklist developed by Krol in 2004 states that:

Websites developers should determine the primary objectives for making the websites. i.e., what is the purpose behind developing the website to provide information or to conduct business via customer interaction?
In order to have a highly ranked website the website owners, developers and managers should dedicate all available sourcing for optimizing the website for the search engines
When developing the websites an initial keyword, key phrase list should also be developed beforehand to determine the theme for the web pages
A content strategy should be implemented which focuses on supporting the keywords and also provides for updating and adding more relevant content to the website
Get the website listed on the search engines and make sure that the web site is accessible by the search engines.
Keep on checking the website for any technical problems that might occur and fix these problems as soon as possible.
Make sure to submit the main key words and key phrases for the website to all the search engines
Try to focus on increasing the PR of the website

2.4.9. Achieving High Ranking without Chasing Algorithms:

Until now we have only discussed methods by which we can improve the ranking of the website by making it more favorable for the algorithm. There exist other ways as well by which the ranking for the websites can be increased. The following steps provide some highly unique ways for increasing the ranking for the website

Stop obsessing about the search engine ranking as the search engines are not the real target audience for the website. They do not buy the products or avail the services that might be depicted on the website. Instead the web pages can be set up and structured for the benefit of the target audience so that word of mouth and the high quality experience on the part of the target audience pulls in more traffic.

Even if the algorithms keep on changing their criteria, keep the key phrase of the website consistent and repetitive in the website. Do not hesitate to use the key phrases liberally in the website wherever relevant and applicable.

Make sure that the site revolves around the key phrases that have been determined and do not delineate from these phrases. Additionally also make sure that each web page has its own key phrases.

Make sure to write quality content on the websites. Keep on updating the information that has been posted on the website and make it approachable and easy to navigate and read for the target audience. Side form keep things interesting as this factor will invite a lot of traffic to the website. More over the focus should be on the key words and phrases that have already been determined. Specific theme building software which are available on the net can also be used for this purpose.

Lastly you should also establish that both the on page factors listed in this paper and the off page factors that have been identified in this context are important. You should not forgo one for the other and instead should make use of both of them for optimum results and getting a high ranked web page.

2.5. Methodology

2.5.1. Keyword Research

The first and most important step I followed is to find out a right key word. To achieve this I use some tool available in market, by using these tools I got what key word is best for my site and then design web pages around these keywords.

First I need to know how keyword Research works – It a kind of supply and demand based strategy.

How many people search for particular key word and how many web supply answer to their demand. It goes five step processes as-

Identify all possible keywords that a searcher on internet might be use.
Organized them properly how many searches versus how many websites.
Choose the keywords which have highest demand and lowest supply.
Analyzing your competitors on the search engine for keywords chosen by you and analyze how well they organize these key words.
Design your web page around these key words.

Keyword	Demand (Searches this month)	Supply (Websites with a keyword match)	R/S Supply/Demand)

I got the appropriate key word as “milton keynes taxis” Which is used by more searchers and provided by lesser web pages. So I decided to take it as master keyword as my site. The other words I used are as “Airport Transfer”, “24 Hour Taxi”, “airport transfer taxi service”, “Reliable airport pickup”, “Airport Taxis” etc.

I used these keywords in my site and get the best result. Below Image is shows how I used these keywords in my taxi.

Fig. Snapshot of my site

2.5.2. Keyword Density

The next step I applied is increase density of selected key words. I increase density of all keywords which I select in step 1. The below table provides information about density of keywords.

Keyword	Count	Density
milton keynes taxi	5	0.65%
airport to hotel	5	0.65%
milton keynes taxis	5	0.65%
taxi milton keynes	3	0.39%
private hire taxi	2	0.26%
airport taxi milton	2	0.26%
milton keynes the	2	0.26%
hire taxi company	2	0.26%
milton keynes milton	2	0.26%
hire milton keynes	2	0.26%
milton keynes airport	2	0.26%
hotel cab service	2	0.26%
birmingham airport taxi	2	0.26%
manchester airport transfer	2	0.26%
airport transfer manchester	2	0.26%
hotel taxi service	2	0.26%

As shown above I increased the keyword density in my site. I include most of the keyword selected by me in step 1.

2.5.3. Bold Keyword and Title

Under this step I include keyword selected by me as title and write them as bold and in title line. For example in above figure I include keywords “Miltion Keynes Taxi” and “Airport taxi” as title and write them in bold. This will increase search credibility of my site. I write other keywords also in bold and try to include their density.

2.5.4. Keywords are Used in Meta Tags

I used these all keywords and description in mata tag of my site, which is used by some search engine like yahoo, MSN and some other engine.

<Metatags> (Description) this site is developed to provide taxi service in Milton kenyes town and this business involve service to pickup and drop facilities. (/description)

(Keywords) milton keynes taxi, airport to hotel, milton keynes taxis, taxi milton keynes, private hire taxi, airport taxi milton, milton keynes the, hire taxi company, milton keynes airport, birmingham airport taxi, hotel taxi service (/keywords)

</Metatags>

2.5.5. Keywords are Used in H1 Tags

The keywords selected by me in step 1, I select some of them as most important and used them in Title tag of my site.

<Title> milton keynes taxi, airport to hotel, milton keynes taxis , taxi milton keynes, private hire taxi, airport taxi milton, milton keynes the, hire taxi company, milton keynes airport, birmingham airport taxi, hotel taxi service </Title>

<H1> Milton Keynes Taxi </H1>

<H1> Airport Taxis </H1>

2.5.6. Making Links and Improving Page Rank

In this step I started to building links to my site with other sites which increase my site page rank. The link building is a continuous process it take long time to build links and improve page rank of any site. Generally to build 100 links take 10 15 days. So I am continuously involved into this process and try to build 10 links per day, after some time definitely my site rank will improve.

These are the steps I follow to improve search credibility of my taxi site that will help me in my business.

Conclusion and Evaluation

The last report gives an idea about Google search engine Structure as well as structure of search engines which use human powered directories. Google parameters have been researched and how to implement them to maximize the ranking of a website has also been discussed. Implementation of these parameters (Keyword placing, HTML tagging, Stamp collecting, context relevance, avoiding frame links, avoiding dynamic door blocks etc.) to a site can result the site ranking top in Google results page, therefore adding value to the site and its contents. The general parameters of other search engines have also been covered which would give similar results upon implementation.

The search engine technologies of Google, Yahoo and MSN have also been discussed. Furthermore the implementation of various Search Engine Optimization techniques have also been researched which help to boost the value of a website on a search engine such as Google. Search Engine Optimization spamming techniques (Duplicate domains, Keyword camouflage, doorway pages, link farming, Meta tag spamming, alternate tag spamming) have also been discussed and their consequences have been listed

These Parameters have been implanted to boost the ranking of a certain websites about a taxi business in Keynes

For implementation the following techniques have been used.

Usage of the White hat link building technique such as asking permission to host link to the taxi business website on a popular website, to improve page rank of the Taxi website
Key words describing the Taxi business site such as “Milton Keynes Taxi”, “Taxi”, “Milton”, “Keynes”, and “Milton Keynes” are repeatedly used in the headings and context using bold and italic. Also the HTML tag has been changed implementing the keywords in the HTML tag. This helps increase the sites search ability over the net.
Tools have been searched from the market and the internet, bought and used to provide better keywords for use.

These activities and implementation of Google parameters and various rank improving techniques have increased the taxi business sites over all ranking in search engines.

When the keywords “Milton Keynes taxis” are used in Google search engine then it shows the upgraded taxi business site in the first page among thirty nine million five thousand other results. But a problem appears when the key words are arranged in different order or are written in different grammar as “taxis in Milton Keynes”. Then it shows the upgraded taxi business site in the second page among three million ninety three thousand results. Therefore a reason behind this anomaly has to be found and measures be taken to ensure the site being given ranking on first page.

By utilizing the above mentioned SEO strategies the website for Milton Keynes taxi service can be optimized for the search engine so that the website is able to be ranked in the top 10 results in the Google search engine when any combination for ‘Milton’, ‘Keynes’ and ‘Taxi’ are typed into the Google search engine. The main purpose of getting a higher ranking on the Google search engine result list is to have an increased number of visitors to the website thereby increasing the incommoding traffic to the website. This will improve the sales for the taxi shuttling business as they will be able order their shuttle taxi on a predetermined planned basis according to their schedule.

This strategy would also enable the business to gain an upper had and edge over the others in the business by having a prominent and successful online option for the operations.

Bibliography

Sullivan, D., 2007, ‘How Search Engines Work’, available at: http://searchenginewatch.com/showPage.html?page=2168031
Singhal, A., Kaszkiel, M., ‘A Case Study in Web Search using TREC Algorithms’ available at: http://www10.org/cdrom/papers/317/
Jansen, B.J., 2000, ‘An Investigation into the Use of Simple Queries on Web IR Systems’, Information Research 2000, available at: http://jimjansen.tripod.com/academic/pubs/ir2000/ir2000.html
Yuwono, B., 1995, A ‘World Wide Web Resource Discovery System’, WWW4 Conference, available at: http://www.w3.org/Conferences/WWW4/Papers/66/
Rappoport, A., 1999, ‘Report from the 1999 Search Engines Meeting’, available at: http://www.searchtools.com/slides/searchengines1998/index.html
Navarro-Prieto, R., Scaife, M., Rogers, Y., 1999, ‘Cognitive Strategies in Web Searching’, Proceedings of the Human Factors & the Web conference, available at: http://zing.ncsl.nist.gov/hfweb/proceedings/navarro-prieto/index.html
Bharat, K., Mihaila, A. G., 2001, ‘When Experts Agree: Using Non-Affiliated Experts to Rank Popular Topics’, available at:http://www10.org/cdrom/papers/474/index.html
Cho, J., Garcia-Molina, H., Paepcke, A., Raghavan, S., 2001, ‘Searching the Web ‘, Stanford University, available at: http://dbpubs.stanford.edu:8090/pub/showDoc.Fulltext?lang=en&doc=2000-37&format=pdf&compression
Cho, J., Garcia-Molina, H., Page, L., 1998, ‘Efficient Crawling Through URL Ordering’, Stanford University, available at: http://dbpubs.stanford.edu:8090/pub/showDoc.Fulltext?lang=en&doc=1998-51&format=pdf&compression
Hagans, A., ‘Link Building for Hilltop’, available at: http://www.seohelpandinfo.com/link-building-for-hilltop-a282.html
Brin, S., Page, L., ‘The Anatomy of a Large-Scale Hypertextual Web Search Engine’, available at: http://infolab.stanford.edu/~backrub/google.html
Nobles, R., 2005, ‘Chasing the Search Engines Algorithms. . . Should you or shouldn’t you?’ available at: http://www.searchengineguide.com/aws/2005/1005_aws1.html
2004, ‘The Google Hilltop Algorithm’, available at: http://www.rankforsales.com/search-engine-algorithms/google-hilltop-algorithm.html
‘Search Engine Optimization’, Search Engine Guide, available at: http://www.searchengineguide.com/optimization.html
Fonseca, B., Golgher, P., Possas, B., RibeiroNeto, B., ‘Concept Based Interactive Query Expansion’, available at: http://homepages.dcc.ufmg.br/~nivio/papers/cikm05-fullpaper.pdf
Van Dijck, P., 2003, ‘Better Search Engine Design: Beyond Algorithms’, available at: http://www.onlamp.com/pub/a/onlamp/2003/08/21/better_search_engine.html
Banks, L. J., 2006, ‘Improve your standing in search engines’, Developer Works, available at: http://www.ibm.com/developerworks/web/library/wa-seo1.html?ca=drs-
Hawn, C., 2003, ‘Schmoozing with the Enemy’, Fast Company, Issue 76, p33, 2/3p, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=107&sid=cdf5b45d-4f5d-4048-9954-e7d5febd4a07%40sessionmgr104
2006, ‘Dancing with Google’s spiders’, Economist, , Vol. 378 Issue 8468, Special Section p14-15, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=107&sid=6e20d6f7-fc22-4882-b91b-4db3a692ba80%40sessionmgr109
Armitt, C., 2005, ‘Retailers lose consumers due to poor online search strategies’, New Media Age, p8-8, 1/3p, available at http://web.ebscohost.com/bsi/detail?vid=1&hid=107&sid=4de4f735-a745-4028-9320-788f4a21f4c4%40sessionmgr104
Sloan, P., 2007, ‘How to Scale Mt. Google’, Business 2.0, Vol. 8 Issue 4, p54-54, 1p, 1c, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=107&sid=9acda171-60bc-48b8-9128-0fdd5ff7879b%40sessionmgr108
Tehrani, R., 2005, ‘Search Engine Marketing, VoIP and Other News from the Telecom Front’, Customer Interaction Solutions, Vol. 23 Issue 8, p12-14, 2p, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=107&sid=525abef8-fcb9-4a90-a811-d38b19859a1a%40sessionmgr109
Cooper, S., 2004, ‘Search Engine Optimization’, Entrepreneur, Vol. 32 Issue 12, p76-76, 1/2p, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=6&sid=8949fe99-9951-447d-af73-06056c3ba185%40sessionmgr7
2005, ‘Web Optimization Resources’, Association Management, Supplement, p12-12, 1/2p, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=4&sid=ef46172c-6eff-41b8-a8b6-3caef7e99567%40sessionmgr3
Caramia, M., Felici, G., Pezzoli, A., ‘Improving search results with data mining in a thematic search engine’, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=4&sid=f3e285dd-1a0b-413f-92e8-b2a11f96c7a4%40sessionmgr9
Sen, R., 2005, ‘Optimal Search Engine Marketing Strategy’, International Journal of Electronic Commerce, Vol. 10 Issue 1, p9-25, 17p, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=4&sid=26630223-5598-44d3-b79b-f5423d7aca31%40SRCSM1
Stone, B., 2005, ‘Hotwiring Your Search Engine’, Newsweek, Vol. 146 Issue 25, p52-54, 2p, 2c, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=4&sid=94e26bfc-21c3-432b-9dee-1569e2f7e317%40sessionmgr7
Krol, C., 2004, ‘Search marketing still not optimized’, B to B, Vol. 89 Issue 13, p1-32, 2p, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=6&sid=cabfd781-299d-4015-b405-3e87d398943a%40SRCSM2
Green, D. C., 2003, ‘Search Engine Marketing: Why it benefits us all’, Business Information Review, Vol. 20 Issue 4, p195-202, 8p, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=4&sid=98f733b2-3a97-4793-ad4d-1668d3f042fc%40sessionmgr3
Krol, C., 2003, ‘Search firms move toward full service’, B to B, Vol. 88 Issue 12, p1-33, 2p, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=6&sid=92dbd587-1ee3-4959-a469-8f914298edc8%40SRCSM2
Moore, R., 2000, ‘Veteran offers tips to optimize search engine exposure’, B to B, Vol. 85 Issue 12, p21, 1/4p, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=6&sid=29623555-f840-4aa4-8ff0-ba7fd51ad4d4%40SRCSM2
Crosman, P., 2006, ‘Three Ways to Drive Traffic from Search Engines’, Intelligent Enterprise, Vol. 9 Issue 12, p9-9, 1p, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=4&sid=d2720c9e-e1b4-4449-88f1-1654d96b33ee%40sessionmgr7
Guha, S., Koudas, N., Kyuseok S., 2006, ‘Approximation and Streaming Algorithms for Histogram Construction Problems’, ACM Transactions on Database Systems, Vol. 31 Issue 1, p396-438, 43p, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=4&sid=28d7b393-d3cc-40f8-bd32-95855bb107c9%40sessionmgr9
Feng, X., Huang, H., 2005, ‘A Fuzzy-Set-Based Reconstructed Phase Space Method for Identification of Temporal Patterns in Complex Time Series’, IEEE Transactions on Knowledge & Data Engineering, Vol. 17 Issue 5, p601-613, 13p, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=4&sid=49fd945f-3266-485c-8e6c-17f183e5c62d%40SRCSM1
Bannan, K. J., 2004, ‘Choosing the Right SEM Provider’, B to B, Vol. 89 Issue 15, p18-20, 2p, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=4&sid=9e547984-9183-4eff-82bf-8a17b2244f6e%40SRCSM1
Bannan, K. J., 2004, ‘Entrepreneur learns why it’s best to optimize site before it launches’, B to B, Vol. 89 Issue 15, p19-19, 1/2p, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=6&sid=4a0dd8c3-dc6b-48cb-83fb-8dce143b8b11%40sessionmgr3
2004, ‘Search optimization tactics to avoid’, B to B, Vol. 89 Issue 9, p19-19, 1/6p, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=6&sid=957fa234-bfbe-41c6-b69f-9ebc995c9134%40sessionmgr7
2003, ‘Seeking Search Engine Optimization’, Catalog Age, Vol. 20 Issue 1, p11, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=4&sid=562b1ef7-63a7-4e5c-97e9-d2c266a485b6%40sessionmgr8
Malaga, R. A., 2007, ‘The Value of Search Engine Optimization: An Action Research Project at a New E-Commerce Site’, Journal of Electronic Commerce in Organizations, Vol. 5 Issue 3, p68-82, 15p, 2 charts, 3 graphs, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=4&sid=aa2a54ca-cc7b-414b-85d6-7a03a5f8cf72%40sessionmgr2
Bell, R., 2007, ‘Prepare for Successful Search-Engine Optimization’, Franchising World, Vol. 39 Issue 5, p19-21, 3p, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=6&sid=25c12185-1d42-4afc-be18-b96040e4cd8b%40sessionmgr3
Kay, R., 2007, ‘Search Engine Optimization’, Computerworld, Vol. 41 Issue 23, p40-40, 1p, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=6&sid=67859f81-5ba4-4a04-94fa-5fbb22beeedd%40sessionmgr9
Ferranti, M., 2005, ‘The Evolution of Search: Chapter Two’, Brandweek, Vol. 46 Issue 27, p18-18, 1p, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=6&sid=7b338241-4044-4911-9c43-04f2d009f9fb%40sessionmgr7
Scoviak, M., 2005, ‘Optimize the Internet’, Hotels, Vol. 39 Issue 3, p45-50, 4p, 4c, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=6&sid=1fef5f53-a8ff-49f9-8e15-44471887c1c9%40sessionmgr8
Grinney, J., 2005, ‘How to target local audience’, B to B, Vol. 90 Issue 2, p21-21, 1/8p, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=6&sid=8a6f14f9-57f2-4d82-a274-628982d6803e%40SRCSM1
Spring, T., 2004, ‘Search Tangles’, PC World, Vol. 22 Issue 8, p24-26, 3p, 1c, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=6&sid=9d6c937e-9d13-43f7-a97e-2a8fdb97b629%40sessionmgr9
Krol, C., 2006, ‘Search pros now specializing’, B to B, Vol. 91 Issue 7, p1-42, 2p, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=2&sid=744cf768-2acf-4eb2-8bcd-2a069a12b038%40sessionmgr7
Bannan, K. J., 2006, ‘Full Optimization’, B to B, Vol. 91 Issue 17, p19-19, 2p, 1c, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=6&sid=73229f3a-2670-45dc-a59c-6bcfc7304a47%40sessionmgr3
Sweatt, A., 2005, ‘On the Web’, Modern Machine Shop, Vol. 78 Issue 4, p34-34, 1p, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=6&sid=0d28362c-2cec-4b2e-8e80-7765bf15dc26%40SRCSM2
Arnold, S. E., 2005, ‘Relevence and the End of Objective Hits’, Online, Vol. 29 Issue 5, p16-21, 6p, 4c, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=6&sid=a74c4143-c953-425c-b1e1-0d84c32a4390%40sessionmgr8
Brinkmeyer, J., 2005, ‘Standing out online’, Sales & Marketing Management, Vol. 157 Issue 6, p14-14, 1/2p, 1c, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=6&sid=e648b46b-feca-4531-a7ac-fc3e07ef19b8%40sessionmgr8
Robinson, K., 2005, ‘7 simple techniques ready site for search’, Marketing News, Vol. 39 Issue 6, p17-29, 3p, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=6&sid=52da367c-a8fd-4386-99f3-afc21ce032b5%40sessionmgr3
Leonard-Wilkinson, T. A., 2004, ‘How to Get Higher Search Engine Rankings’, Intercom, Vol. 51 Issue 3, p47-48, 2p, available at: http://web.ebscohost.com/bsi/pdf?vid=2&hid=2&sid=174a7fc5-aada-437e-867c-bff467c566f6%40sessionmgr8
Vuduc, R., Demmel, J. W., Bilmes, J. A., 2004, ‘Statistical Models for Empirical Search Based Performance Tuning’, International Journal of High Performance Computing Applications, Vol. 18 Issue 1, p65-94, 30p, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=2&sid=b7385ace-37e9-4406-996c-ce077c2840bf%40sessionmgr2
Grant, G., 2003, ‘Search optimization campaigns build brand’, Marketing News, Vol. 37 Issue 20, p22-24, 2p, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=2&sid=f2a30c1d-0c67-4f94-ac01-3fc7ce21bcf3%40sessionmgr8
Leonard-Wilkinson, T. A., 2002, ‘Search Engine Optimization: Keywords That Work’, Intercom, Vol. 49 Issue 4, p39, 3p, 1c, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=2&sid=80a38b5b-4174-4e6d-aa15-e3048ed0fc39%40SRCSM1
Krol, C., 2006, ‘Search spending spree continues’, B to B, Vol. 91 Issue 5, p20-24, 5p, 3 charts, 2c, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=6&sid=b6ca513d-cf8a-41f1-b73a-8455a398765f%40SRCSM2
Plosker, G., 2006, ‘Search Engine Strategies Returns to San Jose’, Information Today, Vol. 23 Issue 9, p42-42, 1p, 4c, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=6&sid=eea7952d-955d-4de9-a4ad-262c5495f0c9%40SRCSM1
Bobula, J., 2005, ‘With keyword selection, more can mean more’, B to B, Vol. 90 Issue 10, p23-23, 1/6p, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=6&sid=e8665cb0-f214-44ff-9215-a8f99be3cb7f%40SRCSM2
Bannan, K. J., 2003, ‘Make your site search engine friendly’, B to B, Vol. 88 Issue 7, p20, 1/5p, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=2&sid=64b100a2-5110-44c5-a1b4-c75a6e76877e%40sessionmgr2
Wooding, J., 2002, ‘Getting serious with search engines’, NZ Business, Vol. 16 Issue 1, p8, 2p, 1c, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=2&sid=0d183d3b-2a6e-4dac-af2f-55a6503a8ab8%40sessionmgr9
Greenberg, K., 2000, ‘Search Patterns’, Adweek, Vol. 41 Issue 37, p70, 3p, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=6&sid=949bf0ae-1eec-47c1-8030-d07a9efa88d7%40sessionmgr2
2006, ‘Problem Solved’, B to B, Vol. 91 Issue 6, p20-20, 2/3p, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=2&sid=5ceaae56-c608-400c-a1d3-9802254c2c17%40sessionmgr2
Mickey, B., 2006, ‘Search Engine Optimization’, Folio: The Magazine for Magazine Management, Vol. 35 Issue 6, p29-29, 1p, 1c, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=6&sid=8fe4a74d-fdbb-4500-ae67-7507de7d893a%40sessionmgr3
Krol, C., 2006, ‘Searching for search marketers’, B to B, Vol. 91 Issue 1, p1-32, 2p, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=6&sid=3f6c4cf5-a9ac-4220-9c07-05af2178e141%40sessionmgr7
Bentley, R., 2005, ‘Search parties’, Travel Weekly: The Choice of Travel Professionals, Issue 1791, p64-65, 2p, 2c, available at: http://web.ebscohost.com/bsi/detail?vid=1&hid=6&sid=20749c01-26e4-4293-b5ba-6f59af7b29bf%40sessionmgr7