Wouldn't it be pleasant to be able to evacuate some codification in your web holiday camp to report the survey engine arachnoid crawlers to cause your location amount one? Unfortunately a robots.txt wallet or robots meta tag won't do that, but they can assist the crawlers to scale your place better and block out the disadvantageous ones.

First a infinitesimal explanation explaining:

Search Engine Spiders or Crawlers - A web lackey (also agreed as web spider) is a program which browses the World Wide Web in a methodical, automated property. Web crawlers are first and foremost nearly new to make a duplicate of all the visited pages for subsequently process by a go through engine, that will scale of measurement the downloaded pages to make available in haste searches.

The Medieval City (Greenwood Guides to Historic Events of the Medieval World)
Anthology of Ancient and Medieval Woman's Song
Dante Alighieri (Bloom's Modern Critical Views)
Joan of Arc and the Hundred Years War (Greenwood Guides to Historic Events of the Medieval World)
The Edges of the Medieval World (Medievalia) (Ceu Medievalia)
Crusade, Heresy and Inquisition in the Lands of the Crown of Aragon (c. 1167-1276) (The Medieval and Early Modern Iberian World)
Cross, Crescent and Conversion: Studies on Medieval Spain and Christendom in Memory of Richard Fletcher (The Medieval Mediterranean)
Holiness and Masculinity in the Middle Ages (University of Wales Press - Religion and Culture in the Middle Ages)
Courtly Love Songs of Medieval France (Oxford Monographs on Music)
Prestige, Authority and Power in Late Medieval Manuscripts and Texts (York Manuscripts Conference)
Tortured Subjects: Pain, Truth, and the Body in Early Modern France
The Medieval Warrior Aristocracy: Gifts, Violence, Performance, and the Sacred (Gallica)
Frederick II: A Medieval Emperor (Oxford Paperbacks)
Wandering Monks, Virgins, And Pilgrims: Ascetic Travel In The Mediterranean World, A.D. 300-800
Preacher, Sermon and Audience in the Middle Ages (New History of the Sermon, 3)
The Proprietary Church in the Medieval West

A web toady is one form of bot, or package agent. In general, it starts beside a document of URLs to pop in. As it visits these URLs, it identifies all the hyperlinks in the page and adds them to the schedule of URLs to visit, recursively browsing the Web according to a set of policies.

Robots.txt - The robots kicking out regular or robots.txt rule is a meeting to forbid good web spiders and other web robots from accessing all or piece of a website. The substance specifying the surround that should not be accessed is nominal in a data file titled robots.txt in the top-ranking book of facts of the website.

The robots.txt prescript is morally advisory, and relies on the concession of the web robot, so that marker an state of your holiday camp out of bounds beside robots.txt does not assurance seclusion. Many web scene administrators have been caught out provoking to use the robots data file to create isolated environment of a website invisible to the portion of the international. However the record is needfully publically unspoken for and is easily restrained by everyone near a web spectator.

The Medieval Origins of the Legal Profession: Canonists, Civilians, and Courts
Medieval Iberia: Changing Societies and Cultures in Contact and Transition (Monograf?as A) (Monograf?as A)
Merovingian Mortuary Archaeology and the Making of the Early Middle Ages
Love, War and the Grail: Templars, Hospitallers and Teutonic Knights in Medieval Epic and Romance, 1150-1500 (History of Warfare, 4)
Cities Of God: The Religion Of The Italian Communes 1125-1325
The Feast of Corpus Christi
The Axe and the Oath: Ordinary Life in the Middle Ages
A Tale of Two Monasteries: Westminster and Saint-Denis in the Thirteenth Century
Time, Work and Culture in the Middle Ages
Merovingian Military Organization, 481-751
Fortress-Churches of Languedoc: Architecture, Religion and Conflict in the High Middle Ages
Heretic Lives: Medieval Heresy from Bogomil and the Cathars to Wyclif and Hus
The Making of a Court Society: Kings and Nobles in Late Medieval Portugal
Communications and Power in Medieval Europe: The Carolingian and Ottonian Centrues
Constructing Medieval Sexuality (Medieval Cultures, V. 11)
Poets, Saints, And Visionaries of the Great Schism, 1378-1417

The robots.txt patterns are matched by painless substring comparisons, so concern should be understood to brand certain that patterns harmonizing directories have the concluding '/' individuality appended: otherwise all files near name calling protrusive next to that substring will match, rather than freshly those in the book of facts considered.

Meta Tag - Meta tags are previously owned to give structured facts active notes.

In the aboriginal 2000s, turn upside down engines veered away from certainty on Meta tags, as more web sites used indecorous keywords, or were keyword space filler to buy any and all aggregation possible.

Some turn out engines, however, lifeless thieve Meta tags into numerous deliberation once delivering results. In recent years, query engines have become smarter, penalizing websites that are two-timing (by continuance the self keyword individual present time to get a aid in the turn out top-ranking). Instead of active up rankings, these websites will go fluff in rankings or, on one hunting engines, will be kicked off of the furrow engine unequivocally.

Index a place - The act of creeping your locality and gathering substance.

How can the robots.txt record and meta tag backing you?

In the robots.txt you can relate the unhygienic 'web crawlers' to depart your web encampment alone, and make a contribution instrumental hints to the ones you want to move your parcel. Below is an first of its kind on how to proscribe a web crawler to rummage your site:

# this identifies the wayback appliance User-agent:


Disallow: /

ia_archiver is the adulator signature for the wayback machine that you may have detected of, and the / after interdict tells ia_archiver not to graduated table any of your tract. The # allows you to dash off annotations to yourself so you can hang on to path of what you typewritten.

Type the above cardinal lines into notepad from your data processor and let go it to the bottom reference book of your web piece of land as robots.txt. Web crawlers air for this written document archetypal at a web land site in the past doing anything other. This helps the flatterer to do its job, and helps the web spot administrator report to the spider what to do. Say for instance you have every assemblage that you don't poorness the crawlers to see. (Like imitate joyful for otherwise witness referrer pages)

You can deter crawlers from compartmentalization the 'duplicate' reference by typewriting this into your robots.txt profile.

User-agent: *

Disallow: /duplicate/

The * after user-agent says that this handling applies to all crawlers and /duplicate/ after compel tells all crawlers to ignore this reference book and not force out it. For all user-agent and command chain in attendance essential be a empty extraterrestrial relating them in demand for it to manoeuvre precisely. So this is how you would instigate the preceding two commands into a robots.txt file:

# this identifies the wayback machine

User-agent: ia_archiver

Disallow: /

User-agent: *

Disallow: /duplicate/

One thing to entry that is precise important: Anyone can accession the robots.txt data file of a encampment. So if you have info that you don't privation any person to see don't consider it into the robots.txt profile. If the handbook that you don't poorness anyone to see is not related to from your web parcel the crawlers won't scale it at least.

An alternative to block compartmentalization of your tract is to put a meta tag into the leaf. It looks close to this:

You put this into the tag of your web leaf. This flash tells the golem crawlers not to index (search) the page and not to trail any of the hyperlinks on the page. So as an information tells the automaton crawlers to not scale the page, but trail the hyperlinks on this folio.

Did You Know That Google Has Its Own Meta Tag?

It looks close to this:

This tells the Google robot lackey not to index the page, not to locomote any of the links, and not to save from storing cached versions of your web place. You will deprivation this through if you intelligence the pleased on your site ofttimes. This prevents the web human from sighted obsolete complacent that isn't rejuvenated because of keeping in the cache.

You can use the meta tag to explicitly cooperate to Google's robots to go round complications or if you are optimizing your place for Google's query engine.




serpoif 發表在 痞客邦 PIXNET 留言(0) 人氣()