Default fields returned by crawler
Internal Links API default crawler fields
A minimum set of fields is required to build minimum rich linking objects. These fields include: URL, Title and Text.
Graphite's Internal Links API crawler will also pick up optional fields when the metadata is found within the page being crawled. These fields, when found, will also be included in the API response.
Minimum required fields
URL
The URL of the page that will be provided as the link URL. If not declared, the Internal Links API will take the canonical tag (<link rel=canonical>
inside <head>
) as the link URL. If the canonical tag is not present, the Internal Links API will take the last URL crawler sees after all the possible redirects.
Title
The title of the page that will be provided as the link title. If not declared, the Internal Links API will take the first <h1>
element found on the page as the link title. Additionally, the Open Graph’s og:title
property or the page’s meta <title>
tag could be taken in that order of precedence, depending on their existence.
Text
(multiple allowed) The page's textual content. The Internal Links API uses this to find related pages when building related links. If not declared, the Internal Links API will take the text content within the page <body>
; remember that the <body>
may contain sections representing noise, such as headers, footers, and sidebars.
Optional fields included when found
Image
The link thumbnail URL. It is an image that should be used as the link image. An image of the desired size should be provided as the Internal Links API does not store or process images. If not declared, the Internal Links API will take the Open Graph og:image
property if present.
Description
The link description. Text that should be used as the link description; is usually used to show a snippet of the page content. If not declared, the Internal Links API will take the Open Graph og:description
property or the HTML <meta name=”description”>
element, in that order of precedence depending on their existence.
Modified Time
The page’s modified time. It is an ISO 8601 timestamp string. If not declared, the Internal Links API will take the Open Graph article:modified_time
property if present.
Published Time
The page’s published time. It is an ISO 8601 timestamp string. If not declared, the Internal Links API will take the Open Graph article:published_time
property if present.
Author
(multiple allowed) The author name of an article-like page (blog post, recipe, etc.). If not declared, the Internal Links API will take the Open Graph article:author
property or the HTML <meta name=”author”>
element, in that other of precedence, depending on their existence. It can be declared multiple times.
Custom fields
Custom fields are also supported by Graphite's Internal Links API. Custom properties could be added by using the prefix graphite:custom, i.e., graphite:custom:{property}.
For more information about adding custom fields, please review our Structured Data Specification for custom metadata
Graphite Growth™
Updated about 2 months ago