Child pages
  • 3a. Data Import, Items
Skip to end of metadata
Go to start of metadata

All features described in this chapter are available only for the advanced edition of the Recommender Engine. BASIC Authentication is enabled by default. Use the customerid ID as username and the license key as password. The license key is displayed in the upper right in the Admin GUI (https://admin.yoochoose.net) after logging in with your registration credentials.

The Recommender Engine needs updated information from the web presence of the customer to generate personalized recommendations for the user profile. To get such information an event tracking process is required to collect events like clicks, purchases, consumes etc. This is described in detail in chapter 1. Tracking Events.

In addition to events collected by the Recommender (see chapter 1. Tracking Events) the Recommender Engine can use external information about the products. This information must be uploaded to the Recommender Engine by the website owner. There are some examples:

  • Product price - Products cheaper than the specified threshold can be filtered out from recommendations (see user guide chapter 9a. General Filters for more information).
  • Availability time period - Products will be recommended only in the specified time window.
  • Custom attributes - By using custom attributes parameters it is possible to group recommendations and narrow the results. For example "I am interesting only for non-food products" or "show me only the news related to my city". See user guide chapter 10. Sub-Models for more information.

Item Format

The XML format for an item import is the same for the PULL and PUSH interface. Here is an example:

<items version="1">
    <!-- version is mandatory and always "1" -->
    <item id="102" type="1">
        <description>the item's description</description>
        <price currency="EUR">122</price>
        <!-- price in cents as integer value (e.g. EUR-Cent) -->
        <validfrom>2011-01-01T00:00:00</validfrom>
        <!-- items will be recommended only in the specified time window -->
        <validto>2021-01-01T00:00:00</validto>
        <categorypaths>
            <categorypath>/8/4/5</categorypath>
            <!-- one product can be in multiple categories -->
            <categorypath>/84</categorypath>
            <categorypath>/1/847</categorypath>
        </categorypaths>
        <content>
            <content-data key="title">
                <![CDATA[ ... ]]>
            </content-data>
            <content-data key="abstract">
                <![CDATA[ ... ]]>
            </content-data>
        </content>
        <attributes>
            <!-- this section should only contain values that are distinct and limited to build prefiltered models -->
            <attribute key="author" value="Max Mustermann" />
            <attribute key="agency" value="dpa" />
            <!-- If no type value given NOMINAL (a limited list of values) is assumed -->
            <attribute key="vendor" value="BOSCH" />
            <!-- if NUMERIC is chosen, the value must be numeric as well. -->
            <attribute key="weight" value="100" type="NUMERIC" />
        </attributes>
    </item>
</items>

Implicit vs. explicit update of categorypaths

Product attributes which can be uploaded over the data import interface include the path of the category/categories, the product is located in. As described in the chapter 1. Tracking Events the category path can be also updated over the "click" events. If the products are regularly uploaded over the data import interface the click event should not contain the category path information. E.g. a product is clicked in the "TopSeller" section and the categorypath "%2FTopSeller" is sent ... which is mostly not desired as it is originally located under "%2FGarden%2F.

Enabling both update ways for category path is possible, but it has following side effect:

  • Every new category path attached to the "click" event will append the list of the categories of the product.
  • The Product imported will overwrite the collected category paths.

 

XML schema definition

The current schema can be seen under https://admin.yoochoose.net/api/00000/item/schema.xsd

Following is a brief description of the attributes  

Name/AttributeDescriptionMandatoryType

id

The id of an item

yes

Integer

type

The type of an item

yes

Integer

description

Some additional information about an item

 

String

price

The price of an item (e.g. in EUR cents)

 

Integer (see below)

currency

Currency used, by default EUR is assumed

 

ISO 4217

validfrom

Defines together with validto the "lifespan" of an item.If NULL or not available, the item is considered as valid

 

ISO 8601 (see below)

validto

Defines together with validfrom the "lifespan" of an item.If NULL or not available, the item is considered as valid

 

ISO 8601

categorypath

A logical (website) navigation path how the item can be reached in the customer's system

 

String, separated with "%2F" (encoded /)

categoryid

The category ID. Deprecated. Use "categorypath" instead!

 Integer

Empty Products

All the elements and attributes except the item type and the item id are optional. It is possible to upload a product without any additional information, e.g. if the random recommendation model is used or on-the-fly boosting and filtering of recommendations is intended to be used. See user guide chapter 5. Recommendation Models for more information about different model types. In this case the Recommendation Engine will randomly recommend the imported products even if there was no events related to these products. This is useful for a news agency, where new products (=news) are published very frequently.

Field Formats

The key attribute in the elements attribute and content must contain only letters, numbers and underscores.

Price is formatted as the amount of "cents", for example 1234 for 12 Euros and 34 Cents. If the currency doesn't contain the cent part, the main currency is used, for example 12 for 12 Japanese Yen. See ISO 4217 [1] to check, if the selected currency has a "cents" part.

Validity Dates are formatted as specified in XSD format [2] without a time zone. As a time zone is the default timezone of the mandator which is used.

Attributes

It is also possible to define custom (numeric or nominal) attributes in the <attributes> section. The default type of every attribute is "NOMINAL", i.e. the values of an attribute are treated as distinct when compared while calculating a content based model (ADVANCED solution). If you add another attribute in the attribute element named type="NUMERIC", the recommender engine will treat the values as ranges. This means that a size of 4 is closer to a size of 5 than to 1. If the attribute price is of type NOMINAL, they are both just different and have no "distance based similarity".

<attribute key="size" value="4" type="NUMERIC" /> 

Another typical example is the color of an item. To insert it in the store, the following line should be a child of the <attributes> element.

<attribute key="color" value="green" />
 Attribute keys are case sensitive. It is possible to have multiple attributes with the same name and different type. For example the size as a number (for example 40.5) and as a code ("L").

Content

One can load any content data in the content part of the item. The content is used only for full text analysis models, it cannot be used in 10. Sub-Models

<content-data key="abstract"> <![CDATA[ ... ]]></content-data>

Push Interface

The Recommender provides a REST interface that accepts an item in XML-format. The following URL describes the interface.

https://admin.yoochoose.net/api/[customerid]/item

It can be used to POST item information within the request's body into the store and to show, update or delete items directly. The parameters that are used in the call are described in the table.

The URL

https://admin.yoochoose.net/api/[customerid]/item/[itemtypeid]/[itemid]

is the direct link to fetch item data.

Parameter nameDescriptionValues

customerid

This is a reference to the account of the customer. It will be provided by YOOCHOOSE.

String

itemid

A unique identifier for the item that is used as a reference to identify the item in the database of the customer

Numeric

itemtypeid

Describing the type of the item id. Usually it is fixed to 1 but if the customer uses more than one type of items this has to refer to the corresponding item type

Numeric

Different HTTP-methods can be used to create, update, delete or retrieve items located in the YOOCHOOSE data store. The following table gives an overview about the different methods and their function.

HTTP MethodDescriptionResult

POST

If the body contains valid xml data, the item will be persisted. The item is not directly available but scheduled to be inserted. If the XML content cannot be validated, the server will send a Bad Request status code.

202 (Accepted)
400 (Bad Request)

GET

This method retrieves all information that is stored in the database for the given item id. If not found the status code 404 is returned

200 (OK)
404 (Not Found)

DELETE

Deletes all information that is related to the item id that has been sent. There is no need to send a body element. The item is not deleted directly but scheduled to be removed from the data store.

202 (Accepted)
404 (Not Found)

The body of a request to import data using the above interface must contain a valid XML document.

Pull/Bulk Interface

The bulk interface is mainly used to provide an initial data import to initialize the Recommender Engine. It also can or should be used to reset and re-initialize the the data store.

To enable the upload one must provide a download location that allows the Recommender System to retrieve an .xml document containing the item data. In order to get a high amount of data, the export should be separated into smaller valid .xml files not bigger than 20 MB each. Only the XML format is supported. The export file must be publicly accessible, e.g.

http://cms.customer.de/path-to-the-service/download-folder/<filename>.xml

To initiate the download, or respectively the upload, the customer can invoke the following trigger service where the 'url' parameter value reflects the url-encoded address where the export file(s) can be found. If more than one file is provided for download, this trigger service has to be invoked once for each file with the corresponding URL. 

The following is an example of a GET request, which triggers the upload of an export file.

https://import.yoochoose.net/api/[customerid]/item/upload?url=<url>

 

ParameterDescriptionValues

clientid

This is a reference to the account of the customer. It will be provided by YOOCHOOSE.

String

url

The URL from where the export file can be downloaded by YOOCHOOSE service components.

URL-encoded String

After invoking the trigger service, downloading the file and further processing is scheduled asynchronously. Therefore, the usual response code for this service call is 202 (Accepted). The download will take place within the next minutes.

Currently Authentication is not supported to fetch customer downloads. As a workaround you could add it in the url value. A better solution is to use cryptic filenames and delete the download after a while.

 

Transfer items

The method transfers an item from one id to another id. The attributes of the old item are NOT moved or merged. If you rely on attributes for e.g. filtering based on prices, the new item must be reimported.

All related historical user data is rewritten to point to the new item. The old item is wiped including all attributes. The authentication is based on BASIC AUTH with customerId and license-key.

POST https://import.yoochoose.net/api/[customerid]/transferitems
Content-Type=text/xml

<transfers>
	<transfer>
		<sourceitem id="1234" type="1"/>
		<targetitem id="6789" type="1"/>
	</transfer>
</transfers>

 


External Links

[1] ISO 4217 - Currency Code - Wikipedia
http://en.wikipedia.org/wiki/ISO_4217
[2]

XML Schema Part 2: Datatypes Second Edition; Chapter 3.2.7 dateTime
http://www.w3.org/TR/xmlschema-2/#dateTime

 




  • No labels