Object Overview
The
HTTP input connector object allows
AlchemyPoint Server to download content from remote WWW servers. This allows
AlchemyPoint to automatically poll specific web sites for updates, retrieve site content on automatic regular intervals, etc.
This object utilizes
AlchemyPoint's underlying Connector framework and thus supports a variety of rate limiting and concurrency options.
Downloaded web content can be automatically processed/transformed or utilized as triggers for starting other processes using
Bindings.
Example Object Definitions
... directly within the top-level
ServerConfiguration object:
Minimal Configuration:
<HTTPInputConnector name="WebIn1"
url="http://www.test.com/latest_updates.html"/>
Overview:
The above example is a
HTTP input connector that retrieves the url 'http://www.test.com/latest_updates.html' on a regular interval (every 900 seconds, the default).
Complex Configuration (all possible attributes shown):
<HTTPInputConnector name="WebIn1" tags="connector http input"
description="example output for retrieving content from a destination HTTP server"
url="http://www.test.com/cgi-bin/post-script"
method="post"
messageBody="post_data=getUpdate"
rateLimitDuration="60"
rateLimitPipelineMax="1"
connectRateLimit="5"
reconnectAttempts="5"
reconnectInterval="60"
reconnectExpBackoff="true"/>
Overview:
The above example is a
HTTP input connector that makes a
HTTP Post to the url 'http://www.test.com/cgi-bin/post-script' every 60 seconds, sending the Post data 'post_data=getUpdate'.
This connector is rate limited at 5
HTTP requests every 60 seconds, and will attempt to retry requests up to 5 times. The first
HTTP retry is made after a 60 second delay, with subsequent attempt delays doubling each time (120 seconds, 240, ...).
Supported Attributes
General Attributes:
- name - (usage: required): Public name of the configuration object. This name must be globally-unique across the entire AlchemyPoint ServerConfiguration definition.
- description - (usage: optional): Textual description of the configuration object.
- tags - (usage: optional): Space-delimited list of one or more tags used to describe the configuration object.
Network Attributes:
- url - (usage: required): The HTTP URL accessed by this InputConnector. This URL represents the WWW address of the webserver / content access by the Connector.
- method - (usage: optional, default: "get"): The HTTP operation 'method' to utilize when accessing a WWW server. Supported methods are:
- get - HTTP Get (this method retrieves content from web server).
- head - HTTP Head (this method retrieves content information from web server, but not the content data).
- post - HTTP Post (this method posts content to web server).
Object Attributes:
- messageBody - (usage: mixed): HTTP entity data that should be sent to a remote web server as part of each HTTPInputConnector operation.
- This option is required if the "method" parameter is set to "post".
- This option cannot be used if the "method" parameter is set to "get" or "head".
Rate Limit Attributes:
- interval - (usage: optional, default: "900"): The number of seconds to wait between executing each Input operation. A value of "0" configures the Connector to only perform the Input operation once, and never again.
- rateLimitDuration - (usage: optional, default: "0"): The duration (in seconds) utilized by the 'connectRateLimit' parameter. A value of "0" disables all rate limits for this Connector.
- connectRateLimit - (usage: optional, default: "0" [unlimited]): The number of TCP connections that may be created by this Connector within the 'rateLimitDuration' period.
Automatic Reconnect Attributes:
- reconnectAttempts - (usage: optional, default: "0"): The number of times (0-1000) this Connector should attempt to retrieve HTTP content before giving up.
- reconnectInterval - (usage: optional, default: "1"): The interval (1+) in seconds that the Connector should delay between retry attempts for HTTP operations.
- reconnectExpBackoff - (usage: optional, default: "true"): Boolean setting that indicates whether the Connector should exponentially increase the reconnection interval upon each HTTP operation (60 seconds -> 120 -> 240 -> ...)
Content Delivery Attributes:
- deliveryMode - (usage: optional, default: "chunk_big"): Specifies the manner (buffered or "streaming") in which content is delivered to its requested AlchemyPoint endpoint (ProtocolConnector? , ProtocolListener? , etc.). Supported "deliveryMode" settings are listed below:
- buffer - Buffer complete resource before delivering.
- chunk - Support chunked delivery only if resource is chunked and length isn't known.
- chunk_big - Support chunked delivery if resource is chunked, length isn't known, and buffer highwater mark has been reached.
- chunk_all - Support chunked delivery of any resource that is chunked, regardless of size.
- chunk_force - Force chunked delivery regardless of whether resource is chunked or size.
- chunk_force_big - Support chunked delivery if buffer highwater mark has been reached.
- chunk_all_force_big - Support chunked delivery if chunked (regardless of size), or if buffer highwater mark has been reached.
- highwater - (usage: optional, default: "100000"): Specifies a highwater mark at which to stop buffering and start streaming data to its requested AlchemyPoint endpoint.
- This option may only be specified when the Connector "deliveryMode" parameter is set to "chunk_big", "chunk_force_big", or "chunk_all_force_big".
- chunkSize - (usage: optional, default: "4096"): Specifies the size of data "chunks" (in bytes) that should be streamed to the requested AlchemyPoint endpoint.
Supported Child Configuration Elements
SSL:
- SSLClientConfig - (usage: optional): Specifies local x.509 client certificate, private key, trusted x.509 certificate store, and other SSL-related configuration options.
Proxy Settings:
- ProxyConfig? - (usage: optional): Specifies the local proxy configuration (HTTP, SOCKS4, SOCKS5, etc.) that should be utilized when making outbound connections.
WWW Bindings:
- Bindings - (usage: optional): Bindings object, containing one or more WebBinding configuration elements. These bindings define the manner in which retrieved HTTP content is acted upon, used as triggers for starting other processes, etc.
Issues
Unresolved documentation issues: intervals, timeouts, concurrency options.