Downloading web resources with http.stream - basics

From MorphOS Library

Revision as of 19:09, 26 August 2010 by Krashan (talk | contribs) (Minimal example: More text.)

Grzegorz Kraszewski

Introduction

The http.stream class is one of Reggae stream classes, in other words data sources. In a chain of Reggae objects, a http.stream instance will be always the first object, having only one, output port. A http.stream object may be also used standalone, not connected to anything, just to retrieve any data resource reachable via HTTP protocol and particularly its GET request. From this point of view, http.stream is just embeddable HTTP/1.1 client with simple yet powerful API. A brief list of its features is given below:

  • Socket API encapsulation. http.stream completely isolates application (and its programmer) from bsdsocket.library and TCP/IP stack. Only very basic knowledge of TCP/IP is needed to use http.stream with success.
  • Unlike bsdsocket.library base instances, http.stream objects may be shared between processes (with the only exception that object must be disposed by proces which created it).
  • The class has builtin parser of HTTP response headers.
  • The class has also an easy to use HTTP request header builder, so custom fields may be added to the header.
  • HTTP proxies are supported.
  • The class supports chunked transfer and media streaming over HTTP.
  • Optional user agent spoofing is possible.
  • When connecting, HTTP redirections may be followed automatically.
  • The class is able to handle streams longer than 4 GB.
  • Easy protocol debugging via MediaLogger.

Minimal example

When we skip any error handling, the whole process of downloading data via HTTP protocol reduces to three lines of code:


#define DATA_LENGTH 7465       /* just example value */

UBYTE buffer[DATA_LENGTH];     /* place for data */
Object *http;

http = NewObject(NULL, "http.stream", MMA_StreamName, "www.morphzone.org", TAG_END);
DoMethod(http, MMM_Pull, 0, buffer, DATA_LENGTH);
DisposeObject(http);


We assume here, http.stream class has been loaded previously with OpenLibrary() (see Opening and closing individual classes). The code will download first 7465 bytes of MorphZone main page (HTML code), assuming there will be no error. This assumption is rather risky, because a network operation can fail for numerous reasons. Then we will be calling method on the NULL pointer and then disposing it, which can even lead to application crash. For this reason http.stream offers a few ways for handling errors. They will be discussed later, for now a minimal error handling is checking NewObject() result against NULL. This is used in a simple example downloading the first 1000 bytes of a resource specified in the commandline and dumping them into the console. Note that using this program for binary resources (like images) may result in rather weird output... I recommend running this example along with MediaLogger, to learn http.class protocol debugging features.