Taglists

From MorphOS Library

Revision as of 10:07, 26 October 2011 by Krashan (talk | contribs) (Structures: Contents++;)

Grzegorz Kraszewski


This article in other languages: Polish


What Is a Taglist?

A taglist is an array of "key-value" pairs. The key is always a 32-bit integer number and is called a tag. The value also has a size of 32 bits. It may be an integer, or a pointer to any structure or object. Taglists are commonly used in the MorphOS API for passing a variable number of arguments, usually sets of attributes with their values. A few special key values are used for array termination, concatenation and item skipping. A set of functions in the utility.library may be used for taglist traversing, filtering, searching, copying etc.

Every pair in a taglist is a TagItem structure, defined in <utility/tagitem.h>:

struct TagItem
{
  ULONG ti_Tag;
  ULONG ti_Data;
};

To be self descriptive, every taglist, being just a plain C array, must have some kind of termination. It is very similar to the string null-termination idea. The termination is done with a TagItem having its ti_Tag field set to TAG_END (which happens to be defined as zero). The ti_Data value of the terminating TagItem is ignored, it is usually set to zero too. The illustration below shows a simple taglist:


Taglists1.png


This taglist may be created with the following code:

double x = 3.438872763e+17;
Object *obj = NewObject( /* ... */ );

struct TagItem mytaglist[] = {
  { Tag1, 2837 },
  { Tag2, (ULONG)&x },
  { Tag3, (ULONG)"The quick brown fox..." },
  { Tag4, (ULONG)obj },
  { TAG_END, 0 }
};


Passing Taglists to Functions

Building taglists as global or local variables is not very convenient. That is why almost every MorphOS API function getting a taglist as one of its arguments has two forms. The first one accepts a pointer to the taglist. The second one is a variadic args macro building the taglist on the program stack dynamically. Such function pairs are named according to one of the two conventions:

  1. SomeFunctionA() takes a pointer to a taglist, SomeFunction() builds the taglist dynamically.
  2. SomeFunctionTagList() takes a pointer to a taglist, SomeFunctionTags() builds the taglist dynamically.


Continuing the above example, one can pass the example taglist to SomeFunction() in two ways:

SomeFunctionTagList(mytaglist);
SomeFunctionTags(Tag1, 2837, Tag2, (ULONG)&x, Tag3, (ULONG)"The quick brown fox...", Tag4,
 (ULONG)obj, TAG_END);

Of course in the second case variable mytaglist need not be defined anywhere. Note that for every taglist-based function the taglist is the last argument. There may be some plain arguments before it. It is a common practice to omit ti_Data for the terminator (it is ignored anyway).


Special Tags

Special tags are used to avoid copying large blocks of data when taglists are manipulated. As taglists are plain arrays, operations like removing or inserting elements or merging taglists involve copying their large fragments. Special tags allow for doing these operations by only changing a few tags. Of course this comes at a price. Taglists with special tags (other than TAG_END) may be only manipulated with a set of functions from utility.library. All these functions detect special tags and interpret them properly. It should be noted, that even if taglists are created by hand and do not contain special tags, such tags may appear later, when taglists are passed to system functions. It is therefore recommended to always use utility.library tools to manipulate taglists, especially the taglist iterator named NextTagItem().

TAG_END

I've mentioned this tag already, it works as a taglist terminator. Its value is 0.

TAG_IGNORE

This special tag is used to logically remove a tag item from a taglist without copying large blocks of data. When a tag is to be removed, it is replaced with TAG_IGNORE, tags after it need not to be moved. Any taglist processing function will ignore the tag and advance to the next one immediately.

Taglists 3.png

TAG_MORE

This tag is used for joining taglists. Joining of plain arrays would require counting their size, allocating memory area and copying all tagitems. It is much simplier with TAG_MORE. If one wants to append taglist B at the end of taglist A, he replaces the terminator of taglist A with TAG_MORE and sets the data field to the address of taglist B. Then, when any taglist manipulation function sees this special tag, it fetches the next tag from the address. In effect taglists are logically joined into one.

Taglists 2.png

TAG_SKIP

This rarely used tag is used to logically remove larger fragments from a taglist. When encountered, it directs the taglist iterator to skip a specified number of following tags. The number is specified in the data field of TAG_SKIP.


Taglists 4.png


When using TAG_SKIP one should take care to not skip past the taglist terminator. Taglist processing functions are not protected against such a bug and will try to process random data. Results are unpredictable.

Traversing Taglists With NextTagItem()

NextTagItem() function from the utlilty.library is the basic taglist iterator. It should be always used to traverse taglists, as it supports all the special tags discussed above. Every call to the function returns an address of the next item in the taglist, of course special items are "executed" and not returned. The argument of the function is an address of variable being a pointer to the current tag. This pointer is initialized manually to the start of taglist and should not be modified later. A basic loop for taglist browsing is organized as follows:

struct TagItem *tag, *tagptr;

tagptr = my_taglist;  // initialization to the start of taglist to be traversed

while (tag = NextTagItem(&tagptr))
{
  /* do something with 'tag' */
}

NextTagItem() returns NULL when it encounters TAG_END terminator.


Taglists Processing

Any processing of taglists can be done with just NextTagItem() and manual manipulation of tagitems. On the other hand utility.library provides set of functions for typical operations on taglists. Using them saves time and avoids bugs, which may be possibly made in manually written code. Advanced taglist processing is usually not needed in typical aplications, but comes handy, when taglists are used for data storage.

It should be noted, that this article does not replace autodocs of utility.library functions. These autodocs have been completely rewritten and heavily extended in the MorphOS SDK (compared to Amiga ones). This text is just an overview of possibilities, while autodocs provide detailed descriptions.

Finding Tags and Data

Finding particular tag in a taglist is relatively simple task. It can be done with FindTagItem() function. It returns the first occurence of specified tag. If a tag is expected to appear multiple times, the function may be called in a loop, taking its result as an argument for the next call.

struct TagItem *tag = my_taglist;

while (tag = FindTagItem(SOME_TAG, tag))
{
  /* do something with 'tag' */

  tag++;  /* start from the next tag after the one just found */
}

As all other functions described below, FindTagItem() automatically interprets special tags, so they are never returned to the caller. That is why incrementing the tag pointer in the above loop is always safe.

In many cases (like constructors of MUI classes) we are interested in data of particular tag, and also want a default value if the tag is not present in a taglist. These two operations may be merged into one call to GetTagData() function. It takes a tag, the default value and a taglist as arguments. If the tag is found, its data is returned, if not the result is the default value. Here is an example, the code:

if (tag = FindTagItem(SOME_TAG, taglist)) value = tag->ti_Data; 
else value = DEF_VALUE;

reduces to

value = GetTagData(SOME_TAG, DEF_VALUE, taglist);

The last function related to finding is TagInArray(). It does not operate on a taglist however. It simply checks if a tag is found in a zero-terminated array of tags. This is not a taglist, just plain array of ULONG-s.

Creation and Copying

The simplest way of creating a tagilst is to declare it as an array, be it global or local variable. It has been shown already in this article. If a taglist should be allocated dynamically, one can use any general purpose memory allocation function provided by exec.library. An example with AllocVec():

struct TagItem *taglist;    /* taglist with 7 items (including terminator) */

taglist = (struct TagItem*)AllocVec(7 * sizeof(struct TagItem), MEMF_ANY);

The utility.library provides also a special function for taglist allocation named AllocateTagItems(), which is in fact nothing more than a wrapper on general purpose memory allocation function. It also clears allocated memory block, so it can be treated as an empty taglist (as TAG_END is 0). Taglist allocated with AllocateTagItems() must be freed by FreeTagItems().

struct TagItem *fresh_list;

if (fresh_list = AllocateTagItems(2))
{
  fresh_list[0].ti_Tag = SOME_TAG;   /* Need not to initialize terminator at [1] */
  fresh_list[0].ti_Data = 123456;    /* as the taglist is cleared to all zeros. */

  /* more opreations on 'fresh_list' */

  FreeTagItems(fresh_list);
}

A copy of an existing taglist can be created with CloneTagItems(). This function is not a simple wrapper, as taglist containing special tags cannot be copied as a memory block. For doing a proper copy, the function traverses the taglist (using NextTagItem()) two times. First it counts plain tags. Then it allocates a memory block for a clone and does second pass over original, copying tags one by one. Order of tags is preserved, so when we iterate both original and clone with NextTagItem(), we get the same results. The difference is that the clone is stripped off of all special tags. All tags ignored and skipped are not cloned, also separate fragments chained with TAG_MORE are merged into one continuous block. The only special tag present in the clone is, of course, TAG_END. As CloneTagItems() allocates memory for the copy, the copy must be freed later with FreeTagItems(). CloneTagItems() is in fact a memory allocation function, so its result must be always checked against NULL.

struct TagItem *original, *copy;

if (copy = CloneTagItems(original))
{
  /* do something with 'copy' */

  FreeTagItems(copy);
}

There is also a taglist copying function for making copy into a ready buffer. It is a bit misleadingly named RefreshTagItemClones(). In fact the function works the same as CloneTagItems(), but uses a destination buffer passed as an argument instead of allocating a new one. The copy is stripped off of ignored and skipped tags and also defragmented, the same as for CloneTagItems(). It should be noted that RefreshTagItemClones() does not check destination buffer size, so if the original list does not fit into the buffer, memory will be corrupted.

struct TagItem copy[7];  /* I'm so sure original list has no more than 6 non-special items */
                         /* + terminator. If it is longer, total armageddon will be started. */

RefreshTagItemClones(copy, original);

The function cannot fail, so it has no result.

Filtering and Mapping

Utility.library provides four functions for filtering and mapping tags and data. Two of them, FilterTagItems() and MapTagItems() operate on tags considering only their identifiers. The second two, ApplyTagChanges() and FilterTagChanges() operate on tags also considering their data.

Filtering Tags by Identifier

The FilterTagItems() function allows for removing tags from taglist according to specified array of tags. Tag removal is done by changing its identifier to TAG_IGNORE. This function has two modes of operation. The array of tags may work either as array of allowed ones (all others are filtered out), or array of ones to be rejected (all others are kept). It is illustrated on the diagram below. Tag identifiers have been marked with colors. On the left there is original taglist (the same in both diagrams). The center one is an array of filtered tags. It does not contain TagItem structures, but tag identifiers, so it is just an array of ULONGs. The original taglist after performing the operation is shown on the right.


Taglists5.png


Filtering a taglist with TAGFILTER_AND mode. All tags not present in the tag array are rejected.


Taglists6.png


Filtering a taglist with TAGFILTER_NOT mode. All tags present in the tag array are rejected.
Tag Mapping

In the process of mapping, a set of tag identifiers is being replaced by another set according to a map. Tags not included in the map may be either kept or removed. The map is just another taglist. Tag identifiers of map taglist are source identifiers. Data of map taglist are destination identifiers. Mapping replaces source identifiers with destination ones. The same as for tag filtering, tag mapping does not change data fields. Diagrams below show the process of mapping. A set of "red tags" is mapped to set of "blue tags", while other are either leaved intact or removed, depending on argument passed to MapTags(). Source taglist before mapping is shown on the left, the map in center and source after mapping on the right.


Taglists7.png


Mapping red tags to blue tags with keeping unmapped ones (MAP_KEEP_NOT_FOUND).


Taglists8.png


Mapping red tags to blue tags with removing unmapped ones (MAP_REMOVE_NOT_FOUND).


An interesting possibility is mapping ordinary tags to special ones. Special tags cannot be mapped, because MapTags() reads the source with NextTagItem(), so special tags are interpreted rather than mapped. On the other hand, any ordinary tag may be mapped to a special one. The most useful special map destination is TAG_IGNORE, which makes MapTags() to act the same as FilterTags(), but with the first function one can map some tags and filter other ones in one go. Mapping to TAG_SKIP and TAG_MORE is not that common, but may be useful in special cases (for example replacing a single tag with taglist or switching off tag sequences). MapTags() does not allow for mapping to TAG_END, TAG_IGNORE is used instead.

Filtering Tags Data

The ApplyTagChanges() function works on two taglists. The first one is the master, the second is a list of changes. Every tag in master is searched for in the list of changes. If the tag is found, the data of the tag in master list is set to the data in change list. The FilterTagChanges() is more advanced version of the previous function. Applying changes to the master is optional in this case (controlled by a function argument). Additionally the change list is modified. Tags specifying no change, it means found in both the master and the change list and having the same data, are removed from the change list.

The diagram below illustrates principle of working of ApplyTagChanges(). Tags marked green are not subject of changes, and are left intact. Tags in red are possible subject of changes. Data, which are different in the master list and change list are marked blue.


Taglists9.png


Similar diagram explains FilterTagChanges(). It is a bit more complicated, as the function has additional parameter. This boolean parameter controls applying changes to the master taglist. Then one can only learn about changes without applying them.


Taglists10.png


Data Conversion

Data conversion functions automate data exchange between taglists and C-style structures and bitfields. Bitfields are easier, so let's start with them.

Bitfields

A bitfield is very effective way of storing boolean (TRUE or FALSE) information. One boolean TagItem ocuppies 64 bits in memory, while it may be stored in one bit in a bitfield. Also checking a bit value is much faster than checking tag value. The PackBoolTags() function converts a taglist into bitfield. Parameters of this function are a bit similar to MapTags(). There is a source taglist and a map taglist. The map however contains bitmasks for the bitfield instead of tag identifiers. Boolean taglist processing starts from setting the initial bitfield value (taken from the first argument of the function). Then every tag in the map is searched in the source taglist. If found, its boolean value is read and controls how corresponding bitmask is applied to the bitfield. For TRUE, all 1-s from the mask are set in the bitfield. For FALSE all 1-s from the mask are cleared in the bitfield.


Taglists11.png


In the usual case, the mask of every tag has only one bit set, so tags are just mapped to single bits of the bitfield. It is possible to use multi-bit masks for tags, even overlapping ones. It should be noted however, that overlapped masks make the final result dependent on order of tags in the source taglist.

One may be surprised that utility.library does not provide an inverse function to PackBoolTags(). There are a few reasons for this:

  • For the general case of multi-bit and possibly overlapping masks, the inverse function is not possible.
  • For the strict case of single bit, not overlapping masks, the inverse is easy to write.
  • To some extent, the task of unpacking may be performed with UnpackStructureTags().
Structures

A pair of two functions: PackStructureTags() and UnpackStructureTags() can be used to convert a taglist to almost arbitrarily defined data structure and vice versa. Because they were poorly documented in AmigaOS 3.x autodocs, they were rarely used in the past. MorphOS SDK contains a rewritten from scratch documentation of these functions, based on gathered knowledge and extensive tests of MorphOS and AmigaOS 3.x implementations (which are fully compatible).

The PackStructureTags() function moves data stored in a taglist to a data structure. The process is controlled by pack table. The table contains encoded storing instructions, like offsets and sizes of structure fields as well as source tag identifiers. Supported field widths are single bits as well as 8-, 16- and 32-bit fields, signed or unsigned. The pack table is best constructed using a set of construction macros. These macros automatically generate pack instructions, field offsets etc. taking into account things like field padding inserted by the compiler.

Tag identifiers for data to be packed are assumed to start from a defined value called tag base. Tag base definition should be the first entry of the pack table and is generated with PACK_STARTTABLE macro. It is also assumed, that offsets between the base and all the tags are no bigger than 1024. Then only offsets are stored in the pack table (using 10 bits out of 32 available), so packing instruction for one tag fits in a single ULONG. If, for some reason, data tags are not in a single offset range, the tag base may be changed with PACK_NEWOFFSET macro. Finally PACK_ENDTABLE macro ends the packing table.