Taglists

From MorphOS Library

Revision as of 14:43, 26 October 2011 by Krashan (talk | contribs) (Tag Mapping: Grammar.)

Grzegorz Kraszewski


This article in other languages: Polish


What Is a Taglist?

A taglist is an array of "key-value" pairs. The key is always a 32-bit integer number and is called a tag. The value also has a size of 32 bits. It may be an integer, or a pointer to any structure or object. Taglists are commonly used in the MorphOS API for passing a variable number of arguments, usually sets of attributes with their values. A few special key values are used for array termination, concatenation and item skipping. A set of functions in the utility.library may be used for taglist traversing, filtering, searching, copying etc.

Every pair in a taglist is a TagItem structure, defined in <utility/tagitem.h>:

struct TagItem
{
  ULONG ti_Tag;
  ULONG ti_Data;
};

To be self descriptive, every taglist, being just a plain C array, must have some kind of termination. It is very similar to the string null-termination idea. The termination is done with a TagItem having its ti_Tag field set to TAG_END (which happens to be defined as zero). The ti_Data value of the terminating TagItem is ignored, it is usually set to zero too. The illustration below shows a simple taglist:


Taglists1.png


This taglist may be created with the following code:

double x = 3.438872763e+17;
Object *obj = NewObject( /* ... */ );

struct TagItem mytaglist[] = {
  { Tag1, 2837 },
  { Tag2, (ULONG)&x },
  { Tag3, (ULONG)"The quick brown fox..." },
  { Tag4, (ULONG)obj },
  { TAG_END, 0 }
};


Passing Taglists to Functions

Building taglists as global or local variables is not very convenient. That is why almost every MorphOS API function getting a taglist as one of its arguments has two forms. The first one accepts a pointer to the taglist. The second one is a variadic args macro building the taglist on the program stack dynamically. Such function pairs are named according to one of the two conventions:

  1. SomeFunctionA() takes a pointer to a taglist, SomeFunction() builds the taglist dynamically.
  2. SomeFunctionTagList() takes a pointer to a taglist, SomeFunctionTags() builds the taglist dynamically.


Continuing the above example, one can pass the example taglist to SomeFunction() in two ways:

SomeFunctionTagList(mytaglist);
SomeFunctionTags(Tag1, 2837, Tag2, (ULONG)&x, Tag3, (ULONG)"The quick brown fox...", Tag4,
 (ULONG)obj, TAG_END);

Of course in the second case variable mytaglist need not be defined anywhere. Note that for every taglist-based function the taglist is the last argument. There may be some plain arguments before it. It is a common practice to omit ti_Data for the terminator (it is ignored anyway).


Special Tags

Special tags are used to avoid copying large blocks of data when taglists are manipulated. As taglists are plain arrays, operations like removing or inserting elements or merging taglists involve copying their large fragments. Special tags allow for doing these operations by only changing a few tags. Of course this comes at a price. Taglists with special tags (other than TAG_END) may be only manipulated with a set of functions from utility.library. All these functions detect special tags and interpret them properly. It should be noted, that even if taglists are created by hand and do not contain special tags, such tags may appear later, when taglists are passed to system functions. It is therefore recommended to always use utility.library tools to manipulate taglists, especially the taglist iterator named NextTagItem().

TAG_END (TAG_DONE)

I've mentioned this tag already, it works as a taglist terminator. There is no difference between TAG_END and TAG_DONE, both are defined as 0 and may be used alternatively. The first name is used through this article just for being shorter.

TAG_IGNORE

This special tag is used to logically remove a tag item from a taglist without copying large blocks of data. When a tag is to be removed, it is replaced with TAG_IGNORE, tags after it need not to be moved. Any taglist processing function will ignore the tag and advance to the next one immediately.

Taglists 3.png

TAG_MORE

This tag is used for joining taglists. Joining of plain arrays would require counting their size, allocating memory area and copying all tagitems. It is much simplier with TAG_MORE. If one wants to append taglist B at the end of taglist A, he replaces the terminator of taglist A with TAG_MORE and sets the data field to the address of taglist B. Then, when any taglist manipulation function sees this special tag, it fetches the next tag from the address. In effect taglists are logically joined into one.

Taglists 2.png

TAG_SKIP

This rarely used tag is used to logically remove larger fragments from a taglist. When encountered, it directs the taglist iterator to skip a specified number of following tags. The number is specified in the data field of TAG_SKIP.


Taglists 4.png


When using TAG_SKIP one should take care to not skip past the taglist terminator. Taglist processing functions are not protected against such a bug and will try to process random data. Results are unpredictable.

TAG_USER

While TAG_USER is not really a special tag, it is listed here, because it can be found in many source codes. It is defined as $80000000, so it divides tag space into halves. The idea behind it is that system tags (tags being part of the system API) are located in the lower half, while tags defined by applications are always in the upper half. In fact this is only a convention, not enforced and not conistently followed (for example all MUI tags or even Intuition tags are in the user half), as non-special tags are always interpreted in some context. Of course values of special tags must not be used, so values 0 - 3 are reserved (and treating values up to at least 255 as reserved is very good idea too).

Traversing Taglists With NextTagItem()

NextTagItem() function from the utlilty.library is the basic taglist iterator. It should be always used to traverse taglists, as it supports all the special tags discussed above. Every call to the function returns an address of the next item in the taglist, of course special items are "executed" and not returned. The argument of the function is an address of variable being a pointer to the current tag. This pointer is initialized manually to the start of taglist and should not be modified later. A basic loop for taglist browsing is organized as follows:

struct TagItem *tag, *tagptr;

tagptr = my_taglist;  // initialization to the start of taglist to be traversed

while (tag = NextTagItem(&tagptr))
{
  /* do something with 'tag' */
}

NextTagItem() returns NULL when it encounters TAG_END terminator.


Taglists Processing

Any processing of taglists can be done with just NextTagItem() and manual manipulation of tagitems. On the other hand utility.library provides set of functions for typical operations on taglists. Using them saves time and avoids bugs, which may be possibly made in manually written code. Advanced taglist processing is usually not needed in typical aplications, but comes handy, when taglists are used for data storage.

It should be noted, that this article does not replace autodocs of utility.library functions. These autodocs have been completely rewritten and heavily extended in the MorphOS SDK (compared to Amiga ones). This text is just an overview of possibilities, while autodocs provide detailed descriptions.

Finding Tags and Data

Finding particular tag in a taglist is relatively simple task. It can be done with FindTagItem() function. It returns the first occurence of specified tag. If a tag is expected to appear multiple times, the function may be called in a loop, taking its result as an argument for the next call.

struct TagItem *tag = my_taglist;

while (tag = FindTagItem(SOME_TAG, tag))
{
  /* do something with 'tag' */

  tag++;  /* start from the next tag after the one just found */
}

As all other functions described below, FindTagItem() automatically interprets special tags, so they are never returned to the caller. That is why incrementing the tag pointer in the above loop is always safe.

In many cases (like constructors of MUI classes) we are interested in data of particular tag, and also want a default value if the tag is not present in a taglist. These two operations may be merged into one call to GetTagData() function. It takes a tag, the default value and a taglist as arguments. If the tag is found, its data is returned, if not the result is the default value. Here is an example, the code:

if (tag = FindTagItem(SOME_TAG, taglist)) value = tag->ti_Data; 
else value = DEF_VALUE;

reduces to

value = GetTagData(SOME_TAG, DEF_VALUE, taglist);

The last function related to finding is TagInArray(). It does not operate on a taglist however. It simply checks if a tag is found in a zero-terminated array of tags. This is not a taglist, just plain array of ULONG-s.

Creation and Copying

The simplest way of creating a tagilst is to declare it as an array, be it global or local variable. It has been shown already in this article. If a taglist should be allocated dynamically, one can use any general purpose memory allocation function provided by exec.library. An example with AllocVec():

struct TagItem *taglist;    /* taglist with 7 items (including terminator) */

taglist = (struct TagItem*)AllocVec(7 * sizeof(struct TagItem), MEMF_ANY);

The utility.library provides also a special function for taglist allocation named AllocateTagItems(), which is in fact nothing more than a wrapper on general purpose memory allocation function. It also clears allocated memory block, so it can be treated as an empty taglist (as TAG_END is 0). Taglist allocated with AllocateTagItems() must be freed by FreeTagItems().

struct TagItem *fresh_list;

if (fresh_list = AllocateTagItems(2))
{
  fresh_list[0].ti_Tag = SOME_TAG;   /* Need not to initialize terminator at [1] */
  fresh_list[0].ti_Data = 123456;    /* as the taglist is cleared to all zeros. */

  /* more opreations on 'fresh_list' */

  FreeTagItems(fresh_list);
}

A copy of an existing taglist can be created with CloneTagItems(). This function is not a simple wrapper, as taglist containing special tags cannot be copied as a memory block. For doing a proper copy, the function traverses the taglist (using NextTagItem()) two times. First it counts plain tags. Then it allocates a memory block for a clone and does second pass over original, copying tags one by one. Order of tags is preserved, so when we iterate both original and clone with NextTagItem(), we get the same results. The difference is that the clone is stripped off of all special tags. All tags ignored and skipped are not cloned, also separate fragments chained with TAG_MORE are merged into one continuous block. The only special tag present in the clone is, of course, TAG_END. As CloneTagItems() allocates memory for the copy, the copy must be freed later with FreeTagItems(). CloneTagItems() is in fact a memory allocation function, so its result must be always checked against NULL.

struct TagItem *original, *copy;

if (copy = CloneTagItems(original))
{
  /* do something with 'copy' */

  FreeTagItems(copy);
}

There is also a taglist copying function for making copy into a ready buffer. It is a bit misleadingly named RefreshTagItemClones(). In fact the function works the same as CloneTagItems(), but uses a destination buffer passed as an argument instead of allocating a new one. The copy is stripped off of ignored and skipped tags and also defragmented, the same as for CloneTagItems(). It should be noted that RefreshTagItemClones() does not check destination buffer size, so if the original list does not fit into the buffer, memory will be corrupted.

struct TagItem copy[7];  /* I'm so sure original list has no more than 6 non-special items */
                         /* + terminator. If it is longer, total armageddon will be started. */

RefreshTagItemClones(copy, original);

The function cannot fail, so it has no result.

Filtering and Mapping

Utility.library provides four functions for filtering and mapping tags and data. Two of them, FilterTagItems() and MapTagItems() operate on tags considering only their identifiers. The second two, ApplyTagChanges() and FilterTagChanges() operate on tags also considering their data.

Filtering Tags by Identifier

The FilterTagItems() function allows for removing tags from taglist according to specified array of tags. Tag removal is done by changing its identifier to TAG_IGNORE. This function has two modes of operation. The array of tags may work either as array of allowed ones (all others are filtered out), or array of ones to be rejected (all others are kept). It is illustrated on the diagram below. Tag identifiers have been marked with colors. On the left there is original taglist (the same in both diagrams). The center one is an array of filtered tags. It does not contain TagItem structures, but tag identifiers, so it is just an array of ULONGs. The original taglist after performing the operation is shown on the right.


Taglists5.png


Filtering a taglist with TAGFILTER_AND mode. All tags not present in the tag array are rejected.


Taglists6.png


Filtering a taglist with TAGFILTER_NOT mode. All tags present in the tag array are rejected.
Tag Mapping

In the process of mapping, a set of tag identifiers is being replaced by another set according to a map. Tags not included in the map may be either kept or removed. The map is just another taglist. Tag identifiers of map taglist are source identifiers. Data fields of map taglist are destination identifiers. Mapping replaces source identifiers with destination ones. The same as for tag filtering, tag mapping does not change data fields. Diagrams below show the process of mapping. A set of "red tags" is mapped to set of "blue tags", while other are either left intact or removed, depending on argument passed to MapTags(). Source taglist before mapping is shown on the left, the map in center and source after mapping on the right.


Taglists7.png


Mapping red tags to blue tags with keeping unmapped ones (MAP_KEEP_NOT_FOUND).


Taglists8.png


Mapping red tags to blue tags with removing unmapped ones (MAP_REMOVE_NOT_FOUND).


An interesting possibility is mapping ordinary tags to special ones. Special tags cannot be mapped, because MapTags() reads the source with NextTagItem(), so special tags are interpreted rather than mapped. On the other hand, any ordinary tag may be mapped to a special one. The most useful special map destination is TAG_IGNORE, which makes MapTags() to act the same as FilterTags(), but with the first function one can map some tags and filter other ones in one go. Mapping to TAG_SKIP and TAG_MORE is not that common, but may be useful in special cases (for example replacing a single tag with taglist or switching off tag sequences). MapTags() does not allow for mapping to TAG_END, TAG_IGNORE is used instead.

Filtering Tags Data

The ApplyTagChanges() function works on two taglists. The first one is the master, the second is a list of changes. Every tag in master is searched for in the list of changes. If the tag is found, the data of the tag in master list is set to the data in change list. The FilterTagChanges() is more advanced version of the previous function. Applying changes to the master is optional in this case (controlled by a function argument). Additionally the change list is modified. Tags specifying no change, it means found in both the master and the change list and having the same data, are removed from the change list.

The diagram below illustrates principle of working of ApplyTagChanges(). Tags marked green are not subject of changes, and are left intact. Tags in red are possible subject of changes. Data, which are different in the master list and change list are marked blue.


Taglists9.png


Similar diagram explains FilterTagChanges(). It is a bit more complicated, as the function has additional parameter. This boolean parameter controls applying changes to the master taglist. Then one can only learn about changes without applying them.


Taglists10.png


Data Conversion

Data conversion functions automate data exchange between taglists and C-style structures and bitfields. Bitfields are easier, so let's start with them.

Bitfields

A bitfield is very effective way of storing boolean (TRUE or FALSE) information. One boolean TagItem ocuppies 64 bits in memory, while it may be stored in one bit in a bitfield. Also checking a bit value is much faster than checking tag value. The PackBoolTags() function converts a taglist into bitfield. Parameters of this function are a bit similar to MapTags(). There is a source taglist and a map taglist. The map however contains bitmasks for the bitfield instead of tag identifiers. Boolean taglist processing starts from setting the initial bitfield value (taken from the first argument of the function). Then every tag in the map is searched in the source taglist. If found, its boolean value is read and controls how corresponding bitmask is applied to the bitfield. For TRUE, all 1-s from the mask are set in the bitfield. For FALSE all 1-s from the mask are cleared in the bitfield.


Taglists11.png


In the usual case, the mask of every tag has only one bit set, so tags are just mapped to single bits of the bitfield. It is possible to use multi-bit masks for tags, even overlapping ones. It should be noted however, that overlapped masks make the final result dependent on order of tags in the source taglist.

One may be surprised that utility.library does not provide an inverse function to PackBoolTags(). There are a few reasons for this:

  • For the general case of multi-bit and possibly overlapping masks, the inverse function is not possible.
  • For the strict case of single bit, not overlapping masks, the inverse is easy to write.
  • To some extent, the task of unpacking may be performed with UnpackStructureTags().
Structures

A pair of two functions: PackStructureTags() and UnpackStructureTags() can be used to convert a taglist to almost arbitrarily defined data structure and vice versa. Because they were poorly documented in AmigaOS 3.x autodocs, they were rarely used in the past. MorphOS SDK contains a rewritten from scratch documentation of these functions, based on gathered knowledge and extensive tests of MorphOS and AmigaOS 3.x implementations (which are fully compatible).

The PackStructureTags() function moves data stored in a taglist to a data structure. The process is controlled by pack table. The table contains encoded storing instructions, like offsets and sizes of structure fields as well as source tag identifiers. Supported field widths are single bits as well as 8-, 16- and 32-bit fields, signed or unsigned. The pack table is best constructed using a set of construction macros defined in <utility/pack.h>. These macros automatically generate pack instructions, field offsets etc. taking into account things like field padding inserted by the compiler. If for some reason macros cannot be used, pack table entry layout is described in the autodoc for PackStructureTags().

Tag identifiers for data to be packed are assumed to start from a defined value called tag base. Tag base definition should be the first entry of the pack table and is generated with PACK_STARTTABLE macro. It is also assumed, that offsets between the base and all the tags are no bigger than 1024. Then only offsets are stored in the pack table (using 10 bits out of 32 available), so packing instruction for one tag fits in a single ULONG. If, for some reason, data tags are not in a single offset range, the tag base may be changed with PACK_NEWOFFSET macro. Finally PACK_ENDTABLE macro ends the packing table.

Structure entries, except of single bits, are defined with PACK_ENTRY macro, taking 5 arguments. The first one is the tag base, the second is the identifier of the tag containing data for the field. Tag offset is calculated automatically from them, saving the need of manual bit masking. The next two arguments describe the field, they are structure C type name and field name. Offset of the field is then determined automatically. This takes into account implicit field padding and also works properly for structures declared with __attribute__ ((packed)), commonly used for data loaded from files (like headers or records). The last, fifth argument contains control flags, defining field size.

struct Foo
{
  ULONG Bar;
  WORD Blah;
  BYTE Xyz;
};

#define FOO_TAGBASE  TAG_USER
#define FOO_BAR      (FOO_TAGBASE + 0)
#define FOO_BLAH     (FOO_TAGBASE + 1)
#define FOO_XYZ      (FOO_TAGBASE + 2)

ULONG PackTable[] = {
  PACK_STARTTABLE(FOO_TAGBASE),
  PACK_ENTRY(FOO_TAGBASE, FOO_BAR,  Foo, Bar,  PKCTRL_ULONG | PKCTRL_PACKUNPACK),
  PACK_ENTRY(FOO_TAGBASE, FOO_BLAH, Foo, Blah, PKCTRL_WORD | PKCTRL_PACKUNPACK),
  PACK_ENTRY(FOO_TAGBASE, FOO_XYZ,  Foo, Xyz,  PKCTRL_BYTE | PKCTRL_PACKUNPACK),
  PACK_ENDTABLE
}

The example above defines some simple structure, a set of tags and a pack table for data exchange between a taglist and the structure. The same table may be used in both directions, as it has been declared as fully symetric (PKCTRL_PACKUNPACK). It is possible to use some tags only while packing taglist to structure (PKCTRL_PACK) or only while unpacking back to taglist (PKCTRL_UNPACK). Desired default values may be stored in the structure before unpacking. If a tag assigned to some field is not found in the source taglist, the field is not changed. If the same table is used for packing and unpacking, it is important to take care of signedness of fields. It does not matter for packing, as ti_Data field is just truncated by discarding higher bytes when stored in 8- or 16-bit field. When unpacking however, fields marked as signed will be properly sign-extended, while unsigned ones will be extended with zeros.

Single bits of fields may be packed and unpacked too. Unlike PackBoolTags(), PackStructureTags() does not allow for multi-bit masks, because pack instruction stores just bit number. On the other hand structure function supports also 8- and 16-bit fields. It is also more flexible in bit handling. Bit may be set according to the value of a boolean tag (with optional negation), it may be also set or cleared based just on the tag presence (tag data is ignored in this case). It is controlled by PACK_BIT / PACK_FLIPBIT and PSTF_EXISTS flags:

  • PACK_BIT – sets bit according to tag data,
  • PACK_FLIPBIT – sets bit according to negation of tag data,
  • PACK_BIT | PSTF_EXISTS – sets bit if tag is present in the taglist,
  • PACK_FLIPBIT | PSTF_EXISTS – clears bit if tag is present in the taglist.

Bits are defined using PACK_BYTEBIT, PACK_WORDBIT and PACK_LONGBIT macros depending on the width of bitfield. Here is a simple example:

struct FooBits
{
  ULONG LongField;    // bitfields should be always
  UBYTE ShortField;   // declared as unsigned
};

#define FOOBITS_TAGBASE   TAG_USER
#define FOO_LongA         (FOOBITS_TAGBASE + 0)
#define FOO_LongB         (FOOBITS_TAGBASE + 1)
#define FOO_ShortC        (FOOBITS_TAGBASE + 2)
#define FOO_ShortD        (FOOBITS_TAGBASE + 3)

ULONG PackTable[] = {
{
  PACK_STARTTABLE(FOOBITS_TAGBASE),
  PACK_LONGBIT(FOOBITS_BASE, FOO_LongA, FooBits, LongField,
    PKCTRL_PACKUNPACK | PKCTRL_BIT, 23),
  PACK_LONGBIT(FOOBITS_BASE, FOO_LongB, FooBits, LongField,
    PKCTRL_PACKUNPACK | PKCTRL_FLIPBIT, 19),
  PACK_BYTEBIT(FOOBITS_BASE, FOO_ShortC, FooBits, ShortField,
    PKCTRL_PACKUNPACK | PKCTRL_BIT | PSTF_EXISTS, 6),
  PACK_BYTEBIT(FOOBITS_BASE, FOO_ShortD, FooBits, ShortField,
    PKCTRL_PACKUNPACK | PKCTRL_FLIPBIT | PSTF_EXISTS, 2),
 PACK_ENDTABLE
};

The code above assigns tags to bits as follows (note that bit 0 is always the least significant one):

  • FOO_LongA is assigned to bit 23 of LongField, tag data is stored in bit,
  • FOO_LongB is assigned to bit 19 of LongField, tag data is negated and then stored in bit,
  • FOO_ShortC is assigned to bit 6 of ShortField, bit is set if tag is found,
  • FOO_ShortD is assigned to bit 2 of ShortField, bit is cleared if tag is found.

Not all of these possibilities are symmetrically available while unpacking to a taglist. UnpackStructureTags() ignores PSTF_EXISTS and PKCTRL_FLIPBIT and simply sets tag data field to FALSE or TRUE depending on the bit.